Adaptive finite element methods for sparse PDE–constrained optimization

# Adaptive finite element methods for sparse PDE–constrained optimization

Alejandro Allendes †Departamento de Matemática, Universidad Técnica Federico Santa María, Valparaíso, Chile. Francisco Fuica ‡Departamento de Matemática, Universidad Técnica Federico Santa María, Valparaíso, Chile.  and  Enrique Otárola§ §Departamento de Matemática, Universidad Técnica Federico Santa María, Valparaíso, Chile.
###### Abstract.

We propose and analyze reliable and efficient a posteriori error estimators for an optimal control problem that involves a nondifferentiable cost functional, the Poisson problem as state equation and control constraints. To approximate the solutions to the state and adjoint equations we consider piecewise linear discretizations whereas three different strategies are used to approximate the control variable: piecewise constant discretization, piecewise linear discretization and the so–called variational discretization approach. For the first two aforementioned solution techniques we devise an error estimator that can be decomposed as the sum of four contributions: two contributions that account for the discretization of the control variable and the associated subgradient, and two contributions related to the discretization of the state and adjoint equations. The error estimator for the variational discretization approach is decomposed only in two contributions that are related to the discretization of the state and adjoint equations. On the basis of the devised a posteriori error estimators, we design simple adaptive strategies that yield optimal rates of convergence for the numerical examples that we perform.

###### Key words and phrases:
PDE–constrained optimization, nondifferentiable objectives, sparse controls, a posteriori error analysis, adaptive finite elements
AA is partially supported by CONICYT through FONDECYT project 1170579. EO is partially supported by CONICYT through FONDECYT project 3160201. FF is supported by USM through Programa de Incentivos a la Iniciación Científica (PIIC)

fourierlargesymbols147

## 1. Introduction.

In this work we shall be interested in the design and analysis of a posteriori error estimators for a nondifferentiable optimal control problem; control constraints are also considered. To make matters precise, for , we let be an open and bounded polytopal domain in with Lipschitz boundary and . Given a desired state , a regularization parameter , and a sparsity parameter , we define the nondifferentiable cost functional

 (1.1) J(y,u):=12∥y−yΩ∥2L2(Ω)+α2∥u∥2L2(Ω)+β∥u∥L1(Ω).

We shall thus be concerned with the nonsmooth optimal control problem: Find

 (1.2) minJ(y,u)

subject to the state equation

 (1.3) −Δy=u+f in Ω,y=0 on ∂Ω,

and the control constraints

We immediately comment that, since we are interested in the nondifferentiable scenario, we assume that are such that . We refer the reader to [10, Remark 2.1] for a discussion.

The design and analysis of solution techniques for problem (1.2)–(1.4) can be motivated by the following two observations:

• The cost functional involves the –norm of the control variable. This term, that is a natural measure of the control cost, leads to sparsely supported optimal controls [10, 30, 36], i.e., optimal controls that are not zero only in a small region of the considered domain. This is a desirable feature in applications, for instance, in the optimal placement of discrete actuators [30].

• The cost functional is nondifferentiable. As a consequence, the study of solution techniques for (1.2)–(1.4) present some extra mathematical difficulties compared with the standard case and that is presented, for instance, in [31]. Fortunately, these difficulties can be overcame with elements from convex analysis [10, 36].

The analysis and discretization of PDE–constrained optimization problems that involve a cost functional containing a –control cost term have been considered in a number of works. To the best of our knowledge, the first work that provides an analysis when the state equation is a linear and elliptic PDE is [30]. In this work, the author utilizes a regularization technique that involves an –control cost term, analyzes optimality conditions, and studies the convergence properties of a proposed semismooth Newton method. Later, these results were complemented with the rates of convergence with respect to a regularization parameter derived in [36]. Subsequently, in [10], the authors consider a nonlinear version of (1.2)–(1.4) where the state equation is a semilinear elliptic PDE and analyze second order optimality conditions. We refer the reader to the recent work [7] for a complete overview of the results available in the literature. Simultaneously with these advances, discretization techniques based on finite element methods and their corresponding a priori error analyses have also been studied. We refer the reader to [36], when the state equation is a linear elliptic PDE, to [9, 10] for extensions to the semilinear case, and to [26] when the state equation (1.3) corresponds to the spectral fractional Laplacian. We also mention references [11, 12] for extensions of the aforementioned developments to evolution problems.

As opposed to the available a priori error analysis for finite element approximations of sparse PDE–constrained optimization, the design and analysis of a posteriori error estimators are rather scarce. An a posteriori error estimator is a computable quantity that depends on the discrete solution and data, and provides information about the local quality of the approximate solution. It is an essential ingredient of the so–called adaptive finite element methods (AFEMs). The theory for linear second–order elliptic boundary value problems is well–established [2, 24, 25, 33]. In contrast, the theory for constrained optimal control problems is not as developed. The main source of difficulty is its inherent nonlinear feature, which appears due to the control constraints. To the best of our knowledge, the first work that provides an advance is [20] where the authors propose an estimator and derive a reliability estimate [20, Theorem 3.1]. Subsequently, this was improved in [17] by providing efficiency estimates involving oscillation terms [17, Theorems 5.1 and 6.1]. An attempt to unify the available results in the literature was later presented in [19]: on the basis of an important error equivalence the analysis is simplified to provide reliable and efficient estimators for the state and adjoint equations. The analysis is based on the energy norm inherited by the state and adjoint equations. Recently, the authors of [29] provided a general framework that complements the one developed in [19], and measures the error in a norm that is motivated by the objective. The analysis relies on the convexity of . The common feature in all the previous cited references is that, in contrast to (1.2)–(1.4), . For different approaches based on weighted residual and goal–oriented methods and advances in the semilinear and nonlinear case, the reader is referred to [5, 16, 21, 34].

To the best of our knowledge, the only work that provides an advance concerning the a posteriori error analysis for (1.2)–(1.4) is [36]. In this work, the authors consider a piecewise constant discretization for the control variable, propose a residual–type a posteriori error estimator, and prove that it yields an upper bound for the approximation errors of the state and control variables [36, Theorem 6.2] (the errors committed in the approximation of the associated subgradient and the adjoint variable are not considered). However, no efficiency analysis is provided in [36]. In this work we consider three discretization schemes for (1.2)–(1.4) that rely on the discretization of the state and adjoint equations with piecewise linear functions. The schemes differ on the type of discretization considered for the control variable: piecewise constant, piecewise linear or variational discretization. For the two first schemes, we design an a posteriori error estimator that accounts for the discretization of the optimal control variable, its associated subgradient, and the state and adjoint variables. The a posteriori error estimator designed for the variational discretization approach only accounts for the discretization of the state and adjoint variables. We measure the total error in energy–norms and –norms and derive global reliability and local efficiency results. With these estimators at hand, we also design simple adaptive strategies that yield optimal rates of convergence for the numerical examples that we perform.

We organize our exposition as follows. We set notation in Section 2 and briefly recall elements from convex analysis. In Section 3 we present existence and uniqueness results together with first–order necessary and sufficient optimality conditions. In Section 4 we present three finite element discretizations of our optimal control problem; all of them rely on the discretization of the state and adjoint equations by using piecewise linear functions. To approximate the control variable three strategies are considered: piecewise constant, piecewise linear and variational discretization. The core of our work is Section 5 where, for each discretization presented in Section 4, we design an a posteriori error estimator and derive reliability and efficiency results. We conclude, in Section 6, with a series of numerical examples that illustrate our theory.

## 2. Notation and Preliminaries

Let us fix notation and the functional setting in which we will operate. Throughout this work and is an open and bounded polytopal domain with Lipschitz boundary. For a bounded domain , and denote the standard Lebesgue and Sobolev spaces, respectively, and is the subspace of consisting of functions whose trace is zero on . Let denote the inner product in . The norm in is denoted by while the seminorm in is denoted by . If and are normed vector spaces, we write to denote that is continuously embedded in . We denote by the dual of . The relation indicates that , with a nonessential constant that might change at each occurrence. Finally, throughout the manuscript we will frequently make use of the following Poincaré inequality

 (2.1) ∥∇w∥L2(Ω)≤C∥w∥L2(Ω)∀w∈H10(Ω).

### 2.1. Convex functions and subdifferentials

In this section we recall some elements from convex analysis that will be essential for the results that we will present in this work.

Let be a real normed vector space. Let be convex and proper, and let with . A subgradient of at is a continuous linear functional on that satisfies

 (2.2) ⟨v⋆,w−v⟩≤η(w)−η(v)∀\leavevmode\nobreak w∈E,

where denotes the duality pairing between and . We immediately remark that a function may admit many subgradients at a point of nondifferentiability. The set of all subgradients of at is called subdifferential of at and is denoted by By convexity, the subdifferential for all points in the interior of the effective domain of . Finally, we mention that the subdifferential is monotone, i.e.,

 (2.3) ⟨v⋆−w⋆,v−w⟩≥0∀v⋆∈∂η(v), ∀w⋆∈∂η(w).

We refer the reader to [14, 28] for a thorough discussion on convex analysis.

## 3. Sparse PDE–constrained optimization.

In this section we briefly review the analysis of the nondifferentiable optimal control problem (1.2)–(1.4). We recall existence and uniqueness results together with first–order necessary and sufficient optimality conditions.

For defined as in (1.1), the nondifferentiable optimal control problem reads:

subject to the linear and elliptic PDE

 (3.2) (∇y,∇v)L2(Ω)=(u+f,v)L2(Ω)∀v∈H10(Ω).

We must immediately notice that the set , defined as in (1.4), is a nonempty, closed and convex subset of .

We now define the control-to-state map as follows: given , associates to it a unique state that solves (3.2). Since , we may also consider acting from into itself. An immediate application of Lax-Milgram Lemma yields the well–posedness of . With this operator at hand we define the reduced cost functional

 j(u)=J(Su,u):=12∥Su−yΩ∥2L2(Ω)+α2∥u∥2L2(Ω)+β∥u∥L1(Ω),

###### Lemma 3.1 (well–posedness).

The sparse PDE–constrained optimization problem (3.1)–(3.2) has a unique optimal solution .

###### Proof.

Since is injective and continuous, is strictly convex and weakly lower semicontinuous. The fact that is weakly sequentially compact allows us to conclude. ∎

In order to obtain optimality conditions for (3.1)–(3.2) we introduce the following ingredients. First, we define the so–called adjoint state as follows:

 (3.3) p∈H10(Ω):(∇w,∇p)L2(Ω)=(y−yΩ,w)L2(Ω)∀w∈H10(Ω).

We define the convex and Lipschitz function by ; it corresponds to the nondifferentiable component of the reduced cost functional . The differential counterpart of the latter is defined by

 φ:L2(Ω)→R,u↦φ(u):=12∥Su−yΩ∥2L2(Ω)+α2∥u∥2L2(Ω).

Standard arguments reveal that is Fréchet differentiable with [31, Theorem 2.20].

With these ingredients at hand, we present the following necessary and sufficient optimality condition for our sparse PDE–constrained optimization problem.

###### Theorem 3.2 (optimality conditions).

The pair is optimal for problem (3.1)–(3.2) if and only if and satisfies the variational inequality

where denotes the solution to (3.3) and .

###### Proof.

See [36, Lemma 2.2]. ∎

To present the following result we introduce, for the projection formula

 (3.5) Π[a,b](v(x))=min{b,max{a,v(x)}}.
###### Corollary 3.3 (projection formula).

Let and be as in Theorem 3.2. Then, we have that

 (3.6) ¯u(x)=Π[a,b](−1α(¯p(x)+β¯λ(x))),

and

 (3.7) ¯u(x)=0⇔|¯p(x)|≤β,¯λ(x)=Π[−1,1](−1β¯p(x)).

Consequently, .

###### Proof.

See [10, Corollary 3.2]. ∎

To summarize, the pair is optimal for (3.1)–(3.2) if and only if

 (3.8) (¯y,¯p,¯u):⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩(∇¯y,∇v)L2(Ω)=(¯u+f,v)L2(Ω)∀v∈H10(Ω),(∇w,∇¯p)L2(Ω)=(¯y−yΩ,w)L2(Ω)∀w∈H10(Ω),\omit(¯p+α¯u+β¯λ,u−¯u)L2(Ω)≥0∀u∈Uad,

where .

## 4. Finite element discretization.

We present three finite element solution techniques for the nondifferentiable optimal control problem (3.1)–(3.2). All the techniques discretize the state and adjoint equations with piecewise linear functions. However, they differ on the type of discretization technique used for the optimal control variable: In Section 4.1, we consider a piecewise constant discretization, in Section 4.2, an scheme based on piecewise linear functions and, in Section, 4.3 we consider the so-called variational discretization approach.

We begin this section by introducing some finite element notation [13, 15]. Let be a conforming partition of into simplices with size , and set . We denote by the collection of conforming and shape regular meshes that are refinements of an initial mesh .

Given a mesh , we define the finite element space of continuous piecewise polynomials of degree one as

 (4.1) V(T)={vT∈C(¯Ω):vT|K∈P1(K)∀K∈T,vT|∂Ω=0}.

In what follows we will describe the three solution techniques that we will consider for our optimal control problem (3.1)–(3.2).

### 4.1. Piecewise constant discretization.

We define . The discrete admissible set ) is thus defined as

With these discrete spaces at hand, we propose the following finite element discretization of the optimality system (3.8): Find such that

 (4.2)

where and

 ψ:U0(T)→R,uT↦ψ(uT)=∫Ω|uT|\rm dx=∑K∈T|uT||K|.

The next result shows discrete projection formulas for and .

###### Lemma 4.1 (discrete projection formulas in U0(T)).

Let be the solution to (4.2). Then, we have

 (4.3) ¯uT|K=Π[a,b](−1α(1|K|∫K¯pT\rm dx+β¯λT|K)),

and

 (4.4) ¯uT|K=0⇔1|K|∣∣∣∫K¯pT\rm dx∣∣∣≤β,¯λT|K=Π[−1,1](−1β|K|∫K¯pT\rm dx).
###### Proof.

We begin by noticing that . Consequently, standard arguments, on the basis of (2.2), allow us to identify with an element of that satisfies

 (4.5) ¯λT=∑K∈TχK¯λT|K,⎧⎪⎨⎪⎩¯λT|K=+1,¯uT|K>0,¯λT|K=−1,¯uT|K<0,¯λT|K∈[−1,1],¯uT|K=0,

where corresponds to the characteristic function of the element . Thus, since , the variational inequality in the optimality system (4.2) reads

 ∑K∈T(∫K¯pT\rm dx+|K|(α¯uT|K+β¯λT|K))(uT|K−¯uT|K)≥0,

where, for every , is such that . We thus invoke similar arguments to the ones used in the proof of [31, Lemma 2.26] to obtain the projection formula (4.3).

The proof of (4.4) follows from (3.5), (4.3), and (4.5); see [10, Section 4] for details. This concludes the proof. ∎

### 4.2. Piecewise linear discretization.

Let us define . The discrete admissible set is thus defined as

We denote the set of all vertices of the mesh by , and, for , we introduce the function which is such that for all . The set is the so–called Courant basis of the space [15, 24]. We notice that every element can be written as

 uT=∑\textscv∈V(T)uT(\textscv)ϕ\textscv.

We follow [9, Section 3] and define, on the space , the discrete inner product and norm by

 (4.6) (uT,vT)T=∑\textscv∈V(T)uT(\textscv)vT(\textscv)∫Ωϕ\textscv\rm dx,∥uT∥2T=(uT,uT)T,

respectively. We also define the discrete nondifferentiable component as and the discrete cost functional

 JT(yT,uT)=12∥yT−yΩ∥2L2(Ω)+α2∥uT∥2T+βψT(uT).

With this discrete setting at hand, we propose the following finite element discretization of the optimality system (3.8) [9, Theorem 3.3]: Find such that

where . To present the following result we introduce the quasi–interpolation operator , that is defined as follows:

 (4.8) \definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0ΘT(w)=∑\textscv∈V(T)θ\textscv(w)ϕ\textscv,θ\textscv(w):=∫Ωwϕ\textscv\rm dx∫Ωϕ\textscv\rm dx.
###### Lemma 4.2 (discrete projection formulas in Uad,1(T)).

Let be the solution to (4.7). Then, for every , we have

 (4.9) ¯uT(\textscv)=Π[a,b](−1α(θ\textscv(¯pT)+β¯λT(\textscv))),

and

 (4.10) ¯uT(\textscv)=0⇔∣∣θ\textscv(¯pT)∣∣≤β,¯λT(\textscv)=Π[−1,1](−1βθ\textscv(¯pT)).
###### Proof.

See [9, Lemma 3.4]. ∎

###### Remark 4.3 (projection formulas).

The projection formulas obtained in Lemmas 4.1 and 4.2 are essential ingredients in the numerical resolution of system (4.2) and (4.7), respectively; see Algorithm 1 in Section 6.

###### Proposition 4.1 (a priori error estimates).

Let be such that and and . Let , with , be an open and bounded convex polytopal domain and . If (4.2) and (4.7) approximate the optimal control problem (3.1)–(3.2) when and , respectively then, the following a priori error estimates can be derived: For every , there is a constant such that for all

 (4.11) ∥¯u−¯uT∥L2(Ω)≤ChT,

where is independent of .

###### Proof.

For a proof of this result we refer the reader to [36, Proposition 4.5] when and [9, Theorem 3.13] when . ∎

### 4.3. Variational discretization

In what follows we will consider the so–called variational discretization approach introduced by Hinze in [18]. We discretize the state equation with the help of the discrete space (4.1); the admissible set of controls is not discretized. In spite of this fact, the proposed semidiscrete scheme will induce a discretization of the optimal control and its associated subgradient on the basis of projection formulas; see Lemma 4.4 below.

With the aforementioned semidiscrete setting at hand, we propose the following finite element discretization of the optimality system (3.8) [10, Section 5]: Find such that

where . We now present projection formulas for the variables and that make evident how they are implicitly discretized by the semidiscrete scheme (4.12).

###### Lemma 4.4 (variational discrete projections).

Let be the solution to (4.12). Then, for all , we have that

 (4.13) ¯uT(x)=Π[a,b](−1α(¯pT(x)+β¯λT(x))),

and

 (4.14) ¯uT(x)=0⇔|¯pT(x)|≤β,¯λT(x)=Π[−1,1](−1β¯pT(x)).
###### Proof.

See [10, Section 5].

###### Proposition 4.2 (a priori error estimate).

Let be such that and and . Let , with , be an open and bounded convex polytopal domain and . If (4.12) approximate the optimal control problem (3.1)–(3.2), then the following a priori error estimate can be derived: For every , there is a constant such that for all

 (4.15) ∥¯u−¯uT∥L2(Ω)≤Ch2T,

where is independent of .

###### Proof.

See [36, Corollary 4.7]

## 5. A posteriori error estimation.

The design and analysis of AFEMs to solve the optimal control problem (3.1)–(3.2) are motivated by the following considerations: First, the a priori error estimates obtained in [10, 36] require to be quasi–uniform and to be convex. In addition, such estimates are valid under the assumption that is sufficiently small. If the condition that is convex is violated, the optimal variables may have singularities and thus exhibit fractional regularity. As a consequence, quasi–uniform refinement of would not result in an efficient solution technique. Second, the sparsity term in the cost functional yield an optimal control that is nonzero only in sets of small support in . It is then natural, to efficiently resolve such a behavior on the optimal control variable and recover optimal rates of convergence when is not convex, to propose AFEMs.

In the next section we will construct three types of a posteriori error estimators: two of them will be based on the following four contributions: two contributions that account for the discretization of the control variable and the associated subgradient, and two contributions related to the discretization of the state and adjoint variables. Instead, the a posteriori error estimator for the variational discretization approach is based only in two contributions: one related to the discretization of the state variable, and another one related to the discretization of the adjoint variable.

### 5.1. A posteriori error analysis for the Laplacian

Since the three error estimators that we will propose involve contributions from the discretization of the state and adjoint variables, in what follows we summarize some classical a posteriori error estimates for the Laplacian.

Let and consider the following problem: Find such that

 (5.1) (∇z,∇v)L2(Ω)=(g,v)L2(Ω)∀v∈H10(Ω).

We define the Galerkin approximation to (5.1) as the solution to the following discrete problem: Find such that

We define as the set of internal –dimensional interelement boundaries of . For , let denote the subset of that contains the sides in which are sides of . We also denote by the subset of that contains the two elements that have as a side. In addition, we define the patch associated with an element as

 (5.2) ΩK:=⋃K′∈T:SK∩SK′≠∅K′.

Given a discrete function , we define, for any internal side , the jump or interelement residual by

where denote the unit normals to pointing toward , , respectively, which are such that and .

With these ingredients at hand, we introduce the following a posteriori error indicators and error estimator

 E2z,K=h2K∥g∥2L2(K)+hK∥∥[[∇zT⋅ν]]γ∥∥2L2(∂K∖∂Ω),Ez:=(∑K∈TE2z,K)12,

respectively. It is well–known that there exists a positive constant such that the following global reliability result holds:

 (5.3) |z−zT|H1(Ω)≤CzEz.

We refer the reader to [2, Section 2.2] and [33, Section 1.4] for details.

Let us define now the following a posteriori error indicators and error estimator

 E2z,K=h4K∥g∥2L2(K)+h3K∥∥[[∇zT⋅ν]]γ∥∥2L2(∂K∖∂Ω),Ez:=(∑K∈TE2z,K)12,

respectively. If is convex, a duality argument reveals that there exist a positive constant such that

 (5.4) ∥z−zT∥L2(Ω)≤CzEz.

We refer the reader to [2, Section 2.4] for details.

### 5.2. Error estimators for sparse PDE–constrained optimization: reliability

The upper bounds for the errors that we will obtain in our work are constructed using upper bounds on the error between the solution to the discretization (4.2), (4.7) or (4.12) and auxiliary variables that we define in what follows. Let be the solution to (4.2), (4.7) or (4.12), with for (4.2), for (4.7), and for (4.12). We thus define as the solution to

 (5.5) {(∇^y,∇v)L2(Ω)=(¯uT+f,v)L2(Ω)∀v∈H10(Ω),(∇w,∇^p)L2(Ω)=(¯yT−yΩ,w)L2(Ω)∀w∈H10(Ω).

We notice that can be seen as a finite element approximation of . This property motivates the following definitions. First, we define

 (5.6) E2y,K =h2K∥¯uT+f∥2L2(K)+hK∥∥[[∇¯yT⋅ν]]γ∥∥2L2(∂K∖∂Ω),E2y:=∑K∈TE2y,K, (5.7) E2p,K =h2K∥¯yT−yΩ∥2L2(K)+hK∥∥[[∇¯pT⋅ν]]γ∥∥2L2(∂K∖∂Ω),E2p:=∑K∈TE2p,K.

In view of the results of the previous section we conclude from (5.3) that there exist constants and such that

 (5.8) |^y−¯yT|H1(Ω)≤CyEy,|^p−¯pT|H1(Ω)≤CpEp.

Secondly, we define the –based a posteriori error indicators and estimators

 (5.9) E2y,K =h4K∥¯uT+f∥2L2(K)+h3K∥∥[[∇¯yT⋅ν]]γ∥∥2L2(∂K∖∂Ω),E2y:=∑K∈TE2y,K, (5.10) E2p,K =h4K∥¯yT−yΩ∥2L2(K)+h3K∥∥[[∇¯pT⋅ν]]γ∥∥2L2(∂K∖∂Ω),E2p:=∑K∈TE2p,K.

If, in addition, is convex, we thus have from (5.4) that there exist constants and such that

 (5.11) ∥^y−¯yT∥L2(Ω)≤CyEy,∥^p−¯pT∥L2(Ω)≤CpEp.

We now define

 (5.12)

The following remark is thus necessary.

###### Remark 5.1 (properties of ~u and ~λ).

We notice two properties that are consequences of definition (5.12). First, . This will be crucial in the a posteriori error analysis that we will perform in Section 5. Second, if the variational approach is considered, we thus have that

 (5.13) ~u=¯uT,~λ=¯λT.

With these ingredients at hand, we thus define the following a posteriori error indicators and estimators for the optimal control variable and the associated subgradient:

 (5.14) E2u,K :=∥~u−¯uT∥2L2(K),Eu:=(∑K∈TE2u,K)12, (5.15) E2λ,K :=∥~λ−¯λT∥2L2(K),Eλ:=(∑K∈T<