Variational discretization of a control-constrained parabolic bang-bang optimal control problem
We consider a control-constrained parabolic optimal control problem
without Tikhonov term in the tracking functional.
For the numerical treatment, we use
variational discretization of its Tikhonov regularization:
For the state and the adjoint equation, we apply
Petrov-Galerkin schemes from [DanielsHinzeVierling] in time
and usual conforming finite elements in space.
We prove a-priori estimates for the error between the
discretized regularized problem and the limit problem.
Since these estimates are not robust
if the regularization parameter tends to zero,
we establish robust estimates, which
— depending on the problem’s regularity —
enhance the previous ones.
In the special case of bang-bang solutions, these estimates are
A numerical example confirms our analytical findings.
Keywords: Optimal control, Heat equation, Control constraints, Finite elements, A-priori error estimates, Bang-bang controls.
In this article we are interested in the numerical solution of the optimal control problem
Here, is basically the (weak) solution operator of the heat equation, the set of admissible controls is given by box constraints, and is a given function to be tracked.
Theoretical and numerical questions related to this control problem attracted much interest in recent years, see, e.g., [deckelnick-hinze], [wachsmuth1], [wachsmuth2], [wachsmuth3], [wachsmuth4], [wachsmuth5], [gong-yan], [felgenhauer2003], [alt-bayer-etal2], [alt-seydenschwanz-reg1], and [seydenschwanz-regkappa]. The last four papers are concerned with being the solution operator of an ordinary differential equation, the former papers with being a solution operator of an elliptic PDE or being a continuous linear operator. In [dissnvd], a brief survey of the content of these and some other related papers is given at the end of the bibliography.
Problem () is in general ill-posed, meaning that a solution does not depend continuously on the datum , see [wachsmuth2, p. 1130]. The numerical treatment of a discretized version of () is also challenging, e.g., due to the absense of formula (10) in the case , which corresponds to problem ().
Therefore we use Tikhonov regularization to overcome these difficulties. The regularized problem is given by
For the numerical treatment of the regularized problem, we then use variational discretization introduced by Hinze in [Hinze2005], see also [hpuu, Chapter 3.2.5]. The state equation is treated with a Petrov-Galerkin scheme in time using a piecewise constant Ansatz for the state and piecewise linear, continuous test functions. This results in variants of the Crank-Nicolson scheme for the discretization of the state and the adjoint state, which were proposed recently in [DanielsHinzeVierling]. In space, usual conforming finite elements are taken. See [dissnvd] for the fully discrete case and [SpringerVexler2013] for an alternative discontinuous Galerkin approach.
The purpose of this paper is to prove a-priori bounds for the error between the discretized regularized problem and the limit problem, i.e. the continuous unregularized problem.
We first derive error estimates between the discretized regularized problem and its continuous counterpart. Together with Tikhonov error estimates recently obtained in [daniels], see also [dissnvd], one can establish estimates for the total error between the discretized regularized solution and the solution of the continous limit problem, i.e. . Here, second order convergence in space is not achievable and (without coupling) the estimates are not robust if tends to zero. Using refined arguments, we overcome both drawbacks. In the special case of bang-bang controls, we further improve those estimates.
The obtained estimates suggest a coupling rule for the parameters (regularization parameter), , and (time and space discretization parameters, respectively) to obtain optimal convergence rates which we numerically observe.
The paper is organized as follows.
In the next section, we introduce the functional analytic description of the regularized problem. We recall several of its properties, such as existence of a unique solution for all (thus especially in the limit case we are interested in), an explicit characterization of the solution structure, and the function space regularity of the solution. We then introduce the Tikhonov regularization and recall some error estimates under suitable assumptions. In the special case of bang-bang controls, we recall a smoothness-decay lemma which later helps to improve the error estimates for the discretized problem.
The third section is devoted to the discretization of the optimal control problem. At first, the discretization of the state and adjoint equation is introduced and several error estimates needed in the later analysis are recalled. Then, the analysis of variational discretization of the optimal control problem is conducted.
The last section discusses a numerical example where we observe the predicted orders of convergence.
2 The continuous optimal control problem
2.1 Problem setting and basic properties
Let , , be a spatial domain which is assumed to be bounded and convex with a polygonal boundary . Furthermore, a fixed time interval , , a desired state , a non-negative real constant , and an initial value are prescribed. With the Gelfand triple we consider the following optimal control problem
where is the control space, the (closed and convex) set of admissible controls is defined by
with fixed control bounds , fulfilling almost everywhere in ,
is the state space, and the control operator as well as the control region are defined below.
Note that we use the notation and for weak time derivatives and for “for almost all”.
denotes the weak solution operator associated with the heat equation, i.e., the linear parabolic problem
The weak solution is defined as follows. For the function with satisfies the two equations
Note that by the embedding
, see, e.g., [evans, Theorem 5.9.3],
the first relation is meaningful.
In the preceding equation, the bilinear form is given by
For the control region and the control operator we consider two situations.
(Distributed controls) We set , and define the control operator by , i.e., the identity mapping induced by the standard Sobolev embedding .
(Located controls) We set the control region . With a fixed functional the linear and continuous control operator is given by
The case of fixed functionals with controls and a control operator , is a possible generalization. To streamline the presentation we restrict ourselves to the case here and refer to [dissnvd] for the case .
For later use we observe that the adjoint operator is given by
If furthermore holds, we can consider as an operator and get the adjoint operator
Note that the adjoint operator (and also the operator itself) is preserving time regularity, i.e., for where is a subspace of depending on the regularity of the (as noticed just before), e.g., or .
Lemma 1 (Properties of the solution operator ).
This can be derived using standard results, see [dissnvd, Lemma 1]. ∎
An advantage of the formulation (7) in comparison to (3) is the fact that the weak time derivative of is not part of the equation. Later in discretizations of this equation, it offers the possibility to consider states which do not possess a weak time derivative.
Lemma 2 (Unique solution of the o.c.p.).
where denotes the adjoint operator of , and the so-called optimal adjoint state is the unique weak solution defined and uniquely determined by the equation
This follows from standard results, see, e.g., [dissnvd, Lemma 2]. ∎
As a consequence of the fact that is a closed and convex set in a Hilbert space we have the following lemma.
In the case the variational inequality (8) is equivalent to
where is the orthogonal projection.
See [hpuu, Corollary 1.2, p. 70] with . ∎
The orthogonal projection in (10) can be made explicit in our setting.
Let us for with consider the projection of a real number into the interval , i.e., .
There holds for
See [dissnvd, Lemma 4] for a proof of this standard result in our setting. ∎
We now derive an explicit characterization of the optimal control.
If , then for almost all there holds for the optimal control
Suppose is given. Then the optimal control fulfills a.e.
We refer to [dissnvd, Lemma 5] for a proof of this standard result in our setting. ∎
As a consequence of (12) we have: If vanishes only on a subset of with Lebesgue measure zero, the optimal control only takes values on the bounds of the admissible set . In this case is called a bang-bang solution.
Assuming more regularity on the data than stated above, we get regularity for the optimal state and the adjoint state needed for the convergence rates in the numerical realization of the problem.
We use here and in what follows the notation
Let with and . Furthermore, we expect . In the case of distributed controls, we assume , . In the case of located controls, we assume , and .
Let Assumption 7 hold and let . For the unique solution of () and the corresponding adjoint state there holds
in the case of located controls or
in the case of distributed controls.
With some constant independent of , we have the a priori estimates
See [dissnvd, Lemma 12]. ∎
Remark 9 (Regularity in the case ).
In the case , we have less regularity:
Since (10) does not hold, we can not derive regularity for from that of as above. We only know from the definition of that , but might be discontinuous as we will see later.
2.2 Tikhonov regularization
in a pointwise almost everywhere sense where and are the bounds of the admissable set . For the limit problem (), which we finally want to solve, this assumption can always be met by a simple transformation of the variables.
To prove rates of convergence with respect to , we rely on the following assumption.
There exist a set , a function with , and constants and , such that there holds the inclusion for the complement of and in addition
with the convention that if the left-hand side of (16) is zero for some .
For a discussion of this assumption we refer to the texts subsequent to [daniels, Assumption 7] or [dissnvd, Assumption 15].
Key ingredient in the analysis of the regularization error and also of the discretization error considered later is the following lemma, see [daniels, Lemma 8] or [dissnvd, Lemma 16] for a proof.
Let Assumption 10.2 hold. For the solution of (), there holds with some constant independent of and
Using this Lemma, we can now state regularization error estimates.
For the regularization error there holds with positive constants and indepent of the following.
For a proof of this recent result, we refer to [daniels, Theorem 11] and [dissnvd, Theorem 19], where also a discussion can be found. We only recall two points for convenience here:
The assumption of the first case of the above Theorem implies
which induces bang-bang controls, compare Remark 6.
2.3 Bang-bang controls
We now introduce a second measure condition which leads to an improved bound on the decay of smoothness in the derivative of the optimal control when tends to zero. This bound will be useful later to derive improved convergence rates for the discretization errors.
Definition 13 (-measure condition).
If for the set
holds true (with the convention that if the measure in (29) is zero for all ), we say that the -measure condition is fulfilled.
Let us assume the -condition
See [daniels, Theorem 15] or [dissnvd, Theorem 24]. ∎
If the limit problem is of certain regularity, both measure conditions coincide:
See [daniels, Corollary 18] or [dissnvd, Corollary 27]. ∎
with a constant independent of due to the definition of . Recall that , by Assumption 7. With this estimate, the projection formula (10) and the stability of the projection (see [dissnvd, Lemma 11]) we obtain the bound
if is sufficiently small.
If the -measure condition (29) is valid, this decay of smoothness in terms of can be relaxed in weaker norms, as the following Lemma shows.
Lemma 16 (Smoothness decay in the derivative).
Let the -measure condition (29) be fulfilled and located controls be given. Then for sufficiently small there holds
with a constant independent of . Here, and . Note that in the case of constant control bounds and .
See [daniels, Lemma 19] or [dissnvd, Lemma 28]. ∎
The question of necessity of Assumption 10 and the -measure condition (28) to obtain the convergence rates of Theorem 12.1 is discussed in [daniels, sections 4 and 5] and [dissnvd, sections 1.4.3 and 1.4.4]. The results there show that in several cases the conditions are in fact necessary to obtain the convergence rates from above.
3 The discretized problem
3.1 Discretization of the optimal control problem
Consider a partition of the time interval . With we have . Furthermore, let for denote the interval midpoints. By we get a second partition of , the so-called dual partition, namely , with . The grid width of the first mentioned (primal) partition is given by the parameters and
Here and in what follows we assume . We also denote by (in a slight abuse of notation) the grid itself.
We need the following conditions on sequences of time grids.
There exist constants and independent of such that there holds
On these partitions of the time interval, we define the Ansatz and test spaces of the Petrov–Galerkin schemes. These schemes will replace the continuous-in-time weak formulations of the state equation and the adjoint equation, i.e., (7) and (9), respectively. To this end, we define at first for an arbitrary Banach space the semidiscrete function spaces
Here, , , , is the set of polynomial functions in time with degree of at most on the interval with values in . We note that functions in can be uniquely determined by elements from . The same holds true for functions but with only uniquely determined in by definition of the space. The reason for this is given in the discussion below [dissnvd, (2.16), p. 41]. Furthermore, for each function we have where denotes the equivalence class with respect to the almost-everywhere relation.
In the sequel, we will frequently use the following interpolation operators.
(Piecewise linear interpolation on the dual grid)
The interpolation operators are obviously linear mappings. Furthermore, they are bounded, and we have error estimates, as [dissnvd, Lemma 31] shows.
In addition to the notation introduced after Remark 6, adding a subscript to a norm will indicate an norm in the following. Inner products are treated in the same way.
Note that in all of the following results denotes a generic, strict positive real constant that does not depend on quantities which appear to the right or below of it.
Note that we can extend the bilinear form of (6) in its first argument to , thus consider the operator
Using continuous piecewise linear functions in space, we can formulate fully discretized variants of the state and adjoint equation.
We consider a regular triangulation of with mesh size
see, e.g., [brenner-scott, Definition (4.4.13)], and triangles. We assume that . We also denote by (in a slight abuse of notation) the grid itself.
With the space
we define to discretize .
For the space grid we make use of a standard grid assumption, as we did for the time grid, sometimes called quasi-uniformity.
There exists a constant independent of such that
We fix fully discrete ansatz and test spaces, derived from their semidiscrete counterparts from (33), namely
With these spaces, we introduce fully discrete state and adjoint equations as follows.
Definition 19 (Fully discrete adjoint equation).
For find such that
Definition 20 (Fully discrete state equation).
For find , such that
Existence and uniqueness of these two schemes follow as in the semidiscrete case discussed in [DanielsHinzeVierling] or [dissnvd, section 2.1.2].
Let us recall some stability results and error estimates for these schemes. The first result is [dissnvd, Lemma 56].
Let solve (39) with . Then there exists a constant independent of and such that
For stability of a fully discrete state and an error estimate, we recall [dissnvd, Lemma 59].
Let us now consider the error of the fully discrete adjoint state. We begin with an norm result, which is [dissnvd, Lemma 62].
Let solve (9) for some such that has the regularity . Let furthermore