A-posteriori snapshot location for POD in optimal control of linear parabolic equations

# A-posteriori snapshot location for POD in optimal control of linear parabolic equations

Alessandro Alla Florida State University, Department of Scientific Computing, FL-32306 Tallahassee, USA (aalla@fsu.edu)    Carmen Grässle University of Hamburg, Department of Mathematics, D-20146 Hamburg, Germany (carmen.graessle@uni-hamburg.de)    Michael Hinze University of Hamburg, Department of Mathematics, D-20146 Hamburg, Germany (michael.hinze@uni-hamburg.de)
30 August 2016
###### Abstract

In this paper we study the approximation of an optimal control problem for linear parabolic PDEs with model order reduction based on Proper Orthogonal Decomposition (POD-MOR). POD-MOR is a Galerkin approach where the basis functions are obtained upon information contained in time snapshots of the parabolic PDE related to given input data. In the present work we show that for POD-MOR in optimal control of parabolic equations it is important to have knowledge about the controlled system at the right time instances. We propose to determine the time instances (snapshot locations) by an a-posteriori error control concept. This method is based on a reformulation of the optimality system of the underlying optimal control problem as a second order in time and fourth order in space elliptic system which is approximated by a space-time finite element method. Finally, we present numerical tests to illustrate our approach and to show the effectiveness of the method in comparison to existing approaches.

Optimal Control, Model Order Reduction, Proper Orthogonal Decomposition, Optimal Snapshot Location
thanks: We would like to thank Z J. Zhou from Shandong Normal University, China for providing the data and code of the space-time approximation in [7]. First author also acknowledges the support of US Department of Energy (grant number DE-SC0009324). \subjclass

49J20, 65N12, 78M34

## 1 Introduction

Optimization with PDE constraints is nowadays a well-studied topic motivated by its relevance in industrial applications. We are interested in the numerical approximation of such optimization problems in an efficient and reliable way using surrogate models obtained with POD-MOR. The surrogate models are built upon snapshots of the system to provide information about the underlying problem. This stage is usually called the offline stage. For the snapshot POD approach we refer the reader to [28].
Several works focus their attention on the choice of the snapshots, in order to approximate either dynamical systems or optimal control problems by suitable surrogate models. In [19], it is proposed to optimize the choice of the time instances such that the error between POD and the trajectory of the dynamical system is minimized. A recent approach proposes to choose the snapshots by an a-posteriori error estimator in order to equidistribute the state error on the time grid related to the snapshot locations (see [13]). We also mention an adaptive method, proposed in [25], where the aim is to reduce expensive offline costs selecting the snapshots according to an adaptive time-stepping algorithm using time error-control. For further references we refer the interested reader to [25].
In optimal control problems the reduced model is usually built upon a forecast on the control. This approach does not guarantee a proper construction of the surrogate model since we do not know how far away the optimal solution is from the reference control. More sophisticated approaches select snapshots by solving an optimization problem in order to improve the selection of the snapshots according to the desired controlled dynamics. For this purpose optimality system for POD (OS-POD) is introduced in [18]. In OS-POD, the computation of the basis functions is performed by means of the solution of an enlarged optimal control problem which involves the full problem, the reduced equation and the eigenvalue problem for the POD modes.
The reduction of optimal control problems with particular focus on adaptive adjustments of the surrogate models can be found in [1, 4]. We should also mention another adaptive method for feedback control problems by means of the Hamilton-Jacobi-Bellman equation, introduced in [2].
Recently, an a-posteriori error estimator was introduced in [30, 14] for optimal control problems. In these works the error between the unknown optimal and the computed POD suboptimal control is estimated for linear and nonlinear problems, and it is shown that increasing the number of basis functions leads to the desired convergence. OS-POD and a-posteriori error estimation is combined in [32].
All these works have in common that they compute basis functions for optimal control problems. In our paper we address the question of an efficient and suitable selection of snapshot locations by means of an a-posteriori error control approach proposed in [7]. We rewrite the optimality conditions as a second order in time and fourth order in space elliptic equation for the adjoint variable and we generalize this approach to control constraints. In particular, a time adaptive concept is used to build the snapshot grid which should be used to construct the POD surrogate model for the approximate solution of the optimal control problem. Here the novelty for the reduced control problem is twofold: we directly obtain snapshots related to an approximation of the optimal control and, at the same time, we get information about the time grid.

We have proposed a similar approach based on a reformulation of the optimality system with respect to the state variable in [3]. Now, we focus our approach on the adjoint variable and generalize the idea presented in [7] to time dependent control intensities with control shape functions including control constraints. Furthermore, we certify our approach by means of several error bounds for the state, adjoint state and control variable.
The outline of this paper is as follows. In Section 2 we present the optimal control problem together with the optimality conditions. In Section 3 we recall the main results of [7]. Proper Orthogonal Decomposition and its application to optimal control problems is presented in Section 4. The focus of Section 5 lies in investigating our snapshot location strategy. Finally, numerical tests are discussed in Section 6 and conclusions are driven in Section 7.

## 2 Optimal Control Problem

In this section we describe the optimal control problem. The governing equation is given by a linear parabolic PDE:

 yt−Δy=f+Bu in ΩT,y(⋅,0)=y0 in Ω,y=0 on ΣT,⎫⎪⎬⎪⎭ (1)

where is an open bounded domain with smooth boundary, , is the space-time cylinder, , and the state is denoted by . As control space we use , where , and define the control operator as , , where denote specified control actions. Thus is linear and bounded. For the control variable we require

 u∈Uad:={u∈U|ua(t)≤u(t)≤ub(t) in Rm a.e. in [0,T]}⊂L∞(0,T;Rm)

with almost everywhere in . It is well-known (see [20], for example) that for a given initial condition and a forcing term the equation (1) admits a unique solution , where

 W(0,T):={v∈L2(0,T;H10(Ω)),∂v∂t∈L2(0,T;H−1(Ω))}.

If , higher regularity results can be derived according to [5]. We also note that the unconstrained case is related to .

The weak formulation of (1) is given by: find with and

 ∫Ωyt(t)vdx+∫Ω∇y(t)⋅∇vdx=∫Ω(f+Bu)(t)vdx∀v∈H10(Ω). (2)

The cost functional we want to minimize is given by

 J(y,u):=12∥y−yd∥2L2(ΩT)+α2∥u∥2U, (3)

where is the desired state and the regularization parameter is a real positive constant. The optimal control problem then reads

 minu∈Uad^J(u):=J(y(u),u), where y(u) satisfies (???). (4)

Note that is a non-empty, bounded, convex and closed subset of . Hence, it is easy to argue that (4) admits a unique solution with associated state , see e.g. [20].

The first order optimality system of the optimal control problem (4) is given by the state equation (1), together with the adjoint equation

 −pt−Δp=y−yd in ΩT,p(⋅,T)=0 in Ω,p=0 on ΣT,⎫⎪⎬⎪⎭ (5)

and the variational inequality

 ⟨αu+B∗p,v−u⟩U≥0 for all v∈Uad, (6)

where is the dual operator of . In (6) we have identified with and with , where we use that Hilbert spaces are reflexive. The variational inequality (6) is equivalent to the projection formula

 (7)

where denotes the orthogonal projection onto . It follows from the reflexivity of the involved spaces that the action of the adjoint operator is given as

 (B∗v)(t)=(⟨χ1,v⟩H−1,H10,…,⟨χm,v⟩H−1,H10)

and

Since our domain is smooth, the regularities of the optimal state, the optimal control and the associated adjoint state are limited through the regularities of the initial state , the right hand side , the control and the desired state
The numerical approximation of the optimality system (1)-(5)-(6) with a standard Finite Element Method (FEM) in the spatial variable leads to a high-dimensional system of ordinary differential equations:

 M˙yN−AyN=fN+BNu,yN(0)=yN0,−M˙pN−ApN=yN−yNd,pN(T)=0,⟨αu+(B∗)NpN,v−u⟩U≥0.⎫⎪ ⎪⎬⎪ ⎪⎭ (8)

Here are the semi-discrete state and adjoint, respectively, are the time derivatives, denotes the mass matrix and the stiffness matrix. Note that the dimension of each equation in the semi-discrete system (8) is related to the number of element nodes chosen in the FEM approach.

## 3 Space-Time approximation

In this section, we consider the reformulation of the optimality system (1)-(5)-(6) as an elliptic equation of fourth order in space and second order in time for the adjoint variable . This is carried out for the unconstrained control problem in [7] and generalized to control constrained optimal control problems in [21]. Following these works, we include control constraints. Here, we aim to derive an a-posteriori error estimate for the time discretization as suggested in [7], which then turns out to be the basis for our model reduction approach to solve (4).
We define

 H2,10(ΩT):={v∈H2,1(ΩT):v(T)=0 in Ω},

where

 H2,1(ΩT)=L2(0,T;H2(Ω)∩H10(Ω))∩H1(0,T;L2(Ω))

is equipped with the norm

 ∥w∥2H2,1(ΩT):=(∥w∥2L2(0,T;H2(Ω))+∥w∥2H1(0,T;L2(Ω))).

Under the assumptions , for and , the regularity of is ensured, see [5] for the details. Then, the first order optimality conditions (1)-(5)-(6) can be transformed into an initial boundary value problem for in space-time:

 −ptt+Δ2p−BPUad(−1αB∗p)=−(yd)t+Δyd in ΩT,p(⋅,T)=0 in Ω,p=0 on ΣT,Δp=yd on ΣT,(pt+Δp)(0)=yd(0)−y0 in Ω,⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭ (9)

where, without loss of generality, we have set . We note that the quantity

is nondifferentiable and nonlinear in and thus (9) becomes a semilinear second order in time and fourth order in space elliptic problem with a monotone nonlinearity. Existence of a unique weak solution for (9) can be proved analogously to [21] and follows from the fact that the optimal control problem (4) in the case of control constraints with closed and convex admits a unique solution.
In order to provide the weak formulation of (9), we define the operator and the linear form as

 A0:H2,10(ΩT)×H2,10(ΩT)→R,L0:H2,10(ΩT)→R,
 L0(v):=∫ΩT⟨−∂yd∂t+Δyd,v⟩H−1(Ω)×H10(Ω)−∫Ω(yd(0)−y0)v(0)+∫ΣTyd∇v⋅^n,

where denotes the outer normal to the boundary . The weak formulation of equation (9) for given reads:

 find p∈H2,10(ΩT) with A0(p,v)=L0(v)∀v∈H2,10(ΩT). (10)

It follows from the monotonicity of the orthogonal projection that (10) admits a unique solution , compare e.g. [10, Th. 1.25]. We put our attention on the semi-discrete approximation of (9) and investigate a-priori and a-posteriori error estimates for the time discrete problem, where the space is kept continuous. Let us consider the time discretization with and . Let . We define the time discrete space

 Vkt:={v∈H2,1(ΩT):v(⋅)|Ij∈P1(Ij)},¯Vkt:=Vkt∩H2,10(ΩT),

where the notation stands for the polynomials of first order on the interval . Then, we consider the semi-discrete problem:

 find pk∈¯Vkt with A0(pk,vk)=L0(vk),∀vk∈¯Vkt. (11)

Using the arguments of e.g. [10, Th. 1.25] one can show that problem (11) admits a unique solution .
We note that with (10) and (11) we have the Galerkin orthogonality

 A0(p,vk)−A0(pk,vk)=0∀vk∈¯Vkt. (12)

Thus, for it holds true

 A0(p,v)−A0(pk,v)=A0(p,v−vk)−A0(pk,v−vk)∀vk∈¯Vkt.

The following Theorem states a temporal residual type a-posteriori error estimate for , which transfers the estimation of [7, Theorem 3.5] to the control constrained optimal control problem (4):

###### Theorem 3.1

Let and denote the solutions to (10) and (11), respectively. Then we obtain

 ∥p−pk∥2H2,1(ΩT)≤C1η2, (13)

where and

Proof. We start the proof showing a consequence of the monotonicity of the projector operator . We find that

and hence

For easier notation, we set .
Let and let denote the standard Lagrange type temporal interpolation of . Using the inequality

 ∥v∥2H2,1(ΩT)≤C(∥∂v∂t∥2L2(ΩT)+∥Δv∥2L2(ΩT))

for and from [7, Lemma 2.5], the monotonicity (14) and the Galerkin orthogonality (12), we can estimate:

 c∥p−pk∥2H2,1(ΩT)≤∥∥∥∂(p−pk)∂t∥∥∥2L2(ΩT)+∥Δ(p−pk)∥2L2(ΩT)≤∥∥∥∂(p−pk)∂t∥∥∥2L2(ΩT)+∥Δ(p−pk)∥2L2(ΩT)+∫ΩT(N(p)−N(pk))(p−pk)=∫ΩT∂(p−pk)∂t∂ep∂t+∫ΩTΔ(p−pk)Δep+∫ΩT(N(p)−N(pk))ep=∫ΩT∂(p−pk)∂t∂(ep−πkep)∂t+∫ΩTΔ(p−pk)Δ(ep−πkep)+∫ΩT(N(p)−N(pk))(ep−πkep)=∫ΩT(−∂yd∂t+Δyd)(ep−πkep)+∫ΣTyd∇(ep−πkep)⋅^n−∫ΩT∂pk∂t∂(ep−πkep)∂t−∫ΩTΔpkΔ(ep−πkep)−∫ΩTN(pk)(ep−πkep)

Integration by parts on each time interval and Green’s formula lead to

 c∥p−pk∥2H2,1(ΩT)≤∑j∫Ij∫Ω(−∂yd∂t+Δyd+∂2pk∂t2−Δ2pk−N(pk))(ep−πkep)+∑j∫Ij∫∂Ω(yd−Δpk)∇(ep−πkep)⋅^n.

Utilizing error estimates of the Lagrange interpolation , the trace inequality and Young’s inequality, we find

Theorem 3.1 provides a tool to refine the time grid by means of the residual of the system (9). Due to (7), the time instances of this grid may be regarded as ideal snapshot locations for POD-MOR applied to problem (4).

## 4 POD for optimal control problems

In this section, we recall the POD method which we use in order to replace the original problem (4) by a surrogate model. The main interest when applying the POD method is to reduce computation times and storage capacity while retaining a satisfying approximation quality. This is possible due to the key fact that POD basis functions (unlike typical finite element ansatz functions) contain information about the underlying model, since the POD modes are derived from snapshots of a solution data set. For this reason it is important to use rich snapshot ensembles reflecting the dynamics of the modeled system. Usually, we are able to improve the accuracy of a POD suboptimal solution by enlarging the number of utilized POD basis functions or enriching the snapshot ensemble, for instance. The snapshot form of POD proposed by Sirovich in [28] works in the continuous version as follows.
Let us suppose that the continuous solution of (1) and of (5) belongs to a real separable Hilbert space , where or , equipped with its inner product and associated norm . We set , where , , . Note that the initial condition is included in . The aim is to determine a POD basis of rank with , by solving the following constrained minimization problem:

 (15)

where denotes the Kronecker symbol, i.e. for and .
It is well-known (see [8]) that a solution to problem (15) is given by the first eigenvectors corresponding to the largest eigenvalues of the self-adjoint linear operator i.e. , , where is defined as follows:

 Rψ=3∑k=1∫T0⟨zk(t),ψ⟩zk(t)dtfor ψ∈V.

Moreover, we can quantify the POD approximation error by the neglected eigenvalues (more details in [8]) as follows:

 3∑k=1∫T0∥∥ ∥∥zk(t)−ℓ∑i=1⟨zk(t),ψi⟩ψi∥∥ ∥∥2dt=d∑i=ℓ+1λi. (16)

Let us assume that we have computed POD basis functions . Then, we define the POD Galerkin ansatz of order for the state as:

 yℓ(t)=ℓ∑i=1wi(t)ψi, (17)

where and the unknown coefficients are denoted by . If we plug this ansatz into the weak formulation of the state equation (2) and use as the test space, we get the following reduced order model for (2) of low dimension:

 ∫Ωyℓt(t)ψdx+∫Ω∇yℓ(t)⋅∇ψdx=∫Ω(f+Bu)(t)ψdx∀ψ∈Vℓ and t∈(0,T]% a.e.,∫Ωyℓ(0)ψdx=∫Ωy0ψdx (18)

Choosing for and utilizing (17), we infer from (18) that the coefficients satisfy

 Mℓ˙w(t)+Aℓw(t)=Fℓ(t)a.e. in(0,T],Mℓw(0)=yℓ0,

where , , and . Note that is the identity matrix, if we choose as inner product . The reduced order model surrogate (ROM) for the optimal control problem is given by

 minu∈Uad^Jℓ(u) s.t. yℓ(u) satisfies (???), (19)

where is the reduced cost functional, i.e. . We recall that the discretization of the optimal solution to (19) is determined by the relation between the adjoint state and control and refer to [9] for more details about the variational discretization concept.
In order to solve the reduced optimal control problem (19), we consider the well-known first order optimality condition given by the variational inequality

which is sufficient since the underlying problem is convex.
The first order optimality conditions of (19) also deliver that the adjoint POD scheme for the approximation of is given by: find with satisfying

 −∫Ωpℓt(t)ψdx+∫Ω∇pℓ(t)⋅∇ψdx=∫Ω(yℓ−yd)(t)ψdx∀ψ∈Vℓ and t∈(0,T) a.e. (20)

## 5 The snapshot location strategy

In Section 4, the POD method in the continuous framework is recalled, where the POD basis functions are computed in such a way that the error between the trajectories of (1) and of (5) and its POD Galerkin approximation is minimized in (15). In practice, we do not have the whole solution trajectories at hand. But we have snapshots available, which are the solutions to (1) and the solutions to (5) at times . This motivates to replace the time integration in (15) by an appropriate quadrature rule based on , i.e. for with quadrature weights We later choose the weights for the trapezoidal rule, compare (25). In the present work, we neglect the error introduced by quadrature weights.

The minimization problem related to (15) then becomes

 min3∑k=1n∑j=0βj∥∥ ∥∥zk(tj)−ℓ∑i=1⟨zk(tj),ψi⟩ψi∥∥ ∥∥2, s.t. ⟨ψj,ψi⟩=δij for 1≤i,j≤ℓ

and obviously constitutes a strong dependence of the POD basis functions on the chosen snapshot locations . The related snapshots shall have the property to capture the main features of the dynamics of the truth solution as good as possible. Here it is important to select suitable time instances at which characteristic dynamical properties of the optimal state are located. A natural question is:

How to pick time instances that represent good locations for snapshots in POD-MOR for (19)?

Moreover, we face some difficulties since the reduction of optimal control problems is usually initialized with snapshots computed from a given input control . This problem is usually addressed in the offline stage for POD, which is the phase needed for snapshot generation, POD basis computation and building the reduced order model. Mostly, we do not have any information about the optimal control, such that in POD-MOR the input control is often chosen as . This circumstance raises the question about the quality of the POD basis and the quality of the POD suboptimal solution. The a-posteriori error estimator (13) in Section 3 motivates a suitable location of time instances for the POD adjoint state and at the same time we get an approximation of the optimal control which can be used as an input control in order to generate the snapshots.

The use, in the offline-stage, of a time adaptive mesh refinement process allows to overcome the choice of an input control and the choice of the snapshot locations by solving equation (9). Then, we take advantage of the a-posteriori error estimation presented in Theorem 3.1. Equation (9) provides the optimal adjoint state associated with (4), which does not require the explicit knowledge of a control input . We note that the ellipticity of equation (9) play a crucial role in this approach. The same approach would not work, if one solves the optimality conditions directly. The numerical approximation of provides important information about the control input. In fact, thanks to the variational inequality (6) we are first able to build an approximate control and finally compute the associated state . In this way our snapshot set will contain information about the state corresponding to an approximation of the optimal control. Thanks to this numerical approximation of the optimal control problem we can build the snapshot matrix and compute the POD basis functions where the number is chosen such that .
The approximation of equation (9) is very useful in model order reduction since we overcome the choice of the initial input control to generate the snapshot set. Moreover, we also gain information about a temporal grid, which allows us to better resolve with respect to time. The a-posteriori error estimation (13) guarantees that the finite element approximation of (9) in the time variable is below a certain tolerance. Therefore, the reduced optimal control problem (19) is set up and solved on the resulting adaptive time grid. Now the question is:

How good is the quality of the computed time grid in terms of the error between

the optimal solution and the POD surrogate solution?

### 5.1 Error Analysis for the adjoint variable

Let us motivate our approach by analyzing the error between the optimal adjoint solution of (5) associated with the optimal control for (4), i.e. and the POD reduced approximation , which is the time discrete solution to the POD-ROM for (5) associated with the time discrete optimal control for (19), i.e. in (5). We denote by the space and by the space . By the triangular inequality we get the following estimates for the -norm:

 ∥p(u)−pℓk(uℓk)∥ ≤∥p(u)−pk(uk)∥(???.1)+∥pk(uk)−Pℓpk(uk)∥(???.2)+∥Pℓpk(uk)−Pℓpk(uℓk)∥(???.3)+∥Pℓpk(uℓk)−pℓk(uℓk)∥(???.4) (21)

where is the time discrete adjoint solution of (11) associated with the control and is the time discrete adjoint solution to (5) with respect to the suboptimal control , i.e. in (5). By we denote the orthogonal POD projection operator as follows:

 Pℓy:=ℓ∑i=1⟨y,ψi⟩Vψi for y∈V.

The term (21.1) can be estimated by (13) and concerns the snapshot generation. Thus, we can decide on a certain tolerance in order to have a prescribed error. The second term (21.2) in (21) is the POD projection error and can be estimated by the sum of the neglected eigenvalues. Then, we note that the third term (21.3) can be estimated as follows:

 ∥Pℓpk(uk)−Pℓpk(uℓk)∥≤∥Pℓ∥∥pk(uk)−pk(uℓk)∥≤C2∥uk−uℓk∥U, (22)

where and is the constant referring to the Lipschitz continuity of independent of as in [22].

In order to control the quantity we make use of the a-posteriori error estimation of [30], which provides an upper bound for the error between the (unknown) optimal control and any arbitrary control (here and ) by

 ∥u−up∥U≤1α∥ζp∥U,

where is the regularization parameter in the cost functional and is chosen such that

is satisfied. Finally, last term (21.4) can be estimated according to [12] and involves the sum of the eigenvalues not considered, the first derivative of the time discrete adjoint variable and the difference between the state and the POD state:

 ∥Pℓpk(uℓk)−pℓk(uℓk)∥2≤C3(d∑i=ℓ+1λki+∥˙pk(uℓk)−Pℓ˙pk(uℓk)∥2L2(0,T,V′)+∥yk(uℓk)−yℓk(uℓk)∥2L2(0,T,H)), (23)

for a constant . We note that the sum of the neglected eigenvalues is sufficiently small provided that is large enough. Furthermore, error estimation (23) depends on the time derivative . To avoid this dependence, we include time derivative information concerning the adjoint variable into the snapshot set, see [17].

To summarize, the error estimation reads:

 ∥p(u)−pℓk(uℓk)∥L2(0,T,V)≤√C1η+C2α(∥ζk∥U+∥ζℓk∥U)+ ⎷C3(d∑i=ℓ+1λki+∥yk−yℓk∥2L2(0,T,H)). (24)

Finally, we note that estimation (23) involves the state variable which is estimated in the following Section 5.2.

### 5.2 Error Analysis for the state variable

In this section we address the problem of the certification of the quality for POD approximation for the state variable. It may happen that the time grid selected for the adjoint will not be accurate enough for the state variable . Therefore a further refinement of the time grid might be useful in order to reduce the error between the POD state and the true state below a given threshold. This is not guarenteed if we use the time grid, which results from the use of the estimate (13). Here, we consider the error between the full solution corresponding to the suboptimal control and the time discrete POD solution , where we assume to have the same temporal grid for snapshots and the solution of our POD reduced order problem. In this situation, the following estimate is proved in [17]:

 n∑j=0βj∥y(tj;uℓk)−yℓj(uℓk)∥2H ≤n∑j=1(Δt2jCy((1+c