A numerical procedure and unified formulation for the adjoint approach in hyperbolic PDE-constrained optimal control problems

A numerical procedure and unified formulation for the adjoint approach in hyperbolic PDE-constrained optimal control problems

Gino I. Montecinos
Centro de Modelamiento Matemático (CMM)
Beauchef 851
Torre Norte    Piso 7    Santiago    Chile
Juan C. López-Ríos
Escuela de Ciencias Matemáticas y Tecnología Informática
YACHAY TECH
San Miguel de Urcuquí
Jaime H. Ortega
Centro de Modelamiento Matemático (CMM) and Departamento de Ingeniería Matemática
Beauchef 851
Torre Norte    Piso 5    Santiago    Chile
Rodrigo Lecaros
Departamento de Matemática
Casilla 110-V
Valparaíso    Chile
Abstract

The present paper aims at providing a numerical strategy to deal with PDE-constrained optimization problems solved with the adjoint method. It is done through out a unified formulation of the constraint PDE and the adjoint model. The resulting model is a non-conservative hyperbolic system and thus a finite volume scheme is proposed to solve it. In this form, the scheme sets in a single frame both constraint PDE and adjoint model. The forward and backward evolutions are controlled by a single parameter and a stable time step is obtained only once at each optimization iteration. The methodology requires the complete eigenstructure of the system as well as the gradient of the cost functional. Numerical tests evidence the applicability of the present technique. The adjoint method, PDE-constrained optimal control, hyperbolic conservation laws.

\gridframe

N \jnodri017

1 Introduction

The adjoint method consists of the use of a linearized equation derived from the constraint PDE, it provides a new set of adjoint states, and the gradient of the cost functional can be completely determined in terms of the state variables, design parameters and adjoint states. The equation for the state variables, as well as for the adjoint parameter, is solved only once. It makes the adjoint method to be more efficient than the sensitivity method, see Hinze:2008a () for a detailed discussion about the performance of both approaches. In this work, we are going to consider the adjoint approach for solving optimization problems related to hyperbolic type PDE’s. Additionally, for hyperbolic equations, the design parameters, which are parameters of models, can be incorporated into the governing equation as a new variable, and then the problem of finding parameters can be cast into a problem of finding the initial condition.

The constraint PDE normally is written in a conservation form and thus suitable schemes are those corresponding to the family of conservative methods. On the other hand, the adjoint models are linearization of the constraint PDE, but this is a quasilinear type model for adjoint variables and thus they are also of hyperbolic type, but these models are non-conservative in a mathematical sense. It is a very important issue from a numerical point of view, the right description of the weave propagation has to be dealt in the frame of the path conservative methods for partial differential equations involving non-conservative products, see Pares:2006a (); CastroM:2006a () for a detailed study of non-conservative partial differential equations with a particular emphasis on hyperbolic types.

As the constraint PDE and the adjoint model are both hyperbolic, they can be written as a unified model incorporating both the constraint PDE and the adjoint model in a single system of balance laws. In this paper, we back up on the unified formulation to set a numerical methodology able to simultaneously deal with the conservative form of constraint PDE and the adjoint model. The aim is to provide a general frame to deal with optimization problems through the adjoint method. Moreover, the present approach works even if the constraint PDE is non-conservative. In the literature, there exist several works dealing with minimization considering hyperbolic conservation laws, see James:1999a (); James:2008a (); Castro:2008a-1 (); Lecaros:2014a (); Ulbrich:2002a () to mention but a few. However, to best of our knowledge, the present approach has not been reported previously and this incorporates in a natural way the non-conservative structure of the adjoint model.

On the other hand, regarding the solution of hyperbolic conservation laws, one of the successful methods are those based on the finite volume approach, which makes use of numerical fluxes at the interfaces of computational cells, providing a compact one-step evolutionary formula. The accuracy of the approximation via this approach depends on the degree of resolution of numerical fluxes. One approach for achieving high resolution is by mean of the use of the so called Generalized Riemann Problem (GRP), see Castro:2008a (); Montecinos:2012a () for further details and comparison among existing GRP solvers. These are building blocks of a class of high-order numerical methods known as ADER schemes Toro:2001c (); Toro:2002a (). The numerical scheme presented in this work belong to the class of ADER methods. The scheme uses a modified version of the Osher-Solomon Riemann solver Osher:1982a () presented by Dumber and Toro, Dumbser:2010b (), which is able to account for the non-conservative terms, using the so-called path conservative approach, it is a second order method for hyperbolic problems in conventional finite volume setting and it inherits the properties of stability, well-balance, consistency and robustness. However, the implementation has to be adapted for accounting the correct wave propagation for the evolution of both, the PDE constraint and the adjoint model.

In order to assess the performance of the present method, we carry out comparisons between the present scheme and a conventional method based on Lellouche:1994a () for solving PDE-constrained optimization problems. We adapt the scheme in Lellouche:1994a () implementing an upwind type scheme for the solution of the constraint and for the adjoint model, the same finite difference approach is kept but it is carried out in an explicit time evolution to be consistent with the present method which is globally explicit in time.

The rest of the paper is organized as follows. In Section 2 the unified formulation of the primal dual strategy for a general system of conservation laws is presented, after the formal derivation of the adjoint system. We also state a general theorem providing the existence of classical solutions for the unified system and study a related optimal control problem. In Section 3, a numerical scheme for solving conservative and non-conservative hyperbolic laws is presented and, as a consequence, a scheme for solving the unified primal-adjoint system. In Section 4 some specific numerical examples are presented using the numerical scheme from Section 3. Finally in Section 5 some conclusions are drawn.

2 PDE-constrained optimization problem

The purpose of this section is to provide a unified formulation of the primal dual strategy, raised from the study of certain optimal control problems concerning the finding of a parameter on a general conservation law.

First, we define the hyperbolic PDE to be considered and we prove that the finding of a parameter on this system is reduced to find an initial condition. Next we compute the adjoint system and set the unified formulation. Then we provide some conditions to establish the existence and uniqueness of classical solutions of the unified system. Finally we set the control problem and prove the existence of an optimal solution.

We consider the system

 ∂tU+∂xR(U,b)=L(U,b)+~B(U,b)∂xb,t∈[0,T],U(x,0)=H(x),} (2.1)

where is a finite time, is a prescribed initial condition, is a vector of states, a flux function, is a source function, is a vector accounting for model parameters and the expression represents source terms which may also include the influence of parameter derivatives. Here, parameters only depend on space. Since one main goal is to provide a strategy for finding as the minimum of a given functional , let us write as

 (2.2)

where is some scalar field, representing some measurement operator, similarly is some available measurement, is a non-negative constant and is an exponent to be determined.

In an optimization setting we identify the state variables , design parameters , objectives or cost functional given by (2.2) and constraints consisting of (2.1) that candidate state and design parameters are required to satisfy.

Notice that, parameters only depend on space, then . So, we can include as a new variable of system (2.1), as follows

 ∂tU+∂xR(U,b)=L(U,b)+~B(U,b)∂xb,∂tb=0.⎫⎪⎬⎪⎭ (2.3)

So, written in the vector form, (2.3) becomes

 \par∂t~U+∂x\par~R(~U)+~M∂x~U=\par~L(~U)\par, (2.4)

where

 \par~U=[Ub],~R(~U)=[R(U,b)0],\par~L(~U)=[L(U,b)0],~M(~U)=[0−~B(U,b)00].\par (2.5)

In this new system, is completely fixed once an initial condition is provided, so the task of finding parameters in a model system as (2.1) is equivalent to find the initial condition of the enlarged model system (2.3), henceforth referred to as constraint PDE. From now on, instead of system (2.1) we are going to consider system

 ∂t~U+∂x~R(~U)+~M∂x~U=~L(~U),t∈[0,T],~U(x,0)=~H(x),} (2.6)

where . The following result provides the conditions on system (2.1) which turns system (2.6) to be hyperbolic.

Proposition 2.1

Let (2.1) be a hyperbolic system in the variable , with the corresponding eigenvalues of , where . Then (2.6) is also a hyperbolic system in the variables . The set of eigenvalues is given by

 {λ1,...,λm,0,...,0ntimes}

and the corresponding eigenvectors are given by

 {[v1,0n]T,...,[vm,0n]T,[~V1,e1]T,...,[~Vn,en]T},

with where is the column vector of the matrix Here, is an eigenvector associated with , is the zero vector in , is the h canonical vector of .

Proof.

The Jacobian matrix of system (2.6) is given by

 JR:=∂~R/∂~U+~M=[∂R/∂U(∂R/∂b−~B)00]. (2.7)

Therefore

 det(JR−λIn+m)=−λndet(∂R/∂U−Im)=−λnp(λ), (2.8)

where and are the identity matrix in and , respectively. Here, is the characteristic polynomial of which has roots, because (2.1) is hyperbolic. This proves

 {λ1,...,λm,0,...,0ntimes}

are the eigenvalues of .

To complete the proof, we must obtain a set of linearly independent eigenvectors. That means, for each we look for such that

 JRW=μW. (2.9)

Here, we obtain eigenvectors of the form , so in a block-wise fashion we look for vectors and such that

 (∂R/∂U)~V+(∂R/∂b−~B)~W=μ~V,0=μ~W.\par} (2.10)

Notice that, setting and , then (2.10) is satisfied for . It provides independent vectors . On the other hand, if we obtain

 (∂R/∂U)~V+(∂R/∂b−~B)~W=0.\par (2.11)

If we set , then

 (∂R/∂U)~V=−(∂R/∂b−~B)j, (2.12)

where is the column vector of the matrix Since is invertible, there exists such that

 ~Vj=−(∂R/∂U)−1(∂R/∂b−~B)j.

Then the proof holds. ∎

Remark 2.1

Notice that the previous result ensure that extended system (2.6) is hyperbolic, whenever system (2.1) is. However, it requires (2.1) to have an invertible Jacobian matrix. It is not guaranteed in the general case. However, the cases of interest in this study do so. Otherwise, the verification has to be done case by case.

As a consequence, the problem of finding a parameter vector for system becomes an inverse problem where the aim is to find an initial condition for which is a conventional variable in a system of hyperbolic balance laws.

2.1 The adjoint method for solving PDE-constrained optimization problems

In this section we derive a system of first order optimality conditions for system (2.6), subject to the functional (2.2) with . Even though these calculations are formal, we will provide the corresponding analysis in sections 2.3 and 2.4. At this point our main interest is to present the unified primal-dual formulation and the main issues related with. Since it is well known for this strategy Hinze:2008a (), the so-called adjoint state is, in some sense, a linearization of the constraint PDE. The dependence of the sought parameter in (2.6) is implicit but, as was mentioned above, can be seen as the tracking of an initial condition.

As we did in Proposition (2.1), for , let us write (2.6) as

 ∂t~Ui+∑m+nj=1JRi,j∂x~Uj=~Li,t∈[0,T],~Ui(x,0)=~Hi(x).} (2.13)

Then

 ∂tδ~Ui+∑m+nj=1JRi,j∂xδ~Uj\par+∑m+nj=1∑m+nk=1\par∂JRi,j∂x~Ukδ~Uk∂x~Uj\par=∑m+nk=1∂~Lk∂~Ukδ~Uk.\par (2.14)

If we denotes the inner product in by and multiply last equation by functions , such that , , after integrating in space and time, we obtain

 ∫T0⟨∂tδ~Ui,Pi⟩\par+m+n∑j=1∫T0⟨∂xδ~Uj,JRi,jPi⟩\par\par+∑m+nj=1∑m+nk=1\par∫T0⟨δ~Uk,∂JRi,j∂~Uk∂x~UjPi⟩\par=∑m+nk=1∫T0⟨δ~Uk,∂~Lk∂~UkPi⟩.\par (2.15)

Summing on indices , yields

 ∫T0m+n∑i=1⟨∂tδ~Ui,Pi⟩\par−m+n∑j=1∫T0⟨δ~Uj,m+n∑i=1∂x(JRi,jPi)⟩\par\par+∑m+nj=1∑m+nk=1\par∫T0⟨δ~Uk,m+n∑i=1∂JRi,j∂~Uk∂x~UjPi⟩\par=∑m+nk=1∫T0⟨δ~Uk,m+n∑i=1∂~Lk∂~UkPi⟩.\par (2.16)

Then integrating by parts, we obtain

 \parm+n∑k=1⟨δ~Uk,Pk⟩|T0−\par∫T0m+n∑k=1⟨δ~Uk,∂tPk⟩\par−m+n∑k=1∫T0⟨δ~Uk,m∑i=1JRi,k∂xPi⟩\par\par−∑m+nk=1∫T0⟨δ~Uk,m∑i=1Dk,j∂x~Uj⟩\par\par=m+n∑k=1∫T0⟨δ~Uk,m+n∑i=1∂~Lk∂~UkPi⟩,\par (2.17)

where

 \parDk,j=∑m+ni=1(∂JRi,k∂~Uj−∂JRi,j∂~Uk)Pi.\par (2.18)

On the other hand, or

 δJ(b)=m+n∑k=1∫T0⟨δ~Uk,(ψ−¯ψ)∂ψ∂~Uk⟩.

Moreover

 δJ=∑nj=1⟨∇J,δb⟩, (2.19)

so, by identifying terms we obtain

 ∇Jj(x)=Pm+j(x,0), (2.20)

for all . Imposing for and for we obtain the adjoint problem

 ∂tPk+∑m+ni=1JRi,k∂xPi+∑m+ni=1Dk,j∂x~Uj=∑m+ni=1∂~Li∂~UkPi+(ψ−¯ψ)∂ψ∂~Uk. (2.21)

In a matrix form

 ∂tP+JRT∂xP=~S(~U,P),P(x,T)=0, (2.22)

where and . The source term in this case has the form

 ~S(~U,P)=∂~L∂~UTP+12∇~U((ψ−¯ψ)2)−D∂x~U.

In the sequel, we shall refer to as the adjoint state. Notice that system (2.22) is also hyperbolic, with the same eigenvalues that system (2.13). It is because the Jacobian matrix of the system (2.22) is Moreover, notice that system (2.22), henceforth referred to as the adjoint model, evolves back in time from to .

2.2 The unified formulation

In this section, we construct a model system which accounts for both the PDE constraint and the adjoint model, aimed to construct a numerical method, able to solve simultaneously both systems including the treatment of non-conservative products. So, let us consider the following unified primal-dual system

 ∂tQ+∂xF(Q)+B(Q)∂xQ=S(Q), (2.23)

where

 (2.24)

Notice that the state contains the state variable , the model parameter and the adjoint state . Moreover, it is expected that not only system (2.23) reproduces the evolution of the constraint PDE for marching forward in time, but also it approximates the evolution backward in time of the adjoint model. The aim is to design a numerical solver for the model (2.23) in order to solve constraint PDE and adjoint model with the same methodology and able to deal with non-conservative products.

We are going to prove that in most of the cases the unified system is hyperbolic if constraint PDE as well as the adjoint model, are hyperbolic. In such a case a finite volume scheme is a suitable method for solving this class of problems. Indeed, if (2.1) admits a conservative formulation, there exists such that

 JR(~U)=∂~K(~U)∂~U. (2.25)

Then and (2.23) is hyperbolic with the same eigenvalues that (2.4). This is stated in the following

Proposition 2.2

If there exists a differentiable function such that then and (2.23) is hyperbolic with eigenvalues

 {λ1,....,λm+n},

each one with multiplicity and eigenvectors

 {[v1,0m+n]T,[0m+n,vt1]T,....,[vm+n,0m+n]T,[0m+n,vtm+n]T},

where is the eigenvector of associated with and is the eigenvector of associated with . Here is the zero vector of

Proof.

Let us write system (2.23) in the quasilinear form

 ∂tQ+A(Q)∂xQ=S(Q), (2.26)

where

 \parA(Q)=(JR0DJRT). (2.27)

Notice that where is the characteristic polynomial of , from which we obtain that eigenvalues of this system are the same of , each one with at lest an algebraic multiplicity . It is because it may exists indices and such that and .

On the other hand, from the existence of , we note that

 \par∂JRi,k∂~Uj=∂2~K(~U)∂~Uk∂~Uj=∂2~K(~U)∂~Uj∂~Uk=∂JRi,j∂~Uk, (2.28)

so from (2.18) we have Thus the Jacobian matrix of the system now takes the form

 A(Q)=[JR00JRT]. (2.29)

In order to find the set of eigenvectors, let us denote by the eigenvector of associated with and by the eigenvector of associated with . Then is straightforward to verify that

 (2.30)

and

 (2.31)

In this form, a set of eigenvectors is obtained. The verification of the linear independence comes from the fact that as well as are two sets of linearly independent vectors. ∎

Remark 2.2

The previous proposition shows that unified systems coming from conservative hyperbolic equations are always hyperbolic. This does not mean that non-conservative systems are not hyperbolic. So in the case of systems where is not the null matrix, the analysis of hyperbolicity has to be done case by case.

2.3 Existence of solutions, the symetrizable case

In this section we discuss the setting where a well-posedness theorem for system (2.23) can be formulated. Recall from Proposition 2.2, system (2.23) can be written in the quasilinear form

 ∂tQ+A(Q)∂xQ=S(Q), (2.32)

with

 (2.33)

In taylor2010partial (), has been made the hypothesis of strict hyperbolicity: that the Jacobian matrix has real and distinct eigenvalues. Then, under this assumption, (2.32) is called a symetrizable hyperbolic system provided there exists positive-definite, such that is symmetric. See Proposition 2.2, Chapter 16 in taylor2010partial () for further details.

The assumption of the symmetry of is natural in this context of conservation laws. In lax1954weak (), Lax introduced a general notion of symmetrizer in the context of pseudodifferential operators to prove that any strictly hyperbolic system is symmetrizable. Also, as was mentioned in serre2015relative () under the context of systems of conservation laws having an entropy solution, Godunov Godunov:1961a () and Friedrichs and Lax Friedrichs:1971a () observed that systems of conservation laws admitting a strongly convex entropy can be symmetrized. It is then under this context of strictly hyperbolic matrices that we state the following theorem for the system (2.32). This theorem is the key point to set a descent method to solve optimal control problems related to conservation laws. Even though its proof is classical, see taylor2010partial (), we outline some key steps for the sake of completeness. Let be an interval with .

Theorem 2.3

Let , . If , in and , , in , then the initial value problem (2.32) has a unique solution in .

Proof.

Let a Friedrichs mollifier. The idea is to obtain a solution to (2.32) as the limit of unique solutions to the systems of ODEs

 {A0(t,x,JϵQϵ)∂tQϵ+JϵA(t,x,JϵQϵ)∂xJϵQϵ=JϵS(t,x,JϵQϵ),Qϵ(0)=f, (2.34)

for close to . We denote .

Let us consider the -inner product

 (w,A0ϵw),A0ϵ=A0(t,x,JϵQ),

which, by the positivity of , defines an equivalent -norm. Therefore by the symmetry of and (2.34)

 ddt(DαQϵ,A0ϵ(t)DαQϵ) =−2(Dα(JϵAϵ∂xJϵQϵ),DαQϵ)+2(DαJϵS,DαQϵ) =−2([Dα,A0ϵ]∂tQϵ,DαQϵ)+(DαQϵ,A′0ϵ(t)DαQϵ).

Then, using Moser–type moser1966rapidly () estimates and Sobolev embeddings, for , one has

 ddt(DαQϵ,A0ϵ(t)DαQϵ)≤Ck(∥Qϵ(t)∥C1)(1+∥JϵQϵ(t)∥2Hk). (2.35)

Since , by (2.35), satisfies

Thus by Gronwall’s inequality, exists for in an interval independent of and

 ∥Qϵ(t)∥Hk≤K(t)<∞. (2.36)

Moreover, by the time-reversibility, exists and is bounded on .

By (2.36) and (2.34), is bounded and therefore will have a weak limit point . Moreover, by Ascoli’s Theorem there is a sequence, still denoted , such that in and since

 Qϵ→Qin C(I,C1(Rd)).

Consequently, (2.32) follows in the limit from (2.34). Uniqueness is treated in a similar way to the estimates above. This conclude the proof of the Theorem. ∎

Corollary 2.1

Let , . If is the solution of (2.32) with initial condition then there exists an open neighborhood of in such that system (2.32) with initial condition and right hand side have a solution . Moreover the mapping , defined by , is of class . Finally, if , for some and some , then is the unique solution of the problem

 ∂tV+JR(Qβ)∂xV=gV,V(0)=b.
Proof.

Proceeding as in the proof of Theorem 2.3, we can consider

 F:C(Hk(Rd))×Hk(Rd)→C(Hk(Rd))×Hk(Rd)

the mapping given by

 F(Q,b)=(∂tQ+A(Q)∂xQ−S(Q),Q(0)−b).

Then is of class and

 ∂F∂Q(Q,b)⋅V=(∂tV+JR(Qβ)∂xV−gV,V(0)).

Applying Theorem 2.3 we have that is an isomorphism from onto . Now, if is a solution of (2.32), then . Therefore, by the implicit function theorem, there exists an open neighborhood of in and such that for every . Moreover, is also of class and

 ∂F∂Q(Qβ,β)∘DG(β)⋅h+∂F∂β(Qβ,β)⋅h=(0,0)∀h∈Hk(Rd).

Then, if we set , we get

 ∂tV+JR(Qβ)∂xV=gV,V(0)=b.

Note that we have established the results above for and . This is to keep the generality of this results. In this work will be enough to consider and as we did in the formulation of system (2.1).

2.4 The optimal control problem

One of our main motivations applying an optimal control strategy to general conservation laws is the possibility of study numerical inverse problems related to this type of systems. Example of them are the bottom detection in water-waves type system, see Zuazua:2015a (), or the tracking of initial conditions in the presence of shocks Castro:2008a-1 (); Lecaros:2014a (), to mention but a few.

We are considering the functional

with , , a given function and . We want to find such that is small. The nature of is not specified and it depends on the physical problem under consideration, as well as the itself election of . The admissible control set is a nonempty convex closed subset of .

Then the optimal control problem is formulated in the following way

Now we study the existence of a solution of (2.4).

Theorem 2.4

Let . If there exists a feasible pair satisfying (2.32), then there exists at least one optimal solution of (2.4).

Proof.

Since there is one feasible pair for the problem and is bounded below by zero we may take a minimizing sequence . Then , which implies that is a bounded sequence in . Thus, we can take a subsequence, denoted in the same way, such that weakly in . Since is closed and convex, . On the other hand, following the proof of Theorem 2.3, we have that is bounded in . Therefore we can assume, by taking a subsequence if necessary, that weakly in . Using the continuity of the inclusion we can pass to the limit in system (2.32) satisfied by and to conclude that also satisfies (2.32). Namely, is a feasible pair for problem (2.4). Taking into consideration that is weakly lower semi-continuous, the result follows. ∎

Let us remark that the assumption of Theorem 2.4 can be checked by taking with . This uniquely defines from the partial differential equation (2.32) as . If then the assumption is satisfied. This is the case if .

Finally we state the following proposition giving the optimality conditions satisfied by the solutions of (2.4). Its proof is based on Corollary 2.1 and the results obtained in Section 2.1.

Proposition 2.5

Let and assume that is a solution of (2.4). Then there exist a unique element such that the following system is satisfied

 ∂tQ0+JR∂xQ0=L(Q0),Q0(x,T)=b0,
 ∂tP0+JTR∂xP0=∂L∂Q0TP0+12∇Q0((ψ−¯ψ)2)−D∂xQ0,P0(x,T)=0,
 P0|t=0.

3 The numerical scheme

In this section, we present the scheme for solving simultaneously conservative as well as non-conservative hyperbolic balance laws of the form (2.23). Cell averages are evolved by following the one-step finite volume formula

 (3.1)

where controls the evolution of the system. This is because in the context of the adjoint method we need the constraint PDE evolves forward in time up to and then the adjoint model evolves backward in time from to . So, makes the forward in time and makes the backward in time. This formula is a conventional one, available in the literature for solving non-conservative schemes, see for instance Dumbser:2014a (). However, we are going to modify the usual implementation of these type of schemes in order to account for the correct wave propagation. By the sake of completeness, we provide a detailed description of the scheme. We have

 (B∂Q)i=1ΔtΔx∫tn+1t∫xi+12xi−12B(Qi(ξ,τ))∂xQi(ξ,τ)dξdτ,\par\parSi=1ΔtΔx∫tn+1t∫xi+12xi−12S(Qi(ξ,τ))dξdτ (3.2)

and