An FEdABCD algorithm for elliptic optimal control problems with constraints on the gradient of the state and control.^{†}^{†}thanks: This study was funded by the National Natural Science Foundation of China (Nos. 11571061, 11401075, 11701065, 11501079), China Postdoctoral Science Foundation (No. 2018M632018) and the Fundamental Research Funds for the Central Universities (No. DUT16LK36).
Abstract
In this paper, elliptic control problems with integral constraint on the gradient of the state and box constraints on the control are considered. The optimal conditions of the problem are proved. To numerically solve the problem, we use the First discretize, then optimize approach. Specifically, we discretize both the state and the control by piecewise linear functions. To solve the discretized problem efficiently, we first transform it into a multiblock unconstrained convex optimization problem via its dual, then we extend the inexact majorized accelerating block coordinate descent (imABCD) algorithm to solve it. The entire algorithm framework is called finite element dualitybased inexact majorized accelerating block coordinate descent (FEdABCD) algorithm. Thanks to the inexactness of the FEdABCD algorithm, each subproblems are allowed to be solved inexactly. For the smooth subproblem, we use the generalized minimal residual (GMRES) method with preconditioner to slove it. For the nonsmooth subproblems, one of them has a closed form solution through introducing appropriate proximal term, another is solved combining semismooth Newton (SSN) method. Based on these efficient strategies, we prove that our proposed FEdABCD algorithm enjoys iteration complexity. Some numerical experiments are done and the numerical results show the efficiency of the FEdABCD algorithm.
Keywords:
Optimal control integral state constraint optimal conditions FEdABCDMsc:
49J20 49N05 68W01∎
1 Introduction
Optimal control problems with constraints on the gradient of the state have a wide range of important applications. For instance, in cooling processes or structured optimization when high stresses have to be avoided. In cooling processes, constraints on the gradient of the state play an important role in practical applications where solidification of melts forms a critical process. In order to accelerate the production, it is highly desirable to speed up the cooling processes while avoiding damage of the products caused by large material stresses. Cooling frequently is described by systems of partial differential equations involving the temperature as a system variable, so that large (Von Mises) stresses in the optimization can be kept small by imposing bounds on the gradient of the temperature (see Wollner2013A (); Nther2011Elliptic ()).
Optimal control problems with constraints on the gradient of the state have caused much attention. For optimal control problems with pointwise constraints on the gradient of the state, there are some existing works on the optimal conditions Casas1993Optimal (), discretization Deckelnick2014A () and error analysis Wollner2013A (); Nther2011Elliptic (); Deckelnick2014A (). Since the Lagrange multiplier corresponding to the pointwise constraint on the gradient of the state in general only represents a regular Borel measure (see Casas1993Optimal ()), the complementarity condition in the optimal conditions cannot be written into a pointwise form, which brings some difficulties and reduces flexibility of the numerical realization. As we know, although this difficulty can be solved by some common regularization approaches such as Lavrentiev regularization Meyer2006Optimal (); Pr2007On (); Chen2016A () and MoreauYosida regularization hintermuller2006feasible (); hintermuller2006path (); Itoa2003Semi (); Kunisch2010Optimal (), the efficiency of these regularization approaches usually depends on the choice of the regularization parameters, which will also bring some difficulties. To relax the pointwise constraint, in this paper, as a model problem we consider the following elliptic PDEconstrained optimal control problem with box constraints on the control and integral constraint on the gradient of state.
() 
where , , is a convex, open and bounded domain with  or polygonal boundary ; the desired state and are given; and are given parameters; is a closed convex subset of with nonempty interior, is a nonempty convex closed subset of ; is a uniformly elliptic operator
(1) 
where , , , and there is a constant such that
We use to denote the Euclidean norm, use to denote the inner product in and use to denote the corresponding norm.
Remark 1
Although we assume that the operator is a uniformly elliptic operator and Dirichlet boundary condition holds, we would like to point out that our considerations can also carry over to parabolic operators and more general boundary conditions of Robin type
where is given and is a nonnegative coefficient.
To numerically solve problem (), there are two possible ways. One is called First discretize, then optimize, another is called First optimize, then discretize Collis2002Analysis (). Independently of where discretization is located, the resulting finite dimensional equations are quite large. Hence, both cases require us to consider proposing an efficient algorithm based on the structure of the problem. In this paper, we use the First discretize, then optimize approach. With respect to the discrete methods, we use the full discretization method, in which both the state and control are discretized by piecewise linear functions.
As we know, there are many first order algorithms being used to solve finite dimensional large scale optimization fast, such as iterative soft thresholding algorithms (ISTA) Blumensath2008Iterative (), fast iterative soft thresholding algorithms (FISTA) Beck2009A (), accelerated proximal gradient (APG)based method jiang2012inexact (); toh2010accelerated () and alternating direction method of multipliers (ADMM) li2015qsdpnal (); li2016schur (); Chen2016A (). Motivated by the success of these first order algorithms, an APG method in function space (called Fast Inexact Proximal (FIP) method) was proposed to solve the elliptic optimal control problem involving control cost in schindele2016proximal (). It is known that whether the APG method is efficient depends closely on whether the steplength is close enough to the Lipschitz constant, however, the Lipschitz constant is not easy to estimate in usual, which largely limits the efficiency of APG method. Recently, an inexact heterogeneous ADMM (ihADMM) algorithm was proposed and used to solve optimal control problems with control cost in Song2017A (). Simultaneously, the authors also extended it to optimal control problems with control cost in Song2016A () and Lavrentievregularized stateconstrained optimal control problem in Chen2016A (). It is known that its iteration scheme is simple and each subproblem can be solved efficiently. However, ihADMM algorithm only has iteration complexity.
Most of the papers mentioned above are devoted to solving the primal problem. However, Song et al. Song2018An () proposed an dualitybased approach for PDEconstrained sparse optimization, which reformulated the problem as a multiblock unconstrained convex composite minimization problem and proposed a sGSimABCD algorithm to solve the problem. Motivated by it, we consider solving () via its dual. We find that we can also reformulate our problem as a multiblock unconstrained convex optimization problem and take advantage of the structure of it to construct an efficient algorithm to solve it efficiently and fast. Specifically, it is shown in Section 4.1 that the dual problem of discretization version of () can be written as
(2)  
where , , and for any given nonempty, closed convex subset of , denotes the indicator function of . That is to say
(3) 
Based on inner product, the conjugate of is defined as follows
(4)  
It is easy to see that (2) is a multiblock unconstrained minimization problem including two nonsmooth terms, which are decoupled. In this case, block coordinate descent (BCD) method (see Grippo2000On (); Sardy2000Block (); Tseng1993Dual (); Tseng2001Convergence ()) is toppriority and appropriate. Combining BCD method and the acceleration technique in APG will result in accelerated block coordinate descent (ABCD) method. It is a natural idea that we use ABCD method to solve (2), however, the convergence property of block ABCD method can not be promised (see Chambolle2015A ()). Fortunately, in our problem, and can be seen as one block because of the important fact that and are decoupled. Then (2) can be seen as an unconstrained convex optimization problem with coupled objective functions of the following form
(5) 
For unconstrained convex optimization problems with coupled objective functions of form (5), an accelerated alternative descent (AAD) algorithm was proposed in Chambolle2015A () to solve it for the situation that the joint objective function is quadratic. However, the AAD method does not take the inexactness of the solutions of the associated subproblems into account. It is known that, in some case, exactly computing the solution of each subproblem is either impossible or extremely expensive. Sun et al. Sun2015An () proposed an inexact accelerated block descent (iABCD) method to solve least squares semidefinite programming (LSSDP) via its dual, whose basic idea is first reducing the two block nonsmooth terms into one through applying the Danskintype theorem and then using APG method to solve the reduced problem. However, for the situation that the subproblem with respect to could not be solved exactly, Danskintype theorem can no longer be used to reduce two block nonsmooth terms into one.
To overcome the bottlenecks above, an inexact majorized accelerated block coordinate descent (imABCD) method was proposed in (cui2016, , Chapter 3). Under suitable assumptions and certain inexactness criteria, the author proved that the imABCD method enjoys iteration complexity. Motivated by its success, in this paper, we propose a finite element dualitybased inexact majorized accelerating block coordinate descent (FEdABCD) algorithm. One distinctive feature of our proposed method is that it employs a majorization technique, which gives us a lot of freedoms to flexibly choose different proximal terms for different subproblems. Moreover, thanks to the inexactness of our proposed method, we have the essential flexibility that the inner subproblems are allowed to be solved only approximately. We would like to emphasize that these flexibilities are essential. First, the flexibility of choosing proximal terms makes each problem maintain good structure and can be solved efficiently. In addition, with some simple and implementable error tolerance criteria, the cost for inexactly solving the subproblems can be greatly reduced, which further contributes to the efficiency of the proposed method. Specifically, we can see from the content in Section 4.3 that the smooth subproblem, i.e. subproblem, is a block saddle point system, which can be solved efficiently by some Krylovbased methods with preconditioner, such as the generalized minimal residual (GMRES) method with preconditioner. As for the nonsmooth subproblems, i.e. subproblems with regard to and , the subproblem has a closed form solution through introducing an appropriate proximal term and the subproblem is solved by combining semismooth Newton (SSN) method efficiently. Moreover, the iteration complexity of the FEdABCD algorithm is also proved.
The rest of this article is structured as follows. In Section 2, we will discuss the optimal conditions of problem (). Then we will consider its discretization in Section 3. Section 4 will give a dualitybased approach to transform the discretized problem into a multiblock unconstrained convex optimization problem. Then a brief sketch of the imABCD method will be given and our proposed FEdABCD algorithm will be described in details. Some numerical experiments will be given in Section 5 to verify the efficiency of the proposed algorithm. Finally, we will make a simple summary in Section 6.
2 Optimal Conditions
For the existence and uniqueness of the solution of the PDE equation in problem ()
(6) 
where is defined by (1), the following theorem holds.
Theorem 2.1
(Hinze2009Optimization, , Theorem 1.23) For each , there exists a unique weak solution of (6) and satifies
where depends only on , , .
The differentiability of the relation between the control and the state can be readily deduced from the implicit function theorem.
Theorem 2.2
(Casas1993Optimal, , Theorem 2) The mapping defined by is of class and for every , the element is the unique solution of the Dirichlet problem
(9) 
Taking a minimizing sequence and arguing in the standard way, we obtain the existence of a solution for the optimal control problem ():
Theorem 2.3
(Hinze2009Optimization, , Theorem 1.43) Assuming the existence of a feasible control (i.e. a control such that ), then problem () has a unique solution.
To prove the optimality conditions for (), we first introduce the following theorem about the existence of Lagrange multiplier.
Theorem 2.4
(Casas1993BOUNDARY, , Theorem 5.2) Let and be two Banach spaces and let and be two convex subsets, having a nonempty interior. Let be a solution of the optimization problem:
where and are Gteaux differentiable at . Then there exist a real number and an element such that
(10a)  
(10b)  
(10c) 
Moreover, can be taken equal to if the following condition of Slater type is satisfied:
(11) 
Theorem 2.5
(13) 
(14) 
(15) 
Proof
Applying Theorem 2.4 with as the control space, , the functional to minimize, , which is differential (Theorem 2.2), the convex set of and the convex set of .
Then from (10b) and (10c), we deduce the existence of and satisfying (12) and (15). Now we take and the unique solution of (14). Then it remains to prove inequality (16), which is done by using the corresponding inequality (10c). , let us take as a solution of
Then we derive that
(17)  
Then from (10c), we get
(18) 
Moreover, if there exists such that , where is the solution of the following Dirichlet problem
We know from Theorem 2.2 that , then
which means the condition of Slater type (11) holds, then can be taken to 1.
3 Finite Element Discretization
In order to tackle () numerically, we discretize both the state and the control by continuous piecewise linear functions. Let us introduce a family of regular triangulations of , i.e. . With each element , we associate two parameters and , where denotes the diameter of the set and is the diameter of the largest ball contained in . The mesh size of is defined by . We suppose the following standard assumption holds (see hinze2010variational (), Hinze2009Optimization ()).
Assumption 1
(Regular and quasiuniform triangulations) The domain is a open bounded and convex subset of , and its boundary is a polygon () or a polyhedron (n=3). Moreover, there exist two positive constants and such that
hold for all and all . Let us define , and let and denote its interior and its boundary, respectively. In the case that has a boundary , we assume that is a convex and that all boundary vertices of are contained in , such that
where denotes the measure of the set and is a constant.
Let be the finite dimensional subspace, then
(19) 
(20) 
(21) 
where , , , . We define the following matrices
(22) 
where and denote the finite element stiffness matrix and mass matrix respectively. Let
(23) 
be the nodal projection of , onto , where , . Then the discretized problem can be rewritten into the following matrixvector form
() 
4 A FEdABCD algorithm
In this part, we could see that the discretized version of original problem can be reformulated into a multiblock unconstrained convex optimization problem via its dual. Then we first focus on the inexact majorized accelerate block coordinate descent (imABCD) method which was proposed by Cui in (cui2016, , Chapter 3) for a general class of problems and then explain how we extend it to our problem with some strategies according to the structure of the problem.
4.1 A dualitybased approach
We introduce two artificial variables and , then () can be transformed into the following problem
() 
whose Lagrangian function is
(24)  
where , and are Lagrangian multipliers associated with the three equality constraints respectively. Then the dual problem of () is
(25) 
Let us focus on first, we have
(26) 
Above we use the concept of conjugate function. The conjugate function of is defined by
(27) 
Insert (26) to we can get
Then (25) is transformed into
(28)  
which is equivalent to
()  
which is a multiblock unconstrained optimization problem. Thus, accelerated block coordinate descent (ABCD) method is preferred and appropriate. We will expend the imABCD algorithm, which was proposed in (cui2016, , Chapter 3), to our problem and employ some strategies according to the structure of our problem to solve it efficiently. We will give the details about the algorithm in the following part.
4.2 Inexact majorized ABCD algorithm
It is well known that taking the inexactness of the solutions of associated subproblems into account is important for the numerical implementation. Thus, let us give a brief sketch of the imABCD method for a general class of unconstrained, multiblock convex optimization problems with coupled objective function
(29) 
where and are two convex functions (possibly nonsmooth), is a smooth convex function, and , are real finite dimensional Hilbert spaces. To tackle with the general model (29), some more conditions and assumptions on are required.
Assumption 2
The convex function is continuously differentiable with Lipschitz contunous gradient.
Let us denote . HiriartUrruty and Nguyen provided a second order MeanValue Theorem (see (Hiriart1984Generalized, , Theorem 2.3)) for , which states that for any and in , there exists and a selfadjoint positive semidefinite operator such that
(30) 
where denotes the Clarke’s generalized Hessian at given and denotes the line segment connecting and . Under Assumption 2, it is obvious that there exist two selfadjoint positive semidefinite linear operators and such that for any ,
(31) 
Thus, for any , , it holds
(32) 
and
(33) 
Furthermore, we decompose the operators and into the following block structures
(34) 
and assume and satisfy the following conditions.
Assumption 3
(cui2016, , Assumption 3.1) There exist two selfadjoint positive semidefinite linear operators and such that
(35) 
Furthermore, satisfies that and .
Now we present the inexact majorized ABCD algorithm for the general problem (29) as follow.
Input: . Let be a summable sequence of nonnegative numbers, and set , .
Output:
Iterative until convergence:
 Step 1

Choose error tolerance , such that
Compute
 Step 2

Set and , compute
As for the convergence result of the imABCD algorithm, we can refer to the following theorem.