An FE-dABCD algorithm for elliptic optimal control problems with constraints on the gradient of the state and control.††thanks: This study was funded by the National Natural Science Foundation of China (Nos. 11571061, 11401075, 11701065, 11501079), China Postdoctoral Science Foundation (No. 2018M632018) and the Fundamental Research Funds for the Central Universities (No. DUT16LK36).
In this paper, elliptic control problems with integral constraint on the gradient of the state and box constraints on the control are considered. The optimal conditions of the problem are proved. To numerically solve the problem, we use the First discretize, then optimize approach. Specifically, we discretize both the state and the control by piecewise linear functions. To solve the discretized problem efficiently, we first transform it into a multi-block unconstrained convex optimization problem via its dual, then we extend the inexact majorized accelerating block coordinate descent (imABCD) algorithm to solve it. The entire algorithm framework is called finite element duality-based inexact majorized accelerating block coordinate descent (FE-dABCD) algorithm. Thanks to the inexactness of the FE-dABCD algorithm, each subproblems are allowed to be solved inexactly. For the smooth subproblem, we use the generalized minimal residual (GMRES) method with preconditioner to slove it. For the nonsmooth subproblems, one of them has a closed form solution through introducing appropriate proximal term, another is solved combining semi-smooth Newton (SSN) method. Based on these efficient strategies, we prove that our proposed FE-dABCD algorithm enjoys iteration complexity. Some numerical experiments are done and the numerical results show the efficiency of the FE-dABCD algorithm.
Keywords:Optimal control integral state constraint optimal conditions FE-dABCD
Msc:49J20 49N05 68W01
Optimal control problems with constraints on the gradient of the state have a wide range of important applications. For instance, in cooling processes or structured optimization when high stresses have to be avoided. In cooling processes, constraints on the gradient of the state play an important role in practical applications where solidification of melts forms a critical process. In order to accelerate the production, it is highly desirable to speed up the cooling processes while avoiding damage of the products caused by large material stresses. Cooling frequently is described by systems of partial differential equations involving the temperature as a system variable, so that large (Von Mises) stresses in the optimization can be kept small by imposing bounds on the gradient of the temperature (see Wollner2013A (); Nther2011Elliptic ()).
Optimal control problems with constraints on the gradient of the state have caused much attention. For optimal control problems with pointwise constraints on the gradient of the state, there are some existing works on the optimal conditions Casas1993Optimal (), discretization Deckelnick2014A () and error analysis Wollner2013A (); Nther2011Elliptic (); Deckelnick2014A (). Since the Lagrange multiplier corresponding to the pointwise constraint on the gradient of the state in general only represents a regular Borel measure (see Casas1993Optimal ()), the complementarity condition in the optimal conditions cannot be written into a pointwise form, which brings some difficulties and reduces flexibility of the numerical realization. As we know, although this difficulty can be solved by some common regularization approaches such as Lavrentiev regularization Meyer2006Optimal (); Pr2007On (); Chen2016A () and Moreau-Yosida regularization hintermuller2006feasible (); hintermuller2006path (); Itoa2003Semi (); Kunisch2010Optimal (), the efficiency of these regularization approaches usually depends on the choice of the regularization parameters, which will also bring some difficulties. To relax the pointwise constraint, in this paper, as a model problem we consider the following elliptic PDE-constrained optimal control problem with box constraints on the control and integral constraint on the gradient of state.
where , , is a convex, open and bounded domain with - or polygonal boundary ; the desired state and are given; and are given parameters; is a closed convex subset of with nonempty interior, is a nonempty convex closed subset of ; is a uniformly elliptic operator
where , , , and there is a constant such that
We use to denote the Euclidean norm, use to denote the inner product in and use to denote the corresponding norm.
Although we assume that the operator is a uniformly elliptic operator and Dirichlet boundary condition holds, we would like to point out that our considerations can also carry over to parabolic operators and more general boundary conditions of Robin type
where is given and is a nonnegative coefficient.
To numerically solve problem (), there are two possible ways. One is called First discretize, then optimize, another is called First optimize, then discretize Collis2002Analysis (). Independently of where discretization is located, the resulting finite dimensional equations are quite large. Hence, both cases require us to consider proposing an efficient algorithm based on the structure of the problem. In this paper, we use the First discretize, then optimize approach. With respect to the discrete methods, we use the full discretization method, in which both the state and control are discretized by piecewise linear functions.
As we know, there are many first order algorithms being used to solve finite dimensional large scale optimization fast, such as iterative soft thresholding algorithms (ISTA) Blumensath2008Iterative (), fast iterative soft thresholding algorithms (FISTA) Beck2009A (), accelerated proximal gradient (APG)-based method jiang2012inexact (); toh2010accelerated () and alternating direction method of multipliers (ADMM) li2015qsdpnal (); li2016schur (); Chen2016A (). Motivated by the success of these first order algorithms, an APG method in function space (called Fast Inexact Proximal (FIP) method) was proposed to solve the elliptic optimal control problem involving -control cost in schindele2016proximal (). It is known that whether the APG method is efficient depends closely on whether the step-length is close enough to the Lipschitz constant, however, the Lipschitz constant is not easy to estimate in usual, which largely limits the efficiency of APG method. Recently, an inexact heterogeneous ADMM (ihADMM) algorithm was proposed and used to solve optimal control problems with -control cost in Song2017A (). Simultaneously, the authors also extended it to optimal control problems with -control cost in Song2016A () and Lavrentiev-regularized state-constrained optimal control problem in Chen2016A (). It is known that its iteration scheme is simple and each subproblem can be solved efficiently. However, ihADMM algorithm only has iteration complexity.
Most of the papers mentioned above are devoted to solving the primal problem. However, Song et al. Song2018An () proposed an duality-based approach for PDE-constrained sparse optimization, which reformulated the problem as a multi-block unconstrained convex composite minimization problem and proposed a sGS-imABCD algorithm to solve the problem. Motivated by it, we consider solving () via its dual. We find that we can also reformulate our problem as a multi-block unconstrained convex optimization problem and take advantage of the structure of it to construct an efficient algorithm to solve it efficiently and fast. Specifically, it is shown in Section 4.1 that the dual problem of discretization version of () can be written as
where , , and for any given nonempty, closed convex subset of , denotes the indicator function of . That is to say
Based on inner product, the conjugate of is defined as follows
It is easy to see that (2) is a multi-block unconstrained minimization problem including two non-smooth terms, which are decoupled. In this case, block coordinate descent (BCD) method (see Grippo2000On (); Sardy2000Block (); Tseng1993Dual (); Tseng2001Convergence ()) is top-priority and appropriate. Combining BCD method and the acceleration technique in APG will result in accelerated block coordinate descent (ABCD) method. It is a natural idea that we use ABCD method to solve (2), however, the convergence property of block ABCD method can not be promised (see Chambolle2015A ()). Fortunately, in our problem, and can be seen as one block because of the important fact that and are decoupled. Then (2) can be seen as an unconstrained convex optimization problem with coupled objective functions of the following form
For unconstrained convex optimization problems with coupled objective functions of form (5), an accelerated alternative descent (AAD) algorithm was proposed in Chambolle2015A () to solve it for the situation that the joint objective function is quadratic. However, the AAD method does not take the inexactness of the solutions of the associated subproblems into account. It is known that, in some case, exactly computing the solution of each subproblem is either impossible or extremely expensive. Sun et al. Sun2015An () proposed an inexact accelerated block descent (iABCD) method to solve least squares semidefinite programming (LSSDP) via its dual, whose basic idea is first reducing the two block nonsmooth terms into one through applying the Danskin-type theorem and then using APG method to solve the reduced problem. However, for the situation that the subproblem with respect to could not be solved exactly, Danskin-type theorem can no longer be used to reduce two block nonsmooth terms into one.
To overcome the bottlenecks above, an inexact majorized accelerated block coordinate descent (imABCD) method was proposed in (cui2016, , Chapter 3). Under suitable assumptions and certain inexactness criteria, the author proved that the imABCD method enjoys iteration complexity. Motivated by its success, in this paper, we propose a finite element duality-based inexact majorized accelerating block coordinate descent (FE-dABCD) algorithm. One distinctive feature of our proposed method is that it employs a majorization technique, which gives us a lot of freedoms to flexibly choose different proximal terms for different subproblems. Moreover, thanks to the inexactness of our proposed method, we have the essential flexibility that the inner subproblems are allowed to be solved only approximately. We would like to emphasize that these flexibilities are essential. First, the flexibility of choosing proximal terms makes each problem maintain good structure and can be solved efficiently. In addition, with some simple and implementable error tolerance criteria, the cost for inexactly solving the subproblems can be greatly reduced, which further contributes to the efficiency of the proposed method. Specifically, we can see from the content in Section 4.3 that the smooth subproblem, i.e. -subproblem, is a block saddle point system, which can be solved efficiently by some Krylov-based methods with preconditioner, such as the generalized minimal residual (GMRES) method with preconditioner. As for the nonsmooth subproblems, i.e. subproblems with regard to and , the -subproblem has a closed form solution through introducing an appropriate proximal term and the -subproblem is solved by combining semi-smooth Newton (SSN) method efficiently. Moreover, the iteration complexity of the FE-dABCD algorithm is also proved.
The rest of this article is structured as follows. In Section 2, we will discuss the optimal conditions of problem (). Then we will consider its discretization in Section 3. Section 4 will give a duality-based approach to transform the discretized problem into a multi-block unconstrained convex optimization problem. Then a brief sketch of the imABCD method will be given and our proposed FE-dABCD algorithm will be described in details. Some numerical experiments will be given in Section 5 to verify the efficiency of the proposed algorithm. Finally, we will make a simple summary in Section 6.
2 Optimal Conditions
where is defined by (1), the following theorem holds.
The weak formulation of (6) is given by
with the bilinear form
The differentiability of the relation between the control and the state can be readily deduced from the implicit function theorem.
(Casas1993Optimal, , Theorem 2) The mapping defined by is of class and for every , the element is the unique solution of the Dirichlet problem
(Hinze2009Optimization, , Theorem 1.43) Assuming the existence of a feasible control (i.e. a control such that ), then problem () has a unique solution.
(Casas1993BOUNDARY, , Theorem 5.2) Let and be two Banach spaces and let and be two convex subsets, having a nonempty interior. Let be a solution of the optimization problem:
where and are Gteaux differentiable at . Then there exist a real number and an element such that
Moreover, can be taken equal to if the following condition of Slater type is satisfied:
Then from (10b) and (10c), we deduce the existence of and satisfying (12) and (15). Now we take and the unique solution of (14). Then it remains to prove inequality (16), which is done by using the corresponding inequality (10c). , let us take as a solution of
Then we derive that
Then from (10c), we get
Moreover, if there exists such that , where is the solution of the following Dirichlet problem
We know from Theorem 2.2 that , then
which means the condition of Slater type (11) holds, then can be taken to 1.
3 Finite Element Discretization
In order to tackle () numerically, we discretize both the state and the control by continuous piecewise linear functions. Let us introduce a family of regular triangulations of , i.e. . With each element , we associate two parameters and , where denotes the diameter of the set and is the diameter of the largest ball contained in . The mesh size of is defined by . We suppose the following standard assumption holds (see hinze2010variational (), Hinze2009Optimization ()).
(Regular and quasi-uniform triangulations) The domain is a open bounded and convex subset of , and its boundary is a polygon () or a polyhedron (n=3). Moreover, there exist two positive constants and such that
hold for all and all . Let us define , and let and denote its interior and its boundary, respectively. In the case that has a -boundary , we assume that is a convex and that all boundary vertices of are contained in , such that
where denotes the measure of the set and is a constant.
Let be the finite dimensional subspace, then
where , , , . We define the following matrices
where and denote the finite element stiffness matrix and mass matrix respectively. Let
be the nodal projection of , onto , where , . Then the discretized problem can be rewritten into the following matrix-vector form
4 A FE-dABCD algorithm
In this part, we could see that the discretized version of original problem can be reformulated into a multi-block unconstrained convex optimization problem via its dual. Then we first focus on the inexact majorized accelerate block coordinate descent (imABCD) method which was proposed by Cui in (cui2016, , Chapter 3) for a general class of problems and then explain how we extend it to our problem with some strategies according to the structure of the problem.
4.1 A duality-based approach
whose Lagrangian function is
Let us focus on first, we have
Above we use the concept of conjugate function. The conjugate function of is defined by
Insert (26) to we can get
Then (25) is transformed into
which is equivalent to
which is a multi-block unconstrained optimization problem. Thus, accelerated block coordinate descent (ABCD) method is preferred and appropriate. We will expend the imABCD algorithm, which was proposed in (cui2016, , Chapter 3), to our problem and employ some strategies according to the structure of our problem to solve it efficiently. We will give the details about the algorithm in the following part.
4.2 Inexact majorized ABCD algorithm
It is well known that taking the inexactness of the solutions of associated subproblems into account is important for the numerical implementation. Thus, let us give a brief sketch of the imABCD method for a general class of unconstrained, multi-block convex optimization problems with coupled objective function
where and are two convex functions (possibly nonsmooth), is a smooth convex function, and , are real finite dimensional Hilbert spaces. To tackle with the general model (29), some more conditions and assumptions on are required.
The convex function is continuously differentiable with Lipschitz contunous gradient.
Let us denote . Hiriart-Urruty and Nguyen provided a second order Mean-Value Theorem (see (Hiriart1984Generalized, , Theorem 2.3)) for , which states that for any and in , there exists and a self-adjoint positive semidefinite operator such that
where denotes the Clarke’s generalized Hessian at given and denotes the line segment connecting and . Under Assumption 2, it is obvious that there exist two self-adjoint positive semidefinite linear operators and such that for any ,
Thus, for any , , it holds
Furthermore, we decompose the operators and into the following block structures
and assume and satisfy the following conditions.
(cui2016, , Assumption 3.1) There exist two self-adjoint positive semidefinite linear operators and such that
Furthermore, satisfies that and .
Now we present the inexact majorized ABCD algorithm for the general problem (29) as follow.
Input: . Let be a summable sequence of nonnegative numbers, and set , .
Iterative until convergence:
- Step 1
Choose error tolerance , such that
- Step 2
Set and , compute
As for the convergence result of the imABCD algorithm, we can refer to the following theorem.