A robust optimal control problem with moment constraints on distribution: theoretical analysis and an algorithm
We study an optimal control problem in which both the objective function and the dynamic constraint contain an uncertain parameter. Since the distribution of this uncertain parameter is not exactly known, the objective function is taken as the worst-case expectation over a set of possible distributions of the uncertain parameter. This ambiguity set of distributions is, in turn, defined by the first two moments of the random variables involved. The optimal control is found by minimizing the worst-case expectation over all possible distributions in this set. If the distributions are discrete, the stochastic min-max optimal control problem can be converted into a convensional optimal control problem via duality, which is then approximated as a finite-dimensional optimization problem via the control parametrization. We derive necessary conditions of optimality and propose an algorithm to solve the approximation optimization problem. The results of discrete probability distribution are then extended to the case with one dimensional continuous stochastic variable by applying the control parametrization methodology on the continuous stochastic variable, and the convergence results are derived. A numerical example is present to illustrate the potential application of the proposed model and the effectiveness of the algorithm.
Ideas to immunize optimization problems against perturbations in model parameters arose as early as in 1970s. A worst-case model for linear optimization such that constraints are satisfied under all possible perturbations of the model parameters was proposed in . A common approach to solving this type of models is to transform the original uncertain optimization problem into a deterministic convex program. As a result, each feasible solution of the new program is feasible for all allowable realizations of the model parameters, therefore the corresponding solution tends to be rather conservative and in many cases even infeasible. For a detailed survey, see the recent monograph .
For traditional stochastic programming approaches, uncertainties are modeled as random variables with known distributions. In very few cases, analytic solutions are obtained (see, e.g. Birge and Louveaux , Ruszczynski and Shapiro ). These approaches may not be always applicable in practice, as the exact distributions of the random variables are usually unknown.
In the framework of robust optimization, uncertainties are usually modeled by uncertainty sets, which specify certain ranges for the random variables. The worst-case approach is used to handle the uncertainty. It is often computationally advantageous to use the “robust” formulation of the problem. However, the use of uncertainty sets as the possible supporting sets for the random variables is restrictive in practice; it leads to relatively conservative solutions.
The recently developed “distributionally robust” optimization approach combines the philosophies of traditional stochastic and robust optimization – this approach does not assume uncertainty sets, but keep using the worst-case methodology. Instead of requiring the shape and size of the support sets for the random variables, it assumes that the distributions of the random variables satisfying a set of constraints, often defined in terms of moments and supports. Since the first two moments can usually be estimated via statistical tools, the distributionally robust model appears to be more applicable in practice. Furthermore, since it takes the worst-case expected cost, it inherits computational advantages from robust optimization. Due to these advantages, distributionally robust optimization has attracted more and more attention in operations research community .
In this paper, we propose a novel optimal control model with an uncertain parameter for which its exact distribution is unknown. However, it is assumed that the mean and the standard deviation of the uncertain parameter are known. The optimal control is found by minimizing the worst-case expectation with respect to all distributions in an “ambiguity set”. Both the problems with discrete probability distribution and with continuous probability distribution will be discussed. We first consider the case of discrete probability distribution, in which the min-max optimal control problem is transformed into an equivalent finite dimensional minimization problem via duality. Then the necessary conditions of optimality are derived. The results for the case of discrete probability distribution are then extended to the case with one dimensional continuous stochastic variable. The control parametrization methodology is applied to parameterise the continuous stochastic variable. Finally, an example is solved showing the potential application of the proposed optimal control framework and the effectiveness of the algorithm.
For simplicity, we only discuss optimal control of dynamical systems with a single uncertain parameter. However, the results can be directly extended to cases involving multiple independent uncertain parameters.
To begin with, consider a system of ordinary differential equations with an uncertain parameter as follows:
where is the state vector, is the control vector function, and is an uncertain parameter. In general, the parameter is regarded as uniquely determined. In reality, however, this hypothesis often does not hold, since parameter is uncertain subject to variability. The only reliable information is that the value of the parameter falls within a certain range and that its potential values follow some statistical distribution. Our interest focuses on the following distributionally robust optimal control problem.
Here, is a compact and convex subset of . The difference between Problem (DROCP) and the standard optimal control problem is that the parameter herein is considered as a stochastic variable with distribution . The distribution , however, is not exactly known. The only knowledge, which is available, is the mean and the standard deviation of the distribution ; they are denoted by and , respectively. The set of all such distributions is denoted by . Any measurable function defined in with values in is called an admissible control. Let be the class of all such admissible controls.
Throughout this paper, we make the following assumptions.
The functions and are at least continuously differentiable with respect to all their arguments.
For each fixed , there exist positive constants and such that the following inequality holds
From the classical differential equation theory (See, for example, Proposition 5.6.5 in ), we recall that the system (Equation 1) admits a unique solution, , corresponding to each and . Problem (DROCP) can be roughly stated as: Find a control such that the worst-case expectation from all feasible distributions is minimized over . Obviously, Problem (DROCP) is a min-max optimal control problem.
3Distributionally robust optimal control problem with discrete distribution
In this section, we focus on the case of discrete distributions. In this case, we will reformulate the distributionally robust optimal control as an equivalent combined optimal control and optimal parameter selection problem by using a dual transformation. We will then develop an algorithm to solve the resulting problem based on the parametric sensitivity functions and the control parametrization method.
3.1Problem reformulation and optimality conditions
Let be a possible value of the parameter , and let be the corresponding probability, i.e., , . We first investigate the inner -optimization problem, in which the value of is fixed. In this context, there are possible system trajectories due to different values of the parameter . Let be the trajectory of system (Equation 1) with . When there is no confusion, is written as . Each possible trajectory yields a corresponding system cost . The inner subproblem is to evaluate the worst-case expectation from all possible distributions, which is given as follows:
Note that the only variables to be optimized in the above inner subproblem are , . Hence, the constraints (Equation 2) and ( ?) are not present in ISP. In addition, Problem (ISP) is a linear programming, and its dual is given as follows:
There is no duality gap between the inner subproblem and its dual problem, since the feasible set of Problem (ISP) is nonempty and bounded. Thus, the original Problem (DROCP) is equivalent to the following problem:
Let and . Combining system ( ?) and the scalar inequality constraints ( ?) to the cost function with multiplier functions and multiplier vector yields the Lagrangian of Problem (Dual-DROCP) as given below.
where . Let and . Then
Based on the fundamental variational principle , we have the necessary optimality conditions of Problem (Dual-DROCP) given in the following as a theorem.
3.2The optimization algorithm
Assume that is a solution of Problem (Dual-DROCP). Clearly, the optimal control function, , for Problem (Dual-DROCP) is also the optimal solution of Problem (DROCP). Then, the optimal distribution, , can be obtained by solving Problem (ISP) with . The algorithm framework for the solution of Problem (DROCP) is presented as follows.
Step 1. Solve Problem (Dual-DROCP), denote the solution by ;
Step 2. For each possible parameter , , compute the optimal trajectories, , and the corresponding cost ;
Step 3. Compute the optimal solution of Problem (ISP) by using linear programming solver.
For the above algorithm framework, Step 2 and Step 3 can be computed readily. Thus, the remaining problem is on how to solve Problem (Dual-DROCP).
Let be the partition grids of the time horizon . On the control parametrization framework, the control function is approximated by a piecewise constant function or a piecewise linear function, where the heights of these approximate functions are decision variables. In fact, the control function can be approximated by a linear combination of any appropriate set of basis functions. Thus, Problem (Dual-DROCP) is approximated as a finite-dimensional optimization problem, where the coefficients of the basis functions are regarded as decision variables. In this paper, the control is approximated as a piecewise constant function in the form as given below:
where , , , and denotes the characteristic function of . Define and . Clearly, the control defined in the form of (Equation 4) is one to one corresponding with the control parameter vector . Let be the solution of system (Equation 1) corresponding to . With some abuse of notation, is abbreviated as or when no confusion can arise. Then, the parameterized problem for Problem (Dual-DROCP) can be stated as given below:
Problem (Discre-Dual-DROCP) is essentially a finite-dimensional optimization problem, which can be solved readily by various optimization techniques. In general, the values of the objective function and the constraint functions and their respective gradients are required to be computed at each iteration of the optimization procedure. The gradient of the objective function is obvious since it is only a linear function of . The gradients of the constraint functions can be evaluated by solving either the adjoint equations (see, for example, ) or the sensitivity function (see, for example, ). In this paper, the method based on the sensitivity function is used. The parametric sensitivity system and the gradient formulas are given in the following as a theorem.
Problem (Discre-Dual-DROCP) differs from the standard mathematical programming problems in the sense that it involves the dynamic system ( ?) and the end-point constraints ( ?). The dynamic constraint ( ?) as well as the systems of differential equations of the parametric sensitivity functions are solved by an ordinary differential equation (ODE) solver in each iteration of the optimization procedure. The end-point constraints ( ?) are handled as follows. Define
and . Constraints ( ?) are equivalent to the following equality constraint:
However, is nonsmooth in . Standard optimization routines would have difficulties in handling this type of equality constraints. A widely used smoothing technique  is to approximate by
By using the quadratic penalty function, Problem (Discre-Dual-DROCP) is finally approximated by
where and is the penalty parameter. The algorithm framework for the solution of Problem (QP-Dual-DROCP), which is constructed based on Algorithm 17.4 in , is stated as follows.
Choose an initial point . Choose convergence tolerances and . Choose positive constants , , and . Set , ;
For , integrate system ( ?)-( ?) together with the parametric sensitivity systems ( ?) forward in time from 0 to .
Evaluate the value of the merit function and its gradients, denoted by and , respectively.
If , where is the partial projection of the vector onto the rectangular box at the current point , defined by
Then goto (S4-1). Otherwise, goto (S4-2).
If , goto (S5-1); otherwise, goto (S5-3)
Using a line search method to find the next point , replace by , and goto (S1).
If and , stop. Record the approximate solution obtained. Otherwise, goto (S5-3).
Set and goto (S5-3).
—Increase penalty parameter
Set , , , and goto (S1).
4The case of continuous distributions
For the case of continuous distributions, the cost function can be considered as a function of and , because the state is only an intermediate variable depending on and . For a fixed , the inner sub-problem is given as follows.
To extent the results obtained for the case of discrete distributions detailed in the previous section to the case of continuous distributions, we propose a scheme for the discretization of the continuous stochastic variable based on the control parametrization method.
Suppose that the uncertain parameter is disturbed in an interval . Let be an element of , i.e., is a potential probability density function of satisfying ( ?) and ( ?). Let be a set of time points on the interval . Denote by , , and by . Let , and let
Let be an arbitrarily but fixed element chosen from . It is referred to as a characteristic element of this subinterval. When the uncertain parameter takes values in , it is approximated as in the system. As a result, the uncertain parameter interval is approximated as a finite set . Moreover, the probability is defined as
On the discretization of the continuous distribution, the cost function is approximated as follows:
The same idea can be used for the constraints. Thus, we can approximate the inner sub-problem (CISP) by the following discrete-distribution problem
Note that Problem (DISP) is the same as Problem (ISP) detailed in the previous section. That is, it is a distributionally robust optimal control problem with discrete distribution. If the above discretization method is convergent, the solution of Problem (DROCP) with continuous distribution can be approximately obtained through solving a sequence of problems with discrete distributions. Therefore, we only need to verify the convergence of the above discretization scheme, which will be proved by investigating the relationships of the cost function and constraint functions between Problem (CISP) and Problem (DISP).
From (Equation 10), it is obvious that constraints (Equation 8) and (Equation 12) are consistent. Besides, inequality ( ?) in Problem (CISP) also implies inequality ( ?) in Problem (DISP). Therefore, we only need to evaluate the differences of the cost functions and the two constraints between Problem (CISP) and Problem (DISP). Details are given in the following two theorems.
Given and any , the solution of system (Equation 1) can be expressed as
It follows that
Since is at least continuously differentiable with respect to , it also satisfies Lipschitz condition in on , that is, there exists a constant such that
Therefore, we have
where is defined in (Equation 9). This complete the proof.
(a) The equality holds directly from the definition. Thus, it remains to prove the validity of the inequality. From (Equation 10), we have
(b) Similar to the derivation given in (a), we obtain
(c) Similarly, we have
Since is continuously differentiable in , there exists, for any , a such that
From Theorem ?, it follows that the following inequality
Let . Then, we conclude that inequalities (a), (b), (c) hold if . Hence, the proof is completed.
In this section, we choose an example to illustrate the application of distributionally robust optimal control model and to test the performance of the proposed algorithm. The illustration example is a distributionally robust optimal control of a microbial fed-batch process , which is stated as follows.
Let be the concentration of biomass (g/L), be the concentration of substrate (g/L) and be the volume of the solution (L). The control system of the fed-batch process is described by
where is the input control of the substrate. is the specific decay rate of cells, and is the concentration of substrate in feed medium. is the specific growth rate of biomass, and is the specific consumption rate of substrate, which are, respectively, expressed as
In (Equation 16), is the maximum specific growth rate, is the saturation constant, and is the critical concentration of the substrate above which cells cease to grow. In (Equation 16), and are, respectively, the maintenance requirement of substrate and the maximum growth yield. The above system is a typical kinetic model used in microbial fermentation process, see, e.g., Ye et al.  and Zeng et al. . In general, is regarded to be constant during the whole fermentation process. However, it is well-known that the maintenance consumption of substrate would vary during different fermentation stages. Thus, we consider as an uncertain parameter in this work. Assume that the mean and the standard deviation of the uncertain parameter are and , respectively.
The problem is to control the input such that the biomass at the terminal time is maximized. For convenience of presentation, let . Define the admissible set of controls as with and being the minimum and maximum input rates. Define
By transforming the time interval [0,] into , the optimal control problem can be stated as follows:
In this numerical example, we set , , , , , , , , , , , h.
In Algorithm 3.1, even if Problem (Dual-DROCP) is linear with respect to , we still optimize together with by using the nonlinear optimization techniques. In the numerical experiments, the time horizon is equidistantly divided into 25 subintervals for the parameterization of the control . Since the microbial fermentation is a relatively slow time-varying process, the partition is adequate. In the discretization of the continuous distribution of the uncertain parameter , we choose ten characteristic elements over , where , .
A good initial guess of the decision variables is important to help ensure the convergence of the algorithm. We use the following procedure to generate an initial guess: randomly generate a control , and for the fixed , the variable is optimized by a linear programming solver and the performance of (QP-Dual-DROCP) is computed; the process is repeated (=200) times and the pair with the best performance is set to be an initial guess for a run of Algorithm 3.1. It is worth mentioning that the generated initial guess, if exists, is a feasible solution of Problem (Dual-DROCP). Starting from the initial guess, the proposed algorithm, which is encoded in Matlab 7.0, is implemented on an Intel dual-core i5 with 2450GHz. The height of the optimal control at each subinterval is listed in Table 1. Under the optimal input strategy and , the trajectories of the biomass and the substrate are plotted in Figs. 1 and 2.
After obtaining the optimal solution from Algorithm 3.1, we fix the control at and optimize again by solving Problem (Dual-ISP) with linear programming solver. The optimal solution is denoted by . The values of the cost function at and are denoted as and , respectively. We have and . The difference between and is only 0.0985, which reflects the effectiveness of the proposed algorithm in some degree. On the other hand, we fix the control at and optimize directly by solving Problem (ISP) with linear programming solver. The optimal solution of is . The terminal concentrations of biomass under the characteristic elements are
To illustrate the superiority of the optimal control strategy obtained from the proposed model, we simulate the system under a constant input . The trajectories of biomass and substrate with varied under this input strategy are shown in Figs. 3 and 4. A comparison of Fig. 1 and Fig. 3 reveals that, not only the the terminal concentration of biomass under the optimal strategy is significantly higher than that under the constant control input, but also the variation of biomass concentration is much smaller than the constant one. This shows that the system under the optimal control strategy could maintain a good performance even in the “worst” case.
This paper introduced an optimal control problem in which both the objective function and the dynamic constraint contain an uncertain parameter. Since the distribution of this uncertain parameter is not exactly known, the objective function is taken as the worst-case expectation over a set of possible distributions of the uncertain parameter. To minimize the worst-case expectation over all possible distributions in an ambiguity set, the stochastic optimal control problem is converted into a finite-dimensional optimization problem via duality and discretization. Necessary conditions of optimality was derived and numerical results for an illustration example are reported. Numerical results in Section 5 show the success of the proposed model in producing an optimal control strategy under which a good performance is achieved. It also ensures that the variation of the performance is small subject to the changes in the value of the uncertain parameter. That is, the system is robust under the optimal control strategy obtained from the proposed model.
The continuation of this work can be divided into two aspects: model aspect and algorithm aspect. The model should take more factors into account. For example, a further study could be on how to introduce proper terminal constraints or path constraints into the model. In the algorithm aspect, the current work transforms the proposed model into a combined optimal control and optimal parameter selection problem and solve it by using nonlinear optimization techniques. However, the special structure of the problem was not taken into detailed investigation. Problem (Dual-DROCP) is linear with respect to the optimization vector but nonlinear with respect the control . An alternative direction optimization technique could be used to handle these two kinds of optimization variables separately. For example, the control can be fixed first, and the optimal solution is easily obtained by solving a linear programming problem. Then, the control is regulated by some nonlinear optimization methods. The procedure is repeated until a satisfactory pair of is found. Some stochastic techniques, such as PSO method, could also be combined with the alternative direction optimization technique to regulate the control in the outer level of the optimization process.
This work was supported by the National Natural Science Foundation for the Youth of China (Grants 11301081, 11401073), China Postdoctoral Science Foundation (Grant No. 2014M552027), the Fundamental Research Funds for Central Universities in China (Grant DUT15LK25), and Provincial National Science Foundation of Fujian (Grant No. 2014J05001).
- A. L. Soyster. Convex programming with set-inclusive constraints and applications to inexact linear programming. Oper. Res. 1973, 21(5): 1154-1157.
- A. Ben-Tal, L. El Ghaoui, A. Nemirovski. Robust Optimization. Princeton University Press, Princeton, NJ, 2009.
- John R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer, 2011.
- A. Ruszczynski and A. Shapiro (eds.). Stochastic Programming: Handbook in Operations Research and Management Science. Elsevier Science, Amsterdam, 2003.
- J. Goh, M. Sim. Distributionally robust optimization and its tractable approximations. Oper. Res. 2010, 58(4-part-1): 902-917.
- M. Sim. Distributionally robust optimization: A marriage of robust optimization and stochastic programming. 3rd Nordic Optimization Symposium, March 13-14, 2009, Stockholm, Sweden.
- X. Chen, M. Sim, P. Sun. A robust optimization perspective on stochastic programming. Oper. Res. 2007, 55(6): 1058-1071.
- W. Q. Chen, M. Sim. Goal-driven optimization. Oper. Res. 2009, 57(2): 342-357.
- L El Ghaoui, H. Lebret. Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. A. 1997, 18(4): 1035-1064.
- D. Erick, and Y. Ye. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 2010, 58(3): 595-612.
- S Zymler, D Kuhn, B Rustem. Distributionally robust joint chance constraints with second-order moment information. Math. Program. 2013, 137(1-2): 167-198.
- S. Mehrotra, H. Zhang. Models and algorithms for distributionally robust least squares problems. Math. Program. Ser. A. 2013, 1-19.
- Vladimir G. Boltyanski and Alexander S. Poznyak. The Robust Maximum Principle-Theory and Applications. Springer Science & Business Media, 2011.
- E. Polak. Optimization algorithms and consisitent approximations. Springer-Verlag, New York, Inc., 1997.
- A.E. Bryson, Applied optimal control: optimization, estimation and control. CRC Press, 1975.
- K. L. Teo, C. J. Goh. A computational method for combined optimal parameter selection and optimal control problems with general constraints. J. Austral. Math. Soc. Ser. B. 1989(30): 350-364.
- R. Loxton, K. L. Teo, V. Rehbock. Robust Suboptimal Control of Nonlinear Systems. Appl. Math. Comput. 2011(217): 6566-6576
- W.F. Feehery, P.I. Barton. Dynamic optimization with state variable path constraints. Computer Chem. Enging. 1998(22): 1241-1256.
- E. Rosenwasser and R. Yusupov. Sensitivity of Automatic Control Systems. Tom Kurfess(eds). CRC Press, 2000.
- R.C. Loxton, K.L. Teo, V. Rehbock, et al, Optimal control problems with a continuous inequality constraint on the state and the control. Automatica, 2009, 45(10): 2250-2257.
- Jorge Nocedal and Stephen J. Wright. Numerical Optimization. Springer, 2006.
- J. Ye, H. Xu, E. Feng, Z. Xiu. Optimization of a fed-batch bioreactor for 1,3-propanediol production using hybrid nonlinear optimal control. J. Process Contr. 24(2014): 1556-1569.
- A.P. Zeng, W. D. Deckwer. A kinetic model for substrate and energy consumption of microbial growth under substrate-sufficient conditions. Biotechnol. Prog. 11(1995): 71-79.