The Scenario Approach for Stochastic Model Predictive Control with Bounds on Closed-Loop Constraint Violations††thanks: This manuscript is the preprint of a paper submitted to Automatica and it is subject to Elsevier copyright. Elsevier maintains the sole rights of distribution or publication of the work in all forms and media. If accepted, the copy of record will be available at http://www.journals.elsevier.com/automatica/.
Many practical applications of control require that constraints on the inputs and states of the system be respected, while optimizing some performance criterion. In the presence of model uncertainties or disturbances, for many control applications it suffices to keep the state constraints at least for a prescribed share of the time, as e.g. in building climate control or load mitigation for wind turbines. For such systems, a new control method of Scenario-Based Model Predictive Control (SCMPC) is presented in this paper. It optimizes the control inputs over a finite horizon, subject to robust constraint satisfaction under a finite number of random scenarios of the uncertainty and/or disturbances. While previous approaches have shown to be conservative (i.e. to stay far below the specified rate of constraint violations), the new method is the first to account for the special structure of the MPC problem in order to significantly reduce the number of scenarios. In combination with a new framework for interpreting the probabilistic constraints as average-in-time, rather than pointwise-in-time, the conservatism is eliminated. The presented method retains the essential advantages of SCMPC, namely the reduced computational complexity and the handling of arbitrary probability distributions. It also allows for adopting sample-and-remove strategies, in order to trade performance against computational complexity.
Model Predictive Control (MPC) is a powerful approach for handling multi-variable control problems with constraints on the states and inputs. Its feedback control law can also incorporate feedforward information, e.g. about the future course of references and/or disturbances, and the optimization of a performance criterion of interest.
Over the past two decades, the theory of linear and robust MPC has matured considerably . There are also widespread practical applications in diverse fields . Yet many potentials of MPC are still not fully uncovered.
One active line of research is Stochastic MPC (SMPC), where the system dynamics are of a stochastic nature. They may be affected by additive disturbances [3, 10, 13, 14, 18, 19], by random uncertainty in the system matrices , or both [12, 15, 25, 30]. In this framework, a common objective is to minimize a cost function, while the system state is subject to chance constraints, i.e. constraints that have to be satisfied only with a given probability.
Stochastic systems with chance constraints arise naturally in some applications, such as building climate control , wind turbine control , or network traffic control . Alternatively, they can be considered as relaxations of robust control problems, in which the robust satisfaction of state constraints can be traded for an improved cost performance.
A major challenge in SMPC is the solution to chance-constrained finite-horizon optimal control problems (FHOCPs) in each sample time step. These correspond to non-convex stochastic programs, for which finding an exact solution is computationally intractable, except for very special cases [17, 31]. Moreover, due to the multi-stage nature of these problems, it generally involves the computation of multi-variate convolution integrals .
In order to obtain a tractable solution, various sample-based approximation approaches have been considered, e.g. [2, 4, 32]. They share the significant advantage of coping with generic probability distributions, as long as a sufficient number of random samples (or ‘scenarios’) can be obtained. The open-loop control laws can be approximated by sums of basis functions, as in the Q-design procedure proposed by . However, these early approaches of Scenario-Based MPC (SCMPC) remain computationally demanding  and/or of a heuristic nature, i.e. without specific guarantees on the satisfaction of the chance constraints [4, 32].
More recent approaches [6, 7, 21, 24, 28, 33] are based on advances in the field of scenario-based optimization. However, these approaches share the drawback of being conservative when applied in a receding horizon fashion, i.e. the focus is either on obtaining a robust solution [6, 7, 33] or the chance constraints are over-satisfied by the closed loop system [21, 24, 28].
This conservatism of SCMPC represents a major practical issue, that is resolved by the contributions of this paper. In contrast to the previous results, the novel approach interprets the chance constraints as a time average, rather than pointwise-in-time with a high confidence, which is much less restrictive. Furthermore, the sample size is reduced by exploiting the structural properties of the finite-horizon optimal control problem . The approach also allows for the presence of multiple simultaneous chance constraints on the state, and an a-posteriori removal of adverse samples for improving the controller performance .
In the most general setting, this paper considers linear systems with stochastic additive disturbances and uncertainty in the system matrices, which may only be known through a sufficient number of random samples. The computational complexity can be traded against performance of the controller by removing samples a-posteriori, starting from a simple convex linear or quadratic program and converging to the optimal SMPC solution in the limit.
The paper is organized as follows: Section 2 presents a rigorous formulation of the optimal control problem that one would like to solve; Section 3 describes how an approximated solution is obtained by SCMPC; Section 4 develops the theoretical details, including the technical background and closed-loop properties; Section 5 demonstrates the application of the method to a numerical example; and Section 6 presents the main conclusions.
2 Optimal Control Problem
Consider a discrete-time control system with a linear stochastic transition map
for some fixed initial condition . The system matrix and the input matrix as well as the additive disturbance are random, as they are (known) functions of a primal uncertainty . For notational simplicity, comprises all uncertain influences on the system at time .
Assumption 1 (Uncertainty)
(a) The uncertainties , are independent and identically distributed (i.i.d.) random variables on a probability space . (b) A ‘sufficient number’ of i.i.d. samples from can be obtained, either empirically or by a random number generator.
The support set of and the probability measure on are entirely generic. In fact, and need not be known explicitly. The ‘sufficient number’ of samples, which is required instead, will become concrete in later sections of the paper. Note that any issues arising from the definition of a -algebra on are glossed over in this paper, as they are unnecessarily technical. Instead, every relevant subset of is assumed to be measurable.
The system (1) can be controlled by inputs , to be chosen from a set of feasible inputs . Since the future evolution of the system (1) is uncertain, it is generally impractical to indicate all future inputs explicitly. Instead, each should be determined by a static feedback law
based only on the current state of the system.
The optimal state feedback law should be determined in order to minimize the time-average of expected stage costs ,
Each stage cost is taken in expectation , since its arguments and are random variables, being functions of . The time horizon is considered to be very large, yet it may not be precisely known at the point of the controller design.
The minimization of the cost is subject to keeping the state inside a state constraint set for a given fraction of all time steps. For many applications, the robust satisfaction of the state constraint (i.e. at all times ) is too restrictive for the choice of , and results in a poor performance in terms of the cost function. This is especially true in cases where the lowest values of the cost function are achieved close to the boundary of . Moreover, it may be impossible to enforce if the support of is unknown and possibly unbounded.
In order to make this more precise, let denote the random variable indicating that , i.e. is the indicator function on the complement of . The expected time-average of constraint violations should be upper bounded by some ,
Assumption 2 (Control Problem)
(a) The state of the system can be measured at each time step . (b) The set of feasible inputs is bounded and convex. (c) The state constrained set is convex. (d) The stage cost is a convex function.
Assumption 2(b) holds for most practical applications, and very large artificial bounds can always be introduced for input channels without natural bounds. Typical choices for the stage cost include
where and are positive semi-definite weighting matrices. Typical choices for the constraints and are polytopic or ellipsoidal sets.
Combining the previous discussions, the optimal control problem (OCP) can be stated as follows:
Remark 3 (Alternative Formulations)
(a) Instead of the sum of expected values, the cost function (5a) can also be defined as a desired quantile of the sum of discounted stage costs. Then the problem formulation corresponds to a minimization of the ‘value-at-risk’, see e.g. . (b) Multiple chance constraints on the state , each with an individual probability level , can be included without further complications. A single chance constraint is considered here for notational simplicity.
Many practical control problems can be cast in the general form of (5). For example in building climate control , the energy consumption of a building should be minimized, while its internal climate is subject to uncertain weather conditions and the occupancy of the building. The comfort range for the room temperatures may occasionally be violated without major harm to the system. Another example is wind turbine control , where the power efficiency of a wind turbine should be maximized, while its dynamics are subject to uncertain wind conditions. High stress levels in the blades must not occur too often, in order to achieve a desired fatigue life of the turbine.
3 Scenario-Based Model Predictive Control
The OCP is generally intractable, as it involves an infinite-dimensional decision variable (the state feedback law) and a large number of constraints (growing with ). Therefore it is common to approximate it by various approaches, such as Model Predictive Control (MPC).
3.1 Stochastic Model Predictive Control (SMPC)
The basic concept of MPC is to solve a tractable counterpart of (5) over a small horizon repeatedly at each time step. Only the first input of this solution is applied to the system (1). In Stochastic MPC (SMPC), a Finite Horizon Optimal Control Problem (FHOCP) is formulated by introducing chance constraints on the state:
Here and denote predictions and plans of the state and input variables made at time , for steps into the future. The current measured state is introduced as an initial condition for the dynamics. The predicted states are understood to be eliminated by recursive substitution of (6b). Note that the predicted states are random by the influence of the uncertainties .
The probability levels in the chance constraints (6c) usually coincide with from the OCP [14, 23, 30], but they may generally differ . Some formulations also involve chance constraints over the entire horizon [19, 12], or as a combination with robust constraints [18, 10]. Other alternatives of SMPC consider integrated chance constraints , or constraints on the expectation of the state .
Remark 4 (Terminal Cost)
The state feedback law provided by SMPC is given by a receding horizon policy: the current state is substituted into (6b), then the FHOCP is solved for an input sequence , and the current input is set to . This means that the FHOCP must be solved online at each time step , using the current measurement of the state .
However, the FHOCP is a stochastic program that remains difficult to solve, except for very special cases. In particular, the feasible set described by chance constraints is generally non-convex, despite of the convexity of , and hard to determine explicitly. Hence a further approximation shall be made by scenario-based optimization.
3.2 Scenario-Based Model Predictive Control (SCMPC)
The basic idea of Scenario-Based MPC (SCMPC) is to compute an optimal finite-horizon input trajectory that is feasible under of sampled ‘scenarios’ of the uncertainty. Clearly, the scenario number has to be selected carefully in order to attain the desired properties of the controller. In this section, the basic setup of SCMPC is discussed, while the selection of a value for is deferred until Section 4.
More concretely, let be i.i.d. samples of , drawn at time for the prediction steps . For convenience, they are combined into full-horizon samples , also called scenarios. The Finite-Horizon Scenario Program (FHSCP) then reads as follows:
The dynamics (7b) provide different state trajectories over the prediction horizon, each corresponding to one sequence of affine transition maps defined by a particular scenario . Note that these state trajectories are not fixed, as they are still subject to the inputs . The cost function (7a) approximates (6a) as an average over all scenarios. The state constraints (7c) are required to hold for sampled state trajectories over the prediction horizon.
Applying a receding horizon policy, the SCMPC feedback law is defined as follows (see also Figure 1, for ). At each time step the current state measurement is substituted into (7b), and the current input is set to the first of the optimal FHSCP solution , which is called the scenario solution.
Unlike many MPC approaches, SCMPC does not have an inherent guarantee of recursive feasibility, in the sense of [22, Sec. 4]. Hence for a proper analysis of the closed-loop system, the following is assumed.
Assumption 5 (Resolvability)
Under the SCMPC regime, each FHSCP admits a feasible solution at every time step almost surely.
While Assumption 5 appears to be restrictive from a theoretical point of view, it is often reasonable from a practical point of view. For some applications, such as buildings , recursive feasibility may hold by intuition, or it may be ensured by the use of soft constraints [26, Sec. 2]. All in all, MPC remains a useful tool in practice, even for difficult stochastic systems (1) without the possibility of an explicit guarantee of recursive feasibility.
Remark 6 (Alternative Formulations)
(a) Instead of the average cost in (7a), the minimization may concern the cost of a nominal trajectory, as e.g. in [28, 24]; or the average may be taken over any sample size other than . (b) The inclusion of additional chance constraints into (7), as mentioned in Remark 3(b), is straightforward. The number of scenarios may generally differ between multiple chance constraints. (c) In case of a value-at-risk formulation, as in Remark 3(a), the average cost in (7a) is replaced by the maximum:
where the sample size must be selected according to the desired risk level.
Remark 7 (Control Parameterization)
In the FHSCP, the predicted control inputs may also be parameterized as a weighted sum of basis functions of the uncertainty, as proposed in [32, 33]. In particular, let be the unit vectors in , and for each time step let be a finite set of pre-selected basis functions. Then
can be substituted into problem (7), so that the weights for and become the new decision variables.
A control parameterization with an increasing number of basis functions generally improves the quality of the SCMPC feedback, while increasing the number of decision variables and hence the computational complexity; see [32, 33] for more details.
Given the sampled scenarios, (7) is a convex optimization program for which efficient solution algorithms exist, depending on its structure . In particular, if and are polytopic (respectively ellipsoidal) sets, then the FHSCP has linear (second-order cone) constraints. If the stage cost is either (4a,b), then the FHSCP has a reformulation with a linear objective function, using auxiliary variables. If the stage cost is (4c), then the FHSCP can be expressed as a quadratic program. More details on these formulation procedures are found in [20, pp. 154 f.].
3.3 A-Posteriori Scenario Removal
A key merit of SCMPC is that it renders the uncertain control system (6b) into multiple deterministic affine systems (7b) by substituting particular scenarios. This significantly simplifies the solution to the FHSCP, as compared to the FHOCP. However, by introducing these random scenarios, a randomizing element is added to the SCMPC feedback law. In particular, the closed-loop system may occasionally show an erratic behavior due to highly unlikely outliers in the sampled scenarios.
This effect can be mitigated by a-posteriori scenario removal, see . This allows for the state constraints (7c) corresponding to scenarios to be removed after the outcomes of all samples have been observed. In exchange, the original sample size must be (appropriately) increased over its value for . Any appropriate combination is called a sample-removal pair. The choice of appropriate values for and is deferred to Section 4. The selection of removed scenarios is performed by a (scenario) removal algorithm [9, Def. 2.1].
Definition 8 (Removal Algorithm)
(a) For each , the (scenario) removal algorithm is a deterministic function selecting out of scenarios . (b) The selected scenarios at time step shall be denoted by
Definition 8 is very general, in the sense that it covers a great variety of possible scenario removal algorithms. However, the most common and practical algorithms are described below:
- Optimal Removal:
The FHSCP is solved for all possible combinations of choosing out of scenarios. Then the combination that yields the lowest cost function value of all the solutions is selected. This requires the solution to choose instances of the FHSCP, a complexity that is usually prohibitive for larger values of .
- Greedy Removal:
The FHSCP is first solved with all scenarios. Then, in each of consecutive steps, the state constraints of a single scenario are removed that yields the biggest improvement, either in the total cost or in the first stage cost. Thus the procedure terminates after solving instances of FHSCP.
- Marginal Removal:
The FHSCP is first solved with the state constraints of all scenarios. Then, in each of consecutive steps, the state constraints of a single scenario are removed based on the highest Lagrange multiplier. Hence the procedure requires the solution to instances of FHSCP.
Figure 1 depicts an algorithmic overview of SCMPC, for the general case with scenario removal . For the case without scenario removal, consider and the selected scenarios .
4 Problem Structure and Sample Complexity
For the SCMPC algorithm described in Section 3, the sample-removal pair remains to be specified. Appropriate values for and are theoretically derived in this section. Their values generally depend on the control system and the constraints, and is referred to as the sample complexity of the SCMPC problem.
For some intuition about this problem, suppose that is fixed and the sample size is increased. This means that the solution to the FHSCP becomes robust to more scenarios, with the following consequences. First, the average-in-time state constraint violations (3) decrease, in general. Therefore the state constraint will translate into a lower bound on . Second, the computational complexity increases as well as the average-in-time closed-loop cost (2), in general. Therefore the objective is to choose as small as possible, and ideally equal to its lower bound.
The higher the number of removed constraints , the higher will be the lower bound on , in order for the state constraints (3) to be satisfied. Now consider pairs of removed constraints together with their corresponding lower bounds , which equally satisfy the state constraints (3). For the intuition, suppose is increased, so increases as well. Then the computational complexity grows, due to more constraints in the FHSCP and the removal algorithm. At the same time, the solution quality of the FHSCP improves, in general, and hence the average-in-time closed-loop cost (2) decreases. Therefore is usually fixed to a value that is as high as admitted by the available computational resources.
4.1 Support Rank
According to the classic scenario approach [8, 9], the relevant quantity for determining the sample size for a single chance constraint (with a fixed ) is the number of support constraints [8, Def. 2.1]. In fact, grows with the (unknown) number of support constraints, so the goal is to obtain a tight upper bound. For the classic scenario approach, this upper bound is given by the dimension of the decision space [8, Prop. 2.2], i.e. in the case of the FHSCP.
The FHSCP is a multi-stage stochastic program, with multiple chance constraints (namely , one per stage). This requires an extension to the classic scenario approach; the reader is referred to  for more details. Now each chance constraint contributes an individual number of support constraints, to which an upper bound must be obtained. These individual upper bounds are provided by the support rank of each chance constraint [29, Def. 3.6].
Definition 9 (Support Rank)
(a) The unconstrained subspace of a constraint in (7c) is the largest (in the set inclusion sense) linear subspace of the search space that remains unconstrained by all sampled instances of , almost surely. (b) The support rank of a constraint in (7c) is
where represents the dimension of the unconstrained subspace .
Note that the support rank is an inherent property of a particular chance constraint and it is not affected by the simultaneous presence of other constraints. Hence the set of constraints of the FHSCP may change, for instance, due to the reformulations of Remark 3.
Besides the extension to multiple chance constraints, the support rank has the merit of a significant reduction of the upper bound on the number of support constraints. Indeed, the following two lemmas replace the classic upper bound with much lower values, such as or , depending on the problem structure.
For systems affected by additive disturbances only, the support rank of any state constraint in the FHSCP is given by the support rank of in (i.e. the co-dimension of the largest linear subspace that is unconstrained by ).
Lemma 10 (Pure Additive Disturbances)
For systems affected by additive and multiplicative disturbances, Lemma 10 no longer holds. However, it will be seen that for the desired closed-loop properties, the relevant quantity for selecting the sample size is the support rank of the state constraint on only. For this first predicted step, the support rank is restricted to at most , under both additive and multiplicative disturbances.
Lemma 11 (Additive and Multiplicative Disturbances)
The support rank of constraint in (7c) is at most .
4.2 Sample Complexity
This section describes the selection of the sample-removal pair , based on a bound of the support rank . Throughout this subsection, the initial state is considered to be fixed to an arbitrary value.
Let denote the (first step) violation probability, i.e. the probability with which the first predicted state falls outside of :
Recall that denotes the first input of the scenario solution . Clearly, and depend on the scenarios that are substituted into the FHSCP at time . The notation and shall be used occasionally to emphasize this fact.
The violation probability can be considered as a random variable on the probability space , with support in . Here and denote the -th product of the set and the measure , respectively. For distinction, the expectation operator on is denoted , and that on is denoted .
The distribution of is unknown, being a complicated function of the entire control problem (6) and the removal algorithm . However, it is possible to derive the following upper bound on this distribution.
Lemma 12 (Upper Bound on Distribution)
Proof. The proof is a straightforward extension of [29, Thm. 6.7], where the bound on is saturated at .
This paper exploits the result of Lemma 12 to obtain an upper bound on the expectation
A reformulation via the indicator function yields that
Definition 13 (Admissible Sample-Removal Pair)
A sample-removal pair is admissible if its substitution into (4.2) yields .
Whether a given sample-removal pair is admissible can be tested by performing the one-dimensional numerical integration (4.2). It can easily be seen that the integral value (4.2) monotonically decreases with and monotonically increases with . Hence, if either or is fixed, an admissible sample-removal pair can be determined e.g. by a bisection method. Moreover, if is fixed, there always exist large enough to generate an admissible pair .
Remark 14 (No Scenario Removal)
If , the integration (4.2) can be replaced by the exact analytic formula
Figure 2 illustrates the monotonic relationship of the upper bound (4.2) in and . Supposing that is fixed, the corresponding admissible pair can be found by moving along the graphs until the desired violation level is reached. The solid and the dashed line correspond to different support dimensions and .
4.3 Closed-Loop Properties
This section analyzes the closed-loop properties of the control system under the SCMPC law for an admissible sample-removal pair . To this end, the underlying stochastic process is first described. Recall that
is the closed-loop trajectory, where depends on all past uncertainties as well as all past scenarios ;
are the violation probabilities, where depends on and , and hence on and ;
indicate the actual violation of the constraints, where depends on , and hence on and .
At each time step , there are a total of random variables, namely the scenarios together with the disturbance . In order to simplify notations, define
for any . These auxiliary variables allow for the random variables , , to be expressed in terms of their elementary uncertainties. Moreover, let denote the probability measure and the expectation operator on , for any .
Observe that is a Bernoulli random variable with (random) parameter , because
for any values of .
Proof. By linearity of the expectation operator,
by virtue of (4.3). Moreover, for any ,
where the integrand is pointwise upper bounded by because is an admissible sample-removal pair.
Theorem 15 shows that the chance constraints of the OCP can be expected to be satisfied over any finite time horizon . The next Lemma 16 sets the stage for an even stronger result, Theorem 17, showing that the chance constraint are satisfied almost surely as .
Proof. For any , define and observe that
by virtue of (4.3). In probabilistic terms, this says that is a sequence of martingale differences. Moreover,
almost surely, because is bounded for . Therefore [16, Thm. 2.17] can be applied, which yields that
Note that Lemma 16 does not imply that
almost surely, because it is not clear that the right-hand side converges almost surely. However, if it converges almost surely, then (20) holds.
5 Numerical Example
5.1 System Data
Consider the stochastic linear system
where . Here is uniformly distributed on the interval and are normally distributed with mean and variance . The inputs are confined to
and two state constraints are considered:
either individually or in combination . The stage cost function is chosen to be of the quadratic form (4c), with the weights and . The MPC horizon is set to .
5.2 Joint Chance Constraint
The support rank of the joint chance constraint is bounded by . Figure 3 depicts a phase plot of the closed-loop system trajectory, for two admissible sample-removal pairs (a) and (b) , corresponding to . Instances in which the state trajectory leaves are indicated in red. Note that the distributions are centered around a similar mean in both cases, however the case features stronger outliers than .
Table 1 shows the empirical results of a simulation of the closed-loop system over time steps. Note that there is essentially no conservatism in the case of no removals (). Some minor conservatism is present for small removal sizes, disappearing asymptotically as . At the same time, the reduction of the average closed-loop cost is minor for this example, while the standard deviation is affected significantly.
To highlight the impact of the presented SCMPC approach, the results of Table 1 can be compared to those of previous SCMPC approaches [28, 6]. The sample size is (compared to about ), and the empirical share of constraint violations in closed-loop is (compared to about ). These figures become even worse when longer horizons are considered; e.g. for , previous approaches require about 900 samples and yield about violations.
5.3 Individual Chance Constraints
For the same example, the two chance constraints and are now considered separately, with the individual probability levels and . Each support rank is bounded by . Figure 4 depicts a phase plot of the closed-loop system trajectory, for the admissible sample-removal pairs (a) , and (b) , .
Table 2 shows the empirical results of a simulation of the closed-loop system over time steps. Note that there is very little conservatism in all cases. As in the previous example, the reduction of the average closed-loop cost is minor, while the standard deviation is affected significantly.