Comments on Truncation Errors
for Polynomial Chaos Expansions
Methods based on polynomial chaos expansion allow to approximate the behavior of systems with uncertain parameters by deterministic dynamics. These methods are used in a wide range of applications, spanning from simulation of uncertain systems to estimation and control. For practical purposes the exploited spectral series expansion is typically truncated to allow for efficient computation, which leads to approximation errors. Despite the Hilbert space nature of polynomial chaos, there are only a few results in the literature that explicitly discuss and quantify these approximation errors. This work derives error bounds for polynomial chaos approximations of polynomial and non-polynomial mappings. Sufficient conditions are established, which allow investigating the question whether zero truncation errors can be achieved and which series order is required to achieve this. Furthermore, convex quadratic programs, whose argmin operator is a special case of a piecewise polynomial mapping, are studied due to their relevance in predictive control. Several simulation examples illustrate our findings.
Uncertainty is inherent to many applications. Considering and counteracting disturbances is becoming ever more important as systems are pushed to the boundaries of operation, for economic reasons or for increased interoperability. By now, many strategies have been developed to predict and counteract disturbances and uncertainties. With respect to systems and control they span robust control [Zhou1996], stochastic, and robust model predictive control (mpc) approaches [Mesbah16, Rawlings2009MPC, limon2009input].
Various methods for uncertainty description, prediction and decision making under uncertainties exist [Xiu10book, jaulin2001applied]. Besides stochastic uncertainty descriptions, deterministic uncertainty descriptions and bounds are often used. In the deterministic setting, uncertainties are typically described by bounded sets, leading to worst-case assumptions and worst-case predictions of the future system behavior.
Instead, stochastic approaches treat the uncertainty as a realization of a random variable (often continuous, second-order) that has to be propagated through given mappings, e.g. system dynamics, to obtain insights on the influence on the variable of interest, or for control. Recently, polynomial chaos expansion (pce) has gained popularity in the field of systems and control to propagate stochastic uncertainty descriptions and to quantify their influence [Mesbah14, Mesbah16, Paulson17a]. pce originates in the works of Norbert Wiener [Wiener38]. In pce the stochastic variables are replaced by an (infinite) sum of weighted orthogonal polynomials [Xiu10book]. Using Galerkin projection, the approximated system is deterministic but of larger dimension than the original system. This expanded system has been used, for example, to design linear controllers [fisher09aut, Du17a, Shen17a], and has been exploited in model predictive control [Fagiano12a, Mesbah14, paulson2014, Lucia15, Muehlpfordt16a].
For the sake of computational tractability, it is necessary to truncate the infinite polynomial chaos expansion to finite order. While the use of pce in the field of systems and control is steadily increasing, it is commonly and frequently assumed that: (i) the input uncertainty can be exactly described using finitely many pce coefficients; (ii) the nonlinear function—denoted in the following by —that maps to the desired output is known analytically;111Here, represents a generic mapping, e.g. the state transition map of a system of ordinary differential equations or of an lti system in discrete or continuous time, a system of nonlinear algebraic equations, or the argmin-operator of a suitable convex optimization problem. and (iii) the output can be exactly realized by a finite number of pce coefficients. Moreover, whereas under these conditions pce is exact (in the L-sense) in the limit, its truncation is often a trade-off between approximation accuracy and computational tractability. For stability and performance guarantees, however, bounding the approximation error is important. As such, stability and performance guarantees derived for the approximated system do not necessarily apply to the original stochastic system.
To the best of the authors’ knowledge, there is only a limited number of results that consider pce truncation errors directly: In [FieldJr04] illustrative examples evaluate the accuracy of pce. Yet, the errors are not computed rigorously. Rather, they are studied via extensive simulations. Similarly, [Debusschere04] list several numerical challenges when using pce, including the potential need for large pce dimensions.
For mpc-specific applications of pce, error bounds on the first- and second-order moments, which are polynomial functions of pce coefficients, are established in [Lucia17]. These results provide a deep insight, however, no bounds on the error of the underlying projections in Hilbert spaces are given. The authors of [Augustin08] provide an upper bound on the truncation error using a univariate Hermitian basis based on differentiability assumptions of . Yet, these results do not easily carry over to other bases.
The main contribution of the present paper is to leverage well-established Hilbert space theory [Luenberger69book] to the end of quantifying truncation errors for in the -sense for multivariate uncertainties considering applications in the field of systems and control. We provide exact error descriptions instead of error bounds. Considering polynomial and non-polynomial mappings , we tackle the question of how to choose the output pce dimension such that the truncation error vanishes. Moreover, we establish bounds that allow the computation of minimum pce dimensions such that a user-specified error tolerance is met. Furthermore, we study truncation errors for convex quadratic programs due to their relevance in (stochastic) model predictive control. Illustrative examples accompany the findings.
The remainder is organized as follows: Section II introduces pce and the tackled research questions. Section III establishes results on pce truncation errors for polynomial mappings. Also, truncation errors for convex quadratic programs are derived as they typically appear in model predictive control. Section IV derives error bounds for the non-polynomial case.
Ii Problem Formulation
We consider random variables that are the image of random variables under the square-integrable mapping
We assume that and are real-valued second-order random variables with multivariate -valued stochastic germ , cf. [Sullivan15book, Xiu10book]. The mapping can, for example, describe the state transition map for continuous-time or discrete-time systems subject to uncertainties in the system matrix. Also, it can describe the influence of a set of uncertain parameters/initial conditions on the output. Describing the dependence of the random variable on the properties of the random variable is in general challenging. One way to do this is via pce, which allows any random variable with finite second-order moment to be represented as a (possibly infinite) series of weighted orthogonal polynomials [Xiu10book, Mesbah14, Sullivan15book]. The main question of interest in the present paper is to quantify the approximation error made if the series expansion is terminated early.
Ii-a Polynomial Chaos Expansion
We first focus on describing real-valued random variables from the Hilbert space of equivalence classes of univariate real-valued second-order random variables given the probability space .222With slight abuse of terminology, we will denote the set of equivalence classes of random variables simply as the set of random variables. First note that the Hilbert space over the product probability space is (under mild technical assumptions) equivalent to the Hilbert space tensor product and given by
where , and denotes the tensor product, cf. [Sullivan15book].333In the following stands for the stochastic germ spanning . Assume that the set of -variate polynomials spans the space and satisfies the orthogonality relation for all
with Kronecker-delta , and the induced norm . This allows the pce of to be defined as:
Definition 1 (Polynomial chaos expansion)
The polynomial chaos expansion of a real-valued random variable is
where is called the th pce coefficient [Sullivan15book, Xiu10book].
In practice, the truncated pce of is often considered to allow for more efficient calculations:
Definition 2 (Truncated polynomial chaos expansion)
The truncated pce of is
where is the dimension of the subspace spanned by ; pce dimension in short. The basis is chosen to contain all -variate polynomials of degree at most (in lexicographical order), yielding
Often, the stochastic germ is chosen to follow a Gaussian, Beta, Gamma, or uniform distribution (or a tensorized combination thereof) [Mesbah14, Fagiano12a, Kim13a, Muehlpfordt17a, Mesbah16, Lucia15]. Note that no specific assumption w.r.t. the character of is made in the context of this work.
Ii-B Truncation Error and Mappings
The truncation error can be shown to be orthogonal to the pce , i.e. . The error also satisfies [Sullivan15book, Xiu10book, Cameron1947]. Furthermore, if the weight to which the polynomials are orthogonal matches the (product) measure , convergence of the above limit is known to be exponential [Cameron1947, Wiener38, Xiu02]. Besides exponential convergence several questions with respect to the truncation error are immediate: First, is it possible to describe a random variable and its mapping precisely by a finite pce? Second, if the pce is truncated early, is it possible to establish an error bound on ? To answer the above questions we define the minimum degree of a pce as follows:
Definition 3 (Minimum expansion degree)
The minimum expansion degree of is the number such that all pce coefficients associated with higher-degree basis polynomials are zero, i.e. for all with .
Assumption 1 (Exact pce input)
For a given orthogonal polynomial basis , the pce of the real-valued random variable has the known and finite minimum degree , and pce coefficients, cf. (6).
In other words, Assumption 1 implies that a finite number of pce coefficients yields a vanishing truncation error for the input uncertainty; i.e.,
Indeed, for many applications Assumption 1 is assumed to hold for a minimum degree of ; in other words, Gaussian, Beta, Gamma, or uniform distributions are employed to model uncertainties [Mesbah14, Fagiano12a, Kim13a, Muehlpfordt17a, Mesbah16, Lucia15].
We now turn back to the main question of quantifying the approximation error when describing the random variable as a function of as given by (1). For the case of univariate Gaussian stochastic germ the following result is known.
Lemma 1 (Bound in univariate Hermite basis [Augustin08])
Let , and let the stochastic germ be a standard Gaussian random variable. Consider , , where with square-integrable. Furthermore, let be times continuously differentiable with . Then, for it holds that
where is the th probabilists’ Hermite polynomial.
Lemma 1 provides an error bound specifically tailored to univariate Gaussian uncertainties. However, obtaining sharp bounds for the general case is especially important for many applications in systems and control, specifically w.r.t. performance and stability results. Two questions arise:
Choosing the pce dimension equal to the pce input dimension, i.e. , what truncation error is made given a square-integrable nonlinear mapping ?
What is the minimum pce dimension such that a zero truncation error is attained for ?
Iii Truncation Errors for Polynomial Maps
Let with be real-valued random variables. Moreover, let be a complete subspace of dimension generated by the orthogonal polynomial basis functions . For ease of presentation we assume that all have the same pce dimension.
Theorem 1 (Error under polynomial mapping)
Suppose that all satisfy Assumption 1 with minimum degree and a respective orthogonal basis, and let be a square-integrable polynomial mapping of degree such that . Then, the magnitude of the truncation error is
and are the pce coefficients of .
Proof: The pce for is given by
where . Substituting this into the polynomial mapping , one obtains
The highest-degree polynomial term of has degree , thus enlarging the basis by elements. The number of basis elements is given by (6) with . The orthogonal projection of onto yields . Consequently, the truncation error becomes , which is zero in case of . For , apply Parseval’s identity to obtain , cf. [Bronstein13].
Corollary 1 (Error for polynomial )
Given a polynomial mapping such that with , and choosing the pce dimension equal to the pce input dimension , the truncation error is given by from (9).444In order to evaluate (9), the terms have to be computed. This raises the question of the attached computational complexity. The terms can, for example, be computed using Gauss quadrature, which has a manageable computation complexity, e.g. with the approach from [Townsend16].
Furthermore, the minimum dimension is .
It is fair to ask for a comparison w.r.t. the error bound from Lemma 1. To provide insight into this question, consider the setting from Lemma 1 (univariate Gaussian stochastic germ) in combination with a polynomial mapping .
Let , , and let be a Gaussian random variable with mean and standard deviation . Consider the mapping . If , i.e. the subspace is spanned by the first two Hermite polynomials, Assumption 1 is satisfied with . In other words, the pce coefficients of are and . Direct inspection shows that . The error becomes with norm . For derivatives the respective error (8) becomes . The minimum exact pce degree for is . Adding another basis function , the projection error becomes zero. Table I shows the squared norm of for ascending input degree and symbolic pce input coefficients . Exactness of the error can only be ensured in the case of .555 It is worth asking for general conditions such that the error bound (8) is tight. Specifically for , and , the errors (9) and (8) with are identical [Augustin08, Corollary 2.2], namely . General conditions are, however, beyond the scope of this paper. In the other cases shown, holds.
Uncertainty Quantification for Quadratic Programs
It is evident that Theorem 1 can be applied to discrete-time lti systems subject to uncertainties, whenever the state transition map is polynomial in the uncertainty. In the following, we focus on uncertain convex quadratic programs (qps) due to their important role in systems and control. For example, model predictive control for discrete-time lti systems with convex polytopic constraints and a convex quadratic cost function is well-known to be equivalent to solving a qp repeatedly online at each time instant [maciejowski2002predictive, Rawlings2009MPC]. Also, qps are the basis for sequential quadratic programming methods for solving nonlinear programs that are encountered in nonlinear mpc. In many cases, however, the problem data of the qp is uncertain—in these cases pce is of advantage.
Problem 1 (qp with uncertain data)
Let be an -valued random vector with elements . Also, let be an -valued random vector with elements . Set , where denotes the matrix transpose. All random variables satisfy Assumption 1, each with known and exact finite pce dimension .666For the sake of brevity of notation, we demand the same dimension . Let be a complete subspace of dimension generated by the orthogonal polynomial basis functions . Consider
for positive definite and a non-empty feasible set . The entries of the vectors and are realizations of the vector-valued random variables and , respectively. Then, the problem is to find the -valued random variable and quantify the element-wise truncation error for all .
Remark 1 (qps and mpc)
In case of linear-quadratic mpc, Problem 1 is equivalent to considering uncertainty with respect to the initial condition at every time instant [maciejowski2002predictive]. This uncertainty may be due to state estimation, or a lack of measurement precision/availability.
pce allows the influence of on to be specified as follows.
Theorem 2 (Uncertainty quantification for convex qps)
For all realizations of , let the active constraints in Problem 1 satisfy the linear inequality constraint qualification (licq) at the optimal solution .
If the pce dimension of is chosen according to , then its element-wise truncation error becomes zero, i.e.
If the pce dimension of is chosen as , and if the set of active constraints is the same for all realizations of , then the element-wise truncation error becomes
where , are the th rows with of the matrices , that satisfy
The active constraint selection matrix is constructed from the active set and has elements for , zero elsewhere.
Proof: Part (i)—Regardless of the realizations of there always exists a (possibly empty) set of active constraints for the optimal solution . Rewrite (13) as
where selects the active constraints. The kkt conditions become
Due to licq the coefficient matrix is invertible, yielding exactly one solution. The argmin-operator maps the realizations linearly to . Consequently, it maps the random variable linearly to the random variable . Thus, Theorem 1 is applicable with .
Part (ii)—Because the set of active constraints is supposed to be for all realizations, the kkt conditions (17) hold in terms of a function of random variables
where (15) and are used. Invertibility follows again from licq. Consequently for all ,
and the result follows from Theorem 1 with .
Remark 2 (Extension to changes in the active set)
Note that even if the active set changes, part i) of Theorem 2 still holds. Furthermore, the error description from part ii) can be turned into an upper bound by considering the worst case active set, which maximizes . Due to space limitations, we leave the details to future work.
Note that the computation of the pce coefficients (18) is effectively instantaneous, i.e. requires no significant computation time, whilst being exact. Arguably, this is not true for Monte Carlo or other sampling-based methods. To illustrate Theorem 2 in use, we consider the following example.
Consider linear-quadratic mpc for an lti discrete-time model of an aircraft. The open-loop optimal control problem can be cast as a qp [maciejowski2002predictive]. The numerical values for the nominal system and weights are taken from [Lucia15]; the horizon length is . The input is the rate of change of the elevator angle, which introduces discrete-time integral action and an additional state. Uncertainty is introduced via the initial condition for the altitude: it is modeled by the random variable that follows a Beta distribution on with shape parameters , , yielding the uncertain initial condition . Assumption 1 is satisfied with and the pce coefficients are , for a Jacobi polynomial basis. Following Theorem 2, a Jacobi polynomial basis with allows a zero pce truncation error in the decision variable . Figure 1 shows the evolution of the -interval of the optimal input over time—note that the realization of the optimal random variable resembles the input to the system. For all realizations of the initial condition the constraints for the second state are active on the interval s. The corresponding optimal input trajectory over s is deterministic, as shown in Figure (a)a. In terms of pce coefficients, this is equivalent to all pce coefficients of order greater than zero being zero, yielding a Dirac--distribution. It is after the constraints become inactive that uncertainty plays a role; depicted in Figure (b)b for s.777The histogram rather than the pdf is shown for sake of readability, because the peak of the pdfs for are orders of magnitude larger. Because the closed-loop system is asymptotically stable, the input uncertainty eventually fades out, resulting again in Dirac--distributions.
Iv Truncation Errors for Non-Polynomial Maps
Theorem 3 (Error for non-polynomial mapping)Let all satisfy Assumption 1 with minimum degree and a respective orthogonal basis, and let be a square-integrable mapping such that . Then, the magnitude of the truncation error is (19) where is positive definite, and with for all .
Proof: The pce coefficients of satisfy
which follows from orthogonality of the basis spanning . The vector of pce coefficients contains all pce coefficients . The truncation error satisfies
Corollary 2 (Error for non-polynomial )
Given a non-polynomial mapping such that with , and choosing the pce dimension equal to the pce input dimension , the truncation error is given by from (19).
No general statement is possible.888Note that in the univariate case a zero truncation error in general requires an infinite pce dimension, because a non-polynomial function cannot be represented exactly by a linear combination of a finite polynomial basis. However, for a user-specified error threshold the according minimum pce dimension is obtained from Theorem 3.
We illustrate Theorem 3 for a continuous-time lti example with lqr and an uncertain system matrix.
Consider the continuous-time lti dynamics for a modified aircraft model from [maciejowski2002predictive]. The initial condition is , and
where . The realization corresponds to the nominal system matrix . The control is computed via lqr using the weights , for the nominal system . Now apply the above feedback to the uncertain system matrix. The closed-loop altitude trajectories are given in Figure 2 (left) for best case and worst case realizations, clearly showing the performance degradation under uncertainty. The uncertainty is mapped to the state via the state transition map . Figure 2 shows the altitude truncation error from (19) over time for increasing highest-degree . The basis consists of Legendre polynomials. The closed-loop system is asymptotically stable for all realizations of , hence the truncation error decays to zero. However, it is clearly non-monotonic over time. Note how over- and undershooting of the deterministic solution, Figure 2 left, carry over to the pce error, Figure 2 right.
As to be expected, in case of a polynomial , result (9) is recovered and computationally cheaper.
Proof: For polynomial , the exact pce for is given from (12). Consequently, for we have
For all with , the numerators become