Chance-Constrained Optimization With Tight Confidence Bounds

Chance-Constrained Optimization With Tight Confidence Bounds

Mark Cannon Engineering Science Dept., University of Oxford, OX1 3PJ, UK. mark.cannon@eng.ox.ac.uk
Abstract

This paper considers convex sample approximations of chance-constrained optimization problems, in which the chance constraints are replaced by sets of sampled constraints. We show that, if a subset of sampled constraints are discarded, then the use of a randomized sample selection strategy allows tight bounds to be derived on the probability that the solution of the sample approximation is feasible for the original chance constraints. These confidence bounds are shown to be tighter than the bounds that apply if constraints are discarded according to optimal or greedy discarding strategies. We further show that the same confidence bounds apply to solutions that are obtained from a two stage process in which a sample approximation of a chance-constrained problem is solved, then an empirical measure of the violation probability of the solution is obtained by counting the number of violations of an additional set of sampled constraints. We use this result to design a repetitive scenario approach that meets required tolerances on violation probability given any specified a priori and a posteriori probabilities. These bounds are necessarily tighter than confidence bounds available for previously proposed repetitive scenario approaches, and we show that the posterior bounds are exact for a particular problem subclass. The approach is illustrated through numerical examples, and extensions to problems involving multiple chance constraints are discussed.

C

hance Constraints; Randomized Methods; Stochastic Programming.

1 Introduction

An important class of optimization problems involves chance constraints, namely constraints dependent on stochastic parameters, which are required to hold with specified probabilities. Solution methods and applications for optimization under chance constraints were first considered in the context of problems in economics and management [13, 14]. More recently, chance-constrained optimization has been applied to diverse problems in finance [17, 4], process design [26, 8], model predictive control [11, 24] and building control [22]. Further applications and references are discussed in [2].

For problems with constraints that may be violated up to prescribed limits on violation probability, chance constraints are less stringent than their robust counterparts, which impose constraints for all realizations of uncertainty. However, methods of handling chance constraints using explicit probability distributions can lead to intractable optimization problems. This motivates the use of scenario or sample-based methods in which constraints are imposed for finite sets of independent samples of the uncertain parameters. These approaches have the advantages that convexity is preserved, assuming that the constraints are convex in the decision variables for all uncertainty realizations, and that probabilistic bounds can be determined on the confidence with which the solution satisfies constraints [7, 8, 9]. In order to keep computation within practicable limits, it is important to understand how the sample size affects the accuracy with which the solution of the sampled problem approximates the solution of the chance-constrained problem.

The seminal papers [9, 5] provide bounds on the confidence that a decision variable satisfies a chance constraint, conditioned on the fact that the decision variable is optimal for a sampled problem in which the chance constraint is replaced by a randomly extracted set of sampled constraints. These bounds are tight in the sense that they cannot be improved without additional information about the sampled problem, and they are exact (i.e. they coincide with the actual distribution of violation probabilities) for a particular problem subclass. However, since this approach assumes that the sampled problem invokes the entire set of sampled constraints, a high level of confidence necessitates a large number of samples and hence a low probability of constraint violation. Consequently the approach cannot generally deliver a high degree of accuracy of approximation for problems involving chance constraints with violation probabilities that are not close to zero.

Alternative formulations, which are more suitable for approximating chance constraints with arbitrary violation probabilities, use a certain proportion of parameter samples to define a set of sampled constraints and discard the rest. Bounds on the confidence with which a solution of the resulting sampled problem satisfies a given chance constraint are derived in [5, 10]. However, these bounds are obtained under the assumption that sampled contraints are discarded optimally with respect to the objective function, or that constraint selection heuristics are used to approximate an optimal sample discarding procedure. The solutions of the sampled problem may therefore have poor generalization properties, and we show in this paper that, relative to a randomized sample discarding procedure, this leads to a lower confidence of satisfying the underlying chance constraints.

A third type of scenario approach for chance-constrained programming selects a solution from among the solutions of a set of sampled problems, with constraints defined in each case by an independently extracted set of parameter samples [12, 6]. By incorporating a posteriori empirical constraint violation tests based on additional parameter samples that are not used in the definition of the sampled problem, this approach can potentially provide tighter bounds on the confidence of satisfying an associated chance constraint. However [12] does not provide a priori confidence bounds, and the posterior confidence bounds (based on [3]) are necessarily conservative for the convex setup employed here; on the other hand, the bounds given in [6] are non-conservative only for a particular problem subclass, as we show in this paper.

This paper explores and develops the connection between the confidence of chance constraint satisfaction for single-shot scenario approaches with and without sample discarding [9, 5, 10], and repetitive scenario approaches [12, 6]. Considering the properties of problems with sampled constraints that are discarded at random rather than according to a deterministic algorithm, we derive new bounds on the confidence that the solution is feasible for an associated chance-constrained problem. These bounds are tight (in the sense that they cannot be improved without additional information on the problem), and, since a combinatorial factor appearing in the bounds of [5, 10] is not required, they demonstrate that a considerable improvement in approximation accuracy can be achieved using a randomized sample discarding approach. We discuss a procedure for implementing randomized sample discarding based on the repetitive scenario approach. We describe how to determine the number of sampled problems that must be solved, and their respective sample sizes, in order to ensure specified tolerances on the violation probability of the solution for any given prior and posterior probabilities. The resulting posterior confidence bound coincides with that of [6] for a specific problem subclass, however, we show that it is in fact exact for this case and we give tight bounds in all other cases.

The rest of the paper is organised as follows. Section 2 gives the problem definition. Section 3 gives the main results, then discusses related results on the optimal value of the objective function and comparisons with existing results for deterministic sample discarding procedures. Section 4 describes an algorithm for approximating the solution of a chance-constrained problem with specified prior and posterior confidence bounds on constraint violation probability. Two examples are given to illustrate the algorithm: an application to the problem of determining the smallest hypersphere containing a given probability mass, and an application to a finite horizon optimal control problem considered in [6], generalized to the case of multiple chance constraints. Section 5 draws conclusions.

2 Problem definition and assumptions

Consider the chance-constrained optimization problem with decision variable :

(2.1)
subject to

Here is a specified probability, is a vector of random parameters and a probability measure defined on . Note that can take any value in the interval , and in particular is not assumed to be close to . The domain of the decision variable and the function satisfy the following assumption.

Assumption

For all , is convex and lower-semicontinuous, and is compact and convex.

Despite Assumption 2, the chance constraint is not necessarily convex in (except for certain special cases, see e.g. [23]), and problem (2.1) is therefore nonconvex in general. To avoid the computational difficulties associated with the chance constraint in (2.1), we consider an approximate problem formulation using samples of the uncertain parameter vector . Let denote a collection of independent identically distributed (i.i.d.) samples of the random variable . The sample indices are assumed to be statistically independent of the sample values, so that denotes a randomly selected subset of , for any .

To motivate a discussion of sampled convex programming, we define (following [5, 10]) the sample approximation with optimal sample discarding for problem (2.1) as

(2.2)
subject to

for a given integer , with . The optimal values of and in (2.2) are denoted and respectively.

Since is compact and convex, and since is a convex constraint on , the sampled problem (2.2) can be expressed as a mixed integer program with a convex continuous relaxation. This implies that (2.2) can be solved exactly using a branch and bound approach (see e.g. [19]). However, unless is large, the solution can have poor generalization properties since, as discussed in Section 3 of this paper, bounds on the confidence that satisfies the chance constraint of (2.1) are not tight for general uncertainty distributions. Moreover the computation required to solve for grows rapidly with and .

In this paper we therefore consider problems with sampled constraints that are defined by randomly selecting subsets of a random multisample . Let

(2.3)

and define for any given as the collection of all -element subsets of such that violates the constraint for all , i.e.

(2.4)

We make the following assumptions on (2.3).

Assumption

The optimization (2.3) is almost surely feasible for .

Assumption

The solution of (2.3) for any satisfies for all , with probability .

The feasibility requirement of Assumption 2 is trivially satisfied if the robust optimization corresponding to (2.1) with is feasible. Clearly Assumption 2 could be restrictive since it is equivalent to the requirement that, with probability , an exists satisfying the constraints of (2.3) with , for any . We note, however, that the results of this paper could be extended to situations in which (2.3) has a non-zero probability of infeasibility by using a framework for analysis such as the one described in [5]. On the other hand, by convexity Assumption 2 holds if and only if, for any , problem (2.3) with is non-degenerate with probability  (i.e. the dual of problem (2.3) almost surely has a unique solution [16]).

In general may contain more than one subset of . In the sequel we refer to each distinct as a level- subset of and to the corresponding solutions of (2.3) for as level- solutions.

We define the essential (constraint) set, , of (2.3) for given as follows (see also [5, Def. 2.9]).

Definition (Essential set)

is an essential set of problem (2.3) if

  1. , and

  2. for all .

An essential set consists of samples that are associated with active constraints at the solution of (2.3). If Assumptions 2 and 2 hold, then is necessarily uniquely determined by conditions (i) and (ii). We define the maximum support dimension of (2.3), denoted , as the least upper bound that holds almost surely on the number of elements in the essential set of (2.3) for any size of multisample , namely the maximum value of over all finite integers . It is easy to show that (which is equivalent to Helly’s dimension for (2.2[5, Def. 3.1]) cannot be greater than if Assumptions 2 and 2 hold. Similarly we define the minimum support dimension of (2.3), denoted , as the minimum value of for all finite . Clearly must hold for all problems.

Assumption

The maximum and minimum support dimensions of (2.3) satisfy for all finite and for all finite respectively, for some and such that .

3 Main results

The results presented in this section enable a randomized procedure to be constructed that ensures tight a priori and a posteriori bounds on the confidence of finding a solution of (2.3) that satisfies the chance constraint in (2.1). For given , denotes the violation probability

We first derive bounds on the conditional probability that satisfies given that is a randomly selected level- subset of . We then give bounds on the conditional probability that a level- solution satisfies given that is a randomly selected level- solution, for any given . These provide the basis for a posteriori bounds on the confidence that satisfies . Finally we provide bounds on the probability of generating a level- subset of using a randomized sample selection procedure; these bounds make it possible to determine a priori bounds on the probability of obtaining a level- solution.

The solution of a randomized optimization problem based on a finite multisample cannot in general satisfy with certainty the chance constraint in (2.1). Instead we seek a solution such that the constraint violation probability lies in a given interval, , with a specified level of confidence, . Two-sided confidence bounds are important in this context because the violation probability should be close to with a high degree of confidence in order that approximates the solution of the chance-constrained problem (2.1) for any given .

In order to define the probability of an event that depends on the multisample , we use to denote the product measure on . The binomial distribution function is denoted

so that is the probability of or fewer events occuring in independent trials, each of which has probability .

Theorem 1 (Confidence bounds for level- solutions)

For any and integers and such that we have

(3.1)

and, for ,

(3.2)

Theorem 1 (which is proved in Section 3.1) provides upper and lower bounds on the probability of the event , conditioned on the event that contains as a subset. Since is statistically identical to a randomly selected subset of of cardinality , a direct consequence of Theorem 1 is that the probability of the event , given that is a randomly selected subset of , satisfies the upper and lower bounds in (3.1) and (3.2).

Whenever and , the lower confidence bound in (3.1) is greater than the bound derived in [5, 10] on the probability that the solution of (2.2) (in which samples are discarded optimally) satisfies ; this is discussed in detail in Section 3.4. The resulting improvement in the confidence bound for a randomly selected level- solution is significant because a combinatorial factor that appears in the bounds of [5, 10] is not required in (3.1). Furthermore, the confidence bounds of Theorem 1 hold with equality if the support dimension of (2.3) is unique, i.e. if . An example of this is the class of fully supported problems, for which  [5, 10].

For general values of , and , it is clearly not computationally tractable to identify all level- subsets of the multisample and then select one at random in order to take advantage of the confidence bounds in Theorem 1. Clearly the optimal solution of (2.2), if available, could be used to identify a level- subset (namely ), and likewise greedy constraint selection algorithms are able to identify suboptimal level- subsets (see e.g. [5, Sec. 5.1]). However the deterministic constraint discarding strategies employed by these methods cannot be used to select an element of at random.

Instead we consider a randomized constraint selection strategy. This is based on the observation that the essential set of (2.3) for a randomly chosen subset of , such as for given , is almost surely the essential set of a randomly selected level- subset, where is the number of elements of the multisample that satisfy the constraint . Thus we can determine for given by first solving (2.3) for , then counting the number of samples that satisfy and setting where

(3.3)

Since is statistically independent of the samples contained in , it can be shown that confidence bounds analogous to those provided by Theorem 1 apply to . These bounds, which are stated in Theorem 2 (and proved in Section 3.2), provide a posteriori bounds on the confidence of constructing a solution of (2.2) with a specified constraint violation probability.

Theorem 2 (A posteriori confidence bounds conditioned on )

For any and integers , and such that we have

(3.4)

and, for ,

(3.5)

Theorem 2 provides tight bounds on the conditional distribution of given the value of . But depends on the random sample , and, for arbitrary , there may be a only small probability that lies in the required range so that has, with a sufficiently high level of confidence, a constraint violation probability in the desired range. However, as we discuss in Section 4, it is possible to choose the value of so as to maximize the probability that lies in the required range. For this we make use of the following result (proof of which is given in Section 3.2).

Theorem 3 (Probability of selecting a level- subset of )

For any integers , and such that we have

(3.6)

and

(3.7)

where denotes the beta function for integers and .

From Theorem 3 it is possible to compute the value of that maximizes the probability with which lies in any specified range. In conjunction with Theorem 2, this allows a priori bounds to be determined on the confidence that the violation probability lies in the desired interval, . These confidence bounds make it possible to compute an upper bound on the number of times the procedure of solving (2.3) for and determining must be repeated in order to ensure that a solution is obtained that satisfies with a probability exceeding any given a priori confidence level . The proposed solution procedure, which is described in Section 4, therefore meets a priori bounds on the confidence of determining a solution with a violation probability in the required range.

The computational cost of determining in (3.3) for given is typically small compared to the cost of solving (2.3) for . For problems in which large numbers of samples are easy to obtain, this makes the use of large values of computationally attractive. In particular, by using a large value of it is possible to obtain very sharp a posteriori confidence bounds. In such cases the main factor limiting a priori confidence bounds is the structural uncertainty in solutions of (2.2), namely the difference between the maximum and minimum support dimensions of (2.3).

3.1 Confidence bounds for randomly selected level- solutions

The proof of Theorem 1 is derived from the properties of a randomly selected level- subset of the multisample . In order to simplify the analysis of problems with non-unique support dimensions (i.e. with ), we introduce a regularized version of the essential set that has a fixed cardinality. To define this set we assign (similarly to [5]) a random label to each sample , where is uniformly distributed on and independent of for all . Furthermore, for any multisample with associated labels we define as the subset of containing the samples with smallest labels, so that and if and only if for all such that . The regularized essential set is defined for a given integer as follows.

Definition (Regularized essential set)

For a multisample and integer such that , the regularized essential set is given by

where and is the essential set of (2.3).

Definition 3.1 implies that almost surely, and that

Using the regularized essential set we define a regularized version of violation probability:

Thus is equal to the probability that, for given , the regularized essential set associated with problem (2.3) changes when the multisample is extended to include a newly extracted sample . The regularized essential set also allows a regularized version of the set of level- subsets of to be defined for by

(3.8)

From Definition 3.1 it can be seen that whenever , and that whenever . These properties imply that , and similarly .

This section determines (in Lemma 2) the probability that is equal to the regularized essential set of (2.3) for some , for any given and with . This enables the conditional probability that given that is an element of to be determined (in Lemma 3), and the bounds in Theorem 1 on the conditional probability of given that are subsequently derived using this result. The approach is based on a fundamental result, stated in Lemma 1, on the distribution of the regularized violation probability . Related results are available in the literature (see e.g. [9, Eq. 3.2] and [5, Eq. 3.11]). However, for completeness and to account for differences in basic assumptions and notation, we provide here a proof of Lemma 1.

Lemma 1

For any and integers and such that we have

Proof

For all let denote the probability distribution of for given , i.e.

Now suppose that is equal to for some such that . This event is equivalent to the event that for , and its probability, conditioned on the assumption that is equal to , is given by

Using the definition of conditional probability we therefore have

(3.9)

and from the continuous version of the law of total probability it follows that

But is statistically identical to a randomly selected -element subset of and, from Assumption 2 and Definition 3.1, is almost surely unique. Therefore the probability that is equal to is given by the reciprocal of the number of distinct -element subsets of , and hence

(3.10)

necessarily holds for all . A solution for is given by , and moreover it can be shown that this solution is unique (since (3.10) is equivalent to a Hausdorff moment problem [15, Sec. VII.3]).

The probability that is equal to the regularized essential set, for some , where , is given by the following result.

Lemma 2

For any integers , and such that we have

Proof

From the definition of in (3.8), it follows that is equal to the regularized essential set for some if and only if of the samples contained in satisfy , and the remaining samples satisfy . The probability of this event, conditioned on the regularized violation probability being equal to , is

Therefore, from the definition of conditional probability, we obtain

(3.11)

where by Lemma 1. Hence the continuous version of the total probability law implies

and the result follows from the definition of the beta function, (see e.g. [1]).

The confidence bounds of Theorem 1 can be established using the following lemma, which provides a subsidiary result on the regularized version of violation probability. The proof of this lemma is based on Lemmas 1 and 2, and on the properties of a subset of selected at random from .

Lemma 3 (Confidence bounds for regularized violation probabilities)

For any and integers , and such that we have

(3.12)

Proof

Proceeding as in the proof of Lemma 2, we first consider the probability that is the regularized essential set for some with . From (3.11) and Lemma 1 we obtain

where is the incomplete beta function [1]. Using Lemma 2 and the definition of conditional probability we obtain

(3.13)

But the statistical independence of and the index imply that the probability of conditioned on is identical to the probability of the same event conditioned on , so that

Furthermore, if , then almost surely, and hence

From (3.13) it therefore follows that

The conditional distribution derived in Lemma 3 for the regularized violation probability given that is similar in form to the confidence bounds of Theorem 1. However the condition employed in Theorem 1 is in general much easier to check than membership of because it requires only a function evaluation to check constraint violation rather than recomputation of the essential set of (2.3), which is needed to determine whether when . In order to link Lemma 3 to Theorem 1 we make use of the independence of the regularized violation probability and the cardinality of the essential set, . A consequence of this independence property is that the arguments used to prove Lemmas 1, 2 and 3 can be used to demonstrate that the same set of results hold when all probabilities are conditioned on the event that takes any given value between and . For convenience we summarize the independence properties that are relevant to the proof of Theorem 1 as follows.

Lemma 4

For any integers and such that and we have

(3.14)

and

(3.15)

Proof

The sample labels are, by assumption, independent of . Therefore the probability of the event for a randomly extracted sample can only depend on the values of and , and does not depend on . Hence the events that and are necessarily independent of the event that has any given value.

Proof (Proof of Theorem 1)

From the observation that the events are mutually exclusive and exhaustive for , we have

However the definitions of regularized and non-regularized violation probabilities and essential sets imply that and if , and it follows that the event conditioned on and is identical to the event conditioned on and . Therefore

and the bounds in Theorem 1 are derived from this expression using Lemmas 3 and 4. Specifically, from (3.14) and (3.15) it follows that

and using (3.12) we obtain

and hence the bounds (which hold for all , and ) imply (3.1) and (3.2) since we must have .

3.2 The probability of selecting a level- subset and a posteriori confidence bounds

This section provides proofs for Theorems 2 and 3. These results are used in Section 4 to determine a randomized constraint selection strategy that is the basis of a method of determining solutions of (2.1) with specified a priori and a posteriori confidence bounds. The posterior confidence bounds in Theorem 2 are derived by an extension of the argument used in the proof of Theorem 1. We then consider Theorem 3, which provides bounds on the probability that the number of samples that satisfy is equal to a given value . The proof of this is based on a subsidiary result (given in Lemma 5) and Lemma 4.

Proof (Proof of Theorem 2)

The definition of in (3.3) implies that if and only if, for some , the set belongs to . But the samples contained in are statistically identical and independent of the samples in , and hence any event conditioned on is identical to the same event conditioned on . In particular we have

Furthermore, whenever and , we must have , and hence

The bounds (3.4) and (3.5) then follow from Theorem 1.

To demonstrate the bounds of Theorem 3 we define (analogously to in (3.3)) as the number of samples, with the property that the regularized essential set is identical to , or equivalently

(3.16)
Lemma 5 (Probability of selecting a regularized level- subset of )

For any integers , , , satisfying we have

(3.17)