Convex duality in stochastic programming and mathematical finance

Convex duality in stochastic programming and mathematical finance


This paper proposes a general duality framework for the problem of minimizing a convex integral functional over a space of stochastic processes adapted to a given filtration. The framework unifies many well-known duality frameworks from operations research and mathematical finance. The unification allows the extension of some useful techniques from these two fields to a much wider class of problems. In particular, combining certain finite-dimensional techniques from convex analysis with measure theoretic techniques from mathematical finance, we are able to close the duality gap in some situations where traditional topological arguments fail.

1 Introduction

Let be a probability space with a filtration (an increasing sequence of sub-sigma-algebras of ) and consider the problem


where is an extended real-valued function, is a space of -adapted decision strategies and is a measurable function (exact definitions will be given below). The variable represents parameters or perturbations of a dynamic decision making problem where the objective is to minimize the expectation over decision strategies adapted to the information available to the decision maker over time. This paper derives dual expressions for the optimal value of (1) by incorporating some measure theoretic techniques from mathematical finance into the general conjugate duality framework of Rockafellar [36].

Problem (1) covers many important optimization models in operations research and mathematical finance. Specific instances of stochastic optimization problems can often be put in the above format by appropriately specifying the integrand . Allowing the integrand to take on the value , we can represent various pointwise (almost sure) constraints by infinite penalties. Some of the earliest examples can be found in Danzig [10] and Beale [4]. Problem (1) provides a very general framework also for various optimization and pricing problems in mathematical finance. Certain classes of stochastic control problems can also put the above form; see [40, Section 6]. In some applications, the parameter is introduced into a given problem in order to derive information (such as optimality conditions or bounds on the optimal value) about it. This is the point of view taken e.g. in [36]. In other applications, the parameter has a natural interpretation in the original formulation itself. Examples include financial applications where may represent the payouts of a financial instrument such as an option and one is trying to minimize the initial cost of a hedging portfolio.

Convex duality has widespread applications in operations research, calculus of variations and mechanics. Besides in deriving optimality conditions, duality is used in numerical optimization and bounding techniques. The essence of convex duality is beautifully summarized by the conjugate duality framework of [36] which subsumes more special duality frameworks such as Lagrangian (and in particular LP) and Fenchel duality; see also Ekeland and Temam [16]. Several duality results, including optimality conditions for certain instances of (1) have been derived from the conjugate duality framework in Rockafellar and Wets [38, 39, 40, 41].

Convex duality has long been an integral part also of mathematical finance but there, duality results are often derived ad hoc instead of embedding a given problem in a general optimization framework. Attempts to derive financial duality results from known optimization frameworks are often hindered by two features. First, general duality frameworks are often formulated in locally convex topological vector spaces while in financial problems the decision strategies are usually chosen from a space that lacks an appropriate locally convex topology. Second, general duality results are often geared towards attainment of the dual optimum which requires conditions that often fail to hold in financial applications. The main contribution of this paper is to propose a general enough duality framework for (1) that covers several problems both in operations research as well as in mathematical finance. Our framework, to be rigorously specified in Section 2, is an extension of the stochastic programming duality frameworks proposed in [38, 40]. In our framework the parameters enter the model in a more general manner and we do not restrict the decision strategies to be bounded or integrable a priori.

Allowing strategies to be general adapted processes has turned out be useful in deriving various duality results for financial models; see e.g. Schachermayer and Delbaen [14], Kabanov and Safarian [22] and their references. This paper extends such techniques to a much more general class of models. We obtain dual representations for the optimal value of (1) but not necessarily the dual attainment as opposed to the strong duality results in [38, 39, 40, 41]. Consequently, we cannot claim the necessity of various optimality conditions involving dual variables. Nevertheless, the mere absence of duality gap is useful in many situations e.g. in mathematical finance where the “constraint qualifications” required for classical duality results often fail to hold. For example, various dual representations of hedging costs correspond to the absence of the duality gap while the dual optimum might not be attained. As an application, we extend certain results on superhedging and optimal consumption to a general market model with nonlinear illiquidity effects and convex portfolio constraints. This will be done by extending the elegant (currency) market model of Kabanov [23] where all assets are treated symmetrically. More traditional market models are then covered as special cases. The absence of duality gap is useful also in deriving certain simulation-based numerical techniques for bounding the optimum value of (1) as e.g. those proposed in Rogers [43] and Haugh and Kogan [19] in the case of optimal stopping problems. We extend such techniques for a more general class of problems.

The rest of this paper is organized as follows. Section 2 presents the general duality framework for problem (1) based on the conjugate duality framework of [36]. Sections 3 and 4 give some well-known examples and extensions of duality frameworks from operations research and mathematical finance, respectively. Section 5 extends some classical closedness criteria from finite-dimensional spaces to the present infinite-dimensional stochastic setting.

2 Conjugate duality

We study (1) in the conjugate duality framework of Rockafellar [36]. However, we deviate from [36] in that the space of decision variables need not be a locally convex topological vector space paired with another one. This precludes the completely symmetric duality in [36] but in some situations it yields more regularity for the optimal value than what can be obtained e.g. with integrable strategies.

For given integers , we set

where denotes the space of equivalence classes of -measurable -valued functions that coincide -almost surely. Each is interpreted as a decision that is made after observing all available information at time . In applications, the filtration is often generated by a finite-dimensional stochastic process whose values are observed at discrete points in time. If is the trivial sigma algebra then the first component is deterministic,

The function is assumed to be an extended real-valued convex normal integrand on where and is a given integer. This means that the set-valued mapping is -measurable and it has closed and convex values (so is convex and lower semicontinuous for every ); see e.g. [42, Chapter 14]. This implies that is -measurable and that the function is lower semicontinuous and convex for every . It follows that is -measurable for every and . Throughout this paper, the expectation of an extended real-valued measurable function is defined as unless the positive part is integrable. The integral functional

in the objective of (1) is then well-defined extended real-valued convex function on . Normal integrands possess many useful properties and they arise quite naturally in many optimization problems in practice. Examples will be given in the following sections. We refer the reader to [37] or [42, Chapter 14] for general treatment of normal integrands on for finite .

For each , the optimal value of (1) is given by the value function

By [36, Theorem 1], is convex. We will derive dual expressions for on the space using the conjugate duality framework of Rockafellar [36]. To this end, we pair with , where is such that . The bilinear form

puts and in separating duality. The weakest and the strongest locally convex topologies on compatible with the pairing will be denoted by and , respectively (similarly for ). By the classical separation argument, a convex function is lower semicontinuous with respect to if it is merely lower semicontinuous with respect to .

Remark 1.

For , is the norm topology and is the weak-topology that has as a Banach space with the usual -norm. For , is the weak*-topology that has as the Banach dual of while is, in general, weaker than the norm topology. It follows from the Mackey-Arens and Dunford-Pettis theorems, that a sequence in converges with respect to if and only if it norm-bounded and converges in measure; see Grothendieck [18, Part 4] for the case of locally compact measure spaces. In mathematical finance, a convex function on is sometimes said to have the “Fatou property” if it is sequentially lower-semicontinuous with respect to .

Remark 2.

Instead of and , we could take an arbitrary pair of spaces of measurable -valued functions which are in separating duality under the bilinear form . Examples include Orlicz spaces which have recently been used in a financial context by Biagini and Frittelli [7].

The conjugate of a function on is the convex function on defined by

The conjugate of a function on is defined similarly. It is a fundamental result in convex duality that where

is the closure of ; see e.g. [36, Theorem 5]. Here denotes the lower semicontinuous hull of . If has a finite value at some point then is proper and ; see [36, Theorem 4].

The Lagrangian associated with (1) is the extended real-valued function on defined by

The Lagrangian is convex in and concave in . The dual objective is the extended real-valued function on defined by

Since is the pointwise infimum of concave functions, it is concave. The basic duality result [36, Theorem 7] says, in particular, that

This follows directly from the above definitions and does not rely on topological properties of . The biconjugate theorem then gives the dual representation


In many applications, the parameter has practical significance, and the dual representation (2) may yield valuable information about the function . On the other hand, in some situations, one is faced with a fixed optimization problem and the parameter is introduced in order to derive information about the original problem. This is the perspective taken in [36], where the minimization problem


would be called the primal problem and


the dual problem. By (2), the optimum values of (3) and (4) are equal exactly when . An important topic which is studied in [36] but not in the present paper is derivatives of the value function and the associated optimality conditions. In this paper, we concentrate on the more general property of lower semicontinuity of ; see Section 5. The lower semicontinuity already yields many interesting results in operations research and mathematical finance. Moreover, lower semicontinuity is useful for proving the continuity of for since a lower semicontinuous convex function on a barreled space is continuous throughout the interior of its domain; see e.g. [36, Corollary 8B].

Remark 3.

As long as the integral functional is closed in (which holds under quite general conditions given e.g. in Rockafellar [33]), the biconjugate theorem gives

and, in particular, so that . On the other hand, (2) gives so that the condition can be expressed as

In other words, the function has a saddle-value iff is closed at the origin. Along with the general duality theory for convex minimization, the conjugate duality framework of [36] addresses general convex-concave minimax problems.

The following interchange rule will be useful in deriving more explicit expressions for the dual objective . It is a special case of [42, Theorem 14.60] and it uses the fact that for an -measurable normal integrand , the function is -measurable; see [42, Theorem 14.37].

Theorem 1 (Interchange rule).

Given an -measurable normal integrand on , we have

as long as the left side is less than .

Theorem 1 yields a simple proof of Jensen’s inequality. Throughout this paper, the conditional expectation of a random variable with respect to will be denoted by ; see e.g. Shiryaev [47, II.7].

Corollary 2 (Jensen’s inequality).

Let is an -measurable convex normal integrand on such that for some . Then

for every .


Applying Theorem 1 twice, we get

where the third equality comes from the law of iterated expectations; see e.g. [47, Section II.7]. ∎

Going back to (1), we define

This is an extended real-valued function on , convex in and concave in . Various dual expressions in stochastic optimization and in mathematical finance can be derived from the following result which expresses the dual objective in terms of . In many situations, the expression can be written concretely in terms of problem data; see Sections 3 and 4. Given an , we let

Theorem 3.

The function is measurable for any and so the integral functional is well-defined on . As long as , we have

If, in addition, is of the form2

for some -measurable extended real-valued functions on then

as long as the right side is less than .


We have , where . To prove the measurability it suffices to show that is a normal integrand on . This follows from Proposition 14.45(c) and Theorem 14.50 of [42].

If , then there exists an such that for every . We can thus assume that in the expression for in which case

by Theorem 1. Here we apply the interchange rule to the function which is a normal integrand, by [42, Proposition 14.45(c)].

Fix a and let be such that . Let be arbitrary and let be such that . Defining , where , we have that the strategy is in and that almost surely for every as . Since the functions are dominated by the integrable function

Fatou’s lemma (applied in the product measure space obtained by equipping with the counting measure) gives

Since was arbitrary and , the claim follows. ∎

The main content of the first part of Theorem 3 is that the infimum in the definition of the Lagrangian can be reduced to scenariowise minimization. This can sometimes be done even analytically. The last part of the above result shows that, while integrability of may be restrictive in the original problem, it may be harmless in the expression for the dual objective . A simple example will be given at the end of Example 1 below. In some applications, the integrability can be used to derive more convenient expressions for .

3 Examples from operations research

This section reviews some well-known duality frameworks from operations research and shows how they can be derived from the abstract framework above. Many of the examples are from Rockafellar and Wets [38, 40] where they were formulated for bounded strategies. We will also point out some connections with more recent developments in finance and stochastics. A recent account of techniques and models of stochastic programming can be found in Shapiro, Dentcheva and Ruszczyński [46].

The best known duality frameworks involve functional constraints and Lagrange multipliers. The most classical example is linear programming duality. These frameworks are deterministic special cases of the following stochastic programming framework from [40], where sufficient conditions were given for the attainment of the dual optimum.

Example 1 (Inequality constraints).


where are convex normal integrands. To verify that is a normal integrand, we write it as , where

and . By [42, Proposition 14.33], the sets are measurable so the functions are normal integrands by [42, Example 14.32] and then is a normal integrand by [42, Proposition 14.44(c)]. The integral functional is thus well-defined and equals

The primal problem (3) can be written as

This is the classical formulation of a nonlinear stochastic optimization problem. It is a stochastic extension of classical mathematical programming models such as linear programming.

The Lagrangian integrand becomes

where . The expression holds under the general condition of Theorem 3, but to get more explicit expressions for the dual objective one needs more structure on ; see the examples below.

To illustrate how the choice of the strategy space may affect the lower semicontinuity of , consider the case , and

for some strictly positive such that . We get for every but there is no which satisfies the pointwise constraint when . However,

so, by the second part of Theorem 3, the strategies can be taken even bounded when calculating .

It was observed in [40, Section 3A] that the dual objective in Example 1 can be written in a more concrete form when the functions have a time-separable form.

Example 2.

Consider Example 1 in the case

where each is an -measurable normal integrand. Defining and using the convention , we can write


Assume now that for every and and that there is a and a -integrable random variable such that . It follows that for every ; see e.g. [37, Theorem 3K]3. If there is an such that are integrable then, by the second part of Theorem 3,

Using the properties of conditional expectation (see e.g. [47, Section II.7]), we get

Applying Theorem 1 for , we can express the dual objective as


The dual problem can thus be written as

where is the set of -valued -integrable martingales.

In the linear case, considered already in Danzig [10], the dual problem in Example 2 can be written as another linear optimization problem.

Example 3 (Linear programming).

Consider Example 2 in the case where

and for -measurable -integrable -dimensional vectors and -measurable integrable scalars . The primal problem can then be written as

where is the matrix with rows and . We get

where is the transpose of . It follows that

and the dual problem can be written as

where is the set of nonnegative -integrable martingales. When , we recover the classical linear programming duality framework.

The famous problem of optimal stopping is a one-dimensional special case of Example 3.

Example 4 (Optimal stopping).

The optimal stopping problem with an integrable nonnegative scalar process can be formulated as

The feasible strategies are related to stopping times through . The optimal value is not affected if we relax the constraint (see below). The relaxed problem fits the framework of Example 3 with , , , and . The dual problem becomes

To justify the convex relaxation, we first note that the feasible set of the relaxed problem is contained in the space of bounded strategies. Since by assumption, it suffices (by the Krein-Millman theorem) to show that the feasible set of the relaxed problem equals the -closed convex hull of the feasible set of the original problem. Let be feasible in the relaxed problem. For , define the stopping times

where . The strategies

are feasible in the original problem. It suffices to show that the convex combinations

converge to in the weak topology. By construction,

so that and thus almost surely. Since and are all contained in the unit ball of , we have

by the dominated convergence theorem.

Remark 4.

The above duality frameworks suggest computational techniques for estimating the optimal value of the primal problem. The dual objective in Example 2 is dominated for every by


If is feasible in the primal problem, we get for every

Minimizing over all feasible strategies shows that (5) lies between and the optimum primal value . When is closed, we thus get that . The problem of finding the infimum in (5) can be seen as a deterministic version of the primal problem augmented by a penalty term in the objective.

In the case of Example 4, (5) can be written for every as

This is the dual representation for optimal stopping obtained by Davis and Karatzas [13]. This was used by Rogers [43] (see also Haugh and Kogan [19]) in a simulation based technique for computing upper bounds for the value of American options in complete market models. The technique is readily extended to the more general problem class of Example 2. The technique can be further extended using the following.

The cost of the nonanticipativity constraint on the strategies has been studied in a number of papers; see e.g. Rockafellar and Wets [38] for a general discrete finite time framework as well as Wets [48], Back and Pliska [2], Davis [11] and Davis and Burnstein [12] on continuous-time models. The cost can be described in terms of dual variables representing the value of information. The following derives a dual representation in the framework of Section 2.

Example 5 (Shadow price of information).

Let be a convex normal integrand and consider the problem


This can be seen as the primal problem associated with the normal integrand

The value function corresponds to adding a general -measurable vector to each in (6). We get

As long as there is a such that , this satisfies the conditions of Theorem 3 with so that

By Theorem 3,

The infimum in the last expression differs from the original problem in that the information constraints have been replaced by a linear term. This can be used to compute lower bounds for the optimal value using simulation much like in Rogers [43] and Haugh and Kogan [19] in the case of optimal stopping problems; see Remark 4. Rockafellar and Wets [38] gave sufficient conditions for the existence of a such that in the case of bounded strategies; see also Back and Pliska [2] for a continuous-time framework with a special class of objective functions.

The following problem format is adapted from Rockafellar and Wets [41]. It has its roots in calculus of variations and optimal control; see Rockafellar [35].

Example 6 (Problems of Bolza type).

Let and consider the problem


where , and each is an -measurable normal integrand on . This fits our general framework with

where and with . Indeed, (7) is (1) with . We get