Duality and optimality conditions in stochastic optimization and mathematical finance

# Duality and optimality conditions in stochastic optimization and mathematical finance

## Abstract

This article studies convex duality in stochastic optimization over finite discrete-time. The first part of the paper gives general conditions that yield explicit expressions for the dual objective in many applications in operations research and mathematical finance. The second part derives optimality conditions by combining general saddle-point conditions from convex duality with the dual representations obtained in the first part of the paper. Several applications to stochastic optimization and mathematical finance are given.

## 1 Introduction

Let be a complete probability space with a filtration of complete sub -algebras of and consider the dynamic stochastic optimization problem

 minimizeEf(x,u):=∫f(x(ω),u(ω),ω)dP(ω)over x∈N ($P_{u}$)

parameterized by a measurable function . Here and in what follows,

 Extra open brace or missing close brace

for given integers and is an extended real-valued -measurable function on , where . The variable is interpreted as a decision strategy where is the decision taken at time . Throughout this paper, we define the expectation of a measurable function as unless the positive part is integrable (In particular, the sum of extended real numbers is defined as if any of the terms equals ). The function is thus well-defined extended real-valued function on . We will assume throughout that the function is proper, lower semicontinuous and convex for every .

It was shown in [6] that, when applied to (($P_{u}$)), the conjugate duality framework of Rockafellar [14] allows for a unified treatment of many well known duality frameworks in stochastic optimization and mathematical finance. An important step in the analysis is to derive dual expressions for the optimal value function

 φ(u):=infx∈NEf(x,u)

over an appropriate subspace of . In this context, the absence of a duality gap is equivalent to the closedness of the value function. Pennanen and Perkkiö [9] and more recently Perkkiö [10] gave conditions that guarantee that is closed and that the optimum in (($P_{u}$)) is attained for every . The given conditions provide far reaching generalizations of well-known no-arbitrage conditions used in financial mathematics.

The present paper makes two contributions to the duality theory for (($P_{u}$)). First, we extend the general duality framework of [6] by allowing more general dualizing parameters and by relaxing the time-separability property of the Lagrangian. We show that, under suitable conditions, the expression in [6, Theorem 2.2] is still valid in this extended setting. This also provides a correction to [6, Theorem 2.2] which omitted certain integrability conditions that are needed in general; see [7]. Second, we give optimality conditions for the optimal solutions of (($P_{u}$)). Again, we follow the general conjugate duality framework of [14] by specializing the saddle-point conditions to the present setting. The main difficulty here is that, in general, the space does not have a proper topological dual so we cannot write the generalized Karush-Kuhn-Tucker condition in terms of subgradients. Nevertheless, the dual representations obtained in the first part of the paper allow us to write the saddle-point conditions more explicitly in many interesting applications.

In the case of perfectly liquid financial markets, we recover well-known optimality conditions in terms of martingale measures. For Kabanov’s currency market model with transaction costs [3], we obtain optimality conditions in terms of dual variables that extend the notion of a “consistent price system” to possibly nonconical market models. We treat problems of convex optimal control under the generalized framework of Bolza much as in [17]. Our formulation and its embedding in the conjugate duality framework of [14] is slightly different from that in [17], however, so direct comparisons are not possible. Our formulation is motivated by applications in mathematical finance. In particular, the optimality conditions for the currency market model are derived by specializing those obtained for the problem of Bolza.

## 2 Duality

From now on, we will assume that the parameter belongs to a decomposable space which is in separating duality with another decomposable space under the bilinear form

 ⟨u,y⟩=E(u⋅y).

Recall that is decomposable if

 \mathbbm1Au+\mathbbm1Ω∖Au′∈U

whenever , and ; see e.g. [15]. Examples of such dual pairs include the Lebesgue spaces and and decomposable pairs of Orlicz spaces; see Section 3. The conjugate of is the extended real-valued convex function on defined by

 φ∗(y)=supu∈U{⟨u,y⟩−φ(u)}.

Here and in what follows, . If is closed4 with respect to the weak topology induced on by , the biconjugate theorem (see e.g. [14, Theorem 5]) gives the dual representation

 φ(u)=supy∈Y{⟨u,y⟩−φ∗(y)}.

This simple identity is behind many well known duality relations in operations research and mathematical finance. It was shown in [9] that appropriate generalizations of the no-arbitrage condition from mathematical finance guarantee the closedness of . Recently, the conditions were extended in [10] to allow for more general objectives.

In general, it may be difficult to derive more explicit expressions for . Following [14], one can always write the conjugate as

 φ∗(y)=−infx∈NL(x,y),

where the Lagrangian is defined by

 L(x,y)=infu∈U{Ef(x,u)−⟨u,y⟩}.

We will show that, under appropriate conditions, the second infimum in

 φ∗(y)=−infx∈Ninfu∈U{Ef(x,u)−⟨u,y⟩}

can be taken scenariowise while the first infimum may be restricted to the space of essentially bounded strategies. Both infima are easily calculated in many interesting applications; see [6] and the examples below.

Accordingly, we define the Lagrangian integrand on by

 l(x,y,ω):=infu∈Rm{f(x,u,ω)−u⋅y}.

We will also need the pointwise conjugate of :

 f∗(v,y,ω): =supx∈Rn,u∈Rm{x⋅v+u⋅y−f(x,u,ω)} =supx∈Rn{x⋅v−l(x,y,ω)}.

By [18, Theorem 14.50], the pointwise conjugate of a normal integrand is also a normal integrand. Clearly, is upper semicontinuous in the second argument since it is the pointwise infimum of continuous functions of . Similarly,

 l–(x,y,ω):=supv∈Rn{x⋅v−f∗(v,y,ω)}

is lower semicontinuous in the first argument. In fact, is the biconjugate of while is the biconjugate of ; see [13, Theorem 34.2]. The function is a normal integrand for any while is a normal integrand for any and thus, the integral functionals and are well-defined on (recall our convention of defining an integral as unless the positive part of the integrand is integrable). Indeed, we have for , where is a normal integrand, by [18, Proposition 14.45(c)]. Similarly for .

Restricting strategies to the space of essentially bounded strategies gives rise to the auxiliary value function

 ~φ(u)=infx∈N∞Ef(x,u).

Under the conditions of Theorem 2 below, the conjugates of and coincide, or in other words, closures of and are equal. The following lemma from [10] will play an important role. We denote

 N⊥:={v∈L1(Ω,F,Rn)|E(x⋅v)=0 ∀x∈N∞}.
###### Lemma 1.

Let and . If , then .

We will use the notation

 dom1Ef :={x∈N|∃u∈U: Ef(x,u)<+∞}.

Recall that algebraic closure, , of a set is the set of points such that for some . Clearly, while in a topological vector space, .

###### Theorem 2.

If , then

 ~φ∗(y)=−infx∈N∞El(x,y).

If, for every with almost surely, there exists with almost surely and , then

 ~φ∗(y)=−infx∈N∞El–(x,y).

We always have

 ~φ∗(y)≤φ∗(y)≤infv∈N⊥Ef∗(v,y).

In particular, if there exists such that , then

 ~φ∗(y)=φ∗(y).
###### Proof.

By the interchange rule ([18, Theorem 14.60]), for and thus,

 −~φ∗(y) =infx∈N∞L(x,y)=infx∈N∞∩dom1EfL(x,y) =infx∈N∞∩dom1EfEl(x,y)≥infx∈N∞El(x,y).

The converse holds trivially if . Otherwise, let and be such that . By the first assumption, there exists an such that for all . By convexity, for small enough and thus

 −~φ∗(y)=infx∈N∞El(x,y).

To prove the second claim, let and . By [13, Theorem 34.3], almost surely so, by assumption, there exists with almost surely and . Thus, so the first domain condition is satisfied. By [13, Theorem 6.1], , and thus by the same line segment argument as above, the infimum in

 −~φ∗(y)=infx∈N∞El(x,y)

can be restricted to those for which . Then, by [13, Theorem 34.2], we may replace by without affecting the infimum.

As to the last claim, the Fenchel inequality gives

 f(x,u)+f∗(v,y)≥x⋅v+u⋅y.

Therefore, for and with , we have by Lemma 1, so we get

 φ∗(y)=supx∈N,u∈UE[u⋅y−f(x,u)]≤infv∈N⊥Ef∗(v,y).

Since , have . This completes the proof of the inequalities. The last statement concerning the equality clearly follows from the inequalities.

Theorem 2 gives conditions under which the conjugate of the value function of (($P_{u}$)) can be expressed as

 φ∗(y)=infx∈N∞El(x,y).

The second part gives conditions that allow one to replace by in the above expression. One can then resort to the theory of normal integrands when calculating the infimum. It was shown in [6] that this yields many well-known dual expressions in operations research and mathematical finance. Unfortunately, the proof of the above expression in [6, Theorem 2.2] was incorrect and the given conditions are not sufficient in general; see [7]. The following example illustrates what can go wrong if the condition on the domains is omitted. Here and in what follows, denotes the indicator function of a set : equals or depending on whether or not.

###### Example 3.

Let be trivial, , , , be such that and are unbounded, and let

 f(x,u,ω)=δR−(β(ω)x0+u)

so that , and

 l(x,y,ω)=yβ(ω)x0−δR+(y).

For , we get while . Here , so the first condition in Theorem 2 is violated.

The following example shows how the equality may fail to hold even when the first condition of Theorem 2 is satisfied.

###### Example 4.

Let , , is the completed trivial -algebra, and

 f(x,u,ω)=|x0−1|+δR−(α(ω)|x0|−x1)+12|u|2,

where is positive and unbounded. It is easily checked that with the first condition of Theorem 2 holds but and .

At the moment, it is an open question whether the algebraic closure in the domain condition could be replaced by a topological closure. This can be done, however, e.g. if is upper semicontinuous on the closure of . Also, in the deterministic case (where is integer-valued on ), the first condition is automatically satisfied since for all , by [13, Theorem 34.3].

The following describes a general situation where the assumptions of Theorem 2 are satisfied.

###### Example 5.

Consider the problem

 minimize E f0(x){\rm over} x∈N subject to fj(x)≤0 P-a.s., j=1,…,m,

where are normal integrands. The problem fits the general framework with

 f(x,u,ω) ={f0(x,ω)if fj(x,ω)+uj≤0 for j=1,…,m,+∞otherwise.

This model was studied in Rockafellar and Wets [16] who gave optimality conditions in terms of dual variables. We will return to optimality conditions in the next section. For now, we note that if for all and , then the assumptions of Theorem 2 are satisfied.

Indeed, so the first two conditions in Theorem 2 hold. The lagrangian integrand can now be written as

 l(x,y,ω)=f0(x,ω)+m∑j=1yjfj(x,ω).

By [14, Theorem 22], the function is Mackey-continuous (with respect to the usual pairing of and ) at the origin of for every so, by convexity, the function

 ϕy(~x):=infx∈N∞El(x+~x,y),

is Mackey-continuous as well; see e.g. [14, Theorem 8]. By [14, Theorem 11], this implies that has a subgradient at the origin, i.e. a such that

 El(x+~x,y)≥ϕy(0)+E(~x⋅v)∀~x∈L∞, x∈N∞

or equivalently

 El(~x,y)≥ϕy(0)+E((~x−x)⋅v)∀~x∈L∞, x∈N∞.

This implies and . By the interchange rule ([18, Theorem 14.60]), this can be written as , so the last condition of Theorem 2 holds.

Section 4 of [5] studied financial models for pricing and hedging of portfolio-valued contingent claims (claims with physical delivery) along the lines of Kabanov [3]; see Example 12 below. The following example is concerned with the more classical financial model with claims with cash-delivery.

###### Example 6.

Consider the problem

 minimizeEV(u−T−1∑t=0xt⋅Δst+1)overx∈N, (ALM)

where convex normal integrand on such that is nondecreasing nonconstant and . This models the optimal investment problem of an agent with financial liabilities and a “disutility function” . The -measurable vector gives the unit prices of “risky” assets at time and the vector the units held over ; see e.g. Rásonyi and Stettner [11] and the references therein.

Assume that, for every , there is a such that , then the closure of the value function of ((ALM)) has the dual representation

 (clφ)(u)=supy∈QE[uy−V∗(y)],

where is the set of positive multiples of martingale densities , i.e. densities of probability measures under which the price process is a martingale.

Indeed, ((ALM)) fits the general model (($P_{u}$)) with

 f(x,u,ω) =V(u−T−1∑t=0xt⋅Δst+1(ω),ω), l(x,y,ω) =−V∗(y,ω)−yT−1∑t=0xt⋅Δst+1(ω), f∗(v,y,ω) ={V∗(y,ω)if vt=−yΔst+1(ω) for t

The integrability condition implies so the first two conditions of Theorem 2 hold. Thus,

 −~φ∗(y)=infx∈N∞El(x,y)=−EV∗(y)+infx∈N∞E[−T−1∑t=0xt⋅(yΔst+1)].

By Fenchel inequality,

 uy−T−1∑t=0xt⋅(yΔst+1)≤V(u−T−1∑t=0xt⋅Δst+1)+V∗(y)P-a.s.

where for every and , the right side is integrable for some . Thus, is integrable for every and . Therefore the last infimum equals unless for every , i.e. unless . Moreover, if then holds with for and , so the last condition of Theorem 2 holds.

The following example addresses a parameterized version of the generalized problem of Bolza studied in Rockafellar and Wets [17].

###### Example 7.

Consider the problem

 minimizex∈NET∑t=0Kt(xt,Δxt+ut), (1)

where , , and each is an -measurable normal integrand on .

We assume that, for every with for all , there exists with for all and , and that and for every and , where . Then the closure of the value function of (1) has the dual representation

 (clφ)(u)=supy∈Y∩NET∑t=0[ut⋅yt−K∗t(EtΔyt+1,yt)]

Indeed, the problem fits our general framework with

 f(x,u,ω) =T∑t=0Kt(xt,Δxt+ut,ω), l(x,y,ω) =T∑t=0[−xt⋅Δyt+1+Ht(xt,yt,ω)], f∗(v,y,ω) =T∑t=0K∗t(vt+Δyt+1,yt),

where with and

 Ht(xt,yt,ω):=infut∈Rd{Kt(xt,ut,ω)−ut⋅yt}

is the associated Hamiltonian.

The domain condition in Theorem 2 is satisfied, so

 −~φ∗(y)=infx∈N∞El–(x,y)=infx∈N∞ET∑t=0[−xt⋅Δyt+1+H––t(xt,yt)],

where . Thus, by the interchange rule [18, Theorem 14.50],

 −~φ∗(y) =−infx∈N∞ET∑t=0[−xt⋅EtΔyt+1+H––t(xt,yt,ω)] =−ET∑t=0K∗t(EtΔyt+1,yt)

for adapted . Moreover, with we get , where . The last condition in Theorem 2 thus holds. For any , Jensen’s inequality gives

 −~φ∗(y) =infx∈N∞El(x,y) ≤infx∈N∞ET∑t=0[−xt⋅EtΔyt+1+Ht(xt,Etyt)] =−~φ∗(ay).

Therefore, for adapted , we get

 clφ(u)=supy∈Y∩NET∑t=0[ut⋅yt−K∗t(EtΔyt+1,yt)].

The dual representation of the value function in Example 7 was claimed to hold in [5] under the assumption that the Hamiltonian is lsc in . The claim is, however, false in general since it omitted the domain condition in Theorem 2. The integrability condition posed in Example 7 not only provides a sufficient condition for that, but it also makes the lower semicontinuity of the Hamiltonian a redundant assumption.

## 3 Optimality conditions

The previous section as well as the articles [6, 9] were concerned with dual representations of the value function . Continuing in the general conjugate duality framework of Rockafellar [14], this section derives optimality conditions for (($P_{u}$)) by assuming the existence of a subgradient of at . Besides optimality conditions, this assumption implies the lower semicontinuity of at (with respect to the weak topology induced on by ) and thus, the absence of a duality gap as well. Whereas in the above reference, the topology of convergence in measure in played an important role, below, topologies on are irrelevant.

Recall that a is a subgradient of at if

 φ(u′)≥φ(u)+⟨u′−u,y⟩∀u′∈U.

The set of all such is called the subdifferential of at is and it is denoted by . If , then is lower semicontinuous at and

 φ(u)=⟨u,y⟩−φ∗(y)

for every . By [14, Theorem 11], , in particular, when is continuous at .

We assume from now on that is closed in and that is proper.

###### Theorem 8.

Assume that and that for every there exists such that . Then an solves (($P_{u}$)) if and only if it is feasible and there exist and such that

 (v,y)∈∂f(x,u)

-almost surely, or equivalently, if

 v∈∂xl(x,y)andu∈∂y[−l](x,y)

-almost surely.

###### Proof.

Note first that if and such that , then and

 infx∈NEf(x,u)=⟨u,y⟩−φ∗(y)=E[u⋅y−f∗(y,v)].

Thus, solves (($P_{u}$)) if and only if

 E[f(x,u)+f∗(v,y)]=E[u⋅y].

By Fenchel’s inequality,

 f(x,u)+f∗(v,y)≥x⋅v+u⋅y,

so, by Lemma 1, for every feasible . Thus, solves (($P_{u}$)) if and only if is feasible and the above inequality holds as an equality -almost surely, that is, if

 (v,y)∈∂f(x,u)

-almost surely. By [13, Theorem 37.5], this is equivalent to

 v∈∂xl(x,y)andu∈∂y[−l](x,y)

-almost surely.

###### Example 9.

In Example 5, the optimality conditions of Theorem 8 mean that

 fj(x)+uj ≤0, x∈argminz∈Rn{f0(z) +m∑j=1yjfj(z)−z⋅v}, yjfj(z) =0j=1,…,m, yj ≥0

-almost surely. These are the optimality conditions derived in [16], where sufficient conditions were given for the existence of an optimal and the corresponding dual variables and (in our notation); see [16, Theorem 1]. The conditions of [16] imply the continuity of the optimum value at the origin with respect to the -norm. This yields the existence of dual variables in the norm dual . They then used the condition of “relatively complete recourse” to show that the projections of the dual variables to the subspace satisfy the optimality conditions as well.

We will now describe another general setup which covers many interesting applications and where the subdifferentiability condition in Theorem 8 is satisfied. This framework is motivated by Biagini [1], where similar arguments were applied to optimal investment in the continuous-time setting. The idea is simply to require stronger continuity properties on in order to get the existence of dual variables directly in (without going through the more exotic space first as in [16]).

A topological vector space is said to be barreled if every closed convex absorbing set is a neighborhood of the origin. By [14, Corollary 8B], a lower semicontinuous convex function on a barreled space is continuous throughout the algebraic interior (core) of its domain. On the other hand, by [14, Theorem 11], continuity implies subdifferentiability. Fréchet spaces and, in particular, Banach spaces are barreled. In the following, we say that is barreled if it is barreled with respect to a topology compatible with the pairing with .

The following applies Theorem 8 to optimal investment in perfectly liquid financial markets.

###### Example 10.

Consider Example 6 and assume that is barreled, is finite on and is proper in . Then an solves ((ALM)) if and only if it is feasible and there exists such that

 y ∈∂V(u−T−1∑t=0xt⋅Δst+1)P-a.s.
###### Proof.

By [14, Theorem 21], is lower semicontinuous so by [14, Corollary 8B], it is continuous. Since

 φ(u)≤Ef(0,u)=EV(u),

[14, Theorem 8] implies that is continuous and thus subdifferentiable throughout , by [14, Theorem 11]. Moreover, for every and given by . The assumptions of Theorem 8 are thus satisfied. The subdifferential conditions for the Lagrangian integrand can now be written as

 vt =−yΔst+1P-a.s. for t

Since , the former means that while the latter can be written in terms of as stated; see [14, Theorem 12]. ∎

The optimality conditions in Example 10 are classical in financial mathematics; for continuous-time models, see e.g. Schachermayer [19] or Biagini and Frittelli [2] and the references therein.

The next example applies Theorem 8 to the problem of Bolza from Example 7. It constructs a subgradient using Jensen’s inequality.

###### Example 11.

Consider Example 7 and assume that is barreled and that there exists a normal integrand such that is finite on , is proper on , and, for every ,

 T∑t=0Kt(xt,Δxt+ut)≤θ(u)P-a.s.

for some . Then an solves

 minimizex∈NET∑t=0Kt(xt,Δxt+ut)

for a if and only if is feasible and there exists such that

 (EtΔyt+1,yt)∈∂Kt(xt,Δxt+ut)

-almost surely for all , or equivalently, if

 EtΔyt+1 ∈∂xHt(xt,yt), ut+Δxt ∈∂y[−Ht](xt,yt)

-almost surely for all .

###### Proof.

The space is a barreled space and its continuous dual may be identified with . Indeed, by Hahn–Banach, any may be extended to a for which coincides with on . Moreover, for any closed convex absorbing set in , we have that is a closed convex absorbing set in , so it is a neighborhood of the origin of which implies that is a neighborhood of the origin of .

By assumption, for every there is an such that

 φ(u) ≤ET∑t=0Kt(xt,Δxt+ut)≤Eθ(u).

By [14, Theorem 21], is lower semicontinuous so by [14, Corollary 8B], it is continuous. Thus, is continuous and in particular, subdifferentiable on (see [14, Theorem 11]) so for every there is a such that

 φ(u′)≥φ(u)+⟨u′−u,y⟩∀u′∈U∩N.

Since each is -measurable, Jensen’s inequality gives

 φ(u′)≥φ(au′)≥φ(u)+⟨au′−u,y⟩=φ(u)+⟨u′−u,y⟩∀ u′∈U,

so as well. Moreover, as observed in Example 7, we have for . The subdifferential conditions in Theorem 8 for the Lagrangian integrand become

 vt+Δyt+1 ∈∂xHt(xt,yt) P-a.s.∀t, ut+Δxt ∈∂y[−Ht](xt,yt) P-a.s.∀t.

By [13, Theorem 37.5], these are equivalent to the conditions in terms of . ∎

The optimality condition in terms of in Example 11 can be viewed as a stochastic Euler-Lagrange condition in discrete-time much like that in [17, Theorem 4]. There is a difference, however, in that [17] studied the problem of minimizing and, accordingly, the measurability conditions posed on the dual variables were different as well. The condition in terms of can be viewed as a stochastic Hamiltonian system in discrete-time; see [12, Section 9] for deterministic models in continuous-time. Our assumptions on the problem also differ from those made in [17]. Whereas the assumptions of [17] and the line of argument follows that in [16] (see Example 9), our assumption implies the continuity of on the adapted subspace .

It is essential that the growth condition in Example 11 is required only for adapted . Indeed, since is adapted, it would often be too much to ask the upper bound for nonadapted . This is the case e.g. in the following example which is concerned with the financial model studied in [8, 9]. The model is an extension of the currency market model introduced by Kabanov [3]; see also Kabanov and Safarian [4].

###### Example 12.

Consider the optimal investment-consumption problem

 minimize ET∑t=0Vt(−k