A Heuristic derivation of the value function

# Capacitary measures for completely monotone kernels via singular control

## Abstract

We give a singular control approach to the problem of minimizing an energy functional for measures with given total mass on a compact real interval, when energy is defined in terms of a completely monotone kernel. This problem occurs both in potential theory and when looking for optimal financial order execution strategies under transient price impact. In our setup, measures or order execution strategies are interpreted as singular controls, and the capacitary measure is the unique optimal control. The minimal energy, or equivalently the capacity of the underlying interval, is characterized by means of a nonstandard infinite-dimensional Riccati differential equation, which is analyzed in some detail. We then show that the capacitary measure has two Dirac components at the endpoints of the interval and a continuous Lebesgue density in between. This density can be obtained as the solution of a certain Volterra integral equation of the second kind.

Keywords: Singular control, verification argument, capacity theory, infinite-dimensional Riccati differential equation, optimal order execution, optimal trade execution, transient price impact

AMS Subject Classification: 49J15, 49K15, 31C15, 49N90, 91G80, 34G20

## 1 Introduction and statement of results

### 1.1 Background

Let be a function. The problem of minimizing the energy functional

 E(μ):=12∫∫G(|t−s|)μ(ds)μ(dt)

over probability measures supported by a given compact set plays an important role in potential theory. A minimizing measure , when it exists, is called a capacitary measure, and the value is called the capacity of the set ; see, e.g., \citeasnounchoquet, \citeasnounfuglede, and \citeasnounlandkof. See also \citeasnounAikawaEssen or \citeasnounHelms for more recent books on potential theory.

In this paper, we develop a control approach to determining the capacitary distribution when is a compact interval and is a completely monotone function. In this approach, measures on will be regarded as singular controls and is the objective function. Our goal is to obtain qualitative structure theorems for the optimal control and characterize by means of certain differential and integral equations.

The intuition for this control approach, and in fact our original motivation, come from the problem of optimal order execution in mathematical finance. In this problem, one considers an economic agent who wishes to liquidate a certain asset position of shares within the time interval . This asset position can either be a long position () or a short position ). The order execution strategy chosen by the investor is described by the asset position held at time . In particular, one must have . Requiring the condition assures that the initial position has been unwound by time . The left-continuous path will be nonincreasing for a pure sell strategy and nondecreasing for a pure buy strategy. A general strategy can consist of both buy and sell trades and hence can be described as the sum of a nonincreasing and a nondecreasing strategy. That is, is a path of finite variation.

The problem the economic agent is facing is that his or her trades impact the price of the underlying asset. To model price impact, one starts by informally defining as the immediate price impact generated by the (possibly infinitesimal) trade executed at time . Next, it is an empirically well-established fact that price impact is transient and decays over time; see, e.g., \citeasnounMoroEtAl. This decay of price impact can be described informally by requiring that is the remaining impact at time of the impact generated by the trade . Here, is a nonincreasing function with , the decay kernel. Thus, is the price impact of the strategy , cumulated until time . This price impact creates liquidation costs for the economic agent, and one can derive that, under the common martingale assumption for unaffected asset prices, these costs are given by

 C(X):=12∫[0,T]∫[0,T]G(|t−s|)dXsdXt (1)

plus a stochastic error term with expectation independent of the specific strategy ; see \citeasnounGSS. Indeed, let us assume that asset prices are given by where is a continuous martingale and models the price impact of the trading strategy at time . Then, we assume that the order is made at the average price and costs , which corresponds to a block shape limit order book, see \citeasnounAFS2. Accumulating these costs over , integrating by parts twice, and taking expectations yields

 E[∫[0,T]12(SXt−+SXt)dXt]=−S00X0+E[C(X)],

where we have used the fact that , due to the martingale assumption on . Further details can be found in \citeasnounGSS.

Thus, minimizing the expected costs amounts to minimizing the functional over all left-continuous strategies that are of bounded variation and satisfy and . This problem was formulated and solved in the special case of exponential decay, , by \citeasnounow. The general case was analyzed by \citeasnounASS in discrete time and by \citeasnounGSS in the continuous-time setup we have used above. We refer to \citeasnounAFS2, \citeasnounAS, \citeasnounGSS2, \citeasnounPredoiuShaikhetShreve, \citeasnounSchiedSlynko, and \citeasnounGatheralSchiedSurvey for further discussions and additional references in the context of mathematical finance.

Clearly, the cost functional coincides with the energy functional of the measure . So finding an optimal order execution strategy is basically equivalent to determining a capacitary measure for . There is one important difference, however: capacitary measures are determined as minimizers of with respect to all nonnegative measures on with total mass , while may be a signed measure with given total mass . This difference can become significant if is only required to be positive definite in the sense of Bochner (which is essentially equivalent to for all ), because then minimizers of the unconstrained problem need not exist. It was first shown by \citeasnounASS, and later extended to continuous time by \citeasnounGSS, that a unique optimal order execution strategy exists and that is a monotone function of when is convex and nonincreasing. This result has the important consequence that the constrained problem of finding a capacitary measure is equivalent to the unconstrained problem of determining an optimal order execution strategy.

In this paper, we aim at describing the structure of capacitary measures/optimal order execution strategies. To this end, it is instructive to first look at two specific examples in which the optimizer is known in explicit form. \citeasnounow find that for exponential decay, , the capacitary measure has two singular components at and and a constant Lebesgue density on :

 μ∗(dt)=12+ρTδ0(dt)+ρ2+ρTdt+12+ρTδT(dt). (2)

Numerical experiments show that it is a common pattern that capacitary measures for nonincreasing convex kernels have two singular components at and and a Lebesgue density on . However, the capacitary measure for is the purely discrete measure

 μ∗=12+NN∑i=0(1−iN+1)(δiρ+δT−iρ),

where [?, Proposition 2.14].

So it is an interesting question for which nonincreasing, convex kernels the capacitary measure has singular components only at and and is (absolutely) continuous on . It turns out that a sufficient condition is the complete monotonicity of , i.e., belongs to and is nonnegative in for . More precisely, we have the following result, which is in fact an immediate corollary of the main results in this paper.

###### Corollary 1.

Suppose that is completely monotone with . Then the capacitary measure has two Dirac components at and and is has a continuous Lebesgue density on .

### 1.2 Statement of main results

Our main results do not only give the preceding qualitative statement on the form of but they also provide quantitative descriptions of the Dirac components of and of its Lebesgue density on . To prepare for the statement of these results, let us first assume that , which we can do without loss of generality. Then we recall that by the celebrated Hausdorff–Bernstein–Widder theorem [?, Theorem IV.12a], is completely monotone if and only if it is the Laplace transform of a Borel probability measure on :

 G(t)=∫e−ρtλ(dρ),t≥0.

In particular, every exponential polynomial,

 G(t)=d∑i=0λie−ρit, (3)

with and is completely monotone. Another example is power-law decay,

 G(t)=1(1+t)γfor some γ>0,

which is a popular choice for the decay of price impact in the econophysics literature; see \citeasnounGatheral and the references therein. We assume henceforth that , which is equivalent to

 ¯¯¯ρ:=∫ρλ(dρ)<∞and¯¯¯¯¯ρ2:=∫ρ2λ(dρ)<∞. (4)

A crucial role will be played by the following infinite-dimensional Riccati equation for functions ,

 φ′(t,ρ1,ρ2)+(ρ1+ρ2)φ(t,ρ1,ρ2)=12¯¯¯ρ(ρ1+∫xφ(t,ρ1,x)λ(dx))(ρ2+∫xφ(t,x,ρ2)λ(dx)) (5)

where denotes the time derivative of , and the function satisfies the initial condition

 φ(0,ρ1,ρ2)=1for all ρ1,ρ2≥0. (6)
###### Remark 1.

When writing (5) in the form one sees that the functional is not a continuous map from some reasonable function space into itself, unless is concentrated on a compact interval. For instance, it involves the typically unbounded linear operator . Therefore, existence and uniqueness of solutions to (5), (6) does not follow by an immediate application of standard results such as the Cauchy–Lipschitz/Picard–Lindelöf theorem in Banach spaces [?, Theorem 3.4.1] or more recent ones such as those in \citeasnounTeixeira and the references therein. In fact, even in the simplest case in which reduces to a Dirac measure, the existence of global solution hinges on the initial condition; it is easy to see that solutions blow up when is not chosen in a suitable manner.

We now state a result on the global existence and uniqueness of (5), (6). It states that the solution takes values in the locally convex space endowed with topology of locally uniform convergence. For integers , the space will consist of all continuous functions which, when considered as functions for some compact subset of , belong to .

###### Theorem 1.

When the initial value problem (5), (6) admits a unique solution in the class of functions in that satisfy an inequality of the form

 0≤˜φ(t,ρ1,ρ2)≤c(1+ρ1)(1+ρ2), (7)

where is a constant that may depend on and locally uniformly on . Moreover, has the following properties.

1. is strictly positive.

2. is symmetric: for all .

3. for all .

4. .

5. For every , the kernel is nonnegative definite on , i.e.,

 ∫∫f(x)f(y)φ(t,x,y)λ(dx)λ(dy)≥0for% f∈L2(λ). (8)
6. The functions and satisfy local Lipschitz conditions in , locally uniformly in .

In Section 1.3 we will discuss computational aspects of the initial value problem (5), (6). In particular, we will discuss its solution when is an exponential polynomial of the form (3) and we will provide closed-form solutions in the cases and .

We can now explain how to use singular control in approaching the minimization of or . To this end, using order execution strategies will be more convenient than using the formalism of the associated measures because of the natural dynamic interpretation of . Henceforth, a -admissible strategy will be a left-continuous function of bounded variation such that . Our goal is to minimize the cost functional defined in (1) over all -admissible strategies with fixed initial value . Clearly, this problem is not yet suitable for the application of control techniques since depends on the entire path of . We therefore introduce the auxiliary functions

 EXt(ρ):=∫[0,t)e−ρ(t−s)dXs,for ρ≥0. (9)

These functions will play the role of state variables that are controlled by the strategy .

###### Lemma 1.

For any -admissible strategy , the function is uniformly bounded in and . Moreover,

 C(X)=∫[0,T)∫EXt(ρ)λ(dρ)dXt+12∑t≤T(ΔXt)2, (10)

where denotes the jump of at .

Proof. Clearly, , where denotes the total variation of over . To obtain (10), we integrate by parts to get

 C(X)=∫[0,T)∫[0,t)G(t−s)dXsdXt+G(0)2∑t≤T(ΔXt)2.

Now we write as and apply Fubini’s theorem. ∎.

The form (10) of our cost functional is now suitable for the application of control techniques. To state our main result, we let be the solution of our infinite-dimensional Riccati equation as provided by Theorem 1 and we define

 φ0(t):=φ(t,0,0)andψ(t,ρ):=∫xφ(t,x,ρ)λ(dx) (11)

###### Theorem 2.

Let be the unique optimal strategy in the class of -admissible strategies with initial value . Then

 C(X∗)=x22φ0(T). (12)

Moreover, has jumps at and of size

 ΔX∗0=ΔX∗T=−ψ(T,0)2¯¯¯ρφ0(T)x

and is continuously differentiable on . The derivative is the unique continuous solution of the Volterra integral equation

 ξ(t)=f(t)+∫t0K(t,s)ξ(s)ds, (13)

where, for

 Θ(t,ρ):=ρ+ψ(t,ρ)ψ(t,0)∫x2φ(t,x,0)λ(dx)−∫x2φ(t,x,ρ)λ(dx)+ρ2, (14)

the function and the kernel are given by

 f(t)=ΔX∗02¯¯¯ρ∫e−ρtΘ(T−t,ρ)λ(dρ),K(t,s)=12¯¯¯ρ∫e−ρ(t−s)Θ(T−t,ρ)λ(dρ). (15)

Let us recall that we know in addition from Theorem 2.20 in \citeasnounGSS that is monotone. The identity (12) immediately yields the following formula for the capacity of a compact interval.

###### Corollary 2.

If , the capacity of a compact interval is given by

 \rm Cap([a,b])=2φ0(b−a).

### 1.3 Computational aspects

In general, the Riccati equation (5), (6) cannot be solved explicitly. A closed-form solution exists, however, when is an exponential polynomial as in (3), i.e., when has a discrete support. Let us assume that , with , , and . All the input that is needed in Theorem 2 are the values , for . By Theorem 1, is a symmetric matrix that solves the following matrix Riccati equation:

 φ′=−φM(3)φ−φM(4)+M(1)φ+M(2), (16)

with , , and . According to \citeasnounLevin, the solution of this equation is given by

 φ(t)=(R(1)(t)1+R(2)(t))(R(3)(t)1+R(4)(t))−1,

where and

 R(t)=[R(1)(t)R(2)(t)R(3)(t)R(4)(t)]=exp(t[M(1)M(2)M(3)M(4)]).

In the special cases and , the solution of the Riccati equation (5), (6) becomes even easier and, to some extend, becomes explicit. We demonstrate this first for and then for :

###### Example 1.

In the case , is of the form for some and some . Clearly, we can set without changing the optimization problem. Then , and (5) becomes

 φ′00=ρ2φ201,φ′01+ρφ01=ρ2(1+φ11)φ01,φ′11+2ρφ11=ρ2(1+φ11)2.

For the initial condition , the preceding equation has the unique solution and . The condition (59) thus reduces to , which easily yields (2) as unique solution.

###### Example 2.

In the case , we can assume that is of the form , where . Consider a solution of the matrix Riccati equation (16) with . We can simplify (16) by using the relation

 λ1φi1+λ2φi2=1,i=0,…,2. (17)

Indeed, the equation for then becomes

 φ′11+2ρ1φ11=12¯¯¯ρ(ρ1+λ1ρ1φ11+λ2ρ2φ12)2=12¯¯¯ρ(ρ1+ρ2+λ1(ρ1−ρ2)φ11)2.

This is an autonomous ODE that, for the initial condition , is solved by

 φ11(t)=c1+[(11−c1−14λ21(ρ1−ρ2)2√ρ1ρ2¯¯¯ρˇρ)exp(2√ρ1ρ2¯¯¯ρˇρ¯¯¯ρ⋅t)+14λ21(ρ1−ρ2)2√ρ1ρ2¯¯¯ρˇρ]−1, (18)

where

 ˇρ:=λ1ρ2+λ2ρ1 and c1=(√ρ1¯¯¯ρ−√ρ2ˇρλ1(ρ1−ρ2))2.

We can notice that and .

Similarly,

 φ′22+2ρ2φ22=12¯¯¯ρ(ρ1+ρ2+λ2(ρ2−ρ1)φ22)2,

which for the initial condition is solved by

 φ22(t)=c2+[(11−c2−14λ22(ρ1−ρ2)2√ρ1ρ2¯¯¯ρˇρ)exp(2√ρ1ρ2¯¯¯ρˇρ¯¯¯ρ⋅t)+14λ22(ρ1−ρ2)2√ρ1ρ2¯¯¯ρˇρ]−1, (19)

where

 c2=(√ρ2¯¯¯ρ−√ρ1ˇρλ2(ρ1−ρ2))2.

From (17) we can now easily compute .

Next, using once again (17), we find that solves

 φ′01+ρ1φ01=12¯¯¯ρ(ρ2+λ1(ρ1−ρ2)φ01)(ρ1+ρ2+λ1(ρ1−ρ2)φ11).

That is,

 φ′01+[ρ1¯¯¯ρ+ρ2ˇρ2¯¯¯ρ−λ212¯¯¯ρ(ρ1−ρ2)2φ11]φ01=ρ22¯¯¯ρ[ρ1+ρ2+λ1(ρ1−ρ2)φ11]. (20)

We set , , and , so that . Then, we can check that is a solution of the fundamental system. By using a variation of parameters, we get that the solution of (20) satisfying is given by

 φ01(t)=A1φ01(+∞)ekt+B1φ01(−∞)e−kt+C01A1ekt+B1e−kt,

with

 φ01(±∞)=±√ρ1ρ2¯¯¯ρˇρ−ρ2ˇρλ1ˇρ(ρ1−ρ2)% and C01=A1(1−φ01(+∞))+B1(1−φ01(−∞)).

Then, can be easily deduced from (17).

It remains to compute , which solves

 φ′00=12¯¯¯ρ[ρ2+λ1(ρ1−ρ2)φ01]2.

We set and get after some calculations:

 φ′00(t) = 12¯¯¯ρ⎡⎢ ⎢⎣A1k¯¯ρˇρekt−B1k¯¯ρˇρe−kt+~C01A1ekt+B1e−kt⎤⎥ ⎥⎦2 = ρ1ρ22ˇρ+~C01ˇρA1kekt−B1ke−kt(A1ekt+B1e−kt)2+~C201ˇρ−4A1B1ρ1ρ2¯¯¯ρ2¯¯¯ρˇρ1(A1ekt+B1e−kt)2.

Thus, we finally get:

 φ00(t) = 1+ρ1ρ22ˇρt−~C01ˇρ(1A1ekt+B1e−kt−1A1+B1) (21) +~C201ˇρ−4A1B1ρ1ρ2¯¯¯ρ4B1k¯¯¯ρˇρ(ektA1ekt+B1e−kt−1A1+B1).

This completes this example.

Given the solution of the Riccati equation, we can approximate the continuous time strategy by a discrete one as follows ( will denote the trading size at time ).

• We first set and , .

• Suppose that and that and have been computed. Then, we set thanks to (59):

 xi=1−i−1∑j=0xj−∫Ei−1(ρ)e−ρT/Nθ(T−iT/N,ρ)λ(dρ), Ei(ρℓ)=Ei−1(ρℓ)e−ρℓT/N+xi.
• Set .

Alternatively, we could have approximated the minimization of the cost (1) by the following discrete problem. Let , , and consider

 minimize 12xTMx over x∈RN+1 s.t N∑i=0xi=1. (22)

The solution of this problem is obviously given by , where for . From a financial point of view, the minimization problem (22) gives the optimal strategy when it is only possible to trade at the times , while the original problem (1) allows to trade continuously. In potential theory, it corresponds to computing the capacitary distribution of the set . It was shown in the proof of Theorem 2.20 in \citeasnounGSS that for these cap2acitary distributions converge in the weak topology of probability measures to the capacitary distribution constructed in Theorem 2. Explicit solutions of (22) for the choices and were given in \citeasnounAFS1 and \citeasnounASS (note, however, that is not completely monotone).

We have computed and plotted the solutions given by both methods in Figure 2 for , , and . They are already rather close together for , and they merge when . Let us discuss briefly the time complexity of the two methods. The one given by (22) gets very slow when gets large since it involves the inversion of a matrix. Instead, when has a discrete support, the matrix Riccati equation can be solved quickly and the algorithm above has a time complexity, which is much faster. However, this is no longer true when does not have discrete support. In that case, we have to approximate by a discrete measure, which means that we have to increase . Doing so, will slow down the algorithm based on the Riccati equation. A rigorous treatment of the convergence rate and time complexity of both algorithms is beyond the scope of this paper and is left for future research.

## 2 Proofs

### 2.1 Proof of Theorem 1

Let us write (5) in the form , where

 Missing or unrecognized delimiter for \Big (23)
###### Lemma 2.

Suppose that is supported by the compact interval . Then (5), (6) admits a unique solution . Moreover, has the properties (a), (b), and (c) in the statement of Theorem 1.

Proof. Let be any compact interval containing . Then defined in (23) maps into itself. Moreover, is Lipschitz continuous with respect to the sup-norm on every bounded subset of . Hence, the Cauchy–Lipschitz/Picard–Lindelöf theorem in Banach spaces implies the existence of a unique local solution for some maximal time [?, Theorem 3.4.1]. We will show below that . Then, if is another compact interval, the restriction of to must coincide with due to the uniqueness of solutions. This consistency then implies the existence and uniqueness of solutions . Moreover, the uniqueness of solutions and the fact that both (5) and (6) are symmetric in and implies that for all , which is property (b) in Theorem 1.

We now fix an interval . Before proving that , we will show that

 ∫φJ(t,ρ,x)λ(dx)=1for ρ∈J and t

This then will establishes property (c) in the statement of Theorem 1 for . Then we will use (24) to derive some estimates on that will yield and .

To prove (24), we let and . We have

 I′(t,ρ)+ρI(t,ρ)+ψJ(t,ρ)=12¯¯¯ρ(ρ+ψJ(t,ρ))(¯¯¯ρ+∫xI(t,x)λ(dx)). (25)

This is a (non-homogeneous) affine ODE of the form , where the operator

 (A(t)f)(ρ)=−ρf(ρ)+12¯¯¯ρ(ρ+ψJ(t,ρ))∫xf(x)λ(dx)

is a continuous map from into the space of bounded linear operators on for each . Hence this ODE admits a unique solution in with initial condition . But (25) is solved by , which which establishes (24).

For the next step, we let

 t0:=inf{t∈[0,tJ)∣∣minρ1,ρ2∈JφJ(t,ρ1,ρ2)<0}.

Since is a continuous map from into and , we must have . Due to (24) we have on that

 ρ1ρ22¯¯¯ρ≤φ′J(t,ρ1,ρ2)+(ρ1+ρ