Drift dependence of optimal trade execution strategies under transient price impact
Abstract
We give a complete solution to the problem of minimizing the expected liquidity costs in presence of a general drift when the underlying market impact model has linear transient price impact with exponential resilience. It turns out that this problem is wellposed only if the drift is absolutely continuous. Optimal strategies often do not exist, and when they do, they depend strongly on the derivative of the drift. Our approach uses elements from singular stochastic control, even though the problem is essentially nonMarkovian due to the transience of price impact and the lack in Markovian structure of the underlying price process. As a corollary, we give a complete solution to the minimization of a certain costrisk criterion in our setting.
1 Introduction
Standard asset pricing models like the Black–Scholes model assume that asset prices are given exogenously and are unaffected by the trading behavior of economic agents. In reality, however, many trades are large enough to feed back on asset prices so that price impact and the resulting liquidity costs cannot be ignored. In such a situation, one aims at minimizing the liquidity costs from trade execution by constructing suitable trading strategies. The problem of computing such trading strategies is called the optimal trade execution problem.
To deal with price impact quantitatively, several stochastic market impact models have been proposed in recent years. In the first model class, which goes back to Bertsimas and Lo (1998) and Almgren and Chriss (1999, 2000), price impact is modeled by combining convex transaction costs with a linear permanent price impact term. While these models make computations feasible and lead to relatively nice and robust trading strategies, they do not adequately model the empirically observed transience of price impact. Transience means that price impact is strongest immediately after being triggered and that it subsequently decays in time. This effect is wellestablished empirically, it can be measured, and it is widely believed that the decay of price impact follows some general laws; see, e.g., Gatheral (2010), Lehalle and Dang (2010), Moro et al. (2009), and the references therein. Therefore, several models for transient price impact have been proposed in recent years. To our knowledge, the first models were proposed by Bouchaud et al. (2004) and Obizhaeva and Wang (2013). The latter is a linear price impact model with exponential decay of price impact and seems to be the first transientprice impact model used for computing optimal trade execution strategies. Two different extensions were given to the case of nonlinear transient price impact. The first was proposed by Alfonsi et al. (2010) and further developed by Alfonsi and Schied (2010) and Predoiu et al. (2011). The second extension is due to Gatheral (2010) and, besides nonlinearity, also allows for more general decay patterns than exponential decay. Let us also mention related research by Bayraktar and Ludkovski (2011), Bouchard et al. (2011), Kharroubi and Pham (2010), and Guéant et al. (2012).
Since transience of price impact is more realistic than the combination of transaction costs with linear permanent impact, one might guess that market impact models with transient price impact perform better in practice than those of Bertsimas and Lo (1998) and Almgren and Chriss (1999, 2000). But what can be said about their mathematical stability and robustness in comparison to these older models? This is an important question because of the high degree of uncertainty in the estimation of market microstructure parameters. Gatheral (2010) addressed this question by analyzing the possible nonexistence of optimal trade execution strategies for certain parameters. As shown by Alfonsi and Schied (2010) and further discussed in Gatheral et al. (2011), these results depend strongly on the way in which nonlinearity of price impact is modeled. Therefore stability investigations with respect to other model features have been carried out in the case of linear price impact. Moreover, for liquid stocks linear price impact can also be a very good approximation to reality as shown empirically by Blais and Protter (2010). Alfonsi et al. (2012) investigate the dependence of optimal trade execution strategies on the decay kernel that models the temporal decay of price impact. They find that discretetime strategies react in a very sensitive manner to the choice of this decay kernel and that price impact must decay as a convex nonincreasing function of time so as to exclude certain irregularities of optimal strategies. This observation implies in particular that in practice the decay of price impact cannot be estimated in a nonparametric way.
An extension of the results in Alfonsi et al. (2012) to continuous time was given by Gatheral et al. (2012). Finally, assuming exponential decay of price impact, Fruth et al. (2011) analyze the specific form and regularity of optimal trade execution strategies when liquidity can be timedependent or even stochastic. An analysis pertaining specifically to regularity issues arising in this context has recently been given by Klöck (2012).
When investigating a particular model aspect, it is important to keep the remaining features of the model simple. For instance, to analyze the existence or nonexistence of price manipulation strategies as in Gatheral (2010) or Alfonsi et al. (2012), it is necessary to assume that the underlying price process is a martingale. There are additional reasons why it may be natural to make this martingale assumption; see, e.g., the discussion in Alfonsi et al. (2012). But there are also good reasons to allow for a nonvanishing drift in unaffected asset prices. For instance, an economic agent may be aware of the trading activities of another market participant. These trading activities will create price impact, which from the point of view of our economic agent will be perceived as a drift in asset prices. Moreover, for several reasons, the economic agent may have a rather accurate estimate of this drift. For instance, some trade execution algorithms create characteristic order patterns and therefore allow for an inference of their future trading trajectory. We refer to Schöneborn and Schied (2009) for a study of a multiagent situation in the Almgren–Chriss framework.
In this paper, we aim at continuing the investigation of the stability of models for transient price impact by focusing on the dependence of optimal trade execution strategies on a possible drift of the underlying unaffected price process. In doing this, we will allow for rather general dynamics of the drift and in particular allow for jumps and a nonMarkovian structure. This is important because the price impact patterns of optimal trade execution strategies with transient price impact have precisely these features and, as mentioned above, the price impact of another market participant is perhaps the most common source for the presence of a drift. On the other hand, we will keep the remaining features of the model simple. This makes the mathematics tractable but also helps to isolate the effects of the drift from the effects created by other model features. We therefore use the linear continuoustime model of Obizhaeva and Wang (2013) (in the version of Gatheral et al. (2012)) with exponential decay of price impact and the problem we are looking at is the minimization of the expected costs.
Theorem 1, our main result, shows that this optimal trade execution problem is very sensitive with respect to the drift. The expected costs will be equal to negative infinity as soon as the drift is not absolutely continuous, a fact that will have strong impact when market impact is generated by several market participants. Moreover, even when the drift is absolutely continuous, optimal strategies will typically not exist if strategies are understood in the sense of Gatheral et al. (2012). We therefore extend the class of admissible strategies by allowing strategies to be semimartingales. We show that unique optimal trade execution strategies may exist in this class of strategies, but the number of shares to be held depends directly on the derivative of the drift at each time and thus may fluctuate strongly. This sensitivity of strategies is particularly striking when compared to the relatively robust drift dependence of optimal trade execution strategies in the Almgren–Chriss framework, which was found by Schied (2011).
Our problem of minimizing the expected costs in the presence of a drift turns out to be also of interest from a purely mathematical point of view. Our approach uses elements from singular stochastic control, although the problem is basically nonMarkovian due to both the transience of price impact and the lack in Markovian structure of the underlying price process. We deal with the first type of nonMarkovianity by using an auxiliary ‘impact process’ that, under the specific assumption of exponential decay of price impact, leads to a Markovian structure for the dynamics of transient price impact. We then guess a formula for the optimal expected costs conditional at time where an arbitrary impact is given as initial condition. With this formula at hand, we can then use a verification argument. The control problem is ‘singular’ since our controls are semimartingale strategies, which enter the value function as integrators of stochastic integrals. A similar technique was recently used in Alfonsi and Schied (2012) to compute optimal strategies for general, completely monotone decay kernels but without drift in the unaffected price process. As an application of our results, we also obtain a complete solution for the minimization of a costrisk criterion that was recently proposed in Gatheral and Schied (2011).
2 Statement of results
2.1 Model setup
A market impact model is a model for an economic agent who can move asset prices. As long as this agent is not active, asset prices are determined by the actions of the other market participants and are described by the unaffected price process . We assume that is a squareintegrable càdlàg semimartingale defined on a given filtered probability space satisfying the usual conditions. We also assume that is trivial, i.e., every measurable random variable is a.s. constant. We will use the linear market impact model with exponential decay of price impact proposed by Obizhaeva and Wang (2013). More precisely, we will use the zerospread version of this model that was suggested in Gatheral et al. (2012); we refer to Alfonsi and Schied (2010) for a discussion of the possible reintroduction of a bidask spread.
The actual asset price will depend on the strategy chosen by the trader. Such a strategy will be an adapted stochastic process that describes the number of shares held by the trader at each time. Following Gatheral et al. (2012), we call admissible if the following conditions are satisfied:

the function is rightcontinuous
^{1} and adapted; 
the function has finite and a.s. bounded total variation;

there exists a liquidation time such that a.s. for all .
Such a strategy has the interpretation that the value stands for an initially given amount of shares that needs to be liquidated by time . When is nonincreasing, it is a pure sell strategy. When it is nondecreasing, it is a pure buy strategy. A general admissible strategy is the sum of a sell and a buy strategy and therefore is of bounded variation. This shows that condition (b) is economically meaningful. With we will denote the class of all strategies that are admissible in this sense for a fixed liquidation time and that satisfy .
When the admissible strategy is used, the price will be
(1) 
where , the function describes the temporal decay of price impact, and the parameter describes its magnitude. Clearly we can set without loss of generality. Following Gatheral et al. (2012), we define the liquidation costs of as
(2) 
Remark 1 (Economic motivation of the cost functional ).
Let us follow Alfonsi et al. (2012) and Gatheral et al. (2012) in motivating the cost functional (2). For a continuous strategy , equals and can thus be easily understood as the accumulated costs of buying shares at price at each time . For general , a nonzero jump can be interpreted as a large market order which shifts the asset price by eating into a blockshaped limit order book. Its execution therefore incurs the following costs:
We assume here that the order is executed immediately after a jump of in case both jumps nominally occur at the same time, an assumption that is economically natural since it precludes arbitragelike exploitation of price jumps. Decomposing a general strategy into its continuous part and its jumps thus leads to the definition (2). An alternative derivation of (2), based on a continuoustime limit of discretetime cost functionals, will be provided by Lemma 1 in the more general framework of semimartingale strategies.
The problem of minimizing the expected costs, , over is called the optimal trade execution problem. When is a squareintegrable martingale, this problem admits the unique solution
(3) 
That is, has an initial jump at of size , continuous trading at rate in , and a terminal jump of size . This formula was found by Obizhaeva and Wang (2013) (see also Example 2.12 in Gatheral et al. (2012) for a short proof).
Remark 2.
Gatheral et al. (2012) consider the leftcontinuous modification of admissible strategies. Since the respective formulas (1) and (2) for the price process and the costs of a strategy depend only on the measure , it is just a matter of notational convention whether to choose the right or leftcontinuous modification of . In particular, our formulas for the price process and the costs are the same as those in Gatheral et al. (2012). Later on, however, we will consider a larger class of semimartingale strategies, and since semimartingales are rightcontinuous by default and for good reason, we must adopt the convention of right continuity so as to be consistent between our two classes of strategies.
As can be seen from the formula (3), optimal strategies will typically have jumps at times and . For rightcontinuous strategies, we need to include the possibility of an initial jump by allowing for an initial value that can be different from . Similarly, for the leftcontinuous modification of strategies used in Gatheral et al. (2012), the terminal jump must be accommodated by allowing for a nonzero value of and by requiring the modified liquidation constraint . So both conventions require us to impose conditions on the limits of when approaches a boundary point of the actual trading interval from outside this interval.
Here, our goal is to study the minimization of the expected costs when has an additional drift. This topic is of intrinsic mathematical interest, and we refer to the introduction of this paper for an account of our economic motivation to study this problem. We assume henceforth that is a càdlàg semimartingale with decomposition
(4) 
where is a constant, is a squareintegrable càdlàg martingale with , and is an adapted process with and locally squareintegrable total variation, i.e., for every we have when denotes the total variation of over the interval . There is in fact no loss of generality in assuming that is predictable (see Proposition I.4.23 in Jacod and Shiryaev (2003)).
It will turn out that the presence of increases the complexity of the optimal trade execution problem significantly. In particular, optimal execution strategies in will exist only under very restrictive assumptions on . For instance, they will not exist even in the simple case in which is a diffusion model,
with nonconstant drift coefficient . We therefore need to extend our class of admissible trading strategies.
Definition 1.
An admissible semimartingale strategy is a bounded
Note that is a subset of . While semimartingale strategies are standard in frictionless asset pricing models, their application in a highfrequency market impact model is economically less natural than strategies of bounded variation, because they can no longer be written as the superposition of buying and selling strategies.
Given a semimartingale strategy , we need to extend the definitions (1) and (2) for the corresponding price process and the resulting liquidation costs. These formulas and our further analysis will involve stochastic integrals in which appears both as integrand and as integrator. Therefore, we first need to clarify how stochastic integrals must be understood in view of our requirement .
Remark 3 (On the definition of stochastic integrals).
It is a common assumption in the literature on stochastic integration that semimartingales may jump at , but a typical convention is to assume . With this convention, a stochastic integral , as defined, e.g., in Protter (2004), will not depend on the initial jump of the integrator at time , and so there is no ambiguity in writing . When the value is nonzero, as it is the case for the semimartingale strategies defined above, one must carefully distinguish whether an initial jump of the integrator is or is not part of a stochastic integral. This has been done, e.g., by Meyer (1976), from where we adopt the convention of writing or , respectively, when the initial jump is or is not part of the stochastic integral. We then have
(5) 
The integration by parts formula for stochastic integrals becomes
(6) 
see (Meyer, 1976, p. 303). When is a stochastic integral, we set by default.
Given a semimartingale strategy , the price at time can be defined just as in (1) when denotes the lefthand limit, , of the generalized OrnsteinUhlenbeck process
(7) 
We now turn to the definition of the liquidation costs of the semimartingale strategy . We will motivate our definition by an approximation from the discretetime case. To this end, we take , let for and define the following sequence of discrete trades:
Then, is an admissible trading strategy in the sense of Alfonsi et al. (2012). In Proposition 1 of Alfonsi et al. (2012) and its proof, the costs incurred by the discretetime strategy were derived as
The economic motivation of this formula is analogous to the one given in Remark 1. In fact coincides with , when denotes the step function with jumps described by . We have the following asymptotics of these costs when our time grid becomes finer.
Lemma 1 (Liquidation costs of a semimartingale strategy).
As , we have
in probability, where is independent of the (arbitrary) choice of the value , and is the generalized OrnsteinUhlenbeck process from (7).
2.2 Minimizing the expected costs
The optimization problem we are interested in is the minimization of the expected costs,
(8) 
over all strategies that belong to or to . To state its solution, let be a càdlàg version of the martingale
which exists due to our assumption that satisfies the usual conditions. We also define the semimartingale as
Theorem 1.
When is a.s. absolutely continuous on with squareintegrable derivative , i.e., when for and , then
(9)  
and
(10) 
otherwise.
When in addition is bounded, the second infimum in (9) can be attained only if is a (rightcontinuous) semimartingale, and the unique optimal strategy is then given by
(11) 
where . In particular the first infimum in (9) can only be attained when is a.s. rightcontinuous and of finite variation on .
Remark 4.
From an economic point of view, the fact that it is possible to generate arbitrarily negative expected costs for drift processes that are not absolutely continuous might indicate a market inefficiency that arises when trading takes place on a much shorter time scale than the resilience of price impact. The market then becomes inefficient, because its resilient reaction to a price shock is delayed in comparison to the trading activities of the economic agent; see also Remarks 2 and 3 in Alfonsi et al. (2012). This becomes particularly apparent when the drift is generated by the trading behavior of a large fundamental seller, who is subject to predatory trading by a highfrequency trader; see Remark 6 below.
The situation in Theorem 1 simplifies significantly when is a martingale:
Corollary 1.
Suppose that is of the form for a bounded càdlàg martingale . Then the optimal strategy (11) becomes
(12) 
Note that the strategy (12) can be computed in a pathwise manner without reference to the particular distribution of ; see Figure 1. This special case highlights the ambiguous and seemingly contradictory nature of the robustness of the optimal strategy: this strategy reacts very sensitively to structural features of the price process, i.e., to the martingale property of , but once this structural requirement is satisfied, the strategy is completely independent of the law of . When vanishes, this strategy reduces to the Obizhaeva–Wang solution (3).
Remark 5 (Comparison with Almgren–Chriss model).
It is interesting to compare the optimal strategy (9) with the one for the corresponding Almgren–Chriss model. In the latter model, strategies must be absolutely continuous. Given such a strategy , the price process takes the form
where and are two nonnegative constants. When and , the corresponding liquidation costs are
In our setting, there is always a unique strategy that minimizes the expected liquidation costs and it is given by
see Corollary 2 in Schied (2011). Here the drift enters the optimal strategy basically in integrated form, and so one can expect that possible misspecifications of the drift may average out to some extent. This relatively stable behavior should be compared to the direct dependence of the strategy (9) on the derivative of the drift.
Remark 6 (A twoplayer situation).
As discussed in the Introduction, an important source for a drift in the asset price process can be the trading activity of another large market participant (“the seller”). There are various reasons why another economic agent (“the predator”) may get good estimates for the resulting drift. For instance, some trade execution algorithms create characteristic order patterns and therefore allow for an inference of their future trading trajectory. But there are also other possibilities as discussed in Schöneborn and Schied (2009).
Suppose that the seller aims at liquidating a position of shares by time . Suppose moreover, for simplicity, that the unaffected asset price is a squareintegrable martingale so that the seller will use the liquidation strategy from (3). The predator will then perceive the unaffected price process , which is no longer a martingale but has the drift . Since has a terminal jump, also the resulting ‘drift’ will jump by the same amount at time . So if the predator faces a more relaxed time constraint than the seller, which is a natural assumption, the predator will perceive a drift that is not absolutely continuous and, by Theorem 1, will have the possibility of making arbitrary large expected profits. Similar results will also hold when has a nonvanishing drift.
2.3 Minimization of a costrisk criterion
As a corollary to Theorem 1, we can also find optimal strategies for the linear risk criterion that was proposed in Gatheral and Schied (2011) for the Almgren–Chriss framework with a riskneutral geometric Brownian motion as unaffected price process. When in our model is a riskneutral geometric Brownian motion, i.e.,
(13) 
the same reasoning as in Gatheral and Schied (2011) motivates the minimization of a costrisk functional of the form
(14) 
where has the same sign as . The parameter is typically derived from the Value at Risk of a unit asset position under the assumption of lognormal future returns. As argued in Remark 2.2 of Gatheral and Schied (2011), one could obtain the same costrisk functional (but perhaps with a different value for ) if Value at Risk is replaced by a coherent risk measure or by any other positively homogeneous risk measure.
Optimal strategies for the costrisk functional (14) in the Almgren–Chriss framework have the advantages of being sensitive to changes in the asset price, easily computable in closed form, and possess completely transparent reactions to parameter changes. In addition, they have a striking robustness property: they are independent of the actual law of as long as is a martingale. Thus they may be optimal even when the law of is not of the particular form (13). A disadvantage is that optimal strategies can switch sign, in which case the interpretation of the costrisk functional (14) breaks down. But, as discussed in Section 4 of Gatheral and Schied (2011), the probability that strategies become negative will be small with reasonable parameter choices.
Corollary 2.
When is a bounded martingale, then is also a martingale. Thus, by Corollary 1 the optimal strategy that minimizes the costrisk criterion (13) simply becomes
This strategy can be computed in a pathwise manner and is completely independent of the particular law of the martingale . It thus minimizes the costrisk criterion (13) whenever is a martingale measure for . When is not bounded but just a squareintegrable martingale, then will not be an admissible semimartingale strategy in the sense of Definition 1. Nevertheless, in this special case, one can show that attains the optimum of the costrisk criterion and thus can still be regarded as an optimal strategy. We leave the details to the reader.
3 Proofs
To simplify the notation, we will drop the superscript in throughout the proofs when there is no ambiguity about the strategy used in the definition of .
Proof of Lemma 1. We first note that
By Theorems II.5.21 and II.5.23 in Protter (2004), this expression converges in probability to
Similarly,
in probability.
When defining
then is the Riemann approximation of a stochastic integral with a deterministic and continuous integrand, and hence uniformly on compacts in probability (ucp) (Jacod and Shiryaev, 2003, Proposition I.4.44). It follows that
Moreover,
Therefore,
where, using the notation from Section II.5 of Protter (2004), for a process we let
Now
(15) 
The first integral on the right converges to in probability by Theorem II.5.21 of Protter (2004). To deal with the second integral on the right, we note that implies that also . Thus, ucp. The continuity of the stochastic integral with respect to ucp convergence (Protter, 2004, p. 59) therefore implies that the rightmost integral in (15) tends to zero in probability for . We thus obtain that
in probability (here we have used the fact that by our convention on stochastic integrals made at the end of Remark 3). Putting everything together yields the assertion. ∎
Now we start preparing for the proof of Theorem 1, which will rely on a series of lemmas. The basic idea underlying the proof is the verification argument appearing in the next lemma. The nature of the verification argument becomes apparent when taking in Lemma 2. The key to the argument is the following formula for the remaining costs of optimally liquidating the asset position over , taking into account a given volume impact . This volume impact can be thought of as the volume impact generated by using a strategy throughout that leads to the asset position at time . The formula is
(16) 
This formula needs to be guessed; we are not aware of a method by which it can be derived analytically. Once this formula has been guessed, we can proceed by the following standard verification argument, which is also used, e.g., in Section 6.6.1 of Pham (2009): We show that the costs (16) plus the costs generated by using over is submartingale for any strategy and a true martingale if is an optimal strategy.
Let us recall the definition
Lemma 2.
Fix , and let be any progressively measurable process with . We furthermore let be a càdlàg version of the martingale
and we define
Then
(17) 
Proof. We note first that Jensen’s inequality implies . Hence,
(18) 
and in turn . So all expressions in (17) are welldefined. We now define for
Then describes the costs incurred by using the strategy throughout the time interval . Next, we use our guess (16) for the costs of optimally liquidating the amount by trading over when an initial volume impact of size is given at time . It leads to defining the function
which describes these optimal costs less the integral term in (16), which does not depend on or . By adding we get the process
(19) 
We will now compute the Itô differential . Our computation will mainly rely on Itô’s product rule in the form (6). For the computation, it will be helpful to collect a few auxiliary formulas in advance. For instance, it follows from the definition of that
(20) 
Using from (7), the fact that (which follows from our corresponding convention for stochastic integrals), and integration by parts yields
(21) 
It follows in particular that the process does not jump throughout and that on this interval . We also note that .
Recalling the fact that , we now choose values , , and in that satisfy
(22) 
but can otherwise be arbitrary. We also choose an arbitrary value . We then have on
(23) 
Hence, a lengthy but straightforward calculation gives
(24)  