High-frequency market-making for multi-dimensional Markov processes

High-frequency market-making for multi-dimensional Markov processes


In this paper we complete and extend our previous work on stochastic control applied to high frequency market-making with inventory constraints and directional bets. Our new model admits several state variables (e.g. market spread, stochastic volatility and intensities of market orders) provided the full system is Markov. The solution of the corresponding HJB equation is exact in the case of zero inventory risk. The inventory risk enters into play in two ways: a path-dependent penalty based on the volatility and a penalty at expiry based on the market spread. We perform perturbation methods on the inventory risk parameter and obtain explicitly the solution and its controls up to first order. We also include transaction costs; we show that the spread of the market-maker is widened to compensate the transaction costs, but the expected gain per traded spread remains constant. We perform several numerical simulations to assess the effect of the parameters on the PNL, showing in particular how the directional bet and the inventory risk change the shape of the PNL density. Finally, we extend our results to the case of multi-aset market-making strategies; we show that the correct notion of inventory risk is the L2-norm of the (multi-dimensional) inventory with respect to the inventory penalties.

Keywords: Quantitative Finance, High-Frequency Trading, Market-Making, Inventory Risk, Markov Processes, Hamilton-Jacobi-Bellman, Stochastic Control, Optimal Control.

1 Introduction

1.1 Variables

We will work with time and two controls, the half ask (resp. bid) spread of the market-maker (resp. ), measured as the distance between the mid-price and her ask quote (resp. bid quote). Our model admits several state variables, which can be any process provided the whole system is Markov. This framework admits a large class of price models e.g. jump processes with stochastic volatility. Without loss of generality, we will restrict our analysis to the folowing ones:

  1. The mid-price , assumed to be an Itô diffusion.

  2. The half market spread , which is assumed non-negative and somewhat mean-reverting.

  3. The volatility of the mid-price process, which is strictly positive.

  4. The inventory , which is modelled as

    where and are two independent Poisson processes.

  5. The intensities of the previous Poisson processes are exponentially decreasing in the distance to the quote on the other side, and their speed of decay is random:

    Figure 1: Market order intensities . They are extrapolated when (dotted lines).
  6. The cash , which is simply the money earned by the market-making by selling and buying the stock, i.e.

To keep things simple, we will use the notation

The approach we propose is very general because it admits any number of state variables. In that spirit, one could add to other state variables, e.g. a random intensity of the arrival of market orders or a statistical indicator coming from an alternative market.

1.2 Hypotheses on the processes and controls

We assume our controls lie in an admissible space . In order to have a well-posed stochastic control problem, we have to assume that the the system of all state variables is Markov, since otherwise our mathematical techniques do no apply. It is worth to mention that this assumption does not necessarily imply having Markovian variables: as in the case of stochastic volatility, it is not the mid-price which needs to be Markov but the couple (mid-price,volatility).

Another hypothesis we need is that all value functions are finite. In the case of the value function (6), we accept any mid-price process such that the function in (9) is finite. This condition holds for any process such that its conditional expectation is affine in . For example, a Brownian motion and an Ornstein-Uhlenbeck are acceptable mid-price processes.

1.3 General HJB and optimal controls

Let us start with the general HJB equation, i.e.


where is the infinitesimal generator for the state variables . The steps to solve (1) are as follows:

  1. Based on the utility function we make an ansatz, i.e. we guess the general form of the solution of (1), i.e.

  2. We substitute the ansatz (2) in (1) in order to find an easier HJB equation for . We use this new HJB equation to find the optimal controls that maximize the jumps. Indeed, after elementary calculus, where the jump part of (1) is considered a function of , one finds that the optimal controls are

  3. We substitute the optimal controls in the HJB equation: the resulting equation is called the verification equation. In our case it is


    which is highly nonlinear because .

  4. We cannot solve directly the verification equation (4) via the Feynmann-Kac representation formula because the former is nonlinear and the latter works only for linear equations. However, the idea is to decompose our nonlinear problem into several linear problems, and apply Feynmann-Kac to each one of them. Recall that for a linear equation of the form

    the (unique) solution is given by the Feynmann-Kac formula (see e.g. Pham [10])


    where is the conditional expectation given , and . As it will be evident throughout this work, our approach relies entirely on the Feynmann-Kac formula (5). Therefore, as long as our general Markov processes are such that the expectation in (5) is finite, we are on solid ground.

  5. Alternatively, we could express the optimal quotes in terms of the market-maker’s bid-ask spread

    and the centre of her spread

    Notice that if and only if . Therefore, measures the level of asymmetry of the quotes with respect to the mid-price .

There is an important remark on the shape of the optimal controls (3). If the only state variable is the mid-price then . Moreover, if the mid-price is a martingale then we can assume that because, by definition, there is no directional bet on the price. Under these assumptions, plugging the explicit optimal controls (3) into the verification equation (4) leads to a system of ODEs for , indexed by . This system can be solved numerically, but is is nearly impossible to have an explicit formula (see e.g. Cartea-Jaimungal [4] and Guéant-Lehalle-Fernández [7]). In our case, with a general price process and several other state variables, there is no hope for an system of ODEs; in fact, the corresponding system is PDE-based. Therefore, either we solve the problem explicitly (as we will do in the linear case), either we perform some asymptotics for the optimal controls (as we will do in the case of inventory penalty).

2 Linear utility function

2.1 Hamilton-Jacobi-Bellman (HJB) equation

Here we will try to find the optimal controls that maximise the PNL profit and loss) of a market-maker, i.e. a linear the value function, i.e.


where is the conditional expectation given the values of all variables at time , i.e.

The probabilistic representation (6) is the unique solution of the HJB equation


where is the infinitesimal generator for all the continuous state variables . Therefore, we will use (7) to find the optimal controls .

Our linear utility function here is simply the PNL of the market maker, i.e.

Its corresponding value function is thus

In this case, the optimal controls are the bid and ask quotes that maximises the PNL of the market-maker throughout the trading day.

2.2 Ansatz and HJB

We will look for a solution of the form

With this ansatz, the HJB equation takes the form

2.3 Computing the optimal controls


Using elementary Calculus we find that its maximum is attained at

Analogously, for

the maximum is attained at

With the optimal half-spreads

we can easily compute the optimal spread for the market maker, along with the centre of her spread:

Notice that is necessary only for the solution of the HJB equation, not for the controls. Indeed, we only need to find in order to have our controls explicit.

2.4 Solving the verification equation

With the optimal controls, the HJB equation reduces to

From the explicit form of we have

Therefore, the verification equation can be rewritten as


We separate (8) into two equations, one for each one of our unknowns:


Let us define

which measures the difference between the expected value of the mid-price at maturity and its current value. With this notation, and using the Feynman-Kac formula twice, first for and then for , we find

where all capital letters inside the integral are evaluated at , i.e. , , etc. This leads to explicit expressions for the (unique) solution and its controls:


2.5 Remarks

Let us explain the decomposition of the solution given in (9) in terms of and . On the one hand, the function is the expectation at expiry of the current portfolio ; this corresponds to a buy-and-hold strategy. On the other hand, the function (as the integral from to suggests) is the profit for playing a dynamic (i.e. high-frequency) market-making strategy. Since , the addition of the market-making mechanism to the strategy is more profitable than the buy-and-hold strategy alone. In consequence, it makes sense to play market-maker dynamically instead of simply apply a buy-and-hold strategy.

Since in principe can be very big, in order to compare strategies we have to compute the value function per time unit, i.e. . Using the integral version of the mean-value theorem, it follows that there exists in such that

In particular, when we have , and . Therefore, at the beginning of the day the expected gain of the market-maker is

The first observation is that the worst mid-price dynamic is the martingale. Indeed, has a strict minimum when , and a non-martigale process has times where . This seems counter-intuitive because, the market-making being lightning fast, it should not be influenced by the long-range behaviour of the mid-price. However, the inventory turnover is slower because it takes several trades to build up and come back to zero, and that is where the directional bet enters into the game.

Another feature is the effect of a non-constant decay rate . Notice that the optimal spread is decreasing in and the value function is decreasing in . If the intensity of order arrival increases, which can be interpreted as either less market orders or a more populated limit order book (LOB), then the market-maker has to reduce her spread to keep her order flow constant. But this makes her PNL smaller because she is selling liquidity cheaper.

The effect of the market spread is very interesting. On the one hand, does not appear at all in the optimal controls, which means that the market-making strategy is independent of . On the other hand, the value function is decreasing in . In consequence, the PNL of the strategy decreases as the spread increases. This is a direct consequence of the hypothesis that the intensity of the arrival of market orders depends on the distance to the other side of the book, not on the distance to the mid-price. Indeed, the bigger the spread, the less market flow captured by the market-maker, even if her position relative to the mid-price does not change.

We have put the intensity into the expectation operator. This is because our formulation allows a stochastic intensity . In fact, our formulation allows as well asymmetric intensities and decays ; the only difference is that the symmetry via is lost and we would have 2 different, complicated exponentials instead.

3 Linear utility function with inventory penalty

3.1 Two inventory penalties

We will penalise the inventory in two ways:

  1. A penalty at expiry, depending on the spread. This models the fact that the market-maker will have to clear her inventory at the market, and as such she will pay the spread for each share:

    For example, if we recover the Stoll model for a quadratic penalty on the inventory (see [11]).

  2. An integral penalty of the (squared) inventory during the remaining of the trading session, weighted by the volatility of the mid-price . This is a sort of tracking error with respect to a flat-inventory position, and is a very standard choice (see e.g. Guilbaud-Pham [8] and Cartea-Jaimungal [4]):

In this section, we will consider the following value function:


Notice that if we recover the previous linear case. Concerning the parameters, the important one is since it gives the penalty as a perturbation of the value function we already computed. The other parameters and can be considered as booleans, so that we can assess a porteriori the effect of the two penalties on the resulting controls.

3.2 The HJB equation and the ansatz

The resulting HJB equation is thus


Recall that we have an explicit formula for the (unique) solution when . Therefore, we will use perturbation methods on . More precisely, we propose the following ansatz:


3.3 Verification equation and its linearisation

With the ansatz (12), the optimal controls take the form


Under these conditions, the verification equation is


Observe that we can rewrite the jump term in as

In consequence, its first-order expansion in is

Analogously, for the term in we have

Therefore, it follows that the linearisation in of the verification equation (14) is


3.4 Solution for

At zero-th order in , which is equivalent to set , we recover the verification equation of the linear case without inventory penalty (8), i.e.

Therefore, is exactly the solution we found before, i.e.


where (as before) . This fact is our main motivation for the application of perturbation methods on the inventory constraints.

3.5 Solution for

Keeping only the first-order terms in (15), i.e. those with , we obtain


We decompose (17) into three equations, and as before we solve one by one via the Feynmann-Kac formula. The first equation, which regroups the terms in , is linear:

Its (unique) solution is


Since and are already known, the equation for the terms becomes linear in , i.e.

Therefore, its (unique) solution is


Finally, for we get

whose (unique) solution is


Putting all together (12), (16), (18), (19), (20) we finally obtain the explicit expansion of up to first order in .

3.6 Optimal controls

Now we are in measure to write down explicitly the optimal controls up to first-order in . From the ansatz (12) and the control equations (13) it follows that


This implies that the optimal spread for the market-maker is


whilst the centre of her spread is


In the light of all the former computations, we can now write down in full splendour the optimal controls for the market-maker:


where is the degree of non-martingality of the mid-price (i.e. the directional bet),

is the (marginal) profit of the market-making as a fuction of the directional bet , which is a function of , and

is the unitary inventory-risk penalty.

3.7 Remarks

The unitary inventory-risk penalty has two components. The term in (the integral one) penalises a non-flat inventory via the volatility thoughout the day. The term in is a penalty that triggers only near the end of the day, but strong enough to force the market-maker to leave the trading floor with zero inventory. Both penalties are complementary: one keeps the inventory within range during the trading day whilst the other forces the inventory to finish the day near zero.

Notice that the optimal spread is increasing in the unitary inventory-risk penalty . This feature can be understood in a very intuitive way: if a market-maker is more sensitive to an inventory risk then she will be more conservative in her quotes, fearing that even a small price jump would put her on the wrong side of the trend. Her inventory-risk aversion thus translates into a wider spread. However, the inventory does not appear at all in the expression for . Indeed, it is the unitary inventory-risk aversion which determines the width of the spread, not the current inventory level.

Observe that the center of the spread is decreasing in the inventory . This is also very intuitive: the market-maker will tilt her quotes, rendering them asymmetrical, in order to favour execution vs incoming market orders which help her reduce her (absolute) inventory. For example, if she is long inventory () then she will post aggressive bid quotes to lure selling market orders. At the same time, her ask quotes are more conservative because she does not want to increase her inventory via buying market orders. If then the market-maker will post conservative sell orders and aggressive buy orders, hoping to reduce her inventory via an asymmetrical market-flow.

The directional bet is present in the centre . For example, if then the market-maker expects a final mid-price higher than the current one. Therefore, if she is willing to carry some inventory-risk, she will post aggressive bid quotes and less aggressive ask quotes. As a result, in average the inventory will be positive, reflecting the long bet in the mid-price. As we can see, the dynamic of the centre is governed by two opposite effects: tries to build up the inventory to profit from the difference between the current mid-price and the estimate of the final mid-price, whilst the term in aims to keep a flat-inventory position.

If the midprice is a martingale then and , i.e. we get rid of the integral term in . This implies that the integral term in measures the profit of the market-making strategy with respect to the directional bet . Now assume that in the non-martingale case we want more simplicity on the formulas, namely we totally discard the integral terms in and in the volatility ; in fact, this is equivalent of a zero-th order expansion in of the optimal controls. It turns out that we recover the optimal quotes in Fodra and Labadie [6], which were obtained by linearising the verification equation without much care on the accuracy of the approximation. This means that the integral terms, which are are a path-dependent, offer a correction based on the time to maturity and the degree of non-martingality .

3.8 The effect of transaction costs

Suppose that the market-maker pays a fixed fee of for each traded asset. In most venues , but there are some trading platforms where a liquidity provider receives a rebate, i.e. . Since the transaction cost affects the cash process each time a share is traded, we have to modify (1) as


Let be the optimal controls without transaction costs, and the optimal controls under transaction costs. If we define

then repeating the previous computations we obtain