Endogenous Formation of Limit Order Books: Dynamics Between Trades.1footnote 11footnote 1Partial support from the NSF grant DMS-1411824 is acknowledged by both authors.

# Endogenous Formation of Limit Order Books: Dynamics Between Trades.111Partial support from the NSF grant DMS-1411824 is acknowledged by both authors.

Roman Gayduk and Sergey Nadtochiy222Address the correspondence to: Mathematics Department, University of Michigan, 530 Church St, Ann Arbor, MI 48104; sergeyn@umich.edu.333We would like to thank the anonymous referees whose constructive remarks helped us improve the paper.
Current version: June 19, 2017
Original version: May 26, 2016
###### Abstract

In this work, we present a continuous-time large-population game for modeling market microstructure between two consecutive trades. The proposed modeling framework is inspired by our previous work [23]. In this framework, the Limit Order Book (LOB) arises as an outcome of an equilibrium between multiple agents who have different beliefs about the future demand for the asset. The agents’ beliefs may change according to the information they observe, triggering changes in their behavior. We present an example illustrating how the proposed models can be used to quantify the consequences of changes in relevant information signals. If these signals, themselves, depend on the LOB, then, our approach allows one to model the “indirect” market impact (as opposed to the “direct” impact that a market order makes on the LOB, by eliminating certain limit orders). On the mathematical side, we formulate the proposed modeling framework as a continuum-player control-stopping game. We manage to split the equilibrium problem into two parts. The first one is described by a two-dimensional system of Reflected Backward Stochastic Differential Equations (RBSDEs), whose solution components reflect against each other. The second one leads to an infinite-dimensional fixed-point problem for a discontinuous mapping. Both problems are non-standard, and we prove the existence of their solutions in the paper.

## 1 Introduction

In this paper, we continue the development of an equilibrium-based modeling framework for market microstructure, initiated in [23]. As in [23], we analyze the market microstructure in the context of an auction-style exchange (as most modern exchanges are), in which the participating agents can post limit or market orders. A crucial component of such a market is the Limit Order Book (LOB), which contains all outstanding limit buy and sell orders (time and price prioritized), and whose shape and dynamics represent the liquidity of the market. We are interested in developing a modeling framework in which the shape of the LOB, and its dynamics, arise endogenously from the interactions between the agents. This is in contrast to many of the existing results on market microstructure, which assume that the shape and dynamics of the LOB are given exogenously. Among the many advantages of our approach is the possibility of modeling the reaction of the LOB to the changes in a relevant market indicator or in the rules of the exchange.444We refer the reader to [23], whose introduction contains a more detailed explanation of the problems of market microstructure and a motivation for our study.

Herein, we extend the discrete time modeling framework proposed in [23] to continuous time, and restrict our analysis to the dynamics of the market between two consecutive trades. The latter simplifies the problem and is justified by the well known empirical fact that most changes in LOB are not due to trades. We manage to establish the existence, and obtain a numerically tractable representation, of an equilibrium in a general continuous time framework, in which the competing agents have different beliefs about the future demand for the asset. These beliefs determine the future distribution of the demand, given the (common) information observed thus far. The latter may, e.g., be generated by a relevant signal (or, market indicator). One can view such conditional distributions as the “models” that the market participants use to predict future demand, and which are based on the (commonly observed) relevant market indicators. Given the beliefs, the agents choose their optimal trading strategies (i.e. limit and market orders), aiming to maximize their expected profits, and reach an equilibrium. The modeling framework proposed herein can be used for predicting the reaction of a market to various changes in the relevant indicators. In particular, if the relevant market indicator depends on the LOB, our framework allows one to model the indirect market impact: i.e. how an initial change to the market may cause further changes to it, due to the information revealed by the initial change (as opposed to the direct impact, e.g., made by a market order eliminating a part of the LOB). An extreme example of such indirect impact is called “spoofing”, and it is an illegal activity aimed at manipulating the market. Our model can be used to quantify such indirect market impact, and it can be, ultimately, used to improve the optimal execution algorithms or to test the consequences of “spoofing” activity. We provide a simplistic example illustrating the potential applications of our model in Section 5, although an empirical investigation (including a more careful model specification, and its estimation), which is needed to make any specific conclusions about the actual market behavior, is left for future research.

On the mathematical side, the problem we analyze is the construction of an equilibrium in a control-stopping game with a continuum of players (cf. [3], [47], [7], for more on the general theory of continuum-player games). The main mathematical challenges stem from three sources: the complicated dependence between the individual payoffs and the controls of other players (which lacks the standard convexity and continuity properties), the presence of multiple participants (as compared to a two-player game) and the control-stopping nature of the game. Equilibria in the games with any number of players can often be constructed directly, by means of a system of Partial Differentia Equations or a system of (Forward-) Backward Stochastic Differential Equations (BSDEs). However, in the case of multiple players, solving such systems numerically becomes very challenging. In such cases, the description of an equilibrium is, typically, limited to the proof of its existence, which, in turn, is obtained by an abstract fixed-point argument. However, even the latter method presents a challenge in the game considered herein. Namely, the complicated dependence structure between the players’ controls and individual payoffs, along with the control-stopping nature of the game, make it very challenging (or even impossible) to (a) to find a compact set of individual controls, which is sufficiently large to include any maximizer of the objective function, and (b) establish the continuity of the objective.555Alternatively, one can exploit the monotonicity properties of the objective, to apply a different type of fixed-point theorem. Nevertheless, such monotonicity is also lacking in the present setting. In order to overcome these challenges, we assume the existence of agents with “extremal beliefs” to split the problem into two parts: a control-stopping game with two players, and a pure control game (without stopping) with a continuum of players. Such a split simplifies our task dramatically, but both resulting problems remain challenging. The first one, concerned with the construction of an equilibrium in a two-player game, leads to a non-standard system of Reflected BSDEs (RBSDEs), whose components reflect against each other, and whose generator lacks to desired regularity. In Subsection 3.2, we prove the existence of a solution to this system, and, in Section 5, we show how it can be computed in a simple example. The second problem, concerned with the equilibrium in a continuum-player game (without stopping), is formulated as a fixed-point problem, and is solved in Subsection 4.1. This auxiliary game is complicated by the fact that it has a discontinuous objective function and does not possess the desired monotonicity properties. Nevertheless, an appropriate “mollification” technique is designed in Subsection 4.1 to construct a solution to the associated fixed-point problem, and, in turn, to describe an equilibrium in the original market microstructure game. One of the computational benefits of the solution method proposed herein is that the aforementioned fixed-point problem can be solved separately for each . In particular, it is not necessary to solve a forward-backward system at each step of the iteration, as it is, for example, done in a typical mean field game (see, e.g., [38], [8]). On the other hand, the local nature of the fixed-point problem causes additional measurability issues, in the proof of the existence result. All these issues are addressed in Subsection 4.1, and the main existence result is stated in Theorem 1, in Section 4.

The literature on market microstructure is vast. Most of the theoretical work is concerned with the problem of a single agent choosing an optimal trading strategy, consisting of limit and/or market orders. The trading environment (e.g. market impact) for this agent is either specified exogenously, or is determined by the agent herself, if she is the designated market-maker. The relevant publications include, among others, [2], [40], [15], [5], [43], [28], [12], [29], [48], [4], [45], [27], [10], and references therein. Nevertheless, none of these works attempt to explain how the key market characteristics (e.g. the shape and dynamics of LOB) arise from the interaction between multiple market participants. Finally, several recent papers have applied an equilibrium-based approach to the problem of optimal execution (cf. [46], [31]). These papers describe an equilibrium between several agents solving an optimal execution problem, with the LOB (or, the market) against which these agents trade being specified exogenously, rather than being modeled as an output of the equilibrium. The endogenous formation of LOB in an auction-style exchange (i.e. without a designated market-maker) is investigated, e.g., in [42], [21], [25], [9], [37], [44], [18]. However, the models proposed in the aforementioned papers do not aim to represent the mechanics of an auction-style exchange with sufficient precision, which is needed to address the questions we investigate herein.

The paper is organized as follows. Section 2 describes the proposed continuum-player game and defines the associated equilibrium. Section 3 introduces an auxiliary two-player game. The latter is interesting in its own right, but its main purpose is to facilitate the construction of an equilibrium in the continuum-player game. The equilibria in the two-player game are described by a system of RBSDEs, whose solution components reflect against each other and whose generator does not satisfy the global Lipschitz and monotonicity properties. Proposition 1, in Subsection 3.2, provides the existence result for this system, which, to the best of our knowledge, has not been available before. Section 4 completes the construction of an equilibrium in the continuum-player game, stating the main result of the paper, Theorem 1. This section, in particular, describes the mollification technique for solving a fixed-point problem with discontinuity, which appears in the auxiliary continuum-player game. We believe that this method can be applied to other relevant fixed-point problems, with a similar type of discontinuity. Finally, in Section 5, we consider a numerical example, in which we compute the equilibrium strategies and show how our results can be used to study the indirect market impact (illustrated by the particular case of “spoofing”).

## 2 Modeling framework in continuous time

### 2.1 Preliminary constructions

We consider an auction-style exchange in which the trades may occur, and the limit orders may be posted, at any time . The market participants are split into two groups: the external investors, who are “impatient”, in the sense that they only submit market orders and need to execute immediately, and the strategic players, who can submit both market and limit orders, and who are willing to spend time doing so, in order to get a better execution price. In our model, we focus on the strategic players, who we refer to as agents, and we model the behavior of the external investors exogenously, via the external demand. The external demand for the asset is modeled using three components: the arrival times of the potential external market orders, the value of the potential fundamental price at these times, and the elasticity of the demand. In our previous investigation [23], we have considered a general family of discrete time games for an auction-style exchange, with the exogenous demand process given by a discretization of a (very general) continuous time demand process, over a chosen partition of . One of the main conclusions of [23] can be, roughly, interpreted as follows: in order for a non-degenerate equilibrium666Degeneracy of an equilibrium is defined formally in [23]. For the discussion presented herein, it suffices to know that degeneracy is an extremal state of the market, and the present work is concerned with the description of the typical (or, normal) states. to exist in a high-frequency limit (i.e. as the diameter of the partition vanishes), the agents have to be market-neutral – i.e. they should not expect the future fundamental price of the asset to increase or decrease. In other words, the results of [23] seem to imply that it is hopeless to search for an equilibrium in a continuous time game (i.e. with unlimited trading frequency) in which the agents have non-trivial trading signals about the direction of the future moves of the asset price. This may sound very discouraging, however, there is a subtle feature hidden in the setting considered in [23]. Namely, the assumptions of [23] imply that, in the limiting high-frequency regime, the (potential) external market orders arrive with an infinite frequency, while the beliefs of the agents (i.e. their trading signals) satisfy certain continuity properties. In other words, the agents’ signals are assumed to be persistent relative to the trades – they cannot change on the same time scale on which the market orders arrive. It turns out that this assumption is crucial, and, allowing the (potential) external market orders to arrive at a finite frequency, and making the agents’ beliefs be short-lived (i.e. only lasting until the next market order is executed), we can obtain a non-degenerate equilibrium in the continuous time (i.e. unlimited trading frequency) regime. Thus, herein, we model the arrival of the (potential) external market orders via a (rather general) point process, and we assume that the game ends after the first trade occurs.

Let be a stochastic basis, satisfying the usual conditions, and supporting a (multidimensional) Brownian motion and a Poisson random measure . We assume that the compensator of is finite on (i.e. is the jump measure of a compound Poisson process) and that it is absolutely continuous w.r.t. Lebesgue measure in time and space. We denote by the usual augmented filtration generated by . We assume that and are independent under . The arrival times of the potential external market orders and the values of the potential fundamental price at these times are described by a counting random measure on , defined as

 M(A)=∫T0∫R1A(t,Jt(x))N(dt,dx),

where is a predictable random function (as defined in [30]). We assume that is adapted to (in particular, it is independent of ). It is clear that the compensator of is finite on , it is absolutely continuous w.r.t. Lebesgue measure in time and space, and it is adapted to . Then, it can be represented as , with an -valued process and a random function , progressively measurable and adapted to , and s.t. . Notice that, conditional on , is a Poisson random measure with the compensator . The -components of the atoms of are the arrival times of the potential external market orders, and their -components represent the values of the potential fundamental price at these times. A positive value of corresponds to the arrival time of a potential external buy order, and a negative value corresponds to the arrival time of a potential external sell order. More precisely, we define the fundamental price process (or, the reservation price process of external investors) as the jump process of :

 Xt=∫RxM({t}×dx). (1)

Note that the is the jump process of , but it is not a cumulative jump process: it stays at at all times except the jump times (thus, can also be interpreted as changes in the fundamental price). We choose to simplify the notation. In general, any is possible, but the only effect it would have on the game is shifting all prices and values by . To develop a better understanding of the proposed framework, from the economic point of view, it may be useful to think of as the last transaction price, which occurred right before the current game started (although this is not important for the mathematical constructions). The process describes the intensity of arrival of the potential external market orders (both buy and sell). The function is the probability density of the value of the potential fundamental price at time . We refer to as the density process of the jump sizes. When the jump size of the fundamental price (along with the demand elasticity, defined below) is not enough to trigger a trade, the jump remains “unregistered” by the agents, and the fundamental price returns to zero. The elasticity of the external demand for the asset is described by the progressively measurable random field , adapted to . We assume that, a.s., is a strictly decreasing continuous function taking value zero at zero. Then, the total external demand to buy and sell the asset at time , at the price level and at all more favorable prices, is equal to

 D+t(p)=max(0,Dt(p−Xt)1{Xt>0}),D−t(p)=−min(0,Dt(p−Xt)1{Xt<0}), (2)

respectively.

At any time , every agent (i.e. strategic player) is allowed to submit a market order or a limit order. The assumptions made further in the paper make it possible to submit a limit order at such a level that it may never get executed – this, effectively, allows the agents to wait (i.e. do nothing). We do not allow for any time-priority in the limit orders. Instead, we assume that the tick size is zero (the set of possible price levels is ), and, hence an agent can achieve a priority by posting her order slightly above or below the competing ones (and arbitrarily close to them). The game stops at the terminal time or at the time when the first trade occurs – whichever one is the earliest. The mechanics of order execution are explained in the next subsection. There is an infinite number of agents, and the inventory of an agent is measured in “shares per unit mass of agents” (see a discussion of this assumption in [23]). We assume that the agents are split into two groups: the ones whose initial inventory is positive (the long agents, typically, indicated with a superscript “”), and those whose initial inventory is negative (the short agents, indicated with a superscript “”). We assume that the absolute size of each agent’s inventory is the same, , and that an agent with inventory posts orders of size . These assumptions are motivated by the results of our previous investigation [23], which demonstrate that, in equilibrium, the absolute value of agent’s inventory only scales the size of her orders proportionally, but does not change their type and location.777Note that the precise setting and the main questions of [23] are not the same as in the present paper. Nevertheless, the two modeling frameworks have many common features. In particular, in both cases, each agent is risk-neutral and infinitesimally small (hence, has no individual impact), which, ultimately, causes their equilibrium strategies to simply scale with the size of initial inventory. We also assume that we are given a pair of measurable spaces of beliefs, and , and, for each , there exists a subjective probability measure on , which is dominated by . An agent with beliefs models the external demand under measure . The empirical distribution of the agents across beliefs is given by a pair of countably additive finite measures , on and , respectively. Note that, because the game stops right after the first market order is executed, the empirical distribution remains constant throughout the game. We make the following assumption on the measures .

###### Assumption 1.

Under every , remains a Brownian motion, and the jump process of is a process with conditionally independent increments w.r.t. (in the sense of [30]).

The above assumption holds throughout the paper. It implies that, under every , is a process with conditionally independent increments w.r.t. . Using this observation and the absolute continuity of w.r.t. , it is easy to deduce that, under every , the compensator of the jump measure of , i.e. of the measure , is given by

 λαtfαt(x)dtdx, (3)

with some nonnegative -adapted and -progressively measurable , s.t. . The interpretation of and is the same as the interpretation of and , but under the measure . Note that we choose not to change the distribution of under different measures for a technical reason – in order to avoid -dependence in the generator of the associated RBSDE system (43).

It is clear that Assumption 1 is satisfied if is given by a stochastic exponential of a process that is an integral of -adapted random function w.r.t. compensated . Namely,

 dZαt=Zαt−∫RΓαt(x)[N(dt,dx)−λtft(x)dtdx],

where is -progressively measurable. The compensator of under is obtained by multiplying its compensator under by , hence, Assumption 1 is clearly satisfied in this case (cf. [30]). In Section 5, we provide an example of a family of probability measures in the above form.

In the proposed setting, the compensator of under , given by (3), represents the supply/demand signal used by the agents with beliefs : in particular, it determines the arrival intensities of external buy and sell orders. Indeed, the value of is determined uniquely by a path of and a realization of the random measure . As the compensator of may be different under each , the resulting compensator of may also vary, however, it always remains adapted to . Thus, the distribution of under is uniquely determined by the choice of .888To have a complete model for the external demand, one also needs to know its elasticity , but the latter is -adapted, hence, its distribution is the same under each . As a result, the agents’ beliefs can be viewed as the “models” they use to map the observed information, given by , into the predictive signal, given by (3).

### 2.2 The continuum-player game

Throughout the rest of this paper we, mostly, work with the filtration , hence, we denote . The state of an agent is . Let us now discuss the controls of the agents and the order execution rules. First, we assume that , representing the agent’s beliefs, does not change over time.999Note that the conditional distribution of the future demand can change dynamically, according the new information revealed. Therefore, the state process of an agent represents only her inventory, which can only change once (because the game ends after the first trade). The control of every agent is given by a pair of processes , progressively measurable with respect to .101010It may seem natural to assume that the agents’ filtration is enlarged by the information generated by the external trades – i.e. by the jumps of that lead to a trade. Note that, since the game ends after the first trade, there may only be one such jump. Then, it is easy to see that the predictable filtration of the enlarged filtration, restricted to the time interval until the first trade, is itself. Naturally, we require the controls to be predictable. The process takes values in , the space of probability measures on , equipped with the weak topology, while takes values in . The second coordinate, , determines the time at which the agent decides to submit a market order, and its formal definition is given below. The first coordinate, , indicates the time- distribution of the agent’s limit orders across the price levels. For example, if is a Dirac measure located at , then, at time , the agent posts all her limit orders at the price level . The collection of all limit orders is described by the Limit Order Book (LOB), which is a pair of process , with values in the finite sigma-additive measures on , adapted to . Herein, corresponds to the cumulative limit sell orders, and corresponds to the cumulative limit buy orders, posted at time .111111For convenience, we sometimes refer to as a “measure”, rather than a “pair of measures”. The bid and ask prices at any time are given by the random variables

 pbt=Q+(νbt),pat=Q−(νat),

respectively, where the functions and act on sigma-additive measures on via

 Q+(κ)=supsupp(κ),Q−(κ)=infsupp(κ). (4)

Notice that and are always well defined as extended random variables, but may take infinite values.

Assume that, at time , an agent posts a limit sell order at the price level . If the demand to buy the asset at or below the price level , , exceeds the amount of all limit sell orders posted below at time , i.e. , then the limit sell order of the agent is executed. Analogous execution rules hold for the limit buy orders. Thus, if an agent follows the limit order strategy , her limit order is (partially) executed by an external market order at the time

 Tp,a=inf{t∈[0,T]:D+t(Q−(pt))>νat((−∞,Q−(pt)))},
 Tp,b=inf{t∈[0,T]:D−t(Q+(pt))>νbt((Q+(pt),∞))},

for the long and short agents, respectively. Let us clarify the meaning of the above formulas. Assume, for simplicity (and only for the sake of this example), that the demand elasticity curve, , is deterministic. Note that unless jumps at . Thus, the above formulas say that a non-zero fraction of agent’s limit orders is executed at time , by an external order, if and only if jumps at time , and its jump is sufficiently large, so that the demand at the agent’s “best limit order” is higher than the size of all limit orders with higher price priority. The latter, along with continuity of , ensures that a non-zero fraction of agent’s limit orders is executed at this time.

The value of indicates the critical level of the bid or ask price (i.e. a threshold), at which the agent decides to submit a market order. We assume that the size of the agent’s market order is equal to her inventory, and it is executed at the bid or ask price available at the time when the order is submitted. Thus, the agent will submit her own market order at the time

 τv,a=inf{t∈[0,T]:vt≤pbt},τv,b=inf{t∈[0,T]:vt≥pat},

for the long and short agents, respectively.121212It is clear that, for every stopping time with respect to , there exists a process , adapted to , such that has the above representation. The collection of all thresholds is described by the pair of processes , with values in the finite sigma-additive measures on , adapted to .

###### Remark 1.

The above definitions of the execution times make use of the assumption that each agent is infinitesimally small, and, hence, her order is necessarily executed once the demand reaches it. They also use the following two implicit assumptions: each agent believes that her limit order will be executed first among all orders at the same price level, and her market order will be executed at the best price available. These assumptions and their connection to a finite-player game are discussed in [23].

Recall that each agent is infinitesimal, hence, even if she executes a non-zero fraction of her inventory, this may not constitute a trade of non-zero size. We, therefore, define the first “significant” execution time as the first time when a non-zero mass of agents execute a non-zero fraction of their inventory (i.e. when a non-zero total inventory mass is traded). Consider the first significant execution times of external market orders:

 Ta=inf{t∈[0,T]:D+t(pat)>0},Tb=inf{t∈[0,T]:D−t(pbt)>0}, (5)

Similarly, we define the first significant execution times of internal market orders:

 τa=inf{t∈[0,T]:θat((−∞,pbt])>0},τb=inf{t∈[0,T]:θbt([pat,∞))>0}. (6)

Finally, given , we define the clearing prices:

 ~pc,at=sup{pνat((−∞,p))},pc,at=~pc,at1{~pc,at≥pat},
 ~pc,bt=inf{p>Q−(νbt):D−t(p)>νbt((p,∞))},pc,bt=~pc,bt1{~pc,bt≤pbt}.

For a long agent with strategy , the game ends at the time (and similarly for the short agents). If an agent has any inventory left at the end of the game, then it is marked to market.131313There is no canonical way to choose the marking-to-market rules in a setting where agents have no exogenously given valuation of the asset (and we insist on using such a setting, because we think of the agents as “pure speculators”). In particular, other marking rules are possible. Herein, we merely make a choice of marking rules which is economically meaningful. The precise rules for computing the payoff of a long agent, using strategy , are described below.

If the game is terminated by an external market order: (note that equality is impossible, as the right hand side is predictable and the left hand side is totally inaccessible).

• If (equality is impossible), then the payoff is

 ∫~pc,at−∞zpt(dz)+∫∞~pc,at(pc,at+pbt)pt(dz),witht=Tp,a∧Ta. (7)
• If , then the payoff is .

Notice that the remaining inventory of an agent is marked to the bid price shifted by the clearing price. This choice can be (heuristically) interpreted as follows. Assume that, after the trade, a new game starts, with the agents having the same distribution of inventory and the same beliefs about the distribution of future jumps of (i.e. the same ). Then, the only parameter that is different in the new game, as compared to the original one, is the value of , which, in the new game, becomes equal to the clearing price. As mentioned in the discussion following (1), the new value of will simply shift all prices and values in the new game by , hence, the bid price is shifted by the value of clearing price. Finally, it is easy to deduce (and will be shown later in the paper) that it is suboptimal for an agent to post a limit buy order at positive levels. Thus, if an external sell order is executed, the clearing price is non-positive, and, hence, the remaining inventory is marked to the current bid price shifted downwards (the opposite holds if an external buy order is executed).

If the game is terminated by an internal market order: .

• If then the payoff is .

• If then the payoff is .

To explain the above, assume, e.g., that an internal buy order occurs: i.e. . Note that the internal orders are different, because they are predictable. Hence, the long agents can act exactly at the time and “flock” their limit orders to the best ask price, , to match the market orders from short agents (who initiated the internal buy order). On the other hand, if any of the agents (long or short) do not trade at , they will mark their inventory to the bid or ask price shifted by , and, since , it is easy to see that it is beneficial for all of them to trade at .141414Of course, in practice, not all agents will act at the same time: only a fraction of them will submit the internal market orders at the end of the game, the others will move on to the next game, with updated . However, such “flocking” of agents at the end of the game (provided the game ends with an internal market order) is consistent with the empirical observation of “clustering trades”.

The following diagram (containing a reference to equation (7)) describes the payoff of a long agent:

Similar rules apply to short agents. Formally, given , the individual objective of an agent starting at the initial state and using the control is given by:

 J(ν,θ),(p,v)(1,α)=Eα[∫R(z1{z≤~pc,a^Tp,a}+(pb^Tp,a+pc,a^Tp,a)1{z>~pc,a^Tp,a})p^Tp,a(dz)1{^Tp,a
 +(pbTb+pc,bTb)1{Tb<^Tp,a∧^τv,a∧τb}+(paτb1{τb<^τv,a}+pb^τv,a1{τb≥^τv,a})1{^Tp,a∧Tb>^τv,a∧τb}]

where , , and we assume that . Similarly,

 J(ν,θ),(p,v)(−1,α)=Eα[−∫R(z1{z≥~pc,b^Tp,b}+(pa^Tp,b+pc,a^Tp,b)1{z<~pc,b^Tp,b})p^Tp,b(dz)1{^Tp,b
 −(paTb+pc,aTa)1{Ta<^Tp,b∧^τv,b∧τa}−(pbτa1{τa<^τv,b}+pa^τv,b1{τa≥^τv,b})1{^Tp,b∧Ta>^τv,b∧τa}]

where , . Every agents aims to maximize her objective. The above objectives may seem convoluted – this is because they are meant to provide a close approximation of the real-world execution rules and marking to market. In the next subsection, we establish a more transparent representation of the objectives.

In the following definitions, we assume that a stochastic basis, a Brownian motion , a random measure , a random field , spaces and , an associated set of measures , and the empirical distribution , are fixed and satisfy the assumptions made earlier in this section. (Nevertheless, it is shown in Subsection 2.3 that the input can be replaced by the agents’ signals .)

###### Definition 1.

For a given market and a state , a pair of -progressively measurable processes is an admissible control, if the positive part of the expression inside the expectation in (8) (if ) or (9) (if ) has a finite expectation under .

###### Definition 2.

For a given market and state , we call an admissible control optimal if

 J(ν,θ),(p,v)(s,α)≥J(ν,θ),(p′,v′)(s,α)

-a.s., for any admissible control .

In the above, we make the standard assumption of games with a continuum of players: each agent is too small to affect the distribution of cumulative controls (described by ) when she changes her control. Next, we define Nash equilibrium in the proposed game.

###### Definition 3.

A given market and a pair of -progressively measurable random fields form an equilibrium, if

1. for -a.e. , is an optimal control for and ,

2. and the following holds -a.s., for any and any :

 νat((−∞,x])=∫Apt(1,α;(−∞,x])μa(dα),νbt((−∞,x])=∫Bpt(−1,α;(−∞,x])μb(dα), (10)
 θat((−∞,x])=∫A1{vt(1,α)≤x}μa(dα),θbt((−∞,x])=∫B1{vt(−1,α)≤x}μb(dα). (11)

Note that a trivial equilibrium, in which all agents stop immediately, is always possible. However, such equilibrium, clearly, is not sufficient for modeling purposes, and the existence of other, non-trivial, equilibria is far from obvious. In the remainder of this paper, we use an auxiliary two-player game (cf. Section 3) to identify a class of more realistic potential equilibria, in which the end time of the game is determined uniquely by the solution of an associated RBSDE system (cf. (44)), and we prove the existence of equilibrium in this class, in Theorem 1. Even though it is possible to construct models in which the resulting equilibrium is still trivial (i.e. the end time of the game is zero), this is not the case in general, as confirmed by the example in Section 5.

###### Remark 2.

In the above definition, it is implicitly assumed that the empirical measure of the agents’ states remains constant in time until the game is over for all players. This is, indeed, the case, if the equilibrium is such that, -a.s., for all , we have:

 μ∘((s,α)↦St(s,α))−1=μ, (12)

with

 St(1,α)=1[0,Tp(1,α),a∧τv(1,α),a)(t),andSt(−1,α)=−1[0,Tp(−1,α),b∧τv(−1,α),b)(t).

The condition (12) may fail if a non-zero mass of agents manages to execute their orders strictly before : i.e. if for a set of with positive -measure, or for a set of with positive -measure. The latter cannot occur due to external market orders, because they only arrive at a finite number of times and, before , only a zero mass of agents can execute their limit orders against any such market order (cf. (5)). It is also true that, at any time , before , only a zero mass of agents can execute their internal market orders (cf. (6)). However, the set of such times may be uncountable. Therefore, to ensure that remains constant and, hence, (12) holds, it suffices to consider only the equilibria satisfying, -a.s., for all , except, possibly, a countable set:

 vt(1,α)≥vat:=Q−(θat),vt(−1,α)≤vbt:=Q+(θbt),∀α∈A∪B.

In the subsequent sections, we construct such an equilibrium.

### 2.3 Representation of the objective

In this section, we provide an equivalent representation of the objective of the agents, which makes it more tractable and more convenient for the analysis that follows. In addition, it shows that the main input parameters for the proposed equilibrium problem are the signals , forming the compensators of under , and the demand elasticity (the latter is independent of and, in many realistic models, can be deterministic). In particular, there is no need to keep track of the random measure and the probability measures – they are only needed to show that the present setting fits within the standard framework for games with heterogenous beliefs. The desired representation is derived following standard arguments, making use of the independence of the driving Poisson measure and the Brownian motion . First, we introduce new notation that will be used throughout the paper. For any , , and , we define the instanteneous filling rates for limit orders at levels and :

 F+,αt(x)=∫∞x∨0fαt(u)du,F−,αt(y)=∫y∧0−∞fαt(u)d% u,cαt(x,y)=λαt(F−,αt(y)+F+,αt(x)). (13)

Next, we define the clearing price as a function of the fundamental price :

 lc,at(x)=sup{pνat((−∞,p))}, (14)
 lc,bt(x)=inf{p>Q−(νbt):−Dt(p−x)>νbt((p,∞))}. (15)

Notice that, if has a positive jump at time , then the clearing price at time is given by . Similarly, if has a negative jump at time , then . Finally, we introduce the instantaneous reward rates from executed limit orders, distributed according to , with the bid and ask prices and :

 hα,at(κ,x,y)=λαt∫∞(Q−(κ)∧x)∨0fαt(u)[∫lc,at(u)−∞zκ(dz)+(y+lc,at(u)1{lc,at(u)≥x})κ((lc,at(u),∞))]du (16)
 +λαt∫y∧0−∞fαt(u)(y+lc,bt(u))du,
 hα,bt(κ,x,y)=λαt∫(Q+(κ)∨y)∧0−∞fαt(u)[∫∞lc,bt(u)zκ(dz)+(x+lc,bt(u)1{lc,bt(u)≤y})κ((−∞,lc,bt(u)))]du (17)
 +λαt∫∞x∨0fαt(u)(x+lc,at(u))du.

Using the above notation, we can obtain a simplified expression for the objective, given in the following lemma. Note that the expectation in this representation is taken under the reference measure, and the objective depends only on the cumulative actions and on (as the expressions in (13)–(17) depend only on ).

###### Lemma 1.

Let Assumption 1 hold. Given a market , for any and any admissible strategy , we have:

 J(ν,θ),(p,v)(1,α)=E[∫^τv,a∧τb0exp(−∫s0cαu(pau∧Q−(pu),pbu)du)hα,as(ps,pas,pbs)ds (18)
 J(ν,θ),(p,v)(−1,α)=−E[∫^τv,b∧τa0exp(−∫s0cαu(pau,pbu∨Q+(pu))du)hα,bs(ps,pas,pbs)ds (19)
 +exp(−∫^τv,a∧τb0cαu(pau,pbu∨Q+(pu))du)(pbτa1{τa<^τv,b}+pa^τv,b1{τa≥^τv,b})],

where , and the expectations are taken under .

###### Proof.

The proof follows easily by conditioning on . Notice that, conditional on , is a Poisson random measure, with the deterministic compensator , which is finite on . Recall also that , , , , , , , , , , , and all the random functions defined above the lemma, are adapted to . Conditional on , they become deterministic functions of time. Recall the fundamental price process, , and introduce

 Yt=Xt(1{Xt>(pat∧Q−(pt))∨0}+1{Xt

Notice that is the time of the first positive jump of , and is the time of its first negative jump. Notice also that, conditional on , the clearing price becomes a deterministic function of and : . Thus, conditional on , the expression inside the expectation in (8) becomes a function of the time and size of the first jump of . Conditional on , is the jump process of a Poisson random measure with the compensator . It is also clear that, conditional on , is the jump process of a non-homogeneous compound Poisson process with intensity , and with the distribution of jump sizes at time given by

 λαtfα(x)cαt(pat∧Q−(pt),pbt)(1{x≤pbt∧0}+1{x≥(pat∧Q−(pt))∨0})dx.

A standard computation, then, yields (18). The equation (19) is derived similarly. The expectations in (18) and (19) are taken under , because the expressions inside the expectations are adapted to , and has the same distribution under and .

## 3 A two-player game

In this section, we consider an auxiliary non-zero-sum two-player control-stopping game. It is related to the continuum-player game, but the precise connection will be established in the subsequent sections. We refer the reader to [33], [34], and the references therein, for more on non-zero-sum two-player control-stopping games.151515See, e.g., [19], [36], [17], [6], and the references therein, for the related classical Dynkin games, which are zero-sum and stopping-only. It is worth mentioning, however, that the present game does not fall within any of the classes considered before. A more detailed description of this class of games is carried out in our forthcoming work [22].

Assume that all the probabilistic constructions made in Subsection 2.1 are in place. Namely, we are given a stochastic basis, with a Brownian motion , a Poisson measure , a counting random measure , a family of probability measures , and with the demand elasticity process , as described in Section 2. We assume that Assumption 1 holds. Assume, in addition, that and . Consider a two-player game, in which the first (long) player starts with the initial inventory and has beliefs , and the second (short) player starts with the initial inventory and has beliefs . The game proceeds according to the rules similar to those described in the previous section: each agent can post limit orders on the respective side of the book, or can terminate the game by submitting a market order. The execution of limit orders against the external market orders occurs in exactly the same way as described in the previous section. However, herein, at any given time, each agent is only allowed to post limit orders at a single location (i.e. the control is a Dirac measure). In addition, the main difference between the present game and the one defined in the previous section is that, herein, each player has a non-zero mass and, hence, can affect the LOB. In fact, since there is only one player on each side of the book, the LOB is given by a combination of two Dirac measures: , , controlled by the locations of the players’ limit orders: for the long agent, and for the short one. Clearly, also coincides with the ask price, and is the bid price. Note that each of these prices is now controlled by a single agent, which is not the case in the original game described in the previous section. The same is true for the stopping thresholds: and are given by Dirac measures, and the locations of these measures correspond to the thresholds and used by the long and short agents, respectively. In this new game (due to its simplicity), it turns out to be more convenient to work with the associated stopping times and . In fact, we will further constraint the agents’ controls, so that and . The meaning behind these constraints is clear: every agent assumes that the counterparty will execute a market order at exactly the same time as she does, and that these orders are executed at the same price. Taking into account the above considerations, we transform (8) into the objective of a long player:

 ~Ja,(pb,¯p),(p,τ)=Eα0[pTp,a1{Tp,aτ}], (20)

where , and are -valued -adapted processes, is a stopping time with values in , and

 Tb=inf{t∈[0,T]:Xtpt},Xt=M({t}×R).

Similarly, for the short agents,

 ~Jb,(pa,¯p),(p,τ)=−Eα0[pTp,b1{Tp,bτ}], (21)

where , and are -valued -adapted processes, is a stopping time with values in , and

 Ta=inf{t∈[0,T]:Xt>pat},Tp,b=inf{t∈[0,T]:Xt

Using Lemma 1, we deduce the following form of the objective functions

 ~Ja,(pb,¯p)