MeanField Games with Differing Beliefs for Algorithmic Trading^{†}^{†}thanks: SJ would like to acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC), [funding reference numbers RGPIN201805705 and RGPAS2018522715]
Work in Progress
Abstract
Even when confronted with the same data, agents often disagree on a model of the realworld. Here, we address the question of how interacting heterogenous agents, who disagree on what model the realworld follows, optimize their trading actions. The market has latent factors that drive prices, and agents account for the permanent impact they have on prices. This leads to a large stochastic game, where each agents’ performance criteria is computed under a different probability measure. We analyse the meanfield game (MFG) limit of the stochastic game and show that the Nash equilibria is given by the solution to a nonstandard vectorvalued forwardbackward stochastic differential equation. Under some mild assumptions, we construct the solution in terms of expectations of the filtered states. We prove the MFG strategy forms an Nash equilibrium for the finite player game. Lastly, we present a leastsquares Monte Carlo based algorithm for computing the optimal control and illustrate the results through simulation in market where agents disagree on the model.
assumptionAssumption \headersMFG with Differing Beliefs for Algo TradingCasgrain, P. and Jaimungal, S.
1 Introduction
Financial markets are immensely complicated dynamic systems which incorporate the interactions of millions of individuals on a daily basis. Market participants vary immensely, both in terms of their trading objectives and in their beliefs on the assets they are trading. All of these participants compete with one another in an attempt to achieve their own personal objectives in the most efficient way possible. Traded assets may also be driven by latent factors, and agents must dynamically incorporate data into their trading decisions.
In this paper, we propose a game theoretic model in which a large population of heterogeneous agents all trade the same asset. This model considers heterogeneity not only from the point of view of an individual’s trading objectives and risk appetite, but also from the point of view of each agent’s beliefs regarding the performance of the asset they are trading. We pay particular attention to the information each agent is privy to, in an attempt to render the framework as realistic as possible, while maintaining some semblance of tractability.
We study the equilibrium of these markets by using the theory of meanfield games (MFGs), which serves to describe GameTheoretic models as the number of participating agents becomes extremely large. The general theory of meanfield games already has a large body of research associated with it. The original works stem from [16], [15], and [19]. Among the many extensions and generalizations which explore the broad theory of MFGs as well as their applications, we highlight the following works: [14], [13], [21], and [3]. This theory has seen applications in various financial contexts, such as [4] and [17] who use it to model systemic risk, [18] show use it for algorithmic trading in the presence of a major agent and a population of minor agents, [2] who investigate MFG in the context of optimal execution, and [9] who look at meanfield games in algorithmic trading with partial information on states.
In contrast to other work on MFGs, as well as its specific application to algorithmic trading, here, motivated by [6], we include latent states so that agents do not have full information about the system dynamics. In contrast to [7], who also study a stochastic game with latent factors, we study how varying beliefs among the agents affect the optimal trading behaviour. In our model, we express the belief of agents as a probability measure on the dynamics of the asset price process and of any latent processes that may be driving them. As far as the authors are aware, this is the first time that MFG with varying beliefs have been treated in the literature. This generalization is quite nontrivial, nonetheless, we succeed in characterizing the model equilibrium as the solution to a nonstandard forwardbackward stochastic differential equation (FBSDE) defined across the collection of belief measures. We are able to present a closed form representation for the solution of the MFG and it incorporates all of the differing market’s beliefs into the decisions of the individual agents.
We structure the remainder of the paper as follows. Section 2 introduces the market model and the stochastic game that agents participate in. Section 3 begins by introducing the MFG limit of the stochastic game and then proves that the collection of optimal strategies in the MFG may be represented as the solution to a system of coupled FBSDEs. Next, the system of FBSDEs is solved, and the solution to the mean field as well as each individual’s strategy is provided. Section 4 provides a specific example of a model where the assumptions in the key results are satisfied. In Section 5 we show that the solution to the MFG satisfies the Nash equilibrium property in the finite population game. Lastly, Section 7 provides a leastsquare Monte Carlo approach to computing certain expecations, as well as simulated examples of a market model with agents having differing beliefs.
2 The Model
In this section, we provide the market model and the participating agents performance criteria. The model presented in the remainder of this section closely resembles the model for the stochastic game used in [7]. The stochastic game presented here aims to characterize a population of agents with several sources of heterogeneity. As in [7], here, agents have varying trading objectives. In addition, however, agents are also characterized by their beliefs regarding the model driving the asset price process. In the remainder of this section, we present the trading mechanics which each of the agents use to interact with the market, as well as the objectives each of the agents seek to achieve with their actions.
2.1 The Population of Agents
The market consists of a population of rational heterogeneous agents trading a single asset. Agents are indexed with an integer . The total population of agents is divided into disjoint subpopulations, which are indexed by . is assumed to be constant and independent of . All agents within a fixed subpopulation behave in a homogeneous manner. The set
(1) 
denotes the set of agents within subpopulation , and the superscript indicates the explicit dependence on the total number of agents. We also define to be the total number of agents within subpopulation . We further assume the number of agents contained in each of the subpopulations remains stable as we take the population limit to infinity. More specifically, we require that the proportion of agents contained within population satisfies
(2) 
2.2 The Agent’s State Processes
We work on the filtered probability space completed by the null sets of and where is some fixed time horizon. All of processes defined in the remainder of this section are adapted, unless otherwise specified, and the notation represents expectation with respect to the measure .
All agents have the ability to buy and sell the asset over the fixed trading period , after which all trading activity comes to a halt. Each agent controls the amount they wish to purchase or sell at a continuous rate denoted , where () indicates the rate of buy (sell) orders the agent sends to the market. At the start of the trading period, each agent is assumed to hold a random amount of the asset. Agents keep track of their holdings in the traded asset with the inventory process , where the superscript indicates the explicit dependence on the agent’s controlled rate of trade. The relationship between agent’s trading rate and their inventory process is
(3) 
This can be interpreted as each agent buying or selling an amount in each small time interval . {assumption} We make the technical assumption that the initial inventory holdings of all agents have a bounded variance, so that for which . Moreover, we assume that the mean of the starting inventory levels are the same within a given subpopulation, so that for each .
Buying and selling actions of agents impact the price of the traded asset in a manner to be specified below. As well, agents believe the asset midprice follows (potentially) different models. We incorporate differing beliefs into our model by assigning a probability measure to each subpopulation . The various measures correspond to the model that agents in a particular subpopulation believes to represent the true dynamics of the asset price.
We define the asset price process , where the superscript indicates the dependence of the price on the actions of all agents in the market. It is useful to define the average trading rate of all agents within subpopulation as
(4) 
Each agent in subpopulation then believes the asset price process follows the dynamics
(5) 
where for each , is a predictable process, is a adapted martingale, and are constants. We also assume here that the initial inventory holdings of each agent are all independent of both and in each measure .
The measure effectively specifies the subpopulation’s asset price model through the processes and , as well as the scale of the market impact of each subpopulation, through set of constants . {assumption} Here, we make the technical assumptions that and , where
(6)  
(7) 
for each and where represents the Euclidean norm. {assumption} We also make the assumption that for all and the law under each measure is the same as that under the measure . Lastly, we assume that for each , and are uncontrolled – i.e., are unaffected by the agents’ actions.
Each agent tracks their total accumulated cash process throughout the trading period. When buying and selling the asset, each agent pays an instantaneous cost that is linearly proportional to amount of shares transacted. This cost is expressed through the controlled dynamics of the cash process. For an agent , their corresponding cash process is
(8) 
where is a parameter that is unique to a subpopulation and sets the scale of the instantaneous cost.
2.3 Information Restriction
In this market model, agents have restricted information over the course of the trading period. More specifically, agents have access only to the information generated by the paths of the asset price process , their own inventory process , and the average order flow of each subpopulation, . We express this information restriction in our model by restricting the sigmaalgebra to which an agent’s strategy may be adapted. For each , we only allow agent to choose strategies contained within the set of asmissible strategies,
(9) 
where we define , and
(10) 
which is the sigmaalgebra generated by the paths of the asset price process, the total orderflow proces,, and the starting inventory level for agent . In definition (9), we deliberately restrict ourselves to processes in , to guarantee that for all .
2.4 The Agent’s Optimization Problem
Each agent chooses their trading strategy to maximize an objective functional that measures their performance over the course of the trading period . For each let . Each agent within a subpopulation , chooses a control to maximize a functional defined as follows
(11) 
where and are parameters that vary across, but are constant within, subpopulations. In definition (11), we use the notation to indicate the dependence of the objective functional on the actions of all other agents in the population.
The objective functional corresponds to the agent trying to maximize a weighted average of three separate quantities. The first term corresponds to the total amount of cash the agent has accumulated up until time . The second term, corresponds to the cost of liquidating all of the agent’s leftover inventory at time , minus a liquidation penalty controlled by the parameter . The last term, is a running riskaversion penalty that is controlled by the parameter , which incentivizes the agent to keep their market exposure low during the trading period. It may also be interpreted as stemming from model uncertainty as shown in [5].
Each agent within subpopulation has an objective functional that is computed by taking expectations under the measure . Hence, agents incorporate their own beliefs on the asset price dynamics. Furthermore, each functional depends on the actions of all other players () through the dynamics of the asset price , which implicitly appear in the definition (11).
By expanding the dynamics of each of the state processes present in (11), and by using integration by parts, we may rewrite the agent’s objective functional as
(12) 
where is a term that is constant with respect to and . Each agent’s behaviour is characterized entirely by the objective functional they are trying to maximize. From (12), it is clear that the objective functional is parametric so that the agent’s preferences can be entirely described by the tuple and their starting inventory .
The market model defined above forms a stochastic game in which all participating agents are competing to maximize each of their own objectives. We wish to find and study this market at its Nash equilibrium – i.e., where agents are simultaneously at their optimum. This equilibrium can be described more formally as the collection of admissible strategies which satisfies the condition
(13) 
Obtaining this collection of strategies for the stochastic game with a finite number of players proves to be a difficult task. One of the main obstacles in finding a solution to this problem is that each agent’s strategy is adapted to different filtration . Furthermore, each of the objective functionals defined in equation (12) are expressed one of different measures from the collection of measures , each representing the beliefs of a particular individual. These two features make the finitepopulation stochastic game difficult to solve directly. It is, however, possible to solve the stochastic game in the infinite population limit, and use the result as an approximation for the finite population game.
3 Solving the MeanField Stochastic Game
As the stochastic game presented in Section 2.4 presents obstacles when aiming to solve it directly, we now take a different avenue. In this section, we study the stochastic game as the population limit tends towards infinity. The resulting limit is that of a stochastic Mean Field Game (MFG) that we can solve. Although we do not explicitly solve the finite player game presented in Section (2), by establishing an Nash equilibrium property in Section 5, we show that the equilibrium solution obtained for the MFG provides an approximation to the finite population game, provided that the population size is large enough.
This section begins by taking the population limit as , to obtain new objective functionals for the agents resulting in a stochastic MFG. Next, using convex analysis methods, we characterize the Nashequilibrium as the solution to a coupled system of FBSDEs. We then conclude by presenting a solution to this FBSDE problem, and thus an exact representation of each agent’s optimal control at the Nashequilibrium.
3.1 The Limiting MeanField Game
Agent’s objective functional (11) only depends on the population size through the dynamics of the midprice process , which is given by the dynamics in equation (5). {assumption} To proceed, we assume that the limiting trading rate exists, in particular, there exist processes for such that and
(14) 
where is the Lebesgue measure on the Borel sigmaalgebra , and where is the canonical product measure of and . As each individual is predictable, must be predictable. Moreover, by our assumption that for each , the limit (14) also holds almost everywhere. From now on, we refer to each of the processes as the meanfield trading rate for subpopulation.
Using the assumption that for all along with (14), we find that in the infinite population limit, from the perspective of agent from subpopulation , the dynamics of the asset price process is
(15) 
In this limit, a single individual’s impact on the price becomes negligible, thus the resulting meanfield trading rate is unaffected by a single agent’s trading rate . Therefore, in the limit, each agent’s objective no longer depends on the whole collection of trading rates , but instead only depends on the collection of meanfield processes . By using the objective functional representation in (12), expanding from (15), and noticing that the martingale components vanish under expectation, we may write the agents objective functional in the infinite population limit as
(16) 
where for each we define as and where we define as . In the expression for in (16) we suppress the argument as, in this infinite population limit, their effect is felt through the meanfields for each subpopulation. We use the superscript in the notation for to indicate the dependence on the set of meanfields.
Our new objective is to obtain the Nashequilibrium in this newly defined meanfield game. The Nash equilibrium for the MFG consists of finding the infinite collection of controls that satisfies the optimality condition
(17) 
as well as the consistency condition
(18) 
for all .
In the limit, the explicit dependence of an agent’s actions in another agent’s objective functional is replaced with an implicit dependence through the consistency condition.
3.2 The Agent’s Optimality Condition
To solve the optimization problem described in Section 3.1, we must determine what strateg maximizes the rhs of equation (17) for all agents. This can be achieved in our particular case by using tools from infinite dimensional convexanalysis or variational calculus. First, we demonstrate that each function is a strictly concave functional of . Next, as is a functional with an infinitedimensional argument, we show that each functional is Gâteaux differentiable within the space and compute the Gâteaux derivative explicitly. General results in convex optimization then state that if the derivative vanishes at a point within the space , it must be the point at which attains its supremum. The lemmas that follow give us the required properties for .
Lemma 1
The functional defined in equation (16) is strictly concave in up to null sets.
Proof
See A.1.
Lemma 2
For an agent in subpopulation , the functional defined in equation (16) is everywhere Gâteaux differentiable in . The Gâteaux derivative at a point in a direction can be expressed as
(19) 
Proof
See A.2.
Therefore, since is concave, the supremum of is attained at a point if and only if the expression (19) vanishes for all . Moreover, the strict concavity of guarantees that such a point is unique up to null sets. Indeed, as the following theorem shows, the collection of points that ensures (19) vanishes for all , and for all , coincides with the solution of an infinitedimensional system of FBSDE.
Theorem 1
We have that
(20) 
for all if and only if for each agent in subpopulation , and is the unique strong solution to the FBSDE
(21) 
where is an adapted martingale and where
(22) 
for all .
Proof
See A.3.
Theorem 1 reduces the convex optimization problem (16), (17), and (18) into an infinite system of FBSDEs. The forward component comes from the latent drift processes and inventory processes , while the backwards component comes from the trading rates . The coupling in this system appears through the meanfield processes , which averages out all of the actions of other agents within the game. A few difficulties are immediately apparent in the FBSDE (21). Firstly, each individual FBSDE, corresponding to a particular agent’s trading rate, is written in terms of a martingale that is specific to the agent’s subpopulation, and the measure under which the process is a martingale corresponds to the agent’s belief about the drift process . Secondly, the conditional expected value appears in the driver of the FBSDE. This is a projection of the meanfields onto the agent’s filtration, and appears because the agent cannot directly observe the strategies of other individuals. This projection of the meanfields adds another layer of difficulty.
Recall that a solution to the FBSDE (21) for agent consists of a pair of processes that satisfies the SDE and terminal condition in (21) almost everywhere. For the requirements of Theorem 1 to be met, a solution must simultaneously meet the consistency condition (22) almost everywhere. If we can find a set of solutions, we can guarantee it is unique up to null sets due to the strict convexity of the objective functional and the ‘if and only if’ nature of the statement.
3.3 Solving the Optimality FBSDE
In this section, we solve the FBSDE (21), and hence provide an exact form for the Nashequilibrium for the infinite population meanfield game. The key to obtaining a solution lies in first postulating a structure for the solution of (21). This form then suggests a vector valued FBSDE that the meanfield processes must satisfy, which are independent of any individual agent’s strategy. The resulting nonstandard FBSDE system, is defined across the set of measures and introduces an obstacle in solving it directly. The key step in obtaining a solution lies in representing the FBSDE in terms of a single measure, and solving it there.
Due to the linear form of the FBSDE (21), it is natural to assume that the solution is affine. As such, for an agent within a subpopulation , we seek for optimal controls of the form
(23) 
where is an unknown deterministic, continuously differentiable, function of time, and where we define the meanfield inventory process for subpopulation as
Plugging this ansatz into (21) and simplifying, we find that
(24) 
along with the boundary condition that
(25) 
which must both hold almost everywhere. Therefore, to solve the FBSDE (21), it is sufficient for us to make the terms in the curly brackets of equation (24) and in the boundary condition (3.3) vanish independently of one another. Collecting these equations, we obtain a firstorder Riccatitype ODE for ,
(26) 
as well as a linear FBSDE for the meanfield process
(27) 
where is an adapted martingale.
Let us point out here that the ansatz for found in equation (23) satisfies the consistency condition as long as there exist solutions to the equations (26) and (27). This can be most easily seen by taking the average of (23) over and taking the limit as .
The FBSDE (27) implies that the solution should be an adapted process. However, Equation (27) holds for any agent for which , i.e., any agent within the same subpopulation, therefore for each , should be adapted for any . Consequently, letting , each is in fact an adapted process which solves the FBSDE
(28) 
where is an adapted, martingale, and the expectation appearing in the drift is conditional on not .
By stacking the FBSDEs (28) over all values of , we may obtain a vectorvalued FBSDE for the process . To this end, define the column vector of filtered drift processes where . Next, as is measurable, stacking the FBSDEs (28) over all values of , we have
(29) 
where and are all realvalued matrices defined as
where , and is a column vector of the adapted processes, where as a reminder, , and the th element is a martingale.
From the linear structure of the FBSDE (29), we can further simplify the problem by seeking for affine solutions of the form
(30) 
where is a deterministic and continuously differentiable function of time, and is an valued stochastic process. Plugging the ansatz into (29), and following through with the same logical steps as before, we find that the ansatz holds true so long as is the solution to the Ricattitype matrixODE
(31) 
and when solves the BSDE,
(32) 
where is the same vector of processes present in FBSDE (29).
At this point, we have succeeded in reducing the search for a Nashequilibrium to solving (i) two deterministic ordinary differential equations (ODEs) (26) and (31), and (ii) a nonstandard linear BSDE (32). The ODEs are straightforward to solve, however, BSDE poses some further challenges.
One of the primary obstacles in solving the BSDE (32) is that each component of incorporates a process that is a martingale under a different probability measure. Recall that the components of are required to be martingales with respect to the different measures . Each measure is what agents within subpopulation use to compute expectations, and agents within that subpopulation assume the asset has drift in excess of the orderflow from all agents. The key step in solving the BSDE is to recast it in terms of martingales under a single probability measure. This introduces nontrivial drfit adjustments, however, we find that it is indeed possible to solve the modified BSDE explicitly.
Consider the th dimension of the BSDE (32)
(33) 
where is a martingale, and where is defined as the th row of the deterministic matrixvalued function . The solution of BSDE (33) can be expressed implicitly as follows
(34) 
Next, we aim to represent (34) in terms of expectation under another measure such that for all . By the assumption that for all , there always exists such a measure. For example, for some . Given this measure, define the adapted RadonNikodym derivative processes
(35) 
Using this process, we find that we may write equation (34) as an expected value under the measure as,
(36) 
Defining the diagonal valued process , where , allows us to write a linear BSDE for using a single measure . More specifically, from (36), we have that
(37) 
where is an valued martingale. The BSDE (37) is linear and its solution can be expressed in closed form. The following theorem provides a representation for the solution of as well as , and .
Theorem 2 (Solutions to the MeanField BSDEs)

Let be any probability measure such that . Then the BSDE (32) admits a closed form solution,
(38) where is the solution to the forward matrixvalued SDE