## Abstract

As an organic combination of mean field theory in statistical physics and (non-zero sum) stochastic differential games, Mean Field Games (MFGs) has become a very popular research topic in the fields ranging from physical and social sciences to engineering applications, see for example the earlier studies by Huang, Caines and Malhamé (2003), and that by Lasry and Lions (2006a, b and 2007). In this paper, we provide a comprehensive study of a general class of mean field games in the linear quadratic framework. We adopt the adjoint equation approach to investigate the existence and uniqueness of equilibrium strategies of these Linear-Quadratic Mean Field Games (LQMFGs). Due to the linearity of the adjoint equations, the optimal mean field term satisfies a forward-backward ordinary differential equation. For the one dimensional case, we show that the equilibrium strategy always exists uniquely. For dimension greater than one, by choosing a suitable norm and then applying the Banach Fixed Point Theorem, a sufficient condition for the unique existence of the equilibrium strategy is provided, which is independent of the coefficients of controls and is always satisfied whenever those of the mean-field term are vanished (and therefore including the classical Linear Quadratic Stochastic Control (LQSC) problems as special cases). As a by-product, we also establish a neat and instructive sufficient condition, which is apparently absent in the literature (see Freiling (2002)) and only depends on coefficients, for the unique existence of the solution for a class of non-trivial nonsymmetric Riccati equations. Numerical examples of non-existence of the equilibrium strategy will also be provided. It is remarked that the uniform agent case of Huang, Caines and Malhamé (2007a) serves as an interesting comparison with our LQMFGs. We give an example (see Appendix) with which existence can be covered by our theory while it needs not satisfy the sufficient condition provided in their work; though in general, both approaches cover different feasible ranges. Finally, similar approach has been adopted to study the Linear-Quadratic Mean Field Type Stochastic Control Problems (see Andersson and Djehiche (2010)) and their comparisons with MFG counterparts.

Keywords: Mean Field Games; Mean Field Type Stochastic Control Problems; Adjoint Equations; Linear-Quadratic

## 1 Introduction

Modeling collective behaviors of individuals in account of their mutual interactions in various physical or sociological dynamical systems has been one of the major problems in the history of mankind. For instance, physicists were simply used to apply the traditional variational methods from Lagrangian or Hamiltonian mechanics to study interacting particle system, which left a drawback of extremely high computational cost that made this microscopic approach almost mathematically intractable. To resolve this matter, a completely different macroscopic approach from statistical physics had been gradually developed, which eventually leads to the primitive notion of mean field theory. The novelty of this approach is that particles interact through a medium, namely the mean field term, being aggregated by action of and reaction on each particle. Moreover, by passing the number of particles to the infinity in these macroscopic models, the mean field term will become a functional of the density function which represents the whole population of particles that leads to much less computational complexity. In biological literature, similar tools have been applied to connect human interactive motion with herding models for insects and animals. For example, the behavior that ants secrete chemical substrates for leading mates to valuable food resources resulting in a lane can be described by a mean-field model (see Kirman [26] for more details).

On economics side, due to the dramatic population growth and rapid urbanization, urgent needs of in-depth understanding of collective strategic interactive behaviors of a huge group of investors is crucial to maintaining sustainable economic growth. Since the vector of good prices is determined by both demand and supply, it is natural to utilize the aggregation effect from the investors’ states as a canonical candidate of mean-field term, and then employs the corresponding mean-field models in place of the classical equilibrium models in economics; moreover, as the investors are usually smart in decision making (i.e. being not of zero-intelligent), it is necessary to also incorporate the theory of stochastic differential games (SDGs) in these mean-field models. Over the past few decades, SDGs has been a major research topic in control theory and financial economics, especially in studying the continuous-time decision making problem between non-cooperative investors; in regard to the one-dimensional setting the theory of two person zero-sum games is quite well-developed via the notion of viscosity solutions, see for example Elliott (1976), and Fleming and Souganidis (1989). Unfortunately, most interesting SDGs are -player non-zero sum SDGs. In this direction we mention the works of Bensoussan and Frehse [6, 7] and Bensoussan et al. [8], but there are still relatively few results in the literature.

As a macroscopic equilibrium model, Huang et al. [19, 25] investigated stochastic differential game problems involving infinitely many players under the name “Large Population Stochastic Dynamic Games”. Independently, Lasry and Lions [30, 31, 32] introduced studied similar problems from the viewpoint of the mean-field theory and termed “Mean-Field Games (MFGs)”. As an organic combination of mean field theory and theory of stochastic differential games, MFGs provide more realistic interpretation of individual dynamics at the microscopic level, so that each player are not of zero-intelligent and will be able to strategically optimize his prescribed objectives, yet with a mathematical tractability in a macroscopic framework. To be more precise, the general theory of MFGs has been built by combining various consistent assumptions on the following modeling aspects: (1) a continuum of players; (2) homogeneity in strategic performance of players; and (3) social interactions through the impact of mean field term. The first aspect is describing the approximation of a game model with a huge number of players by a continuum one yet with a sufficient mathematical tractability. The second aspect is assuming that all players obey the same set of rules of the interactive game, which provide guidance on their own behavior that potentially leads them to ultimate success. Finally, due to the intrinsic complexity of the society in which the players participate in, the third aspect is explaining the fact that each player is so negligible who can only affect others marginally through his own infinitesimal contribution to the society. In a MFG, each player will base his decision making purely on his own criteria and certain summary statistics (that is, the mean field term) about the community; in other words, in explanation of their interactions, the pair of personal and mean-field characteristics of the whole population is already sufficient and exhaustive. Mathematically, each MFG will possess the following forward-backward structure: (1) a forward dynamic describes the individual strategic behavior; (2) a backward equation describes the evolution of individual optimal strategy, such as those in terms of the individual value function via the usual backward recursive techniques. For the detail of the derivation of this system of equations with forward-backward feature, one can consult from the works of Huang et al. [25] and Lasry and Lions [30, 31, 32].

Before introducing our proposed model, we first list out some relevant recent theoretical results in MFGs. (I) For problems over infinite-time horizon: Bardi [4], and Li and Zhang [33] studied ergodic MFGs with different quadratic cost functionals and linear dynamics; Guéant [16] studied a specific MFG with a quadratic Hamiltonian and showed that the density function for the population is Gaussian; Huang et al. [19, 22, 23, 24] considered MFGs with quadratic cost functional; Nourian et al. [35] extends the studies of MFGs with ergodic cost functional to Cucker-Smale Flocking model; and Yin et al. [39] gave a bifurcation analysis of an ergodic MFG with nonlinear dynamics. (II) For problems over finite-time horizon: Guéant [17] applied a change-of-variable technique leading to separation-of-variables to consider MFGs with quadratic Hamiltonian; Lachapelle [27] and Lachapelle and Wolfram [29] extended MFGs to the setting involving 2-population dynamics; Lachapelle et al. [28] investigated MFGs with reflection parts and quadratic cost functional; Tembine et al. [37] considered the risk-sensitive MFGs; and Yang et al. [38] adopts the MFG approach to construct a non-linear filter. (III) Various numerical approximation scheme can also be found in Achdou et al. [1], and Achdou and Capuzzo-Dolcetta [2]. Because of the discretization in the numerics, Gomes et al. [14] studied discrete time mean field game with finite state space directly. For more recent development and its applications, please also refer to the lecture notes Cardaliaguet [11], the surveys Guéant [15], Guéant et al. [18], and the references therein.

In this paper, we study a subclass of Mean Field Games in which the cost functional is quadratic in all state variables, control variables and the mean field terms; while the controlled dynamics are linear and also consist of mean field terms. These Linear-Quadratic mean field games (LQMFGs) have been previously considered in Huang et al. [21] by using the common Riccati equation approach; in contrast, in this paper, the stochastic maximum principle is adopted instead. Essentially, the equilibrium problem can be converted into find a fixed point for a transformation defined by the solution of a control problem. For the part of control problem, these two approaches are certainly equivalent. However, it is not the case for the fixed point problem since a condition that is easier to be verified can be given1. Indeed, in our approach, thanks to the linearity of the adjoint equations, the optimal mean-field term can be expressed as the solution of a forward-backward ordinary differential equation. This method avoids us from solving the optimal trajectory as in the Riccati equation approach and allows generalization to higher dimension, which is a crucial setting in understanding the 2-population MFGs proposed in Lachapelle [27] and Lachapelle and Wolfram [29]. More precisely, by choosing a suitable norm and applying the Banach Fixed Point Theorem, we provide a more relaxed sufficient condition for the existence and uniqueness of the equilibrium strategy, which is also independent of the coefficient of control and always hold when the mean-field term is zero. Under the one-dimensional setting and certain convexity assumption, we also prove that the equilibrium strategy always uniquely exists. As a by-product, we also establish a neat and instructive sufficient condition, which is apparently absent in the literature (see Freiling [13]) and only depends on coefficients, for the unique existence of the solution for a class of non-trivial nonsymmetric Riccati equations. Numerical examples showing the non-existence of any equilibrium strategy will also be provided. Furthermore, in the Appendix, we compare explicitly our conditions with those of Huang et al. [21]. In summary, our present work gives a novel and totally different approach with several advantages in particular for the generalization of the classical Linear-Quadratic Stochastic Control Problem in the MFG setting.

In general, the computational complexity of calibrating a Nash equilibrium of an -player SDG (if it exists) is very high, especially for large values of , it would be more convenient to find a computable approximation of this Nash Equilibrium strategy. Since MFGs are obtained by setting , the equilibrium strategy serves as a natural candidate as it can be shown to be an “approximation”, or -Nash Equilibrium strategy for the corresponding equilibrium for -player SDG. The computability of this equilibrium strategy is justifiable as it depends only on the state of the player and the mean-field term, which dramatically reduces the problem dimension of the Nash Equilibrium strategy of the -player SDG. For more inspiring elaboration on the notion of -Nash Equilbrium, one can refer to, for example, Cardaliaguet [11] and Huang et al. [19, 20, 25].

If one considers a centralized controlling interacting particle system, instead of every particle having the free will to choose its own control as formulated in MFGs, a stochastic control problem of mean field type would be resulted (see Andersson and Djehiche [3]). This mean-field type optimization problem shares a similar mathematical form as proposed in MFG problem, and the mean-field term is now uniformly controlled by a centralizing system instead of being affected by the collective optimal trajectory. For more details about the existence and convergence rate of the related mean-field backward stochastic differential equations, one can refer to Buckdahn et al. [9] and Buckdahn et al. [10]. By using the adjoint equation approach again, we characterize the optimal control, which exists and is unique in virtue of the convex coercive property of the underlying cost functional. Finally, we also find that, in general, this optimal control is different from the equilibrium strategy obtained in its corresponding MFG counterpart.

In Section 2, we will formulate a Linear-Quadratic -player nonzero-sum stochastic differential game and demonstrate how to obtain the corresponding mean field game formally. In Section 3, we shall employ the adjoint equation approach in order to provide a thoughtful study of the the existence and uniqueness of LQMFGs. By choosing a suitable norm and applying the Banach Fixed Point Theorem, an illuminating sufficient condition for the existence and uniqueness of the equilibrium strategy is provided; note that this new condition is independent of the coefficients of the controls, and is always satisfied whenever the coefficients of the mean-field terms are vanished. Relationship with nonsymmetric Riccati equations and illustrative numerical examples will also be provided. We remark that these nonsymmetric Riccati equations, appeared in the resolution of the fixed point problem, could not be found in the literature including Huang et al. [21]; and most importantly, they are substantially different from those symmetric Riccati equation commonly arisen from Control Theory. In Section 4, we shall show that the equilibrium strategy is an -Nash Equilibrium of the -player SDG. In Sections 5 and 6, we shall adopt a similar adjoint equation approach to solve the Linear-Quadratic Mean-Field Type Stochastic Control Problem and compare its optimal control to the equilibrium strategy of the corresponding MFG counterpart. In Appendix, an example is given which illustrates that its unique existence could be covered by our theory but it fails to satisfy the sufficient condition as provided in Huang et al. [21]. It is noticed that in some other cases, Huang et al. [21] may cover different possibilities from ours.

## 2 Problem Formulation

The present formulation of the Linear-Quadratic Mean Field Games will follow closely the classical Linear-Quadratic Stochastic Control Problems, see for example Bensoussan [5]. Following Lasry and Lions [30, 31, 32], in order to formulate the Linear-Quadratic Mean Field Game, we first state the corresponding -player game for .

Let be a complete probability space and be the time horizon. Suppose that are independent -dimensional standard Wiener processes defined on and are independent, identically distributed (i.i.d.) -dimensional random vectors. We also assume that is independent to for each , . The dynamics of the player is modeled by

 dxit=⎛⎝Atxit+Btvit+¯At⋅1N−1N∑j=1,j≠ixjt⎞⎠dt+σtdWit,xi(0)=xi0,

where , , are bounded deterministic matrix-valued functions in time of suitable sizes, is a -function in time of suitable size, and the control in , which is the -space of stochastic processes adapted to the filtration , with values in . The present proposed (additive) model extends the classical linear stochastic dynamical one, as expected, the coefficient measures the effect brought by the state variable and measures the impact of the control; while the new ingredient summarizes the symmetric influence of the rest of the players. Even though this additional term is natural from the modeling perspective when one attempts to investigate interactive real-time multi-player game, it causes a substantial mathematical difficulty in the discussion on the existence and calibration of the corresponding Nash Equilibrium especially for large values of . Although the coefficients look the same for all players, it reflects that each players obeys the same set of game rules and behaves based on the same collections of rationales. Their actual individual realized performances could be far different from each other, in particular, their single dynamics are driven by independent Wiener processes.

The cost functional for each player is assumed to be:

 Ji(v1,…,vN) ≜ E[12∫T0(xit)∗Qtxit+(vit)∗Rtvitdt+12(xiT)∗QTxiT] +E⎡⎣12∫T0⎛⎝xit−St⋅1N−1N∑j=1,j≠ixjt⎞⎠∗¯Qt⎛⎝xit−St⋅1N−1N∑j=1,j≠ixjt⎞⎠dt⎤⎦ +E⎡⎣12⎛⎝xiT−ST⋅1N−1N∑j=1,j≠ixjT⎞⎠∗¯QT⎛⎝xiT−ST⋅1N−1N∑j=1,j≠ixjT⎞⎠⎤⎦,

where denotes the transpose of a matrix , (respectively, , and ) are bounded, deterministic (respectively, non-negative and positive definite) matrix-valued functions in time of suitable sizes. We also suppose that , for some .

The reasons for the same form of the objective functional among different players are the same as in the previous discussions on the modeling of individual dynamics. The first expectation agrees with the corresponding cost functional in the classical Linear-Quadratic Stochastic Control Problem; namely, it describes the sum of running expenses and the terminal costs of each player himself. The other two expectations are specific in our present model setting, they describe the extra costs incurred if a player shows deviated performance away from the average behavior of the community. These two terms truly reflect the coalescence phenomena commonly observed in the literature of socio-economics and finance, in the sense that, every agent has to pay an additional transaction cost of collecting extra profitable information if he aims to outperform from his peers. Along every direction of the deviation, the incorporated additional cost has to be non-negative, this model constraint is ensured by the assumption of the non-negative definiteness of .

The principal objective of each player is to minimize his own cost functional by properly controlling his own dynamics. In this classical non-zero sum stochastic differential game framework, we aim to establish a Nash equilibrium (see for example, Bensoussan and Frehse [6]):

###### Problem 2.1.

Find a Nash Equilibrium which satisfies the following comparison inequalities:

 Ji(u1,…,ui−1,vi,ui+1,…uN)≥Ji(u1,…,uN),

for and any admissible control in .

In accordance with the permutation symmetry of index , it suffices to consider the case for . In general, the computational complexity of calibration of a Nash equilibrium (if it exists) is high, especially for large values of . Due to the large number of participants in most game theoretical models in practice, a convenient computable approximation of the Nash Equilibrium strategy is usually demanded. By formally passing , Lasry and Lions [30, 31, 32] introduced the notion of Mean Field Games. Now, the Mean Field Game associated with Problem 2.1 can be obtained as follows:

###### Problem 2.2.

Find an equilibrium strategy in , with , and , which minimizes the cost functional

 J(v)≜ E[12∫T0x∗tQtxt+v∗tRtvt+(xt−StE[yt])∗¯Qt(xt−StE[yt])dt] +E[12x∗TQTxT+12(xT−STE[yT])∗¯QT(xT−STE[yT])],

where the dynamics is given by

 dxt=(Atxt+Btvt+¯AtE[yt])dt+σtdWt,x(0)=x0,

is an admissible control in and is the trajectory corresponding to the equlibrium strategy (if it exists).

###### Remark 2.3.

An interesting example of our proposed one-dimensional LQMFGs has been considered earlier in Huang et al. [21]. In this paper, we shall provide a complete picture of the resolution of the problem by using adjoint equation approach, which results in a different sufficient condition for the unique existence of the underlying equilibrium strategy.

In comparison with the -player game, in order to avoid confusion, the notations for the dynamics and stated in Problem 2.2 will be changed to and respectively. For if Problem 2.2 were solvable, then for each , , we could obtain a strategy in , where . Since is a deterministic process, it is clear that are i.i.d.. As the Mean Field Game is obtained from the -player game, it is expected that is an -Nash Equilibrium when , see for example Cardaliaguet [11] and Huang et al. [19, 20, 25] for more detail. We shall first present an informal description here, rigorous arguments will be provided later in Section 4. Again, it suffices to consider Player 1.

For any admissible control , let (respectively, ) denote the dynamics in Problem 2.1 controlled by (respectively, ). By the definition of -Nash Equilibrium, will “approximately” minimize . As , for , we have

 1N−1N∑j=1,j≠ixjt−1N−1N∑j=2,j≠ixjt→0.

By the McKean-Vlasov argument, . As an application of the Strong Law of Large Numbers (SLLN), as are i.i.d., which is a consequence of the i.i.d. nature of . By applying SLLN again to the cost functional, we deduce that

 J1(v1,u2,…,uN)→J1(v1).

Similarly, we also have

 J1(u1,u2,…,uN)→J1(u1),

and it shows heuristically that is an -Nash Equilibrium.

## 3 Solution of the Mean Field Game

To motivate for solving Problem 2.2, we first lay down some classical results in the literature of Linear-Quadratic Stochastic Control Theory but from a new perspective which aids for the development of our new methodology to tackle Problem 2.2.

###### Problem 3.1.

Given a continuous deterministic process with values in . Find an optimal control in which minimizes

 J(v)≜ E[12∫T0x∗tQtxt+v∗tRtvt+(xt−Stzt)∗¯Qt(xt−Stzt)dt] +E[12x∗TQTxT+12(xT−STzT)∗¯QT(xT−STzT)],

where the dynamics is given by

 dxt=(Atxt+Btvt+¯Atzt)dt+σtdWt,x(0)=x0,

and is an admissible control in .

###### Theorem 3.2.

Problem 3.1 is uniquely solvable and the optimal control is , where satisfy the stochastic maximum principle relation

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩dyt=(Atyt−BtR−1tB∗tpt+¯Atzt)dt+σtdWt,y0=x0,−dωtdt=A∗tωt+(Qt+¯Qt)yt−(¯QtSt)zt,ωT=(QT+¯QT)yT−(¯QTST)zT, (1)

such that .

###### Proof.

It is clear that Problem 3.1 is a strictly convex coercive optimization problem in the sense that as . In order to derive the stochastic maximum principle relation, we first consider the Euler Equation:

 ddθJ(u(⋅)+θv(⋅))∣∣∣θ=0=0.

To explicitly express the dependence of the state on (recall that , are fixed), we adopt the notation . Note that, by linearity,

 x(t;u(⋅)+θv(⋅))=y(t)+θ~x(t;v(⋅)),

where

 d~xdt=At~xt+Btvt,~x(0)=0.

On the other hand, we can write as

 J(v(⋅))= E[∫T012(x∗t(Qt+¯Qt)xt+v∗tRtvt)−x∗t(¯QtSt)zt+12z∗t(S∗t¯QtSt)ztdt] +E[12x∗T(QT+¯QT)xT−x∗T(¯QTST)zT+12z∗T(S∗T¯QTST)zT],

and therefore the Euler Equation becomes

 E[∫T0~x∗t(Qt+¯Qt)yt−~x∗t(¯QtSt)zt+v∗tRtutdt]+E[~x∗T(QT+¯QT)yT−~x∗T(¯QTST)zT]=0. (2)

 ⎧⎪⎨⎪⎩−dωtdt=A∗tωt+(Qt+¯Qt)yt−(¯QtSt)zt,ωT=(QT+¯QT)yT−(¯QTST)zT,

we obtain

 ddt(~x∗tωt) =(~x∗tA∗t+v∗tB∗t)ωt−~x∗t(A∗tωt+(Qt+¯Qt)yt−(¯QtSt)zt) =v∗tB∗tωt−~x∗t((Qt+¯Qt)yt−(¯QtSt)zt),

and hence

 E[∫T0v∗tB∗tωtdt] = E[~x∗T((QT+¯QT)yT−(¯QTST)zT)]+E[∫T0~x∗t((Qt+¯Qt)yt−(¯QtSt)zt)dt],

and from (2) we obtain

 E[∫T0v∗t(B∗tωt+Rtut)dt]=0. (3)

Set then (3) becomes . Since is arbitrary in , we deduce that . ∎

###### Remark 3.3.

The optimal control can be written as , where and satisfies

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩dΞtdt+ΞtAt+A∗tΞt−Ξt(BtR−1tB∗t)Ξt+Qt+¯Qt=0,ΞT=QT+¯QT,dζtdt=−A∗tζt+Ξt(BtR−1tB∗t)ζt+(¯QtSt−Ξt¯At)zt,ζT=−¯QTST.

According to Theorem 3.2, for fixed , we shall obtain the pair and the optimal control is . Hence is an equilibrium strategy of Problem 2.2 if and only if there is a continuous function such that . We denote the expected values , , the stochastic maximum principle relation (1) implies that

 ⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩ddt(¯yt−¯pt)=(At−BtR−1tB∗tQt+¯QtA∗t)(¯yt¯pt)+(¯Atzt−(¯QtSt)zt),¯y0=E[x0],¯pT=(QT+¯QT)¯yT−(¯QTST)zT.

Hence, Problem 2.2 is solvable if and only if solves the following system of ordinary differential equations:

 (4)

where is the identity matrix.

We next address the uniqueness issue of the Nash Equilibrium. According to the previous discussion, if Equation (4) has at most one solution, at most one possible could be found. Together with the uniqueness result in Theorem 3.2, there is at most one equilibrium strategy . Conversely, suppose that there is at most one equilibrium strategy in Problem 3.2. For each solution of Equation (4), we can associate the corresponding and with . By the construction, we have and and hence is an equilibrium strategy. Due to the uniqueness of equilibrium strategy, there is at most one possible choice for . Hence, the solution is uniquely determined by

 dξtdt=(At+¯At)ξt−BtR−1tB∗tE[pt],ξ0=E[x0].

Similarly, can also be uniquely determined and the uniqueness result follows. The following theorem summarizes the previous discussion.

###### Theorem 3.4.

There is a (unique) equilibrium strategy of Problem 2.2 if and only if there is a (unique) pair of the following system of ordinary differential equations (4):

Moreover, this equilibrium condition depends on and only through .

Since could be quite arbitrary, , may not often be of opposite sign and hence the sinusoidal functions serve as the standard example that Equation (4) does not admit a solution if is sufficiently large. Even under the assumption of the convexity of , Equation (4) is still not in a canonical form commonly found in the literature of the classical optimal control theory and the existence of solution is not guaranteed in our present context. To overcome this hurdle, we first define

 L ≜T(∥QT+ST∥2+∥Q+S∥T)∥BR−1B∗∥T ⋅exp((2∥A+¯A∥T+2∥A∗∥T+∥BR−1B∗∥T+∥Q+S∥T)T),

where denotes the supremum norm of the deterministic matrix-valued function on . By applying Gronwall’s inequality and Banach Fixed Point Theorem, we first have the following standard existence result when is sufficiently small, see for example Ma and Yong [34]:

###### Proposition 3.5.

If , then there exists a unique solution of (4).

However, in general, the supremum norm of could be large and the condition that is too restrictive. By the specific form of Equation (4), a more relaxed condition can be provided as follows:

###### Theorem 3.6.

Assume that the matrix-valued function is invertible. Let be the fundamental solution associated with and

 \interleaveϕ\interleaveT ≜sup0≤t≤T√∥∥ϕ∗(T,t)Q1/2T∥∥2+∫Tt∥∥ϕ∗(s,t)Q1/2s∥∥2ds.

Also, we define that and . Suppose that , and

 √T\interleaveϕ\interleaveT\interleave¯A\interleaveT(1+\interleaveS\interleaveT)+\interleaveS\interleaveT<1.

Then there exists a unique solution of (4).

###### Proof.

Let be the Hilbert space of functions endowed with the inner product

 ⟨z,z′⟩Q≜z∗TQTz′T+∫T0z∗tQtz′tdt.

Given a function in and there is a pair satisfying

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩dξtdt=Atξt−BtR−1tB∗tηt+¯Atzt,ξ0=E[x0],−dηtdt=A∗tηt+Qtξt+¯Qt(I−St)zt,ηT=QTξT+¯QT(I−ST)zT, (5)

since it corresponds to a well-defined control problem (by referring to the deterministic analog of Theorem 3.2). We remark that Equation (5) is different from the deterministic counterpart of Equation (1). The mapping defined in this way is affine and maps into itself. Our objective is to show that it admits a fixed point. To apply Banach Fixed Point Theorem, it suffices to show that the mapping is a contraction if . By considering the dynamics of , we have the following equality:

 ξ∗TQTξT+∫T0ξ∗tQtξtdt+∫T0η∗tBtR−1tB∗tηtdt (6) =∫T0η∗t¯Atztdt−ξ∗T¯QT(I−ST)zT−∫T0ξ∗t¯Qt(I−St)ztdt.

Moreover, let be the fundamental solution associated with , and we get

 ηt =ϕ∗(T,t)(QTξT+¯QT(I−ST)zT) +∫Ttϕ∗(τ,t)(Qτξτ+¯Qτ(I−Sτ)zτ)dτ.

By the Cauchy-Schwarz inequality, we have

 ∥ηt∥ ≤\interleaveϕ\interleaveT(∥ξ∥Q+∥∥Q−1/2¯Q(I−S)z∥∥) ≤\interleaveϕ\interleaveT(∥ξ∥Q+\interleaveS\interleave∥z∥Q).

Therefore, from (6),

 ∥ξ∥2Q≤√T\interleaveϕ\interleaveT(∥ξ∥Q+\interleaveS\interleaveT∥z∥Q)\interleave¯A\interleaveT∥z∥Q+∥ξ∥Q\interleaveS\interleaveT∥z∥Q,

which shows that is a contraction if . ∎

###### Corollary 3.7.

If , then and the previous condition reduces to

 √T\interleaveϕ\interleaveT\interleave¯A\interleaveT<1.
###### Remark 3.8.

For a single person optimization problem (i.e. the classical Linear-Quadratic Stochastic Control Problem), that is , we recover the standard existence and uniqueness result in the literature.

###### Remark 3.9.

Assume that . Then the non-singularity of is not necessary and the norm can be weaken to in applying Theorem 3.6.

###### Remark 3.10.

Suppose that can be written as , where (that is, is positive definite) is chosen to satisfy suitable conditions stated in Theorem 3.6. Replacing , by and respectively in the iterative scheme (5) in the proof, a different sufficient condition for the unique existence of the equilibrium strategy is obtained:

 √T\interleaveϕ\interleaveQ,T\interleave¯A\interleaveQ,T(1+\interleaveS\interleaveQ,T)+\interleaveS\interleaveQ,T<1,

where

 \interleaveϕ\interleaveQ,T \interleave¯A\interleaveQ,T ≜sup0

For example, if all the coefficients are constants and , then the condition provides the desired unique existence by setting and .

###### Remark 3.11.

In Appendix, an example will be constructed which illustrates that its unique existence could be covered by our theory but it fails to satisfy the sufficient condition as stated in Huang et al. [21].

### 3.1 Relationship with Nonsymmetric Riccati Equation

We can look for a solution of Equation (4) in the form . Hence we get the following Nonsymmteric Riccati Equation:

 dΓtdt+Γt(At+¯At)+A∗tΓt−ΓtBtR−1tB∗tΓt+Qt+St=0,ΓT=QT+ST. (7)

If it is solvable, using Remark 3.3, we have . Therefore, the optimal control is and the optimal trajectory satisfies

 {dyt=[(At−BtR−1tB∗tΞt)yt+(¯At−BtR−1tB∗t(Γt−Ξt))¯yt)]dt+σtdWt,y0=x0.

However, because of the non-zero term and , Equation (7) is not the standard Riccati Equation. Hence, it is not always solvable and no natural sufficient condition for the existence of the solution is known (see Freiling [13]). Moreover, is not necessarily symmetric. Nevertheless, when and , the Nonsymmetric Riccati Equation becomes

 ⎧⎪⎨⎪⎩dΓtdt+Γt(At+12¯At)+(At+12¯At)∗Γt−ΓtBtR−1tB∗tΓt+Qt+St=0,ΓT=QT+ST,

which is of the standard form and the existence result holds. The explicit form of the solution can be established in this special case as follows. For the sake of simplicity, assume that all the coefficients are time-independent and our Riccati Equation can be simplified as:

 dΓtdt+(2A+¯A)Γt−B2R−1Γ2t+Q+S=0,ΓT=QT+ST.
1. For , we have

 Γt=(QT+ST+Q+S2A+¯A)exp((2A+¯A)(T−t))−Q+S2A+¯A,

when and

 Γt=(Q+S)(T−t)+QT+ST,

when .

2. For , let and be the two distinct roots of the quadratic equation

 Q+S+(2A+¯A)γ−B2R−1γ2=0,

and the solution can be explicitly written as

 Γt−α=(QT+ST−α)(α+β)(QT+ST+β)