Mean Field Production Output Control with Sticky Prices: Nash and Social Solutions\thanksreffootnoteinfo

# Mean Field Production Output Control with Sticky Prices: Nash and Social Solutions\thanksreffootnoteinfo

[    [
###### Abstract

This paper presents an application of mean field control to dynamic production optimization. Both noncooperative and cooperative solutions are considered. We first introduce a market of a large number of agents (firms) with sticky prices and adjustment costs. By solving auxiliary limiting optimal control problems subject to consistent mean field approximations, two sets of decentralized strategies are obtained and further shown to asymptotically attain Nash equilibria and social optima, respectively. The performance estimate of the social optimum strategies exploits a passivity property of the underlying model. A numerical example is given to compare market prices, firms’ outputs and costs under two two solution frameworks.

BW]Bingchang Wang, MH]Minyi Huang

School of Control Science and Engineering, Shandong University, Jinan 250061, P. R. China

School of Mathematics and Statistics, Carleton University, Ottawa, ON K1S 5B6, Canada

Key words:  Mean field game; social optimum; Nash equilibrium; production output adjustment; sticky price.

11footnotetext: This paper was not presented at any IFAC meeting. Corresponding author Bingchang Wang.

## 1 Introduction

Mean field game theory is effective to design decentralized strategies in a system of many players which are individually negligible but collectively affect a particular player (see e.g., [18], [19], [22], [31]). By identifying a consistency relationship between the individual’s best response and the mass (population macroscopic) behavior, one may obtain a fixed-point equation to specify the mean field. This procedure leads to a set of decentralized strategies as an -Nash equilibrium for the actual model with a large but finite population. By now, mean field games have been intensively studied in the LQG (linear-quadratic-Gaussian) framework [18], [19], [11], [39], [12]; there is also a large body of works on nonlinear models [22], [32], [3], [14]. For further literature, readers are referred to [17], [43], [44] for mean field models with a major player, [46] for oblivious equilibria proposed for large-scale Markov decision processes of industry dynamics, [42] for mean field games with Markov jump parameters. For a survey on mean field game theory, see [3], [14], and [4]. Besides noncooperative games, social optima in mean field control have been investigated in some literature [20], [45]. Mean field games and control have found wide applications, including smart grids [34], [7], [26], finance, economics [46], [15], [6], [23], operation research [28], [1], [33], [36], and social sciences [29], [21], [2], etc.

This paper aims to present an application of mean field control to production output adjustment in a large market with many firms and sticky prices. Under the stickiness assumption, the price of the underlying product does not adjust instantaneously according to its demand function, but evolves slowly and smoothly. Dynamic game models for duopolistic competition with sticky prices were initially proposed by Simaan and Takayama [38], and then extended to investigate asymptotically stable steady-state equilibrium prices in [13]. In [5], [47], the authors considered open and closed-loop Nash equilibria for dynamic oligopoly with firms and compared prices’ behavior in and outside the steady-state levels, respectively. Adjustment costs in production models have been addressed in the economic literature (see e.g. [35]) and they have been taken into account in the study of dynamic oligopoly [10], [37], [24]. The work [10] introduces a duopoly where each firm has output level subject to control according to a first-order integrator dynamics. However, when the number of firms is large (e.g. in a perfectly competitive market) and the adjustment cost is considered, the computational complexity of output adjustment is high. In the mean field control framework, one can effectively address the complexity issue.

Within our model, a large number of producers supply a certain product with sticky prices, and the output adjustment incurs a cost. The cost function of a firm is based on product cost, price, and adjustment cost. In [41], we combined the price and firm’s output as a 2-dimensional system. Thus, the cost function has indefinite state weights, which differs from many existing LQG models of mean field games in the literature [19], [42]. In this paper, the price in the mean field limit model is taken as an exogenous signal without the need of state space augmentation. This contributes to deriving a simple condition that ensures the solvability of the resulting equation system.

The Nash equilibrium and the social optimum are two fundamental solution notions to competitive markets with many firms, where the former applies to the noncooperative model, and the latter is for the cooperative model. In this paper, we design Nash and social optimum strategies for the production output control model based on the mean field control methodology, respectively, and further compare two solutions numerically. The Nash solution of our model starts by solving a limiting optimal control problem and next applies the consistency requirement for the mean field approximation. We then obtain a set of decentralized strategies and show that the set of strategies is an -Nash equilibrium. For the social optimum solution, we first provide an auxiliary optimal control problem by a person-by-person optimality approach, and then design a set of decentralized strategies by solving the limiting auxiliary problem subject to consistent mean field approximations. The set of strategies is shown to be asymptotically socially optimal by exploiting a passivity property of the underlying model.

An illustrative numerical example is given to compare market prices, firms¡¯ outputs and optimal costs under the game and social optimum frameworks. It is numerically shown that the social optimum has a lower average output level than that in the noncooperative case. This is similar to the behavior in a duopoly model [48] where cooperation of the two players results in a lower total output than in the Cournot equilibrium.

The paper is organized as follows. Section II introduces the game and social optimum problems with players. In Section III, we first design a set of decentralized strategies by the mean field control methodology and then show its asymptotic Nash equilibrium property. In Section IV, we construct a set of decentralized strategies, which is shown to be asymptotically socially optimal. In Section V, a comparison of two solutions is demonstrated by a numerical example. Section VI concludes the paper.

Notation: denotes the Euclidean vector norm or matrix spectral norm. For a matrix , denotes the determinant of . denotes the class of -dimensional continuous functions on ; is the class of bounded and continuous functions; For a family of -values random variables , is the -algebra generated by the collection of random variables; ; . For two sequences and , denotes , and denotes . For convenience of presentation, we use to denote generic positive constants, which may vary from place to place.

## 2 Problem Description

### 2.1 Dynamic oligopoly with sticky prices

Dynamic game models for oligopolistic competition with sticky prices were initially proposed by Simaan and Takayama [38], and then further investigated in [13], [5], [47]. According to the model in [5], [47], the sticky price evolves by

 dpdt=α(β−δN∑j=1qj−p),p(0) given,

where is the output of firm , , and has the role of control. The payoff function of firm is described by

 Ki(q1,⋯,qN)=∞∫0e−ρt(pqi−cqi−12q2i)dt.

The constants and are positive, and is the cost of unit output.

### 2.2 Output adjustment in a mean field framework

The paper considers a large market of many firms. Based on the formulation of sticky prices in [13], [47], we assume that the price evolves by

 dp(t)dt= α[β−p(t)−q(N)(t)] (1) = −αp(t)−αq(N)(t)+αβ, (2)

where denotes the speed of adjustment to the level on the demand function, and is the average of firms’ outputs. The output of each firm is described by the stochastic differential equation (SDE)

 dqi(t)=−μqi(t)dt+biui(t)dt+σdwi(t), (3)

where are independent standard Brownian motions, which are also independent of initial outputs of all firms . The constants , and are positive.

Adjustment costs in production models have been addressed in the economic literature (see e.g. [35]) and they have been taken into account in the study of dynamic oligopoly [10], [9], [37], [24]. The work [10] introduces a duopoly where each firm has output level subject to control according to a first order integrator dynamics. In the resulting differential game, the instantaneous payoff of each firm is determined from its net profit minus quadratic penalty terms of and

###### Remark 1

As in [13], is the price on the demand function for the given level of firms’ outputs. In the static case, the inverse demand function has a linear version ; here for simplicity we set as 1. The scaling factor for is standard in modelling and analysis of large markets, and some closely related price modelling in a large dynamic market can be found in [30], [40]. is used to indicate friction in adjusting the output, and is random shocks in output.

The cost function of each firm is given by

 Ji(u)=E∞∫0e−ρtL(p,qi,ui)dt, (4)

where

 L=−p(t)qi(t)+cqi(t)+ru2i(t),
 u=(u1,⋯,ui,⋯,uN),

and . Here, denotes the production cost, and denotes the adjustment cost. The minimization of is equivalent to maximizing the payoff

 Ki(u)=E∞∫0e−ρt[qi(t)(p(t)−c)−ru2i(t)]dt.

We only consider the case to make the subsequent optimization problems be of practical interest. Otherwise, given a positive , the production cost already exceeds the price, and the optimization problem is not too meaningful.

The social cost is defined as

 J(N)soc(u)=1NN∑j=1Jj(u). (5)

Based on costs (3) and (4), one may formulate a standard LQG game and an optimal control problem, respectively. A limitation of this approach is that the control strategy will be centralized. Our goal is to look for decentralized strategies for the corresponding optimization criterion.

The basic objective of this paper is to seek Nash solutions and social solutions to mean field production output control with sticky prices. Specifically, we study the following two problems:

Problem I: Find -Nash equilibrium strategies for agents to minimize the individual cost over the set of decentralized strategies

where , , .

Problem II: Find asymptotic social optimum strategies for agents to minimize over the set of decentralized strategies , .

For a large market, a natural way of modeling the sequence of parameters is to view them as being sampled from a space such that this sequence exhibits certain statistical properties when . Define the associated empirical distribution function , where if and otherwise.

We introduce the assumptions.

A1) The initial price is a constant. The initial outputs of all firms are independent. for all ; there exists independent of such that .

A2) There exists a distribution function such that the empirical distribution converges weakly to , where . Furthermore, each and .

A3) For all , is contained in a fixed compact set , and .

## 3 Nash Solutions to Output Adjustment

### 3.1 Optimal control for the limiting problem

Assume that is given for approximation of . Replacing in (1) by , we introduce

 d¯p(t)dt=α[β−¯p(t)−¯q(t)],¯p(0)=p0. (6)

Accordingly, by replacing in (4) with we define the cost function:

 ¯Ji(ui)=E∞∫0e−ρt[(c−¯p)qi+ru2i]dt. (7)

The corresponding admissible control set is .

We first take as an exogenous signal and solve the problem in (3), (6) and (7). For a general initial condition at time , define the value function

 Vi(t,qi)=infui∈Ud,iE⎡⎢⎣∞∫te−ρ(τ−t)L(qi,ui)dτ∣∣qi(t)=qi⎤⎥⎦.

We introduce the HJB equation:

 ρVi=infui∈R {∂Vi∂t+∂Vi∂qi(−μqi+biui) (8) +σ2∂2Vi∂q2i+(c−¯p)qi+ru2i}, (9)

where . Let . Then the optimal control law is

 ¯ui=−12rbi∂Vi∂qi=−bir(kiqi+si). (10)

Substituting the control (10) into (8), we obtain

 ρ(kiq2i+2siqi+gi) = (−2μki−b2irk2i)q2i + 2[dsidt+(−μ−b2irki)si+c−¯p2]qi + dgidt−b2irs2i+σki.

This yields

 ρki= −2μki−b2irk2i, (11) ρsi= dsidt+(−μ−b2irki)si+c−¯p2, (12) ρgi= dgidt−b2irs2i+σki. (13)
###### Lemma 1

is the unique solution to the algebraic Riccati equation (11) such that .

Proof. By solving (11), we have or . If , . Otherwise, when , .

###### Remark 2

The inequality in Lemma 1 specifies a stability condition for the closed-loop system which must be satisfied by the solution of .

###### Theorem 1

For the optimal control problem in (3), (6) and (7), assume that is given. Then we have
1) there exists a unique solution to (12);
2) the optimal control law is uniquely given by ;
3) there exists a unique solution to (13), and the optimal cost is given by

 Vi(0,qi(0))=2si(0)q0+gi(0).

Proof. Note that by (6), implies . We can prove parts 1) and 3) by showing that and are uniquely determined from the fact and , respectively (see e.g., [17], [19]). To show part 2) we first obtain a prior integral estimate of (see (14)) and then use the completion of squares technique (see e.g., [20], [45]). By Lemma A.1, implies , which further gives that is well defined to be finite since . By Proposition A.1, leads to which further implies

 E∞∫0e−ρtq2idt<∞. (14)

### 3.2 Control synthesis and analysis

Following the standard approach in mean field games [18], [19], we construct the equation system as follows:

 d¯pdt= α[β−¯p−¯q]. (15) ρs= dsdt−μs+c−¯p2 (16) d¯qθdt= −μ¯qθ−θ2rs (17) ¯q= ∫R¯qθdF(θ). (18)

In the above, is a continuum parameter. is regarded as the expectation of the state given the parameter in the individual dynamics. The last equation is due to the consistency requirement for the mean field approximation. and is to be determined. For further analysis, we make the following assumption.

A4) There exists a solution to (15)-(18) such that for each , both and are within .

Some sufficient conditions for ensuring A4) may be obtained by using the fixed-point methods similar to those in [18], [19].

###### Proposition 1

If

 12rμ(ρ+μ)∫Θθ2dF(θ)<1,

then A4) holds.

Proof. By (15)-(17), we have

 ¯p(t)= ¯p(0)e−αt+t∫0e−α(t−τ)[αβ−α¯q(τ)]dτ, s(t)= ∞∫te(ρ+μ)(t−τ)[¯p(τ)−c2]dτ, ¯qθ(t)= ¯qθ(0)+t∫0e−μ(t−τ)[−θ2rs(τ)]dτ.

Thus,

 ¯q(t) =∫Θ¯qθ(0)dF(θ)+∫ΘdF(θ)t∫0e−μ(t−τ1) ⋅ {−θ2r∞∫τ1e(ρ+μ)(τ1−τ2)(A¯q)(τ2)dτ2}dτ1 \lx@stackrelΔ=(T¯q)(t),

where It can be verified that is a map from the Banach space to itself. For any ,

 |(T¯q1−T¯q2)(t)| ≤∥¯q1−¯q2∥∞∫Θt∫0e−μ(t−τ1){θ2r∞∫τ1e(ρ+μ)(τ1−τ2) ⋅[12τ2∫0e−α(τ2−τ3)αdτ3]dτ2}dτ1dF(θ) ≤∥¯q1−¯q2∥∞∫Θθ22rμ(ρ+μ)dF(θ).

It follows that is a contraction and hence has a unique fixed point .

#### 3.2.1 The case of uniform agents

We now consider the case of uniform agents, i.e., . In this case, (15)-(18) reduce to the following equation:

 ddt⎡⎢⎣¯p¯qs⎤⎥⎦ = ⎡⎢ ⎢ ⎢⎣−α−α00−μ−b2r120ρ+μ⎤⎥ ⎥ ⎥⎦⎡⎢⎣¯p¯qs⎤⎥⎦+⎡⎢⎣αβ0−c2⎤⎥⎦.

Let

 M=⎡⎢ ⎢ ⎢⎣−α−α00−μ−b2r120ρ+μ⎤⎥ ⎥ ⎥⎦,¯b=⎡⎢⎣αβ0−c2⎤⎥⎦.

Then

 ddt⎡⎢⎣¯p¯qs⎤⎥⎦=M⎡⎢⎣¯p¯qs⎤⎥⎦+¯b. (19)

By direct computations, we have

 |M|=αμ(ρ+μ)+αb22r

and the equation has the solution

 z= [2rβμ(ρ+μ)+b2c2rμ(ρ+μ)+b2,b2(β−c)2rμ(ρ+μ)+b2,

Note that

 |λI−M|= (λ+α)(λ+μ)[λ−(ρ+μ)]−αb22r (20) = λ3+(α−ρ)λ2−(μ2+ρμ+αρ)λ (21) −αμ(ρ+μ)−αb22r. (22)

In what follows, we use Routh’s stability criterion [8] to determine the number of roots of with negative real parts. The first column of the Routh array for is It can be verified that the first column of the Routh array always has a sign change. By Routh’s stability criterion, (20) has a root with a positive real part, and two roots with negative real parts.

Let be two roots of (20) with negative real parts, and be the corresponding (generalized) complex eigenvectors. Let . The solution to equation (19) given by is in if and only if there exist constants such that . Indeed, suppose

 [~¯p(0),~¯q(0),~s(0)]T=a1ξ1+a2ξ2+a3ξ3,

where is a root of (20) with a positive real part and is the corresponding complex eigenvector. The solution

 z+eMt[~¯p(0),~¯q(0),~s(0)]T=z+3∑i=1hi(t)eλitξi

is in if and only if , where are polynomials of .

Denote , and , where . Then we have

 a1ξ†1+a2ξ†2=[~¯p(0),~¯q(0)]T. (23)

Note that is given. There exists a unique solution to (23) if and only if and are linearly independent.

From the analysis above, we have the following result.

###### Proposition 2

(19) admits a unique solution such that and are in if and only if and are linearly independent. In this case, A4) holds.

###### Example 1

Take parameters as . Let . In this case, has only two eigenvalues with negative real parts and . The corresponding eigenvectors are and , respectively. By (23), we have and . Then (19) admits a unique solution in . However, . The parameters in this example satisfy the condition of Proposition 2, but not of Proposition 1.

### 3.3 ε-Nash equilibrium

Consider the system of firms. Let the control strategy of firm be given by

 ^ui=−birs,i=1,⋯,N, (24)

where is determined by the equation system (15)-(18). After the strategy (24) is applied, the closed-loop dynamics for firm may be written as follows:

 d^p(t)dt = −α^p(t)−α^q(N)(t)+αβ, (25) d^qi(t) = −μ^qi(t)dt−b2irs(t)dt+σdwi, i=1,⋯,N. (26)

Denote .

###### Theorem 2

For the system (1)-(3), if assumptions A1)-A4) hold, then the closed-loop system (25)-(26) satisfies

 supt≥0,N≥1{E|^p(t)|2+E|^q(N)(t)|2}≤C0, (27) supt≥0E{|^p(t)−¯p(t)|2+|^q(N)(t)−¯q(t)|2} (28) = O(ε2N+1/N). (29)

Proof. By (26), it follows that

 d^q(N)(t) = [−μ^q(N)(t)−1NN∑i=1b2irs(t)]dt+1NN∑i=1σdwi(t).

From this together with (25), we have

 [d^pd^q(N)]=[−α−α0−μ][^p^q(N)]dt (30) (31)

By and elementary linear SDE estimates, we have

Denote . By (15), (17) and (30), we have

 dη=Gηdt+[0Δs]dt+[01N∑Ni=1σdwi],

where

 G=[−α−α0−μ], (32)
 Δs=∫Θθ2rsdF(θ)−1NN∑i=1b2irs.

By solving this linear SDE and using the fact that is Hurwitz, we can show

 supt≥0E∥η(t)∥2=O(ε2N+1N),

By the above theorem, we can obtain the next corollary.

###### Corollary 1

For the system (1)-(3), if assumptions A1)-A4) hold, then the closed-loop system (25)-(26) satisfies

 supt≥0,N≥1E∞∫0e−ρt{|^p(t)|2+|^q(N)(t)|2}dt≤C0, E∞∫0e−ρt{|^p(t)−¯p(t)|2+|^q(N)(t)−¯q(t)|2}dt = O(ε2N+1/N).

We are now in a position to show an asymptotic Nash equilibrium property. Denote

 Uc= {ui:ui(t) is adapted to σ{∪Nj=1Fjt}, E∞∫0e−ρtu2i(t)dt<∞}. ^u−i= (^u1,⋯,^ui−1,^ui+1,⋯,^uN).
###### Theorem 3

For the problem (1)-(3), assume that A1)-A4) hold. Then the set of strategies given by (24) is an -Nash equilibrium, i.e.,

 Ji(^ui,^u−i)−ε≤infui∈UcJi(ui,^u−i)≤Ji(^ui,^u−i), (33)

where .

Proof. See Appendix A.

## 4 Social Solutions to Output Adjustment

We first construct an auxiliary optimal control problem by examining the social cost variation due to the control perturbation of a single agent. Then, by mean field approximations we design a set of decentralized strategies which is shown to have asymptotic social optimality.