Dynamic Control of Agents playing Aggregative Games with Coupling Constraints

# Dynamic Control of Agents playing Aggregative Games with Coupling Constraints

Sergio Grammatico S. Grammatico is with the Control Systems group, Department of Electrical Engineering, Eindhoven University of Technology, The Netherlands. E-mail address: s.grammatico@tue.nl.
###### Abstract

We address the problem to control a population of noncooperative heterogeneous agents, each with convex cost function depending on the average population state, and all sharing a convex constraint, towards an aggregative equilibrium. We assume an information structure through which a central coordinator has access to the average population state and can broadcast control signals for steering the decentralized optimal responses of the agents. We design a dynamic control law that, based on operator theoretic arguments, ensures global convergence to an equilibrium independently on the problem data, that are the cost functions and the constraints, local and global, of the agents. We illustrate the proposed method in two application domains: network congestion control and demand side management.

## I Introduction

#### Motivation

The problem to coordinate a population of competitive agents arises in several application domains such as the demand side management in the smart grid [1, 2, 3, 4, 5], e.g. for thermostatically controlled loads [6, 7, 8] and plug-in electric vehicles [9, 10, 11, 12], demand response in competitive markets [13], congestion control for networks with shared resources [14].

The typical challenge in such coordination problems is that the agents are noncooperative, self-interested, yet coupled together, and have local decision authority that if left uncontrolled can lead to undesired emerging population behavior. From the control-theoretic perspective, the objective is to design a coordination law for steering the strategies of the agents towards a noncooperative game equilibrium.

#### Related literature

Whenever the behavior of each agent is affected by some aggregate effect of all the agents, which is a typical feature of the mentioned application domains, rather than by agent-specific one-to-one effects, aggregative games [15, 16, 17, 18] offer the fundamentals to analyze the strategic interactions between each individual agent and the entire population, although in the classic literature the analysis is limited to agents with scalar decision variable.

For large, in fact in the limit of infinite, population size, aggregative game setups have been considered as deterministic mean field games among agents with strongly convex quadratic cost functions [19, 20].

In this paper, we are interested in generalized aggregative games for a population of agents with general convex functions, constrained vector decision variable, and in addition with convex coupling (i.e., shared) constraints.

Generalized games, that is, games among agents with coupling constraints have been intensively studied in the last decade within the operations research community [21, 22] and the control systems one [23, 24, 25, 26] in relation with duality theory and variational inequalities.

Assessing the convergence of the dynamic interactions among the noncooperative agents towards an equilibrium is one main challenge that arises in (generalized) games. With this aim, best response dynamics and fictitious play with inertia, i.e., gradient update dynamics, have been analyzed and designed, respectively, both in discrete [27, 28] and continuous time setups [29, 30]. In particular, fictitious play with inertia has been introduced to overcome the non-convergence issue of the best response dynamics [29]. The common feature of these methods is that the agents implement sufficiently small gradient-type steps, each along the direction of optimality for their local problem. Thus, the noncooperative agents shall agree on the sequence of step sizes and exchange truthful information, e.g. with neighboring players, to update their local descent directions. Several distributed algorithms have been proposed for computing the game equilibria, see [31, 32, 33, 34, 35] and the references therein.

#### Originality

In this paper, we consider aggregative games among noncooperative agents that do not exchange information, nor agree on variables affecting their local behavior, with the other (competing) agents.

Instead, we assume the presence of a central coordinator that controls the decentralized optimal responses of the competitive agents, via the broadcast of incentive signals common to all of them. Specifically, we design a dynamic control law computing incentives that affect linearly the cost functions of all the agents, simply based on the average among their decentralized optimal responses. The resulting information structure determines the semi-decentralized control architecture illustrated in Figure 1.

Technically, we wish to control the decentralized optimal responses of the agents towards an aggregative equilibrium, that is, a set of agent strategies that are feasible for both the local and the shared constraints, and individually optimal for each agent, given the strategies of all other agents and the control vector associated with the potential violation of the shared constraints.

#### Contribution

The main contributions and novelties of the paper with respect to the literature are summarized next.

• We address the general problem to control a population of competitive agents with convex cost functions and constraints coupled together in aggregative form.

• We discover a nontrivial multivariable mapping with the following two fundamental properties:

1. its unique zero is the incentive signal that generates, via the agents’ decentralized optimal responses, the desired equilibrium;

2. there exists a Hilbert space in which the mapping reads as the sum of two monotone operators.

Therefore, splitting methods are applicable for computing the zero of such mapping in a semi-decentralized fashion.

• We design a dynamic control law with global convergence guarantee for steering the agents’ decentralized optimal responses to the desired equilibrium, with minimal information structure, and with no assumption on the problem data, other than convexity.

• We establish global logarithmic convergence rate under an appropriate selection of the control parameters.

• We show that our approach is applicable to network congestion control and demand side management.

To establish global convergence with minimal information structure, we build upon mathematical tools from variational and convex analysis [36], and monotone operator theory [37].

Equilibrium seeking in aggregative games with convex cost functions, convex local constraints and convex coupling constraints has been first studied in [38], with static control law. In this paper, we enrich the technical setup and study dynamic control laws. Preliminary versions of some technical results in this paper are in [39] where no coupling constraint is considered, and in [40] where the cost functions are assumed to be strongly convex quadratic.

#### Paper organization

Section II define the aggregative game setup. Section III presents the novel dynamic control law. The main technical results are shown in Section IV and the designed algorithm is discussed in Section V. Section VI illustrates our approach via numerical simulations. Section VII concludes the paper and points at several research avenues. Some proofs are provided in the Appendix.

### Notation

, , respectively denote the set of real, positive, and non-negative real numbers; ; denotes the set of natural numbers; for , , . denotes the transpose of a matrix . Given vectors , denotes . Given matrices , denotes the block diagonal matrix with in block diagonal positions; given scalars , we use the notation . With we denote the set of symmetric matrices; for a given , the notations () and () denote that is symmetric and has positive (non-negative) eigenvalues. denotes the identity matrix; () denotes a matrix/vector with all elements equal to (); to improve clarity, we may add the dimension of these matrices/vectors as subscript. denotes the Kronecker product between matrices and . Every mentioned set is meant to be nonempty. Given , and , denotes the set ; hence . The notation denotes the distance of a vector from a set .

#### Operator theoretic notations and definitions

, with , denotes the Hilbert space with inner product and induced norm , for all ; we refer to the Hilbert space whenever not specified otherwise. Given a function , . is -strongly convex, where , if is convex. denotes the identity operator. A mapping is -Lipschitz continuous relative to , where , if for all ; is a contraction (nonexpansive) mapping in if it is -Lipschitz relative to with (). Given a function , denotes its subdifferential set-valued mapping [36], defined as . A mapping is (strictly) monotone in if for all ; it is -strongly monotone, where , if for all ; it is -averaged, with , if for all ; it is firmly nonexpansive (hence strictly monotone and nonexpansive) if it is -averaged; it is -cocoercive (hence strictly monotone), with , if the mapping is firmly nonexpansive.

## Ii Aggregative games with coupling constraints

We consider a population of agents, where each agent has strategy (i.e., decision variable) , and all share the constraint

 1N∑Ni=1xi∈S, (1)

for some set .

We assume that each agent aims at minimizing its local cost function , which depends on the average among the strategies of all other agents, and in particular at seeking a strategy such that

 xi∈argminy∈Xi Ji(y,1N∑Nj=1xj,λ), (2)

where the argument represents a control vector that the coordinator agent, introduced later on, can impose on the agents to avoid the violation of the coupling constraint in (1). Equations (1)–(2) define a competitive aggregative game. We have an aggregative game since the optimal strategy of each agent depends on the average among the strategies of all agents; the game is competitive aggregative since the cost functions of the agents all depend on a common vector associated with the coupling constraint in aggregative form.

Throughout the paper, we assume compactness, convexity and Slater’s qualification [41, §5.2.3] of both the individual and the shared constraints, and strong convexity of the cost functions, with linear dependence on the global coupling variable. Such basic assumptions ensure existence of an equilibrium, and that the agents’ optimal responses, defined formally in Section III-A, are single-valued and continuous.

###### Standing Assumption 1

Compactness, convexity, constraint qualification. The sets and are compact and convex subsets of , and satisfy the Slater’s constraint qualification.

###### Standing Assumption 2

Strongly convex cost functions. For all , the cost function in (2) is defined as

 Ji(y,σ,λ):=fi(y)+(Cσ+Kλ)⊤y, (3)

for some function continuous and -strongly convex, with , , and invertible .

In (3), the matrix in (3) weights the influence of the average among the agents’ strategies on each cost function , whereas the matrix in (3) weights the effect of the vector . In the remainder of the paper, we consider as part of the given problem data, while as design choice for the coordinator of the game.

Our goal is to control the strategies of the agents to an aggregative equilibrium, that is, a set of strategies and control vector such that: the coupling constraint in (1) is satisfied, and each agent’s strategy is optimal given the strategies of all other agents and the control vector.

###### Definition 1

Aggregative equilibrium. A pair is an aggregative equilibrium for the game in (2) with coupling constraint in (1) if , for all ,

 ¯xi∈argminy∈Xi Ji(y,1N∑Nj=1¯xj,¯λ).

We formalize next that an aggregative equilibrium exists under the postulated standing assumptions.

###### Proposition 1

Existence of an aggregative equilibrium. There exists an aggregative equilibrium for the game in (2) with coupling constraint in (1).

{proof}

See Appendix -A.

###### Remark 1

Non-uniqueness of aggregative equilibria. Uniqueness of the aggregative equilibrium does not necessarily hold. For instance, consider the game with following problem data: , , , , . The pairs and are aggregative equilibria, independently on the choice of in (3). Selecting the best aggregative equilibrium from a global optimization perspective goes beyond the purpose of this paper.

To conclude the section, we note that in the limit of infinite population size, an aggregative equilibrium is a Nash equilibrium with fixed control vector.

###### Theorem 1

Aggregative equilibrium versus Nash equilibrium. Let the pair be an aggregative equilibrium, and define

 εN:=maxi∈N[1,N]dist(¯xi,argminy∈XiJi(y,1N(y+∑Nj≠i¯xj),¯λ)).

Assume that there exists a compact set such that for all and . Then, there exists such that for all .

{proof}

See Appendix -B.

## Iii Dynamic control of the agents’ decentralized optimal responses

### Iii-a Fixed points of the aggregation mapping

For seeking an aggregative equilibrium, we assume that an agent cannot exchange information, nor has prior knowledge, on the strategies of all other (competing) agents. Instead, we assume that each individual agent responds optimally to incentive signals according to the information structure in Figure 1. Formally, for all , we define the agent optimal response mapping as

 xi⋆(u):=y∈Xiargmin fi(y)+u⊤y, (4)

and the aggregation mapping as the average among the optimal responses of agents to the incentive signal , i.e.,

 A(σ,λ):=1N∑Ni=1xi⋆(Cσ+Kλ). (5)

Note that if for some , then , with shorthand notation . It follows immediately from Proposition 1 that such a pair exists; uniqueness depends however on the choice of as established later in Proposition 2, Section IV.

Therefore, if , then the pair is in fact an aggregative equilibrium. It follows that we can control the agents’ optimal responses, e.g. via dynamic updates of their argument, to a set of strategies whose average is a fixed point of the aggregation mapping (with respect to the first argument) within the coupling constraint set.

### Iii-B From fixed points to zeros

Informally speaking, the objective is to find a pair such that , for some . Since depends on two arguments, it follows naturally that is designed as a mapping that depends on the same arguments. With this aim, we translate the problem into that of finding a zero of an appropriate multivariable mapping via semi-decentralized iterations.

Among all possible design choices, let us define the mapping as

 x0⋆(σ,λ):=argminy∈S 12y⊤y+(K(σ−λ))⊤y. (6)

Remarkably, we notice that a pair satisfies if is a zero of the mapping defined as

 Θ([σλ]):=[σ−A(σ,Kλ)σ−2A(σ,Kλ)+x0⋆(σ,λ)]=[In0In0][σλ]−[A(σ,Kλ)2A(σ,Kλ)−x0⋆(σ,λ)]=:(M+Γ)([σλ]), (7)

where the matrix gain is to be chosen, and we defined the matrix and the mapping as

 M :=[In0In0], (8) Γ([σλ]) :=−[A(σ,Kλ)2A(σ,Kλ)−x0⋆(σ,λ)]. (9)

### Iii-C Dynamic control as zero finding algorithm

In general, computing a zero of a multivariable nonlinear mapping such as in (7) is a challenging task. However, for the sum of monotone mappings there exist iterative algorithms with global convergence guarantee [37, Chapter 25]. Inspired by the forward-backward algorithm [37, Equation 25.26], we propose the dynamic control law in (III-A) for computing a zero of in (7), where is sufficiently small and the averaging step sizes are chosen as follows.

###### Design choice 1

The sequence in (III-A) is such that for all and .

Suitable choices for the sequence that satisfy the design condition stated above are and , for all . The proposed dynamic control scheme is summarized in Algorithm 1.

## Iv Global convergence

### Iv-a Statement of the main results

The mapping in (7) reads as the sum of the linear, hence continuous, mapping and the mapping in (9). With the aim of applying [37, Theorem 25.8], in the following we show that by choosing the matrix gain in (4) appropriately, is monotone and is -cocoercive, that is, is firmly nonexpansive, in some Hilbert space. Consequently, we derive a dynamic control law that ensures global convergence of the controlled decentralized optimal responses to a set of strategies whose average is a fixed point of the aggregation mapping with respect to its first argument.

###### Design choice 2

The matrix in (7) is chosen such that and .

###### Theorem 2

Monotonicity. Under design choice 2, the linear mapping in (8) is monotone in , and the mapping in (9) is -cocoercive, hence strictly monotone, in , where

 P:=[C+2K−K−KK]≻0,β:=ℓ6∥P∥>0. (10)

{proof}

See Section IV-B.

###### Proposition 2

Existence and uniqueness. Under design choice 2, , with as in (7).

{proof}

Existence follows immediately from Proposition 1. The mapping is the sum of monotone and strictly monotone mappings by Theorem 2, hence it is strictly monotone in [36, Exercise 12.4 (c)], with in (10), hence uniqueness holds [37, Proposition 23.35].

###### Design choice 3

The constant in (III-A) is such that , with in (10).

###### Theorem 3

Global convergence. Under design choices 13, the sequence defined in (III-A) converges, for any initial condition, to the zero of in (7), with as in (5) and as in (4) for all .

{proof}

The assumptions of [37, Theorem 25.8 ()] are verified as follows. is continuous and monotone in due to Theorem 2, hence maximally monotone [37, Corollary 20.25]. in (9) is -cocoercive in according to Theorem 2. The sequence is chosen as in [37, Theorem 25.8 ()], and the existence of a zero of holds by Proposition 2.

We conclude the subsection by quantifying the global convergence rate. Since in general this might depend on the chosen sequence , let us focus on the case for all , for which we establish global logarithmic convergence rate.

###### Design choice 4

The sequence in (III-A) is such that for all .

###### Theorem 4

Global logarithmic convergence rate. Under design choices 24, the sequence defined in (III-A) is such that, for all and any initial condition,

 ∥∥[σ(t+1)λ(t+1)]−[σ(t)λ(t)]∥∥2P≤3¯α−1t+1∥∥[σ(0)λ(0)]−\textupzer(Θ)∥∥2P, (11)

where is as in (7), as in (10), as in (5) and as in (4) for all .

{proof}

See Section IV-C.

###### Corollary 1

Global convergence to an aggregative equilibrium. Under design choices 24, for any initial condition, the sequence defined from (4) and (III-A) converges with the logarithmic rate in (11) to an aggregative equilibrium for the game in (2) with coupling constraint in (1).

{proof}

It follows immediately from Theorems 34.

### Iv-B Proof of Theorem 2 (Monotonicity)

First, in (8) is monotone in as [20, Lemma 3]

 [I0I0]⊤[C+2K−K−KK]+[C+2K−K−KK][I0I0]≽0.

We proceed with two statements that are exploited later on.

###### Lemma 1

If a function is -strongly convex, , then: is -strongly monotone, and is everywhere single-valued, globally -Lipschitz continuous, -cocoercive, and strictly monotone.

{proof}

is -strongly monotone by [36, Exercise 12.59], and equivalently is -cocoercive [42, p. 1021, Equation (18)]. is everywhere single-valued, globally -Lipschitz continuous by [36, Proposition 12.54]. Finally, we show that is strictly monotone. For all such that , we have for all . In particular, since is everywhere single-valued, for all there exist , , such that , , and hence .

###### Lemma 2

Let the function be -strongly convex, . Then for any , the mapping

 x⋆(⋅):=argminy∈Rnf(y)+(A⋅)⊤y=(∂f)−1(−A⋅) (12)

is -Lipschitz continuous.

{proof}

By Lemma 1 the mapping is -Lipschitz continuous. The affine mapping is -Lipschitz continuous, hence the composed mapping is -Lipschitz continuous. Equation (12) follows from the Fermat’s rule [37, Theorem 16.2, Proposition 26.1], i.e., , hence for all . The second equation in (12) follows by applying to both sides of the last inclusion.

It follows from Lemma 2 that, for all , the optimal response from (4) reads as

 xi⋆(Cσ+Kλ)=(∂fi)−1([−C,−K][σλ]),

and analogously, the mapping in (6) reads as

where .

In view of in (9), for all , let us define the mapping as

 (13)

so that . Note that the mapping is -strongly monotone with , thus the mapping in (13) is -cocoercive and -Lipschitz continuous due to Lemma 1 and [37, Proposition 20.23]. In the rest of the proof, we exploit the following result, which is a variant of [37, Proposition 4.5].

###### Lemma 3

Let be a -cocoercive mapping, , and be invertible matrices. If , then the mapping is -cocoercive in with .

{proof}

Since is -cocoercive, for all :

 (AM(Bx)−AM(By))⊤A−⊤B(x−y) =(M(Bx)−M(By))⊤B(x−y) ≥γ∥M(Bx)−M(By)∥2 ≥γ∥A∥2∥AM(Bx)−AM(By)∥2 ≥γ∥A∥2∥A−⊤B∥∥AM(Bx)−AM(By)∥2A−⊤B.

We now apply Lemma 3 to the mapping in (13). Namely, we consider and the matrices

 A:=−[In0n×n2In−In], B:=[−C−K−KK],

and derive

 P:=A−⊤B=−[I02I−I]−⊤[−C−K−KK]=[I2I0−I][CKK−K]=[C+2K−K−KK]=[C+(1−ϵ)K000]+[1+ϵ−1−11]⊗K,

where is chosen such that .

Since and , and ensure that is invertible and . By Lemma 3, this implies that, for all , is -cocoercive in , where and in (10). In turn, is also -cocoercive [37, Example 4.31].

Since all the mappings are strictly monotone in , it follows that is strictly monotone as well [36, Exercise 12.4 (c)]; in fact, for all , in (13) is strictly monotone by Lemma 1 and [37, Proposition 20.23]. Finally, strict monotonicity of follows from the next result, which is a variant of [37, Proposition 28.2].

###### Lemma 4

Let be a (strictly) monotone mapping, and . If is invertible and , then the mapping is (strictly) monotone in .

{proof}

Since is (strictly) monotone, for all , we have:

 0≤(<) (M(Bx)−M(By))⊤(Bx−By) =(M(Bx)−M(By))⊤A⊤A−⊤B(x−y) =(AM(Bx)−AM(By))⊤A−⊤B(x−y).

### Iv-C Dynamic control as fixed point iteration: Proof of Theorem 4 (Global logarithmic convergence rate)

The iteration in (III-A) can be written as the fixed point iteration

where the mapping is defined as

 T(⋅):=(I+ϵM)−1(\textupId−ϵΓ)(⋅). (14)

In fact, a vector is a fixed point of if and only if, for all , it is fixed point of the mapping , and if and only if it is a zero of [38, Lemma ]: if , then , which is equivalent to .

To establish the convergence rate of the iteration in (III-A) with for all , we show that the mappings in (14) and

 K(⋅):=(1−¯α)\textupId(⋅)+¯αT(⋅) (15)

are averaged operators.

###### Lemma 5

The mapping in (14) is -averaged in , and the mapping in (15) is -averaged in , with as in (10).

{proof}

It follows from the proof of Theorem 3 that in (8) is monotone in , thus is firmly nonexpansive [37, Corollary 23.10 (i)]. According to Theorem 2, is firmly nonexpansive, hence also the mapping is firmly nonexpansive [37, Proposition 4.2]. Therefore, is the composition of two firmly nonexpansive mappings, or equivalently the composition of two -averaged operators. In particular, is -averaged in [43, Proposition 2.4]. Finally, is the convex combination of two averaged operators, hence it is averaged with parameter [43, Proposition 2.2].

We can now prove Theorem 4. {proof} By the definition of averaged operator, we have that for all , where in view of Lemma 5. Let us take , and . By substituting, we obtain , and equivalently . In particular, note that . Now, we sum up over and derive

 (t+1)∥∥z(t+1)−z(t)∥∥2P ≤β1−β∑tτ=0∥∥z(τ)−¯z∥∥2P−∥∥K(z(τ))−¯z∥∥2P ≤β1−β∥∥z(0)−¯z∥∥2P.

Since , we have that , which completes the proof.

Finally, we note that the mapping is nonexpansive according to Lemma 5, hence several fixed point iterations have global convergence guarantee. The design choice for all is known as Krasnoselskij iteration [20, Equation (18)]. Among other iterations, we mention the Mann iteration [20, Equation (20)], which corresponds to choosing the sequence such that for all , , and , e.g. for all .

## V Discussion

### V-a Features of the dynamic control scheme

One computational feature of the iteration in (III-A) is that it only requires one-to-all coordination between a central computer and the decentralized, hence parallelizable, optimal responses in (4) of the agents, as summarized in Algorithm 1. Each decentralized computation consists in solving a finite dimensional strongly convex optimization problem, for which efficient algorithms are available. Note that at each iteration only one vector in is broadcast, independently on the population size , which can be arbitrarily large. Also note that the coordinator needs access to the aggregate information only, not necessarily to the entire set of optimal responses.

The main distinctive feature of the proposed semi-decentralized architecture is that the central coordinator can decide on the step sizes and , on the gain , and on the stopping criterion for the iteration in (III-A). Therefore, the agents can simply behave as fully noncooperative, in the sense that in addition to being self-interested, they do not have to exchange information with each other, nor to agree on the step sizes associated with the control signals, nor on the stopping criterion. Each of these agreement points would be in fact exposed to malicious agent behavior. Also, for the convergence of the dynamic iterations, neither the central coordinator nor the agents need to know the population size .

In summary, Algorithm 1 is such that:

• the central coordinator keeps the information on the chosen incentive mechanism and on the global coupling constraint private;

• the noncooperative agents keep information on their cost functions and local constraints private.

### V-B Generalized Nash aggregative games

The game setup in (1)–(2) can be related to the generalized Nash equilibrium problem (GNEP) with best responses

 xi∈argminyi∈Rn φi(yi,x−i)\textups.t. yi∈Xi∩Si(x−i), (16)

for all , where

and the shared constraint set for agent reads as

 S