Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers
In this paper, the problem of finding a Nash equilibrium of a multi-player game is considered. The players are only aware of their own cost functions as well as the action space of all players. We develop a relatively fast algorithm within the framework of inexact-ADMM. It requires a communication graph for the information exchange between the players as well as a few mild assumptions on cost functions. The convergence proof of the algorithm to a Nash equilibrium of the game is then provided. Moreover, the convergence rate is investigated via simulations.
First]Farzad Salehisadaghiani First]Lacra Pavel
Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada (e-mails: firstname.lastname@example.org, email@example.com).
There is a close connection between the problem of finding a Nash equilibrium (NE) of a distributed game and a distributed optimization problem. In a distributed optimization problem with agents that communicate over a connected graph, it is desired to minimize a global objective as follows:
In this problem the agents cooperatively solve (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) over a common optimization variable . In other words, all the agents are serving in the public interest in a way that they reduce the global loss. However, there are many real-world applications that involve selfishness of the players (agents) such as congestion control for Ad-hoc wireless networks and optical signal-to-noise ratio (OSNR) maximization in an optical network. In these applications, players selfishly desire to optimize their own performance even though the global objective may not be minimized, hence play a game. In this regard, we are interested in studying the (Nash) equilibrium of this game.
Considering the difference between distributed optimization and distributed Nash equilibrium (NE) seeking, we aim to employ an optimization technique referred to as the alternating direction method of multipliers (ADMM) to find an equilibrium point of a multi-player game. ADMM takes advantage of two different approaches used in solving optimization problems: 1) Dual Decomposition, and 2) Augmented Lagrangian Methods.
Dual decomposition is a special case of a dual ascent method for solving an optimization problem when the objective function is separable w.r.t. variable , i.e., where . This decomposition leads to parallel dual ascent problems whereby each is to be solved for , . This parallelism makes the convergence faster.
The augmented Lagrangian method is more robust and relaxes the assumptions in the dual ascent method. This method involves a penalty term added to the normal Lagrangian.
In this work, we aim to exploit the benefits of ADMM in the context of finding an NE of a game. Here are the difficulties that we need to overcome:
A Nash game can be seen as a set of parallel optimization problems, each of them associated with the minimization of a player’s own cost function w.r.t. his variable. However, each optimization problem is dependent on the solution of the other parallel problems. This leads to have Lagrangians whereby each is dependent on the other players’ variables.
Each player updates only his own variable , however, he requires also an estimate of all other variables and updates it in order to solve his optimization problem. This demands an extra step in the algorithm based on communications between players.
Each optimization problem is not in the proper format of sum of separable functions to allow direct application of ADMM.
Related Works. Our work is related to the literature on distributed Nash games such as Yin et al. (2011); Alpcan and Başar (2005) and distributed optimization problems such as Nedic (2011); Johansson (2008). Finding NE in distributed games has recently drawn attention due to many real-world applications. To name only a few, Salehisadaghiani and Pavel (2014); Salehisadaghiani and Pavel (2016a); Frihauf et al. (2012); Gharesifard and Cortes (2013); Salehisadaghiani and Pavel (2016b); Pavel (2007); Pan and Pavel (2009). In Koshal et al. (2012) an algorithm has been designed based on gossiping protocol to compute an NE in aggregative games. Zhu and Frazzoli (2016) study the problem of finding an NE in more general games by a gradient-based method over a complete communication graph. This problem is extended to the case with partially coupled cost functions (the functions which are not necessarily dependent on all the players’ actions) in Bramoullé et al. (2014). Recently, Ye and Hu (2015) investigate distributed seeking of a time-varying NE with non-model-based costs for players. Computation of a time-varying NE is considered in Lou et al. (2016) in networked games consisting of two subnetworks with shared objectives. Parise et al. (2015) propose two different algorithms to solve for an NE in a large population aggregative game which is subject to heterogeneous convex constraints.
ADMM algorithms, which are in the scope of this paper, have been developed in 1970s to find an optimal point of distributed optimization problems. This method has become widely used after its re-introduction in Boyd et al. (2011) such as He and Yuan (2012); Goldstein et al. (2014); Wei and Ozdaglar (2012). Shi et al. (2014) investigate the linear convergence rate of an ADMM algorithm to solve a distributed optimization problem. ADMM algorithms are extended by Makhdoumi and Ozdaglar (2014) to the case when agents broadcast their outcomes to their neighbors. The problem of distributed consensus optimization is considered in Chang et al. (2015) which exploits inexact-ADMM to reduce the computational costs of a classical ADMM. Recently, an ADMM-like algorithm is proposed by Shi and Pavel (submitted) in order to find an NE of a game. It is shown that the algorithm converges faster than the gradient-based methods. However, the algorithm requires individual cocoercivity and is not developed, but rather postulated by mimicking of ADMM in distributed optimization according to the NE condition.
Contributions. In this paper, first, we reformulate the problem of finding an NE of a convex game as a set of distributed consensus optimization problems. Then we take advantage of a dummy variable to make the problem separable in the optimization variable. This technique can be used for any convex game which satisfies a set of relatively mild assumptions.
Second, we design a synchronous inexact-ADMM algorithm by which every player updates his action as well as his estimates of the other players’ actions. This algorithm takes advantage of the speed and robustness of the classical ADMM and reduces the computational costs by using a linear approximation in players’ action update rule (inexact-ADMM). Compared with gradient-based algorithms such as Koshal et al. (2012); Zhu and Frazzoli (2016); Li and Marden (2013), our ADMM algorithm has an extra penalty term (could be seen as an extra state) which is updated through the iterations and improves the convergence rate.
Third, we prove the convergence of the proposed algorithm toward the NE of the game and compare its convergence rate with a gradient-based method via simulation.
The paper is organized as follows. The problem statement and assumptions are provided in Section 2. In Section 3, an inexact-ADMM-like algorithm is proposed. Convergence of the algorithm to a Nash equilibrium of the game is discussed in Section 4 while in Section 5 a simplified representation of the algorithm for implementation is provided. Simulation results are given in Section 6 and conclusions in Section 7.
Consider as a set of players in a networked multi-player game. The game is denoted by and defined as follows:
: Action set of player , ,
: Action set of all players,
: Cost function of player , .
The game is defined over the set of players, , the action set of player , and the cost function of player , .
The players’ actions are denoted as follows:
: All players actions,
: Player ’s action, ,
: All players’ actions except player ’s.
The game is played such that for a given , each player aims to minimize his own cost function selfishly w.r.t, to find an optimal action,
Each optimization problem is run by a particular player at the same time with other players.
An NE of a game is defined as follows:
Consider an -player game , each player minimizing the cost function . A vector is called a Nash equilibrium of this game if
An NE lies at the intersection of all solutions of the set (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers). The challenge is that each optimization problem in (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) is dependent on the solution of the other simultaneous problems. And since this game is distributed, no player is aware of the actions (solutions) of the other players (problems).
We assume that each player maintains an estimate of the other players’ actions. In the following, we define a few notations for players’ estimates.
: Player ’s estimate of all players actions,
: Player ’s estimate of his own action, ,
: Player ’s estimate of all other players’ actions except his action,
: Augmented vector of estimates of all players’ actions
Note that player ’s estimate of his action is indeed his action, i.e., for . Note also that all players actions can be interchangeably represented as .
We assume that the cost function and the action set are the only information available to player . Thus, the players need to exchange some information in order to update their estimates. An undirected communication graph is defined where denotes the set of communication links between the players. if and only if players and exchange information. In the following, we have a few definitions for :
: Set of neighbors of player in ,
: Adjacency matrix of where if and otherwise,
: Degree matrix of .
The following assumption is used.
is a connected graph.
We aim to relate game (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) to the following problem whose solution can be based on the alternating direction method of multipliers (Bertsekas and Tsitsiklis (1997), page 255).
To this end, we reformulate game (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) so that the objective function is separable by employing estimates of the actions for each player as (the estimates are also interpreted as the local copies of ). Particularly, from (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers), consider that for a given , each player minimizes his cost function selfishly w.r.t. his own action subject to an equality constraint, i.e., for all ,
where for . Note that, in order to update all elements in we need to augment the constraint space to an vector form . Moreover, we replace the constraints with which includes . Note that augmenting the constraints in this way does not affect the solutions of the problem. Then for a given and for all , we obtain,
The equality constraint along with Assumption 1 ensures that all the local copies of are identical, i.e., . Hence (4) recovers (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers).
where is an indicator function of the feasibility constraint and is an intermediary variable to separate the equality constraints.
Note that one can regard the set of problems (5) as being the same as the set of problems (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) but considering estimates (local copies) of the players’ actions for each player .
A characterization of the NE for game (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) could be obtained by finding KKT conditions on the set of problems (5). Let with be the Lagrange multipliers associated with the two constraints in (5), respectively. The corresponding Lagrange function for player , is as follows:
Let and be a pair of optimal primal and dual solutions to (5). The KKT conditions are summarized as follows:
We state a few assumptions for the existence and the uniqueness of an NE.
For every , the action set is a non-empty, compact and convex subset of . is a continuously differentiable function in , jointly continuous in and convex in for every .
The convexity of implies that is a convex function. This yields that there exists at least one bounded subgradient .
Let , be the the pseudo-gradient vector (game map) where . is cocoercive and , i.e.,
Assumption 2 is a standard assumption in the literature of NE seeking. Assumption 3 is relatively stronger than the (strong) monotonicity of the game map (pseudo-gradient vector) (see Zhu and Frazzoli (2016); Koshal et al. (2012)). However, as we will show, this leads to an algorithm with the benefits of ADMM algorithms (speed).
Our objective is to find an ADMM-like\@xfootnoteADMM or Alternative Direction Method of Multipliers also known as Douglas-Rachford splitting is a method of solving an optimization problem where the objective function is a summation of two convex (possibly non-smooth) functions. For a detailed explanation see Parikh et al. (2014). algorithm for computing an NE of using only imperfect information over the communication graph .
We propose a distributed algorithm, using an inexact\@xfootnoteIn an inexact consensus ADMM instead of solving an optimization sub-problem, a method of approximation is employed to reduce the complexity of the sub-problem. consensus ADMM. We obtain an NE of by solving the set of problems (5) by an ADMM-like approach.
The mechanism of the algorithm can be briefly explained as follows: Each player maintains an estimate of the actions of all players and locally communicates with his neighbors over . Then, he takes average of his neighbors’ information and uses it to update his estimates.
The algorithm is elaborated in the following steps:
1- Initialization Step: Each player maintains an initial estimate for all players, . The initial values of and are set to be zero for all , .
2- Communication Step: At iteration , each player exchanges his estimate of the other players’ actions with his neighbors , . Then, he takes average of the received information with his estimate and updates his estimate as follows:
where is a scalar coefficient, and ,
Equations (13), (14) are the dual Lagrange multipliers update rules. Note that in (12), a penalty factor is subtracted, which is associated with the difference between the estimates of the neighboring players (Equations (13), (14)).
Unlike distributed optimization algorithms where the minimization is w.r.t. , here each player minimizes his cost function w.r.t. . To update , each player requires the estimate of the other players at each iteration. Thus, the communication step is inevitable to update for the next iteration.
3- Action Update Step
At this moment all the players update their actions via an ADMM-like approach developed as follows. For each player , let the augmented Lagrange function associated to problem (5) be as follows:
where is a scalar coefficient which is also used in (12), (13) and (14). Consider the ADMM algorithm associated to problem (5) based on (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers):
The update rule for the auxiliary variable is based on (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers),
We simplify (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) by using a proximal first-order approximation for around ; thus using inexact ADMM it follows:
where is a penalty factor for the proximal first-order approximation of each player ’s cost function.
At this point, the players are ready to begin a new iteration from step 2. To sum up, the algorithm consists of (12), (13), (14) and (20) which are the update rule for the players’ estimates except their own actions, the update rules for the Lagrange multipliers and the update rule for player’s action, respectively.
Let \@xfootnoteIn order to have a fully distributed algorithm, one can consider a network-wide known lower bound , and use it instead of .be the minimum penalty factor of the approximation in the inexact ADMM algorithm which satisfies
where is a positive constant for the cocoercive property of , and and are the degree and adjacency matrices of , respectively. Under Assumptions 1-3, the sequence , generated by the algorithm (12), (13), (14) and (20), converges to NE of game (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers).
Proof. The optimality condition of (20) yields:
We combine (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) with (10) which represents the equations associated with the solutions of the set of problems (5) (NE of game (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers)). Then we obtain,
We multiply both sides by and then add and subtract as follows:
As discussed in Remark 3, in addition to updating their own actions, the players need to update their estimates as well. In the following, we explain how to bring in the update rule of into (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers).
Note that by (12), one can obtain,
Multiplying (25) by , one can arrive at,
Adding (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) to (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) and using (13), (14), yilds ,
The second and the third terms are bounded as follows:
for any . By the convexity of (Assumption 2), we have for the fourth term,
Using (28) and (29) in (Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers) and summing over , we obtain,
where and . We bound the first term using Assumption 3,