FPTAS for Mixed-Strategy Nash Equilibria in Tree Graphical Games and Their Generalizations
We provide the first fully polynomial time approximation scheme (FPTAS) for computing an approximate mixed-strategy Nash equilibrium in tree-structured graphical multi-hypermatrix games (GMhGs). GMhGs are generalizations of normal-form games, graphical games, graphical polymatrix games, and hypergraphical games. Computing an exact mixed-strategy Nash equilibria in graphical polymatrix games is PPAD-complete and thus generally believed to be intractable. In contrast, to the best of our knowledge, we are the first to establish an FPTAS for tree polymatrix games as well as tree graphical games when the number of actions is bounded by a constant. As a corollary, we give a quasi-polynomial time approximation scheme (quasi-PTAS) when the number of actions is bounded by the logarithm of the number of players.
For over a decade, graphical games have been at the forefront of computational game theory. In a graphical game, a player’s payoff is directly affected by her own action and those of her neighbors. This large class of games has played a critical role in establishing the hardness of computing a Nash equilibrium in general games (?). It has also generated a great deal of interest in the AI community since ? (?) drew a parallel with probabilistic graphical models in terms of succinct representation by exploiting the network structure. As a result, this is one of the select topics in computer science that has triggered a confluence of ideas from the theoretical computer science and AI communities.
This paper contributes to this development by providing the first fully polynomial-time approximation scheme (FPTAS) for approximate Nash equilibrium computation in a generalized class of tree graphical games. Tree-structured interactions are natural in hierarchical settings. As often visualized in the ubiquitous organizational chart of bureaucratic structures (?), hierarchical organizations are arguably the most common managerial structures still found around the world, particularly in large corporations and governmental institutions (e.g., military), as well as in many social and religious institutions. Supply chains are also commonplace, such as in agriculture (see, e.g., ?). Even within the context of energy grids, the traditional electric power generation, transmission, and distribution systems are tree-structured, and are commonly modeled mathematically and computationally as such (see ?, for a recent example).
Our algorithm eliminates the exponential dependency on the maximum degree of a node, a problem that has plagued research for 15 years since the inception of graphical games (?).
More generally, we consider the problem of computing approximate MSNE in GMhGs, as defined by ? (?). We refer the reader to Table 1 for a list of acronyms used throughout this paper. Roughly speaking, in a GMhG, each player’s payoff is the summation of several local payoff hypermatrices defined with respect to each individual player’s local hypergraph. GMhGs generalize normal-form games, graphical games (?, ?), graphical polymatrix games, and hypergraphical games (?). For approximate MSNE, we adopt the standard notion of -MSNE (also known as -approximate MSNE), an additive (as opposed to relative) approximation scheme widely used in algorithmic game theory (?, ?, ?, ?).
In this paper, we provide FPTAS and quasi-PTAS for GMhGs in which the individual player’s number of actions and the hypertree-width of the underlying game hypergraph are bounded. The key to our solution is the formulation of a CSP such that any solution to this CSP is an -MSNE of the game. This raises two challenging questions: Will the CSP have any solution at all? In case it has a solution, how can we compute it efficiently? Regarding the first question, we discretize both the probability space and the payoff space of the game to guarantee that for any MSNE of the game (which always exists), the nearest grid point is a solution to the CSP. For the second question, we give a DP algorithm that is an FPTAS when and are bounded by a constant. Most remarkably, this algorithm eliminates the exponential dependency on the largest neighborhood size of a node, which has plagued previous research on this problem.
2 Related Work
In this section, we provide a brief overview of the previous computational complexity and algorithmic results for the problem of -MSNE computation (additive approximation scheme as most commonly defined in game theory) in general. A full account of all specific sub-classes of GMhGs such as normal-form games and (standard) graphical games is beyond the scope of this paper, just as is the discussion on (a) other types of approximations such as the less common relative approximation; (b) other popular equilibrium-solution concepts such as pure-strategy Nash equilibria and correlated equilibria (?, ?); and (c) other quality guarantees of solutions, including exact MSNE and “well-supported” approximate MSNE.
|CSP||Constraint Satisfaction Problem|
|FPTAS||Fully Polynomial Time Approx. Scheme|
|GMhG||Graphical Multi-hypermatrix Game|
|MSNE||Mixed-Strategy Nash Equilibrium|
|Quasi-PTAS||Quasi-Polynomial Time Approx. Scheme|
The complexity status of normal-form games is well-understood today, thanks to a series of seminal works (?, ?) that culminated in the PPAD-completeness of 2-player multi-action normal-form games, also known as bimatrix games (?). Once the complexity of exact MSNE computation was established, the spotlight naturally fell on approximate MSNE, especially in succinctly representable games such as graphical games. ? (?) showed that bimatrix games do not admit an FPTAS unless PPAD P. This result opened up computing a PTAS.
There has been a series of results based on constant-factor approximations. The current best PTAS is a 0.3393-approximation for bimatrix games (?), which can be extended to the cases of three and four-player games with the approximation guarantees of 0.6022 and 0.7153, respectively. Note that sub-exponential algorithms for computing -MSNE in games with a constant number of players have been known prior to all of these results (?). As a result, it is unlikely that the case of constant number of players will be PPAD-complete. Along that line, ? (?) considered the hardness of computing -MSNE in -player succinctly representable games such as general graphical games and graphical polymatrix games. He showed that there exists a constant such that finding an -MSNE in a -action graphical polymatrix game with a bipartite structure and having a maximum degree of 3 is PPAD-complete. ? (?) showed the hardness of bimatrix games for a polynomially small , and ? (?) showed the hardness (in this case, PPAD-completeness) of -player polymatrix games for a constant .
On a positive note, ? (?) presented an algorithm for computing a -MSNE of an -player polymatrix game. Their algorithm runs in time polynomial in the input size and . Very recently, ? (?) gave a quasi-polynomial time randomized algorithm for computing an -MSNE in tree-structured polymatrix games. They assumed that the payoffs are normalized so that the local payoff of any player from any other player lies in , where is the degree of . This guarantees, in a strong way, that the total payoff of any player is in . In comparison, we do not make the assumption of local payoffs lying in . Also, our algorithm is a deterministic FPTAS when is bounded by a constant.
Closely related to our work, ? (?) gave a framework for sparsely discretizing probability spaces in order to compute -MSNE in tree-structured GMhGs. The time complexity of the resulting algorithm depends on when is bounded by a constant. Ortiz’s result is a significant step forward compared to ? (?)’s algorithm in the foundational paper on graphical games. In the latter work, the time complexity depends on when is bounded by a constant. Both of these algorithms are exponential in the representation size of succinctly representable games such as graphical polymatrix games. Compared to these works, our algorithm eliminates the exponential dependency on . Furthermore, compared to Ortiz’s work, we discretize both probability and payoff spaces in order to achieve an FPTAS. This joint discretization technique is novel for this large class of games and has a great potential for other types of games.
Hardness of Relaxing Key Restrictions.
We use two restrictions: (1) Our focus is on GMhGs (e.g., graphical polymatrix games) with tree structure, and (2) our FPTAS for -MSNE computation hinges on the assumption that the number of actions is bounded by a constant. We next discuss what happens if we relax either of these two restrictions.
Tree-structured polymatrix games with unrestricted number of actions: A bimatrix game is basically a tree-structured polymatrix game with two players. ? (?) showed that there exists no FPTAS for bimatrix games with an unrestricted number of actions unless all problems in PPAD are polynomial-time solvable. In this paper, we bound the number of actions by a constant. We should also note the main motivation behind graphical games, as originally introduced by ? (?): compact/succinct representations where the representation sizes do not depend exponentially in , but are instead exponential in and linear in . As ? (?) stated, if , we obtain exponential gains in representation size. Thus, it is and the parameters of main interest in standard graphical games; the parameter is of secondary interest. Indeed, even ? (?) concentrate on the case of .
Graphical (not necessarily tree-structured) polymatrix games with bounded number of actions: ? (?) showed that for and , computing an -MSNE for an -player game is PPAD-hard. This hardness proof involves the construction of graphical (non-tree) polymatrix games. Therefore, the result carries over to -player graphical polymatrix games. This lower bound result shows that graph structures that are more complex than trees are intractable (under standard assumptions) even for constant and small but constant .
3 Preliminaries, Background, and Notation
Denote by an -dimensional vector and by the same vector without the -th component. Similarly, for every set , denote by the (sub-)vector formed from using exactly the components of . denotes the complement of , and for every . If are sets, denote by , and . To simplify the presentation, whenever we have a difference of a set with a singleton set , we often abuse notation and denote by .
3.1 GMhG Representation
A graphical multi-hypermatrix game (GMhG) is defined by a set of players and the followings for each player :
a set of actions or pure strategies ;
a set of local cliques or local hyperedges such that if then , and two additional sets defined based on :
’s neighborhood (the set of players, including , that affect ’s payoff) and
(the set of players, not including , affected by );
a set of local-clique payoff matrices; and
the local and global payoff matrices and of defined as and , respectively.
We denote by and the number of hyperedges of player and the maximum number of hyperedges over all players, respectively. Similarly, we denote and the size of the biggest hyperedge of player and the size of the biggest hyperedge over all players, respectively. Also, for consistency with the graphical games literature, we denote by and the size of the neighborhood of the primal graph induced by the local hyperedges of and the maximum neighborhood size over all players, respectively.
Fig. 1 illustrates some of the above terminology. The GMhG shown there (without the actual payoff matrices) is not a graphical game, because in a graphical game each must be singleton (i.e., only one local hyperedge for each node , which corresponds to ). This GMhG is not a polymatrix game either, because not all local hyperedges consist of only 2 nodes. Furthermore, the GMhG is not a hypergraphical game (?), because the local hyperedges are not symmetric (player 1’s local hyperedge has 2 in it, but 2’s local hyperedge does not have 1).
The representation sizes of GMhGs, polymatrix games, and graphical games are , , and , respectively.
Normalizing the Payoff Scale.
The dominant mode of approximation in game theory is additive approximation (?, ?, ?, ?). For to be truly meaningful as a global additive approximation parameter, the payoffs of all players must be brought to the same scale. The convention in the literature (see, e.g., ?) is to assume that (1) each player’s local payoffs are spread between 0 and 1, with the local payoff being exactly 0 for some joint action and exactly 1 for another; and (2) the local-clique payoffs (i.e., entries in the payoff matrices) are between 0 and 1. Here, we relax the second assumption; that is, we can handle matrix entries that are negative or larger than 1. Indeed, because of the additive nature of the local payoffs in GMhGs, the “ assumption” on those payoffs may require that some of the local-clique payoffs contain values or . This is a key aspect of payoff scaling, and in turn the approximation problem, that often does not get proper attention. We have a much milder assumption that the maximum spread of local-clique payoffs (or matrix entries) of each player is bounded by a constant. We allow this constant to be different for different players.
Note that the equilibrium conditions are invariant to affine transformations. In the case of graphical games with local payoff matrices represented in tabular/matrix/normal-form, it is convention to assume that the maximum and minimum local payoff values of each player are and , respectively. This assumption is without loss of generality, because for any general graphical game, we can find the minimum and maximum local payoff of each player efficiently and thereby make these and , respectively through affine transformations.
While doing this for GMhGs in general is intractable in the worst case, it is computationally efficient for GMhGs whose local hypergraphs have bounded hypertree-widths. For instance, the payoffs of a graphical polymatrix game can be normalized in polynomial time to achieve the first assumption above. To do that, we define the following terms.
It is evident from the last expression that we can efficiently compute each of those values for each via dynamic programming (DP) in time , and compute all the values for all in time .
Despite such exceptions, in general, we do not have much of a choice but to assume that the payoffs of all players are in the same scale, so that using a global is meaningful. For any local-clique payoff hypermatrix , we define the following notation on the maximum payoff, minimum payoff, and the largest spread of payoffs in that hypermatrix, respectively.
The following example shows that restricting the values of the local-clique hypermatrices to while keeping the maximum and minimum values of the local payoff functions of each player to be and , respectively, loses generality (e.g., some local-clique payoffs may be negative).The reason is for some games there is no affine transformation that would satisfy both of these conditions while maintaining exactly the same equilibrium conditions. Let , and .
4 Discretization Scheme: Simple Version
In contrast with earlier discretization schemes (?), we allow different discretization sizes for different players. Also, in contrast with recent schemes (?), we discretize both the probability space (Definition 2) and the payoff space (Definition 3).
(Individually-uniform mixed-strategy discretization scheme) Let be the uncountable set of the possible values of the probability of each action of each player . Discretize by a finite grid defined by the set with interval for some integer . Thus the mixed-strategy-discretization size is . We only consider mixed strategies such that for all , and . The induced mixed-strategy discretized space of joint mixed strategies is , subject to the individual normalization constraints.
(Individually-uniform expected-payoff discretization scheme) Let . Define the following two terms.
|(The last inequality above considers the cases of negative and non-negative .)|
Let denote an interval containing every possible expected payoff values that each player can receive from each local-clique payoff matrix , where (i.e., is in the grid). Discretize by a finite grid defined by the set with interval for some integer , where . Thus the expected-payoff-discretization size is . Then, for any , we would only consider an expected-payoff in the discretized grid that is closest to the exact local-clique expected payoff . More formally, . The induced expected-payoff discretized space over all local-cliques of all players is .
? (?) use a similar idea in the setting of interdependent defense (IDD) games, where each of sites has a binary pure-strategy set, and a specific instance of the general setting in which the attacker has pure strategies. The reason why the attacker has pure strategies is because, in the particular instance of IDD games that ? (?) consider, the attacker can attack at most one site at a time, simultaneously. In contrast, the potential multiplicity of actions of all players poses one of the main challenges in our case, particularly because of the non-tabular/non-normal-form representation of the general GMhGs, which is exponential in the size of the largest hyper-edge over all players neighborhood hyper-graphs.
5 A GMhG-Induced CSP: Simple Version
Consider the following CSP induced by a GMhG:
Variables: for all and , a variable corresponding to the mixed-strategy/probability that player plays pure strategy and, for all , a variable corresponding to some partial sum of the expected payoff of player based on an ordering of the local hyperedge elements of . Formally, if and , then the set of all variables is .
Domains: the domain of each variable is , while that of each partial-sum variable is .
Constraints: for each :
Best-response and partial-sum expected local-clique payoff: We first compute a hyper-tree decomposition of the local hypergraph induced by hyperedges . We then order the set of local-cliques of each player such that . The superscript denotes the corresponding order of the local-cliques of player . We make sure that the order is consistent with the hypertree decomposition of the local hypergraph, in the standard (non-serial) DP-sense used in constraint and probabilistic graphical models (?, ?). For any :
, and for ,
We call (a) the best-response constraint and (b) the partial-sum expected local-clique payoff constraint.
The number of variables of the CSP is . The size of each domain is , where . The size of each domain is , where . The computation of each in 1(b) above, which takes time , dominates the running time to build the constraint set. The total number of constraints is . The maximum number of variables in any constraint is . Given a hyper-tree decomposition, the amount of time to build the constraint set using a tabular representation is , which is the representation size of the GMhG-induced CSP.
5.1 Correctness of the GMhG-Induced CSP
We use the following Lemma of ? (?). Note that our results do not follow directly from this Lemma, since we also discretize the payoff space. Furthermore, for tree-structured polymatrix games, ? (?)’s running time depends on when is bounded by a constant, whereas ours is polynomial in the maximum neighborhood size .
(Sparse MSNE Representation Theorem) For any GMhG and any such that
a (uniform) discretization with
for each player is sufficient to guarantee that for every MSNE of the game, its closest (in distance) joint mixed strategy in the induced discretized space is also an -MSNE.
We next present our sparse-representation theorem, where we discretize the partial sums of expected local-clique payoffs.
(Sparse Joint MSNE and Expected-Payoff Representation Theorem) Consider any GMhG and any ,
Setting, for all players , the pair defining the joint (individually-uniform) mixed-strategy and expected-payoff discretization of player such that
so that the discretization sizes
for each mixed-strategy probability and expected payoff value, respectively, is sufficient to guarantee that for every MSNE of the game, its closest (in distance) joint mixed strategy in the induced discretized space is a solution of the GMhG-induced CSP, and that any solution to the GMhG-induced CSP (in discretized probability and payoff space) is an -MSNE of the game.
For the first part of the theorem, let be an MSNE of the GMhG. Let be the mixed strategy closest, in , to in the grid induced by the combination of the discretizations that each generates. For all and , set ; and for all and , first set , and then recursively for , set . The resulting assignment satisfies the normalization constraint of the CSP, by the definition of a mixed strategy. The assignment also satisfies the partial-sum expected local-clique payoffs by construction. Thus, we are left to prove that the best-response constraint is satisfied. By the setting of and Lemma 1, we have that is an -MSNE, and thus also an -MSNE. In addition, for all and , we have the following sequence of inequalities:
By the definition of , for all and , we have that for all and ,
Applying the last inequality to (1) and by unraveling the construction of the CSP assignment, we have
Rearranging the terms, and plugging in we get
Hence, the assignment also satisfies the best-response constraints (1(a) of CSP) and is a solution to the GMhG-induced CSP.
Now, for the second part of the theorem, suppose is a solution of the GMhG-induced CSP. Then, by the combination of the best-response and partial-sum expected local-clique payoff constraints, we have that, for all and ,
This in turn implies that for all and , we can obtain the following sequence of inequalities:
Hence, the corresponding joint mixed-strategy is an -MSNE of the GMhG. ∎
Within the context of Theorem 1, we have
where and . If all the ranges ’s are bounded by a constant, then
First, when all the ranges ’s are bounded by a constant, we have . Furthermore, . When , and hence . For the other case of , . Since is bounded by a constant and , must also be bounded by a constant and hence . Therefore, . Since both and are , we obtain the bounds on and . ∎
Note that if is bounded by a constant, then .
6 CSP-Based Computational Results
The CSP formulation in the previous section leads us to the following computational results based on well-known algorithms for solving CSPs (?, Ch. 5), and the application of equally well-known computational results for them (?, ?, ?).
There exists an algorithm that, given as input a number and an -player GMhG with maximum local-hyperedge-set size and maximum number of actions , and whose corresponding CSP has a hypergraph with hypertree-width , computes an -MSNE of the GMhG in time .
For GMhGs with bounded hypertree width , the following corollary establishes our main CSP-based result.
There exists an algorithm that, given as input a GMhG with bounded , outputs an -MSNE in polynomial time in the size of the input and , for any ; hence, the algorithm is an FPTAS. If, instead, we have , then the algorithm is a quasi-PTAS.
Theorem 2 also implies that we can compute an -MSNE of a tree-structured polymatrix game in . Note that the running time is polynomial in the maximum neighborhood size .
The following results are in term of the primal-graph representation of the GMhG-induced CSP.
There exists an algorithm that, given as input a number and an -player GMhG with maximum number of actions , primal-graph treewidth of the corresponding CSP, maximum local-hyperedge-set size , and maximum local-hyperedge size , computes an -MSNE of the game in time .
There exists an FPTAS for computing an approximate MSNE in -player GMhGs with corresponding , , and all bounded by constants, independent of , and primal-graph treewidth .
There exists an algorithm that, given as input an -player polymatrix GG with a tree graph, maximum neighborhood size , and maximum number of actions , computes an -MSNE of the polymatrix GG in time . If is bounded by a constant, then the algorithm is an FPTAS. If, instead, , then the algorithm is a quasi-PTAS.
7 DP for -MSNE Computation
We present a DP algorithm in the context of the special, but still important class of tree-structured polymatrix games. This is for simplicity and clarity, and as we later discuss, is without loss of generality. We first designate an arbitrary node as the root of the tree and define the notion of parents and children nodes as follows. For any node/player , we denote by the single parent of any non-root node in the tree and by the children of node in the root-designated-induced directed tree. If is the root, then is undefined. If is a leaf, then .
The two-pass algorithm is similar in spirit to TreeNash (?), except that (1) here the messages are , instead of bits ; and (2) more distinctly, our algorithm implicitly passes messages about the partial-sum of expected payoffs across the siblings.
Collection Pass. For each non-root node , we denote by . We order as . We then apply the following DP bottom-up (i.e., from leaves to root). We give an intuition before giving the formal specification. The message is 0 iff it is “OK” for to play when ’s parent plays (the notion of OK recursively makes sure that ’s children are also OK). The message is 0 iff ’s best response to playing is , given that gets a combined payoff of from its children. The message can be thought of as being implicitly passed from ’s child to the next (and back to from the last child ). is 0 iff is the maximum payoff that can get from its first children when plays and those children are OK with that. Fig. 2 illustrates the message passing.