Proof-theoretic Analysis of Rationality for Strategic Games with Arbitrary Strategy Sets
In the context of strategic games, we provide an axiomatic proof of the statement
Common knowledge of rationality implies that the players will choose only strategies that survive the iterated elimination of strictly dominated strategies.
Rationality here means playing only strategies one believes to be best responses. This involves looking at two formal languages. One, , is first-order, and is used to formalise optimality conditions, like avoiding strictly dominated strategies, or playing a best response. The other, , is a modal fixpoint language with expressions for optimality, rationality and belief. Fixpoints are used to form expressions for common belief and for iterated elimination of non-optimal strategies.
There are two main sorts of solution concepts for strategic games: equilibrium concepts and what might be called “effective” concepts. One interpretation of the equilibrium concepts, for example Nash equilibrium, tacitly presupposes that a game is played repeatedly (see, e.g. [13, page 14]). Thus the standard condition for Nash equilibrium in terms of the knowledge or beliefs of the players  – the so-called “epistemic analysis” of Nash equilibrium – includes a requirement that players know the other players’ strategy choices.
Consider the left-hand game in Figure 1,
in which each player has two choices and and both players get payoff of if they coordinate, and otherwise.
Then there are two Nash equilibria
In contrast, effective solution concepts, for example the iterated elimination of strictly dominated strategies, are compatible with such a “one-shot” interpretation of the game. Thus the epistemic analysis of the iterated elimination of strictly dominated strategies does not require that the players know each other’s strategy choice.
A strategy is strictly dominated if there is an alternative strategy such that no matter what the opponent does, is (strictly) better for than . Say that a player is -rational if he never plays a strategy that he believes to be strictly dominated. What the iterated elimination of strictly dominated strategies does in general require, see , is then that players have common true belief that each other is rational, that is: they are rational, believe that all are rational, believe that all believe that all are rational, etc.
In the right-hand game in Figure 1, the column player, on first looking at her choices or is, superficially, in the same situation as before: choose and risk the opponent playing or choice and risk the opponent playing . However, this time the row player can immediately dismiss playing on the grounds that will always be better, no matter what the column player does. So if the column player knows (or believes) this, then he cannot rationally play , and so must play .
In this paper we study the logical form of epistemic characterisation results of this second kind, so we give formal proof-theoretic principles to justify some given effective or algorithmic process in terms of common belief of some form of rationality. We will introduce two formal languages. One, , is a first-order language, that can be used to define ‘optimality conditions’. Avoiding playing a strictly dominated strategy is an example of an ‘optimality condition’. Another one is choosing a best response.
However, as observed in  for all such notions there are two versions: ‘local’ and ‘global’. Notice that in our informal description of when is strictly dominated by we did not specify where is allowed to choose alternative strategies from. In particular, since we are thinking of an iterated procedure, if has been eliminated already then it would seem unreasonable to say that should consider it. That intuition yields the local definition; the global definition states the opposite: that player should always consider his original strategy set from the full game when looking to see if a strategy is dominated.
A motivation for looking at global versions of optimality notions is that they are often mathematically better behaved. On finite games the iterations for various local and global versions coincide , but on infinite games they can differ. In a nutshell: an optimality condition for player is global if does not ‘forget’, during the iterated elimination process, what strategies he has available in the whole game. The distinction is clarified in the respective definitions in .
An optimality condition induces an optimality operator on the complete lattice of restrictions (roughly: the subgames) of a given game. Eliminating non--optimal strategies can be seen as the calculation of a fixpoint of the corresponding operator . Furthermore, common belief is characterised as a fixpoint (cf. Note 3 below). Viewed from the appropriate level of abstraction, in terms of fixpoints of operators, this connection between common belief of rationality and the iterated elimination of non-optimal strategies becomes clear.
We define a language that describes things from this higher level of abstraction. Each optimality condition defines a corresponding notion of rationality, which means playing a strategy that one believes to be -optimal. is a modal fixpoint language with modalities for belief and optimality, and so can express connections between optimality, rationality and (common) belief.
We say that an operator on an arbitrary lattice is monotonic when for all , if then . The global versions of relevant optimality operators, in particular of the operators corresponding to the best response and strict dominance, are monotonic. This is immediately verifiable in by observing that the relevant definition is positive.
Our first result is a syntactic proof of the following result, where
is a monotonic optimality condition:
Common true belief of -rationality entails all played strategies survive the iterated elimination of non--optimal strategies.
Although this theorem relies on a rule for fixpoint calculi that is only sound for monotonic operators, the semantics of the language allows also for arbitrary contracting operators, i.e. such that for all , . We are therefore able to look at what more is needed in order to justify the following statement (cf. [4, Proposition 3.10]), where -rationality means avoiding avoiding strategies one believes to be never best responses in the global sense:
(Imp) Common true belief of -rationality implies that the players will choose only strategies that survive the iterated elimination of strictly dominated strategies.
This theorem connects a global notion of -rationality with a local one, referred to in the iterated elimination operator. Our language allows for arbitrary contracting operators, and their fixpoints to be formed, and we exhibit one sound rule connecting the resulting fixpoints with monotonic fixpoints.
Our theorems hold for arbitrary games, and the resulting potentially transfinite iterations of the elimination process. The syntactic approach clarifies the logical underpinnings of the epistemic analysis. It shows that the use of transfinite iterations can be naturally captured in , at least when the relevant operators are monotonic, by a single inference rule that involves greatest fixpoints.
The relevance of monotonicity in the context of epistemic analysis of finite strategic games has already been pointed out in , where the connection is also noted between the iterated elimination of non-optimal strategies and the calculation of the fixpoint of the corresponding operator.
To our knowledge, although several languages have been suggested for reasoning about strategic games (e.g. ), none use explicit fixpoints (except, as we mentioned, for some suggestions in ) and none use arbitrary optimality operators.
Therefore they are not appropriate for reasoning at the level of abstraction that we suggest when studying the epistemic foundations of these “effective” solution concepts. For example while [7, Section 13] does provide some analysis of the logical form of the argument that common knowledge of one kind of rationality implies not playing strategies that are strictly dominated, the fixpoint reasoning is done at the meta-level. What  provides is a proof schema, that shows how, for any finite game, and any natural number , to give a proof that common knowledge of rationality entails not playing strategies that are eliminated in rounds of elimination of non-optimal strategies.
The more general and elegant reasoning principle is captured by using fixpoint operators and optimality operators. Another important advantage to our approach is that we are not restricted in our analysis to finite games. This means in particular that our logical analysis covers the mixed extension of any finite game.
2 Games and the language
A strategic game is a tuple , where are the players and each is player ’s set of strategies, and is player ’s preference relation, which is a total linear order over the set of strategy profiles . Note that we assume arbitrary games, rather than restricting to games in which is finite. To depict games it is sometimes easier, as we did in Figure 1, to write down a number for the players’ “payoffs”, rather than just a preference ordering. We use some standard notation from game theory, writing for and for the strategy profile , as well as for . A restriction of the game is a sequence with for all players , i.e. a (possibly empty) subgame in which the payoff information is left out.
The language we use for specifying optimality conditions is a first-order language, with variables , a monadic predicate , a constant and a family of ternary relation symbols , where . So is given by the following inductive definition:
where and .
We use the standard abbreviations and , further abbreviate to , to , to , and to .
An optimality model is a triple consisting of a strategic game , a restriction of , and a strategy profile . will be used to interpret the predicate , and will be the interpretation of . An assignment for is a function assigning a strategy profile in to each variable, and to . The ternary satisfaction relation between optimality models, assignments and formulas of is defined inductively as follows, where is an assignment for , and the complement of :
If for any assignment for we have then we write . A variable occurs free in if it is not under the scope of a quantifier ; a formula is closed if it has no free variables.
An optimality condition for player is a closed -formula in which all the occurrences of the atomic formulas are with equal to . Intuitively, an optimality condition for player is a way of specifying what it means for ’s strategy in to be an ‘OK’ choice for given that ’s opponents will play according to and that ’s alternatives are .
In particular, we are interested in the following optimality conditions:
The optimality conditions listed define some fundamental notions from game theory: says that is not locally strictly dominated in the context of ; says that is not globally strictly dominated in the context of ; and says that is globally a best response in the context of .
The distinction between local and global properties, studied further in , is clarified below. It important for us here because the global versions, in contrast to the local ones, satisfy a syntactic property to be defined shortly.
First, as an illustration of the difference between and , consider the game in Figure 2.
Call that game , with the row player and the column player . Then we have
The local notions are such that when the ‘context’ restriction consists of a singleton strategy for a player , then that strategy is locally optimal. So for example
We say that an optimality condition is positive when any sub-formula of the form , with any variable, occurs under the scope of an even number of negation signs (). Note that both and are positive, while is not. As we will see in a moment, positive optimality conditions induce monotonic optimality operators, and monotonicity will be the condition required of optimality operators in Theorem 1 relating common knowledge of -rationality with the iterated elimination of non- strategies.
3 Optimality operators
Henceforth let be a fixed strategic game. Recall that a restriction of the game is a sequence with for all players . We will interpret optimality conditions as operators on the lattice of the restrictions of a game ordered by component-wise set inclusion:
Given a sequence giving an optimality condition for each player , we introduce an optimality operator defined by
Consider now an operator on an arbitrary complete lattice with largest element . We say that an element is a fixpoint of if and a post-fixpoint of if .
We define by transfinite induction a sequence of elements of , for all ordinals :
for limit ordinals , .
We call the least such that the closure ordinal of and denote it by . We call then the outcome of (iterating) and write it alternatively as .
Not all operators have fixpoints, but the monotonic and contracting ones (already defined in the introduction) do:
Consider an operator on .
If is contracting or monotonic, then it has an outcome, i.e., is well-defined.
The operator defined by is contracting.
If is monotonic, then the outcomes of and coincide.
For (i), it is enough to know that for every set there is an ordinal such that there is no injective function from to .
Note that the operators are by definition contracting, and hence all have outcomes. Furthermore, it is straightforward to verify that if is positive for all players , then is monotonic.
The following classic result due to 
also forms the basis of the soundness of some part of the proof systems we consider.
Tarski’s Fixpoint Theorem For every monotonic operator on
where is the largest fixpoint of .
We shall need the following lemma, which is crucial in connecting iterations of arbitrary contracting operators with those of monotonic operators. It also ensures the soundness of one of the proof rules we will introduce.
Consider two operators and on such that
for all , ,
By Note 1 the outcomes of and exist.
We prove now by transfinite induction that for all
from which the claim follows, since by Note 1 we have .
By the definition of the iterations we only need to consider the induction step for a successor ordinal. So suppose the claim holds for some .
The second assumption implies that is monotonic. We have the following string of inclusions and equalities, where the first inclusion holds by the induction hypothesis and monotonicity of and the second one by the first assumption
4 Beliefs and the modal fixpoint language
Recall that is a game . A belief model for is a tuple , with a non-empty set of ‘states’, and for each player , and . The ’s are possibility correspondences cf. . The idea of a possibility correspondence is that if the actual state is then is the set of states that considers possible: those that considers might be the actual state.
Subsets of are called events. A player believes an event if that event holds in every state that considers possible. Thus at the state , player believes iff .
Given some event we write to denote the restriction of determined by :
In the rest of this section we present a formal language that will be interpreted over belief models. To begin, we consider the simpler language , the formulas of which are defined inductively as follows, where :
with an optimality condition for player . We abbreviate the formula to , to and to .
Formulas of are interpreted as events in (i.e. as subsets of the domain of) belief models. Given a belief model for , we define the interpretation function as follows:
gives the set of states that considers possible at , so is the event that player is -rational, since it means that ’s strategy is optimal according to in the context that the player considers it possible that he is in. The semantic clause for was mentioned at the begin of this section and is familiar from epistemic logic: is the event that player believes the event . is the event that player ’s strategy is optimal according to the optimality condition , in the context of the restriction .
Then clearly is the event that every player is -rational; is the event that every player’s strategy is -optimal in the context of the restriction ; and is the event that every player believes the event to hold.
Although can express some connections between our formal definitions of optimality rationality and beliefs, it could be made more expressive. The language could be extended with, for example, atoms expressing the event that the strategy is chosen. This choice is made for example in , where modal languages for reasoning about games are defined. The language we introduce is not parametrised by the game, and consequently can unproblematically be used to reason about games with arbitrary strategy sets.
We will use our language to talk about fixpoint notions: common belief and iterated elimination of non-optimal strategies. Let us therefore explain what is meant by common belief. Common belief of an event is the event that all players believe , all players believe that they believe , all players believe that they believe that…, and so on. Formally, we define , the event that is commonly believed, inductively:
Notice that is the event that everybody believes that (indeed, we have ), is the event that everybody believes that everybody believes that , etc.
‘Common belief’ is called ‘common knowledge’ when for all players and all states , we have . In such a case the players have never ruled out the current state, and so it is legitimate to interpret as ’ knows that ’.
Both common knowledge and common belief are known to have equivalent characterisations as fixpoints, and we will exploit this below in defining them in the modal fixpoint language which we now specify.
We extend the vocabulary of with a single set variable denoted by and the contracting fixpoint operator . (The corresponding extension of first-order logic by the dual, inflationary fixpoint operator was first studied in .) Modulo one caveat the resulting language is defined as follows:
The caveat is the following:
must be -free, which means that it does not contain any occurrences of the operator.
This restriction is not necessary but simplifies matters and is sufficient for our considerations.
To extend the interpretation function to , we must keep track of the variable . Therefore we first extend the function to a function by padding it with a dummy argument. We give one clause as an example:
We use this extra argument in the semantic clause for the variable :
Those formulas whose semantics we have so far given define operators. More specifically, for each of them is an operator on the powerset of . We use this to define the clause for :
When does not occur free in , we have for any events and , so in these cases we can write simply . Note that is well-defined since for all we have , so the operator is contracting.
We say that a formula of is positive in when each occurrence of in is under the scope of an even number of negation signs (), and under the scope of an optimality operator only if is positive.
When is positive, the operator is monotonic.
Then by Tarski’s Fixpoint Theorem and Note 1 we can use the following alternative definition of in terms of post-fixpoints:
Let us mention some properties the language can express. First notice that common belief is definable in using the operator. An analogous characterization of common knowledge is in [9, Section 11.5].
Let be a formula of . Then is the event that the event is common belief.
From now on we abbreviate the formula with a formula of to . So can define common belief. Moreover, as the following observation shows, it can also define the iterated elimination of non-optimal strategies.
In the game determined by the event , every player selects a strategy which survives the iterated elimination of non--optimal strategies.
It follows immediately from the following equivalence, which is obtained by unpacking the relevant definitions:
5 Proof Systems
Consider the following formula:
By Notes 3 and 4, we see that (1) states that: true common belief that the players are -rational entails that each player selects a strategy that survives the iterated elimination of non--optimal strategies.
In the rest of this section we will discuss a simple proof system in which we can derive (1). We will use an axiom and rule of inference for the fixpoint operator taken from  and one axiom for rationality analogous to the one called in  an “implicit definition” of rationality. We give these in Figure 3, where, crucially, is positive in , and all the ’s are positive. We denote here by the formula obtained from by substituting each occurrence of the variable with the formula . Assuming given some standard proof rules for propositional reasoning, we add the axioms and rule given in Figure 3 to obtain the system P.
A formula is a theorem of a proof system if it is derivable from the axioms and rules of inference. An -formula is valid if for every belief model for we have . We now establish the soundness of the proof system P, that is, that its theorems are valid.
The proof system P is sound.
We show the validity of the axiom :
Let be a belief model for . We must show that . That is, that for any the inclusion holds. So take some . Then for every , , and . So by monotonicity of , , i.e. as required.
The axioms and the rule were introduced in ; they formalise, respectively, the following two consequences of Tarski’s Fixpoint Theorem concerning a monotonic operator :
is a post-fixpoint of , i.e., holds,
if is a post-fixpoint of , i.e., , then .
Next, we establish the already announced claim.
The formula (1) is a theorem of the proof system P.
The following formulas
are instances of the axioms (with ) and (with ) respectively:
Putting these two together via some propositional logic, we obtain
which is of the right shape to apply the rule (with and ). We then obtain
which is precisely the formula (1).
The formula (1) is valid.
It is interesting to note that no axioms or rules for the modalities or were needed in order to derive (1), other than those connecting them with rationality. In particular, no introspection is required on the part of the players, nor indeed is the axiom needed.
In the language , the are in effect propositional constants. We might instead define them in terms of the and modalities but to this end we would need to extend the language . One way to do this is to use a quantified modal language, allowing quantifiers over set variables, so extending by allowing formulas of the form . Such quantified modal logics are studied in . It is straightforward to extend the semantics to this larger class of formulas:
In the resulting language each constant is definable by a formula of this second-order language:
The following observation then shows correctness of this definition.
For all the formula (4) is valid in the semantics sketched.
To complete our proof-theoretic analysis we augment the proof system P with the following proof rule where we assume that is positive in , but where is an arbitrary -free -formula:
The soundness of this rule is a direct consequence of Lemma 1.
To formalize the statement Imp we need two optimality conditions, and .
To link the proof systems for the languages and we add the following proof rule, where each and is an optimality condition in , and is a formula of .
The soundness of this rule is a direct consequence of the semantics of the formulas and .
We denote the system obtained from P by adding to it the above two proof rules and standard first-order logic rules concerning the formulas in the language , like
by R. We can now formalize the statement Imp as follows:
The following result then shows that this formula can be formally derived in the considered proof system.
The formula (5) is a theorem of the proof system R.
The properties are monotonic, so the following implication is an instance of (1):
Further, since the implication holds, we get by the Link rule
from which (5) follows.
The formula (5) is valid.
We have studied the logical form of epistemic characterisation results, for arbitrary (including infinite) strategic games, of the form “common knowledge of -rationality entails playing according to the iterated elimination of non- properties”. A main contribution of this work is in revealing, by giving syntactic proofs, the reasoning principles involved in two cases: firstly when (Theorem 1), and secondly when entails (Theorem 2). In each case the result holds when is monotonic. The language that we used to formalise this reasoning is to our knowledge novel in combining optimality operators with fixpoint notions. Such a combination is natural when studying such characterisation results, since common knowledge and iterated elimination are both fixpoint notions.
The language is parametric in the optimality conditions used by players. It is therefore built on the top of a first-order language used to define syntactically optimality conditions relevant for our analysis.
- A Nash equilibrium in a two-player game is a pair of strategies, one for each player such that is a best response to and vice-versa.
- By “common true belief” we mean a common belief that is correct. In particular, common knowledge entails common true belief.
- We use here its ‘dual’ version in which the iterations start at the largest and not at the least element of a complete lattice.
- Apt, K.R.: Relative strength of strategy elimination procedures. Economics Bulletin 3(21), 1–9 (2007), available from http://economicsbulletin.vanderbilt.edu/Abstract.asp?PaperID=EB-07C70015
- Apt, K.R.: The many faces of rationalizability. Berkeley Electronic Journal of Theoretical Economics 7(1) (2007), 38 pages
- Aumann, R.J., Brandenburger, A.: Epistemic conditions for nash equilibrium. Econometrica 63(5), 1161–1180 (1995)
- Battigalli, P., Bonanno, G.: Recent results on belief, knowledge and the epistemic foundations of game theory. Research in Economics 53, 149–225 (1999)
- Benthem, J.v.: Rational dynamics and epistemic logic in games. International Game Theory Review 9(1), 13–45 (2007), (Erratum reprint, 9(2), 377–409)
- Bernheim, B.D.: Rationalizable strategic behavior. Econometrica 52, 1007–1028 (1984)
- Bruin, B.d.: Explaining Games: On the logic of game theoretic explanations. Ph.D. thesis, ILLC, Amsterdam (2004)
- Dawar, A., Grädel, E., Kreutzer, S.: Inflationary fixed points in modal logics. ACM Transactions on Computational Logic (TOCL 5(2), 282 – 315 (2004)
- Fagin, R., Halpern, J.Y., Vardi, M., Moses, Y.: Reasoning about knowledge. MIT Press, Cambridge, MA (1995)
- Fine, K.: Propositional quantifiers in modal logic. Theoria 36, 336–346 (1970)
- Kozen, D.: Results on the propositional mu-calculus. Theoretical Computer Science 27(3), 333–354 (1983)
- Lipman, B.L.: A note on the implications of common knowledge of rationality. Games and Economic Behaviour 6, 114–129 (1994)
- Osborne, M.J., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge, MA (1994)
- Tarski, A.: A lattice-theoretical fixpoint theorem and its applications. Pacific Journal of Mathematics 5, 285–309 (1955)