Coevolution of strategies and update rules in the Prisoner’s Dilemma game on complex networks
Abstract
In this work we study a weak Prisoner’s Dilemma game in which both strategies and update rules are subjected to evolutionary pressure. Interactions among agents are specified by complex topologies, and we consider both homogeneous and heterogeneous situations. We consider deterministic and stochastic update rules for the strategies, which in turn may consider single links or full context when selecting agents to copy from. Our results indicate that the coevolutionary process preserves heterogeneous networks as a suitable framework for the emergence of cooperation. Furthermore, on those networks, the update rule leading to a larger fraction, which we call replicator dynamics, is selected during coevolution. On homogeneous networks we observe that even if replicator dynamics turns out again to be the selected update rule, the cooperation level is larger than on a fixed update rule framework. We conclude that for a variety of topologies, the fact that the dynamics coevolves with the strategies leads in general to more cooperation in the weak Prisoner’s Dilemma game.
pacs:
05.45.Xt, 89.75.Fb1 Introduction
Evolutionary game theory on graphs or networks has attracted a lot of interest among physicists in the last decade [1, 2], both because of the new phenomena that such a nonhamiltonian dynamics [3] gives rise to and because of its very many important applications [4, 5]. Prominent among these applications is the understanding of the emergence of cooperation among bacteria, animals and humans [6]. In this context, evolutionary game theory on graphs is relevant due to the fact that populations interact with a reduced, fixed subset of the population has been proposed as one of the mechanisms leading to cooperation among unrelated individuals [7]. Indeed, the existence of a network of interactions may allow the assortment of cooperators in clusters [8, 9], that can outperform the defectors left on their boundaries. Unfortunately, extensive research has shown that this proposal is not free from difficulties, in particular due to the lack of universality of the evolutionary outcomes when changing the network or the dynamics [2, 10]. In addition, recent experimental tests of the effect of networks on the evolution of cooperation [11, 12] have led to new questions in so far as the experimental results do not match any of the models available in the literature.
In this work, we address the issue of the lack of universality and the problems it poses for applications to real world problems through the idea of coevolution. The rationale behind this approach is simple: if there are many possibilities regarding networks or dynamics and no a priori reasons to favor one over another, one can take a step beyond in evolutionary thought by letting a selection process act on those features: The types of networks or dynamics that are not selected along the process should not be considered then as ingredients of applicable models. This coevolutionary perspective has led to a number of interesting results in the last few years (see, e.g., [13, 14] for reviews), starting from the pioneering work by Ebel and Bornholdt [15] on coevolutionary games on networks. Subsequently, a fruitful line of work has developed by allowing players to rewire their links (e.g., [16, 17, 18, 19, 20, 21, 22]) or even forming the network by adding new players with attachment criteria depending on the payoffs [23, 24, 25]. We here take a much less explored avenue, namely the simultaneous evolution of the players’ strategies and the way they update them. This idea, first introduced in an economic context by Kirchkamp [26] has been recently considered in biological and physical contexts [27, 28, 29], proving itself promising as an alternative way to understand the phenomenon of the emergence of cooperation.
Here we largely expand on the work by Moyano and Sánchez [27], where only the case of a square lattice was considered. For that specific example, it was shown that evolutionary competition between update rules gives rise to higher cooperation levels than in a wellmixed context (complete graph) or when the update rules are fixed and cannot evolve. We now aim to address this issue in a much more general setup as regards the network of interactions, by considering a family of complex networks proposed in [30] as the structure of the population. This will allow us to go from complex but homogeneous networks to scale free ones, thus providing insight into what effects arise because of the specific degree distribution or because of the complexity of the network. As we will see below, a wide variety of effects and results will emerge as a product of this coevolutionary process in this class of networks. In order to present our findings, Section 2 introduces our model of the coevolutionary process and recalls the main features of the oneparameter family of networks we will use as substrate. Section 3 collects our results, organized in terms of the competition between pairs of update rules. Finally, Sec. 4 summarizes our conclusions and discusses their implications.
2 The model
Following the scheme introduced for the first time in [27], our model is formulated as follows: We consider a set of agents with two associated features: a strategy and an update rule. The strategy can take two possible values, namely cooperation (C) and defection (D), whereas for the update rule agents can take one of the following three options: Replicator (REP), Moranlike (MOR) and Unconditional Imitation (UI) rules, which we will define precisely below. These two features of individuals coevolve simultaneously according to the evolutionary dynamics specified below thus yielding an interesting feedbackloop between the game strategies and learning rules of the population. The coevolution process is specified by stipulating that when an individual, applying her update rule, decides to copy the strategy of a neighboring agent, she copies this agent’s update rule as well, i.e., the update process involves simultaneously the two features of the agent. While this is clearly a more biologically inspired approach, it is also possible to think of economic contexts in which not only decisions are imitated but also the way to arrive to those decisions can be copied as well.
As usual in the studies on evolutionary dynamics on complex networks, every player is represented by a node of the network. Agents thus arranged interact through a game only with those players that are directly connected to them as dictated by the underlying network. Mathematically, the network is encoded by its adjacency matrix , whose elements are if nodes and are connected and otherwise. Therefore, given a network substrate the dynamics of each individual is driven by the local constrains imposed by the topology. In order to study the effects of topology on the evolution of the population we will study a family of different network topologies generated by the model introduced by GómezGardeñes and Moreno in [30]. This family of networks depends on one parameter of the model, , that measures the degree of heterogeneity of the degree distribution, , i.e. the probability of finding nodes with degree . Without going into full details of the model, that can be found in the original reference [30], we find it convenient for the reader to briefly summarize its definition. Networks are generated starting from a fully connected graph with nodes and a set of unconnected nodes. At each iteration of the growing process a new node in the set is chosen to create new links. Each of these links are formed as follows: With probability the link connects to any of the other nodes in the network with uniform probability; with probability it links to a node following a preferential attachment strategy á la Barabási and Albert (BA) [31]. This procedure is repeated until all the nodes in have been chosen. Obviously, when the model generates BA scalefree networks with . On the other hand, when the networks generated have a Poisson degree distribution as in ErdősRényi (ER) graphs [32]. For intermediate values of one goes from the large heterogeneity of scalefree networks for to the homogeneous distributions found for . It is worth stressing that the model preserves the total number of links, , and nodes, , for a proper comparison between different values of . In the following we will focus on the cases (BA), (intermediate) and (ER).
Having defined the network that specifies the agents a given one will interact with, we now need to define how that interaction takes place. Connected pair of nodes play the socalled weak Prisoner’s Dilemma, introduced in the pioneering paper on games on graphs by Nowak and May [33]:
(1) 
With this choice for the payoff matrix, when two agents cooperate with each other they both receive a payoff . On the contrary, if one is a defector and the other plays as cooperator the former receives and the latter receives nothing. Finally, if both nodes are defectors they do not earn anything. This setting is referred to as a weak Prisoner’s Dilemma game because there is no risk in cooperating as in the standard Prisoner’s Dilemma, in which a cooperator obtains a negative payoff (i.e., she is punished) when facing a defector. Therefore, in the standard Prisoner’s Dilemma cooperating involves the risk of losing payoff, whereas in its weak version that is not the case, being the temptation to defect, , the only obstacle to cooperation.
The last ingredient needed to specify our model, and the main focus of this study, is the evolutionary dynamics. At a given time step, every player plays the game with all her neighbors using the same strategy in all pairings, as usual in evolutionary game theory on graphs. Once all the games are played, each agent collects the total payoff obtained from playing with her neighbors, . Subsequently, agents have to decide what strategy they will use in the next round, which they do by means of some update rule. For the purposes of the present paper, we will consider only update rules based on the payoff, those being more relevant to biological applications (where “payoff” is interpreted as “fitness” or reproductive capacity). Among those, we will consider three of the most often used rules [1, 2] (see Fig. LABEL:nueva for a sketch of the different rules to help grasp the differences between them):

Replicator Dynamics (REP) [34, 35]: Agent chooses one neighbor at random, say agent , and compare their respective payoffs. If , agent will adopt the state (i.e., the strategy and the update rule) of agent with probability:
(2) where the denominator is chosen to ensure proper normalization of the probability. In case , agent will keep the same dynamical state in the following generation. A remark is in order there to indicate that REP is one of an infinite number of rules that, when the number of agents is taken to infinity on a complete graph, converges to the famous replicator equation [3], hence the name we have used. Other authors prefer to call it “proportional imitation” to emphasize the fact that other rules have the same limit, but we have kept the name REP because as it is found in most textbooks (e.g., [5]).

Unconditional Imitation (UI) [33]: Agent compares its payoff with her neighbor with the largest payoff, say agent . If agent will copy the two features of agent for the next round of the game. Otherwise, agent will remain unchanged.
We stress that, as stated above, and at variance with traditional evolutionary dynamics, the above three update mechanisms refer to the dynamical state of the nodes, here defined by both the strategy and the update rule. The three rules we have chosen allow us to consider different aspects of the dynamics. Indeed, REP is stochastic and linkfocused (by this we mean that the update is carried out after just looking at one of links of the agent); UI is deterministic and contextfocused (the update depends simultaneously on all the neighbors of the considered agent), and MOR is stochastic and contextfocused. A deterministic, linkfocused rule would proceed to choose a link at random and copy the agent in case that her payoff is larger. However, one can see that this is basically the same as REP with a different probability normalization, and that would simply amount to changing the time scale of the simulation. On the other hand, only the first two rules are monotonous, in the sense that evolution proceeds always in the direction of increasing fitness, and players do not make mistakes. This is not the case of MOR, whereas the strategy for updating is chosen among all neighbors with positive probability, irrespective of whether they do better or not. This is not the only nonmonotonous rule, and in fact one can introduce the possibility of such errors in several manners, the most popular of them being the socalled Fermi rule [37] (see also [38] for a generalization that interpolates between the standard Fermi rule and UI). We have not included this rule here to avoid a proliferation of parameters (Fermi rule depends on a temperaturelike parameter that controls the number of mistakes) and of possible pairs of strategies to compete, but it is clear that this is another interesting example of nonmonotonous, payoffdependent update rule, and we will comment on it in the conclusions.
In all cases, simulations proceed as follows: We begin with an initial configuration composed equally of cooperators and defectors, randomly assigning the strategy of each of the individuals with equal probability for both. For the update rule, the second dynamical feature, it is also randomly distributed. In order to understand the mechanisms at work in the competition of rules, we consider only the coexistence of two update rules in the population, distributed uniformly among cooperators and defectors. Thus, if and are the two updates rules chosen for the simulation, and is the initial fraction of individuals having rule then the initial setting is composed of individuals with initial state (), with (), with (,) and with (,). From this initial configuration, the simulation proceeds round by round, with the payoffs being reset to zero after every round in agreement with the general procedure (i.e., we do not accumulate payoffs). Simulations are run until a stationary state is reached, which we verified it is achieved after 4000 rounds. We then measured the magnitudes of interest averaging over another 100 rounds, and the whole process was repeated for 100 realizations for every set of parameters. We varied the value of from 1 to 2 in steps of 0.025, and the frequency of one of the rules in the initial condition from 0 to 1, also in steps of 0.025. We used networks with 5000 nodes and mean degree for and 1. Finally, it is worth mentioning that individuals update their dynamical states in parallel. This is the commonly adopted choice in simulations of evolutionary game theory and, although some differences have been reported in specific cases with sequential dynamics [39], changes are in general limited to a narrow range of parameters [40, 2, 10], at least for homogeneous networks (but see Fig. 4b in [41] for a specific example of large differences in scalefree networks under UI dynamics).
3 Results
In this section, we present the results obtained when pairs of different update rules are subject to the selection pressure arising from the coevolutionary process. We have chosen to focus on two main quantities: The average level of cooperation in the asymptotic regime, , and the final composition of the population in terms of the update rule. As we will see, these two magnitudes are already enough to understand all the mechanisms at work during the coevolution. We have also looked at the dynamics of individual realizations to ensure that our interpretation of the results is correct. In what follows, we discuss our findings for the three possible competitions between pairs of update rules: Moran versus Replicators, Moran versus Imitators and Replicators versus Imitators.
3.1 Moran versus Replicators
To begin with, we deal with the two stochastic update rules, MOR and REP. The results are shown in the panels of Figure 2. The top panels show the degree of cooperation in the asymptotic regime, , while the bottom panels show the final composition of the population in terms of the fraction of replicator players, . Both and are shown as a function of the temptation to defect and the initial fraction of replicators and for BA networks (left panels), networks with moderate heterogeneity, corresponding to in the model introduced in [30], (panels in the middle) and ER graphs (right panels).
The bottom panels give the clue to interpret the processes that are taking place in this first case. REP players completely replace MOR players regardless of the topology of the substrate network and the values of and (provided ). Consequently, the level of cooperation (top panels) only depends on the ability of REP agents to cooperate. This ability, that has been extensively studied (see [1, 2] for detailed reviews) depends on both the temptation and the substrate topology. Generally speaking, homogeneous networks with the replicator rule support nonzero cooperation levels for small , up to , whereas heterogeneous networks are much better suited for cooperation under this dynamics [42, 8]. The results for the cooperation level, shown in the bottom panel, make it clear that we recover the results of those previous studies as cooperation is strongly enhanced when the heterogeneity of the network increases. Thus, the MOR rule is basically irrelevant in the presence of REP, a conclusion that agrees with that found in [27] for the special case of square lattices.
3.2 Moran versus Unconditional Imitation
Having seen that MOR can not survive in the presence of REP in any of the networks considered, it is important to consider the case of UI vs MOR, a deterministic rule versus a stochastic, nonmonotonous one. If MOR is also suppressed by UI, we can then conclude that it is a very fragile rule and its relevance will be very limited. In Fig. 3 we plot the final density of cooperators (top panels) and the final fraction of imitators (bottom panels) for the three types of networks used in this study, BA networks (left panels), moderate heterogeneous networks (center panels), and ER graphs (right panels). For this particular competition, we observe that the competition between these two rules is closely associated to the emergence of cooperation: UI suppreses MOR only when cooperation is the asymptotic result, and both quantities and show very similar values. In particular, for ER networks UI players turn out to be responsible for all the cooperative behavior observed and the evolutionary dynamics always ends up in a partition of the population into MOR defectors and UI cooperators. It is interesting to note that in this scenario MOR defectors are more disruptive of the global cooperation level than in the case of a regular square lattice, where cooperation survived until larger values of the temptation [27]. We note that, for this weak PD, when UI is the only update rule it leads to a higher level of cooperative behavior on the entire range of . This is very likely due to the fact that the ER networks we are studying, having an average degree of , are already close to a wellmixed population, in which UI leads to full defection in one step. Therefore, the net effect of having interaction between MOR and UI players is that the cooperation level is reduced and the already mentioned separation of strategists of two kinds.
When we move to more heterogeneous networks, the situation begins to change. As shown in the panels of both BA and networks, some ImitatorDefectors start to appear when the underlying topology of interactions becomes more heterogeneous: Note that the level of cooperation, top panels, is lower than the fraction of imitators, in particular for large and for large initial fraction of imitator players . For the specific case of the BA network, the invasion of imitators is complete in this limit, appearing when . In the intermediate case, when , there is a noticeable fraction of imitator defectors, as we already mentioned, but the invasion of imitators is not complete. In this regime MOR players are completely removed from the population. In the other regime, the final coexistence of UI and MOR players (for in BA networks) recovers the partition of ER networks into MoranDefectors and ImitingCooperators.
The most striking result of the MoranImitator competition is found in the region of the BA network. Remarkably, in this regime the coexistence (which is only a transient for ) between MOR and UI enhances the average level of cooperation with respect to the limit in which only imitators are present, i.e., the combination of strategies is optimal in terms of the survival of cooperation. As stated above, in the lower half of the region coexistence is established between two types of agents, MOR defectors and UI cooperators pointing out that the localization of defective behavior in Moran players enhance the ability of imitators to cooperate. On the other hand, the fact that the phenomenon persists when the coexistence is only temporary (upper half of the region) makes it clear that survival of MOR defectors in the asymptotic regime is not a necessary condition for the improving of cooperation. This phenomenon is reminiscent of the enhancement of cooperation observed in [41] due to errors, but we want to stress that its mechanism must be different because the case studied in that paper refers to a situation in which update rules do not coevolve with the strategies.
3.3 Replicator Dynamics versus Unconditional Imitation
Given that in the previous two scenarios both imitators and replicators have remarkably outperformed the ability of MOR players to survive (albeit MOR is only completely suppressed by UI in heterogeneous networks for same range of parameters), the competition between replicators and imitators will unveil what is, on average, the most effective update mechanism.
We start by analyzing the BA network. As shown in the bottomleft panel of Fig. 4, REP players invade the whole final population for large values of . However, when the temptation to defect decreases the final fraction of imitators within the population increases. In particular, the final fraction of imitators increases with its initial frequency, . This latter behavior has deep implications regarding the final degree of cooperation. As shown in the topleft panel of Fig. 4, cooperators tend to survive the more the larger is (with the exception of the case ). This result points out that the existence of any small fraction of imitators coexisting with a majority of replicators promote cooperation in the system. It is very interesting to realize that the cooperation level on BA networks is lower when the update rule is UI than when REP is the rule [2] . Therefore, what we are observing here is that, when both types of dynamics are mixed, a nontrivial coupling between them arises such that the final cooperation level is higher when some UI players are left in the population. Again, this points out to a diversity effect similar to the one reported in [41], but the same caveat we expressed above applies.
When the heterogeneity of the network decreases (middle and right panels) we observe from the panels in the top that, as usual, the degree of cooperation decreases. However, this decrease in the total fraction of cooperators is accompanied by the survival of imitators in some regions of the parameter space as shown by the bottom panels. In particular, for moderate values of and intermediate values of ( for and in ER networks) we find a significant fraction of the population playing as UI. These UI players constitute the main source of cooperation in the systems since we have checked that the remaining part of the population, composed of replicator players, are defectors. Therefore, the natural trend of REP players (as observed from the corresponding panels in Fig. 2) to defect in this region is opposed by the ability of imitators to cooperate for larger values of . Outside these two regions, we observe again the prevalence of replicators which play as cooperators and defectors for small and large values of respectively.
4 Conclusions
In summary, in this paper we have largely extended the work on coevolution of strategies and update rules on square lattices reported in [27]. We have focused on the role of the degree of heterogeneity of the network by considering a model that allows to go from BA to ER networks. The main conclusion of this work is that in all the cases considered, the coevolution process leads to the survival of cooperation, even when the temptation to defect is relatively high. This allows us to conclude that generically, coevolutionary dynamics on networks is very different from the meanfield problem. In addition, the study we have carried out sheds more light on the picture that emerged from the preliminary work [27]: While in that paper it was reported that on square lattices REP displaced UI which in turn displaced MOR, we have learned here that this statement requires a number of qualifications. Thus, it is actually the case that REP displaces MOR in all networks studied, but the other two competitions do not show such a clearcut outcome. As we have described in detail above, UI displaces MOR completely only when the coevolution process takes place on BA networks and the temptation parameter is large, whereas a fraction of UI survives when competing with REP in all the networks studied, at least for a range of temptation values and initial condition frequencies.
The asymptotic coexistence of update rules leads to interesting phenomena such as the complete segregation between UI cooperators and MOR defectors, or the enhancement of cooperation as compared to the case when only one rule is present in a number of cases. The first feature, the segregation phenomenon, is likely to arise due to the fact that UI cooperators survive through clustering, whereas UI defectors cannot imitate their MOR neighbors (the rule applies only if the payoff is strictly larger than the player’s) but the converse is possible. Therefore, compact groups of cooperators are UI, and the ‘outside sea’ of defection necessarily ends up being MOR. Concerning the second observation, the enhancement of cooperation by the coevolution of update rules, we have already mentioned its resemblance to the improvement of cooperation by errors reported in [41]. However, it is clear that the relation between the two mechanisms can only be established through the existence of different time scales involved in the global dynamics, as in [41] the update rules are fixed. In the preliminary report [27], the relevance of the time scales was already noted, but it was also pointed out that the mechanism behind it is far from clear. Time scales are relevant because rules that are more prone to change, and therefore lead to more frequent strategy changes, carry the seed of their own extinction, because in our setup they will copy the update rules that lead to less frequent changes. However, the tests carried out in [27] with two versions of UI, one acting only every 10 time steps, showed that while this was part of the ingredients of the coevolution, it could not explain by itself the whole set of observations. Our work here gives far more support to the idea that time scales play a key role, by showing that, even for the same rule, the heterogeneity of the network induces different rates of invasion of different nodes. All this makes it clear that it is necessary to design a specific setup in which the time scales are controlled.
From a more general viewpoint, our work supports the claim that, if update rules grounded on an evolutionary framework are the ones that should be used, REP is by far the most likely candidate among the three ones we have studied. Of course, we do not claim that we have completely solved this issue. Given the nontriviality of the coevolutionary process, a threerule competition should be analyzed to confirm this statement, as when the three rules are present the dynamics could be different. Furthermore, other rules, such as the Fermi rule [37] or the bestresponse rule [43, 44] should also be included in the competition. Another important point would be to consider other games. We can reasonably expect our results to be valid in a parameter region close to the weak PD, reaching into the strict PD and the Snowdrift game. However, we can not predict how far this region extends and whether games with different equilibrium structure, such as the StagHunt, would give rise to the same results we report here. Clearly, all this calls for a more ambitious study for which the present report is paving the way. Finally, we find it particularly interesting that the observation of coexistence in the asymptotic state of some rules agrees with experimental findings of diversity in the behavior of humans playing PD games on lattices [11, 12]. Further work is needed to compare in detail the outcome of the coevolutionary process with the available experimental data.
References
References
 [1] Szabó G and Fáth G 2007 Phys. Rep. 446 97–216
 [2] Roca C P, Cuesta J and Sánchez A 2009 Phys. Life Rev. 6 208–249
 [3] Hofbauer J and Sigmund K 1998 Evolutionary Games and Population Dynamics (Cambridge: Cambridge University Press)
 [4] Nowak M A 2006 Evolutionary dynamics: exploring the equations of life (Cambridge: The Belknap Press of Harvard University Press)
 [5] Gintis H 2009 Game theory evolving, 2nd edition (Princeton: Princeton University Press)
 [6] Pennisi E 2009 Science 325 1196
 [7] Nowak M A 2006 Science 314 1560–1563
 [8] GómezGardeñes J, Campillo M, Floría L M and Moreno Y 2007 Phys. Rev. Lett. 98 108103
 [9] Fletcher J A and Doebeli M 2009 Proc. Roy. Soc. B. 276 13–19
 [10] Roca C P, Cuesta J A and Sánchez A 2009 Phys. Rev. E 80 046106
 [11] Traulsen A, Semmann D, Sommerfeld R D, Krambeck H J and Milinski M 2010 Proc. Natl. Acad. Sci. USA 107 2962–2966
 [12] Grujić J, Fosco C, Araújo L, Cuesta J A and Sánchez A 2010 unpublished
 [13] Gross T and Blasius B 2008 J. Roy. Soc. Interface 5 259–271
 [14] Perc M and Szolnoki A 2010 Biosystems 99 109–125
 [15] Ebel H, Mielsch L I and Bornholdt S 2002 Phys. Rev. E 66 056118
 [16] Zimmermann M G, Eguíluz V M and San Miguel M 2004 Phys. Rev. E 69 065102(R)
 [17] Pacheco J M, Traulsen A and Nowak M A 2006 Phys. Rev. Lett. 97 258103
 [18] Pestelacci E, Tomassini M and Luthi L 2008 Biol. Theor. 3 139–153
 [19] Szolnoki A and Perc M 2009 Eur. Phys. J. B 67 337–342
 [20] Szolnoki A and Perc M 2009 EPL 86 30007
 [21] Szolnoki A and Perc M 2009 New J. Phys 11 093033
 [22] Wu B, Zhou D, Fu F, Luo Q, Wang L and Traulsen A 2010 PLoS ONE 5 e1187
 [23] Poncela J, GómezGardeñes J, Floría L M, Sánchez A and Moreno Y 2008 PLoS ONE 3 e2449
 [24] Poncela J, GómezGardeñes J, Traulsen A and Moreno Y 2009 New J. Phys. 11 083031
 [25] Poncela J, GómezGardeñes J, Floría L M, Sánchez A and Moreno Y 2009 EPL 88 38003
 [26] Kirchkamp O 1999 J. Econ. Behav. Org. 40 295–312
 [27] Moyano L G and Sánchez A 2009 J. Theor. Biol. 259 84–95
 [28] Szabó G, Szolnoki A and Vukov J 2009 EPL. 87 18007
 [29] Szolnoki A, Vukov J and Szabó G 2009 Phys. Rev. E 80 056112
 [30] GómezGardeñes J and Moreno Y 2006 Phys. Rev. E 73 056124
 [31] Barabási A L and Albert R 1992 J. Econ. Theory 57 407–419
 [32] Erdös P and Rényi A 1959 Publicationes Mathematicae 6 290–297
 [33] Nowak M A and May R M 1992 Nature 359 826–829
 [34] Helbing D 1992 Physica A 181 29–52
 [35] Schlag K H 1998 J. Econ. Theory 78 130–156
 [36] Moran P A P 1962 The Statistical Processes of Evolutionary Theory (Oxford: Clarendon Press)
 [37] Szabó G and Töke C 1998 Phys. Rev. E 58 69–73
 [38] Altrock P M and Traulsen A 2009 Phys. Rev. E 80 011909
 [39] Huberman B A and Glance N S 1993 Proc. Natl. Acad. Sci. USA 90 7716–7718
 [40] Nowak M A, Bonhoeffer S and May R M 1994 Proc. Natl. Acad. Sci. USA 91 4877–4881
 [41] Roca C P, Cuesta J A and Sánchez A 2009 EPL 87 48005
 [42] Santos F C and Pacheco J M 2005 Phys. Rev. Lett. 95 98104
 [43] Ellison G 1993 Econometrica 61 1047–1071
 [44] Roca C P, Cuesta J and Sánchez A 2009 Eur. Phys. J. B 71 587–595