Mean extinction times in cyclic coevolutionary rock-paper-scissors dynamics
Dynamical mechanisms that can stabilize the coexistence or diversity in biology are generally of fundamental interest. In contrast to many two-strategy evolutionary games, games with three strategies and cyclic dominance like the rock-paper-scissors game (RPS) stabilize coexistence and thus preserve biodiversity in this system. In the limit of infinite populations, resembling the traditional picture of evolutionary game theory, replicator equations predict the existence of a fixed point in the interior of the phase space. But in finite populations, strategy frequencies will run out of the fixed point because of stochastic fluctuations, and strategies can even go extinct. For three different processes and for zero-sum and non-zero-sum RPS as well, we present results of extensive simulations for the mean extinction time (MET), depending on the number of agents , and we introduce two analytical approaches for the derivation of the MET.
There are several examples of cyclic dominance in biology: Males of the californian lizards Uta stansburiana are known to inherit three different mating strategies which cyclically dominate each other [22, 24]. Another example is given by three strains of Escherichia coli , in tropical marine ecosystems  or in high arctic vertebrates . Cyclic dominance is also important in some theoretical models like the susceptible-infected-recovered-susceptible (SIRS) model in epidemiology , in cyclic extensions of the Lotka-Volterra model [31, 32] or the Public Goods Game [13, 11]. Here, cyclic dominance is a way to preserve the coexistence of strategies (what is often called ‘biodiversity’ in a biological context) in the models.
A straightforward model system for cyclic dominance is the rock-paper-scissors game (RPS) well known from schoolyards. It contains the three strategies ‘rock’, ‘paper’, and ‘scissors’ with a simple domination rule: Paper wraps rock, scissors cut paper, rock crushes scissors. The payoff a dominating strategy gets from the dominated one is set to for normalization, the payoff for a tie is , and a dominated strategy gets a payoff , where we assume . Hence the payoff matrix of this game reads
For the standard choice we have a zero-sum RPS, for all other values the game is non zero-sum. One can simply intuit the impact of : For huge it is important for a player to avoid loosing, so it is successful playing the same strategy as the majority; hence an equilibrium that includes all three strategies is unstable. On the other hand, for small it is more important to win occasionally so that the mixed equilibrium becomes stable. Experiments indicate for the lizard example  and for E. coli .
For a long time, the resulting dynamics have been described by a meanfield approximation in the limit of infinite populations. The traditional way to describe such evolutionary games was the standard replicator equation, an equation of motion for the density of an arbitrary strategy ,
where is a constant prefactor which can be absorbed in the time scale by a constant rescaling of time. and are the payoff of strategy and the average payoff of the whole population, respectively. The adjusted replicator equation 
has been used less frequently. The prefactor can be absorbed in the time scale by a dynamical rescaling of time for symmetric conflicts like the RPS, preserving stability properties. For asymmetric conflicts, like Dawkins Battle of the Sexes , this is not possible, as the average payoffs in each population do not coincide; hence the adjusted replicator equation can lead to qualitatively modified dynamics [26, 6]. The derivation of the replicator equations and Fokker-Planck-equations (comprising the first-order corrections) for the intrinsic noise are commonly taken from the master equation of the underlying discrete stochastic process [27, 28, 26, 31, 29].
Both replicator equations predict the existence of a fixed point at . The FP is neutrally stable for , asymptotically stable for , and unstable for . In the case of finite populations of size , the population can drive out of the fixed point because of stochastic fluctuations, and after some time one of the strategies will go extinct. Once this has happened, a second strategy will die out soon because of the lack of cyclic dominance, and the third strategy will therefore survive forever. Several efforts have been made to overbear this shortcoming especially of the zero-sum RPS, for example spatial discretization of populations (for a review see ), mobility , the introduction of intelligent update rules (best response ), the introduction of the parameter as mentioned above , or the computation of a critical system size above which coexistence of strategies is likely , but it is still an open question how long it takes on average until the first strategy has gone extinct. In this paper we investigate the scaling of the mean time to the extinction of the first strategy (mean extinction time, MET), depending on the system size and the parameter , for well mixed populations and three evolutionary processes (with two of them having the standard and accordingly the adjusted replicator equation as limits ), and present two analytical approaches which give theoretical insight in the scaling of the MET.
2 Evolutionary processes
Contrary to replicator equations describing the dynamics of relative abundance densities, real populations are finite (and discrete), appropriate modeling thus bases on discrete stochastic processes (of birth and death). In the classical Moran process  an individual is chosen with probability proportional to its fitness. It reproduces, and the offspring replaces another randomly chosen individual . In the frequency-dependent Moran process  which extends the classical Moran process  by considering coevolution, the fitness is not static but depends on the frequencies of the strategies. For better comparison with the processes mentioned beneath, in each time step we choose an individual at random which reproduces with probability
and the offspring replaces another randomly chosen individual . Greek lettes can assume each of the strategies , respectively. Here, is the selection strength, the are the number of agents playing strategy , the payoff for an individual of strategy , and the average payoff of the whole population. In the limit , the Moran process leads to the so-called adjusted replicator equation . Note that a factor is introduced here for a better comparability with the two processes mentioned beneath.
The Moran process is a well-established stochastic process for evolutionary birth-death (for growing population sizes, see e.g. ) dynamics with overlapping generations and therefore (in its frequency-dependent extension) serves as a standard model of evolutionary game theory. However, the Moran process requires perfect global information about the whole population, an assumption that can be unrealistic and undesirable. Because of that, two local processes have been mentioned. Again we choose an individual at random for reproduction. Another randomly chosen individual changes its strategy to the strategy of with probability
for the local update  and
for the Fermi process , respectively, where is the maximum of the possible payoff difference. In the limit , these processes lead to the standard replicator equation (local update) , and a nonlinear replicator equation (Fermi process)  with similar properties, respectively. The common approach for the derivation of the equations of motion and the first-order corrections for demographic stochasticity are to derive a Fokker-Planck equation from the master equation for the respective stochastic process [27, 28, 26, 6, 29].
Here we focus on the mean extinction time for the Moran and Local Update processes. For each process, the probability of increasing the population strength of the strategy by in a single time step and decreasing the population strength of strategy by , is given by
This quantity is known as hopping rate. Note that the sum over all hopping rates is because reactions are possible that do not lead to changes in strategy frequencies. The time scale is chosen so that one reaction takes place every unit time step, including empty steps with no strategy change.
3 Mean extinction times
Compared to the time scale a mutant occurs – re-introducing a strategy – the time scale characterizing the survival properties of a genetic strain is defined by the mean extinction time. For two strategies, and fixed , one has a onedimensional Markov process for which a closed expression allows to proceed analytically as has been demonstrated by Antal and Scheuring . For higher-dimensional cases unfortunately exact general solutions of the mean extinction time (or a mean-first-passage time ) problems are not known; in this case the situation is furthermore hampered by the location-dependent dynamics within a simplex boundary. So in most cases one has to rely on systematic numerical investigations. We have carried out extensive simulations to quantify the mean times until one of the three strategies has gone extinct. We have analyzed the mean extinction times for all three described processes depending on the number of agents and the parameters and . In general, we observe the following properties (see Fig. 1): For all three processes the MET becomes independent of the selection strength for . Further on, the MET for is identical for all three processes and proportional to ,
As expected, for the MET is larger and for smaller than for , but for weak selection the difference is small. In a logarithmic plot of the MET divided by one can easily see that , but also with and independent of . For the Moran process the stabilizing effect for is larger than for the other processes.
On the first view, for the MET seems to be . But such a proportionality would imply a disappearance of the MET for large . This is unrealistic because even the shortest way would require steps. For that a dependence is more likely (with having the same properties as for the case), as it is predicted by one of our analytical approaches.
In summary, it can be said that the MET switches from polynomial to exponential scaling at . This agrees with the drift reversal picture from [26, 5]. So the (global) Moran process and the (local) local update and Fermi process behave similar, but the impact of the parameters and is much stronger for the Moran process.
4 Analytical approaches
An exact analytical solution for two- or higher dimensional Markov chains is impossible, and a direct computation of the MET of such a three-strategy evolutionary game is not feasible. Although some efforts have been made, for example the work of Reichenbach et al. , wich have analyzed mean extinction properties in an urn model of a three-strategy game, employing the usual approach of applying van Kampen’s linear noise approximation and deriving a Fokker-Planck equation [35, 27, 28, 26].
Adapting this approach to the zero-sum RPS for the local update and the Moran process, we find a predicted MET proportional to and independent of , which is in accordance with the simulation data. Unfornately this approach has some shortcomings: Although the overall scaling is correct, the numerical value (prefactor) of the predicted MET is too great by a factor of , and it is not possible to use this approach for the non zero-sum RPS or in the Fermi process. For that we have developed two analytical approaches which do better. We will present both approaches for the local update. Following the same schemes for the other two processes is, although possible, a bit more sophisticated because not all integrals can be done analytically here in general. As we have seen from the numerical investigations, the payoffs override differences between the underlying processes, so that it is warranted to concentrate on one analytically more feasible process in the remainder.
4.1 First approach - Expected changes in the distance to the fixed point
To compute – with help of approximations – the MET for the general case, we need an appropriate coordinate system. For the standard replicator equation and ,
is a constant of motion which assumes the value in the fixed point and on the edge of the phase space, the simplex . Here, , and are the frequencies of the strategies , and , respectively. For , is a Lyapunov function of the replicator equations with , and so the inner fixed point ist asymptotically stable . For the fixed point is unstable, and the attractor of the system moves to the edge of the simplex. Via a transformation of the fixed point to the origin of the phase space, and by inserting , which guarantees the conservation of the total density, we find for :
For large absolute values of , the curves with resemble slightly deformed circles, but for they approach the triangle form of the simplex. In the following we will use as an observable to measure the effective distance to the fixed point. But obviously we need a second coordinate to describe a point in the phase space, so we will use a system with the variables and by eliminating the -coordinate. This gives us two solutions for , an upper branch above the fixed point, and a corresponding lower branch below. We need both because it is not possible to describe all points of the phase space by only one solution (see figure 2). So we have
Here the indices ‘’ and ‘’ stand for ‘up’ and ‘down’, respectively, because by re-inserting of and in the first (second) equation one will get only values of that lie above (under) a separator. In two points, and , both solutions are identical so that the curve is closed. We can compute these from the condition and solving for ,
and a third (formal) solution which here is of no use because for typical values of it produces values outside the simplex. is the argument of the complex number .
Now we compute the expectation value of the change in for a single time step for the local update process. For time , the system may be in an arbitrary allowed point . For that we will need the hopping rates in the coordinates. For example, for the change the hopping rate reads after inserting ,
Note that we will use all variables as if they were continous, although in fact they are not, and we use as an abbreviation for due to the conservation of total density (where und can take all values of as long as they are not identical).
Further on we will need the change in that emerges from a change of the -values in the -coordinates as well. To derive this, we write of the changed values of and , subduct the initial value , and transform the result into the -coordinates. For and the same change as in the example above we get
By multiplying all hopping rates with the associated changes in , , and summing over all terms, we get a relatively simple equation for the expectation value of the change in in a single time step,
with . Averaging over by integrating over from to and normalizing by the length of whole interval we derive
with , . The simplest way to approximate the mean extinction time for a system starting in the fixed point of the replicator equations is then to define a map
and to iterate it until is reached. The MET is then simply the number of necessary steps.
As one can see in the right hand side of Figure 2, this result is in good agreement with the simulation data, for the dependence of and , as well as for the numerical values. For the MET is predicted to be proportional to and independent of because is independent of for . It is in good agreement with the simulations that the MET for and weak selection differs only slightly from the MET for , while for strong selection and it is clearly smaller than for .
Unfornately, for this approach works only for small selection and not to great , because for non-positive expectation values of the changes in are possible, so if we start the mentioned iterated map in the fixed point and repeat the iteration again and again, we would arrive at a value of where the expected change is at some time, and the predicted MET would be infinite. The part of the interval where the expectation value is negative is bigger for larger than for smaller , and it grows with increasing , so in this cases this approach is not sufficient.
4.2 Second approach - Fokker-Planck equation
For the local update process, a Fokker-Planck equation (FPE) for the time evolution of the probability density is given by (summation over double indices)
where the coefficients are
which gives the following results
Transforming this into polar coordinates , , we find for the coefficients of the FPE
where , and hence for the FPE itself
If we now assume the probability density to be independent of the angle (which is a good approximation at least for small ), all terms which contain derivatives with respect to will drop out. By averaging over afterwards by integrating from to over and dividing the result by we get a probability density which is only independent of and . Hence the FPE reads
Let us now approximate the diffusion by its value at , this gives us
can theoretically vary between and for some values of , but most extinction events will happen around those point of the edge of the simplex that are closest to the inner fixed point. These points have the distance from the inner FP. Because of that we search for the solution of the one-dimensional random walk descripted by eq. 37 in the interval . Next, let us approximate the convection velocity by its value for mediate , e.g. , and we get
(We cannot approximate it by its value at as we have done for the diffusion because is corresponding to the FP.) For a last simplification, we neglect the term that includes the probability density itself. Hence, the FPE reads
The computation of the MET for such an equation with one reflecting border at and one absorbing border at is a standard problem, for example, see . One can find the MET by introducing a time integrated probability density,
With this, the time-integrated FPE reads:
with the boundary conditions at and at . There, is the time-integrated probability current. The constraint corresponds to the injection of an unit probability current at , this means a start of the system in the origin. The solution of the differential equation is given by
Then the MET is simply
what leads to
or in general
The proportional dependence of the MET on and the other parameters is very good, as one can see in Fig. 4. This approach predicts the MET to be proportional to and independent of the selection strength for , while for it forecasts the MET to be substantially proportional to , and for in good approximation proportional to .
Evolutionary games with three cyclically dominating strategies can stabilize the coexistence of strategies superior compared to games with only two strategies which include a dominating strategy. In finite populations, it is still likely that one of these strategies goes extinct, though on average it takes a time which is considerably longer than in two strategy games. We have analyzed the mean times to the extinction of one of the three strategies numerically in the RPS-game for three different evolutionary processes, the local update, the frequency-dependent Moran process and the Fermi process. These mean extinction times (MET) have the same fundamental dependencies of the number of agents , the parameter and the selection strength .
For the zero-sum RPS game (), the MET is proportional to and independent of for all three processes; even the proportionality constants are the same. For , we find a stabilization of biodiversity, as the MET grows exponentially with . In contrast to that, for , there is a destabilization of biodiversity, and the MET now grows smaller than in the zero-sum RPS (where it grows ), namely, a part of the proportionality factor of the zero-sum RPS now decays exponentially with . In both non zero-sum cases, the coefficient that is multiplied with the in the exponent, grows with both and .
Solving such two-dimensional Markov chains is not possible, but we have developed two analytical approaches for the approximation of the dependencies of the MET of and the parameters and . The first approach can be used for only, but serves for all three processes. In contrast, the second approach gives a good approximation for all parameter values, but it can be used completely analytical only for the local update process, as for the other two processes not all integrals can be done analytical.
In summary, the fixation time – as a long-time property of the process – shows a crossover from exponential to polynomial scaling with the population size, being fully consistent with the critical population size deived from the “drift reversal” picture which is based on the short-time dependence of the average drift in phase space - repelling versus attracting towards the fixed point. Thus both approaches to describe stochastic stability in finite populations here lead to the same conclusions: coexistence is preserved (stabilized) for a positive-sum cyclic game, whereas negative-sum games as well as small populations destabilize coexistence.
-  Antal, T., and Scheuring, I., Fixation of Strategies for an Evolutionary Game in Finite Populations, Bulletin of Mathematical Biology 68,1923-1944 (2006)
-  Axelrod R., and May, R.M., Infectious Diseases of Humans, (Oxford University Press, Oxford, 1991)
-  Blume, L.E., The Statistical Mechanics of Best-Response Strategy Revision, Games Econ. Behav. 11, 111-145, (1995)
-  Buss, L.W., Competitive intransitivity and size-frequency distributions of interacting populations, Proceedings of the National Academy of Science, USA, 77, 5355 (1980)
-  Claussen, J.C., and Traulsen, A., Cyclic dominance and biodiversity in well mixed populations, Phys. Rev. Letters 100, 058104, (2008)
-  Claussen, J.C., Drift reversal in asymmetric coevolutionary conflicts: Influence of microscopic processes and population size, The European Physical Journal B 60, 391-399(2007)
-  Dawkins, R., The Selfish Gene, Oxford University Press, (New York, 1976)
-  Durrett R., and Levin, S., Allelopathy in spatially distributed populations, Journal of Theoretical Biology 185, 165-174 (1997)
-  Durrett, R., and Levin, S., Spatial aspects of interspecific competition, Theoretical Population Biology 53, 30-43, (1998)
-  Gilg, O., Hanski, I., and Sittler, B., Cyclic dynamics in a simple vertebrate predator-prey community, Science 302, 866 (2003)
-  Hauert, C., de Monte, S., Hofbauer, J., and Sigmund, K., Volunteering as red queen mechanism for cooperation in public goods games, Science 296, 1129 (2002)
-  Hofbauer J., and Sigmund, K., Evolution and the Theory of Games, Cambridge University Press, Cambridge, England, (1998)
-  Kagel J.H., and Roth, A.E., (eds.), The Handbook of Experimental Economics, Princeton University Press, Princeton, (1995)
-  Kerr, B., Riley, M.A., Feldman M.W., and Bohannan, B.J.M., Local dispersal promotes biodiversity in a real-life game of rock-paper-scissors, Nature 418, 171-174 (2002)
-  Kirkup B.C., and Riley, M.A., Antibiotic-mediated antagonism leads to a bacterial game of rock-paper-scissors in vivo, Nature 428, 412-414 (2004)
-  Moran, P.A.P., The statistical processes of evolutionary theory, Clarendon Press (1962)
-  Nowak, M.A., Evolutionary Dynamics, Harvard (2007)
-  Nowak, M.A., Sasaki, A., Taylor, C., and Fudenberg, D., Emergence of cooperation and evolutionary stability in finite populations, Nature (London) 428, 646 (2004)
-  Redner, S., A Guide to First-Passage Processes, Springer, (Berlin, 1983)
-  Reichenbach, T., Mobilia, M., and Frey, E., Coexistence versus extinction in the stochastic cyclic Lotka-Volterra model, Phys Rev E 74, 051907 (2006)
-  Reichenbach, T., Mobilia, M., and Frey, E., Mobility promotes and jeopardizes biodiversity in rock-paper-scissors games, Nature 448, 1046-49 (2007)
-  Sinervo, B, and Lively, C., The rock-paper-scissors game and the evolution of alternative male strategies, Nature 380, 240, (1996)
-  Smith, J.M., Evolution and the Theory of Games, (Cambridge University Press, Cambridge, England, 1982)
-  Smith, J.M., The games lizards play, Nature 380, 198 (1996)
-  Szabó, G., and Fáth, G., Evolutionary games on graphs, Physics Reports, 446, 97-216 (2007)
-  Traulsen, A., Claussen, J.C., and Hauert, C., Coevolutionary dynamics: from finite to infinite populations, Physical Review Letters 95, 238701 (2005)
-  D. Helbing, Interrelations between stochastic equations for systems with pair interactions, Physica A 181, 29 (1992)
-  D. Helbing, Stochastic and Boltzmann-like models for behavioral changes, and their relation to game theory, Physica A 193, 241 (1993)
-  Chalub F.A., Souza M.O., From discrete to continuous evolution models: A unifying approach to drift-diffusion and replicator dynamics, Theor. Pop. Biol 76, 268 (2009)
-  Traulsen, A., Pacheco, J. M., and Nowak, M.A., Pairwise comparison and selection temperature in evolutionary game dynamics, J. Theor. Biol. 246, 522 (2007)
-  McKane, A. J., and Newman, T. J., Predator-Prey Cycles from Resonant Amplification of Demographic Stochasticity, Phys. Rev. Lett. 94, 218102 (2005)
-  Butler, T., and Goldenfeld, N., Robust ecological pattern formation induced by demographic noise, Phys. Rev. E 80, 030902 (2009)
-  Poncela, J., Gómez-Gardeñes, J., Traulsen, A., and Moreno, Y., Evolutionary game dynamics in a growing structured population, New J. Phys. 11 083031 (2009)
-  Traulsen, A. Nowak, M.A., and Pacheco, J.M., Stochastic dynamics of invasion and fixation, Physical Review E 74, 011909 (2006)
-  van Kampen, N., Stochastic Processes in Physics and Chemistry, (North Holland, Amsterdam, 1981)
-  Zamudio, K.R., and Sinervo, B., Polygyny, mate-guarding, and posthumous fertilization as alternative male mating strategies, Proceedings of the National Academy of Science, USA, 97, 14427 (2000)