Fixation in finite populations evolving in fluctuating environments

# Fixation in finite populations evolving in fluctuating environments

Peter Ashcroft peter.ashcroft@postgrad.manchester.ac.uk Theoretical Physics, School of Physics and Astronomy, The University of Manchester, Manchester M13 9PL, United Kingdom Philipp M. Altrock paltrock@jimmy.harvard.edu Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 02138, USA Tobias Galla tobias.galla@manchester.ac.ukCorresponding author Theoretical Physics, School of Physics and Astronomy, The University of Manchester, Manchester M13 9PL, United Kingdom
###### Abstract

The environment in which a population evolves can have a crucial impact on selection. We study evolutionary dynamics in finite populations of fixed size in a changing environment. The population dynamics are driven by birth and death events. The rates of these events may vary in time depending on the state of the environment, which follows an independent Markov process. We develop a general theory for the fixation probability of a mutant in a population of wild-types, and for mean unconditional and conditional fixation times. We apply our theory to evolutionary games for which the payoff structure varies in time. The mutant can exploit the environmental noise; a dynamic environment that switches between two states can lead to a probability of fixation that is higher than in any of the individual environmental states. We provide an intuitive interpretation of this surprising effect. We also investigate stationary distributions when mutations are present in the dynamics. In this regime, we find two approximations of the stationary measure. One works well for rapid switching, the other for slowly fluctuating environments.

Keywords: evolutionary dynamics, fluctuating environments, stochastic processes

## 1 Introduction

Evolutionary dynamics describes the change of populations over time subject to spontaneous mutation, selection, and other random events [1, 2]. Different phenotypes in the population can emerge spontaneously by mutation, i.e. through errors during reproduction of wild-types. In many cases, wild-type and mutant individuals are characterised by heritable differences in behavioural traits or strategies [2]. Selection acts on different phenotypes and changes the population composition. Changes in the state of the environment can alter these selective pressures over time.

Time-varying environments are relevant in the evolution of bacterial populations subject to environment modulations by a host [3, 4], or varying antibiotic stress. An illustrative example is the evolution of normal (wild-type) cells and resistant ‘persister’ cells (mutant). This was examined by Kussell et al. [5], where periods of antibiosis were turned on and off. During times of antibiotic stress the growth rate of normal cells was reduced, but the resistant cells sustain population levels. In addition, Acar et al. [6] provided further experimental evidence supporting the deterministic model used in [5]. More complicated studies of dynamics in switching environments rely on cells ‘sensing’ the environment [7] and on the history or information of the environment during a cell’s life [8, 9]. These examples illustrate that the assumption of an interaction structure independent of time is not always realistic. At the same time it is largely an open question how complex interactions between phenotypes together with spontaneous changes in the environment influence the evolutionary dynamics.

The interactions of phenotypes in a population can be formalised in an evolutionary game [10, 11]. Such games can be used to describe conflict over food or territory, cheating in resource allocation, as well as interactions between variants of a gene [12, 13, 14, 15, 16]. In an evolutionary normal-form game each individual can be associated with one out of a finite set of strategies. A payoff matrix quantifies the reward received by a given individual when it interacts with another individual [11].

The dynamics of populations interacting in such a game are often described by deterministic replicator equations or similar differential equations [10, 17, 18]. While deterministic dynamics are useful to understand the action of selection per se, more interesting phenomena arise when stochastic effects are taken into account. A stochastic approach is appropriate – often even strictly required – to understand the impact of fluctuations in finite populations [19, 20]. Deterministic approaches fail to capture effects such as fixation and extinction, or the convergence to a stationary distribution in systems with mutation [21, 22, 23, 24, 25].

There is an increasing body of literature on stochastic evolutionary games. For example, analytical results for the probabilities of a single mutant to reach fixation have been obtained [26, 27, 28, 29]. However, most of this existing work focusses on games played in a fixed environment; the underlying payoff matrix itself remains unchanged in time. This assumption may not be appropriate in cases where external factors influence the environment.

External fluctuations in evolutionary games have been previously introduced by adding extrinsic noise to continuous model parameters [30], or by letting strategy space itself vary in time [31]. Environmental variability has also been the subject of investigation in predator-prey models [32, 33].

In this article we explore different theoretical approaches that allow calculations of fixation probabilities and mean fixation times of a rare mutation, under fluctuating environmental conditions. We use a generic birth-death framework, as described in section 2, and our results thus apply to a wide class of population dynamics. In section 3 the theory is developed for an environment which can transition between an arbitrary number of discrete states, and we expand on the two-environment scenario in section 4. To illustrate our theoretical results we study the fixation properties in an evolutionary game that stochastically switches between a coexistence game and a coordination game in section 5. We determine environmental conditions under which the success of a rare invading mutant is maximal. This is seen to occur at a non-trivial combination of switching rates. For the case in which mutations occur during the dynamics, as described in section 6, we explore how the stationary distribution of the population changes in fluctuating environments. We derive approximations for the stationary distribution, valid for a large range of switching rates. We summarise our findings in section 7 and put them into context.

## 2 Mathematical Model

We seek to model evolutionary dynamics in finite populations of two species that are subject to environmental changes. The changes in the environment are such that at any given point in time, the system can be in one of a finite set of environmental states. The state that the environment is in determines the details of the birth and death dynamics. We focus on two cases: first, in the absence of mutations in the dynamics, we derive laws to predict the probability and mean time of the fixation of a mutant. Fixation describes the event in which mutants take over the population as opposed to going extinct. Fixation or extinction are the only two outcomes of dynamics of a rare mutant in a finite population [34]. In figure 1 we show a basic sketch of this scenario in which the two monomorphic states of the population are absorbing. Second, we study the case when mutations occur in the dynamics. There are then no absorbing states. Instead, the dynamics converges to a stationary distribution.

### 2.1 Birth-death dynamics

We consider populations consisting of a fixed number of individuals. Each individual can be of one of two types, or , which we refer to as ‘mutant’ and ‘wild-type’, respectively. The population is well mixed; every individual can interact with any other individual. The state of the population is fully characterised by the number, , of individuals of type . The remaining individuals are of type . We furthermore assume that at any one time the environment can be in one of discrete states, labelled , where is the space of states of the environment (). Hence the state of the entire system at any time is given by the pair .

The discrete-time birth-death dynamics of the population for a given environment, , is then specified by the transition probabilities and of a one-step process. Specifically, if the system is in state the population transitions to state in the next time step with probability . Similarly the state of the population in the next time step is with probability . These transitions are shown as black arrows in figure 1. With probability the population remains in state . We always assume that and for all .

With the exception of section 6 we will always assume that the states (all-) and (all-) are absorbing ( and for all ). In the absence of further mutation events a type, once absent, can never be re-introduced. If mutations are present in the dynamics, then the states and are no longer absorbing and the system converges to a unique, non-trivial stationary state. We consider this case in section 6.

### 2.2 Fluctuating environment

In our approach the environment evolves from one state to another independently of the state of the population. This simplification still captures a wide array of natural scenarios. In the discrete-time setup we take the dynamics of the environment as a simple Markov chain, described by the transition matrix of size . The entry represents the probability that the environment changes to state in the next time-step, if it is currently in state , as shown in figure 1. The matrix is a stochastic matrix, for all .

A switch of the environment effectively modifies the birth-death dynamics in the population. We do not specify the exact type of interaction at this point, but keep the rates general.

## 3 Fixation probability and time for a general birth-death process in a fluctuating environment

### 3.1 Fixation probability

Let us first consider a discrete-time evolutionary process. If the system is in state at a given time, it may transition to possible states in any one time-step. These are given by , and , where can be any of the states of the environment. If we write for the probability of a transition from to , we have

 R(i,σ)→(i+1,σ′) = μσ→σ′ω+i,σ R(i,σ)→(i−1,σ′) = μσ→σ′ω−i,σ R(i,σ)→(i,σ′) = μσ→σ′(1−ω+i,σ−ω−i,σ). (3.1)

No transitions from to can occur when . In this setup the birth-death probabilities are determined by the state of the environment at the beginning of the discrete time-step.

The fixation probability, , is the probability that the system ends up in the absorbing state with individuals of type , conditioned on initial state . The probability of fixation of a single mutant, , is of particular interest [21]. It is briefly illustrated in figure 1. In our scenario with switching between environmental states, following the lines of Refs. [2, 28, 35], the following balance equation for the fixation probabilities can be found

 ϕi,σ=∑σ′∈Λμσ→σ′[ω+i,σϕi+1,σ′+ω−i,σϕi−1,σ′+(1−ω+i,σ−ω−i,σ)ϕi,σ′]. (3.2)

This is to be solved along with the boundary conditions and for all .

To obtain a formal solution, we introduce , or in matrix form . The vectors and each have components. The boundary conditions and translate into and for all . With this notation we have

 ϕi,σ=ω+i,σ(ψi+1,σ−ψi,σ)−ω−i,σ(ψi,σ−ψi−1,σ)+ψi,σ. (3.3)

Using , we obtain

 ψi+1,σ−ψi,σ=γi,σ(ψi,σ−ψi−1,σ)+1ω+i,σ[(μ––––−1⋅ψ––i)σ−ψi,σ], (3.4)

where . We stress that the calculation of fixation probabilities and mean fixation times using this formalism requires the matrix to be invertible. We comment on this further in the context of a specific example in Sec. 4.

To keep the notation compact we define the variable . Using , we have . With this notation we can write equation (3.4) in the following form

 υi+1,σ=γi,συi,σ+1ω+i,σ[(μ––––−1−I)⋅i∑j=1υ––j]σ, (3.5)

where is the identity matrix. This relation expresses the vector in terms of the vectors . We can therefore express all vectors (i=) in terms of . The constraint then determines self-consistently. We note that the resulting set of equations is linear in the set . Hence a solution can be obtained in closed form, in principle. In practice one inverts the linear system using one of the standard algebraic manipulation packages. Once has been found, the other components , with , can be computed via equation (3.5). One then uses to find the fixation probabilities starting with individuals of type in environment , .

We note here that algebraically inverting the linear system (3.5) when is large is difficult due to the very large number of terms in the corresponding expressions. Thus, at present, this theory is limited computationally to relatively small .

In the case of a single environment, , the matrix is simply the identity matrix, and equation (3.5) simplifies to the well-known result for discrete-time birth-death processes [2, 28, 35].

### 3.2 Unconditional fixation time

We write for the expected number of time-steps taken to reach any one of the two absorbing states, given that the system is started in state . These fulfil the boundary conditions . With these definitions we find the following relation

 ti,σ=∑σ′∈Λμσ→σ′[ω+i,σti+1,σ′+ω−i,σti−1,σ′+(1−ω+i,σ−ω−i,σ)ti,σ′]+1. (3.6)

Introducing the variable , we have

 ti,σ=ω+i,σ(ξi+1,σ−ξi,σ)−ω−i,σ(ξi,σ−ξi−1,σ)+ξi,σ+1, (3.7)

and with the notation we arrive at

 νi+1,σ=γi,σνi,σ+1ω+i,σ[(μ––––−1−I)⋅i∑j=1ν––j]σ−1ω+i,σ. (3.8)

This relation allows one to express all vectors () in terms of . The constraint then determines , and the mean unconditional fixation times are computed using .

### 3.3 Conditional fixation time

We write for the mean fixation time conditioned on absorption in the all- state, given that the system is initially in state . To find this conditional fixation time, we proceed along similar lines as before. Introducing the variable , which has boundary conditions , the following balance equation can be found,

 θi,σ=∑σ′∈Λμσ→σ′[ω+i,σθi+1,σ′+ω−i,σθi−1,σ′+(1−ω+i,σ−ω−i,σ)θi,σ′]+ϕi,σ. (3.9)

We note that equation (3.6) and equation (3.9) appear to be very similar, but the difference is more than just a global pre-factor ; each term in the expression has different indices and .

Introducing the variable , we have

 θi,σ=ω+i,σ(ζi+1,σ−ζi,σ)−ω−i,σ(ζi,σ−ζi−1,σ)+ζi,σ+ϕi,σ, (3.10)

and introducing we arrive at

 ηi+1,σ=γi,σηi,σ+1ω+i,σ[(μ––––−1−I)⋅i∑j=1η–j]σ−1ω+i,σϕi,σ. (3.11)

The set can then be found using an approach similar to the one described above. Results for the mean conditional fixation time can then be obtained using .

### 3.4 Continuous-time model

In any of the elementary time steps of the above discrete-time model, both the state of the population () and the state of the environment () can change. We next consider a continuous-time setup. There are two types of discrete events that may occur at any time: (i) the state of the environment may change, or (ii) a birth-death event may occur. The rate (per unit time) with which a transition from state to state occurs is denoted by . The occurrence of these events is independent of the state of the population. The rate with which a birth-death event of the type occurs is , if the environment is in state . The rate with which occurs is . We write for the probability to find the system in state at time , and find the master equation

 ∂tPi,σ(t) = W+i−1,σPi−1,σ(t)−W+i,σPi,σ(t)+W−i+1,σPi+1,σ(t)−W−i,σPi,σ(t) (3.12) +∑σ′[mσ′→σPi,σ′(t)−mσ→σ′Pi,σ(t)].

#### Fixation probability

We write for the probability to find the system in state a period of time after it has been started in state . The corresponding backward master equation [36, 37] reads

 ∂tQj,σ′;i,σ(t) = W+i,σ[Qj,σ′;i+1,σ(t)−Qj,σ′;i,σ(t)]+W−i,σ[Qj,σ′;i−1,σ(t)−Qj,σ′;i,σ(t)] (3.13) +∑σ′′mσ→σ′′[Qj,σ′;i,σ′′(t)−Qj,σ′;i,σ(t)].

We define as the probability that the system has reached fixation in the all- state a period of time after the dynamics has been started in state . This includes fixation before time . By setting and summing over in equation (3.13) we obtain

 ∂tφi,σ(t) = W+i,σ[φi+1,σ(t)−φi,σ(t)]+W−i,σ[φi−1,σ(t)−φi,σ(t)] (3.14) +∑σ′′mσ→σ′′[φi,σ′′(t)−φi,σ(t)].

The fixation probabilities are found as , and they can be obtained by setting the time derivative in equation (3.14) to zero. Introducing , one finds

 yi+1,σ=gi,σyi,σ+1W+i,σ∑σ′mσ→σ′[i∑j=1(yj,σ−yj,σ′)], (3.15)

with and where we have used to write . For the case of a single environment the second term on the right-hand side vanishes and one recovers again the well known results in single environments [2, 28, 38]. Equation (3.15) has the same general structure as equation (3.5). Keeping in mind that , the fixation probabilities can hence be found by applying the approach outlined in section 3.1.

#### Fixation times

Calculating the mean fixation time using a diffusion approximation [34, 36, 37, 39] is not appropriate for our model. The environmental switching process has no continuum limit. Instead we work with the backward master equation (3.13) and adapt the calculation outlined by Antal and Scheuring [21].

We introduce , the probability that the system has reached fixation in either of the two absorbing states a period of time after being started in . Again this includes fixation before . We then have for the probability density to reach fixation exactly at time . From the backward master equation (3.13) we find

 ∂tρi,σ(t) = W+i,σ[ρi+1,σ(t)−ρi,σ(t)]+W−i,σ[ρi−1,σ(t)−ρi,σ(t)] (3.16) +∑σ′mσ→σ′[ρi,σ′(t)−ρi,σ(t)].

The mean unconditional fixation time is then found via , from which we find

 −1 = W+i,σ[ti+1,σ−ti,σ]+W−i,σ[ti−1,σ−ti,σ] (3.17) +∑σ′mσ→σ′[ti,σ′−ti,σ].

A similar iterative equation can be found for the mean fixation time conditioned on absorption in the all- state. The only difference is the integral of is given by the fixation probability , and that . The mean conditional fixation times, , therefore fulfil the relation

 −ϕi,σ = W+i,σ[ϕi+1,σtAi+1,σ−ϕi,σtAi,σ]+W−i,σ[ϕi−1,σtAi−1,σ−ϕi,σtAi,σ] (3.18) +∑σ′mσ→σ′[ϕi,σ′tAi,σ′−ϕi,σtAi,σ].

Structurally, equations (3.17) and (3.18) are of the same form as the corresponding equations for the discrete-time model, and so they can be solved using an analogous procedure. In continuous time, however, the solution procedure no longer relies on the invertibility of the matrix of switching rates between the states of the environment. This is because we have split up the birth-death dynamics and the changes of the environment into separate events that occur successively.

## 4 Switching between two environments

We now focus on the case of environments which can be in one of two possible states, i.e. . We label the two states as (). We focus on the discrete-time scenario. The matrix can then be written as

 μ––––=(1−p+p+p−1−p−), (4.1)

where the quantity () is the probability that the state of the environment switches to in a given time-step if it is in state at the beginning of this step. We recall that our theoretical results require the inversion of . Excluding the case when vanishes, this inversion can be carried out straightforwardly,

 μ––––=1Δ(1−p−−p+−p−1−p+). (4.2)

For the case , we have verified that there is no anomalous behaviour of simulation results.

### 4.1 Fixation probability and times

The general result of equation (3.5) now reduces to the recursion

 υi+1,σ=γi,συi,σ+1ω+i,σpσΔi∑j=1(υj,σ−υj,−σ). (4.3)

The fixation probability is obtained from the set via

 ϕi,σ=i∑j=1[1−p−σΔυj,σ−pσΔυj,−σ]. (4.4)

Similarly, equations (3.8) and (3.11) reduce to

 νi+1,σ=γi,σνi,σ+1ω+i,σpσΔi∑j=1(νj,σ−νj,−σ)−1ω+i,σ (4.5)

and

 ηi+1,σ=γi,σηi,σ+1ω+i,σpσΔi∑j=1(ηj,σ−ηj,−σ)−1ω+i,σϕi,σ (4.6)

respectively. The mean unconditional and conditional fixation times are then found as

 ti,σ = i∑j=1[1−p−σΔνj,σ−pσΔνj,−σ], (4.7) tAi,σ = 1ϕi,σi∑j=1[1−p−σΔηj,σ−pσΔηj,−σ]. (4.8)

### 4.2 Effective description for fast switching

The environmental change is fast if the environmental states are short-lived, i.e. much shorter than the mean fixation time in either environment. Then we expect the population dynamics to be controlled by a set of effective transition probabilities, i.e. weighted averages of the original transition probabilities in the different environmental states. The weights are given by the fraction of time spent in each environmental state. As the dynamics of follows a simple a telegraph process [36], the asymptotic fraction of time spent in the state is for . Using this, the effective transition probabilities are given by

 ω±i,eff=p−p++p−ω±i,++p+p++p−ω±i,−. (4.9)

We note that is the probability that in a given time-step the environment switches from state to . Hence the time spent in state decreases with increasing if is held fixed.

We anticipate that expression (4.9) can formally be derived by introducing a relative scaling parameter between the switching probabilities and the birth-death probabilities, and by then taking a suitable limit in which the time scales of both processes are widely separated. We do not explore this route further here.

In this approximation the dynamics of the population are mapped to a simple birth-death process on the set with absorbing states and . For such processes explicit expressions for the fixation probabilities and mean fixation times are known [2, 28, 35]. In the fast-switching limit we propose the following approximation for the fixation probability,

 ϕi,eff=1+∑i−1k=1∏kj=1γj,eff1+∑N−1k=1∏kj=1γj,eff. (4.10)

We have here written . The corresponding approximations for the mean unconditional and conditional fixation times of a single mutant are

 t1,eff = ϕeff1N−1∑k=1k∑l=11ω+l,effk∏m=l+1γm,eff, (4.11) tA1,eff = N−1∑k=1k∑l=1ϕl,effω+l,effk∏m=l+1γm,eff, (4.12)

respectively. These expressions exactly describe the fixation properties of a birth-death system with the effective transition probabilities; the nature of our approximation is to assume that the birth-death process in quickly changing environments can be described by the effective transition probabilities in equation (4.9).

Finally we note that this theory is independent of the invertibility of the switching matrix .

## 5 Fixation in fluctuating two-player two-strategy games

### 5.1 Evolutionary games

As a direct application of the general theory we have developed, we now consider evolutionary game dynamics in well-mixed, finite populations. Any of the individuals can be of one of two types, or . We limit the discussion to two-player games, but the extension to multi-player games (e.g. [25, 40, 41]) is straightforward.

At any point in time the environment is in one of two discrete states (). This state fluctuates in time as specified above. The interaction between individuals is characterised by the payoff matrix

 ABAaσbσBcσdσ. (5.1)

The subscript indicates the dependence on the environment. The matrix focuses on the column player: A type- individual encountering another of its kind receives , and it receives when interacting with a type- individual. In turn, an individual of type interacting with an individual of type obtains , and is the payoff for each individual if they are both of type .

If the environment is in state , and if there are individuals of type in the population and individuals of type , the expected payoffs for each type of player are

 πσA(i) = i−1N−1aσ+N−iN−1bσ, πσB(i) = iN−1cσ+N−i−1N−1dσ. (5.2)

The reproductive fitness of any individual is a function of the individual’s payoff in the evolutionary game. We use an exponential mapping [42, 43],

 fσA(i) = eβπσA(i), fσB(i) = eβπσB(i). (5.3)

This is a common natural choice in which fitness is never negative and monotonically increasing with payoff. Any other functional form of the payoff-to-fitness mapping with these two properties would be equally appropriate. The constant parameter is the so-called intensity of selection. Based on this definition of fitness we model the evolutionary dynamics by the update rules of the Moran process [34, 44], which has been widely used in evolutionary game theory [2, 28, 45]. The Moran process represents a simple birth-death process in which the population size remains constant, and by construction it has absorbing states at and . In a discrete-time setting, the frequency-dependent Moran process is specified by the transition probabilities [46]

 ω+i,σ=i(N−i)N2fσA(i)¯¯¯fσ(i),  ω−i,σ=i(N−i)N2fσB(i)¯¯¯fσ(i), (5.4)

where is the average fitness in the population. We note that the framework of the previous section can be applied to microscopic evolutionary dynamics other than the Moran process. This includes, for example, pairwise comparison processes [47, 48], or cases with constant selection in any one environment.

### 5.2 Switching between coexistence and coordination games

Rare mutations can introduce a previously absent strategy into the population. Typically there is only one individual of this novel type initially. We say that is the resident type, and that is the invading mutant type. All results in this section are based on the initial condition . We chose for the payoff matrix. The type of game is then determined by the off-diagonal terms. We chose and , where are real-valued parameters. Thus we have the payoff matrix

 ABA11+σbB1+σc1. (5.5)

Our parametrisation does not span the entire space of all games, but it covers some of the most common types (see below).

There exist three general types of two player-two strategy evolutionary games. First, for the coexistence game and , selection drives the population away from the absorbing boundaries. Second, for and , the population dynamics exhibits bi-stability. This is also known as a coordination game; selection drives the population towards the monomorphic states. In both cases there exists an internal point in frequency space for which the direction of selection changes its sign, i.e. at which the gradient of selection is zero. This point can be calculated by solving (or equivalently ) for , and broadly speaking it is determined by the relative magnitudes of and . Third, for (or ) type (or type ) always has the higher fitness irrespective of the composition of the population. This type is then always favoured by selection, which never changes direction.

For the remainder of this article we focus on switching between coexistence and coordination games. More precisely we choose and in (5.5). The coexistence game corresponds to and the coordination game to .

### 5.3 Results

#### Sample trajectory of the dynamics

In figure 2(a) we show a sample trajectory of a simulation in which a single mutant reaches fixation. The gradient of selection, , for the two games is shown in figures 2(b) and 2(c). During periods when the environment is in the coexistence state (light background) the population fluctuates about the selection-balance point (dashed line), and during periods when the environment is in the coordination state (shaded background) the population is driven away from the selection-balance point. In the final period in the coordination state the mutant is driven to fixation.

#### Fixation probability and conditional fixation time

In figure 3 we show the effect of switching the environment on the fixation dynamics. We choose . By equating the reaction probabilities (i.e., setting ), or equivalently equating the expected payoffs in equation (5.2), and looking at leading order terms in , the gradient of selection is seen to change sign at . This point is closer to the extinction state of the mutant () than to the fixation state (). We next describe the key observations we make from these results, before we turn to their interpretation.

Fixation probability (figure 3(a)):
The fixation probability in this example depends non-trivially on the rates with which the environment switches states; we find an optimal combination of switching rates, , for which fixation of a single mutant is most likely. The fixation probability is dependent on the initial state of the environment for .

Fixation time (figure 3(b)):
Mean conditional fixation times show very little dependence on the initial state of the environment. The fixation time is small for when the environment is found mostly in the coordination game, and large when the environment is mostly in the coexistence state.

Validity of the theoretical approach:
As seen in both panels of figure 3 the theoretical predictions of equations (4.4) and (4.8), indicated by solid lines, are in convincing agreement with simulation data. Theoretical results from the model with effective transition rates (section 4.2) reproduce the simulation data qualitatively. Quantitative agreement is obtained in the limit of large switching rates, but unsurprisingly, there are systematic deviations when switching is slow.

#### Interpretation

We now proceed and give an intuitive explanation for the observed effects.

Mean conditional fixation time is reduced as more time is spent in coordination environment:
The behaviour of the fixation time can intuitively be understood from the deterministic gradient of selection of the two games (figure 2(b) and 2(c)). If fixation happens, it will generally be quicker in the coordination game than in the coexistence game [21, 49]. This is due to the adverse selection bias in the coordination game at low mutant numbers (see figure 2(c)). The more time the system spends in this region of adverse selection the less likely it is for the mutant to reach fixation. Thus if fixation happens in a coordination game then it happens fast. In the coexistence game on the other hand the direction of selection is towards the balance point, as shown in figure 2(b). The system can ‘afford’ to spend significant time in the region of small mutant numbers and still reach fixation eventually even after repeated excursions throughout frequency space. There is thus no need for fixation to occur quickly, and conditional fixation times can be long.

These observations make it plausible that the mean conditional fixation time will generally decrease when less time is spent in the coexistence game, which is exactly what we find in figure 3(b). We have tested other choices of the parameters and for which the two games are a coexistence game and a coordination game, and we find that the behaviour of the mean conditional fixation times is robust under these changes.

Mean conditional fixation time is largely independent of the initial state of the environment:
Systems started in the coordination environment will tend to reach extinction relatively quickly due to initial adverse selection, unless the environment switches to the coexistence state early on. Thus the sample of runs that reach fixation started from the coordination-game environment will be dominated by runs in which the environment switches soon after the start of the run. Then we expect that the value of the mean conditional fixation time is close to the one obtained when starting in the coexistence game.

Dependence of fixation probability on initial state of the environment:
The data in figure 3(a) show that initiating the dynamics in the coexistence game favours fixation of the mutants for (when is fixed), however above this threshold the initial state of the environment has relatively little effect. The reason for this is as follows. When starting in the coordination game, selection pushes the mutant towards extinction. Hence fixation is more likely if the initial state is the coexistence fitness landscape. Above , the switching process of the environment is too fast for the initial condition to have any significant effect on the population dynamics. It is this regime in which we expect the effective description (section 4.2) to approximate the system well. This is indeed confirmed in figure 3, the theoretical prediction of the effective theory, equation (4.10), agrees well with our simulation results in this fast-switching region.

Behaviour of fixation probability depends on location of the selection-balance point:
If the environment is fixed to the coexistence-game state, fixation is more likely the closer the point of selection balance is to the fixated state, see figure 4(a). The location of this balance point is approximated by , and so the fixation probability increases as is decreased. In a fixed coordination-game environment the reverse is the case. The range of adverse selection is to the left of the balance point, and so fixation is less likely the closer the point of selection balance is to the fixated state.

For , i.e. a selection-bias point close to , we therefore expect that the fixation probability will increase the more time is spent in the coordination-game environment, i.e. is an increasing function of the probability with which the system leaves the state (coexistence game). This is indeed what we find in simulations (data not shown). For , i.e. close to , the reverse is the case. Fixation is more likely in the coexistence game (), and the fixation probability is hence a decreasing function of at fixed . Although we do not show the data here, this is again confirmed in simulations.

For the situation is less clear. The fixation probability will be comparable in both games if the environment is frozen. Two effects here conspire to produce a non-trivial outcome:

• Consider the case in which the system is mostly in the coordination-game state, i.e. . It is plausible that an occasional switch to a coexistence game will make fixation more likely than in a constant coordination game. This is because the coexistence-game environment pushes the system away from extinction at low mutant numbers. In the regime of we thus expect the fixation probability to increase as is lowered. In other words, is a decreasing function at large .

• Similarly, if the system is mostly in the coexistence-game environment (), short periods of time in the coordination game can make fixation more likely. This is because selection at large mutant numbers is directed towards fixation in the coordination game. At we expect to be an increasing function of .

These two effects taken together generate a maximum of the fixation probability at intermediate values of , which is exactly what we find in figure 3(a). We would like to stress that the effect (i) is only present provided the selection-balance point is not too close to the extinct state. The phenomenon discussed under (ii) is only present if the selection-balance point is not too close to the fixated state. If the balance point is located too close to either boundary the corresponding effect will be suppressed and the remaining effect dominates. One then finds monotonically increasing or decreasing dependences .

To confirm our picture we varied the payoff parameters and , and find the value of that maximises fixation probability for a given , as a function of and in figure 4(b). The point of selection balance is approximately (up to system-size corrections). The presence of diagonal structures in figure 4(b) shows that the behaviour of the fixation probability is dependent on the location of the selection-balance point. If this point is close to the fixation state (, bottom-right in figure 4(b)), then the fixation probability is maximal for vanishing . If this point is close to the extinction state (, top-left in figure 4(b)), then the fixation probability is maximal for large . For intermediate locations of the selection balance point () fixation is maximised at a non-trivial combination of environment states.

The features observed in figure 3, i.e. the peak in the fixation probability and shape of the mean conditional fixation time as a function of , are found to be robust when the system size is increased. Fixation probabilities generally decrease with system size but the observed peak becomes sharper. The mean conditional fixation times in a frozen coexistence environment scale exponentially with , whereas they scale as the logarithm of in the coordination environment [50]. We observe these scalings in our system, with the mean conditional fixation time increasing exponentially with for small , and increasing sub-linearly with for large .

## 6 Mutation-selection equilibria under fluctuating environments

### 6.1 Mutations and stationary distributions

We now consider systems with mutations occurring during the dynamics. This removes the possibility of fixation and extinction. The combination of mutation, selection, and noise can lead to non-trivial stationary states. We introduce mutation by modifying the discrete-time transition probabilities of equation (5.4) and now use

 ^ω+i,σ=(1−u)i(N−i)N2fσA(i)¯¯¯fσ(i)+u(N−i)2N2,  ^ω−i,σ=(1−u)i(N−i)N2fσB(i)¯¯¯fσ(i)+ui2N2, (6.1)

where is the mutation rate. The transition probabilities are now non-zero, and so the states and are no longer absorbing.

The stationary probability of finding the system in state (, ) is obtained as the solution of the balance equation

 ρi,σ=∑σ′∈Λμσ′→σ[^ω+i−1,σ′ρi−1,σ′+^ω