# Collective Fluctuations in models of adaptation

###### Abstract

The dynamics of adaptation is difficult to predict because it is highly stochastic even in large populations. The uncertainty emerges from number fluctuations, called genetic drift, arising in the small number of particularly fit individuals of the population. Random genetic drift in this evolutionary vanguard also limits the speed of adaptation, which diverges in deterministic models that ignore these chance effects. Several approaches have been developed to analyze the crucial role of noise on the expected dynamics of adaptation, including the mean fitness of the entire population, or the fate of newly arising beneficial deleterious mutations. However, very little is known about how genetic drift causes fluctuations to emerge on the population level, including fitness distribution variations and speed variations. Yet, these phenomena control the replicability of experimental evolution experiments and are key to a truly predictive understanding of evolutionary processes. Here, we develop an exact approach to these emergent fluctuations by a combination of computational and analytical methods. We show, analytically, that the infinite hierarchy of moment equations can be closed at any arbitrary order by a suitable choice of a dynamical constraint. This constraint regulates (rather than fixes) the population size, accounting for resource limitations. The resulting linear equations, which can be accurately solved numerically, exhibit fluctuation-induced terms that amplify short-distance correlations and suppress long-distance ones. Importantly, by accounting for the dynamics of sub-populations, we provide a systematic route to key population genetic quantities, such as fixation probabilities and decay rates of the genetic diversity. We demonstrate that, for some key quantities, asymptotic formulae can be derived. While it is natural to consider the process of adaptation as a branching random walk (in fitness space) subject to a constraint (due to finite resources), we show that other noisy traveling waves likewise fall into this class of constrained branching random walks. Our methods, therefore, provide a systematic approach towards analyzing fluctuations in a wide range of population biological processes, such as adaptation, genetic meltdown, species invasions or epidemics.

rsfscriptOMSrsfsmn \DeclareSymbolFontAlphabet\mathrsfsrsfscript

Many important evolutionary and ecological processes rely on the behavior of a small number of individuals that have a large dynamical influence on the population as a whole. This is, perhaps, most obvious in the case of biological adaptation: Future generations descend from a small number of currently well-adapted individuals. The genetic footprint of the large majority of the population is wiped out over time by the fixation of more fit genotypes. These dynamics can be visualized as a traveling wave in fitness space, see Fig. 1A. At any time, the currently most fit “pioneer” individuals reside in the small tip of the wave. As time elapses, the wave moves towards higher fitness and the formerly rare most fit individuals dominate the population. By that time, however, a new wave tip of even more fit mutants has formed and the cycle of transient dominance continues.

The principle of “a few guiding the way for many” also characterizes the motion of flocks of birds, which can be controlled by just a few leaders, or the expansion of an invasive species, which depends on pioneers most advanced into the virgin territory. The overall dynamics of these processes can become highly erratic even in large populations because the behavior of the entire population is influenced by strong number fluctuations, called genetic drift, occurring in the small subset of “pioneer” individuals.

Such propagation processes with an extreme sensitivity of noise have
also been called “pulled” waves, because they are pulled along by
the action of the most advanced individuals ^{1}^{1}1In “pushed”
waves, by contrast, most of the growth occurs behind the front at
higher population densities. While these “pushed” waves allow for
simple mean-field approximation that neglect noise, “pulled” waves
break down when noise is neglected. The reason is that noise is a
singular perturbation and neglecting it can lead to qualitatively
wrong predictions or even divergences.. If one ignores the
fluctuations at the population level and is interested only in the
expected dynamics of the population, one might be tempted to simply
ignore genetic drift in models of pulled waves. However, it turns out
that mean-field models ignoring genetic drift drastically
overestimate the speed of traveling waves, to the point that they
predict an ever accelerating rather than a finite speed of
adaptation. It took 70 years since the first formulation of traveling
wave models by Fisher and Kolomogorov, to realize that genetic drift
influences both the expectation and the variation in singular
ways tsimring1996rna (); brunet1997shift ().

The expected behavior of pulled waves has since been analyzed at great length. Many results were first obtained for waves of invasion, noisy versions of the classical “FKPP” model by Fisher, Kolmogorov, Petrovskii, Piskunov van2003front (). In recent years, however, there has been a particularly strong research focus on models of adaptation. These models aroused widespread interest because they can be applied to several types of data, including genomic data derived from experimental evolution experiments and from natural populations that undergo rampant adaptation, such as bacteria and viruses neher2013genetic (). We now have analytical predictions for a number of valuable analytical or semi-analytical results for observables such as the mean speed, probability of fixation, distribution of fixed mutations in the asymptotic regime of large populations neher2013genetic (). The great value of these results is that they show and rationalize which parameter combination chiefly influence the dynamics, and through which functional form. Importantly, it has been generally found that the overall dynamics depends logarithmically on population size and mutation rate. The weak functional dependence is in fact at the root of universality observed in many such wave models: These predictions are independent of the precise details of the models, including the form of the non-linear population size regulation.

The basic challenge in analyzing noisy traveling arise from an essential non-linearity that is required to control population growth. Ignoring such a dynamical control of the population size leads to long-term exponential growth or population extinction. Progress in describing the mean behavior of front-sensitive models has been achieved by at least three different approaches: One can either heuristically improve the mean-field dynamics by setting the net growth rate equal to zero in regions where the population densities are too small tsimring1996rna (). Such an ad-hoc approach, based on a growth rate cut-off, correctly reproduces the wave speed to the leading order but does not reveal other universal next-to-leading order corrections or the wave diffusion constant. One can also invoke a branching-process approximation for the tip of the wave, thereby neglecting effects of the non-linear population size control, and then match this linearized description with a deterministic description of the bulk of the population rouzine2003solitary (); desai2007beneficial (); rouzine2008traveling (); schiffels2011emergent (); good2012distribution (). Finally, there is also the possibility to invoke a particular dynamical constraint with the property that the dynamics exhibits a closed linear equation for the first moment. Importantly, this method, which has been called “model tuning” hallatschek2011noisy (); good2012distribution (); geyrhofer2013stochastic (), reproduces the universal features of noisy traveling waves, which are independent of the chosen population control, ultimately because of the weakness of the population size dependence.

While understanding the mean behavior of noisy traveling waves has been an important achievement, the actual stochastic dynamics is characterized by pronounced fluctuations at the population level. No two realizations of an evolution experiment, for instance, will exhibit the same time-dependent fitness distribution because of the chance effects involved in reproduction and mutations. Measuring the mean behavior requires many replicates in which the entire environment is accurately reproduced. Even if one has access to many replicates, as is possible in highly parallelized well-mixed evolution experiments, one can potentially learn a lot from the variability between replicates. Thus, a predictive understanding of the variability in evolutionary trajectories would greatly improve our quantitative understanding of how evolution works.

Some exact results on fluctuations at the population level are available for a special model of FKPP waves brunet2006phenomenological (). Still, we currently lack a systematic approach that can be applied to a wide range of models. Here, we fill this gap by extending the method of “model tuning” hallatschek2011noisy () to the analysis of higher correlation functions: We show that it is possible to choose a constraint in such a way that the hierarchy of moments is closed at any desired level. The resulting linear equations can be solved numerically and are amenable to asymptotic analytical techniques. As an important application of this approach, we show how the coalescence time can be computed within traveling wave models.

Although our main results are applicable to a wide range of models, we focus our attention on simple models of adaptation. Beyond simply grounding our discussion, there are two reasons to focus on these models. On the one hand, models of adaptation are simply important and have become an indispensable tool as a null model for evolutionary dynamics in microbial population. On the other hand, models of adaptation manifestly exhibit a particular mathematical structure, which we call constrained branching random walks. As we will argue below, this mathematical structure, to which all our formal results apply, can be identified as the essence of a wide range of models arising in physics, ecology and evolution.

## I Models of adaptation as Constrained Branching Random Walks

Darwinian adaptation spontaneously emerges from the processes of mutation, reproduction and competition, and these features need to be mirrored in any model of adaptation. In models, spontaneous mutations can be represented by a stochastic jump process in a “fitness space”. Reproduction is naturally described by a branching process by which individuals give birth at certain fitness-dependent rates allen2003introduction (); haccou2005branching (). In combination, reproduction and mutations thus generate a branching random walk allen2003introduction (); haccou2005branching (), which by itself would lead to diverging population sizes. To avoid this unrealistic outcome, models of adaptation also encode a constraint on population sizes to account for the competition for finite resources. The resulting process is a branching random walk subject to a global constraint, which we now frame mathematically.

The state of the population at time is described by a function representing the number density of individuals with fitness . In this context, fitness refers to an individual’s net-growth rate in the absence of competition for resources. The population is assumed to evolve in discrete timesteps of size , which is eventually sent to zero in order to obtain a continuous-time Markov process. Each timestep consists of two sub-steps. The first substep realizes reproduction and mutations and the second substep implements competition.

### i.1 First substep: Reproduction and mutations

The combined effect of reproductions and mutations can be described by the stochastic equation

(1) |

which takes the number density to an intermediate value . The term represents the expected change in density due to reproduction and mutations. This term is linear in the number density because the number of offspring and mutants per timestep is proportional to the current population density. The term represents all sources of noise arising in this setup. We will now discuss separately the precise meaning of both terms, and give natural alternatives for their form.

The Liouville operator depends on how the mutational process is modeled, and various examples are discussed in the following. A particularly simple example is provided by , which has been used to model asexual evolution on a continuous fitness landscape tsimring1996rna (); hallatschek2011noisy (). Here, the diffusion constant quantifies the fitness variance per generation, generated by an influx of novel mutations. To account for the notorious observation that most mutations are deleterious, a drift term is often included. The linear “reaction” term in simply accounts for the fact that individuals with higher growth rate grow faster. The term refers to the mean fitness of the population, which separates the population with a positive net growth rate from the less fit part of the population with .

One cannot generally assume that (biased) diffusion is a good model for discrete mutational events because of the presence of the reaction term favoring highly fit individuals. The diffusion approximation requires that mutation rates are higher than the typical fitness effects of novel mutations. This may apply to rapidly mutating organisms, such as viruses, or close to a dynamic mutation-selection balance goyal2012dynamic (); neher2013coalescence (). It may also effectively apply in island models with low migration rates, where the fitness effect of a mutation is reduced by potentially low migration rates. But, in well-mixed populations, the diffusion approach breaks down when beneficial mutation rates are much smaller than their typical effect, which has been confirmed for a number of microbial species when they adapt to new environments perfeito2007adaptive (); gordo2011fitness (); levy2015quantitative (). More generally, asexual adaptation may, therefore, be cast into the form

(2) |

where the mutational process is described by the operator , which conserves particle numbers, i.e., describes a pure jump process. For instance, one may have one of the time-independent kernels

(3) |

Here is a mutation rate and is a characteristic scale for the mutational effect. The diffusion kernel is the simplest of these kernels because it is characterized by only one compound parameter, the diffusion constant , rather than two in the Staircase Kernel or an entire function in the general case.

The stochastic term in equation (1) accounts for all random factors that influence the reproduction process. The function represents standard white noise, i.e. a set of delta correlated random numbers,

(4) |

where denotes the ensemble average of a random variable , and and are the Dirac delta function and the Kronecker delta, respectively. The amplitude of the noise term in Eq. (1) is typical for number fluctuations: Due to the law of large number, the expected variance in population numbers from one timestep to the next is proportional to the number of expected births or deaths during one timestep. The numerical coefficient is the variance in offspring number per individual. For instance, when we assume that offspring have nearly matching division and death rates one finds . The variance in offspring number is typically assumed to be of order one, but could become much larger if offspring distributions are highly-skewed, as it is the case when few individuals produce most of the offspring hedgecock2011sweepstakes ().

### i.2 Second substep: Population size constraint

Because the reproduction step generally changes population numbers,
another sub-step, following the branching process, is required to
enforce a constant population size ^{2}^{2}2Note that if
denotes the mean fitness of the population, the action of the
Liouvillean does not change the expected number
of individuals in the population. However, fluctuations in the
reproduction implemented by genetic drift will result in slight
deviations from this expected outcome. These deviations accumulate
over time and either lead to extinction or an ever increasing
population size.,

(5) |

In most models and experiments gerrish1998fate (); desai2007speed (), this step is realized by a random culling of the population: individuals are eliminated at random from the population until the population size constraint is restored. Mathematically, the population control step can be cast into the form

(6) |

where represents the fraction of the population that has to be removed to comply with the population size constraint. The second sub-step completes the computational timestep, and takes the concentration field from the intermediate state to the properly constrained state .

The above standard model of adaptation with fixed population size represents a branching random walk subject to the constraint that the total population size is fixed. Enforcing this constraint leads to the non-linearity that makes the associated model difficult to solve.

Note that, in the above formulation, it is assumed that all noise comes from birth-death processes. We have ignored, for simplicity, additional sources of noise due to, e.g. the mutational jump processes, which are sub-dominant in large populations.

### i.3 Generalization to arbitrary linear constraints

While it is necessary to constrain the population dynamics to avoid an exponential run-away, there is no particular biological reason to strictly fix the population size - in fact, most population sizes fluctuate over time frankham1995effective (). As we will see, there are, however, mathematical reasons to consider constraints of particular form, which greatly simplify the analysis.

As a key step towards these tuned models, we note the fixed population size constraint Eq. (5) can be viewed as one member of a whole class of linear constraints,

(7) |

that one could formulate with the help of a suitable weighting function . Any such constraint will be able to limit the population size, and thus defines, together with the Liouvillean , a particular model of adaptation. Our main result will be the observation that there are an entire set of weighting functions for which the dynamics of the model becomes simple (cf. Sec. III).

Note that one recovers the fixed population size constraint of one chooses the weighting function to be a constant, . For any other choice, the population size will not be fixed. At best, one obtains a steady state with a population size fluctuating around its mean value, , which may change depending on the time-dependence imposed on the weighting function . Note that culling is does not discriminate among individuals of different fitness. It is only the amount of culling that depends on the distribution and type of individuals if is –dependent.

## Ii Invasion waves as constrained branching random walks

One advantage of using the general form Eq. (7) for a global constraint is that many types of traveling waves arising in ecology and evolution can be cast in the same mathematical form, if an appropriate Liouville operator and weighting function are used.

We would like to give an example of this assertion. Fig. 1b illustrates the expansion of a population in a real landscape, which may describe an advantageous gene spreading through a population distributed in space, or the invasion of virgin territory by an introduced species. Models of such real-space waves require the following features: i) populations reproduce and die “freely” in the tip of the wave, where population densities are small, ii) individuals move in real-space according to some jump process and iii) the net-growth vanishes in the bulk of the wave, where resources are sparse.

Features i) and ii) again generate a branching-random walk in the tip of the wave, however, according to different position-dependent growth and jump rates than in the case of evolution. Feature iii) requires a finite population size in the tip of the wave. This non-linearity keeps the branching-random walk away from proliferating to infinite densities, where mean-field models apply.

To generate an adequate branching-random walk, one has to use an appropriate Liouville operator. In one dimensions, one can choose, for instance,

(8) |

Now, refers to the location in one-dimensional real space. refers to the position of the cross-over to the bulk of the wave. The operator generates a jump process. In the simplest case, again, the jump process may be approximated by a diffusive process, . The growth term does not need to be modeled by a strict step-function, any sigmoidal function works in the limit of large population sizes brunet1997shift (). Importantly, the net reproduction rates monotonously increase in and saturates at some finite value at . Finally, to limit the population size in the front of the wave, we need to use a non-constant weighting function in the global constraint, for instance, , which ensures that the growth region contains precisely individuals.

The qualitative difference between the traveling waves in real and fitness space turns out to rely on the growth rates in the nose of the wave: While growth rates are saturating in the case of real-space waves, it is increasing without bound in the case of waves of adaptation. This makes models of adaptation even more sensitive to the effects of noise to the extent that a mean-field limit (neglecting noise) does not even yield finite velocity waves. Noise serves a crucial function in these models as it is required to regularize the wave dynamics. These waves have therefore been called “front-regularized waves” cohen2005fluctuation ().

## Iii Summary of Main Formal Results

In the same way as exemplified above for models of adaptation and invasion, one can frame many other eco-evolutionary scenarios, in their essence, as constrained branching random walks. These models are, ultimately, defined by an operator generating the branching-random walk and a weighting function defining the global constraint. In this paper, we show that, in fact, for any given there are natural ways of choosing the weighting function. The associated models, which we call tuned models, have desirable properties, including closed and linear moment equations, greatly facilitating their analysis.

We first state our main results on how to construct tuned models at any desired level of moments. We will then provide an interpretation of these tuned models and provide simulation results, which we compare to fixed population size models. Detailed analytical derivations are given in later sections.

To characterize fluctuations in the makeup of the population it is convenient to consider the so-called -point correlation function , which is the noise-average of the product

(9) |

of number density fields, , evaluated at the same time at various locations . Note that, here and henceforth, we use the notations and for a space and time-dependent function interchangeably.

From many studies over the last 15 years, we know a lot about the first moment, , of several models of noisy traveling waves. This provides access to the expected shape and velocity of the traveling wave, as well as the expected fate of individual mutations, or sub-populations. However, wave shape and velocity fluctuations, as well, as the decay of genetic diversity requires access to higher moments, , for which there is no systematic approach so far.

Our main result is that the dynamics of the moment (and of all lower moments) becomes analytically accessible if one chooses in the global constraint Eq. (7) the weighting function to have the form

(10) |

for any positive integer , where satisfies

(11) |

with the adjoint operator of . The arbitrary function controls the mean total population size as a function of time.

For the special weighting function Eq. (10), the equation of motion for the moment becomes closed and linear,

(12) |

The resulting models may be called tuned, because for all other choices of the weighting function, the equation of motion for involves resulting in an infinite hierarchy of moments. The moment closure of the tuned models is exact and not due to a truncation or approximation of higher moments. The two terms are fluctuation-induced terms. The last, positive term generates correlations at equal space arguments, which are then dissipated by the first, strictly negative, term over longer time scales. The positive term exists only for and indicates that the effect of genetic drift on higher moments is more complicated than a cut-off in the growth term.

Moreover, examining the behavior of differently labeled, but otherwise identical, subpopulations shows that tuned models can be interpreted naturally in the framework of population genetics. First, the weighting function of all tuned models is a fixation probability function: The probability that descendants of individuals at location and time will take over the population on long times is given precisely by in the model tuned for the moment. It is remarkable, in this context, that equation Eq. (11) for is precisely the equation governing the survival probability of an unconstrained branching random walk fisher2013asexual (), with .

Secondly, the higher moments provide access to the statistics of the genetic makeup of the population. According to the principle tenet of population genetics that, without mutations, the genetic diversity of a population decreases with time: fixation and extinction of subtypes needs to be maintained by an appropriate influx of mutations. The decay of genetic diversity fundamentally depends on higher moments – the first moment captures fixation probabilities but not the time to fixation.

The decay of genetic diversity in the absence of mutations can be quantified by the cross-correlation function

(13) |

between subtypes . Here, the number densities is the number density field of type . The sum of all sub-types makes up the total population, .

If all are different and , it is clear that must continuously decay in the absence of mutations because one of the subtypes will take over in the presence of random genetic drift. The case is the most prominent one: is proportional to the so-called heterozygosity at location and , which is the probability that two individuals sampled with replacement are of different type wakeley2009coalescent ().

Now it turns out that, in the -tuned model, the cross-correlation function can be expressed in the simple form

(14) |

with satisfying a one-dimensional linear equation

(15) |

subject to the initial conditions .

Notice that the decay of genetic diversity in the -tuned model is described by the spectrum of the operator appearing on the right-hand side of Eq. (15), which also occurs in Eq. (12) describing the fluctuations of the total population. Consistent with our population genetic interpretation, one can show quite generally that this operator only has decaying relaxation modes (see Fig. 8).

We consider Eqs. (14) and (15) to be our results with the most immediate applicability because they provide a feasible approach to resolving a key question in population genetics (maintenance and decay of genetic diversity) that relies on having access to higher moments.

### iii.1 Remarks

The above formalism includes the case (closed first moment) presented in Ref. hallatschek2011noisy () (in which the tuned weighting function was denoted by ).

In constrained random walk models, one can generally retrieve lower moments from higher moments by contraction with the weighting function ,

(16) | |||||

where we invoked the global constraint Eq. (7) in going from the second to the third line. To simplify the notation, we have here introduced the short hand to denote , i.e., that the th space variable should be omitted in the arguments.

The contraction rule Eq. (16) also shows that contraction in the last term in the dynamical equation for simply generates the next lower moment . Moreover, if one obtains the moment by solving Eq. (12), one generally has access to all lower-rank moments as well (but not to the larger moments). Thus, the model tuned to be linear at the moment gives access to all moments up to and including the .

It is remarkable that the governing equations for and are independent of the offspring number variation . The only effect of is to reduce the fixation probability and scale up the population sizes . This means in particular that noise-induced terms in the moment equation are not scaled by , and thus do not become small in the small noise limit. The lack of a potentially small parameter has important consequences for the use of perturbation theory in this context.

## Iv Numerical Results

We now illustrate our main results by explicit numerical solutions and stochastic simulations for models tuned to be closed at the first and second moment. The two biological phenomena we consider, adaptation and invasion, both give rise to compact traveling waves in real space and fitness space, respectively. They fundamentally differ in their location-dependent growth rates: While in adaptation models, growth rates are linear increasing towards the tip of the waves, they saturate in invasion models. As a consequence, invasion waves have a well-defined infinite population size limit in contrast to adaptation waves.

### iv.1 Models of adaptation

The detailed behavior of simple models of asexual adaptation models varies with the assumptions made about how the mutation process influences the growth rates that define the branching process. In our numerical work, we focus on the diffusion kernel in Eq. (3), which assumes that a growth rate variance is acquired per generation due to mutations. The advantage of the diffusion kernel is that it only contains one parameter, the diffusion constant , and matters when mutation rates are large compared to the (effective) rate of selection tsimring1996rna (); goyal2012dynamic ().

##### Numerical Approach.

To solve our framework of tuned models, we used a multi-dimensional Newton-Raphson method to determine traveling wave solutions corresponding to tuned models of degree . To this end, we first determined a traveling steady state solution of Eq. (11) defining the set of tuned weighting function (Eq. (10)). This weighting function enforces a traveling steady-state for the moment of the number densities, which satisfies the linear time-independent equation Eq. (12). As a result, we obtain for a given , a weighting function and the moment. Details of the numerical scheme and the explicit equations solved for and are described in Appendices A and B.

##### Model tuned to the first moment ().

To set the stage and to reproduce earlier results from Ref. hallatschek2011noisy (), we first present numerical results for the model closed at the first moment. Fig. 2 shows, for two wave speeds, the weighting function and the mean population density in a stationary comoving frame. While the population distribution is, except for an exponential decay in the wave-tip, close to a Gaussian for large speeds, it is markedly skewed for lower wave speeds. Note that the exact numerical results are in near perfect agreement with stochastic simulations, confirming our approach.

##### Model tuned to the second moment ().

Fig. 2 characterizes the behavior of models in comparison with tuned models. Since models provide access to the second moment, one can of course also obtain the first, by contraction with the weighting function (cf. Eq. (12)). Fig. 2B shows the mean population densities for both and models for the same velocity and weighting function which is identical to their respective weighting functions and up to an –dependent factor, cf. Eq. (10). The mean population densities can hardly be distinguished in the figures - generally, the agreement of the two different models increases with increasing wave velocity or, equivalently, population size. Moreover, the stochastic simulations track the predicted mean in near perfect agreement, providing numerical support for our analysis of the tuned closure at higher moments.

Fig. 2B shows, in 3d plot, the second moment in a semi-logarithmic plot. While this plot is mostly Gaussian, it reveals a distinctly exponential decay in the front and back of the wave. This indicates the importance of fluctuations in these low-density regions.

Higher order correlation functions can also be used to directly investigate fluctuation properties of the noisy adaptation wave. The Pearson-product-moment-correlation, defined as shows a clear anticorrelation signal in the nose of the wave: Usually, a (stochastic) rise in population number in the nose of the wave leads to an overall decrease in population size. Such a very fit population has a much larger weight in the tuned constraint, Eq. (7), forcing the bulk population to be culled. Thus, bulk population size and nose population size are anticorrelated geyrhofer2015oscillations ().

##### Fixation probabilities.

Note from Figs. 2 and 3 that the weighting functions strongly increase towards the tip of the wave where it crosses over to a linear increase. The functional form is consistent with the interpretation of a fixation probability: The success probability of an individual should be much larger in the tip of the wave rather than the bulk because it has to compete with less and less equally or more fit individuals. The fixation probability approaches a linear branch beyond some cross-over fitness because individuals there are so exceptionally fit that they merely have to survive random death to fix. The competition with conspecific is minimal - there is only competition with their own offspring.

To test our prediction that can be interpreted, exactly, as an fixation probability, we have carried out the following test: We ran simulations for and tuned models, in which we labeled the tip of the wave such that the predicted fixation probability is exactly , cf. Fig. 4. The measured fixation probability is shown in the insets of Fig. 4. Within the statistical error, the agreement is very good.

We would like to point out that the fact that is strongly increasing towards the wave tip means that highly fit individuals have a large impact on what fraction of the population is culled per time step. For instance, if a single individual arises far out in the tail of the wave, it may have a large fixation probability. To keep the total fixation probability at one, this means that a significant fraction of the individuals need to be cleared from the population. Importantly, however, the culling itself is independent of the identity of individuals, i.e., a poorly fit individual is equally likely to die as a highly fit individual. This distinguishes our model from models that regularize the population size by removing cells preferentially at the wave tip. Such a procedure strongly modifies the wave dynamics and, in particular, reduces the amount of fluctuations.

##### Velocity-Population-Size Relationships

Fig. 5 shows the relation between velocity and mean population size in various tuned models. As comparison, we also show the corresponding relationships between mean velocity and population size for fixed population size models. All these models correspond to different statistical ensembles, yet for large population sizes the curves approach each other very well.

The differences between models at finite population sizes is explained mainly by population size fluctuations: If we force the fixed population size model to fluctuate precisely as a given realization of a tuned model, we obtain very accurate agreement. Conversely, if we plot velocity vs. shows excellent agreement between all models. This agreement is due to a timescale separation, discussed below.

##### Decay of Genetic Diversity

Next, we have solved numerically for the mode spectrum that governs the decay of heterozygosity. Fig. 6 shows the behavior of the lowest two eigenvalues as a function of the velocity of the wave and the population size, respectively. It can be clearly seen that a time scale separation arises: The frequency of the first mode decays slowly to , following to a good approximation. The frequency of the next higher mode, on the other hand, approaches a constant value of order .

This means that coalescence takes much longer than the time until a subpopulation has forgotten its the initial condition of its spatial distribution. This time-scale separation not only helps in analytically finding the coalescence time in Sec. VI.2.2. But it also underlies the ensuing Bolthausen-Sznitman coalescence in many models of adaptation and invasion waves of the Fisher-Kolmogorov type desai2013genetic (); neher2013genealogies ().

##### Invasion waves

After changing the Liouville operator the one in Eq. (8), and following the same numerical pipeline as described above, we obtain analogous results for invasion waves, see Fig. (7). Note that the spatial co-ordinate now corresponds to real space rather than a fitness landscape. Our data reproduce the universal velocity-population size relationship and the coalescence time scaling that have been established for FKPP waves over the last 20 years van2003front (); brunet2006phenomenological (). An explicit form of the equations we solved numerically can be found in Appendix A.

## V The Stochastic Dynamics of Branching Random Walks

In the following we proceed with the derivation of our main results quoted above. To this end, we will first recapitulate the derivation of the stochastic differential equation governing the dynamics of branching random walks, as was done in Ref. hallatschek2011noisy (). We will then discuss the consequences of this stochastic dynamics for -point correlation functions, which will allow us to identify the natural choice of the weighting function . Subsequently, we will discuss how to modify our basic model to account for different subtypes within the population.

### v.1 Stochastic dynamics of Constrained Branching Random Walks

We will now determine the stochastic dynamics obtained in the limit , which is in general non-linear and hence not solvable. We then use the ensuing stochastic differential equation to identify special models with closed moment hierarchies. We will find that these tuned models are not only solvable, but also allow for a natural interpretation of the constraint in terms of fixation probabilities.

The stochastic dynamics of a constrained branching random walk (CBRW) was derived in Ref. hallatschek2011noisy () and may be summarized as follows. The state of the system is described by the number density of random walkers at position and time . At any time, the distribution of random walkers has to satisfy a global constraint defined by Eq. (7).

The combination of Eqs. (1) and (7) can be written as a fraction,

(17) |

in the continuous-time limit (small enough is required to ensure that the denominator of the fraction is never far from ). Note that the expression in Eq. (17) evidently satisfies the constraint,

Moreover, in the continuous-time limit, we only need to retain terms up to order van1992stochastic (). Thus, expanding Eq. (17), we obtain

(18) | |||||

In Eq. (18), we required to change only deterministically, , i.e. it has no stochastic component.

Finally, we replace products of order in the
deterministic term of expansion Eq. (18)
with their averages, Eq. (4) ^{3}^{3}3We note
that Eq. (V.2) is a manifestation of
Ito’s rule for non-linear variable substitutions in stochastic
differential equations., to arrive at the following stochastic
differential equation: The temporal change of the
concentration field from time to can be
written as

(19) |

which consists of a deterministic change of order and a stochastic change of order . These are given by

(20) |

Thus, we have arrived at the continuous-time stochastic process for a constrained branching random walk hallatschek2011noisy (), which summarizes the combined effect of the original two-step algorithm, (i) “branching random walk” and “enforce constraint”. The related concept of forcing the solution of a SDE onto a manifold has been analyzed in katzenberger1991solutions ().

### v.2 Moment equations

This section introduces the hierarchy of moment equations of CBRWs that characterize the mean and the fluctuations of the concentration field of the random walkers.

Consider the dynamics of the products of . Our goal is to determine how the function changes as time marches forward. To this end, we express the time increments of the –point products in terms of the changes of the single fields ,

(21) |

using the deterministic and stochastic time increments computed in (V.1). Next, we expand the product up to order ,

Note that the last term arises only for . Inserting the time increments Eq. (V.1) of the single fields into Eq. (V.2) yields

Within the deterministic terms, we again replace
products of noises in terms of their averages, as given by
Eq. (4)^{4}^{4}4Note that we can always write plus a stochastic component. If such
terms quadratic in the noise arise in the deterministic part
of any stochastic equation, one may simply ignore their stochastic
component, as it would lead to (in the limit ) negligible
contributions. van1992stochastic (). We then obtain

(24) | |||||

Upon averaging and sending to zero, we obtain an equation of moment for the th moment,

(25) |

From (25) we observe that the coupling to higher moments is mediated via the dynamics of the weighting function in the last term.

### v.3 Exact closure

The key result of Ref. hallatschek2011noisy () was that a particular choice of the weighting function exists, for which the first moment equation is closed. We now show that exact closures can be found for higher moments as well.

Suppose, the solution of

(26) |

exists, and we choose as the weighting function. For this particular model, the dependence on the th moment in Eq. (25) disappears identically:

(27) |

Thus, the hierarchy of the first moments is closed. In fact, this closed set of differential equations can be summarized by a single integro-differential equation, Eq. (12), because contracting with reduces the order of the moments by virtue of the constraint, cf. Eq. (16).

The final form of our results Eq.s (11), (12) are obtained upon substituting

(28) |

which is the initially quoted Eq. (10).

In summary, starting from a linear operator , we have identified an algorithm to construct a constrained branching walk model solvable up to the th moment: First, identify the weighting function for which the hierarchy of moment equation closes at the th moment. To this end, solve equation (26), which is deterministic nonlinear equation for the weighting function depending on a space and a time variable. Second, solve the corresponding moment equation (12), which is a linear equation for the function that depends on space variables and a time variable. Once the function has been obtained, any lower-order moment follows by contraction with , as described in Eq. (16).

## Vi Accounting for different subtypes

We now extend our model to account for different types of individuals. This enables studying questions such as how does mutator strain take over in an evolving population of bacteria even if it confers a direct fitness detriment, or, how does a faster dispersing mutant spreads during a growing tumor even if it might be slower growing? Moreover, we can discuss the decay of genetic diversity and arrive at the very important conclusion that is always the fixation probability of a neutral mutation arising at position and time by considering exchangeable subtypes.

To this end, we define the dynamics of the subtypes analogously to our original constrained branching walk model for the total population in Sec. V: The number density of individuals of type at position at time shall be given by a density field . Hence, the entire state of the system is described by the vector . In each time step, a given subtype undergoes a step of branching random walk subject to their own linear dynamics encoded by an operator and their own fluctuations generated by a noise field ,

(29) |

The –dependence of the linear operators encode the phenotypic differences between types. E.g. if type would refer to a mutator type, would include a particular mutational operator characterizing the mutator phenotype. For instance, if mutations are modeled by diffusion, the corresponding diffusion constant of a mutator would be larger than that of the wild type.

The different subtypes are coupled only by the second computational substep

(30) |

which ensures a global constrained defined by a weighting function vector ,

(31) |

Notice that we have merely added another (discrete) dimension to the problem - the type degree of freedoms. It may be checked that our arguments to arrive at an effective stochastic differential equation and for closing the moment hierarchy in Sec. generalize to any number of dimensions. Thus, we can immediately restate our central results for the extended model accounting for sub-types.

In particular, if we choose the weighting function vector to be with

(32) |

then equation of motion for the moment will be closed. If we choose the notation

(33) |

with being the type of the number density field in the product on the right-hand-side. The equation of motion for the moment is given by

(34) |

Notice that the correlations indicated by the last term only arise for a subpopulation with itself, .

### vi.1 The neutral case and interpretation of the weighting function

Now, let us focus on the special case where types follow the same dynamics in the statistical sense,

(35) |

For such “exchangeable” subtypes, it is easy to see that the equation of motion Eq. (12) for the moment of the total population, , is obtained upon summing left and right-hand side of Eq. (34) over all type indices, i.e. by carrying out .

We can single out one particular subpopulation, say ( for labeled), by summing the equation of motion over all other type indices, . This yields

(36) | |||||

(37) |

Note that the correlation function satisfies the same linear equation as does Eq. (12), which we abbreviate as

(38) | |||||

(39) |

defining a linear (integro-differential) operator . Imagine solving for the left eigenvector of corresponding to eigenvalue 0,

(40) |

where is a function of variables just like . Then, contracting Eq. (36) with this new function one obtains

(41) |

The second equality is a key step. It holds because on long times if fixation occurs and , otherwise.

Hence, the fixation probability of the labeled subpopulation with initial density at time is given by

(42) |

Fortunately, the left eigenvector of is easily constructed: If we fully contract Eq. (39) with factors of the weighting function , we have to get on the LHS because of the constraint Eq. (7). Hence, the sought-after left eigenvector can be written as

(43) |

Since this eigenvectors contracts to 1 with the total population, Eq. (42) becomes

(44) |

Thus, as announced in Section III, a single labeled mutant at a certain location has probability that its descendants will take over the population on long times.

### vi.2 Decay of heterozygosity

The diversity of labels in a population will inevitably decline, because of the rise and ultimate fixation of one of the labels initially present. One can capture the gradual loss of genetic diversity by studying the expectation of the product of density fields that correspond to different types. For instance, if there are just two types, , the correlation function satisfies

(45) |

Notice that the right-hand side of Eq. (45) misses the positive –function term of Eq. (12), which characterized the moment of the total population density. Assuming that (as well as and ) have a stable stationary solution, we can conclude that will decay to at long times because of the lacking source term. This is to be expected because the gradual fixation of one of the two types implies that has to approach on long times. Discerning relaxation times of the time-evolution (45) is related to a key question in population genetics: How fast do lineages of two individuals coalesce?

The function is closely related to the so-called heterozygosity in population genetics. The heterozygosity in the population is the probability that two randomly chosen individuals are of different type, . For our purposes, it is much more convenient and natural to consider a variant of that quantity,

(46) |

with and . These quantities have nice properties. At any time, both and represent the probability of fixation of the respective subpopulations. Accordingly, we have , ensured by the global constraint. Thus, is similar to a heterozygosity in neutral populations and, under certain conditions discussed below, even converges against the actual heterozygosity.