The Evolution of Dispersal in Random Environments and The Principle of Partial Control

The Evolution of Dispersal in Random Environments and
The Principle of Partial Control

Lee Altenberg

McNamara and Dall (2011) identified novel relationships between the abundance of a species in different environments, the temporal properties of environmental change, and selection for or against dispersal. Here, the mathematics underlying these relationships in their two-environment model are investigated for arbitrary numbers of environments. The effect they described is quantified as the fitness-abundance covariance. The phase in the life cycle where the population is censused is crucial for the implications of the fitness-abundance covariance. These relationships are shown to connect to the population genetics literature on the Reduction Principle for the evolution of genetic systems and migration. Conditions that produce selection for increased unconditional dispersal are found to be new instances of departures from reduction described by the “Principle of Partial Control” proposed for the evolution of modifier genes. According to this principle, variation that only partially controls the processes that transform the transmitted information of organisms may be selected to increase these processes. Mathematical methods of Karlin, Friedland, and Elsner, Johnson, and Neumann, are central in generalizing the analysis.111Dedicated to the memory of Professor Michael Neumann, one of whose many elegant theorems provides for a result presented here. Analysis of the adaptive landscape of the model shows that the evolution of conditional dispersal is very sensitive to the spectrum of genetic variation the population is capable of producing, and suggests that empirical study of particular species will require an evaluation of its variational properties.

1 Introduction

In analyzing a model of a population that disperses in a patchy environment subject to random environmental change, McNamara and Dall (2011) describe “how an underappreciated evolutionary process, which we term ‘The Multiplier Effect’, can limit the evolutionary value of responding adaptively to environmental cues, and thus favour the evolutionary persistence of otherwise paradoxical unconditional strategies.” By “multiplier effect”, McNamara and Dall mean,

If a genotype is distributed in space and its fitness varies with location, then selection will change the spatial distribution of the genotype through its effect on population demography. This process can accumulate genotype members in locations to which they are well suited. This accumulation by selection is the multiplier effect.

It is possible, they discover, for the ‘multiplier effect’ to reverse — for there to be an excess of the population in the worst habitats — when there is very rapid environmental change. The environmental change they model is a Markov process where are large number of patches switch independently between two environments that produce different growth rates for a population of organisms. They find that for moderate rates of environmental change, populations will have higher asymptotic growth rates if they reduce their rate of unconditional dispersal between patches, which produces effective selection for lower dispersal.

Their key finding is that the reversal of the ‘multiplier effect’ due to rapid environmental change corresponds exactly with a reversal in the direction in which dispersal evolves: when abundance is greater on better habitats, lower dispersal is selected for; when abundance is greater on worse habitats because the environment changes so fast, there is selection for higher dispersal.

McNamara and Dall conclude their paper saying, “the multiplier effect may underpin the evolution and maintenance of unconditional strategies in many biological systems.” This is indeed the case. Their results are in fact part of the phenomenon already known as the “reduction principle”, which was first described as such in models for the evolution of linkage (Feldman, 1972), and subsequently extended to models for the evolution of mutation rates, gene conversion, dispersal, sexual reproduction, and even cultural transmission of traditionalism (Altenberg, 1984). The reduction principle also underlies other phenomena: the ‘error catastrophe’ in quasispecies dynamics, and the effect of population subdivision on the maintenance of genetic diversity.

The Reduction Principle can be stated, in a rather general form, as the widely exhibited phenomenon that mixing reduces growth, and differential growth selects for reduced mixing.

While the reduction phenomenon studied in McNamara and Dall (2011) is not a new concept, three particular aspects of their study are novel:

  1. their discovery of conditions that cause mixing to increase growth — which addresses the open problem posed in Altenberg (2004, Open Question 3.1) as to the conditions that produce departures from the reduction principle;

  2. that these departures from reduction emerge from very rapidly changing environments; and

  3. that these departures from reduction correspond to reversals in the association between fitness and abundance in different environments.

McNamara and Dall produce these results from a two-environment model. A principal goal here is to generalize each of these findings to arbitrary numbers of environments. Insight on how to generalize them is provided by clues in their results. Some of these clues point to the main tool used to achieve the generalization, a theorem of the late Sam Karlin, to be described.

The property described by McNamara and Dall as ‘the multiplier effect’ is here made mathematically precise, as a positive covariance between fitness and the excess of the stationary distribution of the population above what it would be in the absence of differential growth rates, as censused just after dispersal. I refer to this quantity as the fitness-abundance covariance, which is a bit more descriptive and specific than the term ‘multiplier effect’, which already has long use as a concept in economics.

A critical aspect to use of the fitness-abundance covariance is the phase in the life cycle at which the census is taken. When McNamara and Dall say that “individuals are likely to find themselves in circumstances to which they are well-adapted,” it matters where in its life cycle the individual finds itself — whether it is on its natal site or has already dispersed. McNamara and Dall do not explicitly address the phase at which they take their census, but their model shows it to be just after dispersal, before reproduction.

The issue of census phase is explicitly addressed here, and is shown to critically affect properties of the fitness-abundance covariance. For populations censused just after dispersal, one cannot say in general that “individuals are already likely to be on the better site.” As a consequence of this phase dependence, a novel result found here is that by taking a census of the populations before and after reproduction, one can in certain situations infer a bound on the duration of changing environments.

A result in McNamara and Dall (2011) that garnered considerable attention is that “ ‘stupid strategies’ could be best for the genes” (University of Exeter, 2011):

One underappreciated consequence of the multiplier effect is that because individuals tend to be in locations to which they are well suited, its mere existence informs an organism that it is liable to be in favourable circumstances. This information can outweigh environmental cues to the contrary, so that an individual should place more weight on the fact it exists than on any additional cues of location quality. McNamara and Dall (2011)

The general analysis provided here produces results that seems to contradict the above: philopatry is never an evolutionarily stable strategy when there is any level of environmental change; it can always be invaded by organisms that disperse from the correct environments.

In an attempt to resolve the apparent contradiction, I take a closer examination of the adaptive landscape — the gradient of fitness over the space of conditional dispersal probabilities. What is found is that the evolutionarily stable state is highly sensitive to constraints on the organismal variability for dispersal probabilities. Slight changes in the constraints can shift the evolutionarily stable state from complete philopatry to complete dispersal from some environments. This sensitivity means that conditional dispersal may be a highly volatile trait evolutionarily. Moreover, to understand the evolution of any particular species requires an analysis of the constraints on the phenotype, and the probabilities of generating heritable variation in any phenotypic direction — in short, an evolvability analysis (Wagner and Altenberg, 1996).

While it is relatively straightforward to determine the long-term growth rates of different dispersal phenotypes, determining the likelihood that such phenotypes will be produced by the population plunges one into issues of the organism’s perceptual and cognitive limits, ecological correlates, and the genotype-phenotype map, and requires specific empirical knowledge of the organism and its variability in order to address. This is perhaps why, as Levinton (1988, p. 494) insightfully writes, “Evolutionary biologists have been mainly concerned with the fate of variability in populations, not the generation of variability. …Whatever the reason, the time has come to reemphasize the study of the origin of variation.” A principle finding here is that the evolutionary outcome is not determined by the adaptive landscape studied here, and we are pointed instead to examine the variational properties of each particular species in question.

1.1 The Reduction Principle and Fisher’s Fundamental Theorem of Natural Selection

The intuition as to why there should be selection for lower dispersal in a population at a stationary balance between dispersal and selection is well expressed in the following explanation:

Even in the absence of genetic variability for local adaptation in a spatially heterogeneous environment, migration will be selected against because on the average an individual will disperse to an environment worse than the one it was born in, since better environments harbor more individuals. (Olivieri et al., 1995).

This is a description of populations that have equilibrated to a balance between dispersal and differential growth. Fisher’s Fundamental Theorem is that differential growth rates increase the mean fitness of the population by an amount equal to the variance in the growth rates. When the population is at a stationary distribution, however, this requires that dispersal decrease the mean fitness by exactly the same amount. Fisher uses the phrase “deterioration of the environment” (Fisher 1958; discussed in Price 1972) to describe this exact counterbalance to the variance in fitness that increases the mean fitness. But he includes mutation in this concept:

…an equilibrium must be established in which the rate of elimination is equal to the rate of mutation. To put the matter in another way we may say that each mutation of this kind is allowed to contribute exactly as much to the genetic variance of fitness in the species as will provide a rate of improvement equivalent to the rate of deterioration caused by the continual occurrence of the mutation. (Fisher, 1958, p. 41)

Fisher was thinking of mutation, not dispersal, in the above. But as we shall see later, the same mathematics underlies both. Like a the mutation/selection balance Fisher describes, dispersal will generally be to lesser quality environments when the population has reached a growth/dispersal balance.

The interchangeability of many results in population genetics between mutation and dispersal reflects the fact that an organism’s location, like its genotype, is transmissible information about its state, and its degree of preservation during transmission is itself an organismal phenotype and subject to evolution (Cavalli-Sforza and Feldman 1973; Karlin and McGregor 1974; Altenberg 1984, pp. 15–16, p. 178 Schauber et al. 2007; Odling-Smee 2007). The issue of the faithfulness of transmission brings us to the reduction principle.

2 A Review of the Reduction Principle

McNamara and Dall are more correct than perhaps even they realized in noting that their subject is an “underappreciated evolutionary process”. It is clear that awareness of the body of population genetics literature on the reduction principle has not fully percolated between disciplines. Karlin’s (1982) key theorem on the reduction phenomenon, and its application to the evolution of dispersal Altenberg (1984), were independently duplicated recently by Kirkland et al. (2006). And McNamara and Dall (2011) were evidently unaware of the paper by Kirkland et al. (2006), published in a mathematics journal.

One main goal of this paper, therefore, is to provide a ‘portal’ to the reduction principle, its historical development, and methods of analysis for a broader audience. Here, I tie-in the work of McNamara and Dall (2011) to the larger stream of work on the reduction principle, and show that their work contributes toward answering one of the main open problems in the field: how departures from the reduction phenomenon are produced.

It may be appropriate to apologize for the density of equations in this paper, as equations nowadays are often being relegated to online-only supplements. But the subject of this paper is in fact mathematical methodology. It is the mathematics that creates a single conceptual and analytical framework for dispersal, recombination, mutation, random environments, and multiple genetic processes. To show how they all share in a single body of results requires we delve into the mathematics.

It should be noted that many theoretical studies constrain their analysis to models having only patches or genotypes, to allow explicit calculation of the eigenvalues and eigenvectors (e.g. McNamara and Dall 2011, Steinmeyer and Wilke 2009). There are mathematical tools from the reduction principle literature, however — in particular the aforementioned theorems of Karlin — that make analytical results tractable for arbitrary . Dissemination of these tools to a larger audience is another principal goal of this paper. They are laid out in Methods.

2.1 Development of the Reduction Principle

In the first analyses of genetic modifiers of mutation, recombination, and migration by Marc Feldman and coworkers in the 1970s, a common result kept appearing, which was that reduced levels of mutation, recombination, or migration would evolve when populations were near equilibrium under a balance between the forces of selection and transmission. The earliest appearance of the reduction phenomenon in the literature is perhaps Fisher’s ([)p. 130]Fisher:1930 assertion that “the presence of pairs of factors in the same chromosome, the selective advantage of each of which reverses that of the other, will always tend to diminish recombination, and therefore to increase the intensity of linkage in the chromosomes of that species.” This claim was mathematically verified by Kimura (1956). Nei (1967; 1969) posed the first three-locus model for the evolution of recombination, with a modifier locus controlling the recombination between two loci under selection, and found that only reduced recombination would evolve. The first fixed-point stability analysis of modifiers of recombination between two loci under viability selection was by Feldman (1972), who found that recombination would be reduced by evolution. Subsequent studies extended the reduction result to larger and larger spaces of models, including modifiers of:


Karlin and McGregor (1972); Balkau and Feldman (1973); Karlin and McGregor (1974); Teague (1977); Asmussen (1983); Hastings (1983); Feldman and Liberman (1986); Liberman and Feldman (1989); Wiener and Feldman (1991, 1993);


Feldman (1972); Karlin and McGregor (1972); Feldman and Balkau (1972, 1973); Feldman and Krakauer (1976); Feldman et al. (1980); Liberman and Feldman (1986a); Feldman and Liberman (1986); and


Karlin and McGregor (1972); Liberman and Feldman (1986b); Feldman and Liberman (1986).

(Note that this literature prefers the term ‘migration’, while ‘dispersal’ is preferred in the ecology literature. Literature searches need to include both.)

These studies also extended the generality of the reduction results to include arbitrary large modified rates, arbitrary viability selection regimes, and multiple modifier alleles. They could only analyze the case of two patches or two alleles per selected locus, however, due to their use of closed-form solutions for the determinants or eigenvalues. Hastings (1983) is notable in extending the phenomenon to continuous spatial variation.

Feldman (1972) proposed that the essential direction of evolution for the recombination modifiers was reduction in the recombination rates. Shortly thereafter, Karlin and McGregor (1972, 1974) proposed an alternative idea, that the underlying governor for the direction of modifier evolution was the “Mean Fitness Principle”. The Mean Fitness Principle proposed that a modifier allele increases when rare if and only if it changes the parameter it controls to a value that would increase the mean fitness of the population at equilibrium. Both reduction and mean fitness principles explained the known results at that time. However, Karlin and Carmelli (1975, Fig. 1) found an example where reducing recombination would decrease the mean fitness of the population, while Feldman et al. (1980) showed that, even for this example, an allele reducing recombination would grow in the population. Therefore, only the reduction principle remained unfalsified. Subsequent modifier gene studies have found other counterexamples to the mean fitness principle (Uyenoyama and Waller, 1991a, b; Wiener and Feldman, 1993). In Feldman et al. (1980) is where reduction was first referred to as a “principle”.

2.2 Karlin’s Theorems

During the time period of these developments, Karlin had, ironically, elucidated the mathematical foundations for the reduction principle himself — without realizing it.

Karlin was investigating a seemingly distant topic — how population subdivision would affect the maintenance of genetic variation. To understand how the protection of alleles against extinction depended on migration patterns and rates, Karlin (1976, 1982) developed two general theorems on the spectral radius of perturbations of migration-selection systems. The spectral radius is the growth rate for the whole group of genotypes that comprise the perturbation as they approach a stationary distribution among themselves.

These theorems show how, for two different kinds of variation in migration, a greater level of ‘mixing’ reduces the spectral radius of the stability matrix for the system, and thus may cause some alleles to lose their protection against extinction. Hence, greater levels of mixing would lead to fewer polymorphic alleles. Preparatory to this work was the paper by Friedland and Karlin (1975). The theorems first appear, without proof, in Karlin (1976, pp. 642–647), and with proof as Theorems 5.1 and 5.2 in Karlin (1982), restated as follows:

Theorem 1 (Karlin 1982, Theorem 5.1, pp. 114–116, 197–198).

Consider a family of stochastic matrices that commute and are symmetrizable to positive definite matrices:


where and are positive diagonal matrices, and each is a positive definite symmetric real matrix. Let be a positive diagonal matrix. Then for each , the spectral radius, , satisfies:

Theorem 2 (Karlin 1982, Theorem 5.2, pp. 117–118, 194–196).

Let be a non-negative irreducible stochastic matrix. Consider the family of matrices

Then for any positive diagonal matrix , the spectral radius

is decreasing as increases (strictly provided ).

In Theorem 5.1, ‘more mixing’ is produced the application of a second mixing operator; in Theorem 5.2, more mixing is produced by the equal scalar multiplication of all the transition probabilities between states. In both cases, greater mixing reduces the spectral radius, which represents the asymptotic growth rate of a rare allele in Karlin’s analysis.

Theorems 5.1 and 5.2 display certain tradeoffs in generality. In Theorem 5.2, may be any irreducible stochastic matrix, but the variation in the matrix family is restricted to a single parameter — the scaling of the transition probabilities. In Theorem 5.1 on the other hand, the variation in the matrix family is more general in that it has up to degrees of freedom to vary (see Remark for Lemma 22), but the matrix class itself is narrower with the constraint that they be symmetrizable.

Karlin’s proof of Theorem 5.2 relied upon the recently minted variational formula for the spectral radius of Donsker and Varadhan (1975). These results on the reduction principle, and their means of generalization, all came into being in the same time period.

3 Application of Karlin’s theorems to the Evolution of Dispersal and Genetic Systems

My own contribution to the reduction principle began with a conjecture by Marcus Feldman (1980, personal communication). The existence of polymorphisms for genes controlling recombination and mutation rates had been discovered theoretically by Feldman and Balkau (1973) and Feldman and Krakauer (1976)). Generalizing from these examples, Feldman conjectured that whenever a parameter controlled by a gene enters linearly into the recursion on the frequency dynamics, then a polymorphism for that gene would exist in which:

  1. the population, when fixed on an allele producing a particular value of the linear parameter, is at an equilibrium;

  2. each allele’s average value of the parameter is equal to that particular value; and

  3. the gene is in linkage equilibrium with the rest of the genome.

Because condition 2. was analogous to the condition for alleles under viability selection that their marginal fitnesses be equal at equilibrium, these polymorphisms were called ‘viability-analogous, Hardy-Weinberg’ (VAHW) modifier polymorphisms.

The repeated appearance of the VAHW polymorphisms, and the repeated occurrence of the reduction principle in models of different phenomena (recombination, mutation, and dispersal) prompted me to investigate the possible unification of these phenomena, which is provided in Altenberg (1984).

It turns out that the only way a parameter can enter linearly in the recursion is if it modifies transmission probabilities rather than fitnesses. The approach to unification was to represent all of the models in one general expression, in which the specifics of the transmission probabilities (parents and produce offspring ) are ignored, while the variation produced by the modifier locus is made explicit.

The form of variation studied was where the modifier gene produced an equal scaling, , of all transmission probabilities between states, i.e. , when or . The principle models that exhibited the reduction principle all incorporated this form of variation. Equal scaling of transmission probabilities occurs when a single transformative event acts on the transmitted information, and the modifier gene controls the rate of this event (Altenberg, 2011).

With this explicit representation of variation, the models that had exhibited the reduction principle had stability matrices of the form for newly introduced modifier alleles, where as in Karlin’s theorem. Once this structure is made evident, application of Karlin’s Theorem 5.2 immediately yields the result that the growth rate of a new modifier allele was a decreasing function of , so if it reduced below the current value in the population, it would invade, and if it increased above the current level, it would go extinct.

Thus evolution would reduce the rates of all of these various processes, or others that had never been modeled before but which were covered by the general formulation. Prior studies needed to assume only two alleles under selection, or two patches subdividing the population, because they relied on closed-form solutions to determinants or eigenvalues. Karlin’s theorem allowed the result to be generalized to arbitrary numbers of alleles and patches, arbitrary patterns of transformation, and arbitrary selection regimes.

It should be noted that modifiers of segregation distortion have altogether different dynamics that merit a separate classification (Altenberg, 1984, pp. 170–178).

Slight variation among different models led to separate treatments for modifiers of mutation and recombination (Altenberg 1984, pp. 106–169, Altenberg and Feldman 1987), modifiers of dispersal (Altenberg, 1984, pp. 77–81, 178–199), modifiers of rates of asexual vs. sexual reproduction (ibid. pp. 199–203), and culturally transmitted modifiers of cultural transmission — i.e. ‘traditionalism’ (ibid. pp. 203–206). All of these phenotypes manifest the reduction principle for the same underlying reason, the spectral radius property shown in Karlin’s Theorem 5.2.

3.1 The Dispersal Modifier Results of Altenberg (1984)

The results on the evolution of dispersal modifiers in Altenberg (1984, pp. 77–81, 178–199) will be briefly reviewed, so that the work need not be duplicated, as has recently occurred (Kirkland et al., 2006).

The model is of an organism that has a multiple-stage life cycle, consisting of random mating, semelparous reproduction, selection on gametes, zygotes, and adults, and lastly, dispersal. The reproductive output of an organism depends on its patch and its diploid genotype for a gene under selection. The probability of dispersing between any two patches is scaled by a modifier gene. The model includes several generalizations of prior work:

  • arbitrary numbers of patches;

  • arbitrary dispersal patterns between patches, which may include cycles and asymmetry;

  • dispersal of either adults or gametes (but not dispersal of zygotes, which breaks the Hardy-Weinberg frequencies of diploids and complicates the analysis);

  • arbitrary hard or soft selection patterns on diploids and gametes;

  • arbitrary numbers of alleles at a dispersal-modifying locus; and

  • arbitrary number of alleles for the locus with patch-specific fitnesses.

Analysis is made of the evolutionary stability of populations near equilibrium. In order for any new modifier allele to grow or decline at a geometric rate, the equilibrium must possess variation in the reproductive rates among patches and/or genotypes. This variation was first identified as a property of equilibrium populations at mutation-selection balance by Haldane (1937), and was later called the ‘genetic load’ by Muller (1950).

The term “fitness load” was used in Altenberg (1984) to generalize the genetic load concept to circumstances where there may be no genes involved — in particular, to patches with different growth rates where the stationary distribution leaves some patches as sinks and others as sources, as they were later to be called (Pulliam, 1988). The term ‘selection potential’, , was adopted in Altenberg and Feldman (1987) because of the analogy to potentials in physical systems, and because was the actual maximum potential selective advantage that a modifier allele could accrue. is necessary for any geometric growth in the modifier allele. The condition corresponds to a population at an ‘ideal free distribution’ (Fretwell and Lucas, 1969; Fretwell, 1972).

For the dispersal modifier model in Altenberg (1984), a positive selection potential requires some differences at equilibrium among the terms


over the environments , and genotypes , where

is the population size in environment after selection, and after dispersal,

is the mean fitness in environment ,

is the mean fitness of the allele under selection in environment ,

under hard selection, and is constant under soft selection.

One can see the two sources for a selection potential in (2): ecological, i.e. variation in (mentioned in the earlier quote of Olivieri et al. 1995), and genetic, i.e. variation in .

Ideal free distributions having may be produced by the “balanced mixture polymorphisms” discussed in Altenberg (1984, pp. 101–104, 129, 189–190, 218–222), which are synonymous with the Nash equilibria studied in Schreiber and Li (2011).

The main results obtained are the manifestation of the Reduction Principle for dispersal rates. First, we have this result for modifier allele with extreme effect:


3.27, Altenberg (1984, p. 195) A modifier allele which stops all migration will always increase when introduced to a population with an equilibrium selection potential, for any linkage to the locus under selection.

For modifier alleles with intermediate effects on dispersal, tractability requires the assumption of tight linkage between the modifier locus and the selected locus. Under tight linkage, the stability matrix for the new modifier allele becomes a direct sum of blocks for each allele under selection:


where is the matrix of average dispersal probabilities produced by modifier alleles in the equilibrium population, and

where is the number of patches. Then the following is obtained:


3.28, Altenberg (1984, p. 199)

  1. The new modifier allele, , can change frequency at a geometric rate, that is, , only if there is an equilibrium selection potential in the population, so that .

  2. The spectral radius for the new modifier allele, , depends only on how its marginal migration matrix is related to the equilibrium marginal migration matrix . The results of Theorem 3.14 for linear variation …therefore apply directly:


3.14, Altenberg (1984, p. 137): For a tightly linked modifier locus, when a new modifier allele, , is introduced to a population at a stable viability-analogous, tensor product equilibrium (VAHW), where there is a variance in the marginal fitnesses of the selected types present, then for as defined in (3), the new modifier allele frequency will increase if , and it will be excluded if .

Theorem 3.14 derives directly from Karlin’s Theorem 5.2, which shows in addition that asymptotic growth rate of the new modifier allele increases as decreases throughout the range of .

Karlin’s Theorem 5.2, and the dispersal modifier results above, have recently been duplicated by Kirkland et al. (2006, Theorem 3.1). They use a novel, structure-based proof for their version of Theorem 5.2, while Karlin used the Donsker and Varadhan formula for the spectral radius. They apply it to the evolution of unconditional dispersal, and prove a special case of Altenberg (1984, Result 3.28 and Theorem 3.14) where the genetics and life history stages are absent. Their results are extended to continuous time models by Schreiber and Lloyd-Smith (2009, online Appendix B), while Altenberg (2010) uses the Donsker and Varadhan formula to extend Theorem 5.2 to the continuous time case.

The results in Kirkland et al. (2006), while being special cases of Altenberg (1984) as far as the genetics are concerned, offer generalizations of the reduction principle in other new directions, namely, they generalize the work on density-dependent population regulation first addressed for dispersal modifiers by Asmussen (1983), and cover the general case where growth rates decrease with population size (Kirkland et al., 2006, Assumptions A1-A3). They also cover the case of reducible dispersal matrices (Theorem 4.4), the case of lossy dispersal (Assumption A4), and the fate of the modifier allele far from perturbation (Theorem 3.3). They examine conditional dispersers, and analyze the evolutionarily stable state in which dispersal has been conditioned to the point where the population reaches an ideal free distribution.

It should be noted that the ideal free distribution was proposed as ultimate evolutionarily stable state by Kimura (1967) in his ‘principle of minimum genetic load’. Kimura was thinking about the evolution of mutation rates, not dispersal. But the driving force in each case — the genetic load for mutation, and the presence of sink and source populations for dispersal (Pulliam, 1988) — is mathematically the same phenomenon.

4 Departures from Reduction

While the reduction phenomenon occurs throughout a diverse class of evolutionary models, there are two principal classes in which departures from reduction are found. The first class, which will not be further addressed here, comprises situations in which the population is continually kept far from equilibrium, due to genetic drift (e.g. the Hill-Robertson effect (Barton and Otto, 2005; Roze and Barton, 2006; Keightley and Otto, 2006), also Gillespie (1981a)), varying selection regimes (Charlesworth, 1976; Gillespie, 1981b; Ishii et al., 1989; Sasaki and Iwasa, 1987; Bergman and Feldman, 1990; Wiener and Tuljapurkar, 1994; Schreiber and Li, 2011; Blanquart and Gandon, 2010), or flux of beneficial mutations (Eshel, 1973a, b; Kessler and Levine, 1998).

The second class comprises cases of populations near equilibrium where multiple transformation processes act on the transmissible information of the organism. Studies of multiple transformation processes where departures from reduction are found include the evolution of:

  • recombination in the presence of mutation (Feldman et al., 1980; Charlesworth, 1990; Otto and Feldman, 1997; Pylkov et al., 1998). The greatest attention has been given to this combination. The departures from the reduction result in this case are the basis of the ‘deterministic mutation hypothesis’ for the evolution of sex (Kondrashov, 1982, 1984; Kouyos et al., 2007).

  • recombination in the presence of dispersal (Charlesworth and Charlesworth, 1979; Pylkov et al., 1998);

  • multiple mutation processes (Altenberg, 1984, pp. 137–151);

  • recombination in the presence of segregation and syngamy (which self-fertilization exposes in the recursion) (Charlesworth et al., 1979; Holsinger and Feldman, 1983a);

  • mutation in the presence of segregation and syngamy (exposed in the recursion by self-fertilization (Holsinger and Feldman, 1983b), or fertility selection (Holsinger et al., 1986; Twomey and Feldman, 1990)).

It is notable that in their studies of dispersal in the presence of mutation, Wiener and Feldman (1991, 1993) found no departures from the reduction principle.

The pattern of departures from the reduction principle caused by multiple transformation processes was summarized in Altenberg (1984, pp. 149, 225–228) by a simple heuristic:

The principle of partial control:

When the modifier gene has only partial control over the transformations occurring at loci under selection, then it may be possible for the part which it controls to evolve an increase in rates.

In several cases where multiple transformation processes produce departures from reduction, the stability matrix on the modifier gene has the form


Matrices of the form (4) also appear when the modifier gene is not tightly linked to the loci under selection (Feldman 1972, Altenberg 1984, p. 135, Altenberg and Feldman 1987, Altenberg 2009b). Karlin’s Theorem 5.2 does not apply to such matrices, leaving an entire class of models as an unsolved open problem. In a survey of open problems in the spectral analysis of evolutionary dynamics (Altenberg, 2004), the following problem was posed:

Open Question (3.1 in Altenberg 2004 ).

Let and be irreducible stochastic matrices, and let be a positive diagonal matrix. Define


For what conditions on , , and is the spectral radius strictly decreasing in , for , or

This open problem brings us back to the paper by McNamara and Dall (2011).

5 The Model of McNamara and Dall (2011)

The model of McNamara and Dall (2011, eq. (D.10)) is an example of ‘partial control’ (4), where we set and , being the stationary distribution for stochastic matrix , i.e. . A major point of interest is that McNamara and Dall find conditions on that produce departures from the reduction phenomenon, providing another example of the principle of partial control, and contributing towards answering the open problem posed in Altenberg (2004), above. The recursion for their model is




The control exerted by over the transformations occurring in the system in (7) is only partial because the environment itself undergoes transformation, represented by , and the organism cannot eliminate , but only shift between and .

The McNamara and Dall model represents the following. Let be the number of individuals in environment at time , and be the number after one iteration of reproduction and dispersal.

  1. An individual is born into a site with environment type ;

  2. The individual reproduces on the site, and produces an average of offspring when in environment ;

  3. Each offspring disperses independently with probability to a random site;

  4. In one generation, sites of environment type change randomly and independently to type with probability ;

  5. The sites have settled down to a stationary distribution, so the probability that the site will be in environment state is .

Recursion (6) in summation form is:

McNamara and Dall (2011) obtain analytical results for the case of types of environment:



The model is notable for how it represents environmental randomness. The common way to model randomly changing environments would be to let represent the population size in each patch, represent the dispersal between patches, and let the matrix of environment-specific growth rates, , be a random or time-dependent variable on each patch (e.g. Karlin 1982, pp. 90–92, 103–104, 140–145), yielding a system


The analysis of such models can be challenging, requiring a resort to approximations and numerical analysis (see Gillespie 1981b, Tuljapurkar 1990, Wiener and Tuljapurkar 1994). Progress is being made in this area, however, for example the analysis of a two-cycle model for the evolution of dispersal by Schreiber and Li (2011).

When the random process of changing environments is independent among all the patches, as the number of patches becomes large, the system becomes deterministic in the same way that the Wright-Fisher model becomes deterministic for large populations. This allows one to stop keeping track of each patch, and just keep track of the number of individuals in each environment type, which is what McNamara and Dall do in (6). This tremendously simplifies the analysis.

5.1 Clues to the Generalization of the Results

The original motivation for this paper was to generalize the results of McNamara and Dall (2011) to an arbitrary number of environments, and to gain insight into why their model produces departures from the reduction phenomenon. Their results reveal four clues needed to solve this generalization:

  1. The harmonic mean: McNamara and Dall (2011) find that departures from the reduction phenomenon are determined by the critical condition , where is the expected duration of environment . This expression is part of the harmonic mean, . Could the harmonic mean of figure into a generalization of their results?

  2. The limiting distribution: The two matrices in (7), , and , are not arbitrary, but have the relation . This means, notably, that they commute: , and thus satisfy one key condition of Karlin’s Theorem 5.1.

  3. The second eigenvalue: The terms and derive from the probabilities in : and . The condition translates to . Is it a coincidence that appears in the second eigenvalue of , ? Because we see that the critical condition becomes , which is precisely when no longer meets the condition of Karlin’s Theorem 5.1 that it be symmetrizable to a positive definite matrix. By extrapolation, if all the eigenvalues of besides are negative, could this be a general condition for departures from reduction?

  4. Symmetrizability: Since clues 2. and 3. show the relationship between the results of McNamara and Dall an Karlin’s Theorem 5.2, and we note that irreducible matrices are always symmetrizable, might we want to retain symmetrizability in as we try to generalize the results to matrices?

By following the last clue and constraining to be symmetrizable as in (1), we shall find it tractable to generalize the results of McNamara and Dall (2011), and we shall see that the conjectures prompted by the first and third clues are true.

Symmetrizable stochastic matrices are equivalent to the transition matrices of ergodic reversible Markov chains (Altenberg, 2011, Lemma 2). A Markov chain is reversible when the probability of cycles in one direction equals the probability of cycles in the opposite direction (Ross, 1983, Theorem 4.7.1, p. 127). In nature, directional cycles of environmental change may be more the rule than the exception, however. Whether cyclical environments would produce different results remains an open question.

We can step beyond the McNamara and Dall model and obtain a more general theorem for departures from reduction for the form , where and satisfy (1). This is provided in Theorem 16 in Results. The theorem in Altenberg (2009a, 2011) that generalizes the reduction principle to the evolution of mutation rates among multiple loci turns out to be a special case of Theorem 16. This again illustrates the fact that genetic, spatial, cultural, and other transmissible information all belong to a single mathematical framework, and that results from one domain can often translate easily into results in other domains.

6 Results

McNamara and Dall (2011) describe their concept of a “multiplier effect” without ever giving it a precise mathematical definition. But it is clear from their usage in McNamara and Dall (2011, online Appendix A, Theorem A) that what they are thinking about can be summarized as the covariance between 1) the growth rates in each environment, and 2) the excess abundance of the population in that environment over what it would be without differential growth rates. This is defined explicitly below as the fitness-abundance covariance.

When organisms are semelparous, and generations are discrete and non-overlapping, there are two phases in the life cycle that one can census the population: before and after dispersal, or equivalently, after and before reproduction. Thus, the fitness-abundance covariance must be defined for both census phases.

The fitness-abundance covariance is an object of interest in its own right. Section 6.1 ventures beyond the specifics of the McNamara and Dall model and explores various properties of the fitness-abundance covariance for the completely general case of , where is a stochastic matrix representing any process of change between states, and represents the state-specific growth rates. The generality of results in Section 6.1 not only includes the McNamara and Dall model as a special case, but goes beyond models of dispersal since can just as well represent a mutation matrix between genotypes whose fitnesses are . The results can also apply to a rare genotype in a sexual population where represents the linear stability matrix on its growth.

Section 6.3 returns to the specific model of McNamara and Dall with the chief goal of generalizing the results to any number of environmental states. Here is where we pursue the clues described in the previous section.

For clarity, terminology and conventions are provided in Table 1.

6.1 The Fitness-Abundance Covariance

A precise definition needs to be given for the degree to which “individuals tend to be in locations to which they are well suited.” While it may sound reasonable that an organism’s “mere existence informs an organism that it is liable to be in favourable circumstances” (McNamara and Dall, 2011, p. 237), this is not generally true.

The following is an example where an organism is more likely to find itself in a sink habitat than a source habitat. The situation is where there is a small source patch within a large habitat of sink patches. We get a simple result if we assume the extremes: that sink habitats are lethal, and dispersing organisms recruit with fixed probabilities to each patch, , and the dispersal rate is . Then is the stationary proportion of the population in the source patch after dispersal. The stationary portion of the population in the source patch, , can be made as small as one wishes by large dispersal rate , and small .

Clearly, before dispersal, all organisms in this example are in the source patch. So one must be clear about when in the life cycle one is speaking. An organism ‘deciding’ on whether to disperse or not is obviously at the pre-dispersal phase. But the post-dispersal phase is the phase that McNamara and Dall use to measure the ‘multiplier effect’.

Examination of the results of McNamara and Dall also reveals that when they speak of an organism being “liable to be in favourable circumstances,” what they actually mean is that an organism is more likely to be in a favorable habitat than it would be if there were no growth advantage there, not that the organisms is actually liable to be there. This is the concept that I make precise as the fitness-abundance covariance. Even in this relative value of abundance, however, we will see that the fitness-abundance covariance is not always positive.

The fitness-abundance covariance relates three different sets of values: the environment-specific growth rates , the stationary distribution in the presence of differential growth rates, referred to as , and the stationary distribution in the absence of differential growth rates, referred to as .

The stationary distribution for recursion satisfies

where is the eigenvector of associated with the largest eigenvalue of , . This is called the right Perron vector. Throughout, will represent the right Perron vector of a matrix (see Table 1).

or other boldface capital letters represent matrices, and , or other bold face lower case characters represent -vectors; the identity matrix is ; a scalar matrix is for ; represents the elements of , , and represents the elements of ; represents the diagonal elements of diagonal matrix ; a positive diagonal matrix has , ; represents the th row of matrix , and represents the th column. represents the unit vector, where all elements are ; represents the th basis vector, which has at position and elsewhere; is a diagonal matrix of the vector ; , , etc., represent the transpose; represent the eigenvalues of ; symmetrizable to means that an matrix can be represented as a product , where is a symmetric real matrix, and and are positive diagonal matrices; stochastic means an matrix with nonnegative elements and whose columns (by convention here) sum to one (column stochastic); positive definite means a matrix that is symmetric and has only positive eigenvalues; irreducible means an nonnegative matrix where for every there is some such that ; represents the spectral radius, the largest modulus of any eigenvalue of . by convention will refer to the Perron root of a nonnegative irreducible matrix , which is the positive eigenvalue guaranteed by Perron-Frobenius theory (Seneta, 2006, Theorems 1.1, 1.5) to exist, to be the spectral radius, and to be as large as the modulus (i.e. magnitude) of any other eigenvalue. So for . and represent the right and left Perron vectors of nonnegative irreducible , the eigenvectors associated with the Perron root, guaranteed by Perron-Frobenius theory to be strictly positive. So , and . By convention and . , and , throughout, where is obvious from context. traditionally represents the stationary distribution of irreducible (column) stochastic matrix . The harmonic mean of a set of numbers is

Table 1: Definitions and Symbols

The magnitude of determines whether the population grows () or declines () or is stationary (), and ecological models typically impose some kind of negative density dependence so that as population size gets large enough, decreases with , and a stationary state of can be attained. The problems addressed here do not concern the absolute value of , but only the relative changes to and under changes in and . For a general treatment of negative density dependence, Kirkland et al. (2006) provide a thorough analysis.

The stationary distribution in the presence if differential growth rates depends on the phase in the life cycle at which the census is taken. The life cycle consists of alternation between differential growth and dispersal, . When censused just after dispersal, the stationary distribution is . Censused just before dispersal it is .

We see that and have a simple relationship from the cyclical structure. is the Perron vector of up to scaling, since

When scaled to satisfy , one gets the relationship:


It should be noted that in continuous-time models such as quasispecies (Eigen and Schuster, 1977), selection and transformation happen simultaneously so there are no separate life cycle phases, hence no distinction between pre- and post-dispersal stationary states.

For semelparous organisms with discrete, non-overlapping generations, the fitness-abundance covariance is now defined for both phases of the life cycle.

Definition (Fitness-Abundance Covariance).

The fitness-abundance covariance is defined as the unweighted covariance between the environment-specific growth rates and the excess of the stationary distribution above the distribution that the population would attain in the absence of differential growth rates:

  1. Post-dispersal:

  2. Pre-dispersal:

Several elementary results are described. The first shows that the relationship between the pre- and post-dispersal fitness-abundance covariances is Fisher’s Fundamental Theorem of Natural Selection in a slightly new context.

Theorem 3 (Fitness-Abundance Covariance and Census Phases).

Let be an irreducible column stochastic matrix and a positive diagonal matrix.


where is the -weighted variance of ,


Here, , , and . We first note that


Substitution with (10) and (11) gives

Corollary 4 (Derivatives of Fitness-Abundance Covariances and ).

Let be a family of irreducible stochastic matrices, differentiable in , and assume for all . Let be a positive diagonal matrix. Set .



Differentiation of (11) and (13) directly gives (14) and (15). ∎

By plain intuition we would expect the fitness-abundance covariance to be positive. But McNamara and Dall found circumstances in which the fitness-abundance covariance is negative just after dispersal. The question then remains, what about just before dispersal, when individuals are still in the environment whose growth rate they just replicated under? Here is where intuition suggests the fitness-abundance covariance should be positive. This is proven to be the case when is the transition matrix of a reversible Markov chain with positive eigenvalues. But a counterexample is provided when represents a periodic chain that cycles through the states, which has complex eigenvalues.

Theorem 5 (Positivity of the Pre-Dispersal Fitness-Abundance Covariance).

Let be the transition matrix of an ergodic reversible Markov chain, with only nonnegative eigenvalues. Let be a positive diagonal matrix.



Here, , , and . From (12),

Since , and ,


The condition that be the transition matrix of an ergodic reversible Markov chain is equivalent to it being diagonally similar to a symmetric matrix (Keilson 1979, Proposition 1.3B; Altenberg 2011, Lemma 2). Since has all nonnegative eigenvalues, that matrix is positive semidefinite. This allows application of the inequality in Friedland and Karlin (1975, Theorem 4.1): . In (16) this gives

For a counterexample to the positivity of the pre-dispersal fitness-abundance covariance, we try a transition matrix that is as far from Theorem 5 as possible, so the states are periodic and the eigenvalues other than 1 are complex roots of unity. This represents the situation of pelagic organisms along a gyre (e.g. Cowen et al. 2006).

Theorem 6.

Let be an -cyclic matrix,

Then for ,


For a matrix cyclic in this direction, given any , . Thus . So . Hence . Substitution in (11) gives (17). is negative because it is times the difference between the geometric and arithmetic means of , which is always negative if not all are equal (Steele, 2004, pp. 20–26). ∎

Theorem 7.

When the states are transformed in a cycle, it is possible for the pre-dispersal fitness-abundance covariance to be negative.


An example is constructed. Let represent the period-3 cycle of states

and let .

The spectral radius is . By symmetry, , . Symbolic computation with Mathematica™ shows that

A numerical survey shows that is positive over most values of except for a very narrow range of near the boundary where becomes negative. One such value is , which yields , , , and . ∎

The point of this odd counterexample is not that it represents something we might find in nature, but rather to say that we cannot entirely trust our intuition about the fitness-abundance covariance, and that something more subtle is going on mathematically than we might suppose.

6.2 Individual Stationary State Frequencies

Let us now examine the relationships between individual values of , and .

McNamara and Dall show for the case of that the post-dispersal fitness-abundance covariance is positive or negative depending on the durations of the environments, restated in their terms here:

Theorem 8 (McNamara and Dall (2011, online Appendix A, Theorem A)).

Let in (7) to give (8).

  1. If then and

    1. .

  2. If then , , and .

  3. If then and

    1. .

The third case exhibits the very counterintuitive behavior that increasing the reproductive output of an environment will lower the stationary proportion in that environment. We can compare this result to the following general theorem on how changes to a matrix affect its Perron vector:

Theorem 9 (Elsner et al. (1982, Theorem 2.1)).

Let be an nonnegative irreducible matrix. Then for any nonnegative -vector , ,


It is more useful for us to put it in the following form:

Corollary 10 (Change in the Perron Vector).

When normalized to frequencies, , then .


The result follows immediately from rearrangement of (18) and summation using :

In this case, the behavior of the Perron root follows our intuition that increasing the th row of should increase the stationary proportion of .

Something must be very different, therefore, between theorems 8 and 9, since they both deal with changes in the Perron vector when elements of the matrix are changed. Theorem 8 produces counterintuitive results that depend on , while Theorem 9 has no conditions on details of the matrix. How can this discrepancy be reconciled?

We must write in terms of and to compare the two results. Let be the th row of . We can write

where and , . Corollary 10 shows that increasing the reproductive output of environment from to increases the stationary proportion in environment . In the limit , this gives:

Corollary 11.

For irreducible column stochastic matrix and positive diagonal matrix :


In the case , we have , , , and

So Theorem 9 gives regardless of any details of .

The discrepancy is resolved by noticing that the order of and is reversed between Theorem 8 and Theorem 9. The difference between the two is essentially in the phase of the life cycle at which the population is censused.

We can contrast (19) with the following:

Corollary 12.

For irreducible column stochastic matrix and positive diagonal matrix ,


Substitution of (10) in (19) and differentiation gives:

Comparing (19) and (20) we see that the stationary distributions at different census phases behave differently.

To summarize:

  • the portion in environment censused before dispersal always increases with growth rate ;

  • the portion in environment censused after dispersal can, under the right environment transition matrix, decrease with increasing growth rate .

This allows us to make inferences on the duration of the environments based on changes in the proportions of the population in each environment before and after reproduction:

Corollary 13 (Census Inference on Durations of Environments).

Consider the model of McNamara and Dall, , with (7), where . At a stationary distribution, let be the vector of proportions of individuals in each environment before reproduction, and be the proportions after reproduction. For environments,

  1. If , we know that .

  2. If in addition, , then we know ; or if , then .

6.3 Generalization of The McNamara and Dall Model

We shift now from general to the specific model of dispersal in randomly changing environments of McNamara and Dall (2011). First, we see how the direction of selection on unconditional dispersal corresponds to the sign of the post-dispersal fitness-abundance covariance.

Corollary 14 (McNamara and Dall Model with general ).

Let , where is an irreducible stochastic matrix, and . Let