# A sampling theory for asymmetric communities

###### Abstract

We introduce the first analytical model of asymmetric community dynamics to yield Hubbell’s neutral theory in the limit of functional equivalence among all species. Our focus centers on an asymmetric extension of Hubbell’s local community dynamics, while an analogous extension of Hubbell’s metacommunity dynamics is deferred to an appendix. We find that mass-effects may facilitate coexistence in asymmetric local communities and generate unimodal species abundance distributions indistinguishable from those of symmetric communities. Multiple modes, however, only arise from asymmetric processes and provide a strong indication of non-neutral dynamics. Although the exact stationary distributions of fully asymmetric communities must be calculated numerically, we derive approximate sampling distributions for the general case and for nearly neutral communities where symmetry is broken by a single species distinct from all others in ecological fitness and dispersal ability. In the latter case, our approximate distributions are fully normalized, and novel asymptotic expansions of the required hypergeometric functions are provided to make evaluations tractable for large communities. Employing these results in a Bayesian analysis may provide a novel statistical test to assess the consistency of species abundance data with the neutral hypothesis.

###### keywords:

biodiversity, neutral theory, nearly neutral theory, coexistence, mass-effects^{†}

^{†}journal: Journal of Theoretical Biology

## 1 Introduction

The ecological symmetry of trophically similar species forms the central assumption in Hubbell’s unified neutral theory of biodiversity and biogeography (Hubbell, 2001). In the absence of stable coexistence mechanisms, local communities evolve under zero-sum ecological drift – a stochastic process of density-dependent birth, death, and migration that maintains a fixed community size (Hubbell, 2001). Despite a homogeneous environment, migration inhibits the dominance of any single species and fosters high levels of diversity. The symmetry assumption has allowed for considerable analytical developments that draw on the mathematics of neutral population genetics (Fisher, 1930; Wright, 1931) to derive exact predictions for emergent, macro-ecological patterns (Chave, 2004; Etienne and Alonso, 2007; McKane et al., 2000; Vallade and Houchmandzadeh, 2003; Volkov et al., 2003; Etienne and Olff, 2004; McKane et al., 2004; Pigolotti et al., 2004; He, 2005; Volkov et al., 2005; Hu et al., 2007; Volkov et al., 2007; Babak and He, 2008, 2009). Among the most significant contributions are calculations of multivariate sampling distributions that relate local abundances to those in the regional metacommunity (Alonso and McKane, 2004; Etienne and Alonso, 2005; Etienne, 2005, 2007). Hubbell first emphasized the ultility of sampling theories for testing neutral theory against observed species abundance distributions (SADs) (Hubbell, 2001). Since then, Etienne and Olff have incorporated sampling distributions as conditional likelihoods in Bayesian analyses (Etienne and Olff, 2004, 2005; Etienne, 2007, 2009). Recent work has shown that the sampling distributions of neutral theory remain invariant when the restriction of zero-sum dynamics is lifted (Etienne et al., 2007; Haegeman and Etienne, 2008; Conlisk et al., 2010) and when the assumption of strict symmetry is relaxed to a requirement of ecological equivalence (Etienne et al., 2007; Haegeman and Etienne, 2008; Allouche and Kadmon, 2009a, b; Lin et al., 2009).

The success of neutral theory in fitting empirical patterns of biodiversity (Hubbell, 2001; Volkov et al., 2003, 2005; He, 2005; Chave et al., 2006) has generated a heated debate among ecologists, as there is strong evidence for species asymmetry in the field (Harper, 1977; Goldberg and Barton, 1992; Chase and Leibold, 2003; Wootton, 2009; Levine and HilleRisLambers, 2009). Echoing previous work on the difficulty of resolving competitive dynamics from the essentially static observations of co-occurence data (Hastings, 1987), recent studies indicate that interspecific tradeoffs may generate unimodal SADs indistinguishable from the expectations of neutral theory (Chave et al., 2002; Mouquet and Loreau, 2003; Chase, 2005; He, 2005; Purves and Pacala, 2005; Walker, 2007; Doncaster, 2009). These results underscore the compatibility of asymmetries and coexistence. The pioneering work of Hutchinson (1951), has inspired a large literature on asymmetries in dispersal ability that permit the coexistence of “fugitive species” with dominant competitors. In particular, Shmida and Wilson (1985) extended the work of Brown and Kodric-Brown (1977) by introducing the paradigm of “mass-effects”, where immigration facilitates the establishment of species in sites where they would otherwise be competitively excluded. Numerous attempts have been made to reconcile such deterministic approaches to the coexistence of asymmetric species with the stochastic model of ecological drift in symmetric neutral theory (Zhang and Lin, 1997; Tilman, 2004; Chase, 2005; Alonso et al., 2006; Gravel et al., 2006; Pueyo et al., 2007; Walker, 2007; Alonso et al., 2008; Ernest et al., 2008; Zhou and Zhang, 2008). Many of these attempts build on insights from the concluding chapter of Hubbell’s book (Hubbell, 2001).

Nevertheless, the need remains for a fully asymmetric, analytical, sampling theory that contains Hubbell’s model as a limiting case (Alonso et al., 2006). In this article, we develop such a theory for local, dispersal-limited communities in the main text and defer an analogous treatment of metacommunities to Appendix A. Hubbell’s assumption of zero-sum dynamics is preserved, but the requirement of per capita ecological equivalence among all species is eliminated. Asymmetries are introduced by allowing for the variations in ecological fitness and dispersal ability that may arise in a heterogeneous environment (Leibold et al., 2004; Holyoak et al., 2005). Our work expands on the numerical simulations of Zhou and Zhang (2008), where variations in ecological fitness alone were considered. Coexistence emerges from mass-effects as well as ecological equivalence, and both mechanisms generate unimodal SADs that may be indistinguishable. For local communities and metacommunities, we derive approximate sampling distributions for both the general case and the nearly neutral case, where symmetry is broken by a single species unique in ecological function. These approximations yield the sampling distributions of Hubbell’s neutral model in the limit of functional equivalence among all species.

## 2 A general sampling theory for local communities

For a local community of individuals and possible species, we model community dynamics as a stochastic process, , over the labelled community abundance vectors . Consistent with zero-sum dynamics, we require all accessible states to contain total individuals: and . The number of accessible states is .

Allowed transitions first remove an individual from species and then add an individual to species . Removals are due to death or emigration and occur with the density-dependent probability . Additions are due either to an immigration event, with probability , or a birth event, with probability . We will refer to the as dispersal abilities. If immigration occurs, we assume that metacommunity relative abundance, , determines the proportional representation of species in the propagule rain and that the probability of establishment is weighted by ecological fitness, , where high values correspond to a local competitive advantage or a superior adaptation to the local environment. Therefore, species recruits with probability

(1) |

where , , and . If immigration does not occur, we assume that local relative abundance, , governs propagule rain composition such that species recruits with probability

(2) |

In numerical simulations of an asymmetric community, Zhou and Zhang (2008) employed a similar probability for recruitment in the absence of immigration. Here, a factor of is subtracted in the denominator because species loses an individual prior to the birth event for species . An analogous subtraction is absent from Eq. 1 because we assume an infinite metacommunity where the are invariant to fluctuations in the finite, local community populations.

In sum, the nonzero transition probabilities are stationary and given by

(3) | |||||

where is an –dimensional unit vector along the th–direction, the must be sufficiently large such that , and the time, , is dimensionless with a scale set by the overall transition rate. The probability of state occupancy, , evolves according to the master equation

(4) |

where

(5) |

and we define the step-function to be zero for and one otherwise. Eq. 4 can be recast in terms of a transition probability matrix

(6) |

where enumerate accessible states with components , . The left eigenvector of with zero eigenvalue yields the stationary distribution for community composition, . Marginal distributions yield the equilibrium abundance probabilities for each species

(7) |

From here, we calculate the stationary SAD by following the general treatment of asymmetric communities in Alonso et al. (2008)

(8) |

The expected species richness is

(9) |

Given that the local community, with abundances , is defined as a sample of the metacommunity, with relative abundances , we have established the framework for a general sampling theory of local communities.

This sampling theory incorporates aspects of the mass-effects paradigm (Brown and Kodric-Brown, 1977; Shmida and Wilson, 1985; Holt, 1993; Leibold et al., 2004; Holyoak et al., 2005). Local asymmetries in ecological fitness imply environmental heterogeneity across the metacommunity such that competitive ability peaks in the local communities where biotic and abiotic factors most closely match niche requirements (Tilman, 1982; Leibold, 1998; Chase and Leibold, 2003). Where species experience a competitive disadvantage, the mass-effects of immigration allow for persistence. Indeed, the master equation given by Eq. 4, when applied to open communities where for all , admits no absorbing states and ensures that every species has a nonzero probability of being present under equilibrium conditions. By contrast, when Eq. 4 is applied to closed communities where for all , the eventual dominance of a single species is guaranteed.

Mass-effects allow for a soft breaking of the symmetry of neutral theory and provide a mechanism for multi-species coexistence. In Fig. 1, we present numerical results for the marginal equilibrium distributions of an asymmetric local community subsidized by a potentially neutral metacommunity, where the five species share a common relative abundance, =0.2. Although a single species may dominate due to a locally superior competitive ability (see Fig. 1a), multi-species coexistence may arise, despite significant competitive asymmetries, due to high levels of immigration that tend to align local relative abundances with those in the metacommunity (see Fig. 1c). Despite the underlying asymmetric process, coexistence via mass-effects generates unimodal SADs that, given sampling errors in field data, may be indistinguishable from SADs due to neutral dynamics, as shown in Fig. 1d. This reinforces previous conclusions that the static, aggregate data in unimodal SADs cannot resolve the individual-level rules of engagement governing the origin and maintenance of biodiversity (Chave et al., 2002; Mouquet and Loreau, 2003; Purves and Pacala, 2005; He, 2005; Chase, 2005; Walker, 2007; Doncaster, 2009). However, SADs with multiple modes are not uncommon in nature (Dornelas and Connolly, 2008; Gray et al., 2005) and provide a strong indicator of non-neutral dynamics (Alonso et al., 2008). Fig. 1b presents a bimodal SAD for an asymmetric local community with low levels of immigration.

Each plot in Fig. 1 displays results for a relatively small community of individuals and possible species. Sparse matrix methods were used to calculate the left eigenvector with zero eigenvalue for transition matrices of rank . Obtaining stationary distributions for larger, more realistic communities poses a formidable numerical challenge. This motivates a search for analytically tractable approximations to sampling distributions of the general theory.

## 3 An approximation to the sampling distribution

The distribution is stationary under Eq. 4 if it satisfies the condition of detailed balance

(10) |

for all and such that and . For general (g) large– communities where for all , we will show that detailed balance is approximately satisfied by

(11) |

where

(12) |

where is the Pochhammer symbol, is a normalization constant, and is a generalization of the “fundamental dispersal number” (Etienne and Alonso, 2005). From the definition of in Eq. 3, we have

(13) |

and assuming the form of in Eq. 11, we find

(14) |

Now, for large– communities where for all , the ratio is a small number. Given , we expand the right-hand-side of Eq. 14 to obtain

(15) |

which validates our assertion that Eq. 11 is an approximate sampling distribution of the general theory when . For communities of species that are symmetric (s) in ecological fitness but asymmetric in dispersal ability, Eq. 11 reduces to an exact sampling distribution

(16) |

that satisfies detailed balance without approximation. Analogous distributions for general and fitness-symmetric metacommunities are provided in Appendix A. However, in all of these results, the normalization constants must be calculated numerically. This limits the utility of our sampling distributions in statistical analyses. Can we find a non-neutral scenario that admits an approximate sampling distribution with an analytical expression for the normalization?

## 4 Sampling nearly neutral communities

As the species abundance vector evolves under Eq. 4, consider the dynamics of marginal abundance probabilities for a single focal species that deviates in ecological function from the surrounding, otherwise symmetric, community. In particular, let the first element of be the marginal process, , over states , for the abundance of an asymmetric focal species with dispersal ability , ecological fitness , and relative metacommunity abundance . If all other species share a common dispersal ability and ecological fitness , then the focal species gains an individual with probability

and loses an individual with probability

where we have used . These marginal transition probabilities do not depend separately on and , but only on their ratio. Without loss of generality, we redefine to be the focal species’ local advantage in ecological fitness. Eqs. LABEL:bn and LABEL:dn, which are independent of the abundances , suggest a univariate birth-death process for the marginal dynamics of the asymmetric species governed by the master equation

(19) | |||||

and we formally derive this result from Eq. 4 in Appendix B. Given the well-known stationary distribution of Eq. 19

(20) |

we find an exact result for the stationary abundance probabilities of the focal species in a nearly neutral (nn) community

(21) |

where is the beta-function

(22) |

and

(23) |

For the asymmetric focal species, this is an exact result of the general model, Eq. 4, that holds for nearly neutral local communites with any number of additional species. Eq. 21 may be classified broadly as a generalized hypergeometric distribution or more specifically as an exponentially weighted Pólya distribution (Kemp, 1968; Johnson et al., 1992).

In the absence of dispersal limitation, Eq. 21 becomes

(24) |

where the identity has been used. This is a weighted binomial distribution with expected abundance and variance . In the neutral, or symmetric, limit where , Eq. 24 reduces to a binomial sampling of the metacommunity, sensu Etienne and Alonso (2005).

In the presence of dispersal limitation, we evaluate to obtain the expected abundance

(25) | |||||

where . The variance of the stationary distribution is given by

(26) | |||||

and we evaluate to obtain

(27) |

In Eqs. 25 and 26, the normalization of Eq. 22 generates central moments for the abundance distribution and plays a role analogous to the grand partition function of statistical physics. Recent studies have demonstrated the utility of partition functions in extensions of Hubbell’s neutral theory (O’Dwyer et al., 2009; O’Dwyer and Green, 2010).

For large– communities, evaluation of the hypergeometric functions in Eqs. 21, 25, and 27 is computationally expensive. To remove this barrier, one of us (N.M.T.) has derived novel asymptotic expansions (see Appendix C). We use these expansions to plot the stationary abundance probabilities for M. In Fig. 2a, small local advantages in ecological fitness generate substantial increases in expected abundance over the neutral prediction. Hubbell found evidence for these discrepancies in Manu forest data and referred to them as “ecological dominance deviations” (Hubbell, 2001). Hubbell also anticipated that dispersal effects would mitigate advantages in ecological fitness (Hubbell, 2001). The right panel of Fig. 2 demonstrates, once again, that enhanced mass-effects due to increased dispersal ability may inhibit the dominance of a locally superior competitor by compelling relative local abundance to align with relative metacommunity abundance.

An approximation to the multivariate sampling distribution of nearly neutral local communities is constructed in Appendix B

(28) |

where

(29) |

A related approximation for the sampling distribution of nearly neutral metacommunities is derived in Appendix A. In the absence of dispersal limitation, Eq. 28 becomes

(30) |

where we have used for large . Finally, in the symmetric limit, Eq. 30 reduces to a simple multinomial sampling of the metacommunity, as expected.

To illustrate the impacts of an asymmetric species on the diversity of an otherwise symmetric local community, Fig. 3 plots Shannon’s Index of diversity

(31) |

for various values of the ecological fitness advantage, , and dispersal ability, , in a nearly neutral community of species and individuals. All five species share a common relative metacommunity abundance, , so given the exact result for in Eq. 25, we know immediately that for the remaining symmetric species. Note that is maximized where all abundances are equivalent, such that . As can be seen from the next section, this relation holds in the neutral limit where and , but small asymmetries in dispersal ability have a negligible impact on diversity when all species are symmetric in ecological fitness. Therefore, each curve in Fig. 3 peaks near at approximately the same value of . Away from , the declines in diversity are regulated by mass-effects, with more gradual declines at higher values of .

## 5 Recovering the sampling distribution of neutral theory

In a perfectly symmetric local community, the stochastic dynamics for each species differ solely due to variations in relative metacommunity abundances, the . In particular, if and for all in Eq. 3, we recover the multivariate transition probabilities for a neutral sampling theory of local communities, as suggested on p. 287 of Hubbell’s book (Hubbell, 2001). Similarly, in the symmetric limit of Eq. 19 where and , we recover the marginal dynamics for neutral (n) local communities with stationary distribution (McKane et al., 2000)

(32) |

where . This result follows from the symmetric limit of Eq. 21 after applying the identity . The expected abundance and variance are obtained from the symmetric limits of Eqs. 25 and 26, respectively, after applying the identities in Eqs. C.1.0.1 and C.2.2.2

(33) | |||||

(34) |

Finally, the symmetric limits of Eqs. 11, 16, and 28 all yield the stationary sampling distribution for a neutral local community (Etienne and Alonso, 2005; Etienne et al., 2007)

(35) |

In the special case of complete neutrality, Eq. 35 is an exact result of the general model, Eq. 4. This sampling distribution continues to hold when the assumptions of zero-sum dynamics and stationarity are relaxed (Etienne et al., 2007; Haegeman and Etienne, 2008).

## 6 Discussion

We have developed a general sampling theory that extends Hubbell’s neutral theory of local communities and metacommunities to include asymmetries in ecological fitness and dispersal ability. We anticipate that a parameterization of additional biological complexity, such as asymmetries in survivorship probabilities or differences between the establishment probabilities of local reproduction and immigration, may be incorporated without significant changes to the structure of our analytical results. Although the machinery is significantly more complicated for asymmetric theories than their symmetric counterparts, some analytical calculations remain tractable. We find approximate sampling distributions for general and nearly neutral communities that yield Hubbell’s theory in the symmetric limit. Our fully normalized approximation in the nearly neutral case may provide a valuable statistical tool for determining the degree to which an observed SAD is consistent with the assumption of complete neutrality. To facilitate a Bayesian analysis, we have enabled rapid computation of the required hypergeometric functions by deriving previously unknown asymptotic expansions.

## Acknowledgments

We gratefully acknowledge the insights of two anonymous reviewers. This work is partially supported by the James S. McDonnell Foundation through their Studying Complex Systems grant (220020138) to W.F.F. N.M.T. acknowledges financial support from Gobierno of Navarra, Res. and Ministerio de Ciencia e Innovación, project MTM2009-11686.

## Appendix A Sampling asymmetric metacommunities

The analytical insights of Etienne et al. (2007) suggest a clear prescription for translating local community dynamics into metacommunities dynamics in the context of Hubbell’s unified neutral theory of biodiversity and biogeography (Hubbell, 2001): replace probabilities of immigration, , with probabilities of speciation, ; assume for all , where is the total number of species that could possibly appear through speciation events; and consider asymptotics as becomes large.

Following this recipe, we translate the transition probabilities for asymmetric local communities, Eq. 3, into the transition probabilities for asymmetric metacommunities ()

(A.1) |

where is the number of individuals in the metacommunity, is the probability that an individual of species establishes following a speciation event, and

(A.2) |

Metacommunity dynamics are governed by the master equation

(A.3) |

If for all , there are no absorbing states, so for large–, there is a nonzero probability that any given species exists. Analogous develops to those in Section 3 show that detailed balance in the general theory is approximated by

(A.4) |

where

(A.5) |

and is the generalization of Hubbell’s “fundamental biodiversity number” (Hubbell, 2001). The fitness-symmetric (s) distribution

(A.6) |

satisfies detailed balance up to .

For the special case of nearly neutral metacommunities, we translate the marginal dynamics for an asymmetric species in an otherwise symmetric local community into the marginal dynamics for an asymmetric species in an otherwise symmetric metacommunity. The transition probabilities are

where the asymmetric focal species has speciation probability and enjoys an ecological fitness advantage, , over all other species, which share a common probability of speciation . If is an accessible state, then as becomes large and remains finite, the equilibrium probability of observing the asymmetric species approaches zero. However, if we assume that the asymmetric species is identified and known to exist at nonzero abundance levels, the stationary distribution is

(A.8) |

with

(A.9) |

and

(A.10) |

where and are Hubbell’s “fundamental biodiversity numbers” for the asymmetric species and all other species, respectively.

An approximate multivariate stationary distribution is obtained in an identical manner to the derivation of Eq. 28

(A.11) | |||||

where

(A.12) |

We now propose a modest extension to the prescription in Etienne et al. (2007) for converting multivariate distributions over labelled abundance vectors to distributions over unlabelled abundance vectors. Because the asymmetric focal species has been identified and is known to exist with abundance , this species must be labelled, while all other species are equivalent and may be unlabelled. Therefore, we aim to transform Eq. A.11 into a multivariate distribution over the “mostly unlabelled” states , where is the number of species observed in a sample and each is an integer partition of . (To provide an example, if , four distinct states are accessible: with , with , with , and with .) The conversion is given by

(A.13) |

where is the number of elements in equal to . Note that . Taking the leading behavior for large–, we obtain a modification of the Ewens (1972) sampling distribution appropriate to nearly neutral metacommunities

where allows us to take a product over the observed species, , rather than the total number of possible species, , in the first expression; as approaches 0 for has been used to obtain the second expression; l’Hôpital’s rule along with has been used to obtain the third expression; and

(A.15) |

with asymptotics of the hypergeometric function provided in §C.3 and §C.4. In the neutral limit, we obtain a modification to the Ewens sampling distribution for the scenario where a single species is labelled and guaranteed to exist

(A.16) |

Converting this result to a distribution over the “fully unlabelled” states