A The haystack model

# Relatedness and synergies of kind and scale in the evolution of helping

## Abstract

Relatedness and synergy affect the selection pressure on cooperation and altruism. Although early work investigated the effect of these factors independently of each other, recent efforts have been aimed at exploring their interplay. Here, we contribute to this ongoing synthesis in two distinct but complementary ways. First, we integrate models of -player matrix games into the direct fitness approach of inclusive fitness theory, hence providing a framework to consider synergistic social interactions between relatives in family and spatially structured populations. Second, we illustrate the usefulness of this framework by delineating three distinct types of helping traits (“whole-group”, “nonexpresser-only” and “expresser-only”), which are characterized by different synergies of kind (arising from differential fitness effects on individuals expressing or not expressing helping) and can be subjected to different synergies of scale (arising from economies or diseconomies of scale). We find that relatedness and synergies of kind and scale can interact to generate nontrivial evolutionary dynamics, such as cases of bistable coexistence featuring both a stable equilibrium with a positive level of helping and an unstable helping threshold. This broadens the qualitative effects of relatedness (or spatial structure) on the evolution of helping.

Keywords. evolution of helping, relatedness, synergy, inclusive fitness, evolutionary games

\allsectionsfont
• Department of Evolutionary Theory
Max Planck Institute for Evolutionary Biology
August-Thienemann-Str. 2, 24306 Plön, Germany
e-mail: pena@evolbio.mpg.de

• Faculty of Business and Economics
University of Basel
Peter Merian-Weg 6, CH-4002 Basel, Switzerland
e-mail: georg.noeldeke@unibas.ch

• Department of Ecology and Evolution
University of Lausanne
Le Biophore, CH-1015 Lausanne, Switzerland
e-mail: laurent.lehmann@unil.ch

• Corresponding author.

## 1 Introduction

Explaining the evolution of helping (cooperation and altruism) has been a main focus of research in evolutionary biology over the last fifty years (e.g., Sachs et al., 2004; West et al., 2007). In this context, Hamilton’s seminal papers established the importance of relatedness (genetic assortment between individuals) by showing that an allele for helping can be favored by natural selection as long as is satisfied, where is the fitness cost to an average carrier from expressing the allele, is the fitness benefit to such a carrier stemming from a social partner expressing the allele, and is the relatedness between social partners (Hamilton, 1964a, b, 1970). Additional factors, including different forms of reciprocity (i.e., conditional behaviors and responsiveness under multimove interactions, e.g., Trivers, 1971; Axelrod and Hamilton, 1981) and synergy (i.e., nonadditive effects of social behaviors on material payoffs, either positive or negative, e.g., Queller, 1985; Sumpter, 2010), modify the fitness costs and benefits in Hamilton’s rule (Axelrod and Hamilton, 1981; Day and Taylor, 1997; Lehmann and Keller, 2006a; Gardner et al., 2011; Van Cleve and Akçay, 2014) and hence fundamentally influence the evolutionary dynamics of helping.

Because of their ubiquity, relatedness and synergy occupy a central role among the factors affecting the selection pressure on helping. Both are clearly present in the cooperative enterprises of most organisms. First, real populations are characterized by limited gene flow at least until the stage of offspring dispersal (Clobert et al., 2001), with the consequence that most social interactions necessarily occur between relatives of varying degree. Second, social exchanges often feature at least one of two different forms of synergy, which we call in this article “synergies of kind” and “synergies of scale”.

Synergies of kind (implicit in what Queller, 2011 calls “kind selection”) arise when the expression of a social trait benefits recipients in different ways, depending on whether or not (or more generally, to which extent) recipients express the social trait themselves. A classical example of a positive synergy of kind is collective hunting (Packer and Ruttan, 1988), where the benefits of a successful hunt go to cooperators (hunters) but not to defectors (solitary individuals). Examples of negative synergies of kind are eusociality in Hymenoptera, by which sterile workers help queens to reproduce (Bourke and Franks, 1995), and self-destructive cooperation in bacteria, where expressers lyse while releasing virulence factors that benefit nonexpressers (Fröhlich and Madeo, 2000; Ackermann et al., 2008).

Synergies of scale (Corning, 2002) result from economies or diseconomies of scale in the production of a social good, so that the net effect of several individuals behaving socially can be more or less than the sum of individual effects. For instance, enzyme production in microbial cooperation is likely to be nonlinear, as in the cases of invertase hydrolyzing disaccharides into glucose in the budding yeast Saccharomyces cerevisiae (Gore et al., 2009) or virulence factors triggering gut inflammation (and hence removal of competitors) in the pathogen Salmonella typhimurium (Ackermann et al., 2008). In the former case, the relationship between growth rate and glucose concentration in yeast has been reported to be sublinear, i.e., invertase production has diminishing returns or negative synergies of scale (Gore et al., 2009, fig. 3.c); in the latter case, the relationship between the level of expression of virulence factors and inflammation intensity appears to be superlinear, i.e., it exhibits increasing returns or positive synergies of scale (Ackermann et al., 2008, fig. 2.d).

Previous theoretical work has investigated the effects of relatedness and synergy on the evolution of helping either independently of each other or by means of simplified models that neglect crucial interactions between the two factors. For instance, the effects of demography on relatedness and the scale of competition in family and spatially structured populations have often been explored under the assumption of additive payoff effects (e.g., Taylor, 1992; Taylor and Irwin, 2000; Lehmann et al., 2006; Gardner and West, 2006), while synergistic interactions have usually been investigated under the assumption that individuals are unrelated (e.g., Motro, 1991; Leimar and Tuomi, 1998; Hauert et al., 2006). In the cases where relatedness and synergy have been considered to operate in conjunction, it has been customary to model social interactions by means of a two-player Prisoner’s Dilemma, modified by adding a synergy parameter to the payoff of mutual cooperation (Grafen, 1979; Queller, 1984, 1985, 1992; Fletcher and Zwick, 2006; Lehmann and Keller, 2006a, b; Ohtsuki, 2010; Gardner et al., 2011; Ohtsuki, 2012; Taylor and Maciejewski, 2012; Van Cleve and Akçay, 2014). In this framework, (positive synergy) implies positive frequency-dependent selection, while (negative synergy) implies negative frequency-dependent selection. The value of relatedness only matters in determining whether or not positive synergy leads to bistability, resp., whether or not negative synergy leads to coexistence.

Although illuminating in some aspects, such two-player models cannot capture patterns of synergy and resulting frequency dependence where positive (resp. negative) synergies of kind and negative (resp. positive) synergies of scale do not combine additively. Such situations are however likely to be common in nature. For example, collective hunting often features both positive synergies of kind and negative synergies of scale (Packer and Ruttan, 1988), while the production of virulence factors in S. typhimurium features both negative synergies of kind and positive synergies of scale (Ackermann et al., 2008). Models of two-player matrix games between relatives miss these patterns of synergy (and possible interactions between relatedness and synergy) because such games are linear, and only nonlinear games (which necessarily involve at least three-party interactions) can accommodate both negative and positive synergies without conflating them into a single parameter. Although previous work has explored instances of -player games between relatives (e.g., Boyd and Richerson, 1988; Eshel and Motro, 1988; Archetti, 2009; Van Cleve and Lehmann, 2013; Marshall, 2014) this has been done only for specific population or payoff structures, and hence not in a comprehensive manner.

In this article, we study the interplay between relatedness and synergies of kind and scale in models of -player social interactions between relatives. In order to do so, we first present a general framework that integrates -player matrix games (e.g., Kurokawa and Ihara, 2009; Gokhale and Traulsen, 2010) into the “direct fitness” approach (Taylor and Frank, 1996; Rousset, 2004) of social evolution theory. This framework allows us to deliver a tractable expression for the selection gradient (or gain function) determining the evolutionary dynamics, which differs from the corresponding expression for -player games between unrelated individuals only in that “inclusive gains from switching” rather than solely “direct gains from switching” must be taken into account.

We then use the theoretical framework to investigate the interaction between relatedness, synergies of kind, and synergies of scale in the evolution of helping. We show the importance of distinguishing between three different kinds of helping traits (which we call “whole-group”, “nonexpresser-only” and “expresser-only”), that are characterized by different types of synergies of kind (none for “whole-group”, negative for “nonexpresser-only”, positive for “expresser-only”), and can be subjected to different synergies of scale. Our analysis demonstrates that the interplay between relatedness and synergy can lead to patterns of frequency dependence, evolutionary dynamics, and bifurcations that cannot arise when considering synergistic interactions between unrelated individuals. Thereby, our approach illustrates how relatedness and synergy combine nontrivially to affect the evolution of social behaviors.

## 2 Modeling framework

### 2.1 Population structure (demography)

We consider a homogeneous haploid population subdivided into a finite and constant number of groups, each with a constant number of adult individuals (see table 1 for a list of symbols). The following events occur cyclically and span a demographic time period. Each adult individual gives birth to offspring and then survives with a constant probability, so that individuals can be semelparous (die after reproduction) or iteroparous (survive for a number of demographic time periods). After reproduction, offspring dispersal occurs. Then, offspring in each group compete for breeding spots vacated by the death of adults so that exactly individuals reach adulthood in each group.

Dispersal between groups may follow a variety of schemes, including the island model of dispersal (Wright, 1931; Taylor, 1992), isolation by distance (Malécot, 1975; Rousset, 2004), hierarchical migration (Sawyer and Felsenstein, 1983; Lehmann and Rousset, 2012), a model where groups split into daughter groups and compete against each other (Gardner and West, 2006; Lehmann et al., 2006; Traulsen and Nowak, 2006), and several variants of the haystack model (e.g., Matessi and Jayakar, 1976; Godfreyâ-Smith and Kerr, 2009). We leave the exact details of the life history unspecified, but assume that they fall within the scope of models of spatially homogeneous populations with constant population size (see Rousset, 2004, ch. 6).

### 2.2 Social interactions (games and payoffs)

Each demographic time period, individuals interact socially by participating in a game between players. Interactions can occur among all adults in a group (), among a subset of such individuals () or among offspring before dispersal (). Individuals may either express a social behavior (e.g., cooperate in a Prisoner’s Dilemma) or not (e.g., defect in a Prisoner’s Dilemma). We denote these two possible actions by A (“cooperation”) and B (“defection”) and also refer to A-players as “expressers” and to B-players as “nonexpressers”. The game is symmetric so that, from the point of view of a focal individual, any two co-players playing the same action are exchangeable. We denote by the material payoff to an A-player when co-players choose A (and hence co-players choose B). Likewise, we denote by the material payoff to a B-player when co-players choose A.

We assume that individuals implement mixed strategies, i.e., they play A with probability (and hence play B with probability ). The set of available strategies is then the interval . At any given time only two strategies are present in the population: residents who play A with probability and mutants who play A with probability . Let us denote by the strategy (either or ) of a focal individual, and by the strategy of the -th co-player of such focal. The expected payoff to the focal is then

 π(z∙,z1(∙),z2(∙),...,zn−1(∙))=n−1∑k=0ϕk(z1(∙),z2(∙),…,zn−1(∙))[z∙ak+(1−z∙)bk], (1)

where is the probability that exactly co-players play action A. A first-order Taylor-series expansion about the average strategy of co-players shows that, to first order in , the probability is given by a binomial distribution with parameters and , i.e.,

 ϕk(z1(∙),z2(∙),…,zn−1(∙))=(n−1k)zk∘(1−z∘)n−1−k+O(δ2). (2)

Substituting (2) into (1) and discarding second and higher order terms, we obtain

 π(z∙,z∘)=n−1∑k=0(n−1k)zk∘(1−z∘)n−1−k[z∙ak+(1−z∙)bk] (3)

for the payoff of a focal individual as a function of the focal’s strategy and the average strategy of the focal’s co-players (see also Rousset, 2004, p. 95 and Van Cleve and Lehmann, 2013, p. 85).

### 2.3 Gain function and convergence stability

Consider a population of residents playing in which a single mutant appears due to mutation, and denote by the fixation probability of the mutant. We take the phenotypic selection gradient as a measure of evolutionary success of the mutant (Rousset and Billiard, 2000, p. 819Van Cleve, 2014, p. 17); indeed, entails that the mutant has a fixation probability greater than neutral under weak selection (). In order to evaluate the fixation probability, we assume that each demographic time period the material payoff to an individual determines its own fecundity (number of offspring produced before competition) or that of its parent (if interactions occur among offspring) by letting the average fecundity of an adult relative to a baseline be equal to the average payoff of a focal actor (i.e., payoffs from the game have “fecundity effects” as opposed to “survival effects”, e.g., Taylor and Irwin, 2000). With this and our demographic assumptions, is proportional to the “gain function” given by

 G(z)=∂π(z∙,z∘)∂z∙∣∣∣z∙=z∘=zdirect'' effect, −C(z)+κ∂π(z∙,z∘)∂z∘∣∣∣z∙=z∘=zindirect'' % effect, B(z)=−C(z)+κB(z) (4)

(see, e.g., Van Cleve and Lehmann, 2013, eq. 7).

Equation (4) shows that the gain function is determined by three components. First, the “direct” effect , that describes the change in average payoff of the focal resulting from the focal infinitesimally changing its own strategy. Second, the “indirect” effect , that describes the change in average payoff of the focal resulting from the focal’s co-players changing their strategy infinitesimally. Third, the indirect effect is weighted by the “scaled relatedness coefficient” , which is a measure of relatedness between the focal individual and its neighbors, demographically scaled so as to capture the effects of local competition on selection (Queller, 1994; Lehmann and Rousset, 2010; Akçay and Van Cleve, 2012). We discuss these three components of the gain function in more detail in the following section.

Knowledge of equation (4) is sufficient to characterize convergent stable strategies (Eshel and Motro, 1981; Eshel, 1983; Taylor, 1989; Christiansen, 1991; Geritz et al., 1998; Rousset, 2004). In our context, candidate convergent stable strategies are either “singular points” (i.e., values such that ), or the two pure strategies (always play A) and (always play B). In particular, a singular point is convergent stable if . Regarding the endpoints, (resp. ) is convergent stable if (resp. ). In this article we focus on convergence stability, and thus do not consider the possibility of disruptive selection, which can be ruled out by assuming that the evolutionary dynamics proceeds strictly through a sequence of mutant invasions and fixation events (e.g., the “substitution process”; Gillespie, 1991; the “trait substitution process”; Metz et al., 1996; or the “trait substitution sequence”; Champagnat et al., 2006).

### 2.4 Inclusive gains from switching

From equation (4), the condition for a mutant to be favored by selection can be written as . This can be understood as a scaled form of the marginal version of Hamilton’s rule (Lehmann and Rousset, 2010) with corresponding to the marginal direct costs and to the marginal indirect benefits of expressing an increased probability of playing action A. These marginal costs and benefits are not measured in terms of actual fitness (number of adult offspring, which are the units of measurement of and in Hamilton’s rule as given in the introduction, see e.g., Rousset, 2004, p. 113), but in terms of fecundity via payoffs in a game. The scaled relatedness coefficient is also not equal to the regression definition of relatedness present in the standard Hamilton’s rule, except for special cases where competition is completely global (Queller, 1994).

The coefficient is a function of demographic parameters such as migration rate, group size, and vital rates of individuals or groups, but is independent of the evolving trait (Van Cleve and Lehmann, 2013). For instance, in the island model with overlapping generations, , where is the migration rate and is the probability of surviving to the next generation (Taylor and Irwin, 2000, eq. A10; Akçay and Van Cleve, 2012, app. A2). In broad terms, we have (i) for population structures characterized by positive assortment and relatively global competition, (ii) for infinitely large panmictic populations or for viscous populations with local competition exactly compensating for increased assortment of strategies (Taylor, 1992), and (iii) for population structures characterized by negative assortment and/or very strong local competition (e.g., density-dependent competition occurs before dispersal). Scaled relatedness coefficients have been evaluated for many life cycle conditions (see table 1 of Lehmann and Rousset, 2010, table 1 of Van Cleve and Lehmann, 2013, and references therein; see also app. A for values of under different variants of the haystack model).

In contrast to , which depends only on population structure, the other two components of the gain function are solely determined by the payoff structure of the social interaction. In the following, we show how and can be expressed in terms of the payoffs and of the game. Doing so delivers an expression for that can be analyzed with the same techniques applicable for games between unrelated individuals. This expression provides the foundation for our subsequent analysis.

Imagine a focal individual playing B in a group where of its co-players play A. Suppose that this focal individual unilaterally switches its action to A while its co-players hold fixed their actions, thus changing its payoff from to . As a consequence, the focal experiences a “direct gain from switching” given by

 dk=ak−bk, k=0,1,…,n−1. (5)

At the same time, each of the focal’s co-players playing A experiences a change in payoff given by and each of the focal’s co-players playing B experiences a change in payoff given by . Hence, taken as a block, the co-players of the focal experience a change in payoff given by

 ek=kΔak−1+(n−1−k)Δbk, k=0,1,…,n−1, (6)

where we let for mathematical convenience. From the perspective of the focal, this change in payoffs represents an “indirect gain from switching” the focal obtains if co-players are related.

In appendix B, we show that the partial derivatives appearing in (4) can be expressed as expected values of the direct and indirect gains from switching, so that the direct and indirect effects are respectively given by

 −C(z)=n−1∑k=0(n−1k)zk(1−z)n−1−kdk, (7)

and

 B(z)=n−1∑k=0(n−1k)zk(1−z)n−1−kek. (8)

Hence, defining the “inclusive gains from switching” as

 fk=dk+κek, k=0,1,…,n−1, (9)

the gain function can be written as the expected value of the inclusive gains from switching:

 G(z)=n−1∑k=0(n−1k)zk(1−z)n−1−kfk. (10)

An immediate consequence of equation (10) is that matrix games between relatives are mathematically equivalent to “transformed” games between unrelated individuals, where “inclusive payoffs” take the place of standard, or personal, payoffs. Indeed, consider a game in which a focal playing A (resp. B) obtains payoffs

 a′k =ak+κ[kak+(n−1−k)bk+1], k=0,1,…,n−1 (11) b′k =bk+κ[kak−1+(n−1−k)bk], k=0,1,…,n−1 (12)

when of its co-players play A. Using equations (5)–(6) we can rewrite equation (9) as , so that the inclusive gains from switching are identical to the direct gains from switching in a game with payoff structure given by equations (11)–(12). The payoffs (resp. ) can be understood as inclusive payoffs consisting of the payoff obtained by a focal playing A (resp. B) plus times the sum of the payoffs obtained by its co-players.

This observation has two relevant consequences. First, the results developed in Peña et al. (2014) for nonlinear -player matrix games between unrelated individuals, which are based on the observation that the right side of (10) is a polynomial in Bernstein form (Farouki, 2012), also apply here, provided that (i) the inclusive gains from switching are used instead of the standard (direct) gains from switching in the formula for the gain function, and (ii) the concept of evolutionary stability is read as meaning convergence stability. For a large class of games, these results allow to identify convergence stable points from a direct inspection of the sign pattern of the inclusive gains from switching . Second, we may interpret the effect of relatedness on selection as inducing the payoff transformation , . For , this payoff transformation is the one hinted at by Hamilton (1971) and later often discussed in the theoretical literature (Grafen, 1979; Hines and Maynard Smith, 1979; Day and Taylor, 1998), namely

 (a′0a′1b′0b′1)=(a0+κb1(1+κ)a1(1+κ)b0b1+κa0),

where the payoff of the focal is augmented by adding times the payoff of the co-player.

## 3 Evolutionary dynamics of three kinds of helping traits

Throughout the following we assume that each A-player incurs a payoff cost in order for a social good to be produced (e.g., harvested food, nest defense, or help directed to others). The benefits of the social good are accrued by a subset of individuals in the group that we call “recipients”. Each recipient obtains a benefit when there are expressers in the group, and no benefit is produced if no individual expresses the social trait (). The benefit is increasing in the number of expressers, that is, the “incremental benefit” is positive ().

Synergies of scale are characterized by the properties of the incremental benefits. In the absence of synergies of scale, each additional expresser increases the benefit by the same amount so that is constant, implying that is linear in . With negative synergies of scale, is decreasing in , whereas positive synergies of scale arise when is increasing in . To illustrate the effects of synergies of scale on the evolutionary dynamics of the social trait, we will consider the special case in which incremental benefits are given by the geometric sequence for some and , so that benefits are given by

 βj=βj−1∑ℓ=0λℓ. (13)

With geometric benefits, synergies of scale are absent when , negative when , and positive when .

We distinguish three kinds of social traits according to which individuals are recipients and thus benefit from the expression of the social behavior: (i) “whole-group” (benefits accrue to all individuals in the group, fig. 1.a), (ii) “nonexpresser-only” (benefits accrue only to nonexpressers, fig. 1.b), and (iii) “expresser-only” (benefits accrue only to expressers, fig. 1.c). For whole-group traits there are no synergies of kind: benefits accrue to all individuals irrespective of their kind, i.e., whether they are expressers or nonexpressers. In contrast, nonexpresser-only traits feature negative synergies of kind, whereas expresser-only traits feature positive synergies of kind. These differences are reflected in different payoff structures for the corresponding -player games, resulting in different direct, indirect, and inclusive gains from switching (see table 2).

A classical example of a whole-group trait is the voluntary provision of public goods (Samuelson, 1954). In this case, the expressed social behavior consists in the production of a good available to others and hence exploitable by nonproducing cheats (nonexpressers). Well-known instances of public-goods cooperation are sentinel behavior in animals (Maynard Smith, 1965; Clutton-Brock et al., 1999), and the secretion of extracellular products (Velicer, 2003; West et al., 2007), such as sucrose-digestive enzymes (Greig and Travisano, 2004; Gore et al., 2009), in social bacteria.

The most prominent social behavior matching our definition of a nonexpresser-only trait is altruistic self-sacrifice, which happens when individuals expressing the social behavior sacrifice themselves (or their reproduction) to benefit nonexpressers (Frank, 2006; West et al., 2006). Sterile castes in eusocial insects (Bourke and Franks, 1995), and bacteria lysing while releasing toxins (Fröhlich and Madeo, 2000) or virulence factors (Ackermann et al., 2008) that benefit other bacteria provide some examples of altruistic self-sacrifice in nature.

Expresser-only traits have been discussed under the rubrics of “synergistic” (Queller, 1984, 1985; Leimar and Tuomi, 1998) and “greenbeard” (Guilford, 1985; Gardner and West, 2010; Queller, 2011) effects, and conceptualized as involving “rowing”(Maynard Smith and Szathmáry, 1995, p. 261-262) or “stag hunt” (Skyrms, 2004) games. Often cited examples include collective hunting (Packer and Ruttan, 1988), foundresses cooperating in colony establishment (Bernasconi and Strassmann, 1999), aposematic (warning) coloration (Queller, 1984, 1985; Guilford, 1988), and the Ti plasmid in the bacterial pathogen Agrobacterium tumefaciens, which induces its plant host to produce opines, a food source that can be exploited only by bacteria bearing the plasmid (Dawkins, 1999, p. 218, White and Winans, 2007). In each of these examples, the social good accrues only to partners expressing the trait, either because of a greater tendency to group and interact or because of the action of an emergent recognition system discriminating expressers from nonexpressers.

For all three kinds of social traits, the indirect gains from switching are always nonnegative ( for all ) and hence the indirect effect is nonnegative for all . This implies that we deal with helping traits at the level of payoffs and that increasing never leads to less selection for expressing the social behavior. Due to their different ways of defining recipients, however, each social trait is characterized by a social dilemma with structurally different payoff, direct gain, and indirect gains from switching. For nonexpresser-only traits, the direct gains from switching are always negative ( for all ) and thus expressing the social behavior is also payoff altruistic ( and for all ). For whole-group and expresser-only traits, expressing the social behavior is not necessarily altruistic, depending on how the cost compares to benefits (, expresser-only traits) or incremental benefits (, whole-group traits).

Before turning to the analysis, we note that a fourth class of social traits is sometimes also distinguished in the literature, namely “other-only” traits where the benefits accrue to all other individuals in the group, but not to the focal expresser itself (Pepper, 2000). Other-only traits, as whole-group traits, lack synergies of kind, and hence the effects of relatedness on the evolutionary dynamics are qualitatively similar for whole-group and other-only traits. We discuss this latter case in more detail in section 3.5 and relegate the formal analysis, which is similar to the one for whole-group traits, to appendix D.

### 3.1 No synergies of scale

To isolate the effects of synergies of kind, we begin our analysis with the case in which synergies of scale are absent, that is, benefits take the linear form ( in eq. (13)). The resulting expressions for the inclusive gains from switching and the gain functions for the three different social traits are shown in table 3. In each case, the gain function can be written as

 G(z)=(n−1)[−C+κB+(1+κ)Dz],

where the parameter may be thought of as the “effective cost” per co-player of expressing the social trait when none of the co-players expresses the social trait. We have when a focal expresser is not among the recipients (nonexpresser-only traits) and otherwise (whole-group and expresser-only traits). The parameter measures the incremental benefit accruing to each co-player of a focal expresser when none of the co-players expresses the social trait. We thus have for expresser-only traits and otherwise. Finally, measures synergies of kind and is thus null for whole-group traits (), negative for nonexpresser-only traits () and positive for expresser-only traits ().

In the absence of synergies of kind (, whole-group traits) selection is frequency independent and defection dominates cooperation ( is the only convergence stable strategy) if holds, whereas cooperation dominates defection ( is the only convergence stable strategy) if holds.

With negative synergies of kind (), there is negative frequency-dependent selection. Defection dominates cooperation if holds, whereas cooperation dominates defection if holds. If holds, both and are unstable and the singular point

 z∗=C−κB(1+κ)D (14)

is stable.

With positive synergies of kind (), there is positive frequency-dependent selection. Defection dominates cooperation if holds, whereas cooperation dominates defection if holds. When , there is bistability: both and are stable and is unstable.

This analysis reveals three important points. First, in the absence of synergies of scale the gain function is linear in , which allows for a straightforward analysis of the evolutionary dynamics for all three kinds of social traits. Second, because of the linearity of the gain function, the evolutionary dynamics of such games fall into one of the four classical dynamical regimes arising from games, namely (i) A dominates B, (ii) B dominates A, (iii) coexistence, and (iv) bistability (see, e.g., Cressman, 2003, section 2.2). Third, which of these dynamical regimes arises is determined by the interaction of relatedness with synergies of kind in a straightforward fashion. For all traits, defection dominates cooperation when relatedness is low. For whole-group traits, high values of relatedness imply that cooperation dominates defection. For nonexpresser-only and expresser-only traits, high relatedness also promotes cooperation, leading to either the coexistence of expressers and nonexpressers (nonexpresser-only traits) or to bistability (expresser-only traits).

### 3.2 Whole-group traits with synergies of scale

For whole-group traits there are no synergies of kind, but either positive or negative synergies of scale may arise. How do such synergies of scale change the evolutionary dynamics of whole-group helping? Substituting the inclusive gains from switching given in table 2 into equation (10) shows that the gain function for whole-group traits is given by

 G(z)=n−1∑k=0(n−1k)zk(1−z)n−1−k{−γ+[1+κ(n−1)]Δβk}. (15)

Since the incremental benefit satisfies for all , the gain function (15) is negative for . In this case, defection dominates cooperation and is the only stable point. Hence, we consider the case throughout the following.

If synergies of scale are negative ( decreasing in ), the direct gains () indirect gains () and inclusive gains () from switching are all decreasing in . This implies that , and are all decreasing in (cf. Peña et al., 2014, remark 3). Similarly, if synergies of scale are positive ( increasing in ), , and are all increasing in and hence , and are all increasing in . In both cases the evolutionary dynamics are easily characterized by applying the results for public goods games with constant costs from Peña et al. (2014, section 4.3): with negative synergies of scale, defection dominates cooperation (so that is the only convergent stable strategy) if , whereas cooperation dominates defection if holds. If holds, there is coexistence: both and are unstable and there is a unique stable interior point . With positive synergies of scale, defection dominates cooperation if , whereas cooperation dominates defection if . If holds, there is bistability: both and are stable and there is a unique, unstable interior point separating the basins of attraction of these two stable strategies. These results resemble those for the cases in which there are no synergies of scale (section 3.1), but negative, resp. positive synergies of kind are present. In particular, it is again the case that the evolutionary dynamics fall into one of the four classical dynamical regimes arising from games.

The effect of relatedness on the evolution of whole-group traits can be better grasped by noting that multiplying and dividing (15) by , we obtain

 G(z)=[1+κ(n−1)]n−1∑k=0(n−1k)zk(1−z)n−1−k(−~γ+Δβk), (16)

where . Equation (16) is (up to multiplication by a positive constant) equivalent to the gain function of a public goods game between unrelated individuals with payoff cost for producing the public good, which has been analyzed under different assumptions on the shape of the benefit sequence (Motro, 1991; Bach et al., 2006; Hauert et al., 2006; Peña et al., 2014). Hence, relatedness can be conceptualized as affecting only the cost of cooperation, while leaving synergies of scale and patterns of frequency dependence unchanged.

As a concrete example, consider the case of geometric benefits (13) with (see table 4 for a summary of the results and app. C for a derivation). We find that there are two critical cost-to-benefit ratios

 ε=min(1+κ(n−1),λn−1[1+κ(n−1)]) and ϑ=max(1+κ(n−1),λn−1[1+κ(n−1)]), (17)

such that for small costs () cooperation dominates defection ( is the only stable point) and for large costs () defection dominates cooperation ( is the only stable point). For intermediate costs (), there is a singular point given by

 z∗=11−λ⎡⎢⎣1−(γβ[1+κ(n−1)])1n−1⎤⎥⎦, (18)

such that the evolutionary dynamics are characterized by coexistence if synergies of scale are negative () and by bistability if synergies of scale are positive (). It is clear from equation (17) that, for a given cost-to-benefit ratio , increasing relatedness makes larger (resp. smaller) the region in the parameter space where cooperation (resp. defection) dominates. Moreover, and from equation (18), is an increasing (resp. decreasing) function of when (resp. ), meaning that the proportion of individuals cooperating at a stable interior point (resp. the size of the basin of attraction of the fully cooperative equilibrium) increases as a function of (see fig. 2.a and 2.d).

### 3.3 Nonexpresser-only traits with synergies of scale

For nonexpresser-only traits, synergies of kind are negative. In the absence of synergies of scale, and as discussed in section 3.1, this implies negative frequency dependence. To investigate how positive or negative synergies of scale change this baseline scenario, we focus on the case in which relatedness is nonnegative ().

From the formulas for and given in table 2, it is clear that, independently of any synergies of scale, the direct gains from switching are decreasing in . Hence, the direct effect is negative frequency-dependent. When synergies of scale are negative, the indirect gains from switching are also decreasing in , implying that the indirect effect is also negative frequency-dependent and that the same is true for the gain function . Hence, negative synergies of scale lead to evolutionary dynamics that are qualitatively identical to those arising when synergies of scale are absent: for low relatedness, defection dominates cooperation, and for sufficiently high relatedness, a unique interior stable equilibrium appears (see app. E.1 and fig. 2.b).

When synergies of scale are positive, the indirect gains from switching may still be decreasing in because the incremental gain accrues to a smaller number of recipients () as increases. In such a scenario, always applicable when , the evolutionary dynamics are again qualitatively identical to those arising when synergies of scale are absent. A different picture can emerge if holds and synergies of scale are not only positive, but also sufficiently strong. Then, the indirect gains from switching may be unimodal (first increasing, then decreasing) in , implying (Peña et al., 2014) that the indirect benefit is similarly unimodal, featuring positive frequency dependence for small and negative frequency dependence for large . Depending on the value of relatedness, which modulates how the frequency dependence of interacts with that of , this can give rise to evolutionary dynamics different from those possible without synergies of scale, discussed in section 3.1.

For a concrete example of such evolutionary dynamics, consider the case of geometric benefits (13) with (see table 4 for a summary of results, app. E.2 for their derivation and fig. 2.e for an illustration). In this case, the evolutionary dynamics for and depend on the critical value

 ϱ=1+κ(n−1)κ(n−2), (19)

and on the two critical cost-to-benefit ratios

 ζ=κ(n−1),andη=1λ−1⎡⎣1+λκ((n−2)λκ1+κ(n−1))n−2⎤⎦, (20)

which satisfy and .

With these definitions our results can be stated as follows. For the dynamical outcome depends on how the cost-to-benefit ratio compares to . If (high costs), defection dominates cooperation, while if (low costs), there is coexistence. For , the dynamical outcome also depends on how the cost-to-benefit ratio compares to . If (high costs), defection dominates cooperation. If (low costs), we have coexistence, with the stable singular point satisfying where

 ^z=κ[(n−2)λ−(n−1)]−1[1+κ(n−1)](λ−1). (21)

In the remaining case (, intermediate costs) the dynamics are characterized by bistable coexistence, with stable, unstable, and two singular points (unstable) and (stable) satisfying . Numerical values for (resp. ) can be obtained by searching for roots of in the interval (resp. ), as we illustrate in figure 2.e.

It is evident from the dependence of , , and on that relatedness plays an important role in determining the stable level(s) of expression of helping. As increases, the regions of the parameter space where some non-zero level of expression of helping is stable expand at the expense of the region of dominant non-expression. This is so because and are increasing functions of and is a decreasing function of . Moreover, inside these regions the stable non-zero probability of expressing helping increases with (see fig. 2.b and 2.e). Three cases can be however distinguished as for the effects of increasing when starting from a point in the parameter space where is the only stable point. First, can remain stable irrespective of the value of relatedness, which characterizes high cost-to-benefit ratios. Second, the system can undergo a transcritical bifurcation as increases, destabilizing and leading to the appearance of a unique stable interior point (fig. 2.b). This happens when and are relatively small. Third, there is a range of intermediate cost-to-benefit ratios such that, for sufficiently large values of , the system undergoes a saddle-node bifurcation, whereby two singular points (, unstable, and , stable) appear (fig. 2.e). In this latter case, positive synergies of scale are strong enough to interact with negative synergies of kind and relatedness in a nontrivial way.

### 3.4 Expresser-only traits with synergies of scale

For expresser-only traits, and independently of any synergies of scale, the direct gains from switching (cf. table 2) are increasing in , implying that the direct effect is positive frequency-dependent. When synergies of scale are positive, the indirect gains from switching are also increasing in , so that the indirect effect is also positive frequency-dependent. Focusing on the case of nonnegative relatedness ( this ensures that, just as when synergies of scale are absent, the gain function is positive frequency-dependent. Hence, the evolutionary dynamics are qualitatively identical to those arising from linear benefits: for low relatedness, defection dominates cooperation, and for high relatedness, there is bistability, with the basins of attraction of the two pure equilibria and being separated by a unique interior unstable point (see app. F.1 and fig. 2.f).

When synergies of scale are negative, the indirect gains from switching may still be increasing in because the incremental gain accrues to a larger number of recipients as increases. In such a scenario, always applicable when , the evolutionary dynamics are again qualitatively identical to those arising when synergies of scale are absent. A different picture can emerge if holds and synergies of scale are not only negative, but also sufficiently strong. In this case, can be negative frequency-dependent for some , and hence (for sufficiently high values of ) also . Similarly to the case of nonexpresser-only traits with positive synergies of scale, this can give rise to patterns of frequency dependence that go beyond the scope of helping without synergies of scale.

To illustrate this, consider the case of geometric benefits (13) with , , and (see table 4 for a summary of results, app. F.2 for proofs and fig. 2.c for an illustration). Defining the critical value

 ξ=κ(n−2)1+κ(n−1), (22)

and the two critical cost-to-benefit ratios

 ς=1−λn1−λ+κ(n−1)λn−1,%andτ=11−λ⎡⎣1+λκ((n−2)κ1+κ(n−1))n−2⎤⎦, (23)

which satisfy and , our result can be stated as follows. For the evolutionary dynamics depends on how the cost-to-benefit ratio compares to and to . If (low costs), cooperation dominates defection, while if (high costs), defection dominates cooperation. If (intermediate costs), the dynamics are bistable. For , the classification of possible evolutionary dynamics is as in the case , except that, if , the dynamics are characterized by bistable coexistence, with stable, unstable, stable, and unstable, where

 ^z=1+κ[1+κ(n−1)](1−λ). (24)

For , the critical values , , and are all increasing functions of . Hence, as relatedness increases, the regions of the parameter space where some level of expression of helping is stable expand at the expense of the region of dominant nonexpression. Moreover, inside these regions the stable positive probability of expressing helping increases with (fig. 2.c). When synergies of scale are “sufficiently” negative () and for intermediate cost-to-benefit ratios () relatedness and synergies interact in a nontrivial way, leading to saddle-node bifurcations as increases (fig. 2.c).

### 3.5 Connections with previous models

Our model without synergies of scale, for which the is linear in (section 3.1) extends classical two-player matrix games between relatives (e.g. Grafen, 1979, Frank, 1998, ch. 5-6) to the more general case of -player linear games between relatives. Indeed, for , identifying scaled relatedness with relatedness , and up to normalization of the payoff matrices, equation (14) recovers Grafen (1979, eq. 9) and Frank (1998, eq. 5.6). Interestingly, Frank (1998, p. 98) considers a two-player model of helping with two pure strategies (“nesting” or expressing a queen phenotype, and “helping” or expressing a sterile worker phenotype), which is a particular case of our model of nonexpresser-only traits.

Our results on whole-group traits with geometric returns (section 3.2 and app. C) extend the model studied by Hauert et al. (2006, p. 198) from the particular case of interactions between unrelated individuals () to the more general case of interactions between relatives () and recover the result by Archetti (2009, p. 476) in the limit , in which the game is also called a “volunteer’s dilemma” (Diekmann, 1985). Although we restricted our attention to the cases of constant, decreasing, and increasing incremental benefits, it is clear that equation (16) applies to benefits of any shape. Hence, general results about the stability of equilibria in public goods games (Peña et al., 2014) with sigmoid benefits (Bach et al., 2006; Archetti and Scheuring, 2011) carry over to games between relatives.

For their model of “self-destructive cooperation” in bacteria, Ackermann et al. (2008) assumed a nonexpresser-only trait with no synergies of scale, and a haystack model of population structure implying , where is the number of offspring among which the game is played (see eq. (A.4)). Identifying our and with (respectively) their with , the main result of Ackermann et al. (2008) (eq. 7 in their supplementary material) is recovered as a particular case of our result that the unique convergent stable strategy for this case is given by (eq. (14)). The fact that in this example is a probability of coalescence within groups shows that social interactions effectively occur between family members, and hence that kin selection is crucial to the understanding of self-destructive cooperation (Gardner and Kümmerli, 2008).

As mentioned before, the analysis of other-only traits follows closely that of whole-group traits (see app. D). The model of altruistic helping in Eshel and Motro (1988) considers such an other-only trait. In their model, one individual in the group needs help, which can be provided (action A) or denied (action B) by its neighbors: a situation Eshel and Motro call the “three brothers’ problem” when . Suppose that the cost for each helper is a constant independent on the number of expressers (Eshel and Motro (1988)’s “risk for each volunteer”, denoted by in their paper) and that the benefit for the individual in need when co-players offer help is given by (Eshel and Motro (1988)’s “gain function”, denoted by in their paper). Then, if individuals need help at random, the payoffs for helping (A) and not helping (B) are given by and . Defining and , we have and . Comparing these with the payoffs for whole-group traits in table 2, it is apparent that the key difference between other-only traits and whole-group traits is that an expresser is not among the recipients of its own helping behavior. As we show in appendix D, our results for whole-group traits carry over to such other-only traits. In particular, our results for whole-group traits with geometric benefits can be used to recover results 1,2, and 3 of Eshel and Motro (1988) and to extend them from family-structured to spatially-structured populations.

Finally, Van Cleve and Lehmann (2013) discuss an -player coordination game. They assume payoffs given by and , for positive , and , satisfying , and . It is easy to see that both the direct effect and the indirect effect are strictly increasing functions of having exactly one sign change. This implies that, for , the evolutionary dynamics are characterized by bistability, with the basins of attraction of the two equilibria and being divided by the interior unstable equilibrium . Importantly, and in contrast to the social traits analyzed in this article, expressing the payoff dominant action A does not always qualify as a helping trait, as is negative for some interval . As a result, increasing scaled relatedness can have mixed effects on the location of . Both of these predictions are well supported by the numerical results reported by Van Cleve and Lehmann (2013), where increasing leads to a steady increase in for , , , , and a steady decrease in for , , , , see their figure 5. This illustrates that relatedness (and thus spatial structure) plays an important role not only in the specific context of helping games but also in the more general context of nonlinear multiplayer games.

## 4 Discussion

We have shown that, when phenotypic differences are small, the selection gradient on a mixed strategy of a symmetric two-strategy -player matrix game is proportional to the average inclusive payoff gain to an individual switching strategies, and that this can be written as a polynomial in Bernstein form (eq. (10)). As a result, convergence stability of strategies in spatially structured populations can be determined from the shape of the inclusive gain sequence (eq. (9)) and the mathematical properties of polynomials in Bernstein form (Farouki, 2012; Peña et al., 2014). We applied these results to the evolution of helping under synergies of scale and kind, and unified and extended previous analysis. The most important conclusion we reach is that, although an increase in (scaled) relatedness always tempers the social dilemma faced by cooperative individuals in a helping game, how the social dilemma is relaxed crucially depends on the synergies of kind and scale involved.

The simplest case is the one of whole-group traits (fig. 1a). Since there are no synergies of kind, only synergies of scale can introduce frequency dependent selection. For , negative (resp. positive) synergies of scale induce negative (resp. positive) frequency-dependent selection. Moreover, increasing relatedness can transform a game in which defection is dominant (Prisoner’s Dilemma) into a game in which cooperation and defection coexist (Snowdrift or anti-coordination game) when synergies of scale are negative (fig. 2.a), or into a game in which both cooperation and defection are stable (Stag Hunt or coordination game) when synergies of scale are positive (fig. 2.d).

More complex interactions between relatedness and frequency dependence can arise when there are both synergies of kind and scale. For nonexpresser-only traits (fig. 1.b), synergies of kind are negative and helping is altruistic, so that in the absence of relatedness defection dominates cooperation (as in a Prisoner’s Dilemma). When synergies of scale are absent (linear benefits) or negative (diminishing incremental benefits), both the direct and the indirect effect are decreasing in and selection is negative frequency-dependent. In this case, increasing relatedness might turn the game into a Snowdrift or an anti-coordination game, where the probability of cooperating is an increasing function of relatedness (fig. 2.b). Contrastingly, when synergies of scale are positive (increasing incremental benefits) the indirect effect may become unimodal in . This paves the way for new patterns of evolutionary dynamics and bifurcations. For the particular case of geometric benefits, we find that there is a range of cost-to-benefit ratios such that, for sufficiently strong positive synergies of scale, increasing relatedness induces a saddle-node bifurcation whereby two internal equilibria appear, the leftmost unstable and the rightmost stable (fig. 2.e). After the bifurcation occurs, the evolutionary dynamics are characterized by bistable coexistence, where the first stable equilibrium is pure defection () and the second is a mixed equilibrium in which individuals help with a positive probability ().

For expresser-only traits (where synergies of kind are positive, fig. 1c) a similar interaction between relatedness and synergies occurs. When synergies of scale are absent or positive, is increasing in for . In this case, increasing relatedness might turn a scenario reminiscent of the Prisoner’s Dilemma into a Stag Hunt or coordination game, where the size of the basin of attraction of the cooperative equilibrium is an increasing function of relatedness (fig. 2.f). Contrastingly, if synergies of scale are negative, relatedness may interact nontrivially with synergies to produce a dynamical outcome which is qualitatively identical to that arising from nonexpresser-only traits with positive synergies of scale, namely, bistable coexistence (fig. 2.c).

The three kinds of helping traits we considered are also different in the conditions they impose on the origin and the maintenance of helping. To see this, consider a payoff cost so large that the direct sequence is negative. For the case of unrelated individuals () this implies that B dominates A so that is the only stable strategy. We ask what happens when is increased and focus on the stability of the end-points and .

For whole-group traits, the indirect gains from switching when co-players are all defectors () and when co-players are all helpers () are both positive. This opens up the opportunity for both (i) to be destabilized if , and (ii) to be stabilized if , which underlies the classical effect that increasing relatedness can destabilize defection and stabilize helping.

In contrast, one of these two scenarios is missing for nonexpresser-only and expresser-only traits. For nonexpresser-only traits, we have and irrespectively of the shape of the benefit sequence. Hence, although can be destabilized by increasing (allowing for some level of helping to be evolutionarily accessible from ), can never be stabilized and so full helping is never an evolutionary (convergent) stable point. Exactly the opposite happens for expresser-only traits, where but . As a result, can become stable (if ) but can never be destabilized by increasing . This implies that, under our assumptions, an expresser-only trait with high costs () can never evolve from a monomorphic population of nonexpressers (), and this for any value of .

The kind of social trait also has a big impact on the amount of (scaled) relatedness required to make stable some level of helping. This quantitative effect is illustrated in figure 2. When synergies of scale are negative () and the cost-to benefit ratio is relatively low (), for whole-group and nonexpresser-only traits, moderate amounts of relatedness ( for whole-group traits, for nonexpresser-only traits) are sufficient for a non-zero level of expression of helping to be stable (fig. 2.a and 2.b). In contrast, a comparatively large amount of relatedness () is required for some non-zero level of helping to be stable if the trait is expresser-only (fig. 2.c). In the case of positive synergies of scale () and relatively high cost-to-benefit ratio (), full expression of helping is stable already with for whole-group and expresser-only traits (fig. 2.d and 2.f). Contrastingly, for nonexpresser-only traits, a positive probability of expressing helping is stable only for large values of relatedness (, fig. 2.e).

We modeled social interactions by assuming that actions implemented by players are discrete. This is in contrast to many kin-selection models of games between relatives, which assume a continuum of pure actions in the form of continuous amounts of effort devoted to some social activity (e.g., Frank 1994; Johnstone et al. 1999; Reuter and Keller 2001; Wenseleers et al. 2010). Such continuous-action models have the advantage that the “fitness function” or “payoff function” (the counterpart to our eq. (3)) usually takes a simple form that facilitates mathematical analysis. On the other hand, there are situations where individuals can express only a few behavioral alternatives or morphs, such as worker and queen in the eusocial Hymenoptera (Wheeler, 1986), different behavioral tactics in foraging (e.g., “producers” and “scroungers” in house sparrows Passer domesticus; Barnard and Sibly, 1981) and hunting (e.g., lionesses positioned as “wings” and others positioned as “centres” in collective hunts; Stander, 1992), or distinct phenotypic states (e.g., capsulated and non-capsulated cells in Pseudomonas fluorescens; Beaumont et al., 2009). These situations are more conveniently modeled by means of a discrete-action model like the one presented here, but we expect that our qualitative results about the interaction between synergy and relatedness carry over to continuous-action models.

Synergistic interactions are likely to be much more common in nature than additive interactions where both synergies of scale and kind are absent. Given the local demographic structure of biological populations, interactions between relatives are also likely to be the rule rather than the exception. Empirical work should thus aim at measuring not only the genetic relatedness of interactants and the fitness costs and benefits of particular actions, but also at identifying the occurrences of positive and negative synergies of kind and scale, as it is the interaction between synergies and relatedness which determines the qualitative outcomes of the evolutionary dynamics of helping (fig. 2).

## 5 Acknowledgements

This work was partly supported by Swiss NSF Grants PBLAP3-145860 (to JP) and PP00P3-123344 (to LL).

## Appendix A The haystack model

Many models of social interactions have assumed different versions of the haystack model (e.g., Matessi and Jayakar, 1976; Ackermann et al., 2008), where several rounds of unregulated reproduction can occur within groups before a round of complete dispersal (Maynard Smith, 1964) so that competition is effectively global. In these cases, as we will see below, takes the simpler interpretation of the coalescence probability of the gene lineage of two interacting individuals in their group. Here, we calculate for different variants of the haystack model.

The haystack model can be seen as a special case of the island model where dispersal is complete and where dispersing progeny compete globally. In this context, the fecundity of an adult is the number of its offspring reaching the stage of global density-dependent competition. The conception of offspring may occur in a single or over multiple rounds of reproduction, so that a growth phase within patches is possible. In this context, the number of “adults” is better thought of as the number of founding individuals (or lineages, or seeds) on a patch.

Two cases need to be distinguished when it comes to social interactions. First, the game can be played between the adult individuals (founders) in which case

 κ=0, (A.1)

since relatedness is zero among founders on a patch and there is no local competition. Alternatively, the game is played between offspring after reproduction and right before their dispersal. In this case two individuals can be related since they can descend from the same founder. Since there is no local competition, is directly the relatedness between two interacting offspring and is obtained as the probability that the two ancestral lineages of two randomly sampled offspring coalesce in the same founding individual (relatedness in the island model is defined as the cumulative coalescence probability over several generations, see e.g., Rousset, 2004, but owing to complete dispersal gene lineages can only coalesce in founders).

In order to evaluate for the second case, we assume that, after growth, exactly offspring are produced and that the game is played between them (). Founding individuals, however, may contribute a variable number of offspring. Let us denote by the random number of offspring descending from the “adult” individual on a representative patch after reproduction, i.e., is the size of lineage . Owing to our assumption that the total number of offspring is fixed, we have , where the ’s are exchangeable random variables (i.e., neutral process, ). The coalescence probability can then be computed as the expectation of the ratio of the total number of ways of sampling two offspring from the same founding parent to the total number of ways of sampling two offspring:

 κ=E[N∑i=1Oi(Oi−1)No(No−1)]=N(σ2+μ2−μNo(No−1)), (A.2)

where the second equality follows from exchangeability, is the expected number of offspring descending from any individual , and is the corresponding variance. Due to the fact that the total number of offspring is fixed, we also necessarily have (i.e., ), whereby

 κ=No−NN(No−1)+σ2NNo(No−1), (A.3)

which holds for any neutral growth process.

We now consider different cases:

(i) Suppose that there is no variation in offspring production between founding individuals, as in the life cycle described by Ackermann et al. (2008). Then , and equation (A.3) simplifies to

 κ=(No−N)N(No−1). (A.4)

(ii) Suppose that each of the offspring has an equal chance of descending from any founding individual, so that each offspring is the result of a sampling event (with replacement) from a parent among the founding individuals. Then, the offspring number distribution is binomial with parameters and , whereby . Substituting into equation (A.3) produces

 κ=1N. (A.5)

In more biological terms, this case results from a situation where individuals produce offspring according to a Poisson process and where exactly individuals are kept for interactions (i.e., the conditional branching process of population genetics; Ewens, 2004).

(iii) Suppose that the offspring distribution follows a beta-binomial distribution, with number of trials and shape parameters and . Then, and

 σ2=No(N−1)(αN+No)N2(1+αN),

which yields

 κ=1+α1+αN. (A.6)

In more biological terms, this reproductive scheme results from a situation where individuals produce offspring according to a negative binomial distribution (larger variance than Poisson, which is recovered when ), and where exactly individuals are kept for interactions.

## Appendix B Gains from switching and the gain function

In the following we establish the expressions for and given in equations (7)–(8); equation (10) is then immediate from the definition of (9) and the identity .

Recalling the definitions of and from equation (4) as well as the definitions of and from equations (5)–(6) we need to show \linenomath

 ∂π(z∙,z∘)∂z∙∣∣∣z∙=z∘=z =n−1∑k=0(n−1k)zk(1−z)n−1−k[ak−bk], (B.1) ∂π(z∙,z∘)∂z∘∣∣∣z∙=z∘=z =n−1∑k=0(n−1k)zk(1−z)n−1−k[kΔak−1+(n−1−k)Δbk], (B.2)
\endlinenomath

where the function has been defined in equation (3). Equation (B.1) follows directly by taking the partial derivative of with respect to and evaluating at , so it remains to establish equation (B.2).

Our derivation of equation (B.2) uses properties of polynomials in Bernstein form (Farouki, 2012). Such polynomials, which in general can be written as , where , satisfy

 ddxm∑k=0(mk)xk(1−x)m−kck=mm−1∑k=0(m−1k)xk(1−x)m−1−kΔck.

Applying this property to equation (3) and evaluating the resulting partial derivative at , yields

 ∂π(z∙,z∘)∂z∘∣∣∣z∙=z∘=z=(n−1)zn−2∑k=0(n−2k)zk(1−z)n−2−kΔak+(n−1)(1−z)n−2∑k=0(n−2k)zk(1−z)n−2−kΔbk. (B.3)

In order to obtain equation (B.2) from equation (B.3) it then suffices to establish

 xm−1∑k=0(m−1k)xk(1−x)m−1−kck=m∑k=0(m)kxk(1−x)m−kkck−1m (B.4)

and

 (1−x)m−1∑k=0(m−1k)xk(1−x)m−1−kck=m∑k=0(mk)xk(1−x)m−k(m−k)ckm, (B.5)

as applying these identities to the terms on the right side of equation (B.3) yields the right side of equation (B.2).

Let us prove equation (B.4) (eq. (B.5) is proven in a similar way). Starting from the left side of equation (B.4), we multiply and divide by and distribute to obtain

 xm−1∑k=0(m−1k)xk(1−x)m−1−kck=m−1∑k=0mk+1(m−1k)xk+1(1−x)m−(k+1)ckk+1m.

Applying the identity and changing the index of summation to , we get

 xm−1∑k=0(m−1k)xk(1−x)m−1−kck=m∑k=1(m)kxk(1−x)m−kkck−1m.

Finally, changing the lower index of the sum by noting that the summand is zero when gives equation (B.4).

## Appendix C Whole-group traits with geometric benefits

With geometric benefits, we have , so that the inclusive gains from switching for whole-group traits are given by . Using the formula for the probability generating function of a binomial random variable, equation (10) can be written as

 G(z)=−γ+[1+κ(n−1)]β(1−z+λz)n−1. (C.1)

As is either decreasing () or increasing () in , A (resp. B) is a dominant strategy if and only if (resp. if and only if ). Using equation (C.1) to calculate and then yields the critical cost-to-benefit ratios and given in equation (17). The value of given in equation (18) is obtained by solving .

## Appendix D Other-only traits

In contrast to what happens in whole-group traits, individuals expressing an other-only trait are automatically excluded from the consumption of the good they create, although they can still reap the benefits of goods created by other expressers in their group. Payoffs for such other-only traits are given by and , so that the inclusive gains from switching are given by . For this payoff constellation, it is straightforward to obtain the indirect benefits from equation (B.3) in appendix B. Observing that holds for all , we have

 B(z)=∂π(z∙,z∘