Relatedness and synergies of kind and scale in the evolution of helping
Abstract
Relatedness and synergy affect the selection pressure on cooperation and altruism. Although early work investigated the effect of these factors independently of each other, recent efforts have been aimed at exploring their interplay. Here, we contribute to this ongoing synthesis in two distinct but complementary ways. First, we integrate models of player matrix games into the direct fitness approach of inclusive fitness theory, hence providing a framework to consider synergistic social interactions between relatives in family and spatially structured populations. Second, we illustrate the usefulness of this framework by delineating three distinct types of helping traits (“wholegroup”, “nonexpresseronly” and “expresseronly”), which are characterized by different synergies of kind (arising from differential fitness effects on individuals expressing or not expressing helping) and can be subjected to different synergies of scale (arising from economies or diseconomies of scale). We find that relatedness and synergies of kind and scale can interact to generate nontrivial evolutionary dynamics, such as cases of bistable coexistence featuring both a stable equilibrium with a positive level of helping and an unstable helping threshold. This broadens the qualitative effects of relatedness (or spatial structure) on the evolution of helping.
Keywords. evolution of helping, relatedness, synergy, inclusive fitness, evolutionary games

Department of Evolutionary Theory
Max Planck Institute for Evolutionary Biology
AugustThienemannStr. 2, 24306 Plön, Germany
email: pena@evolbio.mpg.de 
Faculty of Business and Economics
University of Basel
Peter MerianWeg 6, CH4002 Basel, Switzerland
email: georg.noeldeke@unibas.ch 
Department of Ecology and Evolution
University of Lausanne
Le Biophore, CH1015 Lausanne, Switzerland
email: laurent.lehmann@unil.ch 
Corresponding author.
1 Introduction
Explaining the evolution of helping (cooperation and altruism) has been a main focus of research in evolutionary biology over the last fifty years (e.g., Sachs et al., 2004; West et al., 2007). In this context, Hamilton’s seminal papers established the importance of relatedness (genetic assortment between individuals) by showing that an allele for helping can be favored by natural selection as long as is satisfied, where is the fitness cost to an average carrier from expressing the allele, is the fitness benefit to such a carrier stemming from a social partner expressing the allele, and is the relatedness between social partners (Hamilton, 1964a, b, 1970). Additional factors, including different forms of reciprocity (i.e., conditional behaviors and responsiveness under multimove interactions, e.g., Trivers, 1971; Axelrod and Hamilton, 1981) and synergy (i.e., nonadditive effects of social behaviors on material payoffs, either positive or negative, e.g., Queller, 1985; Sumpter, 2010), modify the fitness costs and benefits in Hamilton’s rule (Axelrod and Hamilton, 1981; Day and Taylor, 1997; Lehmann and Keller, 2006a; Gardner et al., 2011; Van Cleve and Akçay, 2014) and hence fundamentally influence the evolutionary dynamics of helping.
Because of their ubiquity, relatedness and synergy occupy a central role among the factors affecting the selection pressure on helping. Both are clearly present in the cooperative enterprises of most organisms. First, real populations are characterized by limited gene flow at least until the stage of offspring dispersal (Clobert et al., 2001), with the consequence that most social interactions necessarily occur between relatives of varying degree. Second, social exchanges often feature at least one of two different forms of synergy, which we call in this article “synergies of kind” and “synergies of scale”.
Synergies of kind (implicit in what Queller, 2011 calls “kind selection”) arise when the expression of a social trait benefits recipients in different ways, depending on whether or not (or more generally, to which extent) recipients express the social trait themselves. A classical example of a positive synergy of kind is collective hunting (Packer and Ruttan, 1988), where the benefits of a successful hunt go to cooperators (hunters) but not to defectors (solitary individuals). Examples of negative synergies of kind are eusociality in Hymenoptera, by which sterile workers help queens to reproduce (Bourke and Franks, 1995), and selfdestructive cooperation in bacteria, where expressers lyse while releasing virulence factors that benefit nonexpressers (Fröhlich and Madeo, 2000; Ackermann et al., 2008).
Synergies of scale (Corning, 2002) result from economies or diseconomies of scale in the production of a social good, so that the net effect of several individuals behaving socially can be more or less than the sum of individual effects. For instance, enzyme production in microbial cooperation is likely to be nonlinear, as in the cases of invertase hydrolyzing disaccharides into glucose in the budding yeast Saccharomyces cerevisiae (Gore et al., 2009) or virulence factors triggering gut inflammation (and hence removal of competitors) in the pathogen Salmonella typhimurium (Ackermann et al., 2008). In the former case, the relationship between growth rate and glucose concentration in yeast has been reported to be sublinear, i.e., invertase production has diminishing returns or negative synergies of scale (Gore et al., 2009, fig. 3.c); in the latter case, the relationship between the level of expression of virulence factors and inflammation intensity appears to be superlinear, i.e., it exhibits increasing returns or positive synergies of scale (Ackermann et al., 2008, fig. 2.d).
Previous theoretical work has investigated the effects of relatedness and synergy on the evolution of helping either independently of each other or by means of simplified models that neglect crucial interactions between the two factors. For instance, the effects of demography on relatedness and the scale of competition in family and spatially structured populations have often been explored under the assumption of additive payoff effects (e.g., Taylor, 1992; Taylor and Irwin, 2000; Lehmann et al., 2006; Gardner and West, 2006), while synergistic interactions have usually been investigated under the assumption that individuals are unrelated (e.g., Motro, 1991; Leimar and Tuomi, 1998; Hauert et al., 2006). In the cases where relatedness and synergy have been considered to operate in conjunction, it has been customary to model social interactions by means of a twoplayer Prisoner’s Dilemma, modified by adding a synergy parameter to the payoff of mutual cooperation (Grafen, 1979; Queller, 1984, 1985, 1992; Fletcher and Zwick, 2006; Lehmann and Keller, 2006a, b; Ohtsuki, 2010; Gardner et al., 2011; Ohtsuki, 2012; Taylor and Maciejewski, 2012; Van Cleve and Akçay, 2014). In this framework, (positive synergy) implies positive frequencydependent selection, while (negative synergy) implies negative frequencydependent selection. The value of relatedness only matters in determining whether or not positive synergy leads to bistability, resp., whether or not negative synergy leads to coexistence.
Although illuminating in some aspects, such twoplayer models cannot capture patterns of synergy and resulting frequency dependence where positive (resp. negative) synergies of kind and negative (resp. positive) synergies of scale do not combine additively. Such situations are however likely to be common in nature. For example, collective hunting often features both positive synergies of kind and negative synergies of scale (Packer and Ruttan, 1988), while the production of virulence factors in S. typhimurium features both negative synergies of kind and positive synergies of scale (Ackermann et al., 2008). Models of twoplayer matrix games between relatives miss these patterns of synergy (and possible interactions between relatedness and synergy) because such games are linear, and only nonlinear games (which necessarily involve at least threeparty interactions) can accommodate both negative and positive synergies without conflating them into a single parameter. Although previous work has explored instances of player games between relatives (e.g., Boyd and Richerson, 1988; Eshel and Motro, 1988; Archetti, 2009; Van Cleve and Lehmann, 2013; Marshall, 2014) this has been done only for specific population or payoff structures, and hence not in a comprehensive manner.
In this article, we study the interplay between relatedness and synergies of kind and scale in models of player social interactions between relatives. In order to do so, we first present a general framework that integrates player matrix games (e.g., Kurokawa and Ihara, 2009; Gokhale and Traulsen, 2010) into the “direct fitness” approach (Taylor and Frank, 1996; Rousset, 2004) of social evolution theory. This framework allows us to deliver a tractable expression for the selection gradient (or gain function) determining the evolutionary dynamics, which differs from the corresponding expression for player games between unrelated individuals only in that “inclusive gains from switching” rather than solely “direct gains from switching” must be taken into account.
We then use the theoretical framework to investigate the interaction between relatedness, synergies of kind, and synergies of scale in the evolution of helping. We show the importance of distinguishing between three different kinds of helping traits (which we call “wholegroup”, “nonexpresseronly” and “expresseronly”), that are characterized by different types of synergies of kind (none for “wholegroup”, negative for “nonexpresseronly”, positive for “expresseronly”), and can be subjected to different synergies of scale. Our analysis demonstrates that the interplay between relatedness and synergy can lead to patterns of frequency dependence, evolutionary dynamics, and bifurcations that cannot arise when considering synergistic interactions between unrelated individuals. Thereby, our approach illustrates how relatedness and synergy combine nontrivially to affect the evolution of social behaviors.
2 Modeling framework
2.1 Population structure (demography)
We consider a homogeneous haploid population subdivided into a finite and constant number of groups, each with a constant number of adult individuals (see table 1 for a list of symbols). The following events occur cyclically and span a demographic time period. Each adult individual gives birth to offspring and then survives with a constant probability, so that individuals can be semelparous (die after reproduction) or iteroparous (survive for a number of demographic time periods). After reproduction, offspring dispersal occurs. Then, offspring in each group compete for breeding spots vacated by the death of adults so that exactly individuals reach adulthood in each group.
Dispersal between groups may follow a variety of schemes, including the island model of dispersal (Wright, 1931; Taylor, 1992), isolation by distance (Malécot, 1975; Rousset, 2004), hierarchical migration (Sawyer and Felsenstein, 1983; Lehmann and Rousset, 2012), a model where groups split into daughter groups and compete against each other (Gardner and West, 2006; Lehmann et al., 2006; Traulsen and Nowak, 2006), and several variants of the haystack model (e.g., Matessi and Jayakar, 1976; GodfreyâSmith and Kerr, 2009). We leave the exact details of the life history unspecified, but assume that they fall within the scope of models of spatially homogeneous populations with constant population size (see Rousset, 2004, ch. 6).
2.2 Social interactions (games and payoffs)
Each demographic time period, individuals interact socially by participating in a game between players. Interactions can occur among all adults in a group (), among a subset of such individuals () or among offspring before dispersal (). Individuals may either express a social behavior (e.g., cooperate in a Prisoner’s Dilemma) or not (e.g., defect in a Prisoner’s Dilemma). We denote these two possible actions by A (“cooperation”) and B (“defection”) and also refer to Aplayers as “expressers” and to Bplayers as “nonexpressers”. The game is symmetric so that, from the point of view of a focal individual, any two coplayers playing the same action are exchangeable. We denote by the material payoff to an Aplayer when coplayers choose A (and hence coplayers choose B). Likewise, we denote by the material payoff to a Bplayer when coplayers choose A.
We assume that individuals implement mixed strategies, i.e., they play A with probability (and hence play B with probability ). The set of available strategies is then the interval . At any given time only two strategies are present in the population: residents who play A with probability and mutants who play A with probability . Let us denote by the strategy (either or ) of a focal individual, and by the strategy of the th coplayer of such focal. The expected payoff to the focal is then
(1) 
where is the probability that exactly coplayers play action A. A firstorder Taylorseries expansion about the average strategy of coplayers shows that, to first order in , the probability is given by a binomial distribution with parameters and , i.e.,
(2) 
Substituting (2) into (1) and discarding second and higher order terms, we obtain
(3) 
for the payoff of a focal individual as a function of the focal’s strategy and the average strategy of the focal’s coplayers (see also Rousset, 2004, p. 95 and Van Cleve and Lehmann, 2013, p. 85).
2.3 Gain function and convergence stability
Consider a population of residents playing in which a single mutant appears due to mutation, and denote by the fixation probability of the mutant. We take the phenotypic selection gradient as a measure of evolutionary success of the mutant (Rousset and Billiard, 2000, p. 819; Van Cleve, 2014, p. 17); indeed, entails that the mutant has a fixation probability greater than neutral under weak selection (). In order to evaluate the fixation probability, we assume that each demographic time period the material payoff to an individual determines its own fecundity (number of offspring produced before competition) or that of its parent (if interactions occur among offspring) by letting the average fecundity of an adult relative to a baseline be equal to the average payoff of a focal actor (i.e., payoffs from the game have “fecundity effects” as opposed to “survival effects”, e.g., Taylor and Irwin, 2000). With this and our demographic assumptions, is proportional to the “gain function” given by
(4) 
(see, e.g., Van Cleve and Lehmann, 2013, eq. 7).
Equation (4) shows that the gain function is determined by three components. First, the “direct” effect , that describes the change in average payoff of the focal resulting from the focal infinitesimally changing its own strategy. Second, the “indirect” effect , that describes the change in average payoff of the focal resulting from the focal’s coplayers changing their strategy infinitesimally. Third, the indirect effect is weighted by the “scaled relatedness coefficient” , which is a measure of relatedness between the focal individual and its neighbors, demographically scaled so as to capture the effects of local competition on selection (Queller, 1994; Lehmann and Rousset, 2010; Akçay and Van Cleve, 2012). We discuss these three components of the gain function in more detail in the following section.
Knowledge of equation (4) is sufficient to characterize convergent stable strategies (Eshel and Motro, 1981; Eshel, 1983; Taylor, 1989; Christiansen, 1991; Geritz et al., 1998; Rousset, 2004). In our context, candidate convergent stable strategies are either “singular points” (i.e., values such that ), or the two pure strategies (always play A) and (always play B). In particular, a singular point is convergent stable if . Regarding the endpoints, (resp. ) is convergent stable if (resp. ). In this article we focus on convergence stability, and thus do not consider the possibility of disruptive selection, which can be ruled out by assuming that the evolutionary dynamics proceeds strictly through a sequence of mutant invasions and fixation events (e.g., the “substitution process”; Gillespie, 1991; the “trait substitution process”; Metz et al., 1996; or the “trait substitution sequence”; Champagnat et al., 2006).
2.4 Inclusive gains from switching
From equation (4), the condition for a mutant to be favored by selection can be written as . This can be understood as a scaled form of the marginal version of Hamilton’s rule (Lehmann and Rousset, 2010) with corresponding to the marginal direct costs and to the marginal indirect benefits of expressing an increased probability of playing action A. These marginal costs and benefits are not measured in terms of actual fitness (number of adult offspring, which are the units of measurement of and in Hamilton’s rule as given in the introduction, see e.g., Rousset, 2004, p. 113), but in terms of fecundity via payoffs in a game. The scaled relatedness coefficient is also not equal to the regression definition of relatedness present in the standard Hamilton’s rule, except for special cases where competition is completely global (Queller, 1994).
The coefficient is a function of demographic parameters such as migration rate, group size, and vital rates of individuals or groups, but is independent of the evolving trait (Van Cleve and Lehmann, 2013). For instance, in the island model with overlapping generations, , where is the migration rate and is the probability of surviving to the next generation (Taylor and Irwin, 2000, eq. A10; Akçay and Van Cleve, 2012, app. A2). In broad terms, we have (i) for population structures characterized by positive assortment and relatively global competition, (ii) for infinitely large panmictic populations or for viscous populations with local competition exactly compensating for increased assortment of strategies (Taylor, 1992), and (iii) for population structures characterized by negative assortment and/or very strong local competition (e.g., densitydependent competition occurs before dispersal). Scaled relatedness coefficients have been evaluated for many life cycle conditions (see table 1 of Lehmann and Rousset, 2010, table 1 of Van Cleve and Lehmann, 2013, and references therein; see also app. A for values of under different variants of the haystack model).
In contrast to , which depends only on population structure, the other two components of the gain function are solely determined by the payoff structure of the social interaction. In the following, we show how and can be expressed in terms of the payoffs and of the game. Doing so delivers an expression for that can be analyzed with the same techniques applicable for games between unrelated individuals. This expression provides the foundation for our subsequent analysis.
Imagine a focal individual playing B in a group where of its coplayers play A. Suppose that this focal individual unilaterally switches its action to A while its coplayers hold fixed their actions, thus changing its payoff from to . As a consequence, the focal experiences a “direct gain from switching” given by
(5) 
At the same time, each of the focal’s coplayers playing A experiences a change in payoff given by and each of the focal’s coplayers playing B experiences a change in payoff given by . Hence, taken as a block, the coplayers of the focal experience a change in payoff given by
(6) 
where we let for mathematical convenience. From the perspective of the focal, this change in payoffs represents an “indirect gain from switching” the focal obtains if coplayers are related.
In appendix B, we show that the partial derivatives appearing in (4) can be expressed as expected values of the direct and indirect gains from switching, so that the direct and indirect effects are respectively given by
(7) 
and
(8) 
Hence, defining the “inclusive gains from switching” as
(9) 
the gain function can be written as the expected value of the inclusive gains from switching:
(10) 
An immediate consequence of equation (10) is that matrix games between relatives are mathematically equivalent to “transformed” games between unrelated individuals, where “inclusive payoffs” take the place of standard, or personal, payoffs. Indeed, consider a game in which a focal playing A (resp. B) obtains payoffs
(11)  
(12) 
when of its coplayers play A. Using equations (5)–(6) we can rewrite equation (9) as , so that the inclusive gains from switching are identical to the direct gains from switching in a game with payoff structure given by equations (11)–(12). The payoffs (resp. ) can be understood as inclusive payoffs consisting of the payoff obtained by a focal playing A (resp. B) plus times the sum of the payoffs obtained by its coplayers.
This observation has two relevant consequences. First, the results developed in Peña et al. (2014) for nonlinear player matrix games between unrelated individuals, which are based on the observation that the right side of (10) is a polynomial in Bernstein form (Farouki, 2012), also apply here, provided that (i) the inclusive gains from switching are used instead of the standard (direct) gains from switching in the formula for the gain function, and (ii) the concept of evolutionary stability is read as meaning convergence stability. For a large class of games, these results allow to identify convergence stable points from a direct inspection of the sign pattern of the inclusive gains from switching . Second, we may interpret the effect of relatedness on selection as inducing the payoff transformation , . For , this payoff transformation is the one hinted at by Hamilton (1971) and later often discussed in the theoretical literature (Grafen, 1979; Hines and Maynard Smith, 1979; Day and Taylor, 1998), namely
where the payoff of the focal is augmented by adding times the payoff of the coplayer.
3 Evolutionary dynamics of three kinds of helping traits
Throughout the following we assume that each Aplayer incurs a payoff cost in order for a social good to be produced (e.g., harvested food, nest defense, or help directed to others). The benefits of the social good are accrued by a subset of individuals in the group that we call “recipients”. Each recipient obtains a benefit when there are expressers in the group, and no benefit is produced if no individual expresses the social trait (). The benefit is increasing in the number of expressers, that is, the “incremental benefit” is positive ().
Synergies of scale are characterized by the properties of the incremental benefits. In the absence of synergies of scale, each additional expresser increases the benefit by the same amount so that is constant, implying that is linear in . With negative synergies of scale, is decreasing in , whereas positive synergies of scale arise when is increasing in . To illustrate the effects of synergies of scale on the evolutionary dynamics of the social trait, we will consider the special case in which incremental benefits are given by the geometric sequence for some and , so that benefits are given by
(13) 
With geometric benefits, synergies of scale are absent when , negative when , and positive when .
We distinguish three kinds of social traits according to which individuals are recipients and thus benefit from the expression of the social behavior: (i) “wholegroup” (benefits accrue to all individuals in the group, fig. 1.a), (ii) “nonexpresseronly” (benefits accrue only to nonexpressers, fig. 1.b), and (iii) “expresseronly” (benefits accrue only to expressers, fig. 1.c). For wholegroup traits there are no synergies of kind: benefits accrue to all individuals irrespective of their kind, i.e., whether they are expressers or nonexpressers. In contrast, nonexpresseronly traits feature negative synergies of kind, whereas expresseronly traits feature positive synergies of kind. These differences are reflected in different payoff structures for the corresponding player games, resulting in different direct, indirect, and inclusive gains from switching (see table 2).
A classical example of a wholegroup trait is the voluntary provision of public goods (Samuelson, 1954). In this case, the expressed social behavior consists in the production of a good available to others and hence exploitable by nonproducing cheats (nonexpressers). Wellknown instances of publicgoods cooperation are sentinel behavior in animals (Maynard Smith, 1965; CluttonBrock et al., 1999), and the secretion of extracellular products (Velicer, 2003; West et al., 2007), such as sucrosedigestive enzymes (Greig and Travisano, 2004; Gore et al., 2009), in social bacteria.
The most prominent social behavior matching our definition of a nonexpresseronly trait is altruistic selfsacrifice, which happens when individuals expressing the social behavior sacrifice themselves (or their reproduction) to benefit nonexpressers (Frank, 2006; West et al., 2006). Sterile castes in eusocial insects (Bourke and Franks, 1995), and bacteria lysing while releasing toxins (Fröhlich and Madeo, 2000) or virulence factors (Ackermann et al., 2008) that benefit other bacteria provide some examples of altruistic selfsacrifice in nature.
Expresseronly traits have been discussed under the rubrics of “synergistic” (Queller, 1984, 1985; Leimar and Tuomi, 1998) and “greenbeard” (Guilford, 1985; Gardner and West, 2010; Queller, 2011) effects, and conceptualized as involving “rowing”(Maynard Smith and Szathmáry, 1995, p. 261262) or “stag hunt” (Skyrms, 2004) games. Often cited examples include collective hunting (Packer and Ruttan, 1988), foundresses cooperating in colony establishment (Bernasconi and Strassmann, 1999), aposematic (warning) coloration (Queller, 1984, 1985; Guilford, 1988), and the Ti plasmid in the bacterial pathogen Agrobacterium tumefaciens, which induces its plant host to produce opines, a food source that can be exploited only by bacteria bearing the plasmid (Dawkins, 1999, p. 218, White and Winans, 2007). In each of these examples, the social good accrues only to partners expressing the trait, either because of a greater tendency to group and interact or because of the action of an emergent recognition system discriminating expressers from nonexpressers.
For all three kinds of social traits, the indirect gains from switching are always nonnegative ( for all ) and hence the indirect effect is nonnegative for all . This implies that we deal with helping traits at the level of payoffs and that increasing never leads to less selection for expressing the social behavior. Due to their different ways of defining recipients, however, each social trait is characterized by a social dilemma with structurally different payoff, direct gain, and indirect gains from switching. For nonexpresseronly traits, the direct gains from switching are always negative ( for all ) and thus expressing the social behavior is also payoff altruistic ( and for all ). For wholegroup and expresseronly traits, expressing the social behavior is not necessarily altruistic, depending on how the cost compares to benefits (, expresseronly traits) or incremental benefits (, wholegroup traits).
Before turning to the analysis, we note that a fourth class of social traits is sometimes also distinguished in the literature, namely “otheronly” traits where the benefits accrue to all other individuals in the group, but not to the focal expresser itself (Pepper, 2000). Otheronly traits, as wholegroup traits, lack synergies of kind, and hence the effects of relatedness on the evolutionary dynamics are qualitatively similar for wholegroup and otheronly traits. We discuss this latter case in more detail in section 3.5 and relegate the formal analysis, which is similar to the one for wholegroup traits, to appendix D.
3.1 No synergies of scale
To isolate the effects of synergies of kind, we begin our analysis with the case in which synergies of scale are absent, that is, benefits take the linear form ( in eq. (13)). The resulting expressions for the inclusive gains from switching and the gain functions for the three different social traits are shown in table 3. In each case, the gain function can be written as
where the parameter may be thought of as the “effective cost” per coplayer of expressing the social trait when none of the coplayers expresses the social trait. We have when a focal expresser is not among the recipients (nonexpresseronly traits) and otherwise (wholegroup and expresseronly traits). The parameter measures the incremental benefit accruing to each coplayer of a focal expresser when none of the coplayers expresses the social trait. We thus have for expresseronly traits and otherwise. Finally, measures synergies of kind and is thus null for wholegroup traits (), negative for nonexpresseronly traits () and positive for expresseronly traits ().
In the absence of synergies of kind (, wholegroup traits) selection is frequency independent and defection dominates cooperation ( is the only convergence stable strategy) if holds, whereas cooperation dominates defection ( is the only convergence stable strategy) if holds.
With negative synergies of kind (), there is negative frequencydependent selection. Defection dominates cooperation if holds, whereas cooperation dominates defection if holds. If holds, both and are unstable and the singular point
(14) 
is stable.
With positive synergies of kind (), there is positive frequencydependent selection. Defection dominates cooperation if holds, whereas cooperation dominates defection if holds. When , there is bistability: both and are stable and is unstable.
This analysis reveals three important points. First, in the absence of synergies of scale the gain function is linear in , which allows for a straightforward analysis of the evolutionary dynamics for all three kinds of social traits. Second, because of the linearity of the gain function, the evolutionary dynamics of such games fall into one of the four classical dynamical regimes arising from games, namely (i) A dominates B, (ii) B dominates A, (iii) coexistence, and (iv) bistability (see, e.g., Cressman, 2003, section 2.2). Third, which of these dynamical regimes arises is determined by the interaction of relatedness with synergies of kind in a straightforward fashion. For all traits, defection dominates cooperation when relatedness is low. For wholegroup traits, high values of relatedness imply that cooperation dominates defection. For nonexpresseronly and expresseronly traits, high relatedness also promotes cooperation, leading to either the coexistence of expressers and nonexpressers (nonexpresseronly traits) or to bistability (expresseronly traits).
3.2 Wholegroup traits with synergies of scale
For wholegroup traits there are no synergies of kind, but either positive or negative synergies of scale may arise. How do such synergies of scale change the evolutionary dynamics of wholegroup helping? Substituting the inclusive gains from switching given in table 2 into equation (10) shows that the gain function for wholegroup traits is given by
(15) 
Since the incremental benefit satisfies for all , the gain function (15) is negative for . In this case, defection dominates cooperation and is the only stable point. Hence, we consider the case throughout the following.
If synergies of scale are negative ( decreasing in ), the direct gains () indirect gains () and inclusive gains () from switching are all decreasing in . This implies that , and are all decreasing in (cf. Peña et al., 2014, remark 3). Similarly, if synergies of scale are positive ( increasing in ), , and are all increasing in and hence , and are all increasing in . In both cases the evolutionary dynamics are easily characterized by applying the results for public goods games with constant costs from Peña et al. (2014, section 4.3): with negative synergies of scale, defection dominates cooperation (so that is the only convergent stable strategy) if , whereas cooperation dominates defection if holds. If holds, there is coexistence: both and are unstable and there is a unique stable interior point . With positive synergies of scale, defection dominates cooperation if , whereas cooperation dominates defection if . If holds, there is bistability: both and are stable and there is a unique, unstable interior point separating the basins of attraction of these two stable strategies. These results resemble those for the cases in which there are no synergies of scale (section 3.1), but negative, resp. positive synergies of kind are present. In particular, it is again the case that the evolutionary dynamics fall into one of the four classical dynamical regimes arising from games.
The effect of relatedness on the evolution of wholegroup traits can be better grasped by noting that multiplying and dividing (15) by , we obtain
(16) 
where . Equation (16) is (up to multiplication by a positive constant) equivalent to the gain function of a public goods game between unrelated individuals with payoff cost for producing the public good, which has been analyzed under different assumptions on the shape of the benefit sequence (Motro, 1991; Bach et al., 2006; Hauert et al., 2006; Peña et al., 2014). Hence, relatedness can be conceptualized as affecting only the cost of cooperation, while leaving synergies of scale and patterns of frequency dependence unchanged.
As a concrete example, consider the case of geometric benefits (13) with (see table 4 for a summary of the results and app. C for a derivation). We find that there are two critical costtobenefit ratios
(17) 
such that for small costs () cooperation dominates defection ( is the only stable point) and for large costs () defection dominates cooperation ( is the only stable point). For intermediate costs (), there is a singular point given by
(18) 
such that the evolutionary dynamics are characterized by coexistence if synergies of scale are negative () and by bistability if synergies of scale are positive (). It is clear from equation (17) that, for a given costtobenefit ratio , increasing relatedness makes larger (resp. smaller) the region in the parameter space where cooperation (resp. defection) dominates. Moreover, and from equation (18), is an increasing (resp. decreasing) function of when (resp. ), meaning that the proportion of individuals cooperating at a stable interior point (resp. the size of the basin of attraction of the fully cooperative equilibrium) increases as a function of (see fig. 2.a and 2.d).
3.3 Nonexpresseronly traits with synergies of scale
For nonexpresseronly traits, synergies of kind are negative. In the absence of synergies of scale, and as discussed in section 3.1, this implies negative frequency dependence. To investigate how positive or negative synergies of scale change this baseline scenario, we focus on the case in which relatedness is nonnegative ().
From the formulas for and given in table 2, it is clear that, independently of any synergies of scale, the direct gains from switching are decreasing in . Hence, the direct effect is negative frequencydependent. When synergies of scale are negative, the indirect gains from switching are also decreasing in , implying that the indirect effect is also negative frequencydependent and that the same is true for the gain function . Hence, negative synergies of scale lead to evolutionary dynamics that are qualitatively identical to those arising when synergies of scale are absent: for low relatedness, defection dominates cooperation, and for sufficiently high relatedness, a unique interior stable equilibrium appears (see app. E.1 and fig. 2.b).
When synergies of scale are positive, the indirect gains from switching may still be decreasing in because the incremental gain accrues to a smaller number of recipients () as increases. In such a scenario, always applicable when , the evolutionary dynamics are again qualitatively identical to those arising when synergies of scale are absent. A different picture can emerge if holds and synergies of scale are not only positive, but also sufficiently strong. Then, the indirect gains from switching may be unimodal (first increasing, then decreasing) in , implying (Peña et al., 2014) that the indirect benefit is similarly unimodal, featuring positive frequency dependence for small and negative frequency dependence for large . Depending on the value of relatedness, which modulates how the frequency dependence of interacts with that of , this can give rise to evolutionary dynamics different from those possible without synergies of scale, discussed in section 3.1.
For a concrete example of such evolutionary dynamics, consider the case of geometric benefits (13) with (see table 4 for a summary of results, app. E.2 for their derivation and fig. 2.e for an illustration). In this case, the evolutionary dynamics for and depend on the critical value
(19) 
and on the two critical costtobenefit ratios
(20) 
which satisfy and .
With these definitions our results can be stated as follows. For the dynamical outcome depends on how the costtobenefit ratio compares to . If (high costs), defection dominates cooperation, while if (low costs), there is coexistence. For , the dynamical outcome also depends on how the costtobenefit ratio compares to . If (high costs), defection dominates cooperation. If (low costs), we have coexistence, with the stable singular point satisfying where
(21) 
In the remaining case (, intermediate costs) the dynamics are characterized by bistable coexistence, with stable, unstable, and two singular points (unstable) and (stable) satisfying . Numerical values for (resp. ) can be obtained by searching for roots of in the interval (resp. ), as we illustrate in figure 2.e.
It is evident from the dependence of , , and on that relatedness plays an important role in determining the stable level(s) of expression of helping. As increases, the regions of the parameter space where some nonzero level of expression of helping is stable expand at the expense of the region of dominant nonexpression. This is so because and are increasing functions of and is a decreasing function of . Moreover, inside these regions the stable nonzero probability of expressing helping increases with (see fig. 2.b and 2.e). Three cases can be however distinguished as for the effects of increasing when starting from a point in the parameter space where is the only stable point. First, can remain stable irrespective of the value of relatedness, which characterizes high costtobenefit ratios. Second, the system can undergo a transcritical bifurcation as increases, destabilizing and leading to the appearance of a unique stable interior point (fig. 2.b). This happens when and are relatively small. Third, there is a range of intermediate costtobenefit ratios such that, for sufficiently large values of , the system undergoes a saddlenode bifurcation, whereby two singular points (, unstable, and , stable) appear (fig. 2.e). In this latter case, positive synergies of scale are strong enough to interact with negative synergies of kind and relatedness in a nontrivial way.
3.4 Expresseronly traits with synergies of scale
For expresseronly traits, and independently of any synergies of scale, the direct gains from switching (cf. table 2) are increasing in , implying that the direct effect is positive frequencydependent. When synergies of scale are positive, the indirect gains from switching are also increasing in , so that the indirect effect is also positive frequencydependent. Focusing on the case of nonnegative relatedness ( this ensures that, just as when synergies of scale are absent, the gain function is positive frequencydependent. Hence, the evolutionary dynamics are qualitatively identical to those arising from linear benefits: for low relatedness, defection dominates cooperation, and for high relatedness, there is bistability, with the basins of attraction of the two pure equilibria and being separated by a unique interior unstable point (see app. F.1 and fig. 2.f).
When synergies of scale are negative, the indirect gains from switching may still be increasing in because the incremental gain accrues to a larger number of recipients as increases. In such a scenario, always applicable when , the evolutionary dynamics are again qualitatively identical to those arising when synergies of scale are absent. A different picture can emerge if holds and synergies of scale are not only negative, but also sufficiently strong. In this case, can be negative frequencydependent for some , and hence (for sufficiently high values of ) also . Similarly to the case of nonexpresseronly traits with positive synergies of scale, this can give rise to patterns of frequency dependence that go beyond the scope of helping without synergies of scale.
To illustrate this, consider the case of geometric benefits (13) with , , and (see table 4 for a summary of results, app. F.2 for proofs and fig. 2.c for an illustration). Defining the critical value
(22) 
and the two critical costtobenefit ratios
(23) 
which satisfy and , our result can be stated as follows. For the evolutionary dynamics depends on how the costtobenefit ratio compares to and to . If (low costs), cooperation dominates defection, while if (high costs), defection dominates cooperation. If (intermediate costs), the dynamics are bistable. For , the classification of possible evolutionary dynamics is as in the case , except that, if , the dynamics are characterized by bistable coexistence, with stable, unstable, stable, and unstable, where
(24) 
For , the critical values , , and are all increasing functions of . Hence, as relatedness increases, the regions of the parameter space where some level of expression of helping is stable expand at the expense of the region of dominant nonexpression. Moreover, inside these regions the stable positive probability of expressing helping increases with (fig. 2.c). When synergies of scale are “sufficiently” negative () and for intermediate costtobenefit ratios () relatedness and synergies interact in a nontrivial way, leading to saddlenode bifurcations as increases (fig. 2.c).
3.5 Connections with previous models
Our model without synergies of scale, for which the is linear in (section 3.1) extends classical twoplayer matrix games between relatives (e.g. Grafen, 1979, Frank, 1998, ch. 56) to the more general case of player linear games between relatives. Indeed, for , identifying scaled relatedness with relatedness , and up to normalization of the payoff matrices, equation (14) recovers Grafen (1979, eq. 9) and Frank (1998, eq. 5.6). Interestingly, Frank (1998, p. 98) considers a twoplayer model of helping with two pure strategies (“nesting” or expressing a queen phenotype, and “helping” or expressing a sterile worker phenotype), which is a particular case of our model of nonexpresseronly traits.
Our results on wholegroup traits with geometric returns (section 3.2 and app. C) extend the model studied by Hauert et al. (2006, p. 198) from the particular case of interactions between unrelated individuals () to the more general case of interactions between relatives () and recover the result by Archetti (2009, p. 476) in the limit , in which the game is also called a “volunteer’s dilemma” (Diekmann, 1985). Although we restricted our attention to the cases of constant, decreasing, and increasing incremental benefits, it is clear that equation (16) applies to benefits of any shape. Hence, general results about the stability of equilibria in public goods games (Peña et al., 2014) with sigmoid benefits (Bach et al., 2006; Archetti and Scheuring, 2011) carry over to games between relatives.
For their model of “selfdestructive cooperation” in bacteria, Ackermann et al. (2008) assumed a nonexpresseronly trait with no synergies of scale, and a haystack model of population structure implying , where is the number of offspring among which the game is played (see eq. (A.4)). Identifying our and with (respectively) their with , the main result of Ackermann et al. (2008) (eq. 7 in their supplementary material) is recovered as a particular case of our result that the unique convergent stable strategy for this case is given by (eq. (14)). The fact that in this example is a probability of coalescence within groups shows that social interactions effectively occur between family members, and hence that kin selection is crucial to the understanding of selfdestructive cooperation (Gardner and Kümmerli, 2008).
As mentioned before, the analysis of otheronly traits follows closely that of wholegroup traits (see app. D). The model of altruistic helping in Eshel and Motro (1988) considers such an otheronly trait. In their model, one individual in the group needs help, which can be provided (action A) or denied (action B) by its neighbors: a situation Eshel and Motro call the “three brothers’ problem” when . Suppose that the cost for each helper is a constant independent on the number of expressers (Eshel and Motro (1988)’s “risk for each volunteer”, denoted by in their paper) and that the benefit for the individual in need when coplayers offer help is given by (Eshel and Motro (1988)’s “gain function”, denoted by in their paper). Then, if individuals need help at random, the payoffs for helping (A) and not helping (B) are given by and . Defining and , we have and . Comparing these with the payoffs for wholegroup traits in table 2, it is apparent that the key difference between otheronly traits and wholegroup traits is that an expresser is not among the recipients of its own helping behavior. As we show in appendix D, our results for wholegroup traits carry over to such otheronly traits. In particular, our results for wholegroup traits with geometric benefits can be used to recover results 1,2, and 3 of Eshel and Motro (1988) and to extend them from familystructured to spatiallystructured populations.
Finally, Van Cleve and Lehmann (2013) discuss an player coordination game. They assume payoffs given by and , for positive , and , satisfying , and . It is easy to see that both the direct effect and the indirect effect are strictly increasing functions of having exactly one sign change. This implies that, for , the evolutionary dynamics are characterized by bistability, with the basins of attraction of the two equilibria and being divided by the interior unstable equilibrium . Importantly, and in contrast to the social traits analyzed in this article, expressing the payoff dominant action A does not always qualify as a helping trait, as is negative for some interval . As a result, increasing scaled relatedness can have mixed effects on the location of . Both of these predictions are well supported by the numerical results reported by Van Cleve and Lehmann (2013), where increasing leads to a steady increase in for , , , , and a steady decrease in for , , , , see their figure 5. This illustrates that relatedness (and thus spatial structure) plays an important role not only in the specific context of helping games but also in the more general context of nonlinear multiplayer games.
4 Discussion
We have shown that, when phenotypic differences are small, the selection gradient on a mixed strategy of a symmetric twostrategy player matrix game is proportional to the average inclusive payoff gain to an individual switching strategies, and that this can be written as a polynomial in Bernstein form (eq. (10)). As a result, convergence stability of strategies in spatially structured populations can be determined from the shape of the inclusive gain sequence (eq. (9)) and the mathematical properties of polynomials in Bernstein form (Farouki, 2012; Peña et al., 2014). We applied these results to the evolution of helping under synergies of scale and kind, and unified and extended previous analysis. The most important conclusion we reach is that, although an increase in (scaled) relatedness always tempers the social dilemma faced by cooperative individuals in a helping game, how the social dilemma is relaxed crucially depends on the synergies of kind and scale involved.
The simplest case is the one of wholegroup traits (fig. 1a). Since there are no synergies of kind, only synergies of scale can introduce frequency dependent selection. For , negative (resp. positive) synergies of scale induce negative (resp. positive) frequencydependent selection. Moreover, increasing relatedness can transform a game in which defection is dominant (Prisoner’s Dilemma) into a game in which cooperation and defection coexist (Snowdrift or anticoordination game) when synergies of scale are negative (fig. 2.a), or into a game in which both cooperation and defection are stable (Stag Hunt or coordination game) when synergies of scale are positive (fig. 2.d).
More complex interactions between relatedness and frequency dependence can arise when there are both synergies of kind and scale. For nonexpresseronly traits (fig. 1.b), synergies of kind are negative and helping is altruistic, so that in the absence of relatedness defection dominates cooperation (as in a Prisoner’s Dilemma). When synergies of scale are absent (linear benefits) or negative (diminishing incremental benefits), both the direct and the indirect effect are decreasing in and selection is negative frequencydependent. In this case, increasing relatedness might turn the game into a Snowdrift or an anticoordination game, where the probability of cooperating is an increasing function of relatedness (fig. 2.b). Contrastingly, when synergies of scale are positive (increasing incremental benefits) the indirect effect may become unimodal in . This paves the way for new patterns of evolutionary dynamics and bifurcations. For the particular case of geometric benefits, we find that there is a range of costtobenefit ratios such that, for sufficiently strong positive synergies of scale, increasing relatedness induces a saddlenode bifurcation whereby two internal equilibria appear, the leftmost unstable and the rightmost stable (fig. 2.e). After the bifurcation occurs, the evolutionary dynamics are characterized by bistable coexistence, where the first stable equilibrium is pure defection () and the second is a mixed equilibrium in which individuals help with a positive probability ().
For expresseronly traits (where synergies of kind are positive, fig. 1c) a similar interaction between relatedness and synergies occurs. When synergies of scale are absent or positive, is increasing in for . In this case, increasing relatedness might turn a scenario reminiscent of the Prisoner’s Dilemma into a Stag Hunt or coordination game, where the size of the basin of attraction of the cooperative equilibrium is an increasing function of relatedness (fig. 2.f). Contrastingly, if synergies of scale are negative, relatedness may interact nontrivially with synergies to produce a dynamical outcome which is qualitatively identical to that arising from nonexpresseronly traits with positive synergies of scale, namely, bistable coexistence (fig. 2.c).
The three kinds of helping traits we considered are also different in the conditions they impose on the origin and the maintenance of helping. To see this, consider a payoff cost so large that the direct sequence is negative. For the case of unrelated individuals () this implies that B dominates A so that is the only stable strategy. We ask what happens when is increased and focus on the stability of the endpoints and .
For wholegroup traits, the indirect gains from switching when coplayers are all defectors () and when coplayers are all helpers () are both positive. This opens up the opportunity for both (i) to be destabilized if , and (ii) to be stabilized if , which underlies the classical effect that increasing relatedness can destabilize defection and stabilize helping.
In contrast, one of these two scenarios is missing for nonexpresseronly and expresseronly traits. For nonexpresseronly traits, we have and irrespectively of the shape of the benefit sequence. Hence, although can be destabilized by increasing (allowing for some level of helping to be evolutionarily accessible from ), can never be stabilized and so full helping is never an evolutionary (convergent) stable point. Exactly the opposite happens for expresseronly traits, where but . As a result, can become stable (if ) but can never be destabilized by increasing . This implies that, under our assumptions, an expresseronly trait with high costs () can never evolve from a monomorphic population of nonexpressers (), and this for any value of .
The kind of social trait also has a big impact on the amount of (scaled) relatedness required to make stable some level of helping. This quantitative effect is illustrated in figure 2. When synergies of scale are negative () and the costto benefit ratio is relatively low (), for wholegroup and nonexpresseronly traits, moderate amounts of relatedness ( for wholegroup traits, for nonexpresseronly traits) are sufficient for a nonzero level of expression of helping to be stable (fig. 2.a and 2.b). In contrast, a comparatively large amount of relatedness () is required for some nonzero level of helping to be stable if the trait is expresseronly (fig. 2.c). In the case of positive synergies of scale () and relatively high costtobenefit ratio (), full expression of helping is stable already with for wholegroup and expresseronly traits (fig. 2.d and 2.f). Contrastingly, for nonexpresseronly traits, a positive probability of expressing helping is stable only for large values of relatedness (, fig. 2.e).
We modeled social interactions by assuming that actions implemented by players are discrete. This is in contrast to many kinselection models of games between relatives, which assume a continuum of pure actions in the form of continuous amounts of effort devoted to some social activity (e.g., Frank 1994; Johnstone et al. 1999; Reuter and Keller 2001; Wenseleers et al. 2010). Such continuousaction models have the advantage that the “fitness function” or “payoff function” (the counterpart to our eq. (3)) usually takes a simple form that facilitates mathematical analysis. On the other hand, there are situations where individuals can express only a few behavioral alternatives or morphs, such as worker and queen in the eusocial Hymenoptera (Wheeler, 1986), different behavioral tactics in foraging (e.g., “producers” and “scroungers” in house sparrows Passer domesticus; Barnard and Sibly, 1981) and hunting (e.g., lionesses positioned as “wings” and others positioned as “centres” in collective hunts; Stander, 1992), or distinct phenotypic states (e.g., capsulated and noncapsulated cells in Pseudomonas fluorescens; Beaumont et al., 2009). These situations are more conveniently modeled by means of a discreteaction model like the one presented here, but we expect that our qualitative results about the interaction between synergy and relatedness carry over to continuousaction models.
Synergistic interactions are likely to be much more common in nature than additive interactions where both synergies of scale and kind are absent. Given the local demographic structure of biological populations, interactions between relatives are also likely to be the rule rather than the exception. Empirical work should thus aim at measuring not only the genetic relatedness of interactants and the fitness costs and benefits of particular actions, but also at identifying the occurrences of positive and negative synergies of kind and scale, as it is the interaction between synergies and relatedness which determines the qualitative outcomes of the evolutionary dynamics of helping (fig. 2).
5 Acknowledgements
This work was partly supported by Swiss NSF Grants PBLAP3145860 (to JP) and PP00P3123344 (to LL).
Appendix A The haystack model
Many models of social interactions have assumed different versions of the haystack model (e.g., Matessi and Jayakar, 1976; Ackermann et al., 2008), where several rounds of unregulated reproduction can occur within groups before a round of complete dispersal (Maynard Smith, 1964) so that competition is effectively global. In these cases, as we will see below, takes the simpler interpretation of the coalescence probability of the gene lineage of two interacting individuals in their group. Here, we calculate for different variants of the haystack model.
The haystack model can be seen as a special case of the island model where dispersal is complete and where dispersing progeny compete globally. In this context, the fecundity of an adult is the number of its offspring reaching the stage of global densitydependent competition. The conception of offspring may occur in a single or over multiple rounds of reproduction, so that a growth phase within patches is possible. In this context, the number of “adults” is better thought of as the number of founding individuals (or lineages, or seeds) on a patch.
Two cases need to be distinguished when it comes to social interactions. First, the game can be played between the adult individuals (founders) in which case
(A.1) 
since relatedness is zero among founders on a patch and there is no local competition. Alternatively, the game is played between offspring after reproduction and right before their dispersal. In this case two individuals can be related since they can descend from the same founder. Since there is no local competition, is directly the relatedness between two interacting offspring and is obtained as the probability that the two ancestral lineages of two randomly sampled offspring coalesce in the same founding individual (relatedness in the island model is defined as the cumulative coalescence probability over several generations, see e.g., Rousset, 2004, but owing to complete dispersal gene lineages can only coalesce in founders).
In order to evaluate for the second case, we assume that, after growth, exactly offspring are produced and that the game is played between them (). Founding individuals, however, may contribute a variable number of offspring. Let us denote by the random number of offspring descending from the “adult” individual on a representative patch after reproduction, i.e., is the size of lineage . Owing to our assumption that the total number of offspring is fixed, we have , where the ’s are exchangeable random variables (i.e., neutral process, ). The coalescence probability can then be computed as the expectation of the ratio of the total number of ways of sampling two offspring from the same founding parent to the total number of ways of sampling two offspring:
(A.2) 
where the second equality follows from exchangeability, is the expected number of offspring descending from any individual , and is the corresponding variance. Due to the fact that the total number of offspring is fixed, we also necessarily have (i.e., ), whereby
(A.3) 
which holds for any neutral growth process.
We now consider different cases:
(i) Suppose that there is no variation in offspring production between founding individuals, as in the life cycle described by Ackermann et al. (2008). Then , and equation (A.3) simplifies to
(A.4) 
(ii) Suppose that each of the offspring has an equal chance of descending from any founding individual, so that each offspring is the result of a sampling event (with replacement) from a parent among the founding individuals. Then, the offspring number distribution is binomial with parameters and , whereby . Substituting into equation (A.3) produces
(A.5) 
In more biological terms, this case results from a situation where individuals produce offspring according to a Poisson process and where exactly individuals are kept for interactions (i.e., the conditional branching process of population genetics; Ewens, 2004).
(iii) Suppose that the offspring distribution follows a betabinomial distribution, with number of trials and shape parameters and . Then, and
which yields
(A.6) 
In more biological terms, this reproductive scheme results from a situation where individuals produce offspring according to a negative binomial distribution (larger variance than Poisson, which is recovered when ), and where exactly individuals are kept for interactions.
Appendix B Gains from switching and the gain function
In the following we establish the expressions for and given in equations (7)–(8); equation (10) is then immediate from the definition of (9) and the identity .
Recalling the definitions of and from equation (4) as well as the definitions of and from equations (5)–(6) we need to show \linenomath
(B.1)  
(B.2) 
where the function has been defined in equation (3). Equation (B.1) follows directly by taking the partial derivative of with respect to and evaluating at , so it remains to establish equation (B.2).
Our derivation of equation (B.2) uses properties of polynomials in Bernstein form (Farouki, 2012). Such polynomials, which in general can be written as , where , satisfy
Applying this property to equation (3) and evaluating the resulting partial derivative at , yields
(B.3) 
In order to obtain equation (B.2) from equation (B.3) it then suffices to establish
(B.4) 
and
(B.5) 
as applying these identities to the terms on the right side of equation (B.3) yields the right side of equation (B.2).
Let us prove equation (B.4) (eq. (B.5) is proven in a similar way). Starting from the left side of equation (B.4), we multiply and divide by and distribute to obtain
Applying the identity and changing the index of summation to , we get
Finally, changing the lower index of the sum by noting that the summand is zero when gives equation (B.4).
Appendix C Wholegroup traits with geometric benefits
With geometric benefits, we have , so that the inclusive gains from switching for wholegroup traits are given by . Using the formula for the probability generating function of a binomial random variable, equation (10) can be written as
(C.1) 
As is either decreasing () or increasing () in , A (resp. B) is a dominant strategy if and only if (resp. if and only if ). Using equation (C.1) to calculate and then yields the critical costtobenefit ratios and given in equation (17). The value of given in equation (18) is obtained by solving .
Appendix D Otheronly traits
In contrast to what happens in wholegroup traits, individuals expressing an otheronly trait are automatically excluded from the consumption of the good they create, although they can still reap the benefits of goods created by other expressers in their group. Payoffs for such otheronly traits are given by and , so that the inclusive gains from switching are given by . For this payoff constellation, it is straightforward to obtain the indirect benefits from equation (B.3) in appendix B. Observing that holds for all , we have