Simulation of an Optional Strategy in the Prisoner’s Dilemma in Spatial and Nonspatial Environments ^{1}
Abstract
This paper presents research comparing the effects of different environments on the outcome of an extended Prisoner’s Dilemma, in which agents have the option to abstain from playing the game. We consider three different pure strategies: cooperation, defection and abstinence. We adopt an evolutionary game theoretic approach and consider two different environments: the first which imposes no spatial constraints and the second in which agents are placed on a lattice grid. We analyse the performance of the three strategies as we vary the loner’s payoff in both structured and unstructured environments. Furthermore we also present the results of simulations which identify scenarios in which cooperative clusters of agents emerge and persist in both environments.
Keywords:
Artificial Life; Game Theory; Evolutionary Computation.1 Introduction
Within the areas of artificial life and agentbased simulations, evolutionary games such as the classical Prisoner’s Dilemma [2, 15], and its extensions in the iterated form, have garnered much attention and have provided many useful insights with respect to adaptive behaviours. The Prisoner’s Dilemma game has attained this attention due to its succinct representation of the conflict between individually rational choices and choices that are for the better good. However, in many social scenarios that we may wish to model, agents are often afforded a third option — that of abstaining from the interaction. Incorporating this concept of abstinence extends the Prisoner’s Dilemma to a threestrategy game where agents can not only cooperate or defect but can also choose to abstain from a game interaction. There have been a number of recent studies exploring this type of game [16, 5, 9, 7, 8].
In addition to analysing the evolution of different strategies and different outcomes, previous work has also explored the effect of imposing spatial constraints on agent interactions. Traditionally, these studies assume no such constraints and agents are free to interact with all other agents in wellmixed populations [2]. However, many models consider restricting interactions to neighbourhoods of agents on some predefined topology. These more expressive models include lattices [13, 6], cycles and complete graphs [9], scalefree graphs [16] and graphs exhibiting certain properties, such as clustering coefficients [11].
In this paper we adopt an evolutionary approach to evolve populations of agents participating in the extended Prisoner’s Dilemma [12]. We consider two different environmental settings: one with no enforced structure where agents may interact with all other agents; and another in which agents are placed on a lattice grid with spatial constraints enforced, where agents can play with their immediate eight neighbours (Moore neighbourhood). In both environmental settings, an agent’s fitness is calculated as the sum of the payoffs obtained through the extended Prisoner’s Dilemma game interactions. We investigate the evolution of different strategies (cooperate, defect and abstain) in both spatial and nonspatial environments. We are particularly interested in the effect of different starting conditions (number of different strategies and placement of different strategies) and the different values for the loner’s payoff () on the emergence of cooperation. We identify situations where the simulations converge to an equilibrium, where no further changes occur. These equilibria can be fully stable (no change) or quasistable (with a small cycle length).
The paper outline is as follows: In Section 2 an overview of work in the extended game and of spatial evolutionary game theory is presented. Section 3 gives an overview of the methodology employed. In Section 4, we discuss the nonspatial environment. We firstly present an analysis of pairwise interactions between the three pure strategies. Secondly, evolutionary experiments using all three strategies are presented. Thirdly, we explore the robustness of a population of cooperative and abstaining strategies when a defecting strategy is added to the population. In Section 5, we discuss the environment where agents are placed on a lattice grid, in which their interactions are constrained by their local neighbourhood. Again an analysis of pairwise interactions is first undertaken followed by an exploration of the outcomes when all three strategies are randomly placed on the grid. Based on these findings, we explore different starting groupings of the three strategies, i.e. placed in a nonrandom manner on the grid. This will allow identification of starting configurations that lead to stable cooperation.
2 Related Work
Abstinence has been studied in the context of the Prisoner’s Dilemma (PD) since Batali and Kitcher, in their seminal work [3], first introduced the optional variant of the game. They proposed the optout or “loner’s” strategy, in which agents could choose to abstain from playing the game, as a third option, in order to avoid cooperating with known defectors. Using a combination of mathematical analysis and simulations, they found that populations who played the optional games could find routes from states of low cooperation to high states of cooperation. Subsequently, as this extension has grown in popularity and renown, optional participation has been successfully incorporated into models alongside other cooperation enhancing mechanisms such as punishment [7] and reputation [14, 5], and has been applied to probabilistic models [16].
The study of optional participation can be broadly separated into approaches: one that directly incorporates abstinence into the traditional PD game (the loner’s strategy), and another known as conditional cooperation. Models that incorporate the loner’s strategy treat the option to abstain as an alternative strategy for agents to employ [3, 9], separate to the option to cooperate or defect. These models tend to be more grounded in mathematical models with less of an emphasis on experimental simulations, which oftentimes have been shown to produce unexpected results [6]. On the other hand, conditional cooperation models [1, 10, 8], also known as conditional disassociation, incorporate abstinence into cooperation strategies. These models lend themselves more easily to Axelrodstyle tournaments [2]. They tend to focus on exit options or partnerleaving mechanisms, and often lack a spatial aspect, which has since been shown to increase the number of abstainer strategies thus increasing the chances of cooperation evolving [9].
The work that most closely resembles our own is that of Hauert and Szabó [6]. They consider a spatially extended PD and public goods game (PGG), where a population of agents are arranged and interact on a variety of different geometries, including a regular lattice. Three pure strategies (cooperate, defect and abstain) are investigated using an evolutionary approach. Results showed that the spatial organisation of strategies affected the evolution of cooperation, and in addition, they found that the existence of abstainers was advantageous to cooperators, because they were protected against exploitation. However, there exists some major differences between their model and the one proposed here. Hauert and Szabó focus on a simplified PGG as their primary model for group interactions, and separately use the PD only for pairwise interactions. In our model, agents interact by playing a single round of the PD with each of their neighbours. Additionally, Hauert and Szabó focused on one set of initial conditions for their simulations, using a fixed ratio of strategies. Our work explores a wider range of initialization settings from which we gleam more significant insights, and identify favourable configurations for the emergence of cooperation.
3 Methodology
In order to explore these strategies and, in particular, the effect of introducing abstinence, we propose a set of experiments in which each agent randomly plays a number of oneshot, twoperson extended Prisoner’s Dilemma game. An evolutionary approach is adopted with a fixedsize population where each agent in the population is initially assigned a fixed strategy. Fitness is calculated and assigned based on the payoffs obtained by the agents from playing the game. Simulations are run until the population converges on a single strategy, or configuration of strategies.
In the traditional Prisoner’s Dilemma game there are four payoffs corresponding to the pairwise interaction between two agents. The payoffs are: reward for mutual cooperation (), punishment for mutual defection (), sucker’s payoff () and temptation to defect (). The dilemma arises due to the following ordering of payoff values: . When extending the game to include abstinence, a fifth payoff is introduced, the loner’s payoff () is awarded to both participants if one or both abstain from the interaction.
The value of should be set such that: (1) it is not greater than , otherwise the advantage of not playing will be sufficiently large to ensure that players will always abstain and (2) it is greater than , otherwise there are no benefits to abstaining. This enables us to investigate the values of in the range , which in turn contrasts with the definition used by Hauert and Szabó [6] who define abstainers as strategies who perform better than groups of defectors but worse than groups of mutually cooperating strategies. In their model, abstainers receive a payoff less than and greater than . We choose to explore a more exhaustive range of values. The payoffs for the extended Prisoner’s Dilemma game are illustrated in Tab. 1 and are based on the standard values used by Axelrod [2].


As we aim to study the behaviour of agents in different scenarios, our first model allows all agents to potentially interact (Sect. 4). Our second model places topological constraints on the agent population which restricts the potential interactions that can take place (Sect. 5). This allows for the comparison between spatial and nonspatial environments and allows us to identify similarities and differences in conditions that promote cooperation. For both environments, two common sets of experiments are considered:

Pairwise comparisons: The abstainer strategies compete with one of the other strategies; firstly, an equal number of cooperators () and abstainers (); and secondly, an equal number of defectors () and abstainers ().

All three strategies present: We adopt an unbiased environment in which initially each agent is designated as a cooperator (), defector () or abstainer () with equal probability.
Moreover, to further explore the effect of adding the option of abstinence, a third experiment is undertaken in the nonspatial environment, where we seed the population with a majority of one type of strategy (abstainers) and explore if the population is robust to invasion from (1) a cooperator and (2) a cooperator and a defector (Sect. 4.3). In order to explore the effect of different initial spatial configurations, we also undertake a third set of experiments in the spatial environment, which provide an insight in to the necessary spatial conditions that may lead to robust cooperation (Sect. 5.3).
4 Nonspatial Environment
In this section, we present results of the experiments in the nonspatial environment and settings as described previously in Section 3. We use a tournament selection with size 2.
4.1 Pairwise Comparisons
The simulations involving cooperators, and abstainers , verified the expected outcomes where the cooperators quickly spread throughout the population resulting in complete cooperation. This can be shown to be correct by calculating the difference in the payoffs each strategy receives:
As , and thus the cooperators always dominate. Our simulations confirm this result.
When comparing and strategies and their payoffs, we see:
If , then either defectors or abstainers may dominate at any stage. If , then abstainers dominate. If then defectors dominate. Figure 0(a) illustrates this behaviour in simulations for different values of with an initial equal population of defectors and abstainers. For each simulation, 100 separate runs are undertaken and the average of the numbers of each strategy present per run are averaged per generation and plotted. It can be seen that when , the defectors have a selective advantage and dominate. At , neither the defectors nor the abstainers have a clear advantage. When , the abstainers have the selective advantage and they now dominate in the majority of cases. The above calculations assume all players play all other players; our simulations approximate this result.
4.2 All Three Strategies
In this experiment, an unbiased environment, with an initial population consisting of the same number of cooperators (C), defectors (D) and abstainers (A), is created. Figure 0(b) illustrates the behaviour at generation 50 across 100 individual runs. For defectors have already dominated the population. For , defectors still dominate but on a minority of runs abstainers dominate. For this dominance of the abstainers becomes more pronounced as the payoff for abstainers increases. In fact, in some runs given the selective advantage of abstainers over defectors, some cooperators outperform the defectors resulting in a fully cooperative run.
4.3 Robustness
The previous experiments show the outcome for a range of starting conditions. In this section, we explore the robustness of states to the introduction of a defector. Initially, a population is created comprising one cooperator strategy and the remainder of the strategies are all abstainers. In this situation, in the first generation all strategies receive the same payoffs, . Via tournament selection, subsequent generations may comprise more than one cooperative strategy. If this is the case, and these cooperative strategies are chosen to play against each other, they receive a higher payoff than abstainers, and cooperation will flourish. On the other hand, if the cooperative strategy is not selected for subsequent generations, then the population will consist only of abstainer strategies. This is illustrated in Fig. 0(c), which shows the average of 100 runs. In any of these runs, the evolutionary outcome is either a population comprising fully of cooperators or a population comprising fully of abstainers. The value of does not affect this outcome as is always true.
In the second robustness experiment, the initial population consists of 98 abstainers, 1 cooperator and 1 defector. Figure 0(d) shows the outcomes after 50 generations. When , as seen previously, the defectors will have an advantage over abstainers. However, due to tournament selection, there is a possibility that a defector will not be chosen for subsequent generations. When , the abstainers have the advantage over the defectors given the possibility of mutual defection among defectors. The defectors may continue to survive in the population given the presence of cooperators whom they can exploit. We witness that the cooperators can still do well given the benefits of mutual cooperation. However, the number of runs in which cooperation flourishes is reduced due to the presence of defectors. When , defectors and abstainers achieve the same payoff in their pairwise interactions. However, defectors may do better in that they will exploit any cooperators. As the cooperators die out, there is no selective advantage for defectors but a level of robustness to invasion is observed.
In summary, these results show, when introducing one cooperator, abstainers and cooperators can coexist; but when adding one cooperator and one defector more complex outcomes are possible.
5 Spatial Environments
In this section, we are interested in exploring the larger range of outcomes that result from the introduction of the spatial constraints. For the following experiments, we replace the tournament selection used in the nonspatial experiments with a mechanism whereby an agent adopts the strategy of the best performing neighbour strategy. This is in line with standard approaches in spatial simulations [13, 6].
5.1 Pairwise Comparison between Agents
When placing cooperator and defector agents randomly on the lattice grid, the defecting agents will spread amongst the cooperators echoing previous findings. When cooperator and abstainer agents are randomly placed on the grid, we find that if there are at least two cooperators beside each other, cooperation will spread, irrespective of the value of as cooperative agents playing with each other will obtain a higher payoff than any adjacent abstainer agents. Thus, neighbours will copy the cooperating strategy. Finally, when defector and abstainer agents are randomly placed on the grid, we see from Fig. 2 that different outcomes occur depending on the value of . This is similar to the results observed in the nonspatial pairwise comparison.
5.2 All three strategies
In this experiment, equal numbers of the three strategies are placed randomly on the grid. The outcome for is as expected with defectors quickly dominating the population. However, in 65% of simulations small clusters of cooperators survive thanks to the presence of the abstainers. The abstainers give the cooperators a foothold, allowing them to ward off invasion from the defectors.
For , defectors once again dominate, despite the tie, as they are able to exploit cooperators in the population. Once again, some small groups of cooperators survive with the same probability.
A number of simulations are run varying from 1.1 to 2.0 where results show similar emergent evolutionary stable patterns across all values of in this range. There are two distinct outcomes; abstainers dominate; and abstainers dominate with some sustained cooperation. Some level of cooperation is achieved on average in 51.5% of simulations for values of in the range . In these runs, a cooperative cluster (of minimum size 9), surrounded by defectors, forms in the early generations and remains a stable feature in subsequent generations. The presence of defectors, surrounding the cooperative cluster, prevents the abstainers from being invaded by the cooperators. Similarly the defector strategies remain robust to the spread of abstainers given their ability to exploit the cooperators. In essence, a symbiotic relationship is formed between cooperators and defectors. Figure 2(b) shows a screenshot of a cooperator and defector cluster in a simulation where abstainers have dominated. This configuration, once reached, is stable in these settings.
As the value of increases we also witness newer phenomena. For and , we see cycles between two states where some of the surrounding defectors fluctuate from defector to abstainer and back again. We also see an increase in the size and amount of clusters when they are formed. For , we see “gliders” [4] where a group of defectors flanked by a row of cooperators seemingly move across the grid, as shown in Fig. 2(a). In reality, the cooperators invade the abstainers, the defectors invade the cooperators, and the abstainers in turn invade the defectors.
5.3 Exploration of the Effect of Different Initial Spatial Configurations
The aim of this experiment is to investigate different initial spatial settings of cooperators, defectors and abstainers to further explain the results witnessed in the previous experiment (Sect. 5.2). One interesting outcome from the previous simulations involved a stable situation where one strategy (inner) could survive in a cluster of the same strategies due to being surrounded fully by another strategy (middle) which, in turn, is itself surrounded fully by the third (outer) strategy (see Fig. 2(b)). In this case, it appeared that the inner strategy needs the protection of the middle strategy to avoid invasion by the outer strategy and that the middle strategy in turn needs the inner strategy to avoid invasion by the outer strategy. It was noted that for cooperators surrounded by defectors, a minimum inner cluster size of 9 was needed in order for this outcome to emerge.
Given three strategies, we consider all six permutations with respect to the placement of strategies in the three different positions of inner, middle and outer with an inner cluster of size 9, a middle cluster comprising 3 layers around the inner cluster, and the remaining outer portion of the grid containing only the third strategy. We label these six spatial configurations according to the first letter of the strategy (C, D, A) and their initial position (inner, middle, outer). Figure 2(c) is an illustration of the initial conditions for the “CAD” spatial configuration. We note that given any initial configuration the outcome will not vary. This means that there is no reason, other than for verification, to run a configuration multiple times. Two values of are explored: and for each configuration. For , simulations reveal no selective pressure for interactions between defectors and abstainers. These results involve a level of stochasticity which do not give any meaningful insights, and thus are not further discussed in this paper.
In every permutation of A, C, and D when the defectors dominate. Both defectors and cooperators invade the abstainers and then the defectors begin to invade the cooperators. We again observe that many clusters of cooperators, of different sizes but of minimum size 9, remain robust to this invasion, as a result to the presence of the abstainers. The initial placement of the strategies dictates how many cooperative clusters are likely to remain robust to invasion by defection. Table 2 provides an overview of the results from each scenario when . The existence of the abstainer strategies, in addition to the initial placement of the strategies, ensures that defection will not dominate in all of the scenarios. In fact, in one scenario (CAD), it results in a fully cooperative population.
Shape  Outcome  Description 

DCA  Defection spreads  Abstainers are invaded. Clusters of cooperators survive amongst dominant defectors. 
DAC  Defection spreads  Similar outcome as above. 
CDA  Structurally stable  A symbiotic cluster of cooperators and defectors persist among the abstainers. 
CAD  Cooperation spreads  Abstainers buffer cooperators against defectors, allowing them to dominate. 
ACD  Abstainers invaded  Cooperators invade the inner abstainers to create a cluster resistant to defector invasion. 
ADC  Abstinence spreads  Clusters of cooperators, surrounded by defectors survive within the abstainer majority (see Fig. 2(b)). 
We have seen in comparison in the nonspatial experiments that all strategies may influence each other’s payoffs and we observe a smaller set of outcomes. When all the strategies are placed together, either defectors or abstainers dominate. In the spatial scenarios, there are outcomes with robust clusters of cooperators. In the robustness experiments, in the nonspatial scenarios, a largely cooperative population is easily invaded by defectors; this is not the case in the spatial scenario where we have shown that cooperators can be robust to invasion for specific initial settings. In the nonspatial scenarios, with the existence of abstainers, the population is largely robust and results in a mixed equilibrium.
6 Conclusions and Future Work
In this paper, two different environments in which populations of agents played an extended version of the Prisoner’s Dilemma were considered: nonspatial where all agents were potential partners for each other, and a population organised on a lattice grid where agents can only play with their 8 immediate neighbours. For both scenarios, three sets of experiments were performed: a pairwise comparison of two strategies; experiments involving all three strategies and an exploration of the conditions leading to cooperative outcomes.
In the nonspatial environment, for the pairwise comparison, with agents initially having equal number of cooperators and abstainers, cooperation spreads throughout the population. The outcome when agents initially have an equal number of defectors and abstainers is dependent on the loner’s payoff (). When all three strategies are initially equally present in the population the value of the loner’s payoff is again crucial. When the value of is less than or equal to the punishment for mutual defection, the dominant strategy is defection; in other cases abstinence spreads as a strategy and this in turn can lead to cooperation spreading. In the robustness experiments, we consider populations comprising of agents with abstainer strategies and explore the effects of perturbing the population by the addition of firstly, a cooperative agent and secondly agents with strategies of cooperation and defection. Results show that only in the second scenario does the value of influence the outcome.
In the spatial experiments, similar outcomes arise for the pairwise comparisons. When considering equal numbers of agents with all three strategies some similarities between the spatial and nonspatial results are noticed, but the spatial organisation allows for the clustering of cooperative agents. For all values of the loner’s payoff, defection dominates in addition to the presence of some clusters of cooperators where these clusters are protected by abstainers . As the loner’s payoff increases above 1.5, the size of these clusters of cooperative agents increases. In the experiments considering different initial spatial configurations interesting behaviour was noted for the six different possible starting initialisations. In all cases, irrespective of the position of the cooperative strategies initially, and the value of , cooperative clusters persisted.
In previous work in the spatial Prisoner’s Dilemma, it has been shown that cooperation can be robust to invasion if a sufficiently large cluster of cooperators form. However, given a random initialisation, this rarely happens and defectors can dominate in most scenarios. With the introduction of abstainers, we see new phenomena and a larger range of scenarios where cooperators can be robust to invasion by defectors and can dominate.
Future work will involve extending our abstract model to more realistic scenarios. There are many documented scenarios of symbiosis between entities (individuals, species, plants and companies). In our simulations, we model symbiosis between three distinct entities. We are interested in identifying scenarios where insights obtained in our spatial configurations may apply; for example, the planting of specific plants (abstainers) to prevent the invasion one plant species (defector) into another native species (cooperator).
Future work will also involve performing a more detailed investigation into emergent evolutionary stable patterns witnessed at different values of and the exploration of other topologies with the goal of identifying structures that allow robust cooperation.
Acknowledgments.
This work is funded by CNPq–Brazil and the Hardiman Scholarship, NUI Galway. The final publication is available at Springer via http://dx.doi.org/10.1007/9783319434889_14.
Footnotes
 thanks: The final publication is available at Springer via http://dx.doi.org/10.1007/9783319434889_14
 email: marcos.cardinot@nuigalway.ie
 email: marcos.cardinot@nuigalway.ie
 email: marcos.cardinot@nuigalway.ie
 email: marcos.cardinot@nuigalway.ie
References
 Arend, R.J., Seale, D.A.: Modeling alliance activity: An iterated prisoner’s dilemma with exit option. Strategic Management Journal 26(11), 1057–1074 (2005)
 Axelrod, R.M.: The evolution of cooperation. Basic Books, New York (1984)
 Batali, J., Kitcher, P.: Evolution of altruism in optional and compulsory games. J. Theor. Biol. 175(2), 161–171 (1995)
 Gardner, M.: Mathematical Games: The Fantastic Combinations of John Conway’s New Solitaire Game Life. Scientific American 223(4), 120–123 (1970)
 Ghang, W., Nowak, M.A.: Indirect reciprocity with optional interactions. J. Theor. Biol. 365, 1–11 (2015)
 Hauert, C., Szabö, G.: Prisoner’s dilemma and public goods games in different geometries: Compulsory versus voluntary interactions. Complexity 8(4), 31–38 (2003)
 Hauert, C., Traulsen, A., Brandt, H., Nowak, M.A.: Public goods with punishment and abstaining in finite and infinite populations. Biol. Theory 3(2), 114–122 (2008)
 Izquierdo, S.S., Izquierdo, L.R., VegaRedondo, F.: The option to leave: Conditional dissociation in the evolution of cooperation. J. Theor. Biol. 267(1), 76–84 (2010)
 Jeong, H.C., Oh, S.Y., Allen, B., Nowak, M.A.: Optional games on cycles and complete graphs. J. Theor. Biol. 356, 98–112 (2014)
 Joyce, D., Kennison, J., Densmore, O., Guerin, S., Barr, S., Charles, E., Thompson, N.S.: My way or the highway: a more naturalistic model of altruism tested in an iterative prisoners’ dilemma. J. Artif. Soc. Soc. Simulat. 9(2), 4 (2006)
 Li, M., O’Riordan, C.: The effect of clustering coefficient and node degree on the robustness of cooperation. In: Evolutionary Computation (CEC), 2013 IEEE Congress on. pp. 2833–2839. IEEE (2013)
 Nowak, M.A.: Evolutionary Dynamics: Exploring the Equations of Life. Harvard University Press, Cambridge (2006)
 Nowak, M.A., May, R.M.: Evolutionary games and spatial chaos. Nature 359(6398), 826–829 (1992)
 Olejarz, J., Ghang, W., Nowak, M.A.: Indirect Reciprocity with Optional Interactions and Private Information. Games 6(4), 438–457 (2015)
 Smith, J.M.: Evolution and the theory of games. Cambridge University Press, Cambridge (1982)
 Xia, C.Y., Meloni, S., Perc, M., Moreno, Y.: Dynamic instability of cooperation due to diverse activity patterns in evolutionary social dilemmas. EPL 109(5), 58002 (2015)