Emergent naming of resources in a foraging robot swarm
We investigate the emergence of language convention within a swarm of robots foraging in an open environment from two identical resources. While foraging, the swarm needs to explore and decide which resource to exploit, moving through complex transitory dynamics towards different possible equilibria, such as, selection of a single resource or spread across the two. Our point of interest is the understanding of possible correlations between the emergent, evolving, task-induced interaction network and the language dynamics. In particular, our goal is to determine whether the dynamics of the interaction network are sufficient to determine emergent naming conventions that represent features of the task execution (e.g., choice of one or the other resource) and of the environment, In other words, we look for an emergent vocabulary that is both complete (a word for each resource) and correct (no misnomer) for as long as each resource is relevant to the swarm. In this study, robots are playing two variants of the minimal language game. The classic one, where words are created when needed, and a new variant we introduce in this article: the spatial minimal naming game, where the creation of words is linked with the discovery of resources by exploring robots. We end the article by proposing a proof of concept extension of the spatial minimal naming game that assures the completeness and correctness of the swarm’s vocabulary.
In swarm robotics, coordination and self-organisation allow a group of robots to reach an efficiency at the collective level that would not be achievable by isolated robots (Trianni:2015if; Dorigo:ew). The resulting collaborative processes are often inspired by social insects and other group-living animals, which provide solutions to complex engineering problems that can be gainfully exploited for the design of distributed robotic systems (Brambilla:2013ja). In swarm robotics, robots can only rely on local sensors and actuators, resulting in local knowledge of their environment and local decision-making processes. To increase their effectiveness as a group, robots can share information and decide conjointly by communicating with each other. Communication can be performed through different modalities. Essentially, it can be either indirect (i.e., stigmergy) or direct. Both types are encountered in social insects, such as the pheromone trails used by ants (beekman2001phase) or the waggle dance used by honey bees (biesmeijer2001explo). These communication mechanisms have been implemented with success in swarm robotics, through indirect stigmergic interactions (Holland:1999uc; beckers2000fom), pheromones (fujisawa2014designing) and direct communication (Gutierrez:2010ea; miletitch2018balancing). While efficient, these communication mechanisms are usually designed for a specific task/environment (e.g., application in warehouses, see vivaldini2010automatic; stiefel2004dist) and convey specific pieces of information, hence limiting the system flexibility to conditions not foreseen at design time.
In order to gain more flexibility, researches aimed to add more plasticity to the communication process, for instance by exploiting an evolutionary process to design at the same time signals and adapted responses (marocco2007emergence; floreano2007evolutionary). The resulting communication mechanisms are very well adapted to the tasks and environmental conditions encountered during training, and also show some generalisation abilities. However, the characteristics of the obtained communication mechanisms remain very simple, with few signals and responses to signals that cannot easily scale up to more complex environment and/or tasks. Another possibility to provide communication abilities to a robotic system comes from models of natural language interactions (wang2005invasion; sole2010diversity; steels1997grounding). The study of language dynamics has attracted the attention of theoretical biologists, physicists and computer scientists, and we believe it can provide several advantages in support of self-organisation among robot swarms (CambierMiletitch2019).
A popular approach to the study of language dynamics is given by language games. Languages games are games played between agents/robots, with the purpose of mimicking real-world linguistic interactions leading to the emergence of a structured language. Various kinds of language games have been proposed to date, from imitation games (billard1997learning) to guessing games (steels2001language) and category games (puglisi2008cultural; baronchelli2010modeling). One in particular have received strong attention: the naming game (Steels:1995fn; Steels:2003wj). In this game, two or more robots interact to assign a unique name to a set of objects. At each interaction, one robot is chosen as a speaker and another as a listener. The speaker chooses a referring object and an associated word from its vocabulary—or invents one when no word is available—and then transmits it to the listener. If the listener knows the word, then the game is a success, and both agents remove all other words associated to the chosen object from their vocabulary, keeping only the shared word. If instead the listener does not know the received word, then the game fails, and the listener adds this new word to its vocabulary. We use in our study a specific version of the naming game, the minimal naming game (MNG, see baronchelli2006sharp; baronchelli2016gentle). In this version, focus is given only to reaching consensus on a single world within a population of communicating agents. Specifically, we consider an implementation in which the speaker broadcasts its word to all agents in his neighbourhood, while the listener is the only agent that updates the vocabulary upon success or failure of a game (baronchelli2011role).
The time to achieve consensus and the underlying dynamics are directly linked with the topology of interaction among agents. In non-embodied implementations, the link between topology and language dynamics have been extensively studied (e.g., fully-connected regular, small-world or random geometric networks, see baronchelli2007role; lu2008naming; Loreto:2011jn). Embodied implementations can be divided in two cases. On the one hand, a population of virtual agents can use a small number of robots (sometimes reduced to two, as in steels1999situated; Spranger:2013iq) to play the naming game, so that at each iteration, agents are selected and assigned to robots in order to record physical interactions among them. On the other hand, the naming game can be played among a population of embodied mobile agents (Baronchelli:2012dq; trianni2016emergence) that interact locally with each other according to a topology of interactions that is the direct result of the mobility pattern of the agents. This leads to more diversity in the agent’s local environment, paving the way for the study of language dynamics in a more realistic context, for instance when agents/robots are in the process of tackling a desired task.
As a matter of fact, the task that robots execute requires a specific pattern of interactions among robots, often leading to a complex interaction network whose properties can largely vary over time. If a language game is played over such a dynamic network, the language dynamics may be severely affected. Our point of interest in this study is the understanding of possible correlations between the emergent, task-induced, interaction network and the language dynamics resulting from an embodied version of the MNG. These correlations can be found both in the intrinsic dynamics and outcome of the language game, as well as in the potential relationship between the words resulting from the MNG and the environment in which the game is played. Some experiments have explored semantic connections between words signification and the robots’ physical surroundings in which they are played (steels1995self; Spranger:2013iq; steels2006perspective), but, to the best of our knowledge, only few have attempted to link the MNG with the execution of a specific task (i.e., aggregation as proposed in cambier2017group; cambier2018embodied).
In this study, the MNG is played on top of a self-organised foraging task, which is a behaviour commonly observed in social insects (bailis2010positional; saleh2007traplining) and used in swarm robotics as a metaphor for several concrete application scenarios, such as search and rescue or resource exploitation (mining, fishing, harvesting). When foraging, a swarm needs to explore the environment, identify and evaluate the available resources and make decisions on which resource to exploit, going through different transitory states before reaching an equilibrium (e.g., convergence on one single resource to exploit or split/load-balance among many, as in miletitch2018balancing). Similar behaviour provides a complex and time varying interaction network among robots, which can be exploited to support linguistic interactions among agents. Our main goal is to study whether the dynamics of the interaction network are sufficient to determine language dynamics that represent features of the task execution (e.g., choice of one or the other resource), of the environment (e.g., the presence of more than one resources, each associated to a different word), or both. To this end, We run experiments with two versions of MNG. Beside the classic MNG, we play a version where the creation of words is linked with the discovery of resources by exploring robots. In this setup, we study how well the robots manage to have an accurate description of their surroundings, that is both complete (a word for each resource) and correct (no misnomer) for as long as each resource is relevant (with respect to the number of robots on each path).
The paper is organised as follows. We begin by laying out our problem description and experimental setup. Then we discuss experimental results looking at the completeness and correctness of the emerging vocabulary, and how the swarm interaction topology can explain it. Last we propose a proof of concept in which the swarm evolves a complete dictionary (i.e., a number of different vocabularies to be associated to different resources), leading to a stable identification of the available resources.
2 Experimental setup
2.1 Problem description
In this study, the goal of the swarm is to play a minimal naming game while identifying and exploiting either of two resources (referred to as resource A and resource B) placed at the opposite side of a home area (referred to as nest, see Figure 1). The environment is a 2D infinite plane without obstacles, and all areas have circular shape with radius . Each resource is located at the same distance from the nest.
2.2 Robots and simulations
Experiments are run in simulation using ARGoS (Pinciroli:2012dc). In our study, we use this simulator to model a swarm of 50 e-puck robots (Mondada:2009tw). E-pucks have a differential drive motion, and their speed is measured by an encoder. Avoidance of other robots is done at short range () using infrared proximity sensors and at longer range () using the infrared range and bearing system (GutierrezRAB). The obstacle avoidance behaviour has been optimised to minimise the effects of robot density and congestion and the ability to navigate back and forth between resources, as detailed in a previous study (Miletitch:2013bk). Robots perceive nest and resources only when within the corresponding areas by means of infrared ground sensors, that robots use to differentiate between the white colour of the floor, the grey colour of the resources and the black colour of the nest. We assume here that robots have perfect knowledge of the nest location, while resources need to be located through exploration. Robots moves at maximum speed of and can locally broadcast short messages through the infrared range and bearing system within a distance of (indicated by the dotted circle around the robot in Figure 1). Robots broadcast at regular intervals of 0.1 s with no re-broadcast of information received (no multi-hop communication). They keep track of known areas’ position through odometry. The error on positioning produced through this tracking method can be efficiently compensated through social odometry (Gutierrez:2010ea; Miletitch:2013bk). Owing to this, in this study we neglect odometry errors and focus on the interplay between motion and language dynamics.
The robots start from the nest at the beginning of each experiment and perform a blind random walk for the first 200 s during which they do not communicate or search for resources. This allows us to neglect the initial transitory phase and study the system dynamics after the robots spread out within the environment. In the following experiments, unless mentioned otherwise, we perform runs lasting until language convergence for each experimental setup. Depending on internal parameters, it can last up to .
2.3 Individual and collective behaviour
2.3.1 Resource exploitation
The desired swarm behaviour (localization and exploitation of resources) takes inspiration from the decision-making process displayed by house-hunting honeybees (Pais:2013ek; Seeley108) which resulting spatial dynamics have been studied in miletitch2018balancing. In our study, we make use of the individual robot behaviour from Reina:2015gs, which was synthesized for the e-puck robots following a design pattern based on the above mentioned nest-site selection (NSS) behaviour of honeybees (Reina:2015hu). Following this design pattern, a robot is considered to be committed to a resource when it knows its location, and hence moves back and forth between the resource and the nest. Otherwise, a robot is considered uncommitted and explores the arena in search of a resource. Robots committed to resource A (B) are considered to belong to the population (), while uncommitted robots belong to the population , all summing up to robots such as .
Four concurrent processes determine the individual behaviour: discovery of a resource by an uncommitted robot; abandonment of a resource by a committed robot, which turns uncommitted with probability ; recruitment to a discovered resource of an uncommitted robot, which becomes committed with probability ; cross-inhibition between two committed robots, whereby one turns uncommitted with probability (see Reina:2015gs; Reina:2015hu, for more details). The latter introduces a negative feedback loop that helps the system break the symmetry and lead to a collective decision between resources. In our study, discovery of resources happens when a robot stumbles upon it, while recruitment and cross-inhibition happens only upon communication with other robots when located into the nest. Here, we set the probability of abandonment to zero, so that the only way for robots to change commitment state is through cross-inhibition. This favours the attainment of a consensus state in which all robots within the swarm are committed to the one or the other resource (Reina:2015hu).
The actual movements of the robot are governed by the following basic behaviours. When uncommitted, the robots explore the arena, performing a correlated random walk (Dimidov:2016gp), and have a fixed and small probability at every control step to return to the nest. When committed, the robots enter an exploitation loop where they move back and forth between the known resource and the nest (see Reina:2015gs, for a detailed description).
Depending on the value of and , the swarm displays different dynamics and different final distributions of robots among the populations , and . In this study, we focus on two specific cases: strong cross-inhibition and weak cross-inhibition. In the strong case (, Figure 2 top row) the swarm rapidly converges to the one or the other resource whereas the weak case (, see Figure 2 bottom row) leads to slower dynamics. While given enough time the swarm would end up converging, over the time span of our experiences this results in the swarm splitting between the two resources (see Figure 2, bottom row). At any time, with or without convergence, we define the resource with the highest number of committed robots as the “selected” resource. We define as the selected resource, and as the non-selected resource.
2.3.2 Minimal naming game
The language game played by the robots in our study is an implementation of the minimal naming game (MNG) for mobile agents/robots (baronchelli2006sharp; Baronchelli:2012dq; Trianni:2016fp). Each robot starts with an empty inventory. At each time step, each robot has a probability of becoming a speaker (here, ). The language game is played as follows: the speaker robot selects a word from its inventory (in our study, a one dimension inventory) and broadcasts it to its neighbours. At each time step, if a robot receives at least one message, it becomes a hearer robot. The hearer selects one (and only one) word at random among those received and checks it against its own inventory. If the hearer finds the selected word in its inventory, the hearer keeps only that word in the inventory while deleting all the others. If instead the hearer does not find the selected word in its inventory, it updates its inventory by adding the word (see Trianni:2016fp, for more details).
In this study, we consider two variants of the MNG, which differ in the way in which words are generated. In one case (referred to as classic game), the robots create a new word when becoming speaker with an empty vocabulary. In the other (referred to as spatial game), the robots create a new word when entering a resource with an empty vocabulary. In both cases, we associate each word with the closest resource to the robot at the time of the word creation, and we define () the set of words associated with resource (). Note that, by construction, . Robots having in their inventory any word () constitute population (). Robots with no words constitute population . In Figure 3, we depict the possible partition of robots among different populations, both with respect to the commitment state and to their vocabulary. Since a robot can have at a given time an inventory with words originating in both resource and , the propriety is not always verified. Similarly, through exchanges of words and robots between the different populations, at a given time the inventory of robots committed to one resource might contain a word associated with the other resource (resulting in ). At any time, we can look at the population of robots that know words associated with the resource they are committed to, that is:
Conversely, we can define the population of committed robots that know words from a non-matching resource:
Corresponding to the selected resource , we define the set of matching words and non-matching words as follows:
polarisation the condition in which committed robots know only words associated with the resource they are committed to, that is, when ;
vocabulary matching the condition in which only words associated with the selected resource are retained within the swarm vocabulary, that is and
vocabulary completeness the condition in which at least one word associated with each resource are retained within the swarm vocabulary, that is and
Given a sufficiently connected swarm, the MNG dynamics ensure that the swarm will eventually converge to a final single-word vocabulary, albeit after a very long time (baronchelli2006sharp; Baronchelli:2012dq; Trianni:2016fp). According to the previous definitions, the latter can be matching or not the selected resource.
3 Matching and completion of vocabulary
In this section, we focus on the evolution of the swarm’s vocabulary, looking in particular to the provenance of the last words and their relation to the selected resource. As already discussed (see Figure 2), the foraging dynamics lead to either the quick selection of a single resource, or to the swarm being split among two, possibly for a long time. This means that, apart from few cases and random fluctuations, there will always be a resource that is selected—albeit temporarily—by the swarm. In any case, interactions among different populations of robots are frequent, ensuring that the language dynamics always converge to a single-word vocabulary.
The complex interplay between the foraging and the language dynamics make it difficult to observe a clear emergence of vocabulary matching or completion during a run. It is possible that matching is achieved at some point, but the frequent interactions among sub-populations through the exchange of robots and words (as depicted in Figure 3) makes the analysis of the transitory phases complex. Considering that the MNG guarantees convergence to a single-word vocabulary, we analyse the provenance of the final word to determine if this word matches the selected resource or not (i.e., ). As the distribution of robots among sub-populations may sometimes change even after convergence to a single-word dictionary (e.g., if the language dynamics are much faster than the resource selection dynamics), the final selected resource may also change, hence we consider as matching with the resource selected at the time of convergence, no matter what happens later to the population distribution. Similarly, we consider also the second-last word , to determine whether it was also matching the selected resource or not at the time in which only two words remained within the whole swarm. Given such definitions, every run can end up in one of the following four possibilities:
In case or is observed, the swarm has identified a final word that matches the currently-selected resource, although in the case the second-last word was associated with the non-selected resource. The case represents a missed opportunity of matching, as a matching word was still existing in the vocabulary and could have been chosen. The case instead suggests that the association of words to resource does not reflect the current state of the resource selection. Both middle cases ( and ) indicate a complete vocabulary up until convergence on one word.
The histograms of the vocabulary’s end state shown in Figure 4 highlight the impact of the spatial link between resources and the creation of their associated words over the provenance of the final two words. When playing the classic game (top row of Figure 4), the swarm shows no tendency to favor a specific provenance for the final two words. This results in a histogram displaying characteristics of a uniform distribution. On the other hand, when playing the spatial game, the swarm favors matching words, both in the case of the last and second to last word. In particular, the OO end state is strongly favored for both cross-inhibition value, and the XX end state is especially disfavored in the slower dynamic of weak cross-inhibition.
The notion of provenance of a word in Figure 4 loses its relevance as a swarm’s distribution grows closer to a 50/50 split. In those cases, the swarm does not clearly favor the exploitation of one resource, to the point of possibly changing its selected resource overtime. Such distributions are presents in this study for weak cross-inhibition () (as seen in previous section in Figure 2), or even for strong cross-inhibition when high values of assures a quick convergence of the vocabulary.
The influence of the variations in distribution of the swarm over the provenance of the last two words of its vocabulary is displayed in Figure 5. Different variations have little to no impact when the swarm is playing the classic game (top row). The results with the other values of reinforce this points (c.f. Figure **?** in supplementary material). On the other hand, in the spatial game case, the more the swarm converge over one resource, the more likely the last two words will be matching this selected resource. If we focus on the and cases, representing vocabulary completion, we see that given a slow enough physical dynamic (in the case of weak cross inhibition), the spatial game retains for longer a complete vocabulary than the classic game, especially when the swarm is close to a split state.
Interestingly, when the swarm mostly converges (the 0%-10% bar), resulting in matching distribution for both weak and strong cross-inhibition, the spread of provenance of words are dissimilar. A weaker cross-inhibition results in a slower spatial dynamic and in more matching end words. This can be explain by the earlier exchange of words between sub population in the case of , discussed in next section.
There are two way for the swarm to reach convergence on a final word. Either the swarm converges as a whole, homogeneously, on this final word; or each sub-population first converge toward a word, followed by a competition between these two words resulting in the final word.
3.1 Impact of polarisation on completion of the vocabulary
4 A study of the swarm’s spatial characteristics
The explanations of previous sections’ results lay in the spatial characteristics of the swarm, a direct consequence of the exploitation task. How robots exchange branches and how their communication is impacted by the swarm’s distribution influence both the classic and spatial game, while only the latter is influenced by the link between resource position and word creation.
4.1 Impact of spatial word creation
At the beginning of the experiment the robots start in an uncommitted state and create words by themselves. In both classic and spatial games, this specific rate of discovery (red saturated curve in Figure 6) falls over time as robots get committed or start receiving messages from other robots. While constant in the spatial game, its starting peak grows with the probability of speaking when robots play the classic game. This rate (especially for higher values of ) forbid any robot to get committed, this results in a rate of creation of words in uncommitted robots by receiving words shifted in time with respect to the previous rate. Committed robots only come into play for low values of .
In the case of the spatial game, slow language dynamic gives enough time to robots to become recruited before receiving any messages, assuring a strong rate of self creation of words by committed robots (red faded curve). As the value of rises, robots are quicker to communicate and less likely to be already committed. This leads to a fall of the previously mentioned rate, replaced by the rate of creation of words by uncommitted robots through communication (green saturated curve). While higher values of assures an overall quicker dynamic, lower values assures that robots on each path starts with a word that corresponds to the resource they are exploiting.
4.2 Topology of the swarm
Exchanges of words between the two sub-population happen either when (i) a word is broadcast by a robot from one sub-population to a robot from the other sub-population, or when (ii) a word is carried by a robot switching from one sub-population to the other.
When broadcasting, robots communicate within their neighborhood, defined as the group of robots in their surroundings and noted . How populated this neighborhood is depends on the position of the speaker robot in the arena, which sub-population it is part of, and the overall distribution of the swarm.
In order to understand how the distribution of the swarm impact the size of each robot’s neighborhood, we ran experiments with locked-size sub-population. The experimental setup is explicited in the supplementary materials (Algorithm 1) and results are displayed up in Figure 7.
While most robots have no contact with robots from another sub-population (Bottom left), the more split the swarm is, the more those contacts happens. It’s interesting to note that even with this few contacts, one sub-population can still impose its vocabulary on the other. On the other hands, communications among sub-populations grows with their size (Top right). Two clouds are apparent on this heatmaps, each corresponding to one sub-population, one growing in size, the other shrinking, both finally merging when the swarm is splitting equally. Last, as robots mostly interact within their own sub-population, the heatmaps for interaction over the whole swarm is similar to the one describing interactions within sub-population (especially when the swarm is close to a converging state).
In order to understand how these results compare to a more classical network, we ran the same experiment with an exploring swarm of randomly walking robots. The bottom right of Figure 7 compare such results with cross sections of the previous heatmaps. While most of the robots have no neighbours in the context of the random walk (because of our short communication radius), our experiments results in an average number of neighbour between 1 and 3, most of them among the same sub-population.
Following these theoretical results, we measured the effective load of exchange of words between robots and robots between sub-populations (Figure 8). Overall, it confirms the higher frequency of robots to speak with robots from the same sub-population, as well as its evolution with size of the sub-population it is connected to. These frequencies of speaking scale with the different values of probability of speaking while not changing in overall dynamics (confirmed for the remaining values of in the supplementary material Figure **?**).
On the other hand, exchanges of robots are independent of this probability of speaking, and hence are negligible for all but low values of . In order to understand how different an exchange of robots and an exchange of message would be, we ran the same experiment with , forcing robots to commit to a sub-population for the all experiment. Similar results were found (c.f. supplementary Figure **?**) implying that the exchanges of robots between sub-population does not introduce a fundamentally different social process in this specific language setup.
Beside this exchange of robots, weak and strong cross-inhibition (respectively bottom and top of Figure 8) mainly differ in their transitory dynamic. When the cross-inhibition is weak, we first have the recruitment into the two sub-population, and then the slow convergence toward one. These two sub-dynamics introduce an angle in the curve at . When the cross-inhibition is stronger, both dynamic take effect concurrently.