Intrinsically Motivated Discovery of Diverse Patterns in SelfOrganizing Systems
Abstract
In many complex dynamical systems, artificial or natural, one can observe selforganization of patterns emerging from local rules. Cellular automata, like the Game of Life (GOL), have been widely used as abstract models enabling the study of various aspects of selforganization and morphogenesis, such as the emergence of spatially localized patterns. However, findings of selforganized patterns in such models have so far relied on manual tuning of parameters and initial states, and on the human eye to identify “interesting” patterns. In this paper, we formulate the problem of automated discovery of diverse selforganized patterns in such highdimensional complex dynamical systems, as well as a framework for experimentation and evaluation. Using a continuous GOL as a testbed, we show that recent intrinsicallymotivated machine learning algorithms (POPIMGEPs), initially developed for learning of inverse models in robotics, can be transposed and used in this novel application area. These algorithms combine intrinsicallymotivated goal exploration and unsupervised learning of goal space representations. Goal space representations describe the “interesting” features of patterns for which diverse variations should be discovered. In particular, we compare various approaches to define and learn goal space representations from the perspective of discovering diverse spatially localized patterns. Moreover, we introduce an extension of a stateoftheart POPIMGEP algorithm which incrementally learns a goal representation using a deep autoencoder, and the use of CPPN primitives for generating initialization parameters. We show that it is more efficient than several baselines and equally efficient as a system pretrained on a handmade database of patterns identified by human experts. ^{†}^{†}Source code and videos at https://automateddiscovery.github.io/
1 Introduction
Selforganization of patterns that emerge from local rules is a pervasive phenomena in natural and artificial dynamical systems (ball1999self). It ranges from the formation of snow flakes, spots and rays on animal’s skin, to spiral galaxies. Understanding these processes has boosted progress in many fields, ranging from physics to biology (camazine2003self). This progress relied importantly on the use of powerful and rich abstract computational models of selforganization (kauffman1993origins). For example, cellular automata like Conway’s Game of Life (GOL) have been used to study the emergence of spatially localized patterns (SLPs) (gardener1970mathematical), informing theories of the origins of life (gardener1970mathematical; beer2004autopoiesis). SLPs, such as the famous glider in GOL (gardner1983wheels), are selforganizing patterns that have a local extension and can exist independently of other patterns. However, finding such novel selforganized patterns, and mapping the space of possible emergent patterns, has so far relied heavily on manual tuning of parameters and initial states. Moreover, the dependence of this exploration process on the human eye to identify “interesting” patterns is strongly limiting further advances.
We formulate here the problem of automated discovery of a diverse set of selforganized patterns in such highdimensional, complex dynamical systems. This involves several challenges. A first challenge consists in determining a representation of patterns, possibly through learning, enabling to incentivize the discovery of diverse “interesting” patterns. Such a representation guides exploration by providing a measure of (di)similarity between patterns. This problem is particularly challenging in domains where patterns are highdimensional as in GOL. In such cases, scientists have a limited intuition about what useful features are and how to represent them. Moreover, lowdimensional representations of patterns are needed for human browsing and the visualization of the discoveries. Representation learning shall both guide exploration, and be fed by selfcollected data.
A second challenge consists in how to automate exploration of highdimensional, continuous initialization parameters to discover efficiently “interesting” patterns, such as SLPs, with a limited budget of experiments. Sample efficiency is important to enable the later use of such discovery algorithms for physical systems (grizou2019exploration), where experimental time and costs are strongly bounded. For example, in the continuous GOL used in this paper as a testbed, initialization consists in determining the values of a realvalued, highdimensional matrix besides 7 additional dynamics parameters. The possible variations of this matrix are too large for a simple random sampling to be efficient. More structured methods are needed.
To address these challenges, we propose to leverage and transpose recent intrinsically motivated learning algorithms, within the family of populationbased Intrinsically Motivated Goal Exploration Processes (POPIMGEPs  denoted simply as IMGEPs below, baranes2013active; pere2018unsupervised). They were initially designed to enable autonomous robots to explore and learn what effects can be produced by their actions, and how to control these effects. IMGEPs selfdefine goals in a goal space that represents important features of the outcomes of actions, such as the position reached by an arm. This allows the discovery of diverse novel effects within their goal representations. It was recently shown how deep neuronal autoencoders enabled unsupervised learning of goal representations in IMGEPs from raw pixel perception of a robot’s visual scene (laversanne2018curiosity). We propose to use a similar mechanism for automated discovery of patterns by unsupervised learning of a lowdimensional representation of features of selforganized patterns. This removes the need for human expert knowledge to define such representations.
Moreover, a key ingredient for sample efficient exploration of IMGEPs for robotics has been the use of structured motion primitives to encode the space of body motions (pastor2013dynamic). We propose to use a similar mechanism to handle the generation of structured initial states in GOLlike complex systems, based on specialized recurrent neural networks (CPPNs) (stanley2006exploiting).
In summary, we provide in this paper the following contributions:

We formulate the problem of automated discovery of diverse selforganized patterns in highdimensional and complex gameoflife types of dynamical systems.

We show how to transpose POPIMGEPs algorithms to address the associated joint challenge of (learning to) represent interesting patterns and discovering them in a sample efficient manner.

We compare various approaches to define or learn goal space representations for the sample efficient discovery of diverse SLPs in a continuous GOL testbed.

We show that an extension of a stateoftheart POPIMGEP algorithm, with incremental learning of a goal space using a deep autoencoder, is equally efficient than a system pretrained on a handmade database of patterns.
2 Related Work
Automated Discovery in Complex Systems
Automated processes have been widely used to explore complex dynamical systems. For example, evolutionary algorithms have been applied to search specific patterns or rules of cellular automata (mitchell1996evolving; sapin2003research). However, their objective is to optimize a specific goal instead of discovering a diversity of patterns. Another line of experiments represent active inquirybased learning strategies which query which set of experiments to perform to improve a system model, i.e. a mapping from parameters to the system outcome. Such strategies have been used in biology (king2004functional; king2009automation), chemistry (raccuglia2016machine; reizman2016suzuki; duros2017human) and astrophysics (richards2011active). However, these approaches have relied on expert knowledge, and focused on automated optimization of a predefined target property. Here, we are interested to automatically discover and map a diversity of unseen patterns without prior knowledge of the system. An exception is the concurrent work of grizou2019exploration, which showed how a simple POPIMGEP algorithm could be used to automate discovery of diverse patterns in oildroplet systems. However, it used a lowdimensional input space, and a handdefined lowdimensional representation of goal spaces, identified as a major limit of the system.
Intrinsically motivated learning
Intrinsicallymotivated learning algorithms (baldassarre2013intrinsically; baranes2013active) autonomously organize an agent’s exploration curriculum in order to discover efficiently a maximally diverse set of outcomes the agent can produce in an unknown environment. They are inspired from the way children selfdevelop open repertoires of skills and learn world models. Intrinsically Motivated Goal Exploration Processes (IMGEPs) (baranes2013active; forestier2017intrinsically) are a family of curiositydriven algorithms developed to allow efficient exploration of highdimensional complex real world systems. Populationbased versions of these algorithms, which leverage episodic memory, hindsight learning, and structured dynamic motion primitives to parameterize policies, enable sample efficient acquisition of highdimensional skills in real world robots (forestier2017intrinsically; rolf2010goal). Recent work (laversanne2018curiosity; pere2018unsupervised) studied how to automatically learn the goal representations with the use of deep variational autoencoders. However, training was done passively and in an early stage on a precollected set of available observations. Recent approaches (nair2018visual; pong2019skew) introduced the use of an online training of VAEs to learn the important features of a goal space similar to the methods in this paper. However, these approaches focused on the problem of sequential decisions in MDPs, incurring a cost on sample efficiency. This problem is observed in various intrinsically motivated RL approaches (bellemare2016unifying; burda2018exploration). The approaches are orthogonal to the automated discovery framework considered here with independent experiments allowing the use of memorybased sample efficient methods. A related family of algorithms in evolutionary computation is novelty search (lehman2008exploiting) and qualitydiversity algorithms (pugh2016quality), which can be formalized as special kinds of populationbased IMGEPs.
Representation learning
We are using representation learning methods to learn autonomously goal spaces for IMGEPs. Representation learning aims at finding lowdimensional explanatory factors representing highdimensional input data (bengio2013representation). It is a key problem in many areas in order to understand the underlying structure of complex observations. Many stateoftheart methods (tschannen2018recent) have built on top of Deep variational autoencoders (VAE) (kingma2013auto), using varying objectives and network architectures. However, studies of the interplay between representation learning and autonomous data collection through exploration of an environment have been limited so far.
3 Algorithmic Methods for Automated Discovery
3.1 Populationbased Intrinsically Motivated Goal Exploration Processes
An IMGEP is an algorithmic process generating a sequence of experiments to explore the parameters of a system by targeting selfgenerated goals (Fig. 1). It aims to maximize the diversity of observations from that system within a budget of experiments. In populationbased IMGEPs, an explicit memory of the history of experiments and observations is used to guide the process.
The systems are defined by three components. A parameter space corresponding to the controllable system parameters . An observation space where an observation is a vector representing all the signals captured from the system. For this paper, the observations are a time series of images which depict the morphogenesis of activity patterns. Finally, an unknown environment dynamic : which maps parameters to observations.
To explore a system, an IMGEP uses a goal space that represents relevant features of its observations, computed using an encoding function . For the exploration of patterns, such features may describe their form or extension. The exploration process iterates times through: 1) sample a goal from a goal sampling distribution ; 2) infer corresponding parameter using a parameter sampling policy ; 3) rollout an experiment with , observe the outcome , compute encoding ; 4) store in history . Because the sampling of goals and parameters depend on a history of explored parameters, an initial set of parameters are randomly sampled and explored before the intrinsically motivated goal exploration process starts.
Different goal and parameter sampling mechanisms can be used within this architecture (baranes2013active; forestier2016modular). In the experiments below, parameters are sampled by 1) given a goal, selecting the parameter from the history whose corresponding outcome is most similar in the goal space; 2) then mutating it by a random process. The goal sampling policy is a uniform distribution over a hypercube in chosen to be large enough to bias exploration towards the frontiers of known goals to incentivize diversity.
3.2 Online Learning of Goal Spaces with Deep AutoEncoders
For IMGEPs the definition of the goal space and its corresponding encoder are a critical part, because it biases exploration of the target system. One approach is to define a goal space by selecting features manually, for example by using computer vision algorithms to detect the positions of a pattern and its form. The diversity found by the IMGEPs will then be biased along these predefined features. A limit of this approach is its requirement of expert knowledge to select helpful features, particularly problematic in environments where experts do not know in advance what features are important, or how to formulate them.
Another approach is to learn goal space features by unsupervised representation learning. The aim is to learn a mapping from the raw sensor observations to a compact latent vector . This latent mapping can be used as a goal space where a latent vector is interpreted as a goal.
Previous IMGEP approaches already learned successfully their goal spaces with variational autoencoders (VAE) (laversanne2018curiosity; pere2018unsupervised). However, the goal spaces were learned before the start of the exploration from a prerecorded dataset of observations from the target environment. During the exploration the learned representations were kept fixed. A problem with this pretraining approach is that it limits the possibilities to discover novel patterns beyond the distribution of pretraining examples, and requires expert knowledge.
In this paper we attempt to address this problem by continuously adapting the learned representation to the novel observations encountered during the exploration process. For this purpose, we propose an online goal space learning IMGEP (IMGEPOGL), which learns the goal space incrementally during the exploration process (Algorithm 1). The training procedure of the VAE is integrated in the goal sampling exploration process by first initializing the VAE with random weights (Appendix E). The VAE network is then trained every explorations for epochs on the observation collected in the history . Importance sampling is used to give more weight to recently discovered patterns.
3.3 Structuring the parameter space in IMGEPs: from DMPs to CPPNs
A key role in the generation of patterns in dynamical systems is their initial state . IMGEPs sample these initial states and apply random perturbations to them during the exploration. For the experiments in this paper this state is a twodimensional grid with cells. Performing directly a random sampling of the grid cells results in initial patterns that resemble white noise. Such random states result mainly in the emergence of global patterns that spread over the whole state space, complicating the search for spatially localized patterns. This effect is analogous to a similar problem in the exploration of robot controllers. Direct sampling of actions for individual actuators at a microscopic time scale is usually inefficient. A key ingredient for sample efficient exploration has been the use of structured primitives (dynamic motion primitives  DMPs) to encode the space of possible body motions (pastor2013dynamic).
We solved the sampling problem for the initial states by transposing the idea of structured primitives. Indeed, “actions” consist here in deciding the parameters of an experiment, including the initial state. We propose to use compositional pattern producing networks (CPPNs) (stanley2006exploiting) to produce structured initial patterns similar do DMPs. CPPNs are recurrent neural networks that allow the generation of structured initial states (Appendix B, Fig. 9) . The CPPNs are used as part of the parameters . They are defined by their network structure (number of neurons, connections between neurons) and their connection weights. They include a mechanism for random mutation of the weights and structure. The number of parameters in is therefore not fixed (yet starts small) and openended.
4 Experimental methods
We describe here the continuous Game of Life (Lenia) we use as a testbed representing a large class of highdimensional dynamical systems, as well as the experimental procedures, the evaluation methods used to measure diversity and detect SLPs, and the used algorithmic baselines and ablations.
4.1 Continous Game of Life as a testbed




t=1  t=50  
t=100  t=200 
Lenia (chan2018lenia) is a continuous cellular automaton (wolfram1983statistical) similar to Conway’s Game of Life (gardener1970mathematical). Lenia, in particular, represents a highdimensional complex dynamical system where diverse visual structures can selforganize and yet are hard to find by manual exploration. It features the richness of Turingcomplete gameoflife models. It is therefore well suited to test the performance of pattern exploration algorithms for unknown and complex systems. The fact that GOL models have been used widely to study selforganization in various disciplines, ranging from physics to biology and economics (bak1989self), also supports their generality and potential of reuse of our approach for discovery in other computational or wet highdimensional systems.
Lenia consists of a twodimensional grid of cells where the state of each cell is a realvalued scalar activity . The state of cells evolves over discrete time steps (Fig. 2, a). The activity change is computed by integrating the activity of neighbouring cells. Lenia’s behavior is controlled by its initial pattern and several settings that control the dynamics of the activity change (). Appendix A describes Lenia and its parameters in detail.
Lenia can be understood as a selforganizing morphogenetic system. Its parameters for the initial pattern and dynamics control determine the development of morphological patterns. Lenia can produce diverse patterns with different dynamics (stable, nonstable or chaotic). Most interesting, spatially localized coherent patterns that resemble in their shapes microscopic animals can emerge (Fig. 2, b, c). These pattern types, which we will denote “animals” as a short name, are a key reason scientists have used GOL models to study theories of the origins of life (gardener1970mathematical; beer2004autopoiesis). Therefore, in our evaluation method based on measures of diversity (see below), we will in particular study the performance of IMGEPs, and the impact of using various approaches for goal space representation, on finding a diversity of animal patterns. We implemented for this purpose different pattern classifiers to analyze the exploration results (Appendix A.2). Initially we differentiate between dead and alive patterns. A pattern is dead if the activity of all cells are either or . Alive patterns are separated into animals and nonanimals. Animals are a connected areas of positive activity which are finite, i.e. which do not infinitely cross several borders. All other patterns are nonanimals whose activity usually spreads over the whole state space.
4.2 Evaluation based on the diversity of Patterns
The algorithms are evaluated based on their discovered diversity of patterns. Diversity is measured by the spread of the exploration in an analytic behavior space. This space is externally defined by the experimenter as in previous evaluation approaches in the IMGEP literature. For example, in pere2018unsupervised is the diversity of discovered effects of a robot that manipulates objects measured by binning the space of object positions and counting the number of bins discovered. A difference here is that the experimenter does not have access to an easily interpretable handdefined lowdimensional representation of possible patterns, equivalent to the cartesian coordinate of rigid objects. The space of raw observations , i.e. the final Lenia patterns , is also too highdimensional for a meaningful measure of spread in it. We constructed therefore an external evaluation space. First, a latent representation space was build through the training of a VAE to learn the important features over a very large dataset of Lenia patterns identified during the many experiments over all evaluated algorithms. This large dataset enabled to cover a diversity of patterns orders of magnitude larger than what could be found in any single algorithm experiment, which experimental budget was order of magnitude smaller. We then augmented that space by concatenating handdefined features (the same as for the HGS algorithm). See Appendix C for more information.
For each experiment all explored patterns were projected into the analytic behavior space. The diversity of the patterns is then measured by discretizing the space into bins of equal size by splitting each dimension into sections (results were found to be robust to the number of bins per dimension, see C). This results in bins. The number of bins in which at least one explored entity falls is used as a measure for diversity.
We also measured the diversity in the space of parameters by constructing an analytic parameter space. The 15 features of this space consisted of Lenia’s parameters (, , , , , , ) and the latent representation of a VAE. The VAE was trained on a large dataset of initial Lenia states () used over the experimental campaign. This diversity measures also used 7 bins per dimension.
4.3 Algorithms
The exploration behaviors of different IMGEP algorithms were evaluated and compared to a random exploration. The IMGEP variants differ in their way how the goal space is defined or learned. Appendices D and E provide details and hyperparameters.
Random exploration: The IMGEP variants were compared to a random exploration that sampled randomly for each of the exploration iterations the parameters including the initial state .
IMGEPHGS  Goal exploration with a handdefined goal space: The first IMGEP uses a handdefined goal space that is composed of 5 features used in chan2018lenia. Each feature measures a certain property of the final pattern that emerged in Lenia: 1) the sum over the activity of all cells, 2) the number of activated cells, 3) the density of the activity center, 4) an asymmetry measure of the pattern and 5) a distribution measure of the pattern.
IMGEPPGL  Goal exploration with a pretrained goal space: For this IMGEP variant the goal space was learned with a VAE approach on training data before the exploration process started. The training set consisted of 558 Lenia patterns: half were animals that have been manually identified by chan2018lenia; the other half randomly generated with CPPNs, see Section 4.4.
IMGEPOGL  Goal exploration with online learning of the goal space: Algorithm 1.
IMGEPRGS  Goal exploration with a random goal space: An ablated IMGEP using a goal space based on the encoder of a VAE with random weights.
4.4 Experimental Procedure and hyperparameters
For each algorithm 10 repetitions of the exploration experiment were conducted. Each experiment consisted of exploration iterations. This number was chosen to be compatible with the application of the algorithms in physical experimental setups similar to grizou2019exploration, planned in future work. For IMGEP variants the first iterations used random parameter sampling to initialize their histories . For the following iterations each IMGEP approach sampled a goal via an uniform distribution over its goal space. The ranges for sampling in the handdefined goal space (HGS) are defined in Table 5 (Appendix D). The ranges for the VAE based goal spaces (PGL, OGL) were set to for each of their latent variables. Then, the parameter from a previous exploration in was selected whose reached goal had the minimum euclidean distance to the current goal within the goal space. This parameter was then mutated to generate the parameter that was explored.
The parameters consisted of a CPPN (Section 3.3) that generates the initial state for Lenia and the settings defining Lenia’s dynamics: . The CPPNs were initialized and mutated by a random process that defines their structure and connection weights as done by stanley2006exploiting. The random initialization of the other Lenia settings was done by an uniform distribution and their mutation by a Gaussian distribution around the original values. The meta parameters to initialize and mutate the parameters were the same for all algorithms (Appendix B). They were manually chosen without optimizing them for a specific algorithm.
5 Results
We address several questions evaluating the ability of IMGEP algorithms to identify a diverse set of patterns, and in particular diverse “animal” patterns (i.e. spatially localized patterns).
(a) Diversity in Parameter Space  (b) Diversity in Behavior Space 
(c) Behavior Space Diversity for Animals  (d) Behavior Space Diversity for NonAnimals 
Does goal exploration outperform random parameter exploration?
In robotics/agents contexts where scenes are populated with rigid objects, various forms of goal exploration algorithms outperform random parameter exploration (laversanne2018curiosity). We checked whether this still holds in continuous GOL which have very different properties. Measures of the diversity in the analytic behavior space confirmed the advantage of IMGEPs with handdesigned (HGS) or learned goal spaces (PGL/OGL) over random explorations (Fig. 3, b). The organization resulting from goal exploration is also visible through the diversity in the space of parameters. IMGEPs focus their exploration on subspaces that are useful for targeting new goals. In contrast, random parameter exploration is unguided, resulting in a higher diversity in the parameter space (Fig. 3, b).
What is the impact of learning a goal space vs. using random or handdefined features?
We compared also the performance of random VAE goal spaces (RGS) to learned goal spaces (PGL/OGL). For reinforcement learning problems, using intrinsic reward functions based on random features of the observations can result in a high or boosted performance (burda2018large; burda2018exploration). In our context however, using random features (RGS) collapsed the performance of goal exploration, and did not even outperform random parameter exploration for all kinds of behavioural diversity (Fig. 3). Results also show that handdefined features (HGS) produced significantly less global diversity and less “animal” diversity than using learned features (PGL/OGL). However, HGS found an equal diversity of “nonanimals”. These results show that in this domain, the goalspace has a critical influence on the type and diversity of patterns discovered. Furthermore, unsupervised learning is an efficient approach to discover a diversity of diverse patterns, i.e. both efficient at finding diverse animals and diverse nonanimals.
Is pretraining on a database of expert patterns necessary for efficient discovery of diverse animals?
A possibility to bias exploration towards patterns of interest, such as “animals”, is to pretrain a goal space with a pattern dataset handmade by an expert. Here PGL is pretrained with a dataset containing 50% animals. This leads PGL to discover a high diversity of animals. However, the new online approach (IMGEPOGL) is as efficient as PGL to discover diverse patterns (Fig. 3, b,c,d). Taken together, these results uncover an interesting bias of using learned features with a VAE architecture, which strongly incentivizes efficient discovery of diverse spatially localized patterns.
(a) IMGEPHGS Goal Space  (b) IMGEPOGL Goal Space 
How do goal space representations differ?
We analyzed the goal spaces of the different IMGEP variants to understand their behavior by visualizing their reached goals in a twodimensional space. TSNE (maaten2008visualizing) was used to reduce the highdimensional goal spaces. It puts points that were nearby in the highdimensional space also close to each other in the twodimensional visualization.
The handdefined (HGS) and learned (OGL) goal spaces show strong differences between each other (Fig. 4). We believe this explains their different abilities to find either a high diversity of nonanimals or animals (Fig. 3, c, d). The goal space of the IMGEPHGS shows large areas and several clusters for nonanimal patterns (Fig. 4, a). Animals form only few and nearby clusters. Thus, the handdefined features seem poor to discriminate and describe animal patterns in Lenia. As a consequence, when goals are uniformly sampled within this goal space during the exploration process, then more goals are generated in regions that describe nonanimals. This can explain why IMGEPHGS explored a higher diversity of nonanimal patterns but only a low diversity of animal patterns. In contrast, features learned by IMGEPOGL capture better factors that differentiate animal patterns. This is indicated by the several clusters of animals that span a wide area in its goal space (Fig. 4, b).
We attribute this effect to the difficulty of VAEs to capture sharp details (zhao2017towards). They therefore represent mainly the general form of the patterns but not their finegrained structures. Animals differ often in their form whereas nonanimals occupy often the whole cell grid and differ in their finegrained details. The goal spaces learned by VAEs seem therefore better suited for exploring diverse sets of animal patterns.
6 Conclusion
We formulated a novel application area for machine learning: the problem of automatically discovering selforganized patterns in complex dynamical systems with highdimensions both in the action space and in the observation space. We showed that this problem calls for advanced methods requiring the dynamic interaction between sample efficient autonomous exploration and unsupervised representation learning. We demonstrated that populationbased IMGEPs are a promising algorithmic framework to address this challenge, by showing how it can discover diverse selforganized patterns in a continuous GOL. In particular, we introduced a new approach of learning a goal space representation online via data collected during the exploration process. It enables sample efficient discovery of diverse sets of animallike patterns, similar to those identified by human experts and yet without relying on such prior expert knowledge (Fig. 2). We also analyzed the impact of goal space representations on the diversity and types of discovered patterns.
The continuous game of life shares many properties with other artificial or natural complex systems, explaining why GOL models have been used in many disciplines to study selforganization, see bak1989self. We therefore believe this study shows the potential of IMGEPs to automated discovery in other systems encountered in physics, chemistry or even computer animation. In further work, we aim to apply this approach in roboticized wet experiments such as the one presented in grizou2019exploration and addressing fundamental understanding of how protocells can selforganize.
Acknowledgments
We thank Bert WangChak Chan for his helpful discussions about the Lenia system and Jonathan Grizou for his comments on the visualization of our results. Furthermore, we thank Cédric Colas for his useful comments on the script.
References
Appendix A Target System: continuous Game of Life (Lenia)
The Lenia model is a particular implementation of continuous Game of Life models (chan2018lenia). It was used as the target system for all exploration experiments. The following section describes Lenia and the parameters to control its behavior in detail. It is followed by a description of the classifiers used to categorize dead, animal and nonanimal Lenia patterns. Finally, statistical measures about the patterns are introduced which were used to define goal and analytic spaces.
a.1 Implementation Details and Parameters
Lenia (chan2018lenia) is a cellular automaton (wolfram1983statistical). It consists of a twodimensional grid of cells with for all experiments. The cell grid is similar to the surface of a ball. Cells on the north border are neighbors to the south border cells. The east and west border are also connected. The state of each cell is a realvalued scalar activity . The states of cells evolve over discrete time steps with for all experiments. The activity change of a cell is computed by integrating the previous activity of its neighbouring cells:
where is the growth mapping, is the kernel, with is the time step and is the clip function. For all experiments an exponential growth mapping was used:
with and being perimeters that control its shape.
The kernel integrates the activity of the current cell and its neighbours by a convolution with a kernel function :
(1) 
where is the neighborhood around the cell and with is the site distance. The neighborhood is defined by a circle around with radius : . The kernel is constructed by a kernel core function and a kernel shell function . The kernel core creates a ring around the center coordinate and is defined by an exponential:
The kernel shell takes a vector parameter and copies the kernel core into concentric rings. The rings are of equal thickness with peak heights :
Finally, the kernel is normalized:
In total 8 parameters controlled the behavior of Lenia for all experiments. is the starting pattern of the system. is the radius of the circle around a cell whose enclosed cells influence the activity of . controls the growth strength update per time step. The growth mapping is controlled by and . The form of the kernel function is controlled by .
We based our Python implementation of Lenia on the code provided by https://github.com/Chakazul/Lenia.
a.2 Classifier
We categorized 3 types of patterns that are observed in Lenia. The categories were used to analyze if the exploration algorithms showed differences in their exploration behaviors by identifying different types of patterns. The 3 categories are dead, animals and nonanimals. For each class is a classifier defined. The classifiers only classify the final pattern in which the Lenia system morphs after time steps.
Dead Classifier: For dead patterns is the activity of all cells either or in the last time step.
Animal Classifier: The final Lenia pattern is classified as an animal if it is a finite and connected pattern of activity. Cells , are connected as a pattern if both are active ( and ) and if they influence each other. Cells influence each other when they are within their radius of the kernel as defined by the parameter (Eq. 1).
Furthermore, the connected pattern must be finite. In Lenia finite and infinite patterns can be differentiated because the opposite borders of Lenia’s cell grid are connected, so that the space is similar to a ball surface. Thus, a pattern can loop around this surface making it infinite. We identify infinite patterns by the following approach. First, all connected patterns are identified for the case of assuming an infinite grid cell, i.e. opposite grid cell borders are connected. Second, all connected patterns for the case of a finite grid cell, i.e. opposite grid cell borders are not connected, are identified. Third, for each border pair (northsouth and eastwest) it is tested if cells within a distance of from both borders exists, that are part of a connected pattern for the infinite and finite grid cell case. If such a pattern exists than it is assumed to be infinite, because it loops around the grid cell surface of Lenia (Fig. 6, a). All other patterns are considered to be finite (Fig. 6, b). Please note that this method has a drawback. It can not identify certain infinite patterns that loop over several borders, for example, if a pattern exists that connects the north to east and then the west to south border (Fig. 7).
(a) Infinite Pattern  (a) Finite Pattern  
pattern 


pattern 



Moreover, there are two additional constraints that an animal pattern must fulfill. First, the cells of the connected pattern must have at least 80% of all activation, i.e. . Second, a pattern must exists for the last two time steps ( and ). Both constraint are used to avoid that too small patterns or chaotic entities which change drastically between time steps are classified as animals. See Fig. 5, 27, 29, 30 and 31 for examples of animal patterns.
a.3 Statistical Measures for Lenia Patterns
We defined five statistical measurements for the final patterns that emerge in Lenia. The measures were used as features for handdefined goal spaces of IMGEPs and to define partly the analytic behavior space in which the results of the exploration experiments were compared.
Activation mass : Measures the sum over the total activation of the final pattern and normalizes it according to the size of the Lenia grid:
where is the number of cells of the Lenia system.
Activation volume : Measures the number of active cells and normalizes it according to the size of the Lenia grid:
Activation density : Measures how dense the activation is distributed on average over all active cells:
Activation asymmetry : Measures how symmetrical the activation is distributed according to an axis that starts in the center of the patterns activation mass and goes along the last movement direction of this center. This measure was introduced to especially characterize animal patterns such as shown in Fig. 5. The center of the activity mass is usually also the center of the animals and analyzing the activity along their movement axis measures how symmetrical they are.
As a first step, the center of the activation mass is computed for every time step of the Lenia simulation and the Lenia pattern recentered to this location. This ensures that the center is all the time correctly computed in the case the animal moves and reaches one border to appear on the opposite border in the uncentered pattern. The center for time step is calculated by:
where measures the image moment (or raw moment) of order for .
Based on the center the pattern is recentered to by shifting the and indexes according to the center:
(2) 
where is width and length of the Lenia grid and the indexing is . After each time step the center is recomputed and the pattern recentered:
Please note, the simulations and all figures of patterns in the paper are done with the uncentered pattern. The centered version is only computed for the purpose of statistical measurements.
The recenter step by defines also the movement direction of the activity center:
where are the coordinates for the middle point of the grid. A line can be defined that starts in the midpoint of the final centered pattern and goes in and opposite to the final movement direction of the activity mass center . This line separates the grid in two equal areas. The asymmetry is computed by comparing the amount of activity in the grid right and left of the line. The normalized difference between both sides is the final asymmetry measure:
Activation centeredness : Measures how strong the activation is distributed around the activity mass center:
where is the distance from the point to the center point . is the centered activation that is updated every time step as for the asymmetry measure (Eq. 2). The weights decrease the farer a point is from the center. Thus, patterns that are concentrated around the center have a high value for close to . Whereas, patterns whos activity is distributed throughout the whole grid have a smaller value. For patterns that are equally distributed () is defined as centeredness measure.
Appendix B Sampling of Parameters for Lenia
All exploration algorithms explore Lenia patterns by sampling the parameters that control Lenia. The parameters are comprised of the initial pattern and the parameters which control the dynamic behavior (). There are two operations to sample parameters: 1) random initialization and 2) mutating an existing parameter . CPPNs are used for the random initialization and mutation of the initial pattern . The details of this process are described in the next section. Afterwards, the initialization and mutation of Lenia’s parameter that control its dynamics are described.
b.1 Sampling of Start Patterns for Lenia via CPPNs
Compositional Pattern Producing Networks (CPPNs) are recurrent neural networks that were developed for the generation and evolution of grayscale 2D images (stanley2006exploiting). We used CPPNs to generate and mutate the initial state of Lenia which resembles an image. CPPNs generate images pixel by pixel by taking as input a bias value, the and coordinate of the pixel in the image and its distance to the image center (Fig. 8). Their output is the pixel value as a gray scale between and for the given coordinate. For the generation of initial Lenia patterns is as input the and coordinate of the grid cells used. They were mapped to and . The distance to the grid center is given by . The final activity of a cell is the remapped output of the CPPN via .
CPPNs consist of several hidden neurons (typically between 4 to 6 in our experiments) that can have recurrent connections and self connections. Each CPPN has one output neuron. Two activation functions were used for the hidden neurons and the output neuron. The first is Gaussian and the second is sigmoidal:
(3) 
(4) 
To randomly initialize a Lenia initial pattern a CPPN is randomly sampled by sampling the number of hidden neurons, the connections between inputs and neurons and neurons to neurons, their connection weights and the activation functions for neurons. Afterwards the initial pattern is generated by it. In the history of the IMGEPs is then the CPPN as part of the parameter added. If the parameter is mutated, then the weights, connections and activation functions of the CPPN are mutated and the new initial pattern generated by it. A CPPN is defined over its network structure (number of nodes, connections of nodes) and its connection weights. The number of parameters in is therefore variable and not fixed.
We used the neatpython^{2}^{2}2https://github.com/CodeReclaimers/neatpython package for the random generation and mutation of CPPNs. It is based on the NeuroEvolution of Augmenting Topologies (NEAT) algorithm for the evolution of neural networks (stanley2002efficient). The metaparameters for the initialization and mutation of CPPNs are listed in Table 1. The random sampling and mutation of CPPNs allows to generate complex patterns as illustrated in Fig. 9.
Parameter  Value 
Initial number of hidden neurons  
Initial activation functions  gauss, sigm 
Initial connections  random connections with probability 
Initial synapse weight  Gaussian distribution with , 
Synapse weight range  
Mutation neuron add probability  
Mutation neuron delete probability  
Mutation connection add probability  
Mutation connection delete probability  
Mutation rate of activation functions  
Mutation rate of synapse weights  
Mutation replace rate of synapse weights  
Mutation power of synapse weights  
Mutation enable/disable rate of synapse weights 
The random sampling of a new CPPN is done by the following steps. All CPPNs are initialized with 4 hidden neurons and 1 output neuron. Their activation functions are randomly assigned. Each inputhidden, hiddenhidden and hiddenoutput neuron pair is connected with a probability of . The weights of each connection are sampled via a Gaussian distribution: . The maximum and minimum weights for a connection are and .
An existing CPPN is mutated by the following procedure. At first, structural mutations are performed. With probability a new neuron with a random activation function is added. The neuron is connected to the network by choosing randomly an existing connection. This connection is deleted. A connection from the source of the deleted connection to the new neuron is added with weight . Additionally, a new connection from the new neuron to the target of the deleted connection is added with the old connection weight , finishing the addition of a new neuron. With probability one of the hidden neurons is deleted. With probability a new connection is added between a random inputhidden, hiddenhidden or hiddenoutput neuron pair. The connection weight is sampled by the same method as for the sampling of new CPPNs. With probability one random existing connection is removed. After the structural mutations the activation functions and weights are mutated. For each neuron the activation function is changed with probability by randomly assigning a new activation function (either gauss or sigm). For each connection the weight is mutated by the following steps. With probability the weight of the connection is changed according to:
where is the mutation power and is the clip function. With probability the connection weight is completely replaced by sampling a new one as done for the sampling of weights of new CPPNs.
Please note, the neatpython package allows also the setting and mutation of response and bias weights for each neuron. Those settings were not used for the experiments. Moreover, we adjusted the sigmoid and Gaussian function in the neatpython package to the ones defined in Eq. 3 and Eq. 4 to be able to replicate similar images as in stanley2006exploiting.
Initialization  1^{st} Mutation  2^{nd} Mutation  3^{rd} Mutation  4^{th} Mutation  5^{th} Mutation 
b.2 Sampling of Lenia’s Dynamic Parameters
The parameters that control the dynamics of Lenia () are initialized and mutated via uniform and Gaussian distributions. Table 2 lists for each parameter the metaparameters for their initialization and mutation. Each parameter is initialized by an uniform sampling with and as upper and lower border. An existing parameter is mutated by the following equation:
where is the mutation power and is the clip function with and as upper and lower border. For natural numbers the resulting value is rounded towards the nearest natural number.
Parameter  Type  Value Range  Mutation 
Appendix C Measurement of Diversity in the Analytic Parameter and Behavior Space
The algorithms are compared on their ability to explore a diverse set of patterns. The next section introduces the diversity measure, followed by sections that introduce the spaces in which the algorithms are compared.
c.1 Diversity Measure
Diversity is measured by the area that explored parameters cover in the parameter space of Lenia or that the identified patterns cover in the observation space. For the experiments the parameter space consisted of the initial start state of Lenia () and the settings for Lenia’s dynamics (). The space consist therefore of dimensions, each for a single grid cell of the initial pattern, plus 7 dimensions for the dynamic settings. The observation space consists of the final patterns resulting in dimensions for the space. Each single exploration results in a new point in those spaces.
The diversity measures how much area the algorithms explored in those spaces (Fig. 10). The measurement is done by discretizing the space with a spatial grid and counting the number of discretized areas in which at least one point falls. For the discretization each dimension of the space is given a range, i.e. a minimum and maximum border. Each dimension is then split in a certain number of equally sized bins between those borders. The areas with values falling below the minimum or above the maximum border are counted as two additional bins.
The number of dimensions of the original parameter and observation space are too large to measure diversity in a meaningful manner. The initial pattern and the final pattern have dimensions. We constructed therefore an analytic parameter and behavioral space where the latent representations of a VAE were used to reduce the highdimensional patterns to 8 dimensions. The diversity in those spaces was compared between the algorithms. 5 bins (7 with the out of range values) per dimension were used for the discretization of those spaces for all experiments in the paper.
c.2 Analytic Parameter Space
The analytic parameter space was constructed by the 7 Lenia parameters that control its dynamics and 8 latent representation dimensions of a VAE (Table 3). The VAE was trained on initial patterns used during the experiments. The dataset was constructed by randomly selecting 42500 patterns (37500 as training set, 5000 as validation set) from the experiments of all algorithms and each of their 10 repetitions. The VAE uses the same structure, hyperparameters, loss function and learning algorithm as described in Section E. It was trained for more than 1400 epochs with (Fig. 11). The encoder which resulted in the minimal validation set error during the training was used. According to its reconstructed patterns it can represent the general form of patterns but often not individual details such as their texture (Fig. 12).
Analytic Parameter Space Definition
Parameter  min  max  Parameter  min  max 
R  VAE latent  5  5  
T  VAE latent  5  5  
VAE latent  5  5  
VAE latent  5  5  
VAE latent  5  5  
VAE latent  5  5  
VAE latent  5  5  
VAE latent  5  5 
c.3 Analytic Behavior Space
The analytic behavior space was constructed by combining the 5 statistical measures for final Lenia patterns (Section A.3) and 8 latent representation dimensions of a VAE (Table 4). The VAE was trained on final patterns observed during experiments. The dataset was constructed by randomly selecting 42500 patterns (37500 as training set, 5000 as validation set) from the experiments of all algorithms and each of their 10 repetitions. The dataset consists of 50% animal and 50% nonanimal patterns. The VAE uses the same structure, hyperparameters, loss function and learning algorithm as described in Section E. It was trained for more than 1400 epochs with (Fig. 13). The encoder which resulted in the minimal validation set error during the training was used. Its reconstructed patterns show that it is able to represent the general form of patterns but often not individual details such as their texture (Fig. 14).
Appendix D Random Exploration and IMGEPs with HandDefined Goal Spaces
Two random explorations and several IMGEPs with different handdefined goal spaces were evaluated and compared. The main paper and the additional results in Section F only report the results for the best random exploration and one IMGEP variant with a handdefined goal space. This section introduces the implementation details and diversity results of all evaluated random explorations and IMGEPs with handdefined goal spaces.
d.1 Random Explorations
We evaluated two random exploration strategies: Random Initialization and Random Mutation. The main paper and the additional results in Section F only discuss the Random Initialization approach.
Random Initialization: This approach sampled for each of the 5000 explorations a random parameter including a random CPPN to generate the initial state . The approach can be replicated by using Algorithm 2 with .
Random Mutation: This approach is closer to the principle of IMGEPs. It first performs random explorations and adds each explored parameter to a history . Afterwards, it randomly samples a parameter from the history and mutates it. The new parameter is also added to history . The approach can be replicated by using Algorithm 2 where line 6 is skipped and the parameter sampling distribution is selecting a random parameter from the history and mutating it.
d.2 IMGEPs with HandDefined Goal Spaces
We evaluated several IMGEP variants with goal spaces that were handdefined (IMGEPHGS). Each space was constructed by a different combination of statistical measures of the final Lenia patterns (Tables 5 and 6) which are described in Section A.3. The main paper and the additional results in Section F only discuss the IMGEPHGS 9 approach. Algorithm 2 lists the steps of the IMGEPHGS variants. They begin with random explorations, followed by 4000 explorations based on randomly generated goals. Each goal was sampled from a uniform distribution within the ranges defined in Table 5. Then the parameter from a previous exploration that resulted in the closest outcome to the current goal was mutated and explored.
d.3 Results
The random explorations and IMGEPHGS variants are compared by their resulting diversity in the analytic parameter and behavior space (Fig. 15). The diversity is measured by the number of reached bins in each space using a binning of 7 bins per dimension.
The Random Initialization approach reached for all diversity measures a higher diversity than the Random Mutation approach. Therefore, the Random Initialization approach is used for the comparison to IMGEP approaches in the main paper and the additional results in Section F.
Most IMGEPHGS variants had a higher diversity in the analytic behavior space compared to random explorations, although their diversity in the analytic parameter space is lower. This shows the advantage of IMGEPs over random searches in discovering a wider range of patterns in the target system. The best overall diversity had IMGEPHGS 3, 4 and 9. We chose IMGEPHGS 9 to compare it with learned goal spaces in the main paper and for the additional results in Section F. It identified the highest diversity of nonanimals of the three variants (3, 4, 9) reaching a higher diversity for nonanimals than any IMGEP with a learned goal space. It was therefore selected to show that the choice of the goal space has an influence on the patterns that IMGEPs identify.
Depending on the statistical measures used to define the goal space the diversity between the IMGEPHGS variants varied. IMGEPs that use the volume measure (HGS 1  4) reach in general a higher overall diversity which can be attributed to their higher diversity of animal patterns than goal spaces with the density measure (HGS 5  8) (Fig. 15, b, c). In terms of diversity of identified animals showed the inclusion of several measures the best performance (HGS 4 and HGS 8 in Fig. 15, c). In terms of diversity of identified nonanimals showed the inclusion of several measures besides the centeredness measure the best performance (HGS 3 and HGS 7 in Fig. 15, d). The results show that the choice of the goal space has an important influence on the diversity of identified patterns and their type (animal or nonanimal).
Feature  min  max 
mass  
volume  
density  
asymmetry  
centeredness 
HGSVariants  
Feature  1  2  3  4  5  6  7  8  9 
mass  
volume  
density  
centeredness  
asymmetry 
Appendix E IMGEPs with Random and Learned Goal Spaces using Deep Variational Autoencoders
We considered three random initializations for the VAE representation used in the IMGEPRGS as well as three different training objectives for learning the VAE goal space used in IMGEPPGL and IMGEPOGL. Variational Autoencoders (VAEs) (kingma2013auto; rezende2014stochastic) are commonly used deep generative models that can unsupervisedly learn a latent representation of the data. The latent representation has a reduced number of dimensions and should capture the important features of the input data. We use VAEs to learn the important features that describe Lenia patterns. The features are then used to define goal spaces for IMGEPs. This section details the different variants that were implemented for IMGEP with random and learned VAE goal spaces, the implementation details and compares the diversity results as well as the VAEs reconstruction accuracy.
e.1 IMGEP with Random VAE Goal Spaces
To study the impact of learning representations we implemented IMGEPRGS as an ablated version of IMGEPs where the goal space is based on the encoder of a VAE with random weights. We evaluated three variants to randomly set the weights of the VAE encoder: Pytorch (paszke2017automatic), Xavier (glorot2010understanding) and Kaiming (he2015delving). The VAE is composed of four 2D convolutional layers (with ReLU activations) followed by three fullyconnected layers. Table 7 shows the different sampling distributions from which the encoder parameters are initialized. We used uniform distributions for both Xavier and Kaiming variants and set all the layers bias parameters to zero.
Convolutional Layers  Linear Layers  
RGS Variants  bound  weight  bias  weight  bias 
Pytorch  
Xavier  0  0  
Kaiming  0  0 
e.2 IMGEP with Learned VAE Goal Spaces
e.2.1 VAE framework
VAEs have two components: a neural encoder and decoder. The encoder represents a given data point in a latent representation . In variational approaches the encoder describes a data point by a representative distribution in the latent space of reduced dimension . A standard Gaussian prior and a diagonal Gaussian posterior are used for this purpose. Given a data point , the encoder outputs the mean and variance of the representative distribution in the latent space. The decoder tries to reconstruct the original data from a sampled latent representation for the distribution given by the encoder.
Under these assumptions, training is done by maximizing the computationally tractable evidence lower bound (with ):
(5) 
The first term () represents the expected reconstruction accuracy while the second () is the KL divergence of the approximate posterior from the prior.
(6) 
e.2.2 VAE variants
The recent growing interest in unsupervised representation learning, and therefore in VAEs, resulted in a plethora of proposed losses, network designs and choices of family for the encoder, decoder and prior distributions (tschannen2018recent). In order to enhance desired properties such as interpretability and disentanglement of the latent variables, many current stateoftheart approaches build on the VAE framework and augment the VAE objective (higgins2017beta; burgess2018understanding; kim2018disentangling; chen2018isolating; kumar2017variational).
In this paper, we couple the VAE architecture with three different objectives: the classical VAE objective (kingma2013auto) (equation 5 with ), the VAE objective (higgins2017beta) equation 5 with ) and an augmented VAE objective (equation 7).
The VAE objective reweights the term by a factor , aiming to enhance the disentangling properties of the learned latent factors. We are interested in such properties as it has been shown that it can benefit exploration (laversanne2018curiosity). However, heavily penalizing can result in the network learning to “sacrifice” one or more of the learned latent variables in order to nullify their contribution (equation 6). Those dimensions become completely uninformative and useless for further exploration in the learned latent space. This phenomenon is known as posterior collapse and is a common problem when training VAEs (bowman2015generating; chen2016variational; he2019lagging; kingma2016improved).
To prevent this phenomenon to happen, we then considered an augmented VAE objective with a new term that encourages the network to decrease together the individual contributions of the different latent variables. This augmented loss term not only minimizes the averaged contribution (sum) but also the variance of the individual contributions:
(7) 
Similarly other modifications of the training objective can be found in the literature to avoid posterior collapse (tolstikhin2017wasserstein; zhao2017infovae).
By writing the VAE training objective as stated in equation 8, the three different variants outlined above correspond to the following set of hyperparameters , and .
(8) 
e.3 Implementation Details
This section describes the IMGEP approaches (RGS, PGL and OGL) and the network architecture, training procedure, hyperparameters and datasets for the training of their VAEs.
All VAEs use the same architecture (Table 8). The encoder network has as input the Lenia pattern and as outputs for each latent variable the mean and logvariance . The decoder takes as input during the training for each latent variable a sampled value . For validation runs and the generation of all reconstructed patterns shown in figures the decoder takes the mean as input. Its output is the reconstructed pattern.
The training objectives of all three variants are given in section E.2.1. The resulting loss function (Eq. 8) of all VAE variants for a batch is:
where are the input patterns, are the reconstructed patterns, are the outputs of the decoder network and is the number of latent dimensions. The reconstruction accurray part of the loss is given by a binary cross entropy with logits:
where the index is for the single cells (pixel) of the pattern and for the datapoint in the current batch, is the batch size and . The KL divergence terms are given by:
All VAEs were trained for 2000 epochs and initialized with pytorch default initialization. We used the Adam optimizer (kingma2014adam) (, , , , weight decay=) with a batch size of 64.
The patterns from the datasets were augmented by random x and y translations (up to half the pattern size and with probability 0.3), rotation (up to 40 degrees and with probability 0.3), horizontal and vertical flipping (with probability 0.2). The translations and rotations were preceded by spherical padding to preserve Lenia spherical continuity.
Encoder  Decoder 
Input pattern A:  Input latent vector z: 
Conv layer: 32 kernels , stride , padding + ReLU  FC layers : 256 + ReLU, + ReLU 
Conv layer: 32 kernels , stride , padding + ReLU  TransposeConv layer: 32 kernels , stride , padding + ReLU 
Conv layer: 32 kernels , stride , padding + ReLU  TransposeConv layer: 32 kernels , stride , padding + ReLU 
Conv layer: 32 kernels , stride , padding + ReLU  TransposeConv layer: 32 kernels , stride , padding + ReLU 
FC layers : 256 + ReLU, 256 + ReLU, FC:  TransposeConv layer: 32 kernels , stride , padding 
Three types of IMGEPs were evaluted:
IMGEPRGS (random goal space):
IMGEP with a goal space defined by an encoder network with random weights (Algorihm 3). The network architecture of the encoder is the same that the one of the VAEs used for IMGEP with learned goal spaces. In the other IMGEP algorithms (HGS/PGL/OGL), the goals are sampled uniformly within fixedrange boundaries that are chosen in advance. However, in the case of random goal spaces, we do not know in advance in which region of the space goals will be encoded. Therefore, we set the range to for each of the latent variables, to also bias exploration towards the boundaries of the discovered goal space.
IMGEPPGL (prelearned goal space):
IMGEP (Algorihm 4) with a goal space defined by a VAE that was trained before the exploration starts. The VAE is trained on a dataset with precollected Lenia patterns. The best VAE model obtained during the training phase, i.e. the one with with the highest accuracy on the validation data, is used for the exploration.
The dataset used to train the VAE has 558 patterns which are distributed into a training (75%), validation (10%) and testing (15%) datasets. Half of the patterns (279) were manually identified animal patterns by chan2018lenia (Fig. 5). The other half (279) are randomly initialized CPPN patterns as described in Section B.1 (Fig. 9).
During the intrinsically motivated iterations, goals are uniformly sampled in the hypercube . This values were chosen because the encoder of the VAE is trained to match a prior standard normal distribution (through the KL divergence term), therefore we can assume that most area of the covered goal space will fall into that hypercube.
IMGEPOGL (online learned goal space):
IMGEP (Algorihm 1 in the main paper) that trains the VAE which defines the goal space during the exploration. The VAE is trained on Lenia patterns discovered by the algorithm. Every explorations the VAE model is trained for 40 epochs resulting in 2000 epochs in total (less if there is not enough data after the first runs to start the training).
Importance sampling is used to give the patterns in the training dataset a different weight during the training. A weighted random sampler is used that samples newly discovered patterns from the training dataset half of the time. Each pattern that has been added to the training dataset during the last period of 100 explorations has a probability of to be sampled (N is the total number of new patterns in the dataset). Older patterns are also sampled half of the time each one with probability . As a result, newer discovered patterns have a higher weight and a stronger influence on the training of the VAE model.
The datasets were constructed incrementally during the exploration by gathering nondead patterns. One pattern every ten is added to the validation set (10%) and the rest is used in the training set. At the initial period of training, the training dataset amounts approximately 50 patterns and at the last period of training the dataset amounts approximately 3425 patterns (Fig. 16). The validation dataset only serves for checking purposes and has no influence on the learned goal space.
During the intrinsically motivated iterations, goals are uniformly sampled in the hypercube .
e.4 Results
We compared the different IMGEPRGS variants as well as the different objective variants for IMGEP with learned goal spaces (PGL and OGL) with each other on the basis of the diversity of their identified patterns. Furthermore, the pattern reconstruction ability of the VAEs is analyzed.
e.4.1 Diversity
The algorithms are compared by their diversity in the analytic parameter and behavior space (Section C). Diversity is measured by the number of discretized bins that were explored by the algorithms in each space if each dimension of the space is seperated in 7 bins.
All the IMGEPs with learned goal spaces reached a higher diversity in the analytic behavior space compared to random explorations (Fig. 17, b), although random explorations have a higher diversity in the analytic parameter space (Fig. 17, a). This result confirms further the advantage of IMGEPs over random explorations in terms of identifying diverse patterns.
Furthermore, all the IMGEPs with learned goal spaces outperformed the IMGEP with random goal space. This result shows the importance of learning relevant pattern features that, combined with an effective exploration process, is key to discover a high diversity of patterns.
There is no significant differences between Xavier and Kaiming IMGEPRGS variants. They both seem to present a higher variance than the Pytorch variant and reach therefore a higher average performance, but it is unclear why. Because the Xavier initialization performed slightly the best for IMGEPRGS, it was used for the results in the main paper and in Section F.
The difference between the PGL and OGL variants were small for all diversity measures. The OGL showed a slight advantage over the PGL versions in all diversity measures. Thus, an online version of the IMGEP can learn an appropriate goal space during the exploration. A precollected dataset as for the PGL is not necessary to successfully use IMGEPs.
The difference between the VAE objective variants (VAE, VAE and augmented VAE) was very small. The VAE was slightly better than the other two variants for the diversity in the analytic parameter space and for both IMGEP variants. All VAEs seemed to learn similar features for our datasets. It might be possible that the different VAE variants show different behaviors if their parameters are finetuned, such as the parameter, but this was out of the scope of this paper. Because the VAE objective performed slighly the best for IMGEPPGL and IMGEPOGL, it was used for the results in the main paper and in Section F.
e.4.2 VAE Pattern Reconstruction
All learned VAE variants showed similar learning curves on the precollected dataset and the online collected dataset (Fig. 18 and 20). Their ability to reconstruct patterns based on the encoded latent representation is also qualitatively similar. For both datasets the VAEs are able to learn the general form of the activity pattern (Fig. 19 and 21). Nonetheless, the compression of the images to a 8dimensional vector results in a general blurriness in the reconstructed patterns. As a result, the VAEs are not able to encode finer details and textures of patterns (Fig. 22). We believe this is the reason for their ability to identify more animals compared to the random exploration or the IMGEPRGS and IMGEPHGS. Different animals have often a different form, whereas nonanimals span often over the whole area of Lenia’s grid and differentiate mainly in their textures and small details. Because the VAE seem to encode more the general form a goal space based on them is more appropriate to find patterns with different forms such as the animals and not different textures which are important for nonanimals.
PGL  VAE  PGL  VAE  PGL  VAE aug 
OGL  VAE  OGL  VAE  OGL  VAE aug 
Reconstruction Examples of NonAnimals with Textures
Appendix F Additional Results
This section lists additional results. The results are only for a subset of all algorithm variants that have been evaluated. The results correspond to the following algorithms: Random to Random Initialization (Section D), IMGEPRGS to IMGEPRGS Xavier (Section E), IMGEPHGS to IMGEPHGS 9 (Section D), IMGEPPGL to IMGEPPGL with a VAE (Section E) and IMGEPOGL to IMGEPOGL with a VAE (Section E).
f.1 Number of Identified Patterns
The main paper used the measure of diversity of the found patterns per algorithm to compare their performance. Another measure to compare the algorithms is the number of the patterns they identified for each of the three pattern classes: dead, animals, nonanimals (Fig. 23).
The results deviate slightly from the diversity measures. In terms of identified nondead patterns, all IMGEP approaches outperform a random exploration by finding between 10 to 20% more patterns. Although the IMGEPRGS and IMGEPHGS find more nondead patterns than the IMGEPs with learned goal spaces (OGL, PGL) its overall diversity in the analytic behavior space is smaller (Fig. 3, b of the main paper).
In the case of animal patterns, all IMGEP approaches outperform the random exploration (8%). Within the IMGEP approaches the online learned goal space approach (IMGEPOGL, 34%) and the pretrained goal space approach (IMGEPPGL: 35%) find a similar amount. The handdefined goal space approach identified less animal patterns (IMGEPHGS: 19%). The random goal space approach is the one that finds the least (IMGEPRGS: 10%). For nonanimal patterns, the random goal space approach identifies most patterns (IMGEPRGS: 79%) , followed by the handdefined approach (IMGEPHGS: 67%), the random exploration (56%) and both learned goal space approaches (IMGEPOGL: 45%, IMGEPPGL: 43%). Although the number of identified nonanimal patterns for the learned goal space approaches is low, their diversity is higher than for a random exploration and the random goal space approach, and only slightly lower than for the handdefined goal space approach (Fig. 3, d of the main paper).
f.2 Dependence of the Diversity Measure on the Number of Bins per Dimension
The diversity of identified patterns measures the spread of the area which the identified patterns cover in the analytic behavior space (Fig. 3 in the main paper). The measure is defined by dividing the space in a number of discrete areas or bins (Section C.1). The diversity is then measured by how many bins are covered during an exploration. The bins are created by dividing each dimension of the space into a number of equallysized bins. We analyzed how the number of bins per dimension influences the diversity measure (Fig. 24).
Although the diversity difference between the algorithms depends on the number of bins per dimension for each space, the order of the algorithms, i.e. which algorithm has a higher diversity, is generally constant. Only if the number of bins per dimension grows large (>10) the order of the algorithms changes for some spaces and subpatterns. The order starts to follow the order seen for the number of identified patterns (compare the diversity with 25 bins per dimension in Fig. 24 with the number of identified patterns in Fig. 23). In this case the discretization of the space becomes too fine and each pattern falls into its own discretized area. We chose therefore a smaller number of bins per dimension of 7 (including the out of border bins) for all other diversity plots in the main paper and the Supplementary Material to compare the algorithms in a meaningful way.
(a) Diversity in Parameter Space  (b) Diversity in Behavior Space 
(c) Behavior Space Diversity for Animals  (d) Behavior Space Diversity for NonAnimals 
f.3 Dimension Reduction of the Analytic Parameter and Behavior Space
A twodimensional reduction of the identified patterns in the analytic parameter and behavior space (Section C) visualizes the diversity of the parameters and identified patterns. The dimension reduction of the parameter space is based on all explored parameters encoded in the analytic parameter space from the first repetition experiment of all 4 algorithms. All encoded points were normalized so that the overall minimum value became 0 and the maximum value 1 for each dimension. Afterwards a principle component analysis (PCA) was performed to detect the 2 principle components (Jolliffe1986). The found patterns for each algorithm are plotted according to those components.
Random  IMGEPRGS 
IMGEPHGS 
IMGEPPGL  IMGEPOGL 
Random  IMGEPRGS 
IMGEPHGS 
IMGEPPGL  IMGEPOGL 
The results show that the random exploration has a stronger uniform distribution than any of the IMGEP algorithms in the analytic parameter space (Fig. 25). The IMGEP algorithms show concentrations of explorations in specific regions of the parameter space. The visualization shows also that it is not possible to define distinct regions in the parameter space that allow to differentiate between dead, animal and nonanimal patterns.
The same analysis was performed for the identified patterns of each algorithm encoded in the analytic behavior space (Fig. 26). It is visible that the random exploration and the random goal space (RGS) approaches are more concentrated compared to the handdefined (HGS) or learned goal spaces approaches (PGL and OGL), especially in a region with many nonanimal patterns (northwest). The IMGEPRGS has a similar spread than random exploration, with an even higher concentration of nonanimals (northwest). The IMGEPHGS has a wider spread in the nonanimal area. The IMGEPs with a learned goal space (PGL and OGL) show a stronger distribution in an area that encodes mostly animals (south).
f.4 Identified Patterns
Fig. 27, 28, 29, 30 and 31 illustrate examples of identified pattern per class (animal, nonanimal, dead) and their ratio for the random exploration, IMGEPRGS, IMGEPHGS, IMGEPPGL and IMGEPOGL. The patterns have been randomly sampled from the results of the first exploration repetition experiment of each algorithm.
f.5 Visualization of Goal Spaces
The goal space of IMGEPs is their most important element because it defines which type of patterns are set as goals for the exploration. This section provides complementary material for the analysis made in Section 5.2 of the main paper and shows further visualizations of goal spaces. The goal spaces of all IMGEP algorithms are visualized via a twodimensional reduction of each goal space. Two techniques for dimensionality reduction were applied: PCA (Jolliffe1986) and tDistributed Stochastic Neighbor Embedding (tSNE) (maaten2008visualizing).
The visualization was constructed by using for each exploration algorithm its goal space representations of all patterns it explored from a single repetition experiment. All goal representations were normalized so that the overall minimum value became 0 and the maximum value 1 for each goal space dimension. Afterwards the PCA was performed to detect the 2 principle components. TSNE was executed by using the default standard Euclidean distance metric and default hyperparameters (perplexity set to 50).
The resulting twodimensional visualizations of the goal spaces make the differences between the algorithms visible (Fig. 32). For both aproaches (PCA, tSNE), the random goal space (RGS) and handdefined goal space (HGS) have only a small area and a few clusters for animal patterns. In contrast, the learned goal spaces based on VAEs (PGL and OGL) have larger areas and more clusters for animal patterns. As a result, the learned goal spaces explore more animal patterns and find a higher diversity of them (Fig. 3, c) compared to the handdefined goal space and the random goal space. The reason for this effect seems to be that the VAE which defines the goal space for the PGL and OGL is learning to represent the shape of patterns. The shape is an important feature of animals. Whereas, nonanimals often cover the whole Lenia grid and differ mainly in their textures which the VAE does not represent well (Section E).
The visualization serves as a support in qualitatively evaluating and comparing the efficiency of each algorithm in extracting a diversity of patterns from the data. Integrated into an interactive interface, these graphs are also useful for a potential human enduser to easily explore and visualize the different type of found patterns during the exploration phase. Videos and demonstrations of the interface can be found on the website https://automateddiscovery.github.io/.
PCA  TSNE  
rgs 



hgs 



pgl 



ogl 

