# Transient Compartmentalization dynamics in the presence of mutations and noise

###### Abstract

We extend a recently introduced framework for transient compartmentalization of replicators with selection dynamics, by including the effect of mutations and noise in such systems. In the presence of mutations, functional replicators (ribozymes) are turned into non-functional ones (parasites). We evaluate the phase diagram of a system undergoing transient compartmentalization with selection. The system can exhibit either coexistence of ribozymes and parasites, or a pure parasite phase. If the mutation rate exceeds a certain level called the error threshold, the only stable phase is the pure parasite one. Transient compartmentalization with selection can relax this error treshold with respect to a bulk quasispecies case, and even allow ribozymes to coexist with faster growing parasites.

In order to analyze the role of noise, we also introduce a model for the replication of a template by an enzyme. This model admits two regimes: a diffusion limited regime which generates a high noise, and a replication limited regime, which generates a low noise at the population level. Based on this model, we find that, since the ribozyme dynamics belongs to the replication limited regime, the effects of noise on the phase diagram of the system are mostly negligible. Our results underlines the importance of transient compartmentalization for prebiotic scenarios, and may have implications for directed evolution experiments.

###### keywords:

Error Catastrophe, Mutation, Parasites, Growth Noise, RNA World^{†}

^{†}journal: Journal of LaTeX Templates

## 1 Introduction

Compartments play a central role in many biological processes of cells, in particular in organelles such as the ER or in the Golgi apparatus Jaiman2018 (). Cells use compartments to organize chemical reactions in space: compartments eliminate the risk of losing costly catalysts which are essential for biochemical reactions, they also accelerate chemical reactions, while reducing the risk of cross-talks due to other side reactions.

In the early 20th century, Oparin suggested that membrane-less compartments, which he called coacervates, could have played a central role in the origin of life Oparin1952 (). Many compartments in cells are bounded by membranes, but following the recent discovery of the so-called P-granules in C elegans embryo Brangwynne2009 (), biologists noticed that membrane-less compartments abound in living organisms. These membrane-less compartments are of interest for physicists who want to understand better the active non-equilibrium phase separation which creates them Zwicker2017 (); but also for chemists who are trying to synthetize and control them in vitro Brinke2018 (); Nakashima2018 ().

After the discovery of the structure of DNA, the coacervates scenario for the origin of life got less popular, and was replaced by replication scenarios Higgs2015 (); Dyson1985 (). In the sixties, Spiegelman showed that RNA could be replicated by an enzyme called RNA replicase, in the presence of free nucleotides and salt SPIEGELMAN1965 (). After a series of serial transfers, he observed the appearance of shorter RNA polymers, which he called parasites. Typically, these parasites are non-functional molecules which replicate faster than the RNA polymers introduced at the beginning of the experiment. In 1971, Eigen conceptualized this observation by proving theoretically that for a given accuracy of replication and a relative fitness of parasites, there is a maximal genome length that can be maintained without errors Eigen1971 (). This result led to the following paradox: to be a functional replicator, a molecule must be long enough. However, if it is long, it cannot be maintained since it will quickly be overtaken by parasites. This puzzle eventually played a central role in origin of life studies Smith1995 (); Takeuchi2012 ().

In the eighties, a theoretical solution emerged, the Stochastic corrector model Szathmary1987 () inspired by ideas of group selection Wilson1975 (). The idea is to use compartmentalization and selection to maintain functional replicators (ribozymes). The compartments in that model undergo cell division, which is a sophisticated feature that strongly constrains the allowed prebiotic scenarios.

In order to address this point, and also to assess the role of transient compartmentalization using a quantitative theoretical model, we introduced a general class of multilevel selection with transient compartmentalization Blokhuis2018 (). This class includes several scenarios for the origin of life based on various types of compartments (lipid vesicles vesicles (), pores kreysing2015heat (); Baaske2007 (), inorganic compartments hydrovent (), .. ) or various protocols of transient compartmentalization Damer2015 (); Furubayashi2018 () and a recent experiment, in which small droplets containing RNA in a microfluidic device Matsumura2016 () were used as compartments.

The related issue of cooperation between producers and non-producers has been discussed before Chuang2009 (). When compartments are not well defined, their role can be played by spatial clustering, which can favor the survival of cooperating replicators Tupper2017 (); Kim2016 (). These ideas were combined in a recent study of a population of individuals growing in a large number of compartmentalized habitats, called demes Geyrhofer2018 (). Another recent related study on transient compartmentalization quantifies co-encapsulation effects in the context of directed evolution experiments Zadorin2017 ().

In this paper, we go beyond the analysis carried out in our previous work Blokhuis2018 () by including the effect of mutations and noise. The motivation of including mutations comes from experiments, since mutations play a role in the RNA droplet experiment Matsumura2016 () which inspired us. The motivation of discussing noise in more details comes from the realization that replication is inherently stochastic when a small number of replicators are present in compartments. Therefore, the deterministic approach used in our previous work Blokhuis2018 () may seem like a severe approximation. In fact, it is important to appreciate that although our deterministic model neglects some sources of noise such as fluctuations in the growth rates, it includes fluctuations due to the smallness of the number of replicators present in the initial condition. This is a major source of noise because at the end of the exponential phase, the compartments no longer contain small numbers of molecules and therefore fluctuations become again negligible. A similar effect occurs when considering protein aggregation kinetics in small volumes: the fluctuations in the mass concentration of proteins at some later time are mainly due to the smallness of the number of proteins present in the initial condition Michaels2018 ().

In Sec. 2, we recapitulate the essence of the mutation-free model which we have introduced in Blokhuis2018 (), then in Sec. 3, we introduce an extension of that model to include deterministic mutations. The effect of noise is then covered in Sec. 4, which contains, in particular, in Sec. 4.1 a simple model for the replication of a single template by an enzyme, and in Sec. 4.4 an analysis of the growth noise in a population of such replicating molecules. The latter model is finally used to analyze the effect of noise on the transient compartmentalization dynamics introduced in the first section.

## 2 The mutation-free model

### 2.1 Definition of the model

We start from a large pool of molecules, which contains functional molecules called ribozymes and non-functional ones called parasites. Let the fraction of ribozymes in this pool be . These molecules then seed a large number of compartments, which is treated in the infinite limit. A given compartment will contain molecules, out of which will be ribozymes and the remaining ones parasites. It follows that is a random variable drawn from a Poisson distribution of parameter , while the number follows a binomial distribution . The resulting probability distribution for seeded compartments is then

(1) |

In addition to the replicating molecules, a large amount of replication enzymes and activated nucleotides is supplied. We assume that all compartments contain the same amount of replication enzymes and activated nucleotides.

After seeding, the numbers of ribozymes and parasites grow exponentially, what in a deterministic model leads to

(2) | |||||

(3) |

with the time at the end of exponential growth phase, the number of ribozymes and the number of parasites at time . At the end of this growth phase, we have , at which point further growth will be limited by the number of replication enzymes. As a result, after time , the growth will be linear instead of exponential, but in any case, the system composition defined here by the relative fraction of ribozymes, will not change. For this reason, we focus on the final composition at time which is controlled by the ratio . Here, we do not describe precisely the crossover between the exponential and the linear regime, which could be done using the notion of carrying capacity Houchmandzadeh2018 (). In that case, the growth would be described by logistic equations and the carrying capacity would be comparable to . Note also that the exact time may depend on , but since in practice this dependence has a small effect on the results of the model as we have checked in the Suppl. Mat. of Ref. Blokhuis2018 ().

In any case, the ribozyme fraction at the end of growth phase can be well approximated as

(4) |

If parasites grow faster, we have , and thus , which is the regime considered in Ref. Blokhuis2018 (). In Sec 3, we also consider regimes in which .

We now implement selection at the compartment level. In practice, selection could be autonomous or non-autonomous. For instance, in the experiment of Ref Matsumura2016 (), the selection was non-autonomous : a measurement of the synthesis of a dye molecule by photodetection was used to promote or reject compartments according to the outcome of that measurement. Selection can in general be described by a selection function . In our work, we have assumed that the selection function only depends on the final composition of the compartment.

For the ribozyme-parasite scenario, a natural choice for is a monotonically increasing function of . As an example, we will use the sigmoidal function

(5) |

where and are dimensionless parameters, which describe respectively a threshold in the composition and the steepness of the function.

The compartments which have passed the selection step are then pooled together, forming a new pool of molecules from which future compartments can be seeded. The ribozyme fraction of this new ensemble is the average of among the selected compartments

(6) |

which is equivalent to

(7) |

The transient compartmentalization cycle is then repeated, starting with the seeding of new compartments from that pool of composition .

Upon repetition of this protocol, the pool composition typically converges to a fixed point , which is a solution of

(8) |

The stability of the fixed point changes when

(9) |

It is implicitly assumed that is a sufficiently smooth function of for this derivative to be defined.

### 2.2 Main dynamical regimes

Although finding a fixed point is generally difficult, our ribozyme-parasite model contains two simple fixed points: and . By evaluating the stability of these two fixed points, four regimes can be distinguished, which are shown in the phase diagram in Fig 1. If is stable and unstable, ribozymes are stabilized and parasites are purged. If is stable and unstable, parasites deterministically invade the pool and purge ribozymes. If both and are unstable, trajectories from either side are attracted to a stable third fixed point , leading to stable coexistence between parasites and ribozymes. Finally, if and are stable, their basins of attraction are separated by a third fixed point , which is unstable, and in this case we have a bistable regime in which the initial composition determines the fate of the system.

These conclusions can only be drawn provided there are not other fixed points besides (). Extra fixed points come in pairs (one stable, one unstable) and matter only if they are situated within , in which case a stable coexistence and a bistable phase would be added to the behavior inferred from the other fixed points. For simple monotonically increasing selection functions, we find that extra fixed points are a rare occurrence. Nevertheless a case where this occurs has been discussed in the Suppl. Mat. of Ref. Blokhuis2018 ().

### 2.3 Comparison to experiments

In addition to predicting the phase diagram associated with the long-time compositions reached by this transient compartmentalization dynamics, our theoretical model makes also predictions regarding the evolution of the ribozyme fraction as function of the round number, i.e. the number of completed cycles of compartmentalization. The model correctly reproduces that this fraction quickly goes to zero as function of the round number in bulk, less quickly with compartmentalization and no selection and even less quickly in the case of compartmentalization with selection. In the latter case, a finite fraction can be maintained for an infinite number of rounds provided is sufficiently small, corresponding to the coexistence region of the phase diagram.

In order to compare precisely the predictions of the model to the experiments of Ref. Matsumura2016 (), it is important to know the value of key parameters such as . Table 1 reports the experimental parameters measured in Ref. Matsumura2016 () for the ribozyme and three different parasites. The nucleotide length, its doubling time (), its relative replication rate () from which we infer in the final column. The doubling time for the ribozyme is related to the growth rate by , and similarly the doubling times of the parasites is .

Type | Length (nt) 2 | Relative | ||
---|---|---|---|---|

Ribozyme | 362 | 25.0 | 1.00 | 1 |

Parasite 1 | 245 | 20.7 | 1.21 | 13 |

Parasite 2 | 223 | 17.1 | 1.46 | 107 |

Parasite 3 | 129 | 14.6 | 1.71 | 473 |

In the experiment, a typical compartment contains RNA molecules that can be ribozymes or parasites, molecules of Q replicase, and molecules of each NTP. Replication takes place by complexation of RNA with Q replicase, which uses NTPs to make a complementary copy. This copy is then itself replicated to reproduce the original. There is a large amount of nucleotides, so that exponential growth of the target RNA proceeds until . This large quantity of enzymes also means that in practice, the noise due to fluctuations in the number of enzymes should be very small. Starting from a single molecule, it takes doubling times to reach this regime. In a parasite-ribozyme mixture, we can estimate using the relative :

(10) |

## 3 A modified model with deterministic mutations

In the deterministic model, we assume that a fraction of replicated ribozyme strands mutate into parasites. Thus, the equations describing the evolution of and in the growth phase assumes the form

(11) | |||||

which yields for the first equation

(12) |

where is again the number of ribozymes at the end of the growth phase and the value at the initial time. Now substituting Eq. (12) into the equation for , one finds

(13) |

The ratio between the number of daughters of one parasite molecule and the number of daughters of a ribozyme molecule is now renormalized by the rate : , where is the relative growth of parasites introduced previously in the mutation-free model.

The fraction of ribozymes at the end of the exponential phase is now given by

(14) |

where . We call the mutation ratio, which is a dimensionless measure of mutation versus relative growth (competition). When , we recover the mutation-free model, if mutations become dominant.

Selected compartments are then pooled together, and the new average fraction of ribozymes becomes . Note that for nonzero mutation rate (), ceases to be a fixed point in this deterministic approach, since parasites will always appear at sufficiently long times. Therefore, the pure ribozyme (R) phase is no longer present in the phase diagram of fig. 2.

The fixed point however is still present. If this fixed point is stable, we have a pure parasite phase. If it is unstable, there is stable coexistence at a fixed composition. If more fixed points appear, multiple stable compositions are in principle be possible.

### 3.1 The prolific parasites regime ()

Prolific parasites have a better bulk reproductive success than ribozymes, when , which is equivalent to and . In a mutation-free model, this would imply necessarily a faster growth of parasites (), but in the present case, we could also allow for slower parasites as compared to ribozymes (i.e. ), provided parasites are aided by a sufficiently high mutation rate .

The phase diagram is evaluated by testing the stability of the fixed point . We find an asymptote behaving like for large , and plateaus for small . The ends of these plateaus locate in the limit at the position of the vertical line separating the ribozyme and bistable phase in the original phase diagram.

Let us first derive the right asymptote in the limit. In this limit, we evaluate by considering compartments of size

(15) |

The fixed point stability condition leads to

(16) |

Upon substituting Eq. (14) evaluated at and approximating , (for ) we find a quadratic equation for , whose only physical solution () is

(17) |

Since we consider monotonically increasing selection functions, . For , we find

(18) |

which is the same expression as the one found in the mutation-free phase diagram Blokhuis2018 (). This explains why there is a single asymptote as is varied in the limit.

The plateaus extend to very low values of . We can find their location by considering only compartments of size . In that case, the final compositions can be or

(19) |

We then have for the composition recursion

(20) |

Evaluating the derivative of , we find

(21) |

Substituting (19), we find that the location of plateaus obeys the implicit equation

(22) |

### 3.2 The prolific ribozymes regime ()

We now consider the opposite case where parasites are less prolific than ribozymes. This means and is equivalent to . This implies that (less aggressive parasites) and is reminiscent of a quasipecies scenario in which a fit ribozyme succesfully outcompetes its parasites in bulk Eigen1971 (). Since this can already happen in the absence of selection, we consider here the case where there is no selection, i.e. .

To analyze this regime we again assess the fixed point stability of . We locate numerically the separatrix as shown in Fig 3. We obtain separatrices that for tend to a fixed value of .

Let us start by observing that when , there are only two final compartment compositions for nonempty compartments: or for . We can now distinguish between three initial compartment compositions: (i) only parasites, (ii) no parasites, no ribozymes, and (iii) containing at least one ribozyme. Their associated seeding probabilities are:

(23) | |||||

In that case, we can write the composition recursion equation as

(24) |

The condition yields the expression

(25) |

for the asymptote. For , we obtain using (25)

(26) |

which agrees very well with Fig 3.

Notice that here the coexistence phase is located to the right of the asymptotes, and the parasite phase to the left, whereas in Fig 2 it is the other way around. An intuitive way to understand this is to consider the limit . In this limit, nonempty compartments start with either a parasite or a ribozyme. The former will grow to a fully parasitic compartment, whereas the latter will contain ribozymes plus some parasites acquired by mutations. Therefore, at low , the ribozyme’s capacity to outgrow parasites (competition) cannot be exploited, leading to ribozyme extinction. It is only when ribozymes and parasites are seeded together that the differential growth rate becomes important, which becomes increasingly likely for higher . The phase boundaries in Fig. 3 mark the point where enough compartments engage in competition to allow for ribozyme survival. The mutation strength compares mutation rate to competition. When , there is enough competition to ensure coexistence for all .

### 3.3 Error catastrophe

An error catastrophe corresponds to a situation where the accumulation of replication errors eventually causes the disappearance of ribozymes. Since there are only a parasite (P) and a coexistence phase (C) in the model with mutations, the error catastrophe means that the coexistence region shrinks at the benefit of the parasite phase as the mutation rate increases. One sees this effect in Fig. 2, which corresponds to the prolific parasites regime () discussed above. In this figure, we see a larger coexistence region in the small region, because there the compartmentalization is efficient to purge parasites. As the mutation rate increases however, this region shrinks because the compartmentalization fails to purge the more numerous parasites.

In Fig. 5, a particular example is provided where and are fixed, such that is fixed, and is varied. Since competition is fixed, we have . The resulting steady-state value then decreases monotonically with , and reaches when crossing the phase boundary in Fig 5. For small values of , this boundary corresponds to the plateau region, for larger values, this corresponds to the asymptote. As can be seen in Fig 5, coexistence is stable for much higher values of the mutation rate when the compartment size is small. This means that compartmentalization with selection leads to a relaxed error treshold with respect to the bulk.

The error catastrophe was also studied in the absence of selection, and was shown to be in the prolific ribozymes regime (). In Fig. 7, an example of this case is shown, and there too, we see that the steady-state value of the ribozyme fraction decreases as is increased, until it reaches the phase boundary in Fig 7. In contrast to Fig. 5, where the error threshold decreases as the size of compartments increases, the trend is just the opposite in Fig. 7, which is expected since the role of ribozymes and parasites are exchanged here as compared to the prolific parasites regime.

In the prolific parasites regime, with selection, it is interesting to recast the error threshold as a constraint on the length of a polymer to be copied accurately, as done in the original formulation of the error threshold (Eigen1971, ). Let us introduce the error rate per nucleotide, . Then, for a sequence of length , we have . Since , it follows from this that . When , we have . Using Eq. (22), we find that the condition to copy the polymer accurately is

(27) |

where and is the number of generations. This criterium has a form similar to the original error threshold Eigen1971 (), namely

(28) |

where represents the selective superiority of the ribozyme. In our model, the equivalent of is which characterizes the compartment selection.

## 4 Noise in growth

For deterministic growth, given by Eqs. (2)-(3), fluctuations in the growth rates, denoted for the ribozymes and for the parasites, have been neglected. In order to estimate whether fluctuations in the growth rates of ribozymes and parasites could have a strong effect, we introduce in the next section a model for the replication process of such molecules by a replicating enzyme. This model includes noise due to the stochastic binding of the replicating enzyme to template molecules and the noise due to the stochasticity of monomer addition once the enzyme is bound to a template. Importantly, this model assumes that the replicase once bound stays always active until completion of the copy of the template, therefore the possibility that the replicase falls off the template before completion of the copy is neglected. Similarly, any effects associated with the interaction of multiple replicases on the same template are neglected. In fact, when the replicase falls off of its template, the copying process is aborted and the shorter chain which has been produced in this way becomes a parasite. We can therefore describe such a process as a mutation using the framework of the previous section. To separate the effects due to mutations and noise clearly, we disregard from now on the possibility of mutations, and we focus in the following on the description of the noise associated with replication. Such a noise can stabilize the ribozyme phase at the expense of coexistence, and the coexistence phase at the expense of the parasite phase. The noise of replication becomes very small when the rate-limiting step is nucleotide incorporation, in which case one can use a deterministic approach.

### 4.1 A minimal model for the replication process

The replication of an RNA strand by a replicase can be considered to proceed through two stages. In the first stage, an RNA molecule binds to a polymerase , to form a complex

(29) |

with the rate .

Subsequently, activated nucleotides are incorporated in a stepwise fashion to the complementary strand. A complex of and with a complementary strand of length will be denoted by , and the strand grows until the final length is achieved, such that

(30) | |||||

(31) |

where for simplicity we have assumed the same rate for both reactions. Let us denote by the total time to yield from , which is the sum of the time associated with the step of complex formation, and with the step of nucleotide incorporations . We thus have

(32) |

with and the time for adding one monomer, which we assumed is distributed according to

(33) |

For simplicity, we choose a single value for all monomer additions. The time for the formation of the complex, is similarly distributed according to

(34) |

where .

Let us denote the moment generating function of by and similarly for by with :

(35) | |||||

(36) | |||||

From one obtains the distribution of replication time by performing an inverse Laplace transform:

(37) |

where represents the inverse Laplace transform. This equation shows that the replication time distribution of one strand of length follows a Gamma distribution Floyd2010 (). For , Eq. (37) becomes a simple exponential distribution, which is a memoryless distribution. For , this distribution has memory and the growth in the number of RNA strands can no longer be described as a simple Markov process. Note that the Gamma distribution is peaked around the mean value of , namely for . In this limit, the replication time has very small fluctuations. This feature has recently been exploited to construct a single-molecule clock, in which the dissociation of a molecular complex occurs after a well-controlled replication timeJohnson-Buck2017 ().

### 4.2 Coefficient of variation of the replication time

Let us now study the coefficient of variation of the full time , which includes the diffusion of the replicase and the replication step. The generating function of is clearly . Thus, the cumulant-generating function defined as , yields the two moments of the distribution of , namely the mean and the variance . We have

(38) | |||

(39) |

Thus the coefficient of variation of the replication time, namely is given by

(40) |

Fig 8 shows this quantity as function of the length and of the ratio of the rates ().

There are two regimes: on one hand, when , the time taken by the replication step dominates over the time for the replicase to diffuse to its target. If in addition , the coefficient of variation of the time scales as and therefore becomes very small for long strands. This power-law regime in indeed visible as plateaus in Fig 8.

On the other hand, when , the time to form a complex between the replicase and its template dominates over the replication time. This regime has a large coefficient of variation since as also seen in Fig 8.

### 4.3 The generations representation

Let us look at these two growth regimes in a generations representations, where by generations we mean an event of copy of the template by the replicase. The diffusion-limited regime corresponds to Fig. 9, while the replication- limited regime corresponds to Fig. 10. In this representation, the differences in the two growth regimes become very clear. In the replication-limited regime, generations remain synchronized, until enough noise has accumulated over multiple generations. For two independent strains, generations become desynchronized after about generations. In contrast, in the diffusion-limited regime, fluctuations are very large due to lack of synchronicity in the growth.

These figures have been obtained by simulating the growth of a replicating mixture starting from a single strand. The simulation follows RNA-enzyme complexes, and for each the variable measures the length of the growing complementary strand. For every nucleotide incorporation event, a strand is chosen with probability , after which its number of nucleotides is updated from to . When , we set , we update to , and then we introduce an extra strand variable for the new strand. Both the replication-limited regime and the diffusion-limited regime can be modeled using this simulation. In the latter case, we need to choose , which corresponds to exponentially distributed replication times.

### 4.4 Population-level noise

In sec 4.2, we have analyzed the noise associated with the replication of a single strand. Ultimately, we wish to quantify the compositional variation of the final population. In order to do so, we turn to the theory of branching processes with variable lifetimes taken randomly from a fixed distribution Karlin1975 (). As explained in A, this framework describes theoretically a population that grows exponentially starting from a single individual. In our molecular system, this single individual plays the role of the single molecule present in the initial condition before the replication starts; while the distribution of the lifetimes is the replication time distribution obtained in Eq (37).

For , we find that the average population (starting from a single individual) scales as , with a growth rate . The coefficient of variation of the population size is

(41) |

The renewal theory on which these results are based, can be generalized to the case that there are individuals in the initial condition as shown in B. The full solution is rather complicated due to correlations between the subpopulations generated by the different molecules present in the initial condition. In the following, we neglect these correlations: therefore the initial molecules generate independent subpopulations, which all start at size and follow the branching process described above and in A. In that case, each subpopulation now has a mean and a standard deviation . This then allows to write

(42) |

We show in Fig. 11 that the corresponding coefficient of variation, , agrees well with simulations of the branching process. The 2000 simulation runs were stopped after a time such that .

### 4.5 Fluctuations in logistic growth

The problem of two species competing for the same resources has been studied in the literature and offers a complementary perspective on the role of noise in a growing population, which has been studied in the previous section. Let us consider two such species, which typically start with a few individuals and then grow according to logistic noise. As shown in Ref. Houchmandzadeh2018 (), when the carrying capacity is reached, the number of each species is subject to giant fluctuations (the coefficient of variation is of the order of unity) when the two species have similar growth rates. In the terminology introduced in previous section, this model applies to the diffusion-limited regime (), where a Markov description of the population dynamics is applicable.

Keeping the notations of the first section, we denote by the initial number of molecules, which splits into ribozymes and parasites, and by the final number of molecules in the compartment. In the neutral case (), the moments of the number of ribozymes are found to be Houchmandzadeh2018 () :

(43) | |||||

(44) |

with again . Since remains fixed, . This means that

(45) |

for , which means that the noise in the composition depends primarily on the number of individuals in the initial condition. Let us denote , with and . In Ref. Houchmandzadeh2018 (), it was shown that

(46) |

In general, the dynamics of the composition has a large variability for: (i) small compartments (), (ii) mixed compartments (), and for , (iii) comparable growth rates ().

Such a coefficient of variation is asymptotically constant on long times and the constant only depends on the initial number of molecules. A similar scaling for the coefficient of variation holds in a number of other physical situations, such as for the fluctuations in the number of protein filament formed in small volumes Michaels2018 ().

### 4.6 Noise in transient compartmentalization

Let us now apply the results of the section 4.4 to analyze the effect of the growth noise on our transient compartmentalization dynamics. Let us assume that the length of the ribozymes is and that of the parasites . For experimental values of these parameters we refer the reader to Table 1. In section 2, we have defined to be the initial number of ribozymes and parasites and to be the final mean number of ribozymes and parasites at the end of the growth phase in a given compartment. Using Eqs. (41)-(42), we obtain

(47) |

Since the ribozyme fraction at the end of the exponential phase is given by and , the noise on takes the following form :

where we have used Eq. (42) with . The factor is largest for and vanishes for pure parasite and pure ribozyme compartments, which means that this noise can be neglected when or . Note that if we choose (and thus ), and , Eq. (4.6) becomes

(49) |

which is consistent with Eq. (45) which was found using a different formalismHouchmandzadeh2018 ().

Using the parameters of Table 1 and (41), we can quantify the level of noise in the number of ribozymes or parasites. We find from this table that the ribozyme size was , and that the experiment should be in the replication-limited regime because the diffusion time scale should be approximately over times smaller than replication times of the order of s. The noise in composition should be maximal when we start with one ribozyme and one parasite of equal length, and with , which on average gives . Consequently, the noise in composition is at most .

### 4.7 A weak noise approach

The growth equations given by Eqs. (2) and (3) are deterministic in nature, which means that a given initial condition yields a unique final composition . In contrast to that in a stochastic approach, a given and lead to many different trajectories, which means that is a random variable with a probability distribution . Consequently, the ribozyme fraction after one round is

(50) |

This expression is computationally demanding to evaluate for , but it can be simplified significantly in the weak noise limit.

In order to construct a phase diagram in this limit, we simplify Eq. (50), by considering , where denotes a normal distribution with mean and standard deviation defined by Eq. (4.6). From Eq. (4.6) we expect the effect of noise to be largest when and are close to 1 (if ). In Fig. 13, the original phase diagram from Ref. Blokhuis2018 () is shown together with the modified phase boundaries (dotted lines) due to the presence of Gaussian noise using Eq. (50) for the case that .

Given that the amplitude of this type of noise should rapidly diminish for larger , and that in the experiment, we expect our ribozyme-parasite scenario to be well-described by a deterministic dynamics. We also see that the noise stabilizes the pure ribozyme phase (R) with respect to the coexistence phase (C) because in the presence of noise, the R region has grown at the expense of the C region. Similarly, the noise stabilizes the coexistence region (C) against the parasite region (P).

## 5 Conclusion

In this paper, we have carried out two important extensions of our previous work on transient compartmentalization Blokhuis2018 (), by including the effect of mutations and noise in such systems. This new study confirms one result of our previous work, namely that transient compartmentalization alone can stabilize functional replicators in the absence of a division of compartments of the kind considered in the Stochastic corrector model Szathmary1987 (). We can now add to that, that this property is robust with respect to mutations and noise, an important aspect for Origin of life studies.

In the presence of mutations, we have found that the phase diagram of long-time composition of this system only contains the parasite and the coexistence phases. The case where ribozymes grow faster than the parasites can be analyzed in terms of a modified error threshold, which interestingly now depends on the dynamics of compartmentalization and selection.

In order to analyze the role of noise in this system, we have introduced a simple model for the replication of a template by an enzyme. In the replication limited regime of that model, which should correspond to our experimental conditions, a low noise should be present at the population level, which we have quantified using tools borrowed from the theory of branching processes. In the end, we have studied the modified phase diagram of our model in this weak noise limit.

Of course, the two effects that we have studied here separately, namely mutations and noise, could be present simultaneously. We cannot also exclude that a more detailed modeling of the molecular replication or a different form of compartmentalization dynamics could lead to features not captured by the present treatment. Nevertheless, we think that the present framework represents a basis on which further studies could be built. We hope that our work may not only contribute to studies on the Origin of life but also to future developments on related important experimental techniques such as digital quantitative PCR Hindson2011 () or Directed Evolution A.Drame-Maigne2018 ().

## Acknowledgements

A.B. was supported by the Agence Nationale de la Recherche (ANR-10-IDEX-0001-02, IRIS OCAV). L.P. acknowledges support from a chair of the Labex CelTisPhysBio (ANR-10-LBX-0038). We would like to thank Y. Rondelez for many important and insightful suggestions. We acknowledge stimulating discussions with B. Houchmandzadeh.

## Appendix A Population-level noise generated by a single individual in the initial condition

Let us consider an age-dependent renewal process, in which the probability density of branching at age is given by , and upon branching, the probability of having offspring is given by (assumed to be age-independent for simplicity). We would like to evaluate the behavior of the number of individuals at time . Let us define the function by

(51) |

We also define the generating function for the process by

(52) |

where is the probability that . We assume that , i.e., that we start from a single object. We can then evaluate by adding the probability that no branching has occurred between 0 and , which is given by , with the effect of the first branching at time , such that . We obtain

(53) |

Multiplying by and summing, we obtain

(54) |

Taking the derivative with respect to at , we obtain the following equation for the average :

(55) |

where is the average number of daughters upon branching.

To solve this equation in the limit , let us multiply both sides by and take the limit. Since , we obtain

(56) |

This equation allows for a solution different from 0 and if is chosen to satisfy

(57) |

Then, making use of a result by Smith Smith1954, we obtain

(58) |

As a consequence, we have

(59) |

In the case we are considering we have

(60) |

and , which yields

(61) |

giving, as long as ,

(62) |

We can use this framework to also evaluate higher moments of the population size, and from that obtain the coefficient of variation of the population size which characterizes the amplitude of the noise. Let us denote the second derivative of the generating function with respect to by

(63) |

At large times, . The variance of the population size follows from the standard relation:

(64) |

For the specific case we are considering, we find

(65) |

After extracting the leading contribution in the large limit, we find:

(66) |

which is numerically close to since .

## Appendix B Population-level noise generated from individuals in the initial condition

If we start from individuals rather than just one, we can write the probability to have individuals at time , , in terms of the subpopulations generated by single individuals,

(67) |

Here, denotes the probability of having a population size of at time , starting from one individual, which was considered in A. Note that we have added an addition superscript to the notation used in A to emphasize the initial condition. From this equation, the new generating function follows :

(68) |

From this equation, we obtain the average,

(69) |

which expresses the average with initial strands in terms of the average with one initial strand. For the second moment, we obtain

(70) |

We can then extract by using Eq. (64), which yields

Together with Eq. (69), this leads to

(72) |

which is the coefficient of variation found previously for a single individual in the initial condition, divided by as expected for the growth from independent individuals. This confirms the scaling found in Eq. (42).

## References

- (1) A. Jaiman, M. Thattai, Algorithmic biosynthesis of eukaryotic glycans, BioRxivdoi:10.1101/440792.
- (2) A. I. Oparin, Origin of Life, Dover, 1952.
- (3) C. P. Brangwynne, C. R. Eckmann, D. S. Courson, A. Rybarska, C. Hoege, J. Gharakhani, F. Jülicher, A. A. Hyman, Germline p granules are liquid droplets that localize by controlled dissolution/condensation, Science 324 (5935) (2009) 1729–1732. doi:10.1126/science.1172046.
- (4) D. Zwicker, R. Seyboldt, C. A. Weber, A. A. Hyman, F. Jülicher, Growth and division of active droplets provides a model for protocells, Nat. Phys. 13 (4) (2017) 408–413. doi:10.1038/nphys3984.
- (5) E. Brinke, J. Groen, A. Herrmann, H. A. Heus, G. Rivas, E. Spruijt, W. T. S. Huck, Dissipative adaptation in driven self-assembly leading to self-dividing fibrils, Nature Nanotechnology 13 (9) (2018) 849–855. doi:10.1038/s41565-018-0192-1.
- (6) K. K. Nakashima, J. F. Baaij, E. Spruijt, Reversible generation of coacervate droplets in an enzymatic network, Soft Matter 14 (2018) 361–367. doi:10.1039/c7sm01897e.
- (7) P. G. Higgs, N. Lehman, The RNA world: molecular cooperation at the origins of life, Nat. Rev. Genet. 16 (1) (2015) 7–17. doi:10.1038/nrg3841.
- (8) F. Dyson, Origins of life, Cambridge University Press, 1985.
- (9) S. Spiegelman, I. Haruna, I. B. Holland, G. Beaudreau, D. Mills, The synthesis of a self-propagating infectious nucleic acid with a purified enzyme, Proc. Natl. Acad. Sci. USA 54 (3) (1965) 919–927. doi:10.1073/pnas.54.3.919.
- (10) M. Eigen, Self-organization of matter and the evolution of biological macromolecules, Naturwissenschaften 58 (10) (1971) 465–523. doi:10.1007/BF00623322.
- (11) J. Maynard Smith, E. Szathmáry, The Major Transitions in Evolution, Freeman, Oxford, 1995.
- (12) N. Takeuchi, P. Hogeweg, Evolutionary dynamics of rna-like replicator systems: A bioinformatic approach to the origin of life, Physics of Life Reviews 9 (3) (2012) 219 – 263. doi:10.1016/j.plrev.2012.06.001.
- (13) E. Szathmáry, L. Demeter, Group selection of early replicators and the origin of life, J. Theor. Biol 128 (4) (1987) 463–86. doi:10.1016/S0022-5193(87)80191-1.
- (14) D. S. Wilson, A theory of group selection, Proc. Natl. Acad. Sci. USA 72 (1) (1975) 143–146. doi:10.1073/pnas.72.1.143.
- (15) A. Blokhuis, D. Lacoste, P. Nghe, L. Peliti, Selection dynamics in transient compartmentalization, Phys. Rev. Lett. 120 (2018) 158101. doi:10.1103/PhysRevLett.120.158101.
- (16) P. L. Luisi, P. Walde, T. Oberholzer, Lipid vesicles as possible intermediates in the origin of life, Curr. Op. Coll. Int. Sci. 4 (1) (1999) 33–39. doi:10.1016/S1359-0294(99)00012-6.
- (17) M. Kreysing, L. Keil, S. Lanzmich, D. Braun, Heat flux across an open pore enables the continuous replication and selection of oligonucleotides towards increasing length, Nature chemistry 7 (2015) 203–208. doi:10.1038/nchem.2155.
- (18) P. Baaske, F. M. Weinert, S. Duhr, K. H. Lemke, M. J. Russell, D. Braun, Extreme accumulation of nucleotides in simulated hydrothermal pore systems, Proc. Natl. Acad. Sci. U.S.A. 104 (22) (2007) 9346–9351. doi:10.1073/pnas.0609592104.
- (19) E. V. Koonin, W. Martin, On the origin of genomes and cells within inorganic compartments, Trends Genet. 21 (12) (2005) 647–654. doi:10.1016/j.tig.2005.09.006.
- (20) B. Damer, D. Deamer, Coupled phases and combinatorial selection in fluctuating hydrothermal pools: A scenario to guide experimental approaches to the origin of cellular life, Life 5 (1) (2015) 872–887. doi:10.3390/life5010872.
- (21) T. Furubayashi, N. Ichihashi, Sustainability of a compartmentalized host-parasite replicator system under periodic washout-mixing cycles, Life 8 (3) (2018) 10. doi:10.3390/life8010003.
- (22) S. Matsumura, A. Kun, M. Ryckelynck, F. Coldren, A. Szilágyi, F. Jossinet, C. Rick, P. Nghe, E. Szathmáry, A. D. Griffiths, Transient compartmentalization of RNA replicators prevents extinction due to parasites, Science 354 (6317) (2016) 1293–1296. doi:10.1126/science.aag1582.
- (23) J. S. Chuang, O. Rivoire, S. Leibler, Simpson’s paradox in a synthetic microbial system, Science 323 (5911) (2009) 272–275. doi:10.1126/science.1166739.
- (24) A. S. Tupper, P. G. Higgs, Error tresholds for rna replication in the presence of both point mutations and premature termination errors, J. Theor. Biol. 428 (2017) 34–42. doi:10.1016/j.jtbi.2017.05.037.
- (25) Y. E. Kim, P. G. Higgs, Co-operation between polymerases and nucleotide synthethases in the rna world, PLoS computational biology 12 (11) (2016) e1005161. doi:10.1371/journal.pcbi.1005161.
- (26) L. Geyrhofer, N. Brenner, Coexistence and cooperation in structured habitats, bioRxivdoi:10.1101/429605.
- (27) A. Zadorin, Y. Rondelez, Natural selection in compartmentalized environment with reshuffling, arXiv: 1707.07461.
- (28) T. C. T. Michaels, A. J. Dear, T. P. J. Knowles, Stochastic calculus of protein filament formation under spatial confinement, New Journal of Physics 20 (5) (2018) 055007. doi:10.1088/1367-2630/aac0bc.
- (29) B. Houchmandzadeh, Giant fluctuations in logistic growth of two species competing for limited resources, Phys. Rev. E 98 (2018) 042118. doi:10.1103/PhysRevE.98.042118.
- (30) D. L. Floyd, S. C. Harrison, A. M. van Oijen, Analysis of kinetic intermediates in single-particle dwell-time distributions, Biophys. J. 99 (2) (2010) 360–366. doi:10.1016/j.bpj.2010.04.049.
- (31) A. Johnson-Buck, W. M. Shih, Single-molecule clocks controlled by serial chemical reactions, Nano Letters 17 (12) (2017) 7940–7944. doi:10.1021/acs.nanolett.7b04336.
- (32) S. Karlin, H. M. Taylor, A first course in stochastic processes, Academic Press, 1975. doi:10.1016/C2009-1-28569-8.
- (33) B. J. Hindson, K. D. Ness, D. A. Masquelier, P. Belgrader, N. J. Heredia, A. J. Makarewicz, I. J. Bright, M. Y. Lucero, A. L. Hiddessen, T. C. Legler, et al., High-throughput droplet digital pcr system for absolute quantitation of dna copy number, Analytical chemistry 83 (22) (2011) 8604–8610. doi:10.1021/ac202028g.
- (34) A. Dramé-Maigné, I. Golovkova, A. Zadorin, Y. Rondelez, Quantifying the performance of high-throughput directed evolution protocols, ArXiv: 1811.05288v1.