#
Large Deviations of the Activity in the Contact Process:

Finite-Time and Finite- Scalings in the Large- Limit

###### Abstract

In a recent paper, we studied the finite-time and finite- (population size) scalings of a large deviation function estimator (LDF) by means of the cloning algorithm. Its convergence-speed was used in order to extract its asymptotic behavior in the infinite- and infinite- limits. For the cases analyzed, this limit resulted to render a better LDF estimation in comparison to the standard estimator. The approach was based on the fact that the scalings of the systematic errors of the estimator behave as and in the large- and large- asymptotics. However, the validity of these scalings and the efficiency of the method proposed were proved only in cases for which the number of sites (where the dynamics occurs) was small. In this paper, the analysis is extended to a wider range of system sizes. In order to characterize the large- behavior, we introduce the - and -scalings for the LDF. We show the dependence of the exponents and with and how the configuration where and is restricted to specific regions of the plane .

###### pacs:

05.40.-a, 05.10.-a, 05.70.Ln## I Introduction

In order to study the properties of rare events and rare trajectories in stochastic dynamics, a large variety of methods have been developed Touchette (2009); Giardinà et al. (2011); Bucklew (2013). The numerical approaches range from importance sampling Kahn and Harris (1951), to “go with the winner” algorithms Aldous and Vazirani (1994); Grassberger (2002), adaptive multilevel splitting Cérou and Guyader (2007) and transition path sampling Bolhuis et al. (2002). Through this paper we will give particular attention to the population dynamics algorithms Giardinà et al. (2006); Lecomte and Tailleur (2007); Tailleur and Lecomte (2009). Under this approach, the study of rare trajectories in a system is done by exponentially biasing their probability. The resulting modified dynamics consists in the coupled evolution of a large number of copies of the original process supplemented with a selection rule according to which a copy of the system is multiplied if it is rare or killed, if it is not. This selection-mutation mechanism favors the occurrence of the atypical trajectories of interest which have become typical under the biased dynamics.

The distribution of the class of rare trajectories in the original dynamics is related with the exponential growth of the population of clones of the system and LDF can be estimated from its growth rate. The numerical determination of this estimator is systematized in a method known as the cloning algorithm which can be performed in a number of ways Giardinà et al. (2006); Lecomte and Tailleur (2007); Tailleur and Lecomte (2009); Giardinà et al. (2011); Guevara Hidalgo and Lecomte (2016); Nemoto et al. (2017); Guevara Hidalgo et al. (2017). However, this method introduces two additional parameters into consideration: the population size and the (simulation) time . Both of which affect considerably the accuracy of the LDF estimation which is expected to be high in the infinite- and infinite- limit. Given that this is not achievable in practice, what is generally done is to choose these parameters large enough such that the average estimator (over several realizations) does not depend on them.

The finite- and finite- scalings of the LDF were analyzed recently following two different approaches: an analytical one, in Ref. Nemoto et al. (2017), using a discrete-time version of the population dynamics algorithm Giardinà et al. (2006), and a numerical one, in Ref. Guevara Hidalgo et al. (2017), using a continuous-time version Lecomte and Tailleur (2007); Tailleur and Lecomte (2009). The systematic errors of these scalings were found to behave as and in the large- and large- asymptotics respectively. Moreover, these scaling properties can be used in order to improve the LDF estimation (as shown in Ref. Guevara Hidalgo et al. (2017)). This is done considering that the asymptotic behavior of the estimator in the and limits may be interpolated from the data obtained from simulations at finite (and relative small) and . The improvement in the LDF estimation was illustrated on a simple two-states annihilation-creation dynamics (in one site) and on a more complex system, a contact process Harris (1974) (with sites). However, the validity of these scalings as well as the efficacy of the method as the number of sites increases was leaved as pending. This is precisely the purpose of this paper where we complement the problem introduced in Ref. Guevara Hidalgo et al. (2017) by extending the analysis of the scaling of the LDF to a large- contact process.

The paper is organized as follows. In Sec. II we introduce the method used in order to estimate large deviations of additive observables. The finite-time and finite- scalings of the LDF are summarized in Sec. III.1, and generalized to large- systems in Sec. IV. We make use of these results in Sec. V where we check the validity of the - and -scalings (Sec. V.1), their behavior in the -modified dynamics (Sec. V.2) as well as the applicability of the scaling method (Sec. V.3) for a contact process with sites. This analysis is generalized in Sec. VI where we characterize the finite- and finite- scalings of the LDF in the plane . Finally, we present our conclusions in Sec. VII.

## Ii Large Deviations of Additive Observables

In order to analyze the large deviations of the activity in the contact process we will make use of the continuous-time version of the “cloning algorithm” Lecomte and Tailleur (2007); Tailleur and Lecomte (2009); Giardinà et al. (2011). This approach allows to obtain the LDF from the exponential growth (or decay) rate of a set of copies of a system which evolves following a “-modified dynamics”. The procedure followed in order to obtain this estimator is summarized below.

### ii.1 Biased Dynamics and the Cloning Algorithm

We consider a general Markov dynamics which evolves continuously in time. The system jumps from configuration to with transition rate . The probability to find the system at time in configuration verifies the master equation

(1) |

where is the escape rate from configuration .

A trajectory of configurations jumps, , can be characterized by some additive observable (extensive in time) which is defined as

(2) |

where and describe elementary increments. In that case, the joint distribution describes the probability of finding the system in the configuration , with a value of the observable , and at time .

The procedure followed in order to analyze the large deviations of these observables, consists in biasing the statistical weight of histories of the system by a parameter Lecomte and Tailleur (2007); Tailleur and Lecomte (2009). A value of different from favors the non-typical values of the observable whose average value has been fixed. The large-time limit of the cumulants of can be recovered from the growth rate of the dynamical partition function for as derivatives of in Touchette (2009). This exponent is called scaled cumulant generating function (CGF) and its Legendre transform, large deviation function.

The CGF can be estimated numerically taking into consideration the relation between the dynamical partition function

(3) |

and the Laplace transform of the distribution

(4) |

Thus, the original dynamics have been transformed into a “-modified” one, which verifies the time-evolution equation Garrahan et al. (2009)

(5) |

where , and . The expression

(6) |

represent a -modified transition rate, whereas

(7) |

is the corresponding escape rate. Equation (5) can be interpreted as a population dynamics on a large number of copies of the system which evolves in a coupled way with transition rates and with a selection mechanism of rates Giardinà et al. (2006); Tailleur and Kurchan (2007). Depending on a copy of the system is multiplied or killed, so that under this -biased dynamics an atypical class of histories of the original process become typical. The CGF estimator, that we will denote as , is then obtained from the exponential growth (or decay) rate of a population of copies of the system evolving with these rules. The method which systematize the numerical determination of this estimator is know as cloning algorithm which can be performed in a number of ways Giardinà et al. (2006); Nemoto et al. (2017); Tailleur and Lecomte (2009); Lecomte and Tailleur (2007); Giardinà et al. (2011); Guevara Hidalgo and Lecomte (2016). A detailed description of the version used through this paper can be found in Sec. II.C.1. of Ref. Guevara Hidalgo et al. (2017).

### ii.2 CGF Numerical Estimator

Using the constant-population approach of the continuous-time cloning algorithm on a -biased Markov dynamics, the average over realizations of the CGF estimator for clones (or copies of the system) and a final simulation time , is defined as

(8) |

where is the total number of configuration changes in the full population up to time which is the actual final simulation time Guevara Hidalgo and Lecomte (2016). However, as for , it is possible to set in Eq. (8). At each configuration change, the population of clones is increased by a factor where .

It is expected that in the infinite- and infinite- limits, Eq. (8) provides an accurate estimation of the CGF, i.e.,

(9) |

However, as these limits are not achievable in practice, the best estimation of the large deviation function can be obtained considering a large enough simulation time and number of clones . The dependence of the estimator with these parameters is summarized in Sec. III.1.

### ii.3 Contact Process

The process of interest throughout this paper consists in a one-dimensional lattice with sites and periodic boundary conditions. Each site is occupied by a spin which can be in two possible states, or . The transition rates between states are given by

(10) | |||

(11) |

where and are positive constants Harris (1974). We will use this system in order to extend the analysis of the large deviations of the activity ^{1}^{1}1The activity is defined as the number of configuration changes on the time interval . For this observable, and in Eq. (2). introduced in Ref. Guevara Hidalgo et al. (2017) to the large- limit.

## Iii Scalings of the Large Deviation Function Estimator

The approach described in Sect. II was followed in Ref. Guevara Hidalgo et al. (2017) in order to analyze the finite-time and finite- behavior of the large deviations of the activity for two specific models: a simple one-site annihilation-creation dynamics and a contact process (as described in Sec. II.3) with sites. The resulting scaling behavior was verified to hold in both cases. However, an analysis of the dependence of this behavior with the number of sites remained pending. Below, we summarize the finite-time and finite- scalings of the CGF estimator. We will make use of these results in further sections in order to check their validity in the large -limit.

### iii.1 Large-Time and Large- Behavior

The time evolution of the CGF estimator is well described by a curve whose behavior indicate the existence of a -convergence to (-scaling). Similarly, this infinite-time limit exhibits corrections for large but finite (-scaling). These - and -scalings are explicitly described by

(12) | |||

(13) |

where is the infinite- limit of . The parameters and can be interpreted as characteristic times and sizes, respectively.^{2}^{2}2Additionally, the behavior of the CGF estimator at final time (the standard estimator) as a function of the population size is well described by a curve of the form
indicating that also converges to its infinite- limit with an error proportional to .

The curve (Eq. (12)) is determined from a fit in time over up to (the final simulation) time . By other hand, (Eq. (13)) is obtained from a fit in over computed for several populations sizes . These equations imply that converges to its infinite-time and - limit proportionally to and . Moreover, this limit can be extracted from finite-time and finite- data by making use of the scaling method also proposed in Ref. Guevara Hidalgo et al. (2017). The results obtained for rendered a better estimation of than the standard estimator evaluated for and for .

## Iv Scalings in the Large- Limit

In order to prove whether the finite-time and - scalings observed in small (number of sites ) systems are also valid in the large- limit, we assume that the CGF estimator (and its infinite-time limit) can be described by equations of the form

(14) | ||||

(15) |

redefining in a more general way the scalings (12) and (13).^{3}^{3}3Similarly, the -behavior of can be described by the equation
(16)
where . Here it is important to remark that both and scale in the same way in . In other words,
.
We will refer to Eq. (14) as -scaling whereas Eq. (15) as -scaling.

The problem reduces in determining the exponents and in order to verify if effectively and and whether the terms and represent the limits in and of the CGF estimator. Thus, a value of the exponent , verifies and , verifies . This is done in the section below on a contact process with sites.

## V Contact Process ( Sites)

### v.1 Finite-Time and Finite- Scalings

In Fig. 1, we have considered the behavior of as function of and , for two representative values of the parameter , (left) and (right). Each point of these surfaces was obtained using the cloning algorithm (Eq. (8)) up to time , for and for realizations. The best possible CGF estimation (i.e., evaluated at largest and number of clones) in both cases is shown with solid circles. According to Ref. Guevara Hidalgo et al. (2017), these estimations could be improved by using the and -scalings of the CGF estimator (i.e., the scaling method).

Although the exponents and can be in principle computed for any value of and for any (as could be intuited from Fig. 1), from now on, we will consider these exponents defined at the highest number of clones and at final simulation time, i.e.,

(17) | ||||

(18) |

Thus, the exponent is obtained by adjusting Eq. (14) to for and using the scaling (15) at . Alternatively, the exponent can be determined from a fit in over for (Eq. (16)). In simple words, these exponents can be obtained from an adequate fit over the thick curves in Fig. (1). They characterize the finite- and finite- scalings of the large deviations of the activity .

Following this approach, we found that the -scaling (12) is satisfied only for . This means that the exponent was found to be . As a consequence, the parameter obtained from Eq. (14) effectively represents the limit in of the CGF estimator, i.e., . This is not the case for for which . Similarly, a -scaling is observed for , whereas for , the -scaling (13) holds. It is important to remark that a value of exponent could still guaranty the convergence of the CGF estimator in the infinite- limit. However, even though at initial times, at final time , the exponent is negative (), which would imply that does not represent the infinite- infinite- limit of the CGF estimator. Below, we present how the change in the scalings is produced depending on .

### v.2 Dependence on : &

The exponents and were determined (as described above) for values of ranging in the interval . Two values of system size, (circles) and (squares), are compared in Fig. 2(a) for . As can be seen, independently of , for , the exponent varies around . However for , this is true only for (for which the amplitudes of the fluctuations are considerably smaller). For , we observe that deviates slightly from decreasing with up to at .

In order to describe the behavior of this exponent, results convenient to define the quantity as the value of the parameter such that , i.e., until which the -scaling holds. If the scaling holds (given some ), then . Thus,

(19) |

By other hand, the value of which signals the validity of the -scaling is denoted by . From this point, decreases until eventually it becomes negative, as can be seen in Fig. 2(b). Here, we introduce such that . Thus, for . This behavior was not observed for for which the -scaling was valid independently of Guevara Hidalgo et al. (2017). In those cases, and . Instead of confirming for the -scalings of the CGF estimator presented in Ref. Guevara Hidalgo et al. (2017), here we have been able to distinguish clearly three stages for the exponent :

(20) |

The possibility of extracting the infinite- and infinite- limit of the CGF estimator relied on the validity of the - and -scalings. How the results obtained from the application of the scaling method are affected by and are shown below.

### v.3 The Scaling Method

The scaling method allows to determine the asymptotic limit to which the CGF estimator (8) converges in the infinite- and - limits. This limit, that we have denoted (Eq. (13)), was proved to render a better estimation of the analytical LDF than the standard estimator evaluated at and at , at least for the cases analyzed in Ref. Guevara Hidalgo et al. (2017). However, the results we just presented would suggest that the determination of could be affected depending whether or not. If this is true, the scaling method could render valid results in our example only for . Solely in this region the extracted (obtained from Eq. (15)) would represent the infinite- and - limit of the CGF estimator. Indeed, this can be observed in Fig. 3 where we have applied the scaling method to our example.

The method can be performed following two different approaches: i) : First, imposing a -scaling for (setting in Eq. (14)) and then, considering a -scaling (15) for the extracted , or alternatively, ii) : Leaving and as free parameters in Eqs. (14) and (15). Both resulting estimators and are shown in Fig. 3 with squares and circles, respectively. Additionally, the infinite- limit (16) is also presented with diamonds. The standard CGF estimator (dots) serves as reference of the effectiveness of the method.

As can be seen in Fig. 3, the different estimators correspond to each others up to . From this point, their distance with respect to increases rapidly with up to where a discontinuity occurs. In fact, the behavior observed in Fig. 3 keeps correspondence with the -scaling of the CGF estimator. Specifically, with the stages of the exponent that were presented in Sect. V.2 and Fig. 2(b). Thus, the discontinuity in is related precisely with the change in sign of in in the same way as the divergence of the estimators from the standard one at is related with the fact that from this point, .

The example presented through this section related the effectiveness of the scaling method as proposed in Ref. Guevara Hidalgo et al. (2017) with the actual scaling of the CGF estimator in large systems. Depending on the value of the exponents it is possible to extract or not the infinite- and infinite- limit of the CGF estimator. Moreover, we also showed the dependence of these exponents with the parameter . Below we extend our analysis by considering the scaling behavior on a wider range of values of . This will provide a complete overview of how the CGF estimator behaves and how the change in scaling is given.

## Vi Plane Behavior

In this section we detail the behavior of the finite- and - scalings of the CGF estimator in the plane . The exponents and were computed for values of the parameters and number of sites ranging in the interval .

### vi.1

The contour plot in Fig. 4 shows the value of the exponent as it changes depending on the parameters and . We have focused in the region for as for , and thus, the -scaling (12) holds. The values closest to are presented with the darkest tone while smaller values are shown with clearer tones. As can be seen, the exponent decreases gradually as and increase.

Given a particular value of , qualitatively speaking, the behavior of with respect to is similar to what we described for in Sec. V.2 (except for small values of , for which the -scaling holds even for Guevara Hidalgo et al. (2017)). In order to extend that description into the plane , we introduce a number of sites dependency of the bound . We denote by the value of until which the -scaling is valid given a particular . Similarly, is the lower bound of . Thus, the exponent which characterizes the -scaling (14) of the CGF estimator is given by

(21) |

where , and is large. In fact, for this case, , for all .

### vi.2

Similarly as above, in Fig. 5 we present the exponent as it changes depending of some particular choice of the parameters within the intervals considered. The surface in Fig. 5(a) illustrates clearly the change in the -scaling of the CGF estimator. For every value of considered, the exponent is approximately up to some value of , denoted as (Sec. V.2) which is different for each . However, from this point, its value decreases as and increases, becoming, in some cases, negative. This change in the -scaling is also shown in the contour plot in Fig. 5(b) where we have focus specially in the region for . The values of closer to are shown in dark tones.

In Sec. V.2, we also defined such that . This value of course depends on and in some cases it does not even exists. However, for some particular values of (large), the exponent changes sign twice (as can be seen in Fig. 5(b)). We will use this fact in order to characterize the -scaling depending on the number of zeros of the exponent for a given .

We define as the set of values of , for which the exponent has no zeros, : if has two zeros ( and , with ) and : if has one zero (). These regions are bounded by and/or by , where is the smallest value of such that the curve is tangent to in one single point. By other hand, is the largest such that the curve cuts in two points. Thus, the region groups the values of such that , the values of within the interval and , the values of such that . Thus, the exponent which characterizes the -scaling (15) of the CGF estimator is given by

(22) |

## Vii Conclusion

In this paper we analyzed the large deviations of the activity in the contact process. We used the continuous-time version of the cloning algorithm from which the LDF can be estimated from the exponential growth (or decay) rate of a set of copies of the system which evolves following a modified dynamics up to a time . It is expected that in the infinite- and - limits, this method provides an accurate LDF estimation. However, in practice, the best estimation is obtained is by considering a large enough simulation time and number of clones .

The dependence of this estimator (and of its accuracy) with these two parameters was studied in Refs. Nemoto et al. (2017); Guevara Hidalgo et al. (2017). The finite- and finite- scalings of the systematic errors of the LDF were found to behave as and in the large- and large- asymptotics, respectively. By making use of these convergence-speeds, it was proposed a (scaling) method which allowed to extract the asymptotic behavior of the CGF estimator in the and limits. At least for the cases analyzed in Refs. Nemoto et al. (2017); Guevara Hidalgo et al. (2017), this infinite-time and infinite- limit resulted to render a better LDF estimation in comparison with the standard estimator. However, the validity of the method and of these scalings were proved only for a simple one-site annihilation-creation dynamics and for a contact process with sites, leaving an analysis of the dependence with the number of sites pending.

In order to do so, in this paper, we redefined these scalings in a more general way. We assume the behavior of the CGF estimator described by a -scaling (Eq. (14)) and a -scaling (Eq. (15). This redefinition allowed us to verify in large systems if effectively and and whether the terms and represent the limits in and of the CGF estimator.

This was done at first in Sec. V.1 where we considered a contact process with sites and two representative values of the parameter . Although the -scaling and -scaling were proved to hold for , this was not the case for . How this change in the scaling is produced depending on the parameter was presented in detail in Sect. V.2 where the exponents and were characterized. Particularly, for , we were able to distinguish three stages in its behavior, where, the -scaling was valid up to , then decreases to at and finally, it becomes negative for .

In Sec. V.3 we showed how these scalings affect the determination of the infinite- and infinite- limit of the CGF estimator. This occurs because the scaling method relied on the validity of the - and -scalings. As for this was not the case, it was possible to see how the different estimators corresponded to each others up to from which they diverge up to where there is a discontinuity.

This analysis was extended to the plane in Sec. VI where the exponents and were computed for a grid of values of the parameters . Their characterization was done introducing a number-of-sites dependency of the bounds , and previously defined in Sec. V as well as the use of the number of zeros of the exponent in order to characterize the different groups of . Whether the results presented through this paper are restricted only to the contact process or not is leaved as a pending problem and a possible direction for future research.

###### Acknowledgements.

E. G. thanks Khashayar Pakdaman for his support and discussions. Special thanks to the Ecuadorian Government and the Secretaría Nacional de Educación Superior, Ciencia, Tecnología e Innovación, SENESCYT.## References

- Touchette (2009) H. Touchette, Physics Reports 478, 1 (2009).
- Giardinà et al. (2011) C. Giardinà, J. Kurchan, V. Lecomte, and J. Tailleur, J. Stat. Phys. 145, 787 (2011).
- Bucklew (2013) J. Bucklew, Introduction to Rare Event Simulation (Springer Science & Business Media, 2013).
- Kahn and Harris (1951) H. Kahn and T. E. Harris, National Bureau of Standards applied mathematics series 12, 27 (1951).
- Aldous and Vazirani (1994) D. Aldous and U. Vazirani, in Foundations of Computer Science, 1994 Proceedings., 35th Annual Symposium on (IEEE, 1994) pp. 492–501.
- Grassberger (2002) P. Grassberger, Computer Physics Communications 147, 64 (2002), proceedings of the Europhysics Conference on Computational Physics Computational Modeling and Simulation of Complex Systems.
- Cérou and Guyader (2007) F. Cérou and A. Guyader, Stochastic Analysis and Applications 25, 417 (2007).
- Bolhuis et al. (2002) P. G. Bolhuis, D. Chandler, C. Dellago, and P. L. Geissler, Annual Review of Physical Chemistry 53, 291 (2002).
- Giardinà et al. (2006) C. Giardinà, J. Kurchan, and L. Peliti, Phys. Rev. Lett. 96, 120603 (2006).
- Lecomte and Tailleur (2007) V. Lecomte and J. Tailleur, J. Stat. Mech. 2007, P03004 (2007).
- Tailleur and Lecomte (2009) J. Tailleur and V. Lecomte, AIP Conf. Proc. 1091, 212 (2009).
- Guevara Hidalgo and Lecomte (2016) E. Guevara Hidalgo and V. Lecomte, J. Phys. A: Math. Theor. 49, 205002 (2016).
- Nemoto et al. (2017) T. Nemoto, E. Guevara Hidalgo, and V. Lecomte, Phys. Rev. E 95, 012102 (2017).
- Guevara Hidalgo et al. (2017) E. Guevara Hidalgo, T. Nemoto, and V. Lecomte, Phys. Rev. E 95, 062134 (2017).
- Harris (1974) T. E. Harris, Ann. Probability 2, 969 (1974).
- Garrahan et al. (2009) J. P. Garrahan, R. L. Jack, V. Lecomte, E. Pitard, K. van Duijvendijk, and F. van Wijland, J. Phys. A 42, 075007 (2009).
- Tailleur and Kurchan (2007) J. Tailleur and J. Kurchan, Nat Phys 3, 203 (2007).