random long time dynamics for mean field plane rotators

Synchronization and random long time dynamics
for mean-field plane rotators

Lorenzo Bertini Dipartimento di Matematica, Università di Roma La Sapienza P.le A. Moro 2, 00185 Roma, Italy Giambattista Giacomin Université Paris Diderot, Sorbonne Paris Cité, Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, F- 75205 Paris, France  and  Christophe Poquet Université Paris Diderot, Sorbonne Paris Cité, Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, F- 75205 Paris, France
July 5, 2019
Abstract.

We consider the natural Langevin dynamics which is reversible with respect to the mean-field plane rotator (or classical spin XY) measure. It is well known that this model exhibits a phase transition at a critical value of the interaction strength parameter , in the limit of the number of rotators going to infinity. A Fokker-Planck PDE captures the evolution of the empirical measure of the system as , at least for finite times and when the empirical measure of the system at time zero satisfies a law of large numbers. The phase transition is reflected in the fact that the PDE for above the critical value has several stationary solutions, notably a stable manifold – in fact, a circle – of stationary solutions that are equivalent up to rotations. These stationary solutions are actually unimodal densities parametrized by the position of their maximum (the synchronization phase or center). We characterize the dynamics on times of order and we show substantial deviations from the behavior of the solutions of the PDE. In fact, if the empirical measure at time zero converges as to a probability measure (which is away from a thin set that we characterize) and if time is speeded up by , the empirical measure reaches almost instantaneously a small neighborhood of the stable manifold, to which it then sticks and on which a non-trivial random dynamics takes place. In fact the synchronization center performs a Brownian motion with a diffusion coefficient that we compute. Our approach therefore provides, for one of the basic statistical mechanics systems with continuum symmetry, a detailed characterization of the macroscopic deviations from the large scale limit – or law of large numbers – due to finite size effects. But the interest for this model goes beyond statistical mechanics, since it plays a central role in a variety of scientific domains in which one aims at understanding synchronization phenomena.
2010 Mathematics Subject Classification: 60K35, 37N25, 82C26, 82C31, 92B20
Keywords: Coupled rotators, Fokker-Planck PDE, Kuramoto synchronization model, Finite size corrections to scaling limits, Long time dynamics, Diffusion on stable invariant manifold

1. Introduction

1.1. Overview

In a variety of instances partial differential equations are a faithful approximation – in fact, a law of large numbers – for particle systems in suitable limits. This is notably the case for stochastic interacting particle systems, for which the mathematical theory has gone very far [24]. The closeness between the particle system and PDE is typically proven in the limit of systems with a large number of particles or for infinite systems under a space rescaling involving a large parameter – for example a spin or particle system on and the lattice spacing scaled down to – and up to a time horizon which may depend on . Of course the question of capturing the finite corrections has been taken up too, and the related central limit theorems as well as large deviations principles have been established (see [24] and references therein). Sizable deviations from the law of large numbers, not just small fluctuations or rare events, can be observed beyond the time horizon for which the PDE behavior has been established and these phenomena can be very relevant.

The first examples that come to mind are the ones in which the PDE has multiple isolated stable stationary points: metastability phenomena happens on exponentially long time scales [29]. Deviations on substantially shorter time scales can also take place and this is the case for example of the noise induced escape from stationary unstable solutions, which is particularly relevant in plenty of situations: for example for the model in [30, Ch. 5] phase segregation originates from homogeneous initial data via this mechanism, on times proportional to the logarithm of the size of the system. The logarithmic factor is directly tied to the exponential instability of the stationary solution (see [30] for more literature on this phenomenon). Of course, the type of phenomena happen also in finite dimensional random dynamical systems, in the limit of small noise, but we restrict this quick discussion to infinite dimensional models and PDEs.

In the case on which we focus the deviations also happen on time scales substantially shorter than the exponential ones, but the mechanism of the phenomenon does not involve exponential instabilities. In the system we consider there are multiple stationary solutions, but they are not (or, at least, not all) isolated, and hence they are not stable in the standard sense. Deviations from the PDE behavior happen as a direct result of the cumulative effect of the fluctuations. More precisely, this phenomenon is due to the presence of whole stable manifold of stationary solutions: the deterministic limit dynamics has no dumping effect along the tangential direction to the manifold so, for the finite size system, the weak noise does have a macroscopic effect on a suitable time scale that depends on how large the system is. We review the mathematical literature on this type of phenomena in § 1.6, after stating our results.

Apart for the general interest on deviations from the PDE behavior, the model we consider – mean-field plane rotators – is a fundamental one in mathematical physics and, more generally, it is the basic model for synchronization phenomena. Our results provide a sharp description of the long time dynamics of this model for general initial data.

1.2. The model

Consider the set of ordinary stochastic differential equations

(1.1)

with , is an IID collection of standard Brownian motions and . With abuse of notation, when writing we will actually mean and for us (1.1), supplemented with an (arbitrary) initial condition, will give origin to a diffusion process on , where is the circle .

The choice of the interaction potential is such that the (unique) invariant probability of the system is

(1.2)

where is the uniform probability measure on . Moreover, the evolution is reversible with respect to , which is the well known Gibbs measure associated to mean-field plane rotators (or classical model).

We are therefore considering the simplest Langevin dynamics of mean-field plane rotators and it is well known that such a model exhibits a phase transition, for , that breaks the continuum symmetry of the model (for a detailed mathematical physics literature we refer to [6]). The continuum symmetry of the model is evident both in the dynamics (1.1) and in the equilibrium measure (1.2): if solves (1.1), so does , an arbitrary constant, and , where is the rotation by an angle , that is for every .

1.3. The dynamics and the stationary states

The phase transition can be understood also taking a dynamical standpoint. Given the mean-field set up it turns out to be particularly convenient to consider the empirical measure

(1.3)

which is a probability on (the Borel subsets of) . It is well known, see [6] (for detailed treatment and original references), that if converges weakly for , then so does for every . Actually, the process itself , seen as an element of , where and is the space of probability measures on equipped with the weak topology, converges to a non-random limit which is the process that concentrates on the unique solution of the non-local PDE ( denotes the convolution)

(1.4)

with initial condition prescribed by the limit of . If such a limit probability does not have a () density (with respect to the uniform measure), one has to interpret (1.4) in a weak sense, but actually, even if the initial datum is just in , that is if it does not admit a density or if such a density is not smooth, the probability measure that solves (1.4) has a density for every , see [21]. We insist on the fact that is a probability density: . We will often commit the abuse of notation of writing when and has a density. Much in the same way, if is a probability density, , or , is the probability measure.

It is worthwhile to point out that with . This is to say that the nonlinearity enters only through the first Fourier coefficient of the solution, a peculiarity that allows to go rather far in the analysis of the model. Notably, starting from this observation one can easily (once again details and references are given in [6]) see that all the stationary solutions to (1.4), in the class of probability densities, can be written, up to a rotation, as

(1.5)

where is the normalization constant written in terms of the modified Bessel function of order zero (, for ) and is a non-negative solution of the fixed point equation , with . Since is increasing, concave, and we readily see that if (and only if) there exists a non-trivial (i.e. non-constant) solution to (1.4). Let us not forget however that implies that is a solution and therefore the constant density is a solution no matter what the value of is. From now on we set and choose , the unique positive solution of the fixed point equation, so that the probability density in (1.5) is non trivial and it achieves the unique maximum at and the minimum at . Note that the rotation invariance of the system immediately yields that there is a whole family of stationary solution:

(1.6)

and, when , of course means . , which is more practically viewed as a manifold (in a suitable function space, see § 2.2 below), is invariant and stable for the evolution. The proper notion of stability is given in the context of normally hyperbolic manifolds (see [32] and references therein), but the full power of such a concept is not needed for the remainder. Nevertheless let us stress that in [21] one can find a complete analysis of the global dynamic phase diagram, notably the fact that unless belongs to the stable manifold of the unstable solution – the solution corresponding to in (1.5) – converges (also in strong norms, controlling all the derivatives) to one of the points in , see Figure 1. There is actually an explicit characterization of :

(1.7)

As a matter of fact, it is easy to realize that if then (1.4) reduces to the heat equation which of course relaxes to .

\@input

fluctuations˙kuramoto12.pfg\Gin@PS@raw/PSfrag wherepop(0)[[0(l)1 0]](p0)[[1(l)1 0]](pin)[[2(l)1 0]](U)[[3(l)1 0]](M0)[[4(l)1 0]](2pi)[[5(l)1 0]](th)[[6(l)1 0]](pi2)[[7(l)1 0]](t=0)[[8(l)1 0]](tlarge)[[9(l)1 0]](psi)[[10(l)1 0]]11 0 1/Begin PSfraguserdict /PSfragpopputifelse\Gin@PS@raw/End PSfrag \Gin@PS@raw /Hide PSfragPSfrag replacements\Gin@PS@raw/Unhide PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 0/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 1/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 2/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 3/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 4/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 5/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 6/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 7/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 8/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 9/Place PSfrag \Gin@PS@raw \Gin@PS@raw ˝ 10/Place PSfrag

Figure 1. The evolution limit evolution (1.4) instantaneously smoothens an arbitrary initial probability and, unless the Fourier decomposition such an initial condition has zero coefficients corresponding to the first harmonics (the hyperplane ), it drives it to a point – a synchronized profile – on the invariant manifold and of course it stays there for all times. This has been proven in [21], here we are interested in what happens for the finite size – – system and we show that the PDE approximation is faithful up to times much shorter than : on times proportional to synchronization is kept and the center of synchronization performs a Brownian motion on .

1.4. Random dynamics on : the main result

In spite of the stability of , itself is not stable, simply because if we start nearby, say from , the solution of does not converge to . The important point here is that the linearized evolution operator around ( is an arbitrary element of , not necessarily the one in (1.5): the phase of is explicit only when its absence may be misleading)

(1.8)

with domain is symmetric in – a weighted Hilbert space that we introduce in detail in Section 2.1 – and it has compact resolvent. Moreover the spectrum of , which is of course discrete, lies in and the eigenvalue has a one dimensional eigenspace, generated by . So is the only neutral direction and it corresponds precisely to the tangent space of at : all other directions, in function space, are contracted by the linear evolution and the nonlinear part of the evolution does not alter substantially this fact [21, 25].

Let us now step back and recall that our main concern is with the behavior of (1.1), with large but finite, and not (1.4). In a sense the finite size, i.e. finite , system is close to a suitable stochastic perturbation of (1.4): the type of stochastic PDE, with noise vanishing as , needs to be carefully guessed [19], keeping in particular in mind that we are dealing with a system with one conservation law. We will tackle directly (1.1), but the heuristic picture that one obtains by thinking of an SPDE with vanishing noise is of help. In fact the considerations we have just made on suggest that if one starts the SPDE on , the solution keeps very close to , since the deterministic part of the dynamics is contractive in the orthogonal directions to , but a (slow, since the noise is small) random motion on arises because in the tangential direction the deterministic part of the dynamics is neutral. This is indeed what happens for the model we consider for large. The difficulty that arises in dealing with the interacting diffusion system (1.1) is that one has to work with (1.3), which is not a function. Of course one can mollify it, but the evolution is naturally written and, to a certain extent, closed in terms of the empirical measure, and we do not believe that any significative simplification arises in proving our main statement for a mollified version. Working with the empirical measure imposes a clarification from now: as we explain in Section 2.1 and Appendix A, if and , then can be seen as an element of (or, as a matter of fact, also as an element of a weighted space).

Here is the main result that we prove (recall that ):

Theorem 1.1.

Choose a positive constant and a probability . If for every

(1.9)

then there exist a constant that depends only on and, for every , a continuous process , adapted to the natural filtration of , such that converges weakly to a standard Brownian motion and for every

(1.10)

where , , and

(1.11)

The result is saying that, unless one starts on the stable manifold of the unstable solution (see Remark 2.5 for what one expects if ), the empirical measure reaches very quickly a small neighborhood of the manifold : this happens on a time scale of order one, as a consequence of the properties of the deterministic evolution law (1.4) (Figure 1), and, since we are looking at times of order , this happens almost instantaneously. Actually, in spite of the fact that the result just addresses the limit of the empirical measure, the drift along is due to fluctuations: the noise pushes the empirical measure away from but the deterministic part of the dynamics projects back the trajectory to and the net effect of the noise is a random shift – in fact, a rotation – along the manifold (this is taken up in more detail in the next section, where we give a complete heuristic version of the proof of Theorem 1.1).

Remark 1.2.

Without much effort, one can upgrade this result to much longer times: if we set with an arbitrary , there exists an adapted process converging to a standard Brownian motion such that

(1.12)

This is due to the fact that our estimates ultimately rely on moment estimates, cf. Section 3. These estimates are obtained for arbitrary moments and we choose the moment sufficiently large to get uniformity for times , but working for times would just require choosing larger moments. We have preferred to focus on the case this is the natural scale, that is the scale in which the center of the probability density converges to a Brownian motion and not to an “accelerated” Brownian motion (this is really due to the fact that we work on and marks a difference with [9, 4] where one can rescale the space variable).

1.5. The synchronization phenomena viewpoint

The model (1.1) we consider is actually a particular case of the Kuramoto synchronization model (the full Kuramoto model includes quenched disorder in terms of random constant speeds for the rotators, see [1, 6] and references therein). The mathematical physics literature and the more bio-physically oriented literature use somewhat different notations reflecting a slightly different viewpoint. In the synchronization literature one introduces the synchronization degree and the synchronization center via

(1.13)

which clearly correspond to the parameters and that appear in the definition of , but and are defined for finite and also far from . Note that if (1.9) holds, then both and converge in probability as to the limits and , with and the assumption that just means . Here is a straightforward consequence of Theorem 1.1:

Corollary 1.3.

Under the same hypotheses and definitions as in Theorem 1.1 we have that the stochastic process converges weakly, for every , to .

It is tempting to prove such a result by looking directly at the evolution of :

(1.14)

But this clearly requires a control of the evolution of the empirical measure, so it does not seem that (1.14) could provide an alternative way to many of the estimates that we develop, namely convergence to a neighborhood of and persistence of the proximity to (see Section 3 and Section 5). On the other hand, it seems plausible that one could use (1.14) to develop an alternative approach to the dynamics on , that is an alternative to Section 4. While this can be interesting in its own right, since the notion of synchronization center that we use in the proof and are almost identical (where they are both defined, that is close to ) we do not expect substantial simplifications.

1.6. A look at the literature and perspectives

Results related to our work have been obtained in the context of SPDE models with vanishing noise. In [8, 15] one dimensional stochastic reaction diffusion equations with bistable potential (also called stochastic Cahn-Allen or model A) are analyzed for initial data that are close to profiles that connect the two phases. It is shownthat the location of the phase boundary performs a Brownian motion. These results have been improved in a number of ways, notably to include small asymmetries that result in a drift for the arising diffusion process [7] and to deal with macroscopically finite volumes [4] (which introduce a repulsive effect approaching the boundary). Also the case of stochastic phase field equations has been considered [5].

For interacting particle systems results have been obtained for the zero temperature limit of -dimensional Brownian particles interacting via local pair potentials in [16]: in this case the frozen clusters perform a Brownian motion and, in one dimension, also the merging of clusters is analyzed [17]. In this case the very small temperature is the small noise from which cluster diffusion originates. With respect to [16, 17], our results hold for any super-critical interaction, but of course our system is of mean field type. It is also interesting to observe that for the model in [16, 17] establishing the stability of the frozen clusters is the crucial issue, because the motion of the center of mass is a martingale, i.e. there is no drift. A substantial part of our work is in controlling that the drift of the center of synchronization vanishes (and controlling the drift is a substantial part also of [8, 15, 7, 4, 5]). This is directly related to the content of § 1.5.

As a matter of fact, in spite of the fact that our work deals directly with an interacting system, and not with an SPDE model, our approach is closer to the one in the SPDE literature. However, as we have already pointed out, a non negligible point is that we are forced to perform an analysis in distribution spaces, in fact Sobolev spaces with negative exponent, in contrast to the approach in the space of continuous functions in [8, 15, 7, 4, 5]. We point out that approaches to dynamical mean field type systems via Hilbert spaces of distribution has been already taken up in [14] but in our case the specific use of weighted Sobolev spaces is not only a technical tool, but it is intimately related to the geometry of the contractive invariant manifold . In this sense and because of the iterative procedure we apply – originally introduced in [8] – our work is a natural development of [8, 4].

An important issue about our model that we have not stressed at all is that propagation of chaos holds (see e.g. [18]), in the sense that if the initial condition is given by a product measure, then this property is approximately preserved, at least for finite times. Recently much work has been done toward establishing quantitative estimates of chaos propagation (see for example the references in [11]). On the other hand, like for the model in [11], we know that, for our model, chaos propagation eventually breaks down: this is just because one can show by Large Deviations arguments that the empirical measure at equilibrium converges in law as to the random probability density , with a uniform random variable on . But using Theorem 1.1 one can go much farther and show that chaos propagation breaks down at times proportional to . From Theorem 1.1 one can actually extract also an accurate description of how the correlations build up due to the random motion on .

It is natural to ask whether the type of results we have proven extend to the case in which random natural frequencies are present, that is to the disordered version of the model we consider that goes under the name of Kuramoto model. The question is natural because for the limit PDE [13, 26] there is a contractive manifold similar to [20]. However the results in [27] suggest that a nontrivial dynamics on the contractive manifold is observed rather on times proportional to and one expects a dynamics with a nontrivial random drift. The role of disorder in this type of models is not fully elucidated (see however [12] on the critical case) and the global long time dynamics represents a challenging issue.

The paper is organized as follows: we start off (Section 2) by introducing the precise mathematical set-up and a number of technical results. This will allow us to present quantitative heuristic arguments and sketch of poofs. In Section 3 we prove that if the system is close to , it stays so for a long time. We then move on to analyzing the dynamics on (Section 4) and it is here that we show that the drift is negligible. Section 5 provides the estimates that guarantee that we do approach and in Section 6 we collect all these estimates and complete the proof our main result (Theorem 1.1).

2. More on the mathematical set-up and sketch of proofs

2.1. On the linearized evolution

We introduce the Hilbert space or, more generally, the space for a general weight by using the rigged Hilbert space structure [10] with pivot space . In this way given an Hilbert space , dense in , for which the canonical injection of into is continuous, one automatically obtains a representation of – the dual space – in terms of a third Hilbert space into which is canonically and densely injected. If is the closure of under the squared norm , that is , the third Hilbert space is precisely . The duality between and is denoted in principle by , but less cumbersome notations will be introduced when the duality is needed (for example, below we drop the subscripts).

It is not difficult to see that for

(2.1)

where , respectively , is the primitive of (resp. ) such that (resp. ), see [6, § 2.2]. More precisely, if there exists such that and for every . One sees directly also that by changing one produces equivalent norms [22, §2.1] so, when the geometry of the Hilbert space is not crucial, one can simply replace the weight by , and in this case we simply write . Occasionally we will need also which is introduced in an absolutely analogous way.

Remark 2.1.

One observation that is of help in estimating weighted norms is that computing the norm of requires access to : in practice if one identifies a primitive of , then . This is just because for some and .

The reason for introducing weighted spaces is because, as one can readily verify, , given in (1.8), is symmetric in . A deeper analysis (cf. [6]) shows that is essentially self-adjoint, with compact resolvent. The spectrum of lies in , there is an eigenvalue with one dimensional eigenspace generated by . We therefore denote the set of eigenvalues of as , with and for . The set of eigenfunctions is denoted by and let us point out that it is straightforward to see that . Moreover, if is even (respectively, odd), then is even (respectively, odd): the notion of parity is of course the one obtained by observing that can be extended to a periodic function in . This implies that one can choose with that is either even or odd, and we will do so.

Remark 2.2.

By rotation symmetry the eigenvalues do not depend on the choice of , but the eigenfunctions do depend on it, even if in a rather trivial way: the eigenfunction of and just differ by a rotation of . We will often need to be precise about the choice of and for this it is worthwhile to introduce the notations

(2.2)

The eigenfunctions are normalized in .

Remark 2.3.

Some expressions involving weighted norms can be worked out explicitly. For example a recurrent expression in what follows is , for and . If is the primitive of such that , then we have , where is uniquely defined by , but of course the explicit value of is not used in the final expression. In practice however it may be more straightforward to use an arbitrary primitive of (i.e. is not necessarily zero) for which we have

(2.3)

Since now appears, let us make it explicit:

(2.4)

2.2. About the manifold

As we have anticipated, we look at the set of stationary solutions , defined in (2.2), as a manifold. For this we introduce

(2.5)

which is a metric space equipped with the distance inherited from , that is . We have and can be viewed as a smooth one dimensional manifold in . The tangent space at is and for every we define the projection on this tangent space as . The following result is proven in [32, p. 501] (see also [22, Lemma 5.1]):

Lemma 2.4.

There exists such that for all with

(2.6)

there is one and only one such that . Furthermore, the mapping is in , and (with the Fréchet derivative)

(2.7)

Note that the empirical (probability) measure that describes our system at time is in (see Appendix A) and Lemma 2.4 guarantees in particular that as soon as it is sufficiently close to there is a well defined projection on the manifold. Since the manifold is isomorphic to it is practical to introduce, for , also , uniquely defined by . It is immediate to see that the projection is .

2.3. A quantitative heuristic analysis: the diffusion coefficient

The proof of Theorem 1.1 is naturally split into two parts: the approach to and the motion on . The approach to is based on the properties of the PDE (1.4): in [21] it is shown, using the gradient flow structure of (1.4), that if the initial condition is not on the stable manifold (see (1.7)) of the unstable stationary solution , then the solution converges for time going to infinity to one of the probability densities (of course is a function of the initial condition), so given a neighborhood of after a finite time (how large it depends only on the initial condition), it gets to the chosen neighborhood: due to the regularizing properties of the PDE, such a neighborhood can be even in a topology that controls all the derivatives [21], but here there is no point to use a strong topology, since at the level of interacting diffusions we deal with a measure (that we inject into ). And in fact we have to estimate the distance between the empirical measure and the solution to (1.4) – controlling thus the effect of the noise – but this type of estimates on finite time intervals is standard. However here there is a subtle point: the result we are after is a matter of fluctuations and it will not come as a surprise that the empirical measure approaches but does not reach it (of course: just contains smooth functions, and is not a function), but it will stay in a -neighborhood (measured in the norm). How long will it take to reach such a neighborhood? The approach to is actually exponential and driven by the spectral gap () of the linearized evolution operator (at least close to ). Therefore in order to enter such a -neighborhood a time proportional to appears to be needed, as the quick observation that for suggests. The proofs on this stage of the evolution are in Section 5: here we just stress that

  1. controlling the effect of the noise on the system on times is in any case sensibly easier than controlling it on times of order , which is our final aim;

  2. on times of order it is no longer a matter of showing that the empirical measure stays close to the solution of the PDE: on such a time scale the noise takes over and the finite system, which has a non-trivial (random) dynamics, substantially deviates from the behavior of the solution to the PDE, which just converges to one of the stationary profiles.

Let us therefore assume that the empirical measure is in a -neighborhood of a given . It is reasonable to assume that the dominating part of the dynamics close to is captured by the operator and we want to understand the action of the semigroup generated by on the noise that stirs the system, on long times. Note that we cannot choose arbitrarily long times, in particular not times proportional to right away, because in view of the result we are after the stationary profile around which we linearize changes of an order one amount. We will actually choose some intermediate time scale as we will see in § 2.4 and Remark 2.6, that guarantees that working with makes sense, i.e. that the projection of the empirical measure on is still sufficiently close to . The point is that the effect of the noise on intermediate times is very different in the tangential direction and the orthogonal directions to , simply because in the orthogonal direction there is a damping, that is absent in the tangential direction. So on intermediate times the the leading term in the evolution of the empirical measure turns out to be the projection of the evolution on the tangential direction, that is . One can now use Remark 2.3 to obtain

(2.8)

with a primitive of ( given in Remark 2.3). By applying Itô’s formula we see that the term in (2.8) can be written as the sum of a drift term and of a martingale term. It is not difficult to see that to leading order the drift term is zero (a more attentive analysis shows that one has to show that the next order correction does not give a contribution, but we come back to this below). The quadratic variation of the martingale term instead turns out to be equal to times

(2.9)

Since (note that is not normalized), (2.9) suggests that the diffusion coefficient in Therem 1.1 is , which coincides with (1.11).

To make this procedure work one has to carefully put together the analysis on the intermediate time scale, by setting up an adequate iterative scheme. Several delicate issues arise and one of the challenging points is precisely to control that the drift can be neglected. In fact the first order expansion of the projection that we have used

(2.10)

is not accurate enough and one has to go to the next order, see Lemma A.5. This is due to the fact that the random contribution, which in principle appears as first order, fluctuates and generates a cancellation, so in the end the term is of second order.

Remark 2.5.

It is natural to expect that Theorem 1.1 holds true also when and this is just because the evolution is attracted to and then the noise will cause an escape from this unstable profile after a time , since the exponential instability will make the fluctuations grow exponentially with a rate which is just given by the linearized dynamics (linearized around of course). Arguments in this spirit can be found for example in [30, Ch. 5], see [3] and references therein for the finite dimensional counterpart. However

  1. this is not so straightforward because it requires a good control on the dynamics on and around the heteroclinic orbits linking to [21, Section 5];

  2. the statement would require more details about the initial condition: the simple convergence to a point on is largely non sufficient (the fluctuations of the initial conditions now matter!);

  3. in general the initial phase on is certainly going to be random: if the initial condition is rotation invariant (at least in law), like if are IID variables uniformly distributed on or if , one expects to be uniformly distributed on . Note however that uniform distribution of is definitely not expected in the general case and asymmetries in the initial condition should affect the distribution of .

2.4. The iterative scheme

As we have explained in § 2.3, the analysis close to requires an iterative procedure, which we introduce here. We assume that at the system is already close to , while in practice this will happen after some time: in Section 6 we explain how to put together the results on the early stage of the evolution and the analysis close to , that we start here. So, for such that (here and below is the distance built with the norm of ), by Lemma 2.4 we can define . Applying the Itô formula to , we see that

(2.11)

where

(2.12)

and is the kernel of in . The evolution equation (2.11) and the noise term (2.12) have a meaning in , as well as the recentered empirical measures , and it is in this sense that we will use them: we detail this in Appendix A, where one finds also an explicit expression and some basic facts about the kernel . We have started here an abuse of notation that will be persistent through the text: stands for .

Equations (2.11)–(2.12) are useful tools as long as we can properly define the phase associated to the empirical measure of the system and that this phase is close to : in view of the result we want to prove, this is expected to be true for a long time, but it is certainly expected to fail for times of the order of , since on this timescale the phase does change of an amount that does not vanish as becomes large.

The idea is therefore to divide the evolution of the particle system up to a final time proportional to into time intervals , where and is chosen close to a fractional power of (see Remark 2.6). Moreover runs from up to so that and is equal to a positive constant (the of Theorem 1.1). If the empirical measure stays close to the manifold , we can define the projections of and successively update the reentering phase at all times . The point then will be essentially to show that the process given by these phases, on the time scale , converges to a Brownian motion.

More formally, we construct the following iterative scheme: we choose

(2.13)

(see Remark 2.6), we set and for we define

(2.14)

if and

(2.15)

Then we set

(2.16)

for and , and otherwise for every (of course can be smaller than and, in this case, the definition becomes redundant). Therefore the process we have just defined solves for

(2.17)

where

(2.18)

Once again, we refer to Appendix A for the precise meaning of (2.17) and (2.18).

Remark 2.6.

For the remainder of the paper we choose and . The two exponents do not have any particular meaning: a look at the argument shows that the exponent for has in any case to be chosen smaller than , but then a number of technical estimates enter the game and we have settled for a value without trying to get the optimal value that comes out of the method we use.

3. A priori estimates: persistence of proximity to

The aim of this section is to prove that, if we are (say, at time zero) sufficiently close to , we stay close to for times . The arguments in this section justify the choice of the proximity parameter that we have made in the iterative scheme. We first prove some estimates on the size of the noise term and then we will give the estimates on the empirical measure.

3.1. Noise estimates

We define the event

(3.1)

where is defined precisely like , see (2.18), except for the replacement of with .

Lemma 3.1.

.

Proof.

In order to perform the estimates we introduce and work with approximated versions of and (see Lemma A.4). Define for

(3.2)

The kernel in this case is (cf. Appendix A)

(3.3)

where are the ordered eigenvalues of , are the associated eigenfunctions of unit norm in , cf. Remark 2.2, and are the eigenfunctions of , the adjoint in (see Appendix A).

Very much in in the same way we define

(3.4)

with

(3.5)

We decompose for and

(3.6)

and an absolutely analogous formula holds for : in fact the bounds for