Mean field limit for disordered diffusions with singular interactions\thanksrefT2

Mean field limit for disordered diffusions with singular interactions\thanksrefT2

[ [    [ [ Technische Universität, Berlin and Bernstein Centerfor Computational Neuroscience Institut für Mathematik
Technische Universität Berlin
Straße des 17. Juni 136
D-10623 Berlin
Germany
and
Bernstein Center for
 Computational Neuroscience
Philippstr. 13
D-10115 Berlin
Germany
Current address:
Laboratoire MAP5
Université Paris Descartes
45 rue des Saints Pères
75270 Paris Cedex 06
France
\printeade1
Institut für Mathematik
Technische Universität Berlin
Straße des 17. Juni 136
D-10623 Berlin
Germany
and
Bernstein Center for
 Computational Neuroscience
Philippstr. 13
D-10115 Berlin
Germany
\printeade2
\smonth1 \syear2013\smonth6 \syear2013
\smonth1 \syear2013\smonth6 \syear2013
\smonth1 \syear2013\smonth6 \syear2013
Abstract

Motivated by considerations from neuroscience (macroscopic behavior of large ensembles of interacting neurons), we consider a population of mean field interacting diffusions in in the presence of a random environment and with spatial extension: each diffusion is attached to one site of the lattice , and the interaction between two diffusions is attenuated by a spatial weight that depends on their positions. For a general class of singular weights (including the case already considered in the physical literature when interactions obey to a power-law of parameter ), we address the convergence as of the empirical measure of the diffusions to the solution of a deterministic McKean–Vlasov equation and prove well-posedness of this equation, even in the degenerate case without noise. We provide also precise estimates of the speed of this convergence, in terms of an appropriate weighted Wasserstein distance, exhibiting in particular nontrivial fluctuations in the power-law case when . Our framework covers the case of polynomially bounded monotone dynamics that are especially encountered in the main models of neural oscillators.

[
\kwd
\doi

10.1214/13-AAP968 \volume24 \issue5 2014 \firstpage1946 \lastpage1993 \newproclaimremark[theorem]Remark \newproclaimassmp[theorem]Assumption \newproclaimdefinition[theorem]Definition

\runtitle

Diffusions with singular interactions

{aug}

a]\fnmsEric \snmLuçon\correflabel=e1]eric.lucon@parisdescartes.fr and b]\fnmsWilhelm \snmStannatlabel=e2]stannat@math.tu-berlin.de

\thankstext

T2Supported by the BMBF, FKZ 01GQ 1001B.

class=AMS] \kwd[Primary ]60K35 \kwd60G57 \kwd[; secondary ]60F15 \kwd35Q92 \kwd82C44

Disordered models \kwdweakly interacting diffusions \kwdWasserstein distance \kwdspatially extended particle systems \kwddissipative systems \kwdKuramoto model \kwdFitzHugh–Nagumo model

1 Introduction

The purpose of this paper is to provide a general convergence result for the empirical distribution of spatially extended networks of mean field coupled diffusions in a random environment. The main novelty of the paper is to consider a family of interacting diffusions indexed by the box of volume in the -dimensional lattice () where the interaction between two diffusions in depends on their relative positions. We are in particular interested in diffusions modeling the spiking activity of neurons in a noisy environment. To motivate the mathematical model we want to work with, let us consider, as a particular example, a family of stochastic FitzHugh–Nagumo neurons (see 22657695 (), MR2674516 () and references therein for further neurophysiological insights on the model)

(1)

for , with exterior input current . The variable denotes the voltage activity of the neuron, and plays the role of a recovery variable. are independent Brownian motions modeling exterior stochastic forces. Depending on the parameters , the neurons exhibit an oscillatory, excitable or inhibitory behavior. Suppose that the precise values of are unknown, which will always be the case in real-world applications, but rather are given as independent and identically distributed random variables. From a point of view from statistical physics, this additional randomness in (1) may be considered as a disorder. For simplicity we suppose that the are independent of the time . Equation (1) can be written as

(2)

using the shorthand notation , , , and . We suppose that the individual neurons are coupled with the help of a possibly nonlinear and random coupling term () modeling electrical synapses between the neurons. The coupling intensity between neurons and will depend additionally on some weight ( may be thought as a function of the distance, but not necessarily), so that the resulting system gets the following type:

(3)
(4)

The purpose of the paper is to address the behavior of system (3) in large populations (), under general assumptions on the dynamics , the coupling and the spatial constraint .

1.1 Empirical measure and mean-field limit

All the statistical information of the neural ensemble is contained in its empirical distributionof the diffusions (with disorder and with renormalized position)

(5)

that can be seen as a random probability measure. {remark} The renormalization of the positions by maps to a discrete subset of . The necessity of this renormalization will become clear in the discussion on the spatial constraints below in this Introduction. Since we are interested in the collective behavior of a large numbers of neurons, as it is the case for neural ensembles in the brain, understanding the asymptotic behavior of as is important.

Under the assumption that

(6)

for a general class of functions defined on , we prove, as part of our main results in this paper (see Theorems 2.2 and 2.3), that converges to a deterministic measure where is a weak solution of the McKean–Vlasov equation

For a formal derivation of this equation, we refer to the end of Section 2.4 below. The measure is called the mean field limit of the system (3). Through Theorems 2.2 and 2.3, we not only prove the convergence toward , but we also provide some explicit estimates on the speed of convergence in terms of an appropriate weighted Wasserstein distance.

1.2 Existing literature and motivations

1.2.1 The nonspatial case:

Of course, since there is no spatial interaction in this case, indexing the diffusions by a subset of is not relevant. Systems of type (3) are called mean field models (or weakly interacting diffusions) in statistical physics and have attracted much attention in the past years (see, e.g., McKean1967 (), Gartner (), Oelsch1984 (), Sznit84 (), daiPra96 ()), since they are capable of modeling complex dynamical behavior of various types of real-world models from physics to biology, like, for example, synchronization of large populations of individuals, collective behavior of social insects, emergence of synchrony in neural networks 22657695 (), 1108.2414 (), MR2998591 (), 1211.0299 () and providing particle approximations for various nonlinear PDEs appearing in physics MR1410117 (), MR2834721 (), MR2731396 (), Malrieu2003 (), MR2280433 ().

The most prominent example of such models is the Kuramoto model, which has been widely considered in the literature as the main prototype for synchronization phenomena (see, e.g., Acebron2005 (), Lucon2011 (), 1209.4537 (), GPP2012 (), Strogatz1991 ()),

(8)
(9)

where is the intensity of interaction and .

In the context of weighted interactions, a notable attempt to go beyond pure mean field interactions has been to consider moderately interacting diffusions; see MR779460 (), Meleard1987 (), Jourdain1998 ().

1.2.2 The spatial case

The motivation of going beyond pure mean-field interaction comes from the biological observation that neurons do not interact in a mean-field way (see, e.g., PhysRevLett.110.118101 () and references therein), and a vast literature exists in physics about synchronization on general networks. In particular, several papers have already considered model (3) (in dimension ) for different choices of spatial weight defined in (6). In this paper, we will be more particularly interested in two classes of spatial weights: {longlist}[(1)]

The -nearest-neighbor model: this model (see PhysRevLett.106.234102 (), PhysRevE.85.026212 ()) concerns the case where each diffusion only interacts with its neighbors within a box , where is smaller than ,

(10)
(11)

We are concerned in this work with the case where is proportional to , that is,

(12)

for a fixed proportion . {remark} The case of corresponds to the mean field case. Understanding the behavior of system (1.2.2) in the case of a pure local interaction (i.e., when ) does not enter into the scope of this work. In particular, we will not address the question of of order smaller than (e.g., for some ), whose behavior as seems to be quite different.

Under assumption (12), the -nearest-neighbor model (1.2.2) enters into the framework of (3) for the following choice of in (6):

(13)

The power-law model: this model also considered in the physical literature (see PhysRevE.82.016205 (), PhysRevE.85.066201 (), PhysRevE.66.011109 (), PhysRevE.54.R2193 ()) corresponds to the case where in (6) is given by

(14)

for some parameter , that is,

(15)
(16)

Note that the pure mean field case corresponds again to . As observed in the articles mentioned above on the basis of numerical simulations, it appears that the behavior of the system is strongly dependent on the value of the parameter . The situation which is considered in this paper corresponds to the subcritical case where the parameter is smaller than the dimension

(17)

The case of is much more delicate and will be the object of future work. We refer to Remark 2.2 below for further explanations on this case.

It is easy to see that in the case of (17) the renormalization of the positions by a factor in (15) is necessary: by standard arguments, the diverging series is of order . Consequently, is of order , so that we should expect a nontrivial limit in (15), as .

1.3 Main lines of proof and organization of the paper

The strategy usually used in the literature on mean-field models (see Gartner (), Jourdain1998 (), Lucon2011 (), Oelsch1984 ()) for the convergence of the empirical measure (5) is the following: first prove tightness of in the set of measure-valued continuous processes and second, prove uniqueness of any possible limit points, that is, uniqueness in the McKean–Vlasov equation (1.1).

In our context, a priori uniqueness in (1.1) appears unclear, due the fact that our model includes singular spatial weights [discontinuous in (13) and singular in (14)] and also a class of dynamics with no global-Lipschitz continuity and polynomial growth; recall the FitzHugh–Nagumo case (1). Note that we are also concerned with the case where is degenerate (even equally zero) for which uniqueness in (1.1) is also not clear.

To bypass this difficulty, we adopt a converse strategy: we first prove existence of a solution to the mean-field limit (1.1) (through an ad-hoc fixed point argument, using ideas from Sznitman SznitSflour ()). Second, via a propagator method (see MR1741805 () for related ideas), we prove the convergence (with respect to a Wasserstein-like distance adapted to the singularities of the interaction) of the empirical measure to any solution to (1.1). In particular, easy byproducts of this method are uniqueness of any solution to (1.1) as well as explicit rates of convergence to the McKean–Vlasov limit. In that sense, one of the main conclusions of the paper is to exhibit a phase transition in the size of the fluctuations in the power-law case; see Theorem 2.3. An actual central limit theorem in this case is of course a natural perspective and is currently under investigation.

The paper is organized as follows: we give in Section 2 the main assumptions on the model and we state the main results (Theorems 2.2 and 2.3). Section 3 contains the proof of Proposition 2.1 concerning the existence of a solution to the McKean–Vlasov equation (1.1). Section 4 summarizes the main ideas and results concerning the propagator method. The proofs of the laws of large numbers are provided in Section 5 for the -nearest case and in Section 6 for the power-law case. An additional assumption of regularity is made from Section 4 to 6, with is discarded in Section 7.

2 Mathematical set-up and main results

2.1 The model

Fix , , and let be the hypercube and be its volume. We consider diffusions on with values in the state space222Note that it is also possible to choose as the circle in the case of the Kuramoto model, but we will stick to for simplicity. for a certain .

Each diffusion is attached to the site of . The local dynamics of is governed by the following stochastic differential equation which is perturbed by a random environment represented by a vector ():

(18)

where is the covariance matrix, is a function from to , and is a given sequence of independent Brownian motions in .

The vectors are supposed to be i.i.d. realizations of a law and are hence seen as a random environment for the diffusions.

When connected to the others, the diffusions interact in a mean field way with spatial extension,

(19)
(20)

where is a function from to , and is a function from to . The required assumptions for the function will be made precise in Assumption 2.2 below. One should notice at this point that does not need to depend on the difference .

We suppose that, at time , the variables are independent and identically distributed according to a probability distribution on .

{remark}

Instead of considering diffusions on , we can also suppose periodic boundary conditions, that is, when is replaced by , where is the discrete -torus, that is, with and identified. The only thing that changes in what follows in the continuous model is that one should replace by where . Since the corresponding changes in the proofs of this paper remain marginal, we will restrict to the non periodic case and let the interested reader make the appropriate modifications in the periodic case.

2.2 Notation and assumptions

From now on, we will suppose that the following assumptions (Assumptions 2.2, 2.2 and 2.2) are satisfied throughout the paper. In particular, saying that Assumption 2.2 is true means that we are either in the -nearest-neighbor case or in the power-law case; see hypotheses (H1) and (H2) below. {assmp}[(Hypothesis on and )] We make the following assumptions:

  • The function is supposed to be locally Lipschitz-continuous in  (for fixed ) and satisfy a one-sided Lipschitz condition w.r.t. the two variables ,

    (21)

    for some constant (not necessarily positive). We suppose also some polynomial bound about the function ,

    (22)

    for some constant and where and .

  • The interaction term is supposed to be bounded by and globally Lipschitz-continuous on , with a Lipschitz constant .

We also assume that for fixed , the functions and are twice differentiable with continuous derivatives.

{remark}

Assumption 2.2 is in particular satisfied for the FitzHugh–Nagumo case. One technical difficulty is the dynamics is not globally Lispchitz continuous. This will entail some technical complications in the following. Note also that the constant mentioned in (21) does not take part in the estimates of Sections 4 to 6. It only enters into account in Section 3.

{assmp}

[(Assumptions on and )] We suppose that the initial distribution of satisfies the following moment condition:

(23)

and that the law of the disorder satisfies the moment condition

(24)

where the constants and are given by (22) in Assumption 2.2. {assmp}[(Assumptions on the weight )] In order to cover the case of both the -nearest model and the power-law interaction introduced in Section 1.2.2, we suppose that either hypotheses (H1) or (H2) is true: {longlist}[(H1)]

-nearest-neighbor:

(25)

where is defined in (13).

Power-law: the function is supposed to be a nonnegative function on such that the following properties are satisfied:

(26)
(27)

for some parameters and chosen to be

(29)
{remark}

Note that we could have chosen simply in any case. But this would have led to worse convergence rates than the ones that we obtain below in Theorem 2.3. Of course, the main prototype for hypothesis (H2) is when , for [recall (14)]. But, the assumptions made in (H2) cover a larger class of examples: the reader may think of the general case of , for a bounded Lipschitz-continuous function . Note also that the case of bounded Lispchitz interactions is also captured (take ).

{remark}

[(About the supercritical case)] The case of a power-law interaction with is more delicate and requires more attention. Note that, to our knowledge, no proposition for any continuous limit has been made in the literature in this case. We are only aware of PhysRevE.82.016205 (), where system (30) below is considered for finite .

One trivial observation is that the series is in this case already convergent. Consequently, an interaction term of the form simply vanishes to as . Hence, the correct model in this case is where the factor is absent,

(30)
(31)

The main difficulty for the derivation of the correct continuous limit in the case of (30) lies in the fact that the interaction term is not sufficiently mixing: if it exists, the McKean–Vlasov limit in this case should be random. We believe that the correct continuous limit should be governed by a stochastic partial differential equation instead of a deterministic PDE. This case is currently under investigation and will be the object of a future work.

2.3 The empirical measure

Let us consider for fixed horizon and time , the empirical measure [introduced in (5)],

(32)

as a probability measure on . Here

(33)

2.4 The McKean–Vlasov equation

The convergence of the empirical measure at is clear: since are i.i.d. random variables sampled according to , the initial empirical measure converges, as , to

(34)

An application of Itô’s formula to (19) [for any bounded function of class w.r.t. with bounded derivatives] leads to the following martingale representation for :

where is a martingale. Note that we use here the usual duality notation for the integral of a test function against a measure .

Taking formally in (2.4) shows that any limit point of should satisfy the following nonlinear McKean–Vlasov equation:

where is the weight function introduced either in hypotheses (H1) or in (H2). {remark} An important remark about a priori properties of (2.4) is the following: taking a test function in (2.4) that does not depend on implies

In particular, the marginal distribution of w.r.t. the measure is independent of and equal to . This implies that, for the class of singular weight we consider here, is always integrable against , for all , since the function is integrable w.r.t. to the Lebesgue measure on .

Moreover, since the function is supposed to have a polynomial growth [recall (22)], one has to justify in particular the term in (2.4) (the others are easily integrable). Thus, one should look for solutions having finite moment: for all , .

In particular, well-posedness in (2.4) will be addressed within the class of all measure-valued processes satisfying the properties mentioned above.

Formally integrating by parts in equation (2.4) and assuming the existence of a density , satisfies

(37)
(38)

In the case where is nondegenerate, one can make this integration by parts rigorous: using the same arguments as in GLP2011 (), Appendix A, one can show that for any measure-valued initial condition in (2.4), by the regularizing properties of the heat kernel, the solution of (2.4) has a regular density for all positive time that solves (37). We refer to GLP2011 (), Proposition A.1, for further details. But of course, if is degenerate, the strong formulation (37) does not necessarily make sense, and one has to restrict to the weak formulation (2.4) in that case.

2.5 Results

The first result of this paper, whose proof is given in Section 3, concerns the existence of a weak solution to the McKean–Vlasov equation (2.4):

Proposition 2.1

Under Assumptions 2.2, 2.2 and 2.2, for any initial condition , there exists a solution to (2.4).

Having proven the existence of at least one such solution in the general case, we turn to the issue of the convergence of the empirical measure to any of such solution. From now on, we specify the problem to the case of hypothesis (H1) (Section 2.5.1) and of hypothesis (H2) (Section 2.5.2). For each case, in order to state the convergence result, one needs to define an appropriate distance between two random measures that is basically the supremum over evaluations against a set of test functions. Such a space of test functions must incorporate the kind of singularities that are present either in hypotheses (H1) or (H2).

2.5.1 The -nearest-neighbor case

Suppose that the weight function satisfies hypothesis (H1) of Assumption 2.2. {definition}[(Test functions for -nearest-neighbor)] For fixed and , let be the set of functions on of the form

where is given in (13) and is globally Lipschitz-continuous w.r.t.

(39)
(40)

Let

be the corresponding seminorm. {remark} Note that for any that is in the variable , the following estimate holds:

(41)

We now turn to the appropriate distance between two random measures: {definition}[(Distance for -nearest-neighbor)] For random probability measures and on , let

where the supremum is taken over all functions , such that , .

Our convergence result is given in the following:

Theorem 2.2 ((Law of large numbers))

Under Assumptions 2.2, 2.2 and hypothesis (H1) of Assumption 2.2, for all , for any arbitrary solution to the mean-field equation (2.4), we have

(42)

where the constant only depends on , , and .

2.5.2 The case of the power-law interaction

Assume that the weight function satisfies hypothesis (H2). In view of the form of in this case (recall Assumption 2.2), the main idea is to consider test functions that become regular when renormalized by . The seminorm introduced in (2.5.2) below should therefore be thought of as a weighted Hölder seminorm. {definition}[(Test functions for power-law interaction)] For fixed and as in Assumption 2.2 and for fixed , let be the set of functions on satisfying:

  • regularity w.r.t. : is globally Lipschitz-continuous on , uniformly in , that is,

    (43)
    (44)
  • regularity w.r.t. : is uniformly bounded

    (45)

    and is globally -Hölder, uniformly in

    (46)
    (47)

Denote by

the corresponding seminorm.

{remark}

Note that for any that is in the variable , the following holds:

(49)

The corresponding definition of the distance between two random measures is similar to Definition 2.5.1 given in the -nearest-neighbor case. The main difference here is that one needs to take care of test functions with singularities. Since those singularities happen at points of the form (for some and ) that are regularly distributed on , we first need to introduce some further notation: for all integers , we denote by the regular discretization of with mesh of length

The appropriate distance between two random measures is then: {definition}[(Distance for power-law interaction)] Let and be defined by

(51)

where stands for the smallest integer strictly larger than . On the set of random probability measures on , let us define a sequence of distances indexed by , between two elements and by

where the supremum is taken over all the functions , such that . Let us then define the distance by

(52)

for a sufficiently large constant (that depends on the parameters of our model) and where is the conjugate of : . For a precise estimate on , we refer to Proposition 6.4 below. Apart from the weight (which is precisely here to compensate the estimate that we find in Proposition 6.4 below), the definition of exactly follows the usual Fréchet construction; see, for example, Gelfand1964 (). {remark} The choice of the integer in (51) is made for integrability reasons that will become clear in the proof of Theorem 2.3. One only has to notice here that has been precisely defined so that its conjugate always satisfies . The main result of this work is the following:

Theorem 2.3 ((Law of large numbers in the power-law case))

Under Assumptions 2.2, 2.2 and hypothesis (H2) of Assumption 2.2, for any arbitrary solution to the mean-field equation (2.4), we have

(53)

where the constant only depends on , , , and .

Note that the speed of convergence found in Theorem 2.3 is never smaller than which is the optimal speed for the case without spatial extension; recall the CLT results in the mean field case in Lucon2011 (). Note also that, in the case where , we have obtained a speed of convergence which is arbitrarily close to (since in that case is arbitrarily close to ). We believe that the optimal speed in this case should be exactly , but the proof we propose in this work does not seem to reach this optimal result.

Nevertheless, in the case where we only consider a bounded Lispchitz-continuous weight function (i.e., with no singularity at all), the proof of Theorem 2.3 can be considerably simplified and one obtains a speed that is .

Note also that the fluctuations when appear to be nontrivial. A natural perspective of this work would be to prove a precise central limit theorem in this case and to study the limiting fluctuation process in details.

2.6 Well-posedness of the McKean–Vlasov equation

A straightforward corollary of Theorems 2.2 and 2.3 is that uniqueness holds for the McKean–Vlasov equation (2.4):

Proposition 2.4 ((Well-posedness of the McKean–Vlasov equation))

Under Assumptions 2.2, 2.2 and 2.2, for every initial condition , there exists a unique solution to the McKean–Vlasov equation (2.4).

3 The nonlinear process and the existence of a continuous-limit

The purpose of this paragraph is to prove Proposition 2.1 concerning the existence of a solution to the McKean–Vlasov equation (2.4). This part is reminiscent of the techniques used by Sznitman SznitSflour () in order to prove propagation of chaos for nondisordered models.

3.1 Distance on probability measures

Let us first consider the set of probability measures on with finite moments of order [where is given in (22)] and endow this set with the Wasserstein metric

(54)

where the infimum in (54) is considered over all couplings with respective marginals and . Here, the are understood as random variables on a certain probability space . Note, however, that the definition of (54) does not depend on its particular choice. Equation (54) defines a complete metric on