1 Introduction

Noisy heteroclinic networks

Abstract.

We consider a white noise perturbation of dynamics in the neighborhood of a heteroclinic network. We show that under the logarithmic time rescaling the diffusion converges in distribution in a special topology to a piecewise constant process that jumps between saddle points along the heteroclinic orbits of the network. We also obtain precise asymptotics for the exit measure for a domain containing the starting point of the diffusion.

1. Introduction

In this note, we study small noise perturbations of a smooth continuous time dynamical system in the neighborhood of its heteroclinic network.

The deterministic dynamics is defined on as the flow generated by a smooth vector field , i.e. is the solution of the initial-value problem

(1.1)

We assume that the vector field generates a heteroclinic network, that is a set of isolated critical points connected by heteroclinic orbits of the flow . Heteroclinic orbits arise naturally in systems with symmetries. Moreover, they are often robust under perturbations of the system preserving the symmetries, see the survey [9] and references therein, for numerous examples and a discussion of mechanisms of robustness.

We consider the system (1.1) perturbed by uniformly elliptic noise:

(1.2)

where is a standard -dimensional Wiener process, is a nondegenerate matrix of diffusion coefficients for every , and is a small number. The initial point is assumed to belong to one of the heteroclinic orbits.

Our principal result on the vanishing noise intensity asymptotics can be informally stated as follows.

Theorem. Under some technical nondegeneracy assumptions, as , the process converges in distribution in an appropriate topology to a process that spends all the time on the set of critical points and jumps instantaneously between them along the heteroclinic trajectories.

In fact, we shall provide much more detailed information on the limiting process and describe its distribution precisely. Thus, our result provides a unified and mathematically rigorous background for the existing phenomenological studies, see e.g. [1]. In particular, we shall see that in many cases the limiting process is not Markov. The precise description of the limiting process allows to obtain asymptotics of the exit distribution for domains containing the starting point. This asymptotic result is of a different kind than the one provided by the classical Freidlin–Wentzell (FW) theory, see [5]. In fact, it allows to compute precisely the limiting probabilities of specific exit points that are indistinguishable from the point of view of the FW quasi-potential.

To prove the main result, we have to trace the evolution of the process along the heteroclinic orbits and in the neighborhood of hyperbolic critical points. The latter was studied in [8] and [2], where it was demonstrated that in the vanishing diffusion limit, of all possible directions in the unstable manifold the system chooses to evolve along the invariant curve associated to the highest eigenvalue of the linearization of the system at the critical point, and the asymptotics for the exit time was obtained. However, these results are not sufficient for the derivation of our main theorem, and more detailed analysis is required.

The paper is organized as follows. In Section 2, we study a simple example of a heteroclinic network. In Section 3, we give non-rigorous analysis of the behavior of the diffusion in the neighborhood of one saddle point. The general setting and the main result on weak convergence are given in Section 5. To state the result we need to define in what sense the convergence is understood. Therefore, we begin our exposition in Section 4 with a brief description of the relevant metric space, postponing the proofs of all technical statements concerning the metric space till Section 12. In Section 6, we state a result on the exit asymptotics and derive it from the main result of Section 5. In Section 7, we give the statement of the central technical lemma and use it to prove the main result. The proof of the central lemma is split into two parts. They are given in Section 8 and Section 9, respectively. Proofs of some auxiliary statements used in Section 8, are given in Section 11. Section 10 is devoted to an informal discussion of implications of our results and their extensions.

Acknowledgments. I started to think about heteroclinic networks in the Fall of 2006, after I attended a talk on their applications to neural dynamics by Valentin Afraimovich (see his paper [10]). I am grateful to him for several stimulating discussions this work began with, as well as for his words of encouragement without which it could not have been finished. I am also grateful to NSF for their support via a CAREER grant.

2. A simple example

Here we recall a simple example of a heteroclinic network described in [9]. If , the deterministic cubic system defined by the drift of the following stochastic system

(2.1)

has 6 critical points

The matrix of the linearization of the system at is given by

and the linearizations at all other critical points are can be obtained from it using the symmetries of the system. We see that if

(2.2)

then all the critical points are saddles with one unstable direction corresponding to the eigenvalue . It is shown in [9] that system (2.1) admits 12 orbits connecting to , to , and to . Each of these orbits lies entirely in one of the coordinate planes.

Let us equip (2.1) with an initial condition on one of the 12 heteroclinic connections, say, on the one connecting to denoted by .

The theory developed in this paper allows to describe precisely the limiting behavior of the process as . Namely, the process will stay close to the heteroclinic network, moving mostly along the heteroclinic connections between the saddle points. At each saddle point it spends a time of order . More precisely, the process defined by converges to a process that jumps from to instantaneously along the heteroclinic connection , sits at for some time, then chooses one of the orbits and , and jumps along that orbit instantaneously, spends some time in the endpoint of that orbit and then chooses a new outgoing orbit to follow, etc. However, the details of the process depend crucially on the eigenvalues of the linearization at the critical point.

If at each saddle point the contraction is stronger than expansion, i.e., if

which is equivalent to , then the system exhibits loss of memory and the sequence of saddle points visited by the limiting process is a standard random walk on the directed graph formed by the network, i.e. a Markov chain on saddle points that at each point chooses to jump along one of the two possible outgoing connections with probability .

If the expansion is stronger than the contraction, i.e., , then at , the first saddle point visited, the process still chooses each of the two possible next heteroclinic connections with probability . However, if it chooses , then it will cycle through and never visit any other critical points. If it chooses , then it will cycle through and never visit any other critical points. This situation is strongly non-Markovian, because the choice the system makes at at any time is determined by its choice during the first visit to that saddle point.

The case where combines certain features of the situations described above. The limiting random saddle point sequence explores all heteroclinic connections, but it makes asymmetric choices determined by its the history, also producing non-Markov dynamics.

Besides the probability structure of limiting saddle point sequences, our main result also provides a description of the random times spent by the process at each of the saddles it visits.

3. Nonrigorous analysis of a linear system

In this section we consider the diffusion near a saddle point in the simplest possible case:

with initial conditions , . Here , and are i.i.d. standard Wiener processes.

Let us study the exit distribution of for the strip . In other words, we are interested in the distribution of , where . Duhamel’s principle implies

The integral in the r.h.s. converges a.s. to a centered Gaussian r.v. , so that for large ,

Therefore, for the exit time , we have to solve

so that

So, the time spent by the diffusion in the neighborhood of the saddle point is about .

On the other hand,

and the integral in the r.h.s. converges in distribution to , a centered Gaussian r.v. Plugging the expression for into this relation, we get

Therefore, when the contraction is stronger than expansion , the limiting exit distribution is centered Gaussian. In the opposite case , the limiting exit distribution is strongly asymmetric and concentrated on the positive semiline reflecting the fact that the initial condition was positive and presenting a strong memory effect. In the intermediate case , the limit is the distribution of the sum of a symmetric r.v.  and a positive r.v. , and the resulting asymmetry also serves as a basis for a certain memory effect.

In general, the asymmetry in the exit distribution means that at the next visited saddle point the choices of the exit direction will not be symmetric, thus leading to non-Markovian dynamics. Notice that three types of behavior for the linear system that we just derived correspond to the three types of cycling through the saddle points in the example of Section 2.

One of the main goals of this paper is the precise mathematical meaning of the approximate identities of this section and their generalizations to multiplicative white noise perturbations of nonlinear dynamics in higher dimensions.

4. Convergence of graphs in space-time

The main result of this paper states the convergence of a family of continuous processes to a process with jumps. This kind of convergence is impossible in the traditionally used Skorokhod topology on the space of processes with left and right limits, since the set of all continuous functions is closed in the Skorokhod topology, see [3].

In this section we replace by another extension of the space of continuous functions. This extension allows to describe not only the fact of an instantaneous jump from one point to another, but also the curve along which the jump is made. We shall also introduce an appropriate topology to characterize the convergence in this new space.

It is interesting that in his classical paper [11] Skorokhod introduced several topologies for trajectories with jumps. Only one of them is widely known as the Skorokhod topology now. However, the construction that we are going to describe here did not appear either in [11], or anywhere else, to the best of our knowledge, at least in the literature on stochastic processes.

We consider all continuous functions (“paths”)

such that is nondecreasing in . Here (and often in this paper) we use superscripts to denote coordinates: .

We say that two paths and are equivalent, and write if there is a path and nondecreasing surjective functions with and for all , where means the composition of two functions. (These are essentially reparametrizations of the path except that we allow to be not strictly monotone.) In Section 12 we shall prove the following statement:

Lemma 4.1.

The relation on paths is a well-defined equivalence relation.

Any non-empty class of equivalent paths will be called a curve. We denote the set of all curves by . Our choice of the equivalence relation ensures that each curve is a closed set in sup-norm (see Section 12), and we shall be able to introduce a metric on induced by the sup-norm.

Since each curve in is nondecreasing in the zeroth coordinate which plays the role of time, it can be thought of as the graph of a function from  to  for some nonnegative . However, any value can be attained by a path’s “time” coordinate for a whole interval of values of the variable parametrizing the curve, thus defining a curve in , which is interpreted as the curve along which the jump at time is made.

We would like to introduce a distance in that would be sensitive to the geometry of jump curves, but not to their parametrization. So, for two curves , we denote

(4.1)

where denotes the Euclidean norm in .

Theorem 4.1.
  1. The function defined above is a metric on .

  2. The space is Polish (i.e. complete and separable).

We postpone the proof of this statement to Section 12.

Naturally, any continuous function defines a path by

(4.2)

and a curve that is the equivalence class of .

The following result shows that the convergence of continuous functions in sup-norm is consistent with convergence of the associated curves in metric .

Lemma 4.2.

Let and be continuous functions on  for some . Then

is necessary and sufficient for

The proof of this lemma is also given in Section 12. In fact, it can be extended to describe the convergence of graphs of functions with varying domains.

We shall need the following notion in the statement of our main result.

An element of is called piecewise constant if there is a path , a number and families of numbers

and points

such that for , , and for , . A piecewise constant describes a particle that sits at point between times and , and at time jumps along the path . It is natural to identify with a sequence of points and jumps, and we write

(4.3)

where denotes the time spent by the particle at point .

5. The setting and the main weak convergence result

In this section we describe the setting and state the main result. The conditions of the setting and possible generalizations are discussed in Section 10.3.

We assume that the vector field is -smooth, and the -matrix valued function is also . We assume that for each , the flow associated to the system (1.1) is well-defined for all (including negative values of ). We assume that admits a heteroclinic network of a special kind that we proceed to describe.

We suppose that there is a finite or countable set of points , where or for some , with the following properties.

  1. For each , there is a neighborhood of and a matrix such that

    where , for a constant and every . In particular, is a critical point for , since . Moreover, we require that is conjugated on to a linear dynamical system by a -diffeomorphism satisfying . This means that for any , there is such that for all ,

  2. For each , the eigenvalues of are real and simple, we also assume that there is an integer with such that

    (5.1)

These requirements mean, in particular, that each is a hyperbolic fixed point (saddle) for the dynamics. The Hartman–Grobman theorem (see Theorem 6.3.1 in [7]) guarantees the existence of a homeomorphism conjugating the flow generated by vector field to linear dynamics. Our condition (i) imposes a stronger requirement for this conjugation to be . This requirement is still often satisfied as follows from the Sternberg linearization theorem for hyperbolic fixed points with no resonances, see Theorem 6.6.6 in [7]. In particular, the cubic system of Section 2 is -conjugated to a linear system at each saddle point for typical values of its parameters.

We also want to make several assumptions on orbits of the flow connecting these saddle points to each other. First, we denote by the unit eigenvectors associated with the eigenvalues . The hyperbolicity (and, even more straightforwardly, the conjugation to a linear flow) implies that for every there is a -dimensional -manifold containing such that for every (i.e. is the stable manifold associated to .) For each the unstable manifold of is also well-defined.

However, it is known, see [8] and [2], that if the initial data for the stochastic flow are close to the stable manifold then after passing the saddle the solution evolves mostly along the invariant manifold associated to the highest eigenvalue of . So, what we need is the curve containing , tangent to at , and invariant under the flow. Of course, the intersection of with is well-defined and coincides with .

For each we denote and set

where the numbers are chosen so that

and these sets are mutually disjoint. Here, for any , the number is defined by

and denotes the -th coordinate of in the coordinate system defined by .

We denote the orbits of by , and assume that for each , there are numbers  such that

This means that the curves are heteroclinic orbits connecting the saddle point to saddle points . We do not prohibit these orbits to be homoclinic and connect to itself, i.e. the situations where are allowed.

For any we define

(5.2)
(5.3)

so that is the time it takes to travel from to the neighborhood of the next saddle, and is the point of entrance to that neighborhood.

Our first nondegeneracy assumption is that for all ,

(5.4)

which means that each heteroclinic orbit has a nontrivial component in the direction of the as it approaches . Although this condition holds true for all systems of interest (e.g., the system considered in Section 2), it is easy to adapt our reasoning to the situations where other components of the projection of on the stable directions dominate.

We shall also need a nondegeneracy condition on the linearization of (1.1) along . For each we consider the fundamental matrix solving the equation in variations along the orbit :

For all we denote

The technical nondegeneracy assumption on that we need is:

(5.5)

Again, we work with this condition since it holds true for any system of interest, but it is easy to adapt our reasoning to the situations where this condition is violated.

To formulate our main theorem we need a notion of an entrance-exit map describing the limiting behavior of in the neighborhood of a saddle points, namely, the asymptotics of the random entrance-exit Poincaré map as .

We denote the set of all probability Borel measures on by .

Let us denote by the set of all triples where

  1. satisfies ;

  2. ;

  3. with

This set will be used to describe the initial condition for equation (1.2): where the distribution of weakly converges to as .

We also define

and

Here, the numbers define the limiting probabilities of choosing each of the two branches of the invariant curve associated with the highest eigenvalue of the linearization at the saddle point; are points on these orbits that serve as entrance points to neighborhoods of the next saddle points; are the times it takes to reach these points under the proper (logarithmic) renormalization; is the scaling exponent so that the exit distribution (serving as the entrance distribution to the next saddle’s neighborhood) takes the form , where the distribution of converges to  or  depending on which of the two branches was chosen.

It is possible (see Lemma 7.1) to give a precise description of the asymptotic behavior of the diffusion in the neighborhood of each saddle point in terms of an appropriate entrance-exit map, i.e. a map that for each saddle point computes a description of the exit parameters in terms of the entrance parameters:

For an entrance-exit map we shall denote its components by , , , , .

Suppose belongs to one of heteroclinic orbits of the network. A sequence is called admissible for if

  1. is the positive orbit of with ;

  2. for each , or ;

  3. for each

The number is called the length of .

Our main limit theorem uses entrance-exit maps to assign limiting probabilities to admissible sequences. Let us proceed to describe this procedure.

With each admissible sequence we associate the following sequence:

(5.6)

Here , , and

(5.7)

where

(5.8)

We add 1 in the r.h.s. so that the distribution is nondegenerate (and the maps are well-defined) even if . All other entries in (5.6) are obtained according to the following recursive procedure. For each ,

(5.9)

The numbers defined above play the role of time, and the admissible sequence can be identified with a piecewise constant trajectory :

The numbers defined in (5.6) play the roles of conditional probabilities, and we denote

(5.10)

The set of all admissible sequences for has the structure of a binary tree. The natural partial order on it is determined by inclusion. We say that a set of admissible sequences for is free if no two sequences in are comparable with respect to this partial order. If additionally, for any sequence not from , it is comparable to one of the sequences from , the set is called complete. In the language of graph theory, a complete set is a section of the binary tree.

It is clear that for any free set , , where . A free set is called conservative if . Every complete set is finite and conservative.

We are ready to state our main result now.

Theorem 5.1.

Under the conditions stated above there is an entrance-exit map with the following property.

Let belong to one of the heteroclinic orbits of the network. For each define a stochastic process by

(5.11)

where is the strong solution of (1.1) with initial condition .

For any conservative set of -admissible sequences, there is a family of stopping times such that the distribution of the graph converges weakly in to the measure concentrated on the set

and satisfying

(5.12)

where is defined via in (5.10).

For any entrance-exit map the family of conservative sets includes all finite complete sets, so that the content of Theorem 5.1 is nontrivial.

Importantly, we actually construct the desired entrance-exit map in the proof. This allows to study the details of the limiting process in Section 10. At this point let us just mention that the sequence of saddles visited by the limiting process can be Markov or non-Markov depending on the eigenvalues of the linearizations at the saddle points. We discuss this memory effect and some other implications and possible extensions of Theorem 5.1 in Section 10.

6. Exit measure asymptotics

We shall now apply Theorem 5.1 to the exit problem along a heteroclinic network, and formulate a theorem that, in a sense, gives more precise information on the exit distribution than the FW theory. We assume that there is a domain with piecewise smooth boundary such that . The FW theory implies that, as , the exit measure for the process  started at concentrates at points that provide the minimum value of the so called quasi-potential . Since for all the points that are reachable from along the heteroclinic network, the quasi-potential equals 0, we conclude that in the case of heteroclinic networks, the exit measure asymptotically concentrates at the boundary points that can be reached from along the heteroclinic network. However, this approach does not allow to distinguish between the exit points while ours allows to determine an exact limiting probability for each exit point.

We take a point on one of the heteroclinic orbits of the network and denote by the set of all the admissible sequences (of any length ) for such that the last curve of the sequence intersects transversally at a point (if there are several points of intersection we take the first one with respect to the natural order on ), and does not intersect for all . Let

The distribution of is concentrated on .

For each , one can define via (5.10).

Theorem 6.1.

For the setting described above, if the set is conservative, then the distribution of converges weakly, as , to

Proof.

Let us use Theorem 5.1 to choose the times providing convergence of the distribution of to . The theorem follows since is a functional of , continuous on the support of . ∎

Remark 6.1.

Notice that for different sequences and it is possible to have so that an exit point can accumulate its limiting probability from a variety of admissible sequences.

Remark 6.2.

The behavior of the system up to is entirely determined by the drift and diffusion coefficients inside . Therefore, there is an obvious generalization of this theorem for heteroclinic networks in a domain, where one requires the invariant manifolds associated to the highest eigenvalue at a critical point to connect that critical point either to another critical point, or to a point on . An advantage of that theorem is that one does not have to specify the (irrelevant) coefficients of (1.2) outside of . We omit the precise formulation for brevity.

Remark 6.3.

In the case of nonconservative , the limit theorem is harder to formulate. In the limit, the exit happens along the sequences belonging to with positive probability . With probability , the exit happens in a more complicated way (and in a longer than logarithmic time) depending on the details of the driving vector field.

7. Proof of Theorem 5.1

We begin with the central lemma that we need in the proof. It has a lengthy statement, and after formulating it, we also give a brief informal explanation.

For each we introduce and via

(7.1)
Lemma 7.1.

For each , there is a map

with the following property.

Take any and any family of distributions in with as . For each , consider the solution of (1.2) with initial condition

(7.2)

where

(7.3)

and define a stopping time

and two events

Then

  1. As ,

  2. As ,

  3. There is a family of random vectors such that

    1. for every , on

      where was defined in (5.2);

    2. as ,

  4. For any , there is such that, as ,