arclength parametrized Hamilton’s equations
calculation of instantons
A method is presented to compute minimizers (instantons) of action functionals using arclength parametrization of Hamilton’s equations. This method can be interpreted as a local variant of the geometric minimum action method (gMAM) introduced to compute minimizers of the Freidlin-Wentzell action functional that arises in the context of large deviation theory for stochastic differential equations. The method is particularly well-suited to calculate expectations dominated by noise-induced excursions from deterministically stable fixpoints. Its simplicity and computational efficiency are illustrated here using several examples: a finite-dimensional stochastic dynamical system (an Ornstein-Uhlenbeck model) and two models based on stochastic partial differential equations: the -model and the stochastically driven Burgers equation.
Finding the minimizer of action functionals is a fundamental problem that appears in a variety of areas in mathematics and science. For example, it arises in the context of large deviation theory (LDT), where probabilities and expectations over the solutions of stochastic differential equations (SDEs) can be estimated in the small noise limit via minimization of the Freidlin-Wentzell action . This has led to the development of numerical tools specifically designed to tackle this task: the string method (for gradient fields) [5, 7], the minimum action method [6, 9], the adaptive minimum action method (aMAM)  and the geometric minimum action method (gMAM) [13, 14, 19] are a few examples of techniques to calculate the minimizer of the Freidlin-Wentzell action, also called instanton. One difficulty one is faced in these calculations is that the speed of the instanton can be very inhomogeneous: for example it can vanish at critical locations making its total duration infinite. In these situations, it is computationally advantageous to replace the time-parametrization of the instanton by some other, geometrically motivated parametrizations, such as e.g. arclength parametrization. This strategy is a the core of the string method as well as aMAM and gMAM. The aim of the present paper is to revisit these methods and propose alternative ways to calculate instantons by focusing on the Hamiltonian counterpart of the Euler-Lagrange equation of gMAM.
To be specific, we will consider a system of SDEs in with a drift vector and a diffusion matrix weighted by a small parameter characterizing the strength of the noise:
For simplicity, we will assume that the diffusion matrix is constant (but not necessarily diagonal) – the generalization of the ideas below to non-constant is straightforward. From the theory of large deviations [20, 10] it is known that in the limit as , the solutions to (1) that contribute most to the probability of an event or the value of an expectation are likely to be close to the minimizer of the Freidlin-Wentzell action functional subject to appropriate boundary conditions. This action functional is given by
with the Lagrangian
Here denotes the Euclidean scalar product between the vectors and and we introduced the norm induced by , – if is the identity, this is simply the Euclidean length.
The instanton, i.e. the path that minimizes the action (2), can be found by solving the corresponding Euler-Lagrange equations
with the appropriate boundary conditions. Thus, in the context of large deviation theory the estimation of probabilities or expectations can be reduced to the solution of the deterministic system (4). The question then becomes how to do this efficiently, for example in situations where one is searching for solutions to (4) with one end-point fixed at a stable critical point of – this case is relevant e.g. for the estimation of expectations with respect to the stationary distribution of (1). This is the main topic of this paper, which is organized as follows. In section 2 we derive the arclength parametrized Hamilton’s equations that are equivalent to (4). We also show how these equations can be used to estimate the stationary probability distribution of the SDE (1) via calculation of the quasipotential from LDT (section 2.1), as well as expectations with respect to this distribution (section 2.2). Finally we give a few special solutions of these equations (section 2.3). In section 3 we propose a simple and efficient algorithm for the numerical integration of the arclength parametrized Hamilton’s equations. This algorithm is then tested on several illustrative examples: a finite dimensional Ornstein-Uhlenbeck process that is amenable to analytical solution (section 4.1), the -model (section 4.2) and the randomly forced Burgers equation (section 4.3). Some concluding remarks are given in section 5.
2 Hamilton’s equations with arclength parametrization
Instead of solving (4), we can also compute the instanton in the corresponding Hamiltonian framework. The Hamiltonian is the Legendre transform of the Lagrangian ,
and the Hamilton’s equations for the instanton are
As mentioned before, the main idea of gMAM is that, in a wide range of situations, the instanton can be computed much more efficiently by using a parametrization that is different than the time . Let us consider a reparametrization of the form
and assume that . By chain rule, we have then
and we can write
One possibility to choose the parametrization is to require that , where is a suitable norm and is a constant giving the length of in this norm. This fixes the value of the function in (8), which can be thought of as a Lagrange multiplier introduced to enforce the constraint . Once is known, and its inverse can then be obtained from the last equation in (8). It is often natural to use the norm induced by the correlation matrix , hence to define . For simplicity, we call this arclength parametrization even if is not the identity. Note that, in principle, norm (and parametrization) can be chosen freely and, indeed, other choices are possible. In the present work, however, we concentrate on arclength parametrization in contrast to the original time parametrization. Details of the implication of other parametrizations (and norms) will be studied elsewhere.
For diffusion-driven systems, the Hamiltonian is given by
and (6) read explicitly
where is the tensor with entries . Correspondingly the first two equations in (8) become
In this case, the parametrization by arclength directly leads to and we can write (11) as
The norm appearing in the denominator can be expressed as
where we used the property that since the Hamiltonian does not depend explicitly on time, energy is conserved and the points on the graph need to satisfy the constraint everywhere. This leads to yet another equivalent form of (12):
These equations are valid in particular in the case when , which, as we will see in sections 2.1 and 2.2 is the relevant one to compute probabilities and expectations with respect to the stationary distribution of (1). In the sequel, we will focus on this case, that is, we will study
where we dropped the bar to simplify notations - we will stick to this convention in the sequel. Note that these equations seem singular at the critical points where . This is not the case, however, as at these points, too. But it will complicate the numerical integration of (15) since, typically, minimizing paths start from and may go through critical points - this issue will be dealt with in section 3. Also note that the constant giving the length of the path is, usually, not known a priori. Finally, note that if were not constant, an additional term should be added to the equation for -component of .
2.1 Derivation of (15) from the geometric action and link with the quasipotential
A key quantity in Freidlin-Wentzell theory is the quasipotential, defined as
where and are two arbitrary points in and the infimum over is taken over all the paths that satisfy , . The quasipotential permits to estimate various long-time properties of (1) in the limit as . For example, if is the unique stable fixpoint of , and under suitable conditions such that the solutions to (1) are ergodic with respect to the stationary probability density function , then
where the infimum is taken over the paths that satisfy , and we introduced the geometric action
This quantity bears his name from the property that it is left invariant by any reparametrization of the path.
We show now by explicit calculation that arclength parametrized Hamilton’s equations (15) are equivalent to the Euler-Lagrange equations associated with (19). These Euler-Lagrange equations are given by
which we can write in a more explicit form as
Then (20) can be written as
2.2 Use of (15) in the computation of expectations
Assume again that has a single stable fixpoint at and suppose that we want to estimate the following expectation with respect to the stationary density of the process.
where is some function. Then as
where the infimum is taken over all paths such that . It is easy to see that the minimizer of this variational problem solves (15) with the boundary conditions
Note that these boundary conditions differ from the boundary conditions that are commonly used in applications of the original gMAM. In those applications, two points, e.g. two stable fixpoints and are prescribed. In contrast, when computing expectations as shown above, one of these boundary conditions is replaced by . It is possible to adapt gMAM to handle such boundary conditions, for example by making the end point variable. However, as shown in section 3, working with (15) offers a much simpler way to handle boundary conditions of the type (26).
2.3 Special solutions of (15)
If , then (15) reduces to
meaning that and point in the same direction. This solution corresponds to a reparametrized version of a deterministic trajectory, which is the solution of . Another that solves with given by (9) is , for which the equation for (15) reduces to
meaning that and points in the opposite direction. This special solution, however, is not always consistent with the second equation in (15). Indeed, differentiating gives
where we used (28) to get the second equality. This equation is consistent with the second equation in (15) iff , i.e. iff the drift term is gradient, for some potential . Thus, solutions of (15) that follow deterministic paths in reverse can only be observed in gradient systems.
where is such that and . It can easily be checked that in this case the solutions to (1) are ergodic with respect to the following stationary probability density
where is a normalization factor. In that case a special class of solutions to (15) are those such that and satisfies
as can be checked by direct substitution.
3 Algorithmic aspects
Using the arclength parametrized Hamilton’s equations (15) gives us a direct and numerically efficient iterative method to find the instanton that fits the boundary conditions (26): Assume that we are given the -th iteration together with an approximation of the path length as approximation to the solution of (15) with the boundary conditions (26). To compute the next iteration and we proceed in the following way:
Use the known approximate solution and , together with the known final condition , to solve the equation for backwards in the arclength-parameter to obtain the next iteration .
Use this computed approximate solution and , together with the initial condition , in the equation for to obtain the next iteration .
Compute the length of to obtain the next iteration of .
If this iteration scheme converges to a fixpoint for and , then the constraint will be automatically satisfied by this fixpoint. It may not be satisfied during the iterations, however, in which case in step 3 it may be useful to also reparametrize to enforce the constraint . At convergence, the solution will also satisfy , since the first equation in (15) implies that and hence . A similar method (with a parametrization using an exponentially scaled time) was suggested by Chernykh and Stepanov  for the computation of instantons in stochastic Burgers equation. We will discuss this example in more detail in section 4.
It is worth stressing that the implementation of the method is extremely simple: The solution of (15) requires only forward (or backward) propagation of the initial (or final) value, hence existing codes can be easily adapted to be used in this new method. In contrast, gMAM usually requires the design of entirely new code. The locality in memory of the method presented here is particularly well-suited for parallelization: For large-scale problems, it was successfully implemented in a massively parallel setup, using graphics processing units (GPUs) to speed up the computation (this will be presented elsewhere). In contrast, global minimization techniques such as Newton or Quasi-Newton methods are more difficult to implement in such a setting. These aspects additionally simplify the scaling to a large number of degrees of freedom. The largest computations for stochastic partial differential equations presented here, with grid-points in space and in time take less than minutes in Matlab (on an Intel Xeon E5-1620 with 3.60GHz) to successfully converge to the minimizer.
4 Applications and Examples
Let us now present several examples and applications of the new method. In these examples, we focus on the exit from a stable fixed point and apply the iterative method described section 3 in two ways: First, using Hamilton’s equations in the original time-parametrization given by (10):
and second using the arclength parametrized Hamilton’s equations (15):
In all cases, we consider the escape from a stable fixpoint that can be used as initial condition for in the above equations and we designate a final condition for either by fixing it or through the second boundary condition in (26) for some specific . Note that, in the time-parametrized case, since is a stable fixpoint, it takes infinite time to escape it. This makes the iteration numerically difficult as, in practice, one needs to choose a and then decrease it towards until convergence of the computed minimum of the action. This problem does not occur when one works with arclength-parametrization, which is an immediate numerical advantage of this method.
4.1 Application to a linear Ornstein-Uhlenbeck model
As a first example, consider a linear Ornstein-Uhlenbeck model for which the drift is of the form
with being the diffusion tensor from above. Then is the covariance of the equilibrium distribution for the SDE, which is zero-mean Gaussian and exists as long as is a stable fixpoint of . The path of maximum likelihood of escape from is the time-reversed of the solution to
This result is consistent with the fact that the time-reversed form of the SDE (1) with is
and it corresponds to picking
as solution to . It can be checked that, if we insert (36) in the first equation in (10), this equation reduces to (35), and that the second equation in (10) is also satisfied. Since we can compute the minimizer analytically, this model is a good start to compare the two methods (Hamilton’s equations with time-parametrization vs Hamilton’s equations with arclength-parametrization) and to study convergence. As a concrete example, we chose
and , as a test case. Note that, in this case, is normal and, therefore, the analytical solution is even simpler: The minimizing path is the time-reversed of the trajectory that spirals into the fixpoint at the origin and is given by
For this system we implemented the algorithm described in section 3 to solve both the equations of motion of the original Hamiltonian (10) and of the geometric parametrization (15). Propagation of the initial conditions is realized with a simple first order explicit Euler scheme.
As shown in figure 1, with identical resolution the proposed method is closer to the analytical solution by a factor (for a given accuracy, the proposed method needs roughly a factor 10 grid-points less). Due to the choice of the integration scheme, both methods converge with first order. The difference to the analytical solution is measured by comparing the endpoints of the paths, (due to different parametrizations, differences between other points than the endpoint are harder to define and calculate).
The action density over the curve parameter is concentrated on times close to for the time parametrization, while it gets stretched out to cover a significant part of the temporal domain in the geometric case (see figure 2). This behavior is desired, because it offers more grid points in the regions with critical dynamics.
4.2 Application to the model
Consider the following SPDE
with Dirichlet boundary conditions , and where is a spatio-temporal white-noise. It can be shown  that this equation is well-posed, and that its equilibrium distribution is formally given by
The right way to interpret this measure is via its Radon-Nikodym derivative with respect to the measure of the Brownian bridge (i.e. by using the measure of the Brownian bridge as reference and writing the density of the equilibrium distribution of (37) with respect to this measure – see e.g. ):
where denotes the measure of the Brownian bridge on , i.e. of the Gaussian process with mean zero and covariance
In the large deviation regime (i.e. in the limit when ), we can ask what is the most likely configuration on the invariant measure such that . This is the solution of
In the limit as , this solution is simply
We can also compute
where is a parameter and the expectation is taken on the invariant measure. The configuration on the invariant measure that contributes the most to this expectation is the solution to (41) with the boundary condition and :
A similar story holds if we consider (37) with periodic boundary conditions, which are easier to implement numerically. In this case the solution to (41) with the boundary condition and gives the relevant configuration for the periodic case.
In this example, we implemented the iterative scheme, both for the equations of motion in the time-parametrized and the geometric variant by using a second order Heun integration in the curve parameter. The -dimension is resolved with grid-points, using Fourier transforms to calculate the spatial derivatives.
Figure 3 compares, for different resolutions, the numerical solutions of the final configuration of the minimizer to the model (37) to the analytic solution for the periodic case. At a resolution of , the geometric method already outperforms the time-parametrized variant with a considerably higher resolution of in terms of accuracy.
4.3 Application to the stochastically driven Burgers equation
Consider the stochastically driven Burger’s equation with periodic boundary conditions:
In this example, we consider a noise term that has a finite correlation length in space, but is still -correlated in time. Hence we can write
and for the correlation function in space, we assume the following form in Fourier space
where denotes the Heaviside-function. This form implies that the stochastic driving occurs only at frequencies up to a cut-off frequency . In this way, (45) is a simple model for turbulence : The system is stochastically driven at larger scales by the noise term . This energy is transported via the non-linearity from large scales to small scales where energy is dissipated due to the presence of the diffusive term .
We consider the velocity field starting at rest for and we are interested in noise realizations that lead to strong negative gradients at . This setup is similar to the analytic treatment in . We choose as our observable
For this case, our Hamiltonian equations for the reparametrized fields , can be written as
The initial condition for the velocity field is for the final condition for the auxiliary field is as can be seen from (48) using integration by parts. The norm is given by the inverse of on its support (which is compact in Fourier domain), hence
where, denotes the inverse Fourier transform operator and is the Fourier transform of .
Note that the same equations can also be obtained using the Martin-Siggia-Rose/Janssen/de Dominicis formalism [16, 15, 4, 17]. A main problem in solving (51) with the boundary conditions at and consists in the fact that the fixpoint of the system is only reached in the limit and not in finite time. When Chernykh and Stepanov computed a numerical solution to the system above, they used a combination of two clever tricks to mitigate this difficulty: First, they used self-similar properties of the heat-kernel to design a coordinate transform that leads to an exponential rescaling in time, second they used the linearization of the system around zero in order to replace the boundary condition by (where the constant can be computed from the correlation function ). In the following we show that the geometric reformulation of the instanton equations leads to a natural rescaling of time that allows for direct efficient numerical solution, which is furthermore transferable to similar situations without any modification.
We implemented the iterative algorithm outlined in section 3, employing a second order Heun integration of the geometric equations of motion (49) and compare the results to a similar iterative algorithm, but integrating the time-parametrized equations of motion (51), instead. The space variable is resolved with grid-points, and all derivatives in space are calculated via Fourier transforms. For the calculations presented here we took and in (48)
Figure 4 shows the convergence of the maximum of the Hamiltonian for the proposed scheme in comparison to the time-parametrized variant. For both methods the Hamiltonian converges to in second order. The reparametrized variant is a factor more accurate: For the smallest tested resolution of the error of the geometric integration is still smaller than the error of the time-parametrized integration at the highest tested resolution of . The reason is simple:
When using physical time as discretization, it is necessary to prescribe a finite starting-time which is restricted in its magnitude to ensure stability and accuracy of the integration scheme for a given time resolution. Yet, when choosing too close to , the resulting finite-time minimizer does not necessarily approximate the global minimizer. For the geometric equations of motion this issue is absent, and the arbitrary choice of is avoided in a natural way. Figure 5 depicts the final configurations of the minimizer for different choices of , demonstrating the necessity to choose a sufficiently small starting time: A choice of and, to a much lesser extend, produce secondary extrema, which are absent for both and the geometric version.
A plot of the complete minimizer in space-time, as shown in figure 6, reveals the superior parametrization of the geometric scheme. Even for a somewhat optimistic choice of , most of the dynamics are squeezed into very few grid-points for the time-parametrization (left), while being far better resolved in the geometric case (right), which nevertheless covers a more than 10 times larger interval of physical time in total.
We presented a new method to compute minimizers of action functionals using a geometric reparametrization of Hamilton’s equations. The new method is particularly well-suited to compute expectations related to stochastically driven exits from attracting domains, though it may be useful in other contexts as well. The new method was illustrated using a simple Ornstein-Uhlenbeck system and then numerically implemented and tested in the context of stochastic partial differential equations with additive noise for the -model and the stochastic Burgers equation. In all cases, it was shown that the new parametrization has computational advantages over the parametrization using the original physical time. Applications to more complicated systems, in particular higher-dimensional stochastic partial differential equations (e.g. stochastically driven Navier-Stokes or MHD equations) are therefore within reach and will be the topic of future research.
The work of T.G. was partially supported through the grants ISF-7101800401 and Minerva â Coop 7114170101. T.S. acknowledges support through the following NSF grants: DMS-0807396, DMS-1108780, and CNS-0855217. The work of E.V.-E. was supported in part by NSF Grant DMS07-08140 and ONR Grant N00014- 11-1-0345. Furthermore, T.G. would like to thank Gregory Falkovich for interesting discussions.
-  E. Balkovsky, G. Falkovich, I. Kolokolov, and V. Lebedev. Intermittency of Burgers’ turbulence. Phys. Rev. Lett., 78:1452, 1997.
-  J. Cardy, G. Falkovich, and K. Gawedzki. Non-equilibrium Statistical Mechanics and Turbulence. Cambridge University Press, 2008.
-  A.I. Chernykh and M.G. Stepanov. Large negative velocity gradients in Burgers turbulence. Phys. Rev. E, 64:026306, 2001.
-  C. de Dominicis. Techniques de renormalisation de la théorie des champs et dynamique des phénomènes critiques. J. Phys. C, 1:247, 1976.
-  W. E, W. Ren, and E. Vanden-Eijnden. String method for the study of rare events. Phys. Rev. B, 66:052301, 2002.
-  W. E, W. Ren, and E. Vanden-Eijnden. Minimum action method for the study of rare events. Comm. Pure Appl. Math., 57:1–20, 2004.
-  W. E, W. Ren, and E. Vanden-Eijnden. Simplified and improved string method for computing the minimum energy paths in barrier-crossing events. J. Chem. Phys., 126:164103, 2007.
-  W. G. Faris and G. Jona-Lasinio. Large fluctuations for a nonlinear heat equation with noise. J. of Phys. A, 15(10):3025, 1982.
-  H. C. Fogedby and W. Ren. Minimum action method for the Kardar-Parisi-Zhang equation. Phys. Rev. E, 80:041116, 2009.
-  M. I. Freidlin and A. D. Wentzell. Random perturbations of dynamical systems. Springer, 1998.
-  C. Gardiner. Stochastic Methods: A Handbook for the Natural and Social Sciences. Springer, 2009.
-  T. Grafke, R. Grauer, and T. Schäfer. Instanton filtering for the stochastic Burgers equation. J. Phys. A: Mathematical and Theoretical, 46(6):62002, 2013.
-  M. Heymann and E. Vanden-Eijnden. The geometric minimum action method: A least action principle on the space of curves. Comm. Pure and Appl. Math., 61:1053, 2008.
-  M. Heymann and E. Vanden-Eijnden. Pathways of maximum likelihood for rare events in nonequilibrium systems: application to nucleation in the presence of shear. Phys. Rev. Lett., 100(14):140601, 2008.
-  H. K. Janssen. On a Lagrangian for classical field dynamics and renormalization group calculations of dynamical critical properties. Z. Physik B, 23:377, 1976.
-  P. C. Martin, E. D. Siggia, and H. A. Rose. Statistical dynamics of classical systems. Phys. Rev. A, 8:423, 1973.
-  R. Phythian. The functional formalism of classical statistical dynamics. J. Phys. A, 10, 1977.
-  M. G. Reznikoff and E. Vanden-Eijnden. Invariant measures of stochastic partial differential equations and conditioned diffusions. Comptes Rendus Mathematique, 340(4):305–308, 2005.
-  E. Vanden-Eijnden and M. Heymann. The geometric minimum action method for computing minimum energy paths. J. Chem. Phys., 128:061103, 2008.
-  S. R. S. Varadhan. Large deviations. Ann. Prob., 36(2):397–419, 2008.
-  X. Zhou, W. Ren, and W. E. Adaptive minimum action method for the study of rare events. J. Chem. Phys., 128:104111, 2008.