arclength parametrized Hamilton’s equations
for the
calculation of instantons
Abstract
A method is presented to compute minimizers (instantons) of action functionals using arclength parametrization of Hamilton’s equations. This method can be interpreted as a local variant of the geometric minimum action method (gMAM) introduced to compute minimizers of the FreidlinWentzell action functional that arises in the context of large deviation theory for stochastic differential equations. The method is particularly wellsuited to calculate expectations dominated by noiseinduced excursions from deterministically stable fixpoints. Its simplicity and computational efficiency are illustrated here using several examples: a finitedimensional stochastic dynamical system (an OrnsteinUhlenbeck model) and two models based on stochastic partial differential equations: the model and the stochastically driven Burgers equation.
1 Introduction
Finding the minimizer of action functionals is a fundamental problem that appears in a variety of areas in mathematics and science. For example, it arises in the context of large deviation theory (LDT), where probabilities and expectations over the solutions of stochastic differential equations (SDEs) can be estimated in the small noise limit via minimization of the FreidlinWentzell action [10]. This has led to the development of numerical tools specifically designed to tackle this task: the string method (for gradient fields) [5, 7], the minimum action method [6, 9], the adaptive minimum action method (aMAM) [21] and the geometric minimum action method (gMAM) [13, 14, 19] are a few examples of techniques to calculate the minimizer of the FreidlinWentzell action, also called instanton. One difficulty one is faced in these calculations is that the speed of the instanton can be very inhomogeneous: for example it can vanish at critical locations making its total duration infinite. In these situations, it is computationally advantageous to replace the timeparametrization of the instanton by some other, geometrically motivated parametrizations, such as e.g. arclength parametrization. This strategy is a the core of the string method as well as aMAM and gMAM. The aim of the present paper is to revisit these methods and propose alternative ways to calculate instantons by focusing on the Hamiltonian counterpart of the EulerLagrange equation of gMAM.
To be specific, we will consider a system of SDEs in with a drift vector and a diffusion matrix weighted by a small parameter characterizing the strength of the noise:
(1) 
For simplicity, we will assume that the diffusion matrix is constant (but not necessarily diagonal) – the generalization of the ideas below to nonconstant is straightforward. From the theory of large deviations [20, 10] it is known that in the limit as , the solutions to (1) that contribute most to the probability of an event or the value of an expectation are likely to be close to the minimizer of the FreidlinWentzell action functional subject to appropriate boundary conditions. This action functional is given by
(2) 
with the Lagrangian
(3) 
Here denotes the Euclidean scalar product between the vectors and and we introduced the norm induced by , – if is the identity, this is simply the Euclidean length.
The instanton, i.e. the path that minimizes the action (2), can be found by solving the corresponding EulerLagrange equations
(4) 
with the appropriate boundary conditions. Thus, in the context of large deviation theory the estimation of probabilities or expectations can be reduced to the solution of the deterministic system (4). The question then becomes how to do this efficiently, for example in situations where one is searching for solutions to (4) with one endpoint fixed at a stable critical point of – this case is relevant e.g. for the estimation of expectations with respect to the stationary distribution of (1). This is the main topic of this paper, which is organized as follows. In section 2 we derive the arclength parametrized Hamilton’s equations that are equivalent to (4). We also show how these equations can be used to estimate the stationary probability distribution of the SDE (1) via calculation of the quasipotential from LDT (section 2.1), as well as expectations with respect to this distribution (section 2.2). Finally we give a few special solutions of these equations (section 2.3). In section 3 we propose a simple and efficient algorithm for the numerical integration of the arclength parametrized Hamilton’s equations. This algorithm is then tested on several illustrative examples: a finite dimensional OrnsteinUhlenbeck process that is amenable to analytical solution (section 4.1), the model (section 4.2) and the randomly forced Burgers equation (section 4.3). Some concluding remarks are given in section 5.
2 Hamilton’s equations with arclength parametrization
Instead of solving (4), we can also compute the instanton in the corresponding Hamiltonian framework. The Hamiltonian is the Legendre transform of the Lagrangian ,
(5) 
and the Hamilton’s equations for the instanton are
(6) 
As mentioned before, the main idea of gMAM is that, in a wide range of situations, the instanton can be computed much more efficiently by using a parametrization that is different than the time . Let us consider a reparametrization of the form
(7) 
and assume that . By chain rule, we have then
and we can write
(8) 
One possibility to choose the parametrization is to require that , where is a suitable norm and is a constant giving the length of in this norm. This fixes the value of the function in (8), which can be thought of as a Lagrange multiplier introduced to enforce the constraint . Once is known, and its inverse can then be obtained from the last equation in (8). It is often natural to use the norm induced by the correlation matrix , hence to define . For simplicity, we call this arclength parametrization even if is not the identity. Note that, in principle, norm (and parametrization) can be chosen freely and, indeed, other choices are possible. In the present work, however, we concentrate on arclength parametrization in contrast to the original time parametrization. Details of the implication of other parametrizations (and norms) will be studied elsewhere.
For diffusiondriven systems, the Hamiltonian is given by
(9) 
and (6) read explicitly
(10) 
where is the tensor with entries . Correspondingly the first two equations in (8) become
(11) 
In this case, the parametrization by arclength directly leads to and we can write (11) as
(12) 
The norm appearing in the denominator can be expressed as
(13)  
where we used the property that since the Hamiltonian does not depend explicitly on time, energy is conserved and the points on the graph need to satisfy the constraint everywhere. This leads to yet another equivalent form of (12):
(14) 
These equations are valid in particular in the case when , which, as we will see in sections 2.1 and 2.2 is the relevant one to compute probabilities and expectations with respect to the stationary distribution of (1). In the sequel, we will focus on this case, that is, we will study
(15) 
where we dropped the bar to simplify notations  we will stick to this convention in the sequel. Note that these equations seem singular at the critical points where . This is not the case, however, as at these points, too. But it will complicate the numerical integration of (15) since, typically, minimizing paths start from and may go through critical points  this issue will be dealt with in section 3. Also note that the constant giving the length of the path is, usually, not known a priori. Finally, note that if were not constant, an additional term should be added to the equation for component of .
2.1 Derivation of (15) from the geometric action and link with the quasipotential
A key quantity in FreidlinWentzell theory is the quasipotential, defined as
(16) 
where and are two arbitrary points in and the infimum over is taken over all the paths that satisfy , . The quasipotential permits to estimate various longtime properties of (1) in the limit as . For example, if is the unique stable fixpoint of , and under suitable conditions such that the solutions to (1) are ergodic with respect to the stationary probability density function , then
(17) 
where indicates that the ratio of the logarithms of both sides tends to 1 as . In [13], it was shown that the quasipotential (16) can be represented equivalently as
(18) 
where the infimum is taken over the paths that satisfy , and we introduced the geometric action
(19) 
This quantity bears his name from the property that it is left invariant by any reparametrization of the path.
We show now by explicit calculation that arclength parametrized Hamilton’s equations (15) are equivalent to the EulerLagrange equations associated with (19). These EulerLagrange equations are given by
(20) 
which we can write in a more explicit form as
(21) 
Let
(22) 
Then (20) can be written as
(23) 
If we solve (22) in , and use this expression in (23), the resulting system of equations is exactly (15).
2.2 Use of (15) in the computation of expectations
Assume again that has a single stable fixpoint at and suppose that we want to estimate the following expectation with respect to the stationary density of the process.
(24) 
where is some function. Then as
(25) 
where the infimum is taken over all paths such that . It is easy to see that the minimizer of this variational problem solves (15) with the boundary conditions
(26) 
Note that these boundary conditions differ from the boundary conditions that are commonly used in applications of the original gMAM. In those applications, two points, e.g. two stable fixpoints and are prescribed. In contrast, when computing expectations as shown above, one of these boundary conditions is replaced by . It is possible to adapt gMAM to handle such boundary conditions, for example by making the end point variable. However, as shown in section 3, working with (15) offers a much simpler way to handle boundary conditions of the type (26).
2.3 Special solutions of (15)
If , then (15) reduces to
(27) 
meaning that and point in the same direction. This solution corresponds to a reparametrized version of a deterministic trajectory, which is the solution of . Another that solves with given by (9) is , for which the equation for (15) reduces to
(28) 
meaning that and points in the opposite direction. This special solution, however, is not always consistent with the second equation in (15). Indeed, differentiating gives
(29) 
where we used (28) to get the second equality. This equation is consistent with the second equation in (15) iff , i.e. iff the drift term is gradient, for some potential . Thus, solutions of (15) that follow deterministic paths in reverse can only be observed in gradient systems.
Another class of systems for which we can find special solutions of (15) that are not (27) are those for which [10]
(30) 
where is such that and . It can easily be checked that in this case the solutions to (1) are ergodic with respect to the following stationary probability density
(31) 
where is a normalization factor. In that case a special class of solutions to (15) are those such that and satisfies
(32) 
as can be checked by direct substitution.
3 Algorithmic aspects
Using the arclength parametrized Hamilton’s equations (15) gives us a direct and numerically efficient iterative method to find the instanton that fits the boundary conditions (26): Assume that we are given the th iteration together with an approximation of the path length as approximation to the solution of (15) with the boundary conditions (26). To compute the next iteration and we proceed in the following way:

Use the known approximate solution and , together with the known final condition , to solve the equation for backwards in the arclengthparameter to obtain the next iteration .

Use this computed approximate solution and , together with the initial condition , in the equation for to obtain the next iteration .

Compute the length of to obtain the next iteration of .
If this iteration scheme converges to a fixpoint for and , then the constraint will be automatically satisfied by this fixpoint. It may not be satisfied during the iterations, however, in which case in step 3 it may be useful to also reparametrize to enforce the constraint . At convergence, the solution will also satisfy , since the first equation in (15) implies that and hence . A similar method (with a parametrization using an exponentially scaled time) was suggested by Chernykh and Stepanov [3] for the computation of instantons in stochastic Burgers equation. We will discuss this example in more detail in section 4.
It is worth stressing that the implementation of the method is extremely simple: The solution of (15) requires only forward (or backward) propagation of the initial (or final) value, hence existing codes can be easily adapted to be used in this new method. In contrast, gMAM usually requires the design of entirely new code. The locality in memory of the method presented here is particularly wellsuited for parallelization: For largescale problems, it was successfully implemented in a massively parallel setup, using graphics processing units (GPUs) to speed up the computation (this will be presented elsewhere). In contrast, global minimization techniques such as Newton or QuasiNewton methods are more difficult to implement in such a setting. These aspects additionally simplify the scaling to a large number of degrees of freedom. The largest computations for stochastic partial differential equations presented here, with gridpoints in space and in time take less than minutes in Matlab (on an Intel Xeon E51620 with 3.60GHz) to successfully converge to the minimizer.
4 Applications and Examples
Let us now present several examples and applications of the new method. In these examples, we focus on the exit from a stable fixed point and apply the iterative method described section 3 in two ways: First, using Hamilton’s equations in the original timeparametrization given by (10):
and second using the arclength parametrized Hamilton’s equations (15):
In all cases, we consider the escape from a stable fixpoint that can be used as initial condition for in the above equations and we designate a final condition for either by fixing it or through the second boundary condition in (26) for some specific . Note that, in the timeparametrized case, since is a stable fixpoint, it takes infinite time to escape it. This makes the iteration numerically difficult as, in practice, one needs to choose a and then decrease it towards until convergence of the computed minimum of the action. This problem does not occur when one works with arclengthparametrization, which is an immediate numerical advantage of this method.
4.1 Application to a linear OrnsteinUhlenbeck model
As a first example, consider a linear OrnsteinUhlenbeck model for which the drift is of the form
(33) 
with and the matrix not necessarily normal. In this case, a lot of analytical information about the minimizer can be obtained in the following way [10, 11]: Let be the solution of
(34) 
with being the diffusion tensor from above. Then is the covariance of the equilibrium distribution for the SDE, which is zeromean Gaussian and exists as long as is a stable fixpoint of . The path of maximum likelihood of escape from is the timereversed of the solution to
(35) 
This result is consistent with the fact that the timereversed form of the SDE (1) with is
and it corresponds to picking
(36) 
as solution to . It can be checked that, if we insert (36) in the first equation in (10), this equation reduces to (35), and that the second equation in (10) is also satisfied. Since we can compute the minimizer analytically, this model is a good start to compare the two methods (Hamilton’s equations with timeparametrization vs Hamilton’s equations with arclengthparametrization) and to study convergence. As a concrete example, we chose
and , as a test case. Note that, in this case, is normal and, therefore, the analytical solution is even simpler: The minimizing path is the timereversed of the trajectory that spirals into the fixpoint at the origin and is given by
For this system we implemented the algorithm described in section 3 to solve both the equations of motion of the original Hamiltonian (10) and of the geometric parametrization (15). Propagation of the initial conditions is realized with a simple first order explicit Euler scheme.
As shown in figure 1, with identical resolution the proposed method is closer to the analytical solution by a factor (for a given accuracy, the proposed method needs roughly a factor 10 gridpoints less). Due to the choice of the integration scheme, both methods converge with first order. The difference to the analytical solution is measured by comparing the endpoints of the paths, (due to different parametrizations, differences between other points than the endpoint are harder to define and calculate).
The action density over the curve parameter is concentrated on times close to for the time parametrization, while it gets stretched out to cover a significant part of the temporal domain in the geometric case (see figure 2). This behavior is desired, because it offers more grid points in the regions with critical dynamics.
4.2 Application to the model
Consider the following SPDE
(37) 
with Dirichlet boundary conditions , and where is a spatiotemporal whitenoise. It can be shown [8] that this equation is wellposed, and that its equilibrium distribution is formally given by
(38) 
The right way to interpret this measure is via its RadonNikodym derivative with respect to the measure of the Brownian bridge (i.e. by using the measure of the Brownian bridge as reference and writing the density of the equilibrium distribution of (37) with respect to this measure – see e.g. [18]):
(39) 
where denotes the measure of the Brownian bridge on , i.e. of the Gaussian process with mean zero and covariance
(40) 
In the large deviation regime (i.e. in the limit when ), we can ask what is the most likely configuration on the invariant measure such that . This is the solution of
(41) 
In the limit as , this solution is simply
(42) 
We can also compute
(43) 
where is a parameter and the expectation is taken on the invariant measure. The configuration on the invariant measure that contributes the most to this expectation is the solution to (41) with the boundary condition and :
(44) 
The path of maximum likelihood to get dynamically to (42) or (44) is then the time reversed of the solution to (37) with (deterministic flow), and (42) or (44) as initial condition.
A similar story holds if we consider (37) with periodic boundary conditions, which are easier to implement numerically. In this case the solution to (41) with the boundary condition and gives the relevant configuration for the periodic case.
In this example, we implemented the iterative scheme, both for the equations of motion in the timeparametrized and the geometric variant by using a second order Heun integration in the curve parameter. The dimension is resolved with gridpoints, using Fourier transforms to calculate the spatial derivatives.
Figure 3 compares, for different resolutions, the numerical solutions of the final configuration of the minimizer to the model (37) to the analytic solution for the periodic case. At a resolution of , the geometric method already outperforms the timeparametrized variant with a considerably higher resolution of in terms of accuracy.
4.3 Application to the stochastically driven Burgers equation
Consider the stochastically driven Burger’s equation with periodic boundary conditions:
(45) 
In this example, we consider a noise term that has a finite correlation length in space, but is still correlated in time. Hence we can write
(46) 
and for the correlation function in space, we assume the following form in Fourier space
(47) 
where denotes the Heavisidefunction. This form implies that the stochastic driving occurs only at frequencies up to a cutoff frequency . In this way, (45) is a simple model for turbulence [2]: The system is stochastically driven at larger scales by the noise term . This energy is transported via the nonlinearity from large scales to small scales where energy is dissipated due to the presence of the diffusive term .
We consider the velocity field starting at rest for and we are interested in noise realizations that lead to strong negative gradients at . This setup is similar to the analytic treatment in [1]. We choose as our observable
(48) 
For this case, our Hamiltonian equations for the reparametrized fields , can be written as
(49) 
The initial condition for the velocity field is for the final condition for the auxiliary field is as can be seen from (48) using integration by parts. The norm is given by the inverse of on its support (which is compact in Fourier domain), hence
(50) 
where, denotes the inverse Fourier transform operator and is the Fourier transform of .
For comparison with previous work [3, 12], it is useful to contrast (49) with the corresponding instanton equations in the original parametrization (using the time as parameter)
(51) 
Note that the same equations can also be obtained using the MartinSiggiaRose/Janssen/de Dominicis formalism [16, 15, 4, 17]. A main problem in solving (51) with the boundary conditions at and consists in the fact that the fixpoint of the system is only reached in the limit and not in finite time. When Chernykh and Stepanov computed a numerical solution to the system above, they used a combination of two clever tricks to mitigate this difficulty: First, they used selfsimilar properties of the heatkernel to design a coordinate transform that leads to an exponential rescaling in time, second they used the linearization of the system around zero in order to replace the boundary condition by (where the constant can be computed from the correlation function ). In the following we show that the geometric reformulation of the instanton equations leads to a natural rescaling of time that allows for direct efficient numerical solution, which is furthermore transferable to similar situations without any modification.
We implemented the iterative algorithm outlined in section 3, employing a second order Heun integration of the geometric equations of motion (49) and compare the results to a similar iterative algorithm, but integrating the timeparametrized equations of motion (51), instead. The space variable is resolved with gridpoints, and all derivatives in space are calculated via Fourier transforms. For the calculations presented here we took and in (48)
Figure 4 shows the convergence of the maximum of the Hamiltonian for the proposed scheme in comparison to the timeparametrized variant. For both methods the Hamiltonian converges to in second order. The reparametrized variant is a factor more accurate: For the smallest tested resolution of the error of the geometric integration is still smaller than the error of the timeparametrized integration at the highest tested resolution of . The reason is simple:
When using physical time as discretization, it is necessary to prescribe a finite startingtime which is restricted in its magnitude to ensure stability and accuracy of the integration scheme for a given time resolution. Yet, when choosing too close to , the resulting finitetime minimizer does not necessarily approximate the global minimizer. For the geometric equations of motion this issue is absent, and the arbitrary choice of is avoided in a natural way. Figure 5 depicts the final configurations of the minimizer for different choices of , demonstrating the necessity to choose a sufficiently small starting time: A choice of and, to a much lesser extend, produce secondary extrema, which are absent for both and the geometric version.
A plot of the complete minimizer in spacetime, as shown in figure 6, reveals the superior parametrization of the geometric scheme. Even for a somewhat optimistic choice of , most of the dynamics are squeezed into very few gridpoints for the timeparametrization (left), while being far better resolved in the geometric case (right), which nevertheless covers a more than 10 times larger interval of physical time in total.
5 Conclusion
We presented a new method to compute minimizers of action functionals using a geometric reparametrization of Hamilton’s equations. The new method is particularly wellsuited to compute expectations related to stochastically driven exits from attracting domains, though it may be useful in other contexts as well. The new method was illustrated using a simple OrnsteinUhlenbeck system and then numerically implemented and tested in the context of stochastic partial differential equations with additive noise for the model and the stochastic Burgers equation. In all cases, it was shown that the new parametrization has computational advantages over the parametrization using the original physical time. Applications to more complicated systems, in particular higherdimensional stochastic partial differential equations (e.g. stochastically driven NavierStokes or MHD equations) are therefore within reach and will be the topic of future research.
Acknowledgments
The work of T.G. was partially supported through the grants ISF7101800401 and Minerva â Coop 7114170101. T.S. acknowledges support through the following NSF grants: DMS0807396, DMS1108780, and CNS0855217. The work of E.V.E. was supported in part by NSF Grant DMS0708140 and ONR Grant N00014 1110345. Furthermore, T.G. would like to thank Gregory Falkovich for interesting discussions.
References
 [1] E. Balkovsky, G. Falkovich, I. Kolokolov, and V. Lebedev. Intermittency of Burgers’ turbulence. Phys. Rev. Lett., 78:1452, 1997.
 [2] J. Cardy, G. Falkovich, and K. Gawedzki. Nonequilibrium Statistical Mechanics and Turbulence. Cambridge University Press, 2008.
 [3] A.I. Chernykh and M.G. Stepanov. Large negative velocity gradients in Burgers turbulence. Phys. Rev. E, 64:026306, 2001.
 [4] C. de Dominicis. Techniques de renormalisation de la théorie des champs et dynamique des phénomènes critiques. J. Phys. C, 1:247, 1976.
 [5] W. E, W. Ren, and E. VandenEijnden. String method for the study of rare events. Phys. Rev. B, 66:052301, 2002.
 [6] W. E, W. Ren, and E. VandenEijnden. Minimum action method for the study of rare events. Comm. Pure Appl. Math., 57:1–20, 2004.
 [7] W. E, W. Ren, and E. VandenEijnden. Simplified and improved string method for computing the minimum energy paths in barriercrossing events. J. Chem. Phys., 126:164103, 2007.
 [8] W. G. Faris and G. JonaLasinio. Large fluctuations for a nonlinear heat equation with noise. J. of Phys. A, 15(10):3025, 1982.
 [9] H. C. Fogedby and W. Ren. Minimum action method for the KardarParisiZhang equation. Phys. Rev. E, 80:041116, 2009.
 [10] M. I. Freidlin and A. D. Wentzell. Random perturbations of dynamical systems. Springer, 1998.
 [11] C. Gardiner. Stochastic Methods: A Handbook for the Natural and Social Sciences. Springer, 2009.
 [12] T. Grafke, R. Grauer, and T. Schäfer. Instanton filtering for the stochastic Burgers equation. J. Phys. A: Mathematical and Theoretical, 46(6):62002, 2013.
 [13] M. Heymann and E. VandenEijnden. The geometric minimum action method: A least action principle on the space of curves. Comm. Pure and Appl. Math., 61:1053, 2008.
 [14] M. Heymann and E. VandenEijnden. Pathways of maximum likelihood for rare events in nonequilibrium systems: application to nucleation in the presence of shear. Phys. Rev. Lett., 100(14):140601, 2008.
 [15] H. K. Janssen. On a Lagrangian for classical field dynamics and renormalization group calculations of dynamical critical properties. Z. Physik B, 23:377, 1976.
 [16] P. C. Martin, E. D. Siggia, and H. A. Rose. Statistical dynamics of classical systems. Phys. Rev. A, 8:423, 1973.
 [17] R. Phythian. The functional formalism of classical statistical dynamics. J. Phys. A, 10, 1977.
 [18] M. G. Reznikoff and E. VandenEijnden. Invariant measures of stochastic partial differential equations and conditioned diffusions. Comptes Rendus Mathematique, 340(4):305–308, 2005.
 [19] E. VandenEijnden and M. Heymann. The geometric minimum action method for computing minimum energy paths. J. Chem. Phys., 128:061103, 2008.
 [20] S. R. S. Varadhan. Large deviations. Ann. Prob., 36(2):397–419, 2008.
 [21] X. Zhou, W. Ren, and W. E. Adaptive minimum action method for the study of rare events. J. Chem. Phys., 128:104111, 2008.