Measuring the Irreversibility of Numerical Schemes for Reversible Stochastic Differential Equations
For a stationary Markov process the detailed balance condition is equivalent to the time-reversibility of the process. For stochastic differential equations (SDE’s), the time discretization of numerical schemes usually destroys the time-reversibility property. Despite an extensive literature on the numerical analysis for SDE’s, their stability properties, strong and/or weak error estimates, large deviations and infinite-time estimates, no quantitative results are known on the lack of reversibility of discrete-time approximation processes. In this paper we provide such quantitative estimates by using the concept of entropy production rate, inspired by ideas from non-equilibrium statistical mechanics. The entropy production rate for a stochastic process is defined as the relative entropy (per unit time) of the path measure of the process with respect to the path measure of the time-reversed process. By construction the entropy production rate is nonnegative and it vanishes if and only if the process is reversible. Crucially, from a numerical point of view, the entropy production rate is an a posteriori quantity, hence it can be computed in the course of a simulation as the ergodic average of a certain functional of the process (the so-called Gallavotti-Cohen (GC) action functional). We compute the entropy production for various numerical schemes such as explicit Euler-Maruyama and explicit Milstein’s for reversible SDEs with additive or multiplicative noise. In addition we analyze the entropy production for the BBK integrator for the Langevin equation. The order (in the time-discretization step ) of the entropy production rate provides a tool to classify numerical schemes in terms of their (discretization-induced) irreversibility. Our results show that the type of the noise critically affects the behavior of the entropy production rate. As an example of our results we show that the Euler scheme for multiplicative noise is not an adequate scheme from a reversibility point of view.
Key words and phrases:Stochastic differential equations, Detailed Balance, Reversibility, Relative Entropy, Entropy production, Numerical integration, (overdamped) Langevin process.
1991 Mathematics Subject Classification:65C30, 82C3, 60H10
Pour un processus de Markov la condition de balance détaillée est équivalente à la reversibilité du processus par rapport au renversement du temps. Pour des équations différentielles stochastiques, les schémas de discrétisation détruisent en général cette proprieté de reversibilité. En dépit d’une vaste littérature sur l’analyse numérique des équations differentielles stochastiques, leur proprieté de stabilité, les erreurs fortes et/ou faibles, les proprietés de grandes déviations et à long temps, il n’y a pas eu jusqu’à maintenant de résultats quantitatifs sur l’irréversibilité introduite par les approximation numériques. Dans cet article nous fournissons de telles estimations, en nous basant sur le taux de production d’entropie, inspirés par des idées de mécanique statistique hors-équilibre. Le taux de production d’entropie est, par définition, l’entropie relative (par unité de temps) du processus par rapport au processus renversé en temps. Par construction, le taux de production d’entropie est non-négatif et il est zéro si et seulement si le procesus est réversible. Crucialement, d’un point de vue numérique, le taux de production d’entropie peut être evalué directement comme la moyenne ergodique d’une certaine fonctionnelle du processus (la fonctionelle de Gallavotti-Cohen), sous des conditions d’ergodicité adéquates. Nous calculons la production d’entropie pour le schéma explicite d’Euler-Maruyama et le schéma explicite de Milstein pour des equations différentielles stochastiques reversibles avec des bruit additifs ou multiplicatifs. Nos résultats démontrent que le type de bruit change le comportement la production d’entropie de manière critique. Finalement nous analysons la production d’entropie pour le schéma BBK pour l’équation de Langevin.
In molecular dynamics algorithms arising in the simulation of systems in materials science, chemical engineering, evolutionary games, computational statistical mechanics, etc. the steady- state statistics obtained from numerical simulations is of great importance [Gardiner:85, vanKampen:06, Schlick:02]. For instance, the free energy of the system or free energy differences as well dynamical transitions between metastable states are quantities which are sampled in the stationary regime. In addition, physical processes are often modeled at a microscopic level as interactions between particles which obey a system of stochastic differential equations (SDE’s) [Lelievre:10, Gardiner:85]. To perform steady-state simulations for the sampling of desirable observables, the solution of the system of SDE’s must possess a (unique) ergodic invariant measure. The uniqueness of the invariant measure follows from the ellipticity or hypoellipticity of the generator of the process together with irreducibility, which means that the process can reach at some positive time any open subset of the state space with positive probability [Meyn:93, Rey-Bellet:06]. Under such conditions the distribution process converges to the invariant measure (ergodicity) which has a smooth density and the process started in the invariant measure is stationary, i.e. the distribution of the paths of the processes, is invariant under time-shift. Many processes of physical origin, such as diffusion and adsoprtion/desoprtion of interacting particles, satisfy the condition of detailed balance (DB), or equivalently, time-reversibility, i.e., the distribution of the path of the processes are invariant under time-reversal. It is easy to see that time-reversibility implies stationarity but this a strictly stronger condition in general. The condition of detailed balance often arises from a gradient-like behavior of the dynamics or from Hamiltonian dynamics if the time-reversal includes reversal of the velocities.
However, the numerical simulation of SDE’s necessitates the use of numerical discretization schemes. Discretization procedures, except in very special cases, results in the destruction of the DB condition. This affects the approximation process in at least two ways. First, the invariant measure of the approximation process, if it exists at all, is not known explicitly and, second, the time reversibility of the process is lost. Several recent results prove the existence of the invariant measure for the discrete-time approximation and provide error estimates [Bally:96a, Bally:96b, Mattingly:02, Mattingly:10] but, to the best of our knowledge, there is no quantitative assessment of the irreversibility of the approximation process. Of course there exist Metropolized numerical schemes such as MALA [Roberts:96] and variations thereof which do satisfy the DB condition but they are numerically more expensive, especially in high-dimensional systems, as they require an accept/reject step. Thus, a quantitative understanding of the lack of reversibility for simpler discretization schemes can provide new insights for selecting which schemes are closer to satisfying the DB condition.
The implications of irreversibility are only partially understood, both from the physical and mathematical point of view. These issues have emerged as a main theme in non-equilibrium statistical mechanics and it is well-known that irreversibility introduces a stationary current (net flow) to the system [Nicolis:77, Schnakenberg:76, Maes:00, Rey-Bellet:11] but it is unclear how this current affects the long-time properties (i.e., the dynamics and large deviations) of the process such as exit times, correlation times and phase transitions of metastable states. Reversibility is a natural and fundamental property of physical systems and thus, if numerical simulation results in the destruction of reversibility, one should carefully quantify the irreversibility of the approximation process and we do in this paper using the entropy production rate. The entropy production rate which is defined as the relative entropy (per unit time) of the path measure of the process with respect to the path measure of the reversed process is widely used in statistical mechanics for the study of non-equilibrium steady states of irreversible systems [Gallavotti:95, Lebowitz:99, Maes:00, Rey-Bellet:11]. A fundamental result on the structure of non-equilibrium steady states is the Gallavotti-Cohen fluctuation theorem that describes the fluctuations (of large deviations type) of the entropy production [Gallavotti:95, Lebowitz:99, Maes:00, Rey-Bellet:11] and this result can be viewed as a generalization of the Kubo-formula and Onsager relations far from equilibrium. For our purpose, it is important to note that the entropy production rate is zero when the process is reversible and positive otherwise making entropy production rate a sensible quantitative measure of irreversibility. Furthermore, if we assume ergodicity of the approximation process, the entropy production rate equals the time-average of the Gallavotti-Cohen (GC) action functional which is defined as the logarithm of the Radon-Nikodym derivative between the path measure of the process and the path measure of the reversed process. A key observation of this paper is that GC action functional is an a posteriori quantity, hence, it is easily computable during the simulation making the numerical computation of entropy production rate tractable. We show that entropy production is a computable observable that distinguishes between different numerical schemes in terms of their discretization-induced irreversibility and as such could allow us to adjust the discretization in the course of the simulation.
We use entropy production to assess the irreversibility of various numerical schemes for reversible continuous-time processes. A simple class of reversible processes, yet of great interest, is the overdamped Langevin process with gradient-type drift [Gardiner:85, Gillespie:92, Lelievre:10]. The discretization of the process is performed using the explicit Euler-Maruyama (EM) scheme and we distinguish between two different cases depending on the kind of the noise. In the case of additive noise, under the assumption of ergodicity of the approximation process [Bally:96a, Bally:96b, Mattingly:02, Mattingly:10] we prove that the entropy production rate is of order where is the time step of the numerical scheme. In the case of multiplicative noise, the results are strikingly different. Indeed, under ergodicity assumption, the entropy production rate for the explicit EM scheme is proved to have a lower positive bound which is independent of . Thus irreversibility is not reduced by adjusting , as the approximation process converges to the continuous-time process. The different behavior of entropy production depending on the kind of noise is one of the prominent findings of this paper. As a further step in our study, we analyze the explicit Milstein’s scheme with multiplicative noise (it is the next higher-order numerical scheme). We prove that the entropy production rate of Milstein’s scheme decreases as time step decreases with order .
Finally, we compute both analytically and numerically the entropy production rate for a discretization scheme for Langevin systems which is another important and widely-used class of reversible models [Gardiner:85, Lelievre:10]. The Langevin equation is time-reversible if addition to reversing time, one reverses the sign of the velocity of all particles. The noise is degenerate but the process is hypo-elliptic and under mild conditions the Langevin equation is ergodic [Talay:02, Mattingly:02, Rey-Bellet:02]. Our discretization scheme is a quasi-symplectic splitting scheme also known as BBK integrator [Brunger:84, Lelievre:10]. We rigorously prove, under ergodicity assumption of the approximation process, that the entropy rate produced by the numerical scheme for the Langevin process with additive noise is of order , hence, in terms of irreversibility it is an acceptable integration scheme.
The paper is organized in four sections. In Section 1 we recall some basic facts about reversible processes and define the entropy production. Moreover we give the basic assumptions necessary for our proofs, namely, the ergodicity of both continuous-time and discrete-time approximation process. In Section 2 we compute the entropy production rate for reversible overdamped Langevin processes. The section is split into three subsections for the additive and multiplicative noise for the Euler and Milstein schemes. In Section 3 we compute the entropy production rate for the reversible (up to momenta flip) Langevin process using the BBK integrator. Conclusions and future extensions of the current work are summarized in the fourth and final Section.
1. Reversibility, Gallavotti-Cohen Action Functional, and Entropy Production
Let us consider a -dimensional system of SDE’s written as
where is a diffusion Markov process, is the drift vector, is the diffusion matrix, and is a standard -dimensional Brownian motion. We will always assume that and are sufficiently smooth and satisfy suitable growth conditions and/or dissipativity conditions at infinity to ensure the existence of global solutions. The generator of the diffusion process is defined by
for smooth test functions . We assume that the process has a (unique) invariant measure , and that it satisfies the Detailed Balance (DB) condition, i.e., its generator is symmetric in the Hilbert space :
for suitable smooth test functions .
A Markov process is said to be time-reversible if for any and sequence of times the finite dimensional distributions of and of are identical. More formally, let denote the path measure of the process on the time-interval with . Let denote the time reversal, i.e. acts on a path has
Then reversibility is equivalent to and it is well-known that a stationary111Stationarity is equivalent to starting the process from its invariant measure, i.e., . process which satisfies the DB condition is time-reversible.
The condition of reversibility can be also expressed in terms of relative entropy as follows. Recall that for two probability measure on some measurable space, the relative entropy of with respect to is given by if is absolutely continuous with respect to and otherwise. The relative entropy is nonnegative, and if and only if . The entropy production rate of a Markov process is defined by
If satisfies DB and then is identically for all and the entropy production rate is . Note that if then is a boundary term, in the sense that it is and so the entropy rate vanishes in this case in the large time limit (under suitable ergodicity assumptions). Conversely when the process is truly irreversible. The entropy production rate for Markov processes and stochastic differential equations is discussed in more detail in [Lebowitz:99, Maes:00].
Let us consider a numerical integration scheme for the SDE (1) which has the general form
Here is a discrete-time continuous state-space Markov process, is the time-step and are i.i.d. Gaussian random variables with mean and variance . We will assume that the Markov process has transition probabilities which are absolutely continuous with respect to Lebesgue measure with everywhere positive densities and we also assume that has a invariant measure which we denote and which is then unique and has a density with respect to Lebesgue. In general the invariant measure for and differ, and does not satisfy a DB condition. Note also that the very existence of is not guaranteed in general. Results on the existence of do exist however and typically require that the SDE is elliptic or hypoellitptic and that the state space of is compact or that a global Lipschitz condition on the drift holds [Bally:96a, Bally:96b, Mattingly:02, Mattingly:10].
Proceeding as in the continuous case we introduce an entropy production rate for the Markov process . Let us assume that the process starts from some distribution , then the finite dimensional distribution on the time window where is given by
For the time reversed path we have then
and the Radon-Nikodym derivative takes the form
where is the Gallavotti-Cohen (GC) action functional given by
Note that is an additive functional of the paths and thus if is ergodic, by the ergodic theorem the following limit exists
We call the quantity the entropy production rate associated to the numerical scheme. Note that we have, almost surely,
and for concrete numerical schemes we will compute fairly explicitly the entropy production in the next sections. Since we are interested in the ergodic average we will systematically omit boundary terms which do not contribute to ergodic averages and we will use the notation
For example we have
In the following sections we investigate the behavior of the entropy production rate for different discretization schemes of various reversible processes in the stationary regime. However, before proceeding with our analysis, let us state formally the basic assumptions necessary for our results to apply.
The generator is elliptic or hypo-elliptic, in particular the transition probabilities and the invariant measure (if it exists) are absolutely continuous with respect to Lebesgue with smooth densities. We assume that is ergodic, i.e. every open set can be reached with positive probability starting from any point. For the discretized scheme we assume that has smooth everywhere positive transition probabilities.
Both the continuous-time process and discrete-time process are ergodic with unique invariant measures and , respectively. Furthermore for sufficiently small we have
for functions which are with at most polynomial growth at infinity.
Notice that inequality (15) is an error estimate for the invariant measures of the processes and . The rate of convergence in terms of depends on the particular numerical scheme [Talay:90a, Mattingly:10]. Ergodicity results for (numerical) SDEs can be found in [Roberts:96, Mattingly:02, Talay:02, Talay:90a, Talay:90b, Bally:96a, Bally:96b, Mattingly:10, Khasminskii:10]. For instance, if both drift term and diffusion term have bounded derivatives of any order, the covariance matrix is elliptic for all and there is a compact set outside of which holds for all (Lyapunov exponent) then it was shown in [Talay:90a] that the continuous-time process as well both Euler and Milstein numerical schemes are ergodic and error estimate (15) holds. Another less restrictive example where ergodicity properties were proved is for SDE systems with degenerate noise and particularly for Langevin processes [Mattingly:02, Talay:02]. Again, a Lyapunov functional is the key assumption in order to handle the stochastic process at the infinity. More recently, Mattingly et al. [Mattingly:10] showed ergodicity for SDE-driven processes restricted on a torus as well their discretizations utilizing only the assumptions of ellipticity or hypoellipticity and the assumption of local Lipschitz continuity for both drift and diffusion terms.
2. Entropy Production for Overdamped Langevin Processes
The overdamped Langevin process, , is the solution of the following system of SDE’s
where is a smooth potential function, is the diffusion matrix, is the covariance matrix and is a standard -dimensional Brownian motion. We assume from now on that is invertible for any so that the process is elliptic. It is straightforward to show that the generator of the process satisfies the DB condition (3) with invariant measure
where is the normalization constant and thus if then the Markov process is reversible.
The explicit Euler-Maruyama (EM) scheme for numerical integration of (16) is given by
with , are -dimensional iid Gaussian random variables. The process is a discrete-time Markov process with transition probability density given by
where and is the normalization constant for the multidimensional Gaussian distribution. The following lemma provides the GC action functional for the explicit EM time-discretization scheme of the overdamped Langevin process.
Assume that . Then the GC action functional of the process solving (18) is
where means equality up to boundary terms, as defined in (13).
The assumption for non-zero determinant is imposed so that the transition probabilities and hence the GC action functional are non-singular. The proof is then a straightforward computation using (19) and (10).
where all the terms of the general form in the sums were cancelled out since they form telescopic sums which become boundary terms. ∎
Three important remarks can readily be made from the above computation.
The numerical computation of entropy production rate as the time-average of the GC action functional on the path space (i.e., based on (9)) at first sight seems computationally intractable due to the large dimension of the path space. However, due to ergodicity, the numerical computation of the entropy production can be performed as a time-average based on (11) and (20) for large . Additionally, this computation can be done for free and “on-the-fly” since the quantities involved are already computed in the simulation of the process. The numerical entropy production rate shown in the following figures is computed using this approach.
It was shown in [Maes:00] that the GC action functional of the continuous-time process driven by (16) equals the Stratonovich integral
which reduces to a boundary term as expected. This functional has the discretization
and this is exactly the first term in the GC action functional for the explicit EM approximation process (see (20)). However, the discretization scheme introduces two additional terms to the GC action functional which may greatly affect the asymptotic behavior of entropy production as goes to zero, as we demonstrate in Section 2.2. Notice that when the noise is additive, i.e., when the diffusion matrix is constant, then these two additional terms vanish and taking the limit , the GC action functional , if exists, becomes the Stratonovich integral which is a boundary term.
The GC action functional consists of three terms (see (20)), each of which stems from a particular term in the SDE. Thus, each term in the SDE contributes to the entropy production functional a component which is totally decoupled to the other terms. The reason for this decomposition lies in the particular form of the transition probabilities for the explicit EM scheme which are exponentials with quadratic argument. This feature can be exploited for the study of entropy production of numerical schemes for processes with irreversible dynamics. Indeed, if a non-gradient term of the form is added to the drift of (16), the process is irreversible and its GC action functional is not anymore a boundary term and is given by [Maes:00]
On the other hand, due to the separation property of the explicit EM scheme, the GC action functional of the discrete-time approximation process has the additional term
Evidently, the discretization of equals the additional term of the GC functional . Thus, GC action functional is decomposed into two components, one stemming from the irreversibility of the continuous-time process and another one stemming from the irreversibility of the discretization procedure.
2.1. Entropy Production for the Additive Noise Case
An important special case of (16) is the case of additive noise, i.e., when the covariance matrix does not depend in the process, . In this case, the SDE system becomes
and the GC action functional is simply given by
In this section we prove an upper bound for the entropy production of the explicit EM scheme. The proof uses several lemmas stated and proved in Appendix A.
Let Assumption 1 hold. Assume also that the potential function has bounded fifth-order derivative and that the covariance matrix is invertible. Then, for sufficiently small , there exists such that
Utilizing the generalized trapezoidal rule (84) for , the GC action function is rewritten as
Applying, once again, Taylor series expansion to , the GC action functional becomes
where . Moreover, expanding using the multi-binomial formula
Then, the GC action functional becomes
From (11), the entropy production rate is the time-averaged GC action functional as . Thus,
The ergodicity of as well the Gaussianity of guarantees that the first two limits in the entropy production formula exist. Additionally, the residual terms, , are bounded due to the assumption on bounded fifth-order derivative of , hence, the third limit also exists. Note here that this assumption could be changed by assuming boundedness of a higher order derivative and performing a higher-order Taylor expansion. Appendix A gives rigorous proofs of these ergodicity statements. Hence,
where is the equilibrium measure for while is the Gaussian measure of . Using the Isserlis-Wick formula we can compute the higher moments of multivariate Gaussian random variable from the second-order moments. Indeed, we have
where means summing over all distinct ways of partitioning into pairs. Moreover, , hence, applying (34) into (33) and changing the multi-index notation to the usual notation, the entropy production rate becomes
Using that , entropy production is rewritten as