Pseudo generators for under-resolved molecular dynamics

# Pseudo generators for under-resolved molecular dynamics

Andreas Bittracher bittrach@ma.tum.de Fakultät für Mathematik, Technische Universität München, D-87548 Garching Carsten Hartmann chartman@mi.fu-berlin.de Institut für Mathematik, Freie Universität Berlin, D-14195 Berlin Oliver Junge oj@tum.de Fakultät für Mathematik, Technische Universität München, D-87548 Garching Péter Koltai peter.koltai@fu-berlin.de Institut für Mathematik, Freie Universität Berlin, D-14195 Berlin
July 13, 2019
###### Abstract

Many features of a molecule which are of physical interest (e.g. molecular conformations, reaction rates) are described in terms of its dynamics in configuration space. This article deals with the projection of molecular dynamics in phase space onto configuration space. Specifically, we study the situation that the phase space dynamics is governed by a stochastic Langevin equation and study its relation with the configurational Smoluchowski equation in the three different scaling regimes: Firstly, the Smoluchowski equations in non-Cartesian geometries are derived from the overdamped limit of the Langevin equation. Secondly, transfer operator methods are used to describe the metastable behaviour of the system at hand, and an explicit small-time asymptotics is derived on which the Smoluchowski equation turns out to govern the dynamics of the position coordinate (without any assumptions on the damping). By using an adequate reduction technique, these considerations are then extended to one-dimensional reaction coordinates. Thirdly, we sketch three different approaches to approximate the metastable dynamics based on time-local information only.

\subtitle

What does the Smoluchowski equation tell us about the spatial dynamics of molecular systems?

## 1 Introduction

Langevin and Hamiltonian dynamics constitute established models for the analysis of biomolecular processes by classical molecular dynamics. While they describe the system at hand through the evolution of configuration and momentum coordinates, many properties of interest, such as metastable conformations, conformational transition rates or folding pathways, are merely characterized by the configurational dynamics or the dynamics of few collective variables, called reaction coordinates, that span a low-dimensional submanifold of the configuration space (see, e.g., [1, 11, 12]).

Both from a computational and modeling point of view it is very appealing to describe a molecular system just by its position (or reaction) coordinates, since this drastically reduces the dimensionality of the problem. Over decades, it has been of major interest to derive equations which govern the evolution of these coordinates either exactly [52, 31], or approximately with the smallest possible error [4, 29]. One popular model for molecular dynamics in position space that comes under various names like overdamped Langevin dynamics, Brownian dynamics, Kramers equation or Smoluchwski equation is obtained by the so-called Smoluchowski-Kramers approximation of the Langevin equation [46, 27]; see also [20, 17] and the references therein. Yet it is unclear whether there are conditions beyond the asymptotic regime of the Kramers-Smoluchowski approximation (i.e. the high-friction limit), under which the Smoluchowski equation accurately captures e.g., the folding dynamics of a protein in terms of a one-dimensional reaction coordinate. For a general account of this topic we refer to [19].

In this article, we discuss the accuracy of the Smoluchowski equation for the spatial dynamics of a molecular system under various parameter regimes where, in each case, our analysis departs from the Langevin equation in phase space.111We use the terms spatial, position(al), and configuration(al) interchangeably, when referring to coordinates or dynamics. Our presentation of the topic is not claimed to be exhaustive; it rather reflects the authors’ interests, and their wish to understand how the hierarchies of models used in molecular dynamics relate to each other. Parts of this article are based on the recent work [2] by some of the authors, however, their analysis in the context of long time scales and in non-Cartesian geometries is new. The main contribution of this article is that it sketches solutions to answer the following questions:

1. What is the appropriate generalization of the Smoluchowski equation in generalized (non-Cartesian) coordinates, to be used, for example, in reduced-order models of protein folding or polymers, and how is it related to a phase space description of a molecular system?

2. Is there a closed equation for the spatial dynamics on small time intervals if the underlying phase space dynamics is governed by a Langevin equation?

3. How well (and in which sense) is a system’s metastable behaviour approximated by the Smoluchowski dynamics when the phase space dynamics is generated by Langevin dynamics? What are the time scale regimes on which the approximation of the metastable dynamics by the Smoluchowski equation may be used?

The manuscript is organized as follows: Section 2 introduces the basic model of molecular dynamics in terms of deterministic and stochastic differential equations and describes an operator-based framework for the evolution of probability densities under these dynamics. This section also introduces the formulations of the stochastic equations in generalized coordinates in a non-Euclidean space. Section 3 reviews the concept of metastability based on density fluctuations in position space and establishes a connection between Langevin and Smoluchowski dynamics on short time scales. A numerically exploitable scheme which replaces the complicated position space density transport by a rescaled Smoluchowski transport is described, along with asymptotic error estimates. Section 4 reviews the approximation quality of these methods, gives improved error estimates and discusses the extension to longer times scales. A summary and possible future directions are given in Section 5.

## 2 Trajectory- and ensemble-based views

We consider a dynamical system described by positional degrees of freedom that represent a system of particles. Let denote the corresponding configuration space and the potential energy of a given particle configuration where we assume that the function is at least twice continuously differentiable, polynomially growing at infinity and bounded from below.

### 2.1 Models for molecular dynamics

We introduce three typical models for molecular dynamics. The simplest model to describe the motion of the particles in vacuum, i.e. without external influences like a solvent is given by Hamilton’s equations

 dqdt =∇pH(q,p) (1) dpdt =−∇qH(q,p),

where denotes the vector of conjugate particle momenta, and

 H=12p⋅M−1p+V(q)

is the Hamiltonian (total energy) of the system, with denoting the mass matrix. Depending on the type of system or when transformed to generalized coordinates, the mass matrix can be a general symmetric positive-definite, possibly position-dependent matrix.

In the presence of a heat bath or solvent, one typically adds a drift-diffusion term, arriving at the Langevin equation,

 dqdt =∇pH(q,p) (2) dpdt =−∇qH(q,p)−γ∇pH(q,p)+σζt .

The term , with being symmetric positive definite, models the drag through the solvent, the term accounts for random collisions with the solvent particles [35, Chap. 4]. Here, is an uncorrelated, zero-mean white noise process that can be formally interpreted as the (generalized) derivative of a standard -dimensional Brownian motion, and is the noise covariance matrix. In order to keep the system at a constant average kinetic energy, damping and excitation have to be balanced, which is ensured by assuming that noise and drag coefficients satisfy the fluctuation-dissipation relation

 2γ=βσσT,

where is the inverse temperature in the system. Choosing or is a modelling issue and thus depends on the particular problem at hand. As we will see later on, both and may even be position dependent.

If the friction in the system is uniformly large, i.e. for all , the Langevin equation can be replaced by the Smoluchowski equation

 γdqdt=−∇V(q)+σζt, (3)

or, using the common notation for Itô stochastic differential equations (see [36]),

 γdqt=−∇V(qt)dt+σdwt,

where is a standard Brownian motion in .222 In the following, we will interpret stochastic differential equations such as (2) or (3) in the sense of Itô, which implies that, for stochastic differentials such as , a generalized chain rule known as Itô’s formula or Itô’s Lemma applies [36, Theorem 4.2.1]. The Smoluchowski equation is also termed overdamped Langevin equation and can be derived from (2) by letting under a suitable rescaling of time; for a precise statement, see [35, Theorem 10.1] or the derivation given below on pp. 914.

In some cases a description of the stochastic dynamics in a different coordinate system is needed. The aim of this subsection therefore is to derive the Smoluchowski equation in generalized coordinates. This is most conveniently done by resorting to the canonical form of the Langevin equation (2) that we will state first.

#### Langevin equation in generalized coordinates.

To state the Langevin equation (2) in canonical form, we consider a diffeomorphism between configuration spaces that has a cotangent lift given by

 (q,p)↦(Φ−1(q),((∇Φ∘Φ−1)(q))Tp).

Using Itô’s formula [36, Theorem 4.2.1], the Langevin equation (2) can be written in the new configuration variables and their conjugate momenta : introducing the new (possibly position-dependent) drag and noise coefficients by

 ~γ=∇ΦTγ∇Φ,~σ=∇ΦTσ, (4)

the Langevin equation can be recast as [21, 23]

 dudt =∇v~H(u,v) (5) dvdt =−∇u~H(u,v)−~γ(u)∇v~H(u,v)+~σ(u)ζt.

Here denotes the push-forward of the Hamiltonian to the new coordinate system,

 ~H(u,v)=12v⋅(G(u))−1v+~V(u),

with and being the mass metric tensor induced by the transformation .

It can be readily seen that, when the original drag and noise coefficients satisfy the fluctuation-dissipation relation, then so do the transformed coefficients:

 2~γ=β~σ~σT. (6)

#### Derivation of the Smoluchowski equation in generalized coordinates.

It is now possible to derive the Smoluchowski equation in generalized coordinates from the canonical Langevin dynamics (5) using formal asymptotics. To this end, let us scale the original drag and noise coefficients according to and where is a small parameter. Clearly, the scaling preserves the fluctuation-dissipation relation (6), and it leads to the Langevin equation

 dudt =∇vH(u,v) (7) dvdt =−∇uH(u,v)−1εγ(u)∇vH(u,v)+1√εσ(u)ζt.

For notational convenience, we have dropped the twiddle signs on the transformed Hamiltonian and the coefficients and . To study the limit of (7) we seek a perturbative expansion of the associated backward Kolmogorov equation333The Kolmogorov backward equation is a partial differential equation governing the evolution of observables. Specifically, if is the solution of an Itô stochastic differential equation, such as (7), then, for any integrable function , satisfies with initial condition . Here is the infinitesimal generator associated with the process , and denotes the conditional expectation over all realizations of the process starting at . We introduce the dual concept of forward equation and the corresponding generator in Section 2.2; cf. also [36, Section 7.3].

 ∂tϕε(u,v,t)=ALanϕε(u,v,t),ϕε(u,v,0)=ϕ0(u,v) (8)

following the approach described in [37, 39]. To begin with, we notice that the backward operator in (8) admits the decomposition (see also p. 24 below)

 ALan=AHam+1εAOU,

with

 AHam=∇vH⋅∇u−∇uH⋅∇v

and

 AOU=12σσT:∇2v−(γG−1v)⋅∇v

We consider a perturbative solution of (8) that is of the form

 ϕε=ϕ0+εϕ1+ε2ϕ2+…

with . Inserting the ansatz into the backward equation and equating powers of we obtain a hierarchy of equations, the first three of which read

 AOUϕ0 = 0 (9) AOUϕ1 = ∂tϕ0−AHamϕ0 (10) AOUϕ2 = ∂tϕ1−AHamϕ1. (11)

Note that is a second-order differential operator in with appearing only as a parameter. By the assumption that is symmetric positive definite with uniformly bounded inverse, (9) implies that does not depend on . By a standard closure argument (a.k.a. centering condition), it thus follows that .

Closely inspecting the resulting equations (10)–(11), the next nontrivial term, , is found to be the solution of the backward equation

 ∂tψ=−∫Rd(AHamA−1OUAHamψ)ϱu(v)dv, (12)

where is independent of , and is the solution to , with being the formal adjoint of . Equation (12) must be equipped with a suitable initial condition .

Before we evaluate the right hand side of (12), a few remarks are in order:

1. The function in (12) is the unique invariant probability density with respect to the Lebesgue measure on the momentum space (that we can identify with ) of the Ornstein–Uhlenbeck process generated by . It is given by

 ϱu(v)=(β2π)n/2(det(G(u)))−1/2exp(−β2v⋅(G(u))−1v) (13)

and satisfies .

2. The inverse of the operator in (12) is only unambiguously defined when it is acting on functions that are in the range of . By the Fredholm alternative (see [51, Sections 3.12 & 4.3]), the range of a linear continuous operator on some Banach space is the orthogonal complement of the kernel of its adjoint:

 ran(AOU)=(ker(A∗OU))⊥=V0,

where

 V0:={f∈L2μ(X′):∫Pf(v,u)ϱu(v)dv=0}⊂L2μ(X′).

That is to say that the linear equation has a solution, if and only if averages to zero under . Note that is linear in the momenta , hence . As a consequence, in (12) is well defined.

The formal expansion (9)–(12) now suggests that the solution of the Langevin-based backward equation (8) and the solution to the limiting system (12) satisfy

 ∥ϕε(⋅,t)−ψ(⋅,t/ε)∥V→0,ε→0. (14)

for some suitable norm on . Indeed, standard results from homogenization theory for parabolic partial differential equations (e.g. [37, 38]) imply that, under certain regularity assumptions on the coefficients, the convergence is uniform in for any finite (cf. also [35, Theorem 10.1]).

As we show in the appendix, the operator on the right hand side of (12) reads

 ¯A=β−1~Δ−∇V⋅~∇, (15)

where

 ~∇=γ−1∇ and ~Δ=1√detγ∇⋅(√detγγ−1∇),

denote gradient and Laplace-Beltrami operator with respect to . The differential operator has a straighforward interpretation as the infinitesimal (backward) generator of the Smoluchowski dynamics on the configuration space , with the position dependent drag matrix acting as metric tensor on . Alternatively, one may regard as the generator of the Smoluchowski dynamics on a Riemannian manifold endowed with the metric tensor and a position dependent friction matrix . Our findings are summarized in the next lemma.

###### Lemma 1.

The Smoluchowski equation in generalized coordinates reads

 γ(u)dudt=−∇V(u)+g(u)+σ(u)ζt,u0=u, (16)

where has the entries

 gi=1β∑j,kγij1√detγ∂∂uk(√detγγkj). (17)
###### Remark 1.

The additional drift term in the Smoluchowski equation is due to the geometry of the configuration manifold and the interpretation of the Smoluchowski equation in the sense of Itô. Formally, it can be seen to be related with the first order derivative in the expression for the Laplace Beltrami operator in (15).

###### Remark 2.

The Smoluchowski equation (16) in generalized coordinates likewise follows by transforming the original Smoluchowski equation (3) in Cartesian coordinates to the new coordinate system using Itô’s Lemma [36, Theorem 4.2.1]. As a consequence, the stochastic convergence of the spatial component of the high friction Langevin equation to the solution of the (time-rescaled) Smoluchowski equation that has been proved in [35, Theorem 10.1] should be inherited by its non-Cartesian counterpart.

### 2.2 The transfer operator

We shall now examine how probability densities evolve under the Langevin or the Smoluchowski dynamics. To this end, call or the state vector, depending on which type of dynamics is considered, and or the phase space or state space, respectively. For any given , we seek the probability density of for some with respect to the natural (Liouville or Lebesgue) measure on . Now let be any measurable subset of , and define the stochastic transition function444It is common to denote the transition function and transition probabilities by . We hope that the clash in the notation with the conjugate momenta is not going to confuse the reader, since the transition function and -probabilities are always going to be functions of three variables. of the dynamics by

 p(t,x,B)=Prob[xt∈B|x0=x], (18)

Further call

 pμ(t,A,B):=Probμ[xt∈B|x0∈A] (19)

the transition probability between the measurable sets and , where indicates that the initial condition is distributed according to a probability measure . For the long term macroscopic behaviour of the system, sets play an important role for which  for some physically relevant measure  and some characteristic times .

The transition probability can be derived from a stochastic transition kernel or transition density via

 p(t,x,B)=∫BΨ(t,x,y)dy, (20)

where is the fundamental solution of the Fokker-Planck equation (26). Existence and uniqueness of the transition density follow relatively under mild conditions (see e.g. [28, Chap. 11.7]).

Now let an initial density be given. The density describing the distribution of the system at time is then given by

 ft(y)=∫Xf0(x)Ψ(t,x,y)dμ(x) (21)

Equation (21) can be seen as the definition of the transfer operator with lag time :

 Ptf0(x):=ft(x).

By linearity we can extend the definition of  from probability densities to arbitrary integrable functions, and it will be convenient in what follows to consider the transfer operator as a family of linear maps . This family of linear operators has the Chapman-Kolmogorov (or semigroup) property:

1. ,

2. for all .

We also have non-expansiveness in the induced operator norm, , and positivity, for .

The transition probabilities (19) can be conveniently expressed in terms of the transfer operator. If we define the scalar product on the space of square integrable functions by

 ⟨f,g⟩μ=∫Xf(x)g(x)dμ(x),

then, for any measurable set with ,

with  being the indicator function.

#### The forward generator.

The semigroup property means that  is “memoryless”, i.e. that (1)–(3) generate a Markov process. Noting that

 Pt=(Pt/n)n,

we may conclude all relevant information about the density transport is already contained in for arbitrarily small . This is formalized by looking at the operator

 Lf=limτ→0Pτf−fτ (22)

that is defined for all , for which the limit exists. is called the forward generator or infinitesimal generator of .

From its general form for arbitrary Itô diffusions [26, p. 282], we can derive the generator for our dynamics. For the Hamiltonian dynamics (1) and functions , where is equipped with the supremum norm, the operator is given by

 LHam=∇qH⋅∇p−∇pH⋅∇q, (23)

where the dot denotes the Euclidean inner product, and , are the gradients with respect to  or . In case of Langevin dynamics (2) and , we have

 LLan=LHam+12σσT:∇2p+γ∇pH⋅∇p+γ:∇2pH, (24)

where the notation denotes the inner product between matrices , and denotes the matrix of second derivatives with respect to . Finally, the generator of the Smoluchowski dynamics (3) reads

 LSmol=β−1γ−1:∇2q+(γ−1∇qV)⋅∇q+γ−1:∇2qV, (25)

with being the matrix of second derivatives in , and we have used that .

#### Fokker-Planck equations and invariant measures.

By definition of the forward generator, the evolution of probability densities associated with any of the stochastic dynamics (2)–(3) is described by a parabolic transport equation of the form

 ∂tft=Lft,ft=0(x)=g(x), (26)

that are called Kolmogorov forward equations or Fokker–Plack equations [26], with being either or . When , then the Fokker-Planck equation with turns into the Liouville equation that describes the transport of probability densities under the Hamiltonian dynamics (1).

Probability measures that are invariant under the dynamics play a prominent role. The corresponding densities are fixed points of for any , and equation (22) implies that they lie in the kernel of . For the stochastic processes considered here, the invariant density can be shown to be unique (cf. [33]). For the Langevin dynamics (2), the unique invariant probability density is the canonical density

 fcan(q,p) =1Zexp(−βH(q,p)) (27) =1ZPexp(−β2p⋅M−1p)=:fP(p) 1ZQexp(−βV(q))=:fQ(q),

with and

For the Smoluchowski dynamics (3), the unique invariant measure has the density , which is called the Boltzmann density or Gibbs-Boltzmann density. (We assume throughout that is integrable).

Under fairly mild assumptions on the potential , the invariant densities can be shown to be the unique asymptotically stable fixed point of , which implies that converges to the stationary distribution for any initial density [34]. The Liouville equation associated with the Hamiltonian dynamics (1) is known to have infinitely many stationary solutions, among which is .

###### Remark 3.

It can be readily seen that the Smoluchowski dynamics (16) in generalized coordinates has the unique invariant probability measure

 dρ(u)=(fQ∘Φ)(u)dΣ(u), (28)

with

 dΣ(u)=√deth(u)du

being the Riemannian volume element on where is the corresponding metric tensor. Note that (28) is simply the pullback of the Boltzmann distribution in Cartesian coordinates by the coordinate transformation . As a consequence, is the -marginal of the canonical density . To see this, replace the metric tensor on by the generalized mass matrix or the corresponding expression for the friction coefficient . This does not change the invariant measure as the constant mass or drag matrices cancel out.

### 2.3 More on semigroups and their generators

Before we proceed, let us recall two results relating the transfer operator semigroup and its generator. For our purposes the main connection between them is given by the following

###### Theorem 1 (Spectral mapping theorem [40]).

Let be a Banach space, , , a semigroup of bounded linear operators (i.e.  as for every , and  bounded for every ), and let be its infinitesimal generator. Then

 etσ∙(A)⊂σ∙(Tt)⊂etσ∙(A)∪{0},

with denoting the point spectrum. The corresponding eigenvectors are identical.

Evidently, a function is an invariant density of for all , if and only if . Further, since , the eigenvalues of lie in the left complex half-plane. The family can be approximated by a truncated “Taylor series”:

###### Example 1 ([2]).

If is times continuously differentiable and , , is square-integrable with respect to , then

 ∥∥Ptf−N∑n=0tnn!Lnf∥∥L2μ=O(tN+1)  for t→0,

where denotes the -norm with respect to .

## 3 Spatial dynamics and metastability

Consider an infinite number of systems modeled by (2) in thermodynamic equilibrium, i.e. identically and independently distributed according to (this collection of systems is called an ensemble in statistical mechanics). We are now interested in the portion of these systems which undergo a certain configurational change, i.e. leave a subset and enter another subset . For this, we track the evolution of , which is given by . This will be the starting point of the subsequent analysis.

#### The spatial transfer operator.

Since we are only interested in the distribution of their configurations at time , we compute the marginal with respect to . The resulting spatial transfer operator for some  is [42, 50]

 Stu(q):=1fQ(q)∫PPtLan(u(q)fcan(q,p)) dp. (29)

#### Metastability on configuration space.

Using the scalar product

 ⟨u,v⟩fQ:=∫Qu(q)v(q)fQ(q)dq

(which gives rise to the norm ), and the “slice” in state space, we define transition probabilities between slices via

 p(t,Γ(A),Γ(B))=⟨StχA,χB⟩fQ⟨χA,χA⟩fQ. (30)

We call a disjoint union of position space metastable or almost invariant if

 p(t,Γ(Aj),Γ(Aj))≈1, j=1,…,n

for the time scales  of interest. Other, more sophisticated, notions of metastability can be found in [43].

The link between almost invariant/metastable sets and eigenvalues close to one and the corresponding eigenvectors of some transfer operator was first established in [8] and used for conformation dynamics in [9]. We here cite an extension to a broader class of transfer operators from [25].

###### Theorem 2 (Application of [25], Theorem 2).

Let with and be the largest eigenvalues of , with eigenvectors . Let be a measurable partition of and be the orthogonal projection onto . Then

 1+ρ2λ2+⋯+ρnλn+c≤p(t,Γ(A1),Γ(A1))+⋯+p(t,Γ(An),Γ(An))≤1+λ2+⋯+λn,

where and .

###### Remark 4.

We briefly discuss some specific properties of the spatial dynamics that are useful to understand the concept of pseudo generators outlined below.

1. The spatial transfer operator  from (29) satisfies the assumptions in Theorem 2; see [2, Appendix B]. Unfortunately, lacks the semi-group property and so cannot be the solution operator of an autonomous transport equation like the Fokker–Planck equation. Equivalently, spatial dynamics is not induced by an Itô diffusion process and thus has no infinitesimal generator in the sense of (26).

2. The closer the eigenvalues are to 1, the more metastable can a partition potentially be (upper bound in the theorem). How metastable a given partition is, is controlled by the  (lower bound), which measure the constancy of the eigenfunctions on the partition elements. The better the eigenfunctions can be approximated by piecewise constant functions over the partition, the closer the  are to 1, and the more metastability is guaranteed by the lower bound. Also, note that since  is not a semigroup, the eigenfunctions  depend on  (cf. Figure 1 below). As a consequence, metastability of a partition must be understood with respect to the characteristic time scale ;

3. It is not necessary that the sets  form a full partition, i.e. . Similar results to the above have been obtained for non-complete partitions, where the  are considered to be cores of the metastable sets; cf. [43, Section 5].

### 3.1 Pseudo generators

Even though the spatial dynamics is lacking the semigroup property, it is possible—at least formally and in analogy with (22)—to differentiate at . We will see in the following that the resulting operators can play the role of the infinitesimal generator in the context of metastability analysis.

###### Definition 1.

Let be a Banach space, be a time-parametrized family of bounded linear operators. The operator

 ddtTtf=limh→0Tt+hf−Ttfh

is called the time-derivative of . Iteratively, we define by the -th time-derivative. The operator

 Gn:=dndtnTt∣∣t=0

is called the -th pseudo generator of .

For , the transfer operator of an Itô process, the pseudo generators are simply , where is the infinitesimal (forward) generator.

The pseudo generators of the spatial transfer operator can be expressed by the generator of the full Langevin transfer operator:

###### Lemma 2 ([2]).

The -th pseudo generator of takes the form

Explicitly, we have

1. ,

2. . In particular, is independent of .

Surprisingly, one has

###### Corollary 1 ([2]).

The pseudogenerator (of the spatial transfer operator) is the infinitesimal generator of the Smoluchowski dynamics:

 G2=GSmol.
###### Remark 5.

Note that  has the form of the backward Smoluchowski generator  (cf. Section 4). Still, is also the forward generator of the Smoluchowski process, if distributions are thought of as distributions with respect to the Gibbs–Boltzmann density . This is in accordance with the definition of the spatial transfer operator (29), which also describes redistribution of mass with respect to . The formal coincidence “” is not accidentally, but rather it reflects the reversibility of the Smoluchowski process.

#### Taylor reconstruction of the spatial transfer operator.

It is natural to ask whether there is an analogue of Proposition 1 for and its pseudo generators. We have the following result:

###### Theorem 3 ([2]).

If is sufficiently regular, then

 ∥∥ ∥∥Stu−K∑k=0tkk!Gku∥∥ ∥∥L2fQ=O(tK+1),(t→0).

Unfortunately, for , higher derivatives of the potential  appear in the expressions for , which are therefore impractical to work with, while the gradient  is typically available. We call

 Rtu :=(id+t22G2)u=u+t22(1βΔu−∇u⋅∇V) (31)

the (2nd order) Taylor approximation of  such that if  is sufficiently regular,

 ∥∥Stu−Rtu∥∥L2fQ=O(t3),(t→0).

#### Exponential reconstruction.

Unfortunately, unlike , is not norm-preserving for densities with respect to . Therefore, when transporting , we lose the interpretation of as a physical density. Moreover, is not even bounded in [2]. This quickly (i.e. for small ) destroys the interpretation of the eigenvalues of as a measure of metastability.

An alternative approximation to preserves those properties. The Taylor approximation (31) already suggests that  behaves similarly to a -scaled Smoluchowski dynamics. Hence, we define

 Etf:=Pt2/2Smolf,

where is the semigroup of transfer operators generated by . This operator is integral-preserving with respect to the weight  [2], and we get the following analogue for Proposition 1:

###### Lemma 3 ([2, Lemma 4.10]).

If is sufficiently regular, then for ,

 ∥∥∥Etu−N∑n=0(t22G2)nn!u∥∥∥L2fQ=O(t2N+1).

In particular,

 ∥∥Etu−Stu∥∥L2fQ=O(t3)(t→0).

#### Reconstruction of eigenspaces.

The error asymptotics carries over to the spectrum and eigenvectors of , and in the following way:

###### Corollary 2 ([2]).

Let be a sufficiently regular eigenvector of  or of  to eigenvalue . Then

 ∥Stu−λu∥L2fQ=O(t3).

Thus, for small  we may interpret dominant eigenpairs  of  and  as good approximations to dominant eigenpairs of . Hence, they can be used to define metastable sets following the spatial decomposition approach in [10]. The eigenfunctions of interest, those of , , and , can be shown to be sufficiently regular under fairly general conditions, cf. [2, Appendix C].

###### Remark 6.

Corollary 2 is in accordance with functional analytical results by Nier and co-workers (e.g. [24]) that show that the dominant spectrum of the non-reversible Langevin dynamics is real-valued and close to the spectrum of the reversible Smoluchowski dynamics, even for moderate values of the friction coefficient. In [24], this somewhat surprising result is obtained by large deviations arguments for the small-noise limit using the Witten Laplacian representation of the (hypoelliptic) Langevin generator, whereas the considerations here and in [2] are based on small-time asymptotics of the spatial Langevin dynamics. We believe that a connection between these results is that the small-noise limit can be understood as an exponential rescaling of time as is suggested by large deviations theory; cf. [41]. We refrain from going into details here and leave the analysis of this interesting connection to future work.

### 3.2 Towards spatial generators in essential coordinates

We have discussed the concept of the spatial transfer operator that is obtained by projecting the phase space dynamics onto the spatial components. We shall now consider the restriction of the dynamics to a given collective variable, also termed essential coordinate. To this end, let be a smooth map with the property that, for every regular value of , the level sets

 Mz={q∈Q:ξ(q)=z}⊂Q

are smooth submanifolds of with codimension 1 (i.e. hypersurfaces). We suppose that is a physically relevant observable of the dynamics, such as a reaction coordinate or some collective variable that monitors a conformational transition, and call the essential coordinate; the unessential coordinates are then implicitly defined through the foliation of by the map , in other words: the unessential coordinates parameterize the leaves of the foliation for every (regular) value of .

To define the analogue of the spatial transfer operator (29) for the essential coordinate, firstly note that [15, Section 3.2]

 ∫Qg(q)dq=∫Z(∫Mzg|∇ξ|−1dσz)dz (32)

for any integrable function where denotes the Riemannian volume element on . Equation (32) is called the coarea formula and can be considered a nonlinear variant of Fubini’s theorem.

Together with the law of total expectation, the coarea formula thus entails that the canonical probability measure conditional on has the form

 dμz=1N(z)fcan|∇ξ|−1dσzdp, (33)

with the normalization constant

 N(z)=∫Mz×Pfcan|∇ξ|−1dσzdp. (34)

The spatial transfer operator for essential coordinates can now be defined as

 Stessw(z):=1N(z)∫Mz×PPtLan(w(ξ(q))fcan(q,p))|∇ξ(q)|−1dσzdp, (35)

#### Projected pseudo-generators.

To compute the corresponding pseudo-generators, let be the configurational marginal probability measure that is obtained by projecting onto the configurations by integrating out the momenta, i.e., . Let us further introduce a projection operator by

 (Πzu)(z)=1NQ(z)∫Qu(q)fQ(q)|∇ξ(q)|−1dσz(q) (36)

where is the -marginal of and is the corresponding normalization constant for the conditional density. It can be readily seen that, is an orthogonal projection with respect to the natural (weighted) scalar product in the space and amounts to the expectation of functions with respect to conditional on .

Thus, for functions , the reduced spatial transfer operator and the spatial transfer operator (29) are related by (cf. [42])

 Stessw(z)=(ΠzStu)(z). (37)

The last identity is helpful in computing the corresponding pseudo generators . Here we are interested only in the second pseudo generator , for which we have the following analogue of Lemma 2:

###### Lemma 4.

For sufficiently smooth functions , the -th pseudo generator of reads

 Gessnw(z)=(ΠzGnu)(z)

Specifically, we have

 Gess2=β−1a(z)∂2∂z2+b(z)∂∂z,

with the noise and drift coefficients

 a(z)=(Πz|∇ξ|2)(z),b(z)=(Πz(β−1Δξ−∇ξ⋅∇V))(z).
###### Proof.

The first part of the assertion is a straight consequence of Lemma 2 and the coarea formula. As for the second part, observe that the second pseudo generator is given by which by chain rule implies:

 G2w(ξ(q))=β−1|∇ξ|2w′′(z)|z=ξ(q)+(β−1Δξ−∇ξ⋅∇V)w′(z)|z=ξ(q).

Letting the projection act from the left using that and likewise gives the desired result. ∎

A few remarks are in order:

1. In accordance with Corollary 1, the second projected pseudo generator is the infinitesimal generator of the diffusion

 dzdt=b(z)+√2β−1σ(z)ζt, (38)

with and being a one-dimensional uncorrelated Gaussian white noise process. Equation (38) has been derived by Legoll and Lelièvre [29] using first-order (Markovian) optimal prediction.

2. In [29], the authors prove an error bound for the projected dynamics (38) under the assumption that the conditional probability satisfies a logarithmic Sobolev inequality. We refrain from transferring the analysis to our situation as logarithmic Sobolev constants are difficult to estimate (beyond the case of strictly convex potentials or in the zero-noise limit), hence the approach is of limited practical use. Nonetheless, we believe that the projected pseudo generator will provide a good approximation of the dominant spectrum of whenever is a slow coordinate relative to the unessential configuration variables and the momenta.

3. If is bounded above and away from zero, it can be shown (see [29]) that the process generated by has the unique invariant measure

 dν(z)=exp(−βF(z))dz

with

 F(z)=−β−1log∫fQ|∇ξ|−1dσz

being the thermodynamic free energy in the essential coordinate. Note that is the push-forward of the canonical distribution by (i.e. the -marginal). Naively, one might expect the projected Smoluchwski equation to be of the form

 dydt=−F′(y)+√2β−1ζt, (39)

and it can be shown that (38) can be transformed into (39) according to using Itô’s Lemma with being the volatility transform

 φ(z)=∫z0(σ(s))−1ds

that leads to a Smoluchowski equation with unit noise coefficient [12]. As a consequence, (38) can be equivalently expressed as

 dzdt=−a(z)F′(z)+β−1a′(z)+√2β−1σ(z)ζt, (40)

which is exactly the one-dimensional analogue of (16)–(17) with .

4. In order to use in metastability analysis (analogous to in section 3.1), it has to be discretized. The method of choice is spectral collocation due to the regularity of the objects of interest [2] (i.e. eigenfunctions of ). Here, collocation requires the evaluation of for ansatz functions at collocations points (see Section 5 for details). This in turn requires the evaluation of the noise and drift-coefficients in Lemma 4, which involve (potentially high-dimensional) integrals that represent averages over the non-essential degrees of freedom; see, e.g., [6, 22, 30] for Monte-Carlo methods to efficiently compute these high-dimensional integrals.

5. We should stress that Lemma 4 can be readily generalized to multidimensional reaction coordinates, however, in general (expect for the case of pairwise orthogonal reaction coordinates) it is unclear whether the physical interpretation of the projected equation as a reversible diffusion in the free energy landscape is retained.

## 4 Approximation quality for larger time scales

We have seen in Section 3.1 that approximates well (pointwise) for small times . However, for metastability analysis, spectral properties of the spatial operator for larger time scales are of interest. In this section we make use of two well-known techniques—perturbation expansion, already seen in Section 2.1, and the Mori–Zwanzig formalism—with the aim of explaining the approximation quality of pseudo generator reconstructions of , and extending them to larger time scales. Then, we discuss how to utilize the ergodicity of the Langevin process to show an almost Markovian behaviour of the spatial dynamics on long time scales. This eventually leads to a bound on the time scale on which the spatial dynamics is well approximated.

### 4.1 Perturbation expansion

The idea of perturbation expansion rests on the assumption that there exists a small problem parameter in which one can expand the objects of interest in a (formal) power series. As in Section 2.1, here this small parameter is the inverse of the damping in the Langevin dynamics, i.e.  where, for simplicity, we assume that the friction coefficient is scalar. For ease of presentation, we set the inverse temperature .

It turns out to be advantageous to work with the propagators (Koopman operators) instead of the transfer operators themselves. The difference is only of technical nature, since the propagators are the adjoints of the corresponding transfer operators. Denoting the propagators of the Langevin, Smoluchowski, and spatial dynamics by , , and , respectively, we have the explicit representations

 TtLanu(q,p) = E[u(qLant,pLant)∣∣qLan0=q, pLan