Efficiency of a micro-macro acceleration method for scale-separated stochastic differential equations
We discuss through multiple numerical examples the accuracy and efficiency of a micro-macro acceleration method for stiff stochastic differential equations (SDEs) with a time-scale separation between the fast microscopic dynamics and the evolution of some slow macroscopic state variables. The algorithm interleaves a short simulation of the stiff SDE with extrapolation of the macroscopic state variables over a longer time interval. After extrapolation, we obtain the reconstructed microscopic state via a matching procedure: we compute the probability distribution that is consistent with the extrapolated state variables, while minimally altering the microscopic distribution that was available just before the extrapolation. In this work, we numerically study the accuracy and efficiency of micro-macro acceleration as a function of the extrapolation time step and as a function of the chosen macroscopic state variables. Additionally, we compare the effect of different hierarchies of macroscopic state variables. We illustrate that the method can take significantly larger time steps than the inner microscopic integrator, while simultaneously being more accurate than approximate macroscopic models.
Keywords and phrases: Monte Carlo methods, Micro-macro acceleration, stochastic differential equations, stiff differential equations
Many applications in science and engineering are modeled with stochastic differential equations (SDEs)
where is a diffusion process on a domain . The drift term has the same dimension as , the diffusion term and represents an -dimensional Brownian motion. In practice however, we are often only concerned with the evolution of a few macroscopic state variables , which are defined as the expectation of some functions of interest over the distribution of , i.e.,
In many situations, it is difficult or impossible to derive a closed model for the dynamics of the macroscopic states . Instead, we need to approximate the expectation in (2) at different points in time via Monte Carlo methods and simulate the SDE (1) for each of the generated samples.
In this work, we are concerned with SDE systems that are stiff, in the sense that individual microscopic paths vary over fast time scales, while the evolution of the macroscopic state variables takes place over much longer time scales. Due to the discrepancy in time scales, explicit time discretization methods, such as the Euler-Maruyama scheme, are forced to take small time steps as they have a restricted stability domain. Hence, explicit methods are inadequate to simulate the macroscopic state variables up to a large end time .
Over the years, many multiscale methods have been proposed to overcome the problem of stiffness. We mention implicit methods [tian2001implicit, amiri2015class], S-ROCK [abdulle2008s], the equation-free framework [kevrekidis2009equation, kevrekidis2003equation] and the heterogeneous multiscale method (HMM) [weinan2003multiscale, abdulle2012heterogeneous]. Implicit methods have proven to be very successful for stiff ODEs since they allow for much larger time steps than their explicit counterparts, due to a much larger stability domain, [byrne1987stiff]. The cost of implicit time steppers is usually higher, since at every time step a (non-)linear system needs to be solved. Nevertheless, implicit schemes are a major improvement over explicit time integrators on stiff ODEs because they allow taking much larger time steps. However, a problem arises when applying implicit techniques to SDEs [burrage2004numerical]. The authors of [li2008effectiveness] showed that these methods are unable to capture the probability distribution of the fast modes when taking large time steps. The idea of S-ROCK is to extend the stability domain maximally along the negative real axis by bounding the stability domain by a Chebyshev polynomial [abdulle2008s, abdulle2007stabilized, komori2012weak]. One can then construct a Runge-Kutta scheme that has this stability domain. A drawback is that S-ROCK schemes do not attain a high convergence order [komori2013strong].
Both the equation-free framework [kevrekidis2009equation] and the heterogeneous multiscale method [abdulle2012heterogeneous] can yield considerable simulation speed-ups. However, the convergence analysis is usually restricted to showing converge in the limit when the time-scale separation becomes infinite. In that limit, the microscopic dynamics itself converges to a limiting macroscopic model, and both the equation-free method and HMM recover that limiting macroscopic model when the time-scale separation becomes infinite[abdulle2012heterogeneous]. An alternative approach is to average out the fast dynamics, either analytically or numerically, to obtain a the evolution of the slow dynamics only [legoll2010effective, pavliotis2008multiscale]. All of these methods introduce a modeling error when the time-scale separation is finite. However, in cases of large time-scale separation, the limiting macroscopic model becomes accurate and this modeling error becomes small, compared to the time discretization error.
Recently, a new micro-macro acceleration algorithm [debrabant2017micro] was introduced as an alternative to the above techniques. Micro-macro acceleration exploits the stiffness of the SDE by introducing a second time step, besides the microscopic time step , which is used to simulate the stiff SDE (1). This second, much larger time step , is introduced to extrapolate the evolution of the macroscopic state variables . One time step of micro-macro acceleration consists of four steps: (i) Monte Carlo simulation of the SDE (1) with a small time step ; (ii) restriction to approximate the macroscopic state variables (2) at every microscopic time step; (iii) extrapolation of the macroscopic state variables over a time interval of size ; and (iv) matching to generate a new microscopic probability distribution, consistent with the extrapolated macroscopic states, and with minimal deviation from a prior distribution (which we choose to be the final microscopic distribution obtained during the simulation stage). There are many ways of expressing the deviation of one probability distribution from another. One possibility is minimizing the -norm between the matched and prior distribution, but this choice does not guarantee positivity of the matched distribution [debrabant2017micro]. In this text, we consider matching in Kullback-Leibler divergence, which is based on notions from information theory [lelievre2018analysis]. We provide a more detailed mathematical description of the algorithm in Section 2.
It is shown that micro-macro acceleration can converge to the exact dynamics of the SDE, even when fixing a finite time-scale separation, for both matching in the -norm [debrabant2017micro] and in Kullback-Leibler divergence [lelievre2018analysis]. The conditions under which convergence has been demonstrated are the following: (i) the number of state variables must increase to infinity; and (ii) the time steps present (the microscopic time step and the extrapolation time step ), must tend to zero [lelievre2018analysis, debrabant2017micro]. Additionally, it was shown that the method is stable for the extrapolation step sizes independent of the fast time scales in the system [debrabant2018study].
The convergence analysis in [debrabant2017micro, lelievre2018analysis] reveals that the errors in the micro-macro acceleration method that are caused by the combination of extrapolation and matching, can be viewed as a time discretization error, with a size that depends on the extrapolation time step. In contrast, equation-free [kevrekidis2009equation, kevrekidis2003equation] and heterogeneous multiscale methods [weinan2003multiscale, abdulle2012heterogeneous] introduce a modeling error, because they compute the time evolution of an approximate macroscopic model in which a numerical closure relation is introduced. Since micro-macro acceleration is stable for time steps that do not depend on the fast time scales, the extrapolation time step can be chosen based on accuracy considerations only. Hence, we need to study accuracy as a function of extrapolation time step to assess the efficiency of the method. This study is exactly the aim of the current work.
The main parameters that determine the accuracy of micro-macro acceleration are the following:
The number of macroscopic state variables . We may expect that for a larger value of , the matched distribution lies closer to the exact microscopic distribution, since the method converges as tends to infinity. However, matching is computationally more expensive when increases.
The choice of the functions of interest in equation (2) and their potential to capture the essential features of the underlying microscopic distribution. Two different hierarchies of macroscopic state variables may reach a different accuracy using the same number of state variables, due to the different nature of both hierarchies.
The extrapolation step size . When decreases, we may expect that micro-macro acceleration is more accurate, but the computational cost increases.
Based on the arguments above, we choose to study the accuracy that can be reached for a given number of state variables and extrapolation step size . We investigate the choice of and on a set of representative toy examples. In particular, we investigate for which parameters and micro-macro acceleration can be more accurate than the approximate macroscopic model (or numerical closure), while simultaneously being faster than a full microscopic simulation.
The text is organized as follows. In Section 2, we describe the micro-macro acceleration algorithm in mathematical detail. We discuss some implementation issues in Section 3. Section 4 is devoted to the choice of the hierarchy and the number of macroscopic state variables to extrapolate. We focus on two problems from molecular dynamics: FENE dumbbells and a simple three-atom molecule. In Section 5, we study the accuracy of micro-macro acceleration as a function of the extrapolation step , when fixing the number of macroscopic state variables. More specifically, we discuss two stochastic models with a time-scale separation for which an approximate macroscopic model can be derived for the slow component of the system, in the limit of infinite time-scale separation. The first example is an overdamped Langevin equation with a double-well potential; the second is a linear system with an external time-periodic force. We investigate for which extrapolation step size micro-macro acceleration is more accurate than the corresponding approximate macroscopic models. Finally, Section 6 presents a concluding discussion.
2 A micro-macro acceleration algorithm
In this Section, we introduce the micro-macro acceleration algorithm and delineate each of its four steps. We define the microscopic time-stepper that integrates the stochastic process (1) over small time steps using a weighted Monte Carlo ensemble in Section 2.1. In Section 2.2, we present the restriction operator to retrieve the macroscopic state variables from the microscopic Monte Carlo particle ensemble. Then, in Section 2.3, we introduce the extrapolation of the macroscopic state variables over a larger time step . Finally, Section 2.4 contains the description of the matching operator with which we return from the macroscopic to the microscopic level of description. Matching is performed by reweighting the Monte Carlo particle ensemble. We collect the complete algorithm in Section 2.5.
2.1 Monte Carlo simulation
Suppose at time we have a weighted particle ensemble , where the state variable denotes the -th realization of the SDE (1) and is its associated weight. The particle ensemble determines an empirical probability distribution
that approximates the exact continuous probability distribution of the diffusion process (1) at time . In the first stage of micro-macro acceleration, we perform microscopic time steps of size with a Monte Carlo time integrator. The propagation of the particle ensemble through the SDE (1) at time is given by a (possibly time-dependent) transition operator
where every particle ensemble determines the empirical probability distribution that approximates the exact probability distribution at time . For completeness, we define . For instance, the Euler-Maruyama scheme propagates each particle as
with an -dimensional standard normally distributed random variable. For the remainder of the manuscript, we will use the Euler-Maruayama method as the inner time integrator.
To transit from the microscopic to the macroscopic description of the diffusion process, we introduce the restriction operator that acts on a probability distribution as follows
where is the vector of state functions that we introduced in equation (2). To ensure that the matched density in Section 2.4 has unit mass, we add an additional macroscopic state function to the vector of state functions from (2) without changing notation. In the context of Monte Carlo simulations, we approximate the continuous probability distribution by a discrete particle ensemble with an associated empirical probability distribution , and define a discrete version of the restriction operator as
The macroscopic state variables at are then given by , .
To extrapolate the macroscopic state variables over a time interval of size , we approximate time derivative using a finite difference approximation. Throughout this text, all experiments use linear extrapolation, for which the macroscopic state variables at time read
Linear extrapolation of the macroscopic state variables mimics a forward Euler step of the unavailable macroscopic model for the macroscopic state variables. Extensions to higher order extrapolation methods are also possible and we refer to [lafitte2017high] for a treatment of higher order projective integration techniques for hyperbolic conservation laws.
To recover the microscopic level of description from the macroscopic level, we need to find a probability distribution that is consistent with the extrapolated states . A priori, however, many possible distributions can be consistent with . To resolve this ill-posedness, we introduce a matching procedure that finds a unique probability distribution that is consistent with the states, while minimizing a dissimilarity functional compared to some prior distribution. We choose the prior distribution to be the final distribution at time during the simulation stage in Section 2.1. We first introduce the matching operator acting on continuous probability distributions in Section 2.4.1. In Section 2.4.2 we explain how matching can be implemented on probability distributions represented by weighted particle ensembles.
2.4.1 The continuous matching operator
In our approach, matching is formulated as an optimization problem that reads
with the matched distribution that serves as the initial distribution for the simulation stage in Section 2.1. Note that the functional does not need to be a distance metric. There are many possible matching strategies that come with a different choice for , see [debrabant2017micro] for more details. In this work, we will use matching in Kullback-Leibler divergence, or relative entropy, which is based on information-theoretic considerations [lelievre2018analysis]. The objective is to minimize
which has an analytic solution of the form
where the Lagrange multipliers solve the non-linear system
In this text, we will employ the Newton-Raphson method to solve the system above to find the Lagrange multipliers [debrabant2017micro]. We give more details in Section 3. Minimizing the Kullback-Leibler divergence (7) amounts to solving a dual system.
2.4.2 Matching with weighted particle ensembles
In the context of Monte Carlo simulations, we replace the prior distribution by the particle ensemble and the matched distribution by a new particle ensemble at time . We rewrite the matching formula on the ensemble level as
where is the matching operator that minimizes the Kullback-Leibler divergence (7).
By the multiplicative nature of matching in Kullback-Leibler divergence (8), sampling from the matched distribution can be performed efficiently by reweighting the particle ensemble to represent the matched ensemble . The new weights read
2.5 The complete micro-macro acceleration algorithm
Assume we have a microscopic ensemble at time , a microscopic step size , the extrapolation step size and a number of microscopic steps such that . Let be the end time of the simulation and the number of macroscopic state variables. The algorithm produces the microscopic ensemble in four steps:
(i) Monte Carlo simulation: simulate the microscopic ensemble over small inner steps of size
(ii) Restriction: compute the macroscopic states corresponding to these microscopic ensembles
(iii) Extrapolation: approximate the macroscopic state variables over a larger time interval with linear extrapolation
(iv) Matching: compute the new particle ensemble consistent with by reweighting the prior ensemble using the matching operator
and advance time with until the end time is reached.
3 Practical implementation details
To implement the micro-macro acceleration algorithm in practice, several issues need to be addressed. In this Section, we focus on three aspects. First, in Section 3.1, we formulate an efficient implementation of the Newton-Raphson scheme to solve the non-linear system (9) using the weighted particle ensemble available at the end of the simulation stage.
Second, due to the particle reweighting (11), some particles can end up with very large or very small weights. Such a variation in the weights can render the approximation of the macroscopic state variables in (5) inaccurate as some particles with small weights are essentially ignored. In Section 3.2, we go deeper into the problem of detecting a large variation in the weights. To alleviate this variation, we use a resampling strategy that randomly duplicates some particles and removes others in such way that, after resampling, all resulting particles have equal weights [hol2006resampling].
Finally, in practice it can happen that the extrapolated macroscopic states fall outside the domain of the matching operator , i.e., there is no probability distribution that is consistent with . We call such a phenomenon a matching failure. An efficient way of dealing with matching failures is decreasing the extrapolation step size such that the extrapolated states deviate less from the already available states in the simulation stage. The detection of matching failures and the adaptive time stepping strategy are discussed in Section 3.3.
3.1 An efficient formulation of the Newton-Raphson scheme
To compute the matched distribution , we need to find the Lagrange multipliers that solve the dual system of non-linear equations (9), where we define
Suppose we have the Lagrange multipliers at the -th iteration of the Newton-Raphson scheme, . One Newton-Raphson iteration for the next set of Lagrange multipliers for the dual problem (9) reads
where the gradient of is given by
In our setting, the prior probability distribution is only available as an empirical measure . We can approximate the function and its Jacobian using the prior particle ensemble as
Note that since the prior particle ensemble is fixed, the solution to the system is also deterministic.
The cost of one Newton-Raphson iteration increases linearly with the number of particles , making matching the most expensive part of Algorithm 1. There exist other numerical techniques, such as a sparse approximation of the Jacobian [lucia1983explicit] or the BFGS algorithm [martinez2000practical] that could speedup matching. These methods usually require more iterations to converge to the correct multipliers, but the iterations require fewer computations. We use the Newton-Raphson procedure in all the numerical experiments in this manuscript.
As initial value for (13), we take . When there is no extrapolation, i.e. , the extrapolated macroscopic state variables are consistent with the prior distribution , i.e., . In this case, the optimal Lagrange multipliers equal 0 so that equations (13) and (9) are already solved by the initial guess. As stopping criterion we require that extrapolated moments are met, within a tolerance
where the tolerance is usually . In practice, we also impose a maximum number of Newton-Raphson iterations due to the possibility of matching failures, which we will explain further in Section 3.3.
3.2 Particle resampling
Besides the problem of matching failures, there is also the issue of increasing variance between particle weights. By the multiplicative form of the matching procedure (8) and particle reweighting (11), some Monte Carlo replicas may end up with a high weight after a few steps, while some may have a low weight. Particles with a small weight will be neglected in the discrete approximation of the macroscopic state variables (5), while a few particles with large weights will dominate the approximation. Hence, the discrete approximation to the macroscopic state variables becomes prone to statistical errors. This is in fact a well-known issue in many particle filter methods [arulampalam2002tutorial].
To resolve the large variation in particle weights, we resample the particles from time to time. The technique we use in this paper is stratified resampling [hol2006resampling, debrabant2017micro]. The idea is to randomly discard and duplicate each of the particles in such way that the distribution of the particles is unchanged, and all particles have an equal weight after resampling. In practice, one divides the unit interval in equal pieces and generates a uniformly distributed random number in each sub-interval
One then duplicates each particle , times where is chosen by counting how many bins of equal size in the weight covers [douc2005comparison]. Mathematically, we thus take . If we discard particle and if we duplicate the particle into more particles. When then the particle remains and we simply adjust its weight to . The above procedure ensures that the particle distribution remains unchanged, in expectation [douc2005comparison].
To determine whether the variation between the weights is too large, we propose to compute the discrete relative entropy of the weights compared to uniform weights ,
which is zero when all weights are equal. We propose, as a heuristic, to resample when this relative entropy exceeds a fixed value of . We choose this value to ensure that the variation in the weights is not too high and prevent inaccurate computations of the macroscopic states. In practice, we compute the relative entropy of the weights every five time steps of the micro-macro acceleration algorithm, and we resample the weights if the entropy exceeds the chosen maximal value.
3.3 An adaptive time-stepping strategy for matching failures
As we mentioned in the introduction of this Section, a problem arises when the extrapolated macroscopic state variables fall outside the domain of the matching operator . In this case, there is no probability distribution that is consistent with the extrapolated states, and the micro-macro acceleration algorithm fails. We call such behavior a matching failure [lelievre2018analysis, debrabant2017micro]. When the extrapolation step is smaller, the extrapolated macroscopic state variables will lie closer to the macroscopic state variables of the prior distribution, for which there exists a probability distribution consistent with the prior state variables, namely the prior distribution itself. We can thus expect that, the smaller the extrapolation step, the more likely we will find a probability distribution consistent with . In practice, we thus perform micro-macro acceleration with adaptive time stepping based on matching failures. If we detect a matching failure, we drastically decrease by half, and if matching succeeds we cautiously increase again by a factor .
The question remains how we can detect a matching failure in practice. When there is no probability distribution consistent with the extrapolated state variables , the Newton-Raphson solver will also fail to find Lagrange multipliers that solve the non-linear system (9). We thus only need to monitor the convergence of the Newton-Raphson solver to determine whether there is a matching failure or not. We impose a maximum number of iterations for the non-linear solver. If this maximum number is reached before convergence, we consider matching to have failed.
Imposing a maximum number of iterations before lowering the extrapolation time step size also has another beneficial effect. For larger , the extrapolated state variables will lie far from the prior macroscopic state variables so that the Newton-Raphson method will need more iterations to convergence compared to lower extrapolation step sizes. Taking two time steps with a lower could speed up the algorithm compared to one step with twice the extrapolation time step. In the light of performance, we can hence take the maximum number of Newton-Raphson iterations to be small. In this text, we allow 6 Newton-Raphson iterations before lowering the time step.
4 The influence of the choice of state variables
In this Section, we present two examples from molecular dynamics: FENE-dumbbells (Section 4.1) and a three-atom molecule (Section 4.2). In both examples, we investigate the accuracy of micro-macro simulation for different choices of macroscopic state variables. In Section 4.1, we introduce the model of FENE-dumbbells and look at three different hierarchies of macroscopic state variables to approximate the quantities of interest. Section 4.2 introduces and discusses the three-atom molecule, where we compare the accuracy of micro-macro acceleration to existing approximate macroscopic models for slow quantities of interest. We will show that even with some macroscopic state variables that are not entirely slow, the micro-macro acceleration method can generate more accurate approximations to the exact dynamics than these approximate models. Finally, we present a short overarching discussion on the effect of the macroscopic state variables in Section 4.3.
4.1 Three state hierarchies for FENE-dumbbells
FENE stands for ‘Finitely Extensible Non-linear Elastic’, and FENE-dumbbells model represents a dilute polymer solution in a solvent[laso1993calculation]. The state variable denotes the end-to-end vector between two beads of the polymer chain, which are connected by a spring. As the polymers move through the solvent, they experience Stokes drag due to the velocity gradient of the ambient solvent, Brownian motion due to collisions with solvent molecules and a spring force
due to intramolecular interactions. The set is the open ball of radius around the origin. The diffusion process for FENE-dumbbells reads
where is the Weissenberg number, the ratio of the characteristic relaxation time of the polymers in the solvent to the characteristic time of the solvent. We refer to [bird1987dynamics] for a derivation of SDE (16) and to [jourdain2003mathematical, jourdain2004existence] for the existence of a global solution to the SDE.
In practice, the FENE-dumbbells process is usually coupled to the macroscopic Navier-Stokes equations that models the evolution of the solvent. This coupling happens through the Stokes drag and by adding a non-Newtonian term in the form of the stress tensor [masmoudi2008well]
In the following experiments, we will use a 1-dimensional FENE-dumbbells process, with parameters . We also define the ambient velocity gradient as in [keunings1997peterlin]
For this choice of the velocity gradient, one can show that hysteresis occurs between the stress tensor and the first even moment, as shown on the phase diagram in Figure 1. The hysteric effect is due to the interplay between the non-linearities in the FENE model (16) and the dispersity between the elongations of the individual polymers. We refer to [sizaire1999hysteretic] for more details on this hysteric effect.
The objective of the numerical experiments in this Section is to study the accuracy of micro-macro acceleration in approximating this hysteretic curve, and the effect of the choice of the macroscopic state variables on this accuracy. We propose three different hierarchies of macroscopic state variables and study the accuracy of the resulting approximations by micro-macro acceleration. The micro-macro acceleration algorithm has already been successfully applied to the FENE-dumbbells diffusion process [debrabant2017micro], but a study of the accuracy of different hierarchies of macroscopic state variables has not yet been performed. The authors of [samaey2011numerical] have, however, already carried out a similar study in the equation-free context.
In the following experiments, we will use the same three hierarchies of macroscopic state variables as [samaey2011numerical], and investigate the accuracy of micro-macro acceleration on the hysteretic curve and the stress tensor, as a function of the hierarchy and the number of macroscopic state variables per hierarchy. The families of macroscopic state variables are:
Hierarchy 1: Use the first even moments of the diffusion process. By symmetry of the spring force, the odd moments of the FENE process vanish, so there is no need to use these states for matching. The macroscopic state variables read
Hierarchy 2: Replace the final state variable from strategy 1 with the stress tensor itself. Since we want to approximate the evolution stress tensor, adding the latter as a state variable could improve the accuracy of micro-macro acceleration. The second hierarchy of state variables hence reads
Hierarchy 3: Start with and add terms that pop up in the Taylor expansion of using Itô’s lemma. For each of those new terms that pop up in the evolution equation for , write down their evolution equation as well and keep adding new terms as macroscopic state variables. This way, the first four state variables that pop up are
We refer to Appendix A for the derivation of the above macroscopic state variables.
For the numerical experiments, we compute the evolution of the stress tensor and the first even moment and also plot the hysteric curve in the phase space. We will use up to macroscopic state variables for the first two hierarchies and up to state variables for the third hierarchy (17). We take a maximal time step for the extrapolation time step with adaptive time stepping, since the expected gain is quite small due to a small relative time-scale separation in the FENE model. Furthermore, we choose for all experiments and we use particles during the Monte Carlo simulations. The numerical results are summarized in Figures 2 to 4, for each of the three hierarchies of state variables, respectively.
With the first hierarchy of macroscopic state variables in Figure 2, micro-macro acceleration does not capture the hysteresis curve well for . For in brown, the approximation to the exact hysteresis curve is somewhat better, but there is still a lot of room for improvement.
The second hierarchy of state variables, which includes the stress tensor, improves the approximation of micro-macro acceleration in Figure 3. The green and purple lines enclose a larger area and the brown line approximates the hysteresis curve slightly better. There is however still no reasonable fit of the exact hysteresis curve when .
The third hierarchy of macroscopic states yields the best results. For , micro-macro acceleration (in purple) follows the exact hysteresis curve closely. With the third hierarchy of state variables, we hence perform less computational work while obtaining a better approximation to the exact stress tensor and hysteresis curve. A priori selecting the set of macroscopic state variables has a profound impact on the accuracy and efficiency of micro-macro acceleration. This conclusion is very similar to the one in the equation-free context [samaey2011numerical].
4.2 Reaction coordinates of a three-atom molecule
The three-atom molecule is a simple three-dimensional molecular system, where the bonds between the individual atoms vibrate at a high frequency, compared to the frequency of global conformational changes of the molecule. Figure 5 depicts the simple molecule. To remove some degrees of freedom, atom A is restricted to move on the -axis with coordinates and atom B is fixed at the origin. We define the coordinates for atom C as . The individual atoms are influenced by a potential that describes the interactions between the atoms, and also by collisions with the ambient solvent, modeled via Brownian motion. The magnitude of these collisions is proportional to the inverse temperature of the solvent.
The evolution of the atoms is given by an overdamped Langevin dynamics of the form[legoll2010effective]
where is the inverse temperature. The potential energy that governs the drift of the individual atoms is given as
where is the equilibrium length for the bonds and is the angle between atoms A and C. As we can see from the expression of the potential energy, the atom has two stable conformations where . Due to the time-scale separation in the potential energy, expressed by the small scale parameter , the bonds between atoms A and B and atoms B and C vibrate quickly relative to the bimodal behavior of the angle , which is the variable of interest in this system. For the experiments in this section, we will use the same parameter values as in [legoll2010effective], i.e., and .
In Section 4.2.1 we describe how an approximate model, or an effective dynamics model can be constructed for a quantity of interest, or reaction coordinate, when the time-scale separation is large. In Section 4.2.2 we choose two reaction coordinates and discuss the accuracy of the effective dynamics. In Section 4.2.3, we apply micro-macro acceleration to the three-atom molecule with macroscopic state variables closely related to the quantity of interest (the angle ) and look at its efficiency gain. Finally, in Section 4.2.4, we look at the accuracy of micro-macro acceleration with macroscopic state variables that are loosely coupled to .
4.2.1 Effective dynamics
When the parameter is small, a direct simulation of (18) becomes prohibitively costly. In many situations in molecular dynamics, however, we are usually not interest in the evolution of the complete stochastic system, but rather in the evolution of a low-dimensional set of reaction coordinates. For the three-atom molecule (18), we consider a scalar reaction coordinate . In [legoll2010effective], the authors propose to use a so-called effective dynamics model, or an approximate macroscopic model for the reaction coordinate . The effective dynamics is based on the conditional expectation of the state variable given a value of the reaction coordinate , and reads
The drift and diffusion term are defined by the conditional expectations
Analytic expressions for the functions and are usually hard or impossible to derive. Therefore, we pre-compute the functions on a grid of -values and interpolate between these points when an intermediate value is required [legoll2010effective].
4.2.2 Choice of reaction coordinates
As we mentioned in the introduction of this Section, we are mostly interested in the possible conformations that the molecule can reside in. To this end, we define a first reaction coordinate to be the angle . Alternatively, we define a reaction coordinate as the -distance squared between the two outer atoms A and C, i.e.,
The effective dynamics evolution equation for is readily given by the potential energy function (18), while for the second reaction coordinate, we compute the drift and diffusion terms, and on uniform grid of -values between 0 and 5. The reaction coordinates seem very similar, as they both can capture the bimodal nature of the dynamics of . However, the accuracy of the resulting effective dynamics differ considerably. The numerical results for and , as reproduced from [legoll2010effective], are summarized in Figure 6. The microscopic time step is for all simulations and we use Monte Carlo replicas.
The effective dynamics based on follows the exact dynamics of the angle very well. On the other hand, the effective dynamics with does not follow the exact dynamics of at all. There is a large steady-state error between the effective dynamics and the exact microscopic dynamics. The authors of [legoll2010effective] attribute this error to the fact that is not orthogonal to the fast dynamics, while is. The a priori choice of which reaction coordinate to use, is thus of great importance.
4.2.3 Efficiency gain by micro-macro acceleration
The choice of a good set of reaction coordinates to use can be very hard in more complex applications [legoll2010effective]. Micro-macro acceleration can reduce this difficulty. We now perform the same numerical experiments as above with the micro-macro acceleration method, and study when micro-macro acceleration can improve on the results of the effective dynamics. In the following experiments, we extrapolate the mean of either or as a macroscopic state variable, for different extrapolation step sizes. The simulation results are summarized in Figure 7, with reaction coordinate on the left and on the right.
On the left plot of Figure 7, we see that micro-macro acceleration closely follows the exact dynamics of the first reaction coordinate, for different extrapolation step sizes. Only a small transient error appears as increases. This is already an efficiency gain compared to the microscopic simulation: we obtain the same steady-state value while taking larger time steps. The right plot of Figure 7 contains a surprising result. Micro-macro acceleration with extrapolating only the mean of completely eliminates the modeling error made by the effective dynamics of , even though the mean of contains some fast dynamics. This is the case for multiple extrapolation step sizes. One reason why micro-macro acceleration is more accurate than the effective dynamics is that the microscopic dynamics is taken into account as well during the simulation stage. A complete explanation of the high level of accuracy for large is however not yet available.
We can conclude, numerically, that micro-macro acceleration yields accurate simulations results when extrapolating either one of the two reaction coordinates. Micro-macro acceleration attains the same level of accuracy or is more accurate than the effective dynamics. At the same time, we are able to bridge the gap in the time-scale separation by taking extrapolation time steps that are about ten times larger than the microscopic time step. The time-scale separation is approximately a factor of ten between the slow and ten fast dynamics, due to the particular choice of values for and . We thus gain in efficiency compared to both the effective dynamics and the microscopic time integrator.
4.2.4 A hierarchy of moments of and
Reaction coordinates and are very closely connected (or identical to) the angle , and micro-macro acceleration yields very accurate results when extrapolating the mean of either reaction coordinates. We now consider a set of pure moments of the three-atom molecule (18) that also determines the angle uniquely and investigate the numerical results. Suppose we take the first and second moments of and as macroscopic state variables
Figure 8 depicts the numerical approximations by micro-macro acceleration with these moments as macroscopic state variables, with the same large extrapolation time steps and particles.
The numerical results indicate that micro-macro acceleration follows the exact microscopic dynamics of and well, for small extrapolation steps, i.e., . However, when the extrapolation step grows, micro-macro acceleration remains unbiased but the statistical noise increases. The noise on in the right panel of Figure 8 is higher than the noise on . This effect is mainly due to the fact that we are not extrapolating any states of , such that can move freely. In the experiments in previous section, both and depend on and the noise is a lot lower. We leave the quantification of statistical error through micro-macro acceleration for future work.
4.3 Conclusion on the number of macroscopic state variables
For both FENE-dumbbells and the three-atom molecule, we can conclude that choosing a set of macroscopic state variables that is in close connection to some quantity of interest (the angle for the three-atom molecule and the stress tensor for FENE-dumbbells) yields accurate simulation results. In case of FENE-dumbbells, the third hierarchy of macroscopic states approximates the stress very well, with fewer moments than the other hierarchies of macroscopic state variables. Also, in case of the three-atom molecule, micro-macro acceleration can improve on the effective dynamics for reaction coordinates that are related to the quantity of interest.
As a second observation, we need to ensure to extrapolate enough macroscopic state variables related to the quantity of interest to keep the statistical error low enough. When extrapolating only moments of atom in the three-atom molecule, the noise on the angle is smaller than on the distance since the variable can move around freely.
5 Impact of the extrapolation step size on accuracy
In this Section, we look in more detail into the effect of the extrapolation time step on accuracy. We assume that the macroscopic state variables are given, for example through an analysis similar to those in Section 4. In the following examples, we are interested in the maximal extrapolation step we can employ, such that the micro-macro acceleration error remains within a given tolerance, specifically as a function of the time-scale separation present in the system. For systems in which the time-scale separation can be expressed in terms of a small-scale parameter , an approximate macroscopic model usually exists for the slow dynamics [pavliotis2008multiscale], that converges to the microscopic dynamics when to 0. The goal of this Section is to determine the maximal such that micro-macro acceleration is more accurate than the approximate macroscopic model. In Section 5.1, we first investigate the efficiency gain on a slow-fast system with a double-well potential, and in Section 5.2, we discuss a simple linear SDE with a periodic external force.
5.1 A slow-fast bimodal system
As the first example, we consider a slow-fast system, where the fast component is governed by a double-well potential
The potential energy function is depicted in Figure 9.
With this double-well potential, the fast component will reside in one well for some time and then switch at a random instant in time to the other well due to the Brownian motion term in the evolution equation for . The mean switching time between the two wells depends on and is derived in detail in [bruna2014model]. Figure 10 depicts the switching process of one particle for a small (left) and a large (right). For large time-scale separations (small ), the fast component switches frequently between the wells, so that the slow component (in red) is not heavily influenced by the switching. The slow component only feels the invariant distribution of the position of the fast component. For low time-scale separations, the fast component resides longer in one well before switching to the other. In this case, the slow component feels in which well the fast component resides and evolves accordingly.
The slow-fast bimodal system in this section differs from the bimodal behavior of the three-atom molecule. In the current example, the bimodal motion is fast while in the previous example it is slow. Also, the effective dynamics from the three-atom molecule (20) is not applicable here since there is an present in the diffusion term of (23).
In Section 5.1.1, we introduce an approximate macroscopic model for the slow component of general slow-fast systems, based on averaging out the fast component. We show the accuracy of the approximate macroscopic model for small and larger values of . Finally, in Section 5.1.2, we discuss the accuracy and efficiency gain of micro-macro acceleration compared to the approximate macroscopic model, applied to the bimodal system (23).
5.1.1 An approximate macroscopic model for general slow-fast systems
For general slow-fast systems, one can derive an approximate macroscopic model for the slow component by averaging out the mean of the fast dynamics , given value of the slow component . We then substitute the averaged value of into the evolution equation for the slow component. In case of the bimodal system, the fast dynamics is autonomous and the invariant measure is [pavliotis2008multiscale]
The mean of the , given a value of equals zero
so that the approximate macroscopic model for the bimodal system (23) reads
Equation (24) converges weakly to (23) when decreases to 0 [pavliotis2008multiscale], implying that the approximate macroscopic model makes a modeling error whenever . The reason for a modeling error can easily be identified from the left plot of Figure 10. Indeed, as we mentioned, for larger the fast mode heavily influences the slow mode, so that we cannot just replace the fast behavior by its mean.
There is a large variance in induced by the switching of that we ignore by substituting the mean of for itself. When is lower, the error also lowers as we identified in the right plot of Figure 10. This induced variance is illustrated in Figure 11, which shows the variance of the slow component of (23) as a function of time, for (left) and (right). The step size is for both models is and the initial condition is far from the equilibrium distribution of the system.
The left plot of Figure 11 indicates a large steady-state error on the slow variance for moderate ; this error is also present for but a lot smaller. With the micro-macro acceleration method, we aim to reduce this steady-state error, especially for moderate time-scale separations.
5.1.2 Efficiency gain by micro-macro acceleration
In the following experiments, we run the micro-macro acceleration method on the bimodal system (23), and extrapolate the first and second moment of , with particles. Both moments are required, since we need to approximate the variance of the slow component as closely as possible. We consider values for the time-scale parameter: and . In the former case, we expect a large efficiency gain due to a high time-scale separation. In the latter case, the time-scale separation is only moderate and the expected efficiency gain is rather small. The numerical results are summarized in Figure 12. On the one hand, for small , micro-macro acceleration can take extrapolation steps that are a hundred times larger than the microscopic step size, while only creating a small transient error. On the other hand, when is large, micro-macro acceleration is able to eliminate the error of the approximate macroscopic model and reach the same steady state value for the variance as the microscopic integrator. At the same time, micro-macro acceleration also gains in efficiency compared to the microscopic integrator by taking larger time steps. There, however, remains a transient error on the variance of the slow mode, that enlarges as increases. There is currently no error control available for micro-macro acceleration and we leave this topic for further research.
To conclude, with micro-macro acceleration, we can almost bridge the time-scale separation gap by taking extrapolation steps on the order of the small dynamics then is small, and when is large, we can eliminate the steady-state error of the approximate macroscopic model with time steps that are slightly larger than those of the microscopic time integrator. Thus, on this double-well system, we gain in efficiency for all values of .
5.2 Accuracy analysis of extrapolation on a periodically driven linear system
For the final numerical example of this manuscript, we consider a slow-fast linear system where we add a periodic external force to evolution equation of the the slow component [li2008effectiveness],
Hence, the mean of has a periodic invariant solution. The periodic external force makes it easier to compare the numerical solution of different methods to the exact solution. The errors between different models are present in every period of the invariant solution and do not damp out to zero as time increases. We can thus compute errors of different methods by simply computing -norm of their difference with the exact solution, over one period of the solution.
with the normalization constant. The mean of , computed against its invariant distribution, for a given value of , thus reads
Therefore, the approximate macroscopic model for the slow component is
Let us first compare the numerical solution of the approximate macroscopic model (26) to the exact microscopic dynamics (25), for different values of . For all the numerics in this Section, the initial condition is taken on the invariant curve of the exact continuous solution, the microscopic time step is and we use Monte Carlo particles. Figure 13 depicts the evolution of the mean of for on the left, and on the right. The numerical results show that the approximate macroscopic model (in blue) makes an error, compared to the exact microscopic model (in orange). The error is mostly visible in the amplitude of the periodic solution, and the discrepancy decreases when decreases to zero. We show in Appendix B that the error between the microscopic and approximate macroscopic model decreases linearly with , as is also visible on Figure 14.
5.2.1 Tuning the error of micro-macro acceleration
Let us now investigate the accuracy of micro-macro acceleration for the periodically driven linear system (25). For every value of , we want to examine the maximal extrapolation step micro-macro acceleration can take, while keeping the error smaller than the corresponding error of the approximate macroscopic model.
To determine the maximal extrapolation step, we first need to know how the micro-macro acceleration error decreases as decreases to . As of now, there exist no theoretical result on the convergence rate of micro-macro acceleration with relative entropy matching [lelievre2018analysis], so we can only investigate the rate of convergence numerically. In Figure 15, we show the error of micro-macro acceleration as a function of the extrapolation time step , for the same values of as in Figure 13. For large values of , the micro-macro acceleration error decreases linearly with as is shown on the left plot in Figure 15. We expect first-order convergence since the linear extrapolation of macroscopic state variables mimics the forward Euler method. However, for small , we observe that the error decreases quadratically with .
Having the convergence order of both the approximate macroscopic model and micro-macro acceleration for small in place, we will derive an expression for the maximal extrapolation step we can take, before the approximate macroscopic model becomes more accurate than micro-macro acceleration. Using the linear convergence of the approximate macroscopic model, and the quadratic convergence of micro-macro acceleration, both methods attain the same accuracy when
with the maximal extrapolation step possible before the approximate macroscopic model is more accurate than micro-macro acceleration. We define as the constant of proportionality for the approximate macroscopic model and the function as the proportionality constant for micro-macro acceleration, which can in principle depend on . For a more natural comparison of the maximal extrapolation step for different values of , we define the dimensionless ‘extrapolation factor’ as . Furthermore, due to the stability requirement for the inner integrator, we choose the inner microscopic step proportional to , . Putting all the parameters together, equation (27) yields
such that a direct expression for follows. Unfortunately, we lack a direct expression for , so we cannot use (28) directly. We must resort to numerical experiments to deduce the maximal extrapolation step and the maximal extrapolation factor .
In Figure 16, we plot in solid lines the micro-macro acceleration error for multiple values of as a function of the extrapolation factor . The error of the approximate macroscopic model is also given by the dashed lines, and the crosses indicate where both errors are equal. Denote the associated maximal extrapolation factor by . Whenever , the micro-macro acceleration method is more accurate than the approximate macroscopic model for that value of , and vice versa when . The numerical results indicate that the maximal extrapolation factor increases as decreases.
An important consequence of the the fact that increases, is that the maximal extrapolation step decreases slower than linearly with , since the microscopic inner step decreases linearly with . Indeed, Figure 17 depicts that decreases approximately as . As a result, micro-macro acceleration can be more accurate than the approximate macroscopic model if we choose . We can also tune the micro-macro acceleration error. Given a tolerance, or maximal value for the error, we can choose the extrapolation step such the error is beneath the tolerance, while still taking larger time steps than the microscopic time integrator. We conclude that micro-macro acceleration gains in efficiency compared to the approximate macroscopic model by lowering the error, but also compared to the inner microscopic time integrator by taking larger time steps. The gain compared to the microscopic time integrator even becomes infinite when decreases to zero.
We studied the efficiency of a micro-macro acceleration procedure. We introduced the micro-macro acceleration algorithm in mathematical detail in Section 2 and discussed some implementation issues in Section 3.
We investigated the effect of the number and nature of the macroscopic state variables in detail in Section 4. The state variables have a big impact on the efficiency of micro-macro acceleration. Increasing the number of state variables yields a higher accuracy of the matched distributions but increases the cost of matching.
The numerical results of Section 4 show that a hierarchy of macroscopic state variables that is closely connected to some quantity of interest, gives accurate simulation results for that quantity. In case of FENE-dumbbells, the hierarchy of state variables that includes terms in the Itô expansion of the stress tensor, gives better approximations to the stress tensor than other hierarchies. In fact, this hierarchy of states requires only four macroscopic state variables, while the others do not approximate the stress tensor, even with five macroscopic state variables.
A similar conclusion holds for the three-atom molecule, where we worked with two reaction coordinates: the angle and the distance between the two outer atoms. Micro-macro acceleration with extrapolating the mean of the two reaction coordinates, follows the exact microscopic dynamics very well, even for quite large extrapolation step sizes. Moreover, the second reaction coordinates eliminates the steady-state error made by the approximate model based on the effective dynamics for that reaction coordinate. The reaction coordinate does not necessarily need to be slow, as we have seen with the second reaction coordinate, it just needs to be linked to the quantity of interest, the angle in this example. A hierarchy based only on moments also produced an unbiased results but the statistical error is larger. Thus, micro-macro acceleration achieves the same or a better accuracy than the effective dynamics, while allowing for larger time steps than the stiff microscopic solver, gaining efficiency compared to both models.
Besides the choice of macroscopic state variables, the extrapolation step size greatly influences the accuracy of micro-macro acceleration, as we studied in Section 5. For the slow-fast bimodal model, the approximate macroscopic model makes a modeling error on the variance of the slow component, especially visible when there is almost no time-scale separation. Micro-macro acceleration attains the correct steady-state value for different extrapolation step sizes, at the expense of a controllable transient error. We thus gain in accuracy compared to the approximate macroscopic model while gaining in efficiency against the microscopic integrator by taking larger time steps. For the slow-fast periodic model, we proved numerically that the efficiency gain by micro-macro acceleration becomes infinite as the time-scale separation increases to infinity. The maximal extrapolation step micro-macro acceleration can take before the approximate macroscopic model becomes more accurate decreases approximately as , while the microscopic time step must decrease linearly due to stability. Furthermore, we can also tune the extrapolation step such that the error stays below a given tolerance, while still taking larger time steps than the microscopic time integrator. The efficiency gain is thus also present in the periodic linear system.
Appendix A Derivation of the macroscopic state variables for FENE-dumbbells
In this Section, we give a short derivation of the macroscopic state variables of the third hierarchy of the FENE-dumbbells process (17). We start by writing out the evolution equation of the first even moment and consequently add all terms that appear in the evolution equation. For all these new terms we also write the evolution equations and select extra terms and so forth.
First note that for any twice-differentiable function , the evolution of the expectation is given by Itô’s law
Starting from the first even moment of the diffusion process , we find that
Since the state variable appears in its evolution equation, we do not need to add it to the list of macroscopic state variables. We also do not need to add the constant term, so we add as the second macroscopic state variable for the FENE-process.
Let us now consider the second function . The first and second derivative are
Using equation (29), the evolution equation of hence becomes
To complete our choice of four macroscopic state variables, we choose the first two moments popping up in the evolution equation of : and .
Appendix B Derivation of first order convergence of approximate model (26)
In this Appendix, we show that the -error between the slow means of a general system of linear SDEs with an external period driving force, and the corresponding approximate macroscopic model for the slow component, decreases linearly to zero when the small-scale parameter decreases to zero. For the derivation, we consider a more general form of the system in Section 5.2
where the associated approximate macroscopic model is given by
One can show using elementary techniques from calculus that the mean vector propagates as
where and is the initial condition to the equation. The constants and are the solution of the linear system
Since we are only interested in the evolution of , we only need to know the explicit expressions for and . Using the symbolic engine MUPAD [fuchssteiner1996mupad], the constants and are