A constrained Fokker-Planck equation: gradient flow formulation

Gradient flow formulation and longtime behaviour of a constrained Fokker-Planck equation

Simon Eberle Fakultät für Mathematik, Universität Duisburg-Essen. Barbara Niethammer  and  André Schlichting Institut für Angewandte Mathematik, Universität Bonn.

We consider a Fokker-Planck equation which is coupled to an externally given time-dependent constraint on its first moment. This constraint introduces a Lagrange-multiplier which renders the equation nonlocal and nonlinear.

In this paper we exploit an interpretation of this equation as a Wasserstein gradient flow of a free energy on a time-constrained manifold. First, we prove existence of solutions by passing to the limit in an explicit Euler scheme obtained by minimizing among all satisfying the constraint for some and time-step .

Second, we provide quantitative estimates for the rate of convergence to equilibrium when the constraint converges to a constant. The proof is based on the investigation of a suitable relative entropy with respect to minimizers of the free energy chosen according to the constraint. The rate of convergence can be explicitly expressed in terms of constants in suitable logarithmic Sobolev inequalities.

Key words and phrases:
constrained gradient flow, entropy method, energy-dissipation relation, Fokker-Planck equation
2010 Mathematics Subject Classification:

1. Introduction

We consider a nonlocal Fokker-Planck equation


which describes the evolution of an ensemble of identical particles in a potential well subject to stochastic fluctuations. Here, a single particle is characterized by its thermodynamic state , is its free energy and denotes the probability density of the whole system at time . Equation (1.1) contains the small parameters  and , where accounts for entropic effects and is the typical relaxation time of a single particle. Furthermore, is a Lagrange multiplier which is such that the dynamical constraint


is satisfied, where is an externally given constraint. A direct calculation shows that the Lagrange multiplier is obtained as a nonlocal interaction term


Equation (1.1) together with (1.2) was introduced in [9, 7] to model hysteretic behaviour in many-particle storage systems, such as for example Lithium-ion batteries subject to externally imposed charging and discharging. In this context is typically nonconvex which gives rise to nontrivial dynamics that are studied in various scaling regimes in [13, 14].

System (1.1) has a free energy, which is also essential in the modeling and its thermodynamic derivation [9, 7], consisting of an entropy and a potential energy


Here we added a constant to make nonnegative. By differentiating the free energy along the solution to (1.1), (1.2), one obtains a second law of thermodynamics for open systems, that is


where the nonnegative dissipation is defined by


If the identity (1.5) is characteristic for systems possessing a gradient flow structure and very useful in the investigation of the long-time behaviour of solutions, since it shows that is a Lyapunov function. Motivated by this feature, the goal of this paper is two-fold. First we will develop an existence theory for (1.1),(1.2) based on an underlying gradient flow formulation. Second, we will provide quantitative estimates of the rate of convergence of solutions to equilibrium in case that as .

We now give a brief overview of the corresponding results.

The first part of this paper on the gradient flow formulation is inspired by the seminal paper [15], presenting an interpretation of a linear Fokker-Planck equation as a gradient flow of the free energy  in the space of probability densities with finite second moment endowed with the Wasserstein metric and subject to the physically accurate free energy functional (1.4) (cf. also [2]). This was the starting point of interpretations of more general nonlinear, nonlocal Fokker-Planck equations as Wasserstein gradient flows with respect to time-dependent energies, on constrained manifolds and even of none dissipative equations. A selection of works introducing certain constraints into gradient flows are [11, 6, 21].

In our case, the equation has a time independent energy functional , however subject to a possibly time-dependent constraint (1.2). Let us point out here, that such a setting raises the problem, that the constrained gradient cannot be simply defined as the projection onto the constrained manifold, since this could lead to a violation of the dynamical constraint. In Section 2, it is shown that a gradient flow with respect to a dynamical constraint needs a further restriction of the space of admissible tangential directions in order to match the dynamical constraint at all times. Then, among these admissible tangential directions, the one of steepest descent of the free energy is chosen.

This formal definition is complemented by proving rigorously in Section 3 that (1.1) can be obtained as Wasserstein gradient flow with dynamical constraint. To that aim we introduce time-discrete solutions obtained from an implicit time-discrete Euler scheme. In the setting of geometric flows, the scheme was first introduced in [17, 18] and in the setting of Fokker-Planck equations it goes back to [15]. Then, equation (1.1) is obtained by passing to the limit in the time step of the discrete scheme. In addition, this provides an alternative well-posedness result to [14, Lemma 1] that is based on a fixed point argument. Let us also note, that well-posedness in the case of compact state space is also obtained in [8].

In the second part of this paper, Section 4, we prove a quantitative long-time result under the assumption that for . In order to identify the internal time-scale of the system we set from now on . The main difficulty in the investigation of the long-time behaviour is that due to the external constraint (1.2) the system is not thermodynamically closed, that is the free energy (1.4) is not strictly decreasing but satisfied the energy-dissipation identity (1.5). The key idea in the analysis is to introduce a suitable comparison state parametrized by the constraint with the help of some convenient reparametrization . This state is characterized by the constrained minimization of the free energy (1.4) among all states satisfying (1.2) (cf. Proposition 4.2). Then, we are able to establish for the relative entropy with respect to this state, defined by


a differential inequality which implies a quantitative convergence to the equilibrium state. The constant in this differential estimate is characterized by the constant in a suitable logarithmic Sobolev inequality.

Since the relative entropy dominates the -norm, we are able to show, provided sufficiently fast, that there exists depending on the initial value and the convergence assumption on as well as a depending on the constant in the logarithmic Sobolev inequalities such that (cf. Theorem 4.8 and Corollaries 4.10 and 4.13)


Here, we can identify three possible internal time scales: In the strictly convex case, that is for all , then in (1.8) can be chosen as . In the unimodal case where has only one global minimum, then for some that is independent of . In the so-called Kramers case, where has a multi-well structure, we obtain that is exponentially small in . Here, is a characteristic energy barrier of the system (cf. Section 4.3). Moreover we show that for outside of a certain regime and for sufficiently well-prepared initial data, the multi-well structure does not play a role in the dynamics and for some .

2. Constrained gradient flows

2.1. Setting

Let be the state manifold and a smooth free energy function. Furthermore, shall possess in each point a tangent space and on  a positive definite symmetric bilinear form . Then is the gradient flow with respect to if it solves


where denotes the first variation of at in direction .

Another formulation (cf. Mielke [20]) uses the inverse of the metric denoted by the Onsager operator , which is assumed to be a positive semidefinite linear operator. Then, the gradient flow of with respect to is defined by


By the definition of the Onsager operator, the cotangent space is given as the preimage of the Onsager operator


Then, any covector field gives rise to a curve on  by solving in a suitable sense


We call this the continuity equation on , since it respects possible conservation laws. For instance for the space of absolutely continuous probability measures with bounded second moment, we formally have and . Hence, (2.4) becomes the classical continuity equation on : with driving potential field .

In the following, we often write the identities (2.1) or (2.2) as and do not make the underlying metric respectively Onsager operator apparent in the notation. Moreover, we let .

A crucial consequence of the gradient flow formulation is the so called energy-dissipation estimate


which corresponds to the second law of thermodynamics for closed systems. In this context the term is called dissipation.

2.2. Formalism

In this section we want to introduce our notion of a gradient flow subject to a time dependent constraint. The solution does no longer live on the manifold , but for each time the gradient flow has to be an element of a constrained manifold .

Therefore, let be an a-priori given differentiable functional such that


We call a constraint with such a property a nondegenerate constraint and we set the time-dependent constrained state space. The constraint is called stationary if for all .

To define a constrained gradient flow for a stationary constraint, note first of all that due to the nondegeneracy of from (2.6), the gradient is orthogonal to and is a linear subspace of with co-dimension 1. Let be the orthogonal projection from onto . Since is an orthogonal projection, it is also self-adjoined. The constrained gradient is then the unique element in such that for all :

Employing that and that is self-adjoined, we find

and hence, since was arbitrary,

Therefore, a curve is called constrained gradient flow with respect to the stationary, nondegenerate constraint , if for all :

The additional difficulty in the case of a dynamical constraint is that the orthogonal projection of onto does not necessarily keep the flow congruent to the dynamical constraint. To keep them synchronized, we introduce an extended state space incorporating the time as an additional coordinate. This approach resembles the basic transformation of a nonautonomous ordinary differential equation to an autonomous one by adding the time as additional coordinate. Hence, let us define the extended state manifold and for two elements the formal metric

Then is given by and the tangent space of is given by

In order to lift onto we define

and hence

Therefore, a constrained gradient should satisfy:

  1. For all that is such that holds


Under these premises and the nondegenerate assumption (2.6) on the constraint, the only possible definition of the constrained gradient flow is the projection of along onto . Doing so we get



Hence, we arrive at the following definition for the gradient flow in the case of a dynamical constraint:

Definition 2.1 (Dynamically constrained gradient flow).

A curve is called constrained gradient flow with respect to the nondegenerate dynamical constraint , if for all :


where the Lagrange multiplier is given by (2.8).

2.3. Formal derivation as constrained gradient flow in

In this section, we formally show, that (1.1) can be seen as gradient flow with respect to the free energy functional as defined in (1.4) satisfying the constraint (1.2). In the following discussion the parameter is set to one. The constraint given in terms of a functional such that reads


The metric is induced by the Wasserstein distance defined on the space of absolutely continuous probability measure with finite second moment defined by


where is the set of couplings between and , i.e. probability measures on with marginals and , respectively. Furthermore it holds the dynamical representation [4]:

where the Onsager operator is defined by in the weak sense. The differential of the free energy is given by . Moreover, we evaluate


which satisfies the nondegeneracy assumption (2.6). Hence, in (2.8) becomes


which is as defined in (1.3). Hence, we obtain from Definition 2.1


which is nothing else than (1.1).

3. The time discrete scheme and existence of weak solutions

In this section, we make the discussion of Section 2.3 rigorous by showing existence of weak solutions of (1.1) with constraint (1.2) by using a variational implicit Euler scheme based on the constrained gradient flow. For the existence, the parameter is set to one.

First, let us fix the assumptions throughout this section and define weak solutions for the constrained Fokker-Planck equations.

Assumption 3.1.

The function has at most quadratic growth at infinity such that for some and all


The dynamical constraint is Lipschitz, i.e. , and the initial data satisfies and .

Definition 3.2 (weak solutions).

We say that is a weak solution of (1.1) and (1.2) on for , if and for a.e. holds with


as well as is a distributional solution of (1.1), that is for all holds


For the proof of existence of a weak solution of (1.1) with constraint (1.2) we follow mainly the ideas in [15] which are to use a Wasserstein gradient flow with respect to the free energy functional  (1.4). This gradient flow is carried out in a time discrete manner for arbitrary but fixed time-step length and leading thereby to a sequence of piecewise constant approximations of the solution. Finally the limit is taken and it is proven that the limit actually is a weak solution of (1.1) with constraint (1.2). The main additional difficulty in comparison to [15], is the need for additional estimates on the Lagrange multiplier and the second moment.

3.1. The Euler scheme

Since the metric and Onsager operator are induced by the Wasserstein distance (cf. also [15, 1]), we use the following time-discrete variational approximation. Let be a fixed time step and consider the following constrained implicit Euler scheme


In the following, we often investigate the entropy and potential energy inside of the free energy (1.4) separately and write with


First of all we show the well-posedness of the scheme.

Proposition 3.3 (Well-posedness of the scheme).

Given , there is a unique sequence satisfying the scheme (3.4).

The proof mainly follows the respective proof in [15, Proposition 4.1] which is using the direct method to show existence and exploiting the strict convexity of the functional and the convexity of for the uniqueness. The only additional step is to show, that the first moment is preserved along the minimizing sequence in the direct method is preserved (cf. [10, Proposition 4.2]).

3.2. Passing to the Limit

In this section we show that a constant in time interpolation of the solution of the discrete scheme (3.4) leads to a weak solution of (1.1) with constraint (1.2). Again we follow the structure of the respective proof outlined in [15, Theorem 5.1]. However, the additional constraint (1.2) on the first moment leads to the rise of a Lagrange multiplier (1.3) which needs to be extracted from the discrete scheme (3.4).

Theorem 3.4 (Existence of a weak solution).

Suppose Assumption 3.1 holds. For fixed , let be the solution of the scheme (3.4). Define the constant interpolation by


Then for any and


and is a weak solution of (1.1) with constraint (1.2). Moreover it holds

Remark 3.5 (Regularity, energy dissipation and uniqueness).

The regularity of the solutions constructed in Theorem 3.4 can be improved. The only difference to the unconstrained case is the Lagrange multiplier . However, the uniform bounds for and already contained in [14, Appendix A Proposition 2] (cf. also Lemma 4.4) ensure that we are able to apply standard regularity results for the Fokker-Planck equation (cf. [15] and [10, Chapter 5]) to obtain


The regularity can be further improved under stronger assumptions on the potential and external constraint (cf. [10, Theorem 5.2] for a detailed statement).

Using the improved regularity properties (3.9) a chain rule is established, which rigorously shows the energy-dissipation identity (1.5). Similarly, the improved regularity is sufficient to establish uniqueness by a comparison argument. Due to the nonlocal nature of the equation, the strategy for the uniqueness proof in [15] has to be modified. Instead of proving uniqueness for the solutions itself, by following the idea of [14] one considers the distribution function, which allows for a comparison principle (cf. [10, Chapter 6]).

The proof of Theorem 3.4 is based on the following three Lemmas, which are proved separately in the next section. The first one provides an approximate weak formulation of the Fokker-Planck equation (1.1) with constraint (1.2).

Lemma 3.6 (Time-discrete approximation of the weak formulation).

The solution to the discrete scheme (3.4) satisfies for all and all


where the discrete Lagrange multiplier is given by


In the next Lemma, we establish a priori bounds, which allow to pass to the limit.

Lemma 3.7 (A priori estimates for the discrete scheme).

Let be the solution of the scheme (3.4). Then for any there exists such that for all and all with the following a priori estimates hold true


Based on the a priori estimates of Lemma 3.7, the only additional difficulty in the passage to the limit in the approximate weak formulation (3.10) is the convergence of the Lagrange multiplier . Hence, we prove its uniform convergence separately in the next Lemma.

Lemma 3.8 (Convergence of the Lagrange multiplier).

Let be the solution of the scheme (3.4) and define


with given by (3.11). Then, there exists such that


With the help of the above three Lemmas, we can prove Theorem 3.4.

Proof of Theorem 3.4.

The a priori estimates of Lemma 3.7 allow us to pass in the piecewise constant interpolation from (3.6) to the limit . Indeed, from (3.12) we derive tightness of and from (3.13), since has superlinear growth, we deduce that for any holds up to a subsequence


In addition the tightness implies that is a probability density for a.e. , which shows (3.7). To prove for a.e. , we note that the a priori estimate (3.12) passes to the limit and we have for all . Similarly, for showing that , we use from the construction the identity


The second moment bound (3.12) implies enough tightness to pass to the limit in the identity. By the growth assumption (3.1) on , the statement follows along the same argument as the proof for , which completes the proof of (3.8).

It remains to show that solves (1.1). Therefore, we sum (3.10) from , use the a priori estimate (3.15) and obtain for any by using the definition of from (3.16) the estimate


To arrive at the weak formulation of (1.1), it is left to pass to the limit on both sides. The right-hand side goes to zero for , which follows directly from (3.15). For passing to the limit on the left hand side, we use (3.17) from Lemma 3.8 and finally obtain


which by Definition 3.2 is a weak solution to the constrained Fokker-Planck equation. ∎

3.3. Proof of auxiliary Lemmas 3.63.8

Proof of Lemma 3.6.

We can choose arbitrary but fixed, hence we neglect it in the notation of the proof. Since minimizes (3.4) among all admissible probability densities , the Euler-Lagrange equation has to ensure that perturbations of are still in . The perturbations are realized as a push-forward with respect to the flow of a smooth vector field.

In contrast to [15], we have to use a second push-forward as correction to ensure the constraint on the first moment is met. This second push-forward causes the Lagrange multiplier. Note, that the idea is the same as in Section 2.2 and we would like to choose the constant vector field corresponding to (cf. (2.12)). However, this vector field is not in and we work with an admissible one and use an approximation argument at the end of the proof.

Let and let be the flow with respect to , that is the solution of


Then the push-forward of with respect to denoted by is given by


For the correction take another vector field , that satisfies the nondegeneracy property ensuring that the respective push-forward is able to change the first moment of . We define the flow with respect to :


Let the joint push-forward be given as


To make sure that , it needs to be shown that . First of all observe, that on and for some . Hence, we can always approximate by using a spatial cut-off function to arrive at the estimate


Another approximation argument with a spatial cut-off function, ensures the identity


The smoothness of the flows allows to differentiate the above functional


Now, the constraint reads . Hence, by the implicit-function-theorem, due to the nondegenerate property of , there is some and a function such that . To identify the Lagrange multiplier, we note that is given by


For the Euler-Lagrange-equation, we proceed to calculate

Ad a) By using monotone convergence, we can use as a test function in the push-forward and have . Therefore, it holds