A higher-order gradient flow scheme for a singular one-dimensional diffusion equation
A nonlinear diffusion equation, interpreted as a Wasserstein gradient flow, is numerically solved in one space dimension using a higher-order minimizing movement scheme based on the BDF (backward differentiation formula) discretization. In each time step, the approximation is obtained as the solution of a constrained quadratic minimization problem on a finite-dimensional space consisting of piecewise quadratic basis functions. The numerical scheme conserves the mass and dissipates the -norm of the two-step BDF time approximation. Numerically, also the discrete entropy and variance are decaying. The decay turns out to be exponential in all cases. The corresponding decay rates are computed numerically for various grid numbers.
Key words and phrases:Super-fast diffusion equation, Wasserstein gradient flow, minimizing movements, higher-order scheme, G-norm.
2000 Mathematics Subject Classification:65M99, 35K20.
The aim of this paper is to propose and study a fully discrete higher-order variant of the minimizing movement scheme in one space dimension for the nonlinear diffusion equation
with negative exponent , where is the -dimensional torus. Equation (1) can be written as the gradient flow of the entropy with respect to the Wasserstein distance. The gradient flow formulation gives rise to a natural semi-discretization in time of the evolution by means of the minimizing movement scheme, leading to a minimization problem for the sum of kinetic and potential energies.
Minimizing movement schemes for evolution equations with an underlying gradient flow structure were first suggested by De Giorgi  in an abstract framework. Jordan, Kinderlehrer, and Otto  have shown that the solution to the linear Fokker-Planck equation can be obtained by minimizing the logarithmic entropy in the Wasserstein space. Since then, many nonlinear evolution equations have been shown to constitute Wasserstein gradient flows, for instance the porous-medium equation , the Keller-Segel model , equations for interacting gases , and a nonlinear fourth-order equation for quantum fluids .
The minimizing movement scheme can be interpreted as an implicit Euler semi-discretization with respect to the Wasserstein gradient flow structure. While for some time this has been mainly used as an analytical tool, more recently numerical approximations of evolution equations using this scheme have been proposed. In one space dimension, the optimal transport metric becomes flat when re-parametrized by means of inverse cumulative functions, which simplifies the numerical solution; see e.g. [1, 4, 20, 22]. For multi-dimensional situations, one approach is based on the Eulerian representation of the discrete solution on a fixed grid. The resulting problem can be solved by using interior point methods , finite elements , or finite volumes . Another approach employs the Lagrangian representation, which is well adapted to optimal transport. Examples are explicit marching schemes , moving meshes , linear finite elements for a fourth-order equation , reformulations in terms of evolving diffeomorphisms , and entropic smoothing using the Kullback-Leibler divergence . The connection between Lagrangian schemes and the gradient flow structure was investigated in . In this paper, we will use the Lagrangian viewpoint.
The minimizing movement scheme of De Giorgi is of first order in time only since it is based on the implicit Euler method. Concerning higher-order schemes, we are only aware of the paper . There, second-order gradient flow schemes were suggested for the Euler equations, with finite differences in space and the two-step BDF (Backward Differentiation Formula) method or diagonally implicit Runge-Kutta (DIRK) schemes in time.
In this paper, we propose a fully discrete second-order minimizing movement scheme using quadratic finite elements in space and the two-step BDF method in time. We consider periodic point-symmetric solutions. The finite-dimensional minimization problem, constrained by the mass conservation, is solved by the method of Lagrange multipliers which leads to a sequential quadratic programming problem.
By construction, our numerical scheme is of second order both in time and space, it conserves the mass and dissipates the -norm of the approximation of the quadratic ansatz functions at time step ,
where the matrix is defined in the approximation of the Wasserstein metric on the space of the quadratic ansatz functions. We refer to Section 3 for details. It turns out that numerically, the relative -norm decays exponentially fast to zero. Although we cannot expect for the multistep scheme that the discrete entropy decays exponentially fast, this holds true for the numerical experiments performed in this paper. Furthermore, also the discrete variance of the original variable and the Lagrangian variable decay exponentially fast.
The numerical tests also indicate that the decay rate of the entropy is increasing in the number of grid points, i.e., the discrete decay rates are smaller than the (expected) value for the continuous equation. This result is in accordance with the findings of  for a finite-volume approximation of a one-dimensional linear Fokker-Planck equation.
Let us briefly review the literature for the diffusion equation (1) with . Equation (1) with appears in the modeling of heat conduction in solid Helium, where the solution corresponds to the inverse temperature . When this equation is considered in the whole space or in a bounded domain with homogeneous Dirichlet boundary conditions, it is sometimes called the super-fast diffusion equation [28, Chap. 9]. The critical exponent of this equation in one space dimension is . For and , we have a smoothing property, namely for any [28, Section 9.1]. For , no solutions exist with data in . The non-existence range in dimensions contains even all negative exponents, . However, if and , there is a weak smoothing effect. Indeed, given for , the solution exists and is locally bounded in . Furthermore, if then there is instantaneous extinction, i.e. in for all [28, Theorem 9.3].
Clearly, such results cannot be expected when the super-fast diffusion equation is considered on the torus. We expect that global-in-time weak solutions exist, which converge to the constant steady state as . Since we could not find any results on the existence and large-time asymptotics in the literature in that situation and since the use of negative exponents is less standard, we provide a (short) proof for completeness in the appendix; see Section 2.1 for details.
The paper is organized as follows. The global existence result and the exponential decay of the solutions to the constant steady state as well as some basic properties of the Wasserstein distance for periodic functions are stated in Section 2. Section 3 is devoted to the description of the numerical scheme, and some numerical experiments are presented in Section 4. We conclude in Section 5. The appendix contains the proofs of the existence and large-time asymptotics theorems and the calculations of the coefficients of the matrix and the Hessian of the discrete entropy.
2.1. Existence of solutions and large-time asymptotics
Equation (1) on the torus does not possess the non-existence or instantaneous extinction properties of the super-fast diffusion equation in the whole space since mass cannot get lost. In fact, we expect that for any , there exists a global weak solution. If the initial datum is nonnegative only, (1) is still a singular diffusion equation. However, because of the fast diffusion, the solution becomes positive for all positive times and, by parabolic regularity theory, also smooth.
Theorem 1 (Existence of weak solutions).
Let and let satisfy . If , we assume additionally that . Then there exists a unique weak solution to (1) satisfying , for all , and for , .
The proof of this theorem is based on a standard regularization procedure but we need to distinguish carefully the cases , , and . We present the (short) proof for completeness in Appendix A. For , the (smooth) solution converges to the constant steady state. Since this constant is positive, diffusion slows down when time increases. Therefore, we cannot expect instantaneous extinction phenomena. Still, we are able to prove that the convergence is exponentially fast with respect to the -norm. We introduce the entropy
The steady state of (1) is given by if .
Theorem 2 (Exponential decay).
In the first result, the decay rate depends on , which seems to be not optimal. The decay rate in the second result depends on the -norm of only but we need a particular value of . The proof is based on the entropy method; see Appendix A. Stronger decay results have been derived for the fast-diffusion equation in the whole space or in bounded domains; see, e.g., [5, 28]. However, our proof is very elementary and just an illustration for the qualitative behavior of the solutions to (1).
2.2. The Wasserstein distance for periodic functions
We recall some basic facts about mass transportation and Wasserstein distances following . For more details, we refer to [2, 30]. Let be a Riemannian manifold with distance and let be the convex set of measures with fixed mass on . The -Wasserstein distance of two measures , is defined by
where is the set of transport plans connecting with , i.e. the set of all measures on with respective marginals and ,
for all measurable sets , . If the measures and possess densities and , respectively, with respect to a fixed measure on , then we write, slightly abusing the notation, instead of .
In one space dimension, there exists an explicit formula to compute . Let be a (possibly infinite) interval, and let , be two measures. We define their distribution functions
As these functions are right-continuous and monotonically increasing, they possess right-continuous increasing pseudo-inverse functions , given by
This formula does not extend to because of the topology induced by the periodic boundary conditions, . The reason is that mass can be transported either clock- or counter-clockwise (see  for details). However, if the densities and are point-symmetric, (4) still holds. More precisely, let for and . Then (4) holds, where is the inverse function of [13, Lemma 2.2].
3. Time discretization and Lagrangian coordinates
3.1. The semi-discrete BDF scheme
We introduce the second-order minimizing movement scheme. First, we explain the underlying idea for the finite-dimensional gradient flow
where is a smooth potential. This equation can be approximated by the following minimization problem:
where is the time step size and is an approximation of . The minimizer is a critical point and thus,
which corresponds to the implicit Euler scheme.
Instead of the Euler scheme, we wish to discretize (1) by a multistep method. As an example, consider the two-step BDF (or BDF-2) method,
where and are given. Writing this scheme as
we see that is a critical point of the functional
More generally, the BDF- approximation of (5),
for given can be formulated as
since , or as the minimization problem
In a similar way, we may formulate general multistep methods. We recall that BDF- schemes are consistent if , and zero-stable if and only if [25, Section 11.5].
3.2. Lagrangian coordinates
Before introducing the spatial discretization, we rewrite the scheme (7) in a more explicit manner, using the inverse distribution functions and of and , respectively, which were introduced in Section 2.2. The numerical procedure is similar to that in  but our higher-order scheme introduces some changes. We call the Lagrangian coordinate, which is conjugate to the Eulerian coordinate , and we refer to the inverse distribution function as the associated Lagrangian map. For a consistent change of variables, we need to express the entropy in terms of the Lagrangian coordinates. With the formula for the differential of an inverse function,
and the change of unknowns under the integral in , we obtain
This expression is simply a quadratic form in and thus easy to implement in the numerical scheme. We summarize our results which slightly generalize Lemma 2.3 in .
Let the initial datum be point-symmetric. Then the solution to the BDF- scheme (7) is in one-to-one correspondence to the solution obtained from the inductive scheme
with initial condition and given obtained from a lower-order scheme. The argmin has to be taken over all measurable functions satisfying the mass constraint , and the function is given by
Moreover, and the functions and are related by
In Lagrangian coordinates, the problem has become a minimization problem in which is the sum of a quadratic form and a convex functional, hence it is convex. In the special case , the second integral in (9) is quadratic too which simplifies the numerical discretization. Therefore, we will consider mainly numerical examples with in Section 4.
3.3. Spatial discretization
We approximate the infinite-dimensional variational problem (8) by a finite-dimensional one. Minimization in (8) is performed over the finite-dimensional space of quadratic ansatz functions. This generalizes the approach in , where only linear ansatz functions were used. We define the ansatz space as follows.
Let and a mesh on be given with and . Using (10), we construct the mesh of . Then , , , and (point-symmmetry), where . Since we wish to introduce quadratic ansatz functions, we add the grid points for .
The basis functions are defined by
This set of piecewise linear functions is supplemented by the following piecewise quadratic basis functions:
The ansatz space is the set of all positive, piecewise quadratic functions of the form
We call the associated weight vector. By definition of , we have for . Moreover, as is point-symmetric, .
Now, for given mass and grid , we define the set as the set of weight vectors for which the associated interpolation from (11) satisfies the mass constraint,
The Wasserstein metric for functions approximated in this way becomes
The coefficients , , and can be computed explicitly. The explicit expressions are given in Appendix B.1. Setting , , , and defining the matrix
we can formulate the above sum as
As the matrices and are symmetric, is symmetric, too. The matrix corresponds to the linear approximation considered in .
3.5. Fully discrete Euler-Lagrange equations
The minimizer in (15) is subject to the mass constraint (12), by definition of the set . Therefore, instead of working on the set , it is more convenient to consider (15) as a constrained minimization problem for on the larger set , which is solved by the method of Lagrange multipliers using the Lagrange functional
A critical point of satisfies the conditions
The precise values for are given in Appendix B.2 for . The condition for the constraint is recovered from
The vector is the gradient of with respect to . We approximate a critical point numerically by applying the Newton method to the first-order optimality condition . This leads to a sequential quadratic programming method, since at every Newton iteration step a quadratic subproblem has to be solved.
Let the solution at the th time step be given and let , . The iteration is as follows:
where is the solution to the linear system
where denotes the Hessian of , whose entries are given in Appendix B.2 for . The iteration is stopped if the norm of is smaller than a certain threshold (see Section 4 for details). In this case, we define and at the th time step. For the BDF- scheme, the values are computed from a lower-order scheme. In the numerical section below, we employ the BDF-2 scheme only such that is calculated by the implicit Euler method.
Note that the constrained minimization problem is exactly mass conserving by construction, but the Newton iteration introduces a small error which depends on the tolerance imposed in the Newton method.
In each iteration step, we need to invert the dense matrix which is the sum of and the Hessian of . This is not a numerical challenge in the one-dimensional case we consider but it may become critical in multi-dimensional discretizations on fine grids. For , however, and the Hessian of are constant matrices which significantly simplifies the Newton scheme.
3.7. Choice of the initial condition
In order to compute the initial condition in Lagrangian coordinates, we need to make precise the values of the vector and the points of the spatial lattice which is moving as the solution evolves. Let the mesh be given and set . Approximating the initial function by a linear ansatz function, we obtain
This is a system of linear equations in . Choosing the uniform grid , we can solve this system explicitly. Indeed, since
which can be solved for :
As is assumed to be point-symmetric, this is true for too. Finally, we approximate by the arithmetic mean , . Consequently, the weights vanish for in the expansion , which is consistent with our approximation (17). We also refer to the discussion in [13, Section 2.8].
4. Numerical experiments
In this section, we present some numerical results for (1) with , by employing the BDF-2 method with quadratic ansatz functions. We choose a uniform grid for with grid points, and the time step size . The Newton iterations are stopped if both the relative error in the -variables and the norm of are smaller than .
Figure 1 illustrates the temporal evolution of the solution with the initial conditions (left figure) and (right figure). We observe that, as expected, the solutions converge to the constant steady state. Because of the negative exponent , very small initial values increase quickly in time.
A nice feature of the Wasserstein gradient flow scheme is that we may interpret the evolution as a process of redistribution of particles with spatio-temporal density on under the influence of a nonlinear particle interaction, which is described by . The way in which the initial density is “deformed” during the time evolution is illustrated in Figure 2. We have chosen 50 “test particles” for the solutions to (1) for the initial conditions chosen above. We stress the fact that the density of trajectories can generally be not identified with the density of the solution.
We verify that the discretization is indeed of second order. Figure 3 shows the -error for various numbers of grid points . We have chosen the initial datum , the end time , and the time step size . The reference solution is computed by using , and . The differences and in the norm feature the expected second-order dependence on .
Next, we fix the number of grid points and compute the error for varying time step sizes ; see Figure 4. Because of the approximation of the initial datum as detailed in Section 3.7, the error will be not of second order initially. Therefore, we compute the error in the interval with . The errors are of second order, as expected.
The time decay of the discrete version of the relative entropy is presented in Figure 5 (left) for various grid numbers. We observe that the decay is exponential until saturation. The saturation comes from the spatial error and the error from the Newton iteration. The decay rate is estimated in the linear regime from the difference quotient
The numerical decay rates for are shown in Figure 6. The rate for (right) is much larger than the corresponding one for (left), since a smaller exponent yields a larger diffusion coefficient (if ) and thus, diffusion becomes faster. We also see that the decay rates become larger on a finer spatial grid. This behavior seems to confirm recent analytical results for spatial discretizations of Fokker-Planck equations; see [21, Section 5]. One may ask if a similar behavior can be observed for the decay rate as a function of the time step size. However, our numerical experiments do not show a monotonic dependence (figures not presented); rather the decay rates vary in a small range which seems to be determined by the other numerical error parts.
In Figure 7, the decay of the square of the relative -norm is presented. The -norm is calculated according to (2), where the argument is given by and is the weight vector corresponding to the constant steady state. Again, the decay is much faster for because of the faster diffusion.
Finally, we present some results on the time decay of the discrete variance of and at time , defined by
where is the expectation value of (which equals the mass and is therefore constant in time). The discrete variance of is defined in a similar way. Interestingly, the variances are exponentially decaying (Figure 8), although it is not clear how to prove this property analytically.
We have proposed an extension of the standard minimizing movement scheme to higher-order BDF time discretizations. Quadratic finite elements have been used to discretise in (mass) space. As an example, we have considered a (singular) diffusion equation, but it should be possible to adapt the scheme to other partial differential equations which constitute Wasserstein gradient flows. The implicit numerical scheme is based on successive solution of constrained minimization problems. The quadratic convergence in space and time of our method has been confirmed numerically. It turns out that the relative entropy, -norm, and variance converge exponentially to zero. An interesting observation is that the decay rate depends monotonically on the number of grid numbers, but it seems that there is no monotonic behavior with respect to the time step size. Future work may be concerned with an analytical derivation of numerical decay rates using a discrete variant of the Bakry-Emery approach. First steps in this direction were presented in  but only for a semi-discretization. Possibly this approach has to be adapted to -stable schemes in the spirit of .
Proof of Theorem 1.
The existence proof is based on a regularization of the initial datum and a fixed-point argument. Let and . Let and , Set
where will be determined later. The set is convex and, by Aubin’s lemma, compact in . Let and let be the weak solution to
This defines the fixed-point operator , . Standard arguments show that is continuous. We verify that . By the maximum principle, . Using as a test function in the weak formulation of (18) shows that , where is some constant depending on . Moreover, . Thus, setting, , we infer that . By the fixed-point theorem of Schauder, there exists a fixed point of .
In order to perform the limit , we need to derive -independent estimates for . To this end, we need to distinguish several cases. First, let . Employing the test function in the weak formulation of (18) with , we find that
The right-hand side is uniformly bounded as we assumed that . Since