From diffusion to reaction via convergence
Abstract
We study the limit of high activation energy of a special FokkerPlanck equation, known as KramersSmoluchowski (KS) equation. This equation governs the time evolution of the probability density of a particle performing a Brownian motion under the influence of a chemical potential . We choose having two wells corresponding to two chemical states and . We prove that after a suitable rescaling the solution to (KS) converges, in the limit of high activation energy (), to the solution of a simple system modeling the diffusion of and , and the reaction .
The aim of this paper is to give a rigorous proof of Kramer’s formal derivation and to embed chemical reactions and diffusion processes in a common variational framework which allows to derive the former as a singular limit of the latter, thus establishing a connection between two worlds often regarded as separate.
The singular limit is analysed by means of Gammaconvergence in the space of finite Borel measures endowed with the weak topology.
Key words and phrases: unification, scalebridging, upscaling, highenergy limit, activation energy, Dirichlet forms, Moscoconvergence, variational evolution equations
AMS subject classification: 35K57, 35Q84, (49J45, 49S05, 80A30)
1 Introduction
1.1 Chemical reaction as a diffusion process
In a seminal paper in 1940, Hendrik Anthony Kramers described a number of approaches to the problem of calculating chemical reaction rates [12]. One of the limit cases in this paper is equivalent to the motion of a Brownian particle in a (chemical) potential landscape. In this description a reaction event is the escape of the particle from one energy well into another.
This description is interesting for a number of reasons. It provides a connection between two processes, diffusion and reaction, which are often—especially at the macroscopic level—viewed as completely separate. It also provides a link between a macroscopic effect—chemical reaction—and a more microscopic, underlying motion, and in doing so, it highlights the fact that diffusion and reaction ultimately spring from the same underlying motion. It finally also allows for explicit calculation of reaction rates in terms of properties of the energy landscape.
In this paper we contribute to this discussion by studying the limit process of high activation energy in the unimolecular reaction . As a first contribution, this provides a rigorous proof of the result that Kramers had derived formally. At the same time we extend his result to a Brownian motion in the product space spanned by both the chemical variable of Kramers and the variables corresponding to position in space, resulting in a limit system that models not only chemical reaction but also spatial diffusion—a simple reactiondiffusion system.
With this paper we have two aims. The first is to clarify the mathematical—rigorous—aspects of the formal results of [12], and extend them to include spatial diffusion, and in this way to contribute to the upscaling of microscopic systems. The second is to make a first step in the construction of a variational framework that can describe the combination of general diffusive and chemically reactive processes. From this point of view it would be interesting, for example, to place the limit system in the context of Wasserstein gradient flows (see also Section 1.10). Initiated by the work of Otto [11, 15] and extended into many directions since, this framework provides an appealing variational structure for very general diffusion processes, but chemical reactions have so far resisted representation in the Wasserstein framework.
In this paper we only treat the simple equation , but we plan to extend the approach to other systems in the future (see also [14]).
1.2 The setup: enthalpy
We consider the unimolecular reaction . In chemical terms the and particles are two forms of the same molecule, such that the molecule can change from one form into the other. A typical example is a molecule with spatial asymmetry, which might exist in two distinct, mirrorimage spatial configurations; another example is that of enzymes, for which the various spatial configurations also have different biological functions.
Remark.
Classical, continuumlevel modelling of the system of and particles that diffuse and react (see e.g. [9, 3]) leads to the set of differential equations, where we write and for the concentrations of and particles:
(1a)  
(1b) 
(See Section 1.10 for the equal reaction rates). This system will arise as the upscaling limit (see Theorem 1) of the system that we now develop in detail.
We next assume that the observed forms and correspond to the wells of an appropriate energy function. Since it is common in the chemical literature to denote by ‘enthalpy difference’ the release or uptake of heat as a particle is converted into a particle , we shall adopt the same language and consider the and states to correspond to the wells of an enthalpy function .
While the domain of definition of should be highdimensional, corresponding to the many degrees of freedom of the atoms of the molecule, we will here make the standard reduction to a onedimensional dependence. The variable is assumed to parametrize an imaginary ‘optimal path’ connecting the states to , such that corresponds to and to . Such a path should pass through the ‘mountain pass’, the point which separates the basins of attraction of and , and we arbitrarily choose that mountain pass to be at , with . We also restrict to the interval , and we assume for simplicity that the wells are at equal depth, which we choose to be zero. A typical example of the function is showed in Figure 1.
Specifically we make the following assumptions about : , is even in , maximal at with value , and minimal at with value ; for any ; . The assumption of equal depth for the two wells corresponds to an assumption about the rate constants of the two reactions; we comment on this in Section 1.10.
1.3 Diffusion in the chemical landscape
This newly introduced ‘chemical variable’ should be interpreted as an internal degree of freedom of the particle, associated with internal changes in configuration. In the case of two alternative states of a molecule, parametrizes all the intermediate states along a connecting path.
In this view the total state of a particle consists of this chemical state together with the spatial position of the particle, represented by a dimensional spatial variable in a Lipschitz, bounded, and open domain , so that the full state space for the particle is the closure of
Taking a probabilistic point of view, and following Kramers, the motion of the particle will be described in terms of its probability density , in the sense that for Borel sets and the number is the probability of finding the particle at a position and with a ‘chemical state’ .
The particle is assumed to perform a Brownian motion in , under the influence of the potential landscape described by . This assumption corresponds to the ‘largefriction limit’ discussed by Kramers. The time evolution of the probability distribution then is given by the KramersSmoluchowski equation
(2) 
with Neumann boundary conditions on the lateral boundary . The coefficient is introduced to parametrize the difference in scales for and : since is a rescaled physical distance, and is a rescaled ‘chemical’ distance, the units of length in the two variables are different, and the parameter can be interpreted as the factor that converts between the two scales. Below we shall make an explicit choice for .
1.4 The limit of high activation energy
In the setup as described above, there is a continuum of states (i.e. ) connecting the state to the state, and a statement of the type ‘the particle is in the state’ is therefore not well defined. In order to make a connection with the macroscopic description ‘’, which presupposes a clear distinction between the two states, we take the limit of high activation energy, as follows.
We rescale the enthalpy with a small parameter , to make it . (This is called ‘high activation energy’ since is the height of the mountain that a particle has to climb in order to change states).
This rescaling has various effects on the behaviour of solutions of (2). To illustrate one effect, let us consider the invariant measure , the unique stationary solution in of (2):
(3) 
(where are the  and dimensional Lebesgue measures). The constant is fixed by the requirement that .
Since is strictly positive at any , the exponential vanishes at all except for ; therefore the measure concentrates on the lines and , and converges weakly as to the limit measure given by
(4) 
Here weak convergence is to be interpreted in the duality with continuous functions in (thus considering as a weakly closed convex subset of the space of signed Borel measures with finite total variation) i.e.
We should interpret the behaviour of as follows. In the limit , the deep wells at force particles to stay increasingly close to the bottom of the wells. However, at any given , there is a positive probability that a particle switches from one well to the other in any given period of time. The rate at which this happens is governed by the local structure of near and near , and becomes very small—of order , as we shall see below.
In the limit , the behaviour of particles in the direction is no longer recognizable as diffusional in nature. In the direction a particle can only be in one of two states , which we therefore interpret as the and states. Of the diffusional movement in the direction only a jump process remains, in which a particle at jumps with a certain rate to position , or vice versa.
1.5 Spatiochemical rescaling
Since the jumping (chemical reaction) rate at finite is of order , the limiting reaction rate will be zero unless we rescale the system appropriately. This requires us to speed up time by a factor of . At the same time, the diffusion rate in the direction remains of order as , and the rescaling should preserve this. In order to obtain a limit in which both diffusion in and chemical reaction in enter at rates that are of order , we use the freedom of choosing the parameter that we introduced above.
We therefore choose equal to
(5) 
and we then find the differential equation
(6) 
which clearly highlights the different treatment of and : the diffusion in is independent of while the diffusion and convection in the variable are accelerated by a factor .
1.6 Switching to the density variable
As is already suggested by the behaviour of the invariant measure , the solution will become strongly concentrated at the extremities of the domain . This is the reason why it is useful to interpret as a family of timedependent measures, instead of functions. It turns out that the densities
of with respect to also play a crucial role and it is often convenient to have both representations at our disposal, freely switching between them. In terms of the variable equation (6) becomes
(7) 
supplemented with the boundary conditions
(8) 
We choose an initial condition
(9) 
Let us briefly say something about the functionalanalytic setting. It is well known (see e.g. [7]) that the operator with Neumann boundary conditions (8) has a selfadjoint realization in the space . Therefore the weak form of equation (7) can be written as
(10) 
where the bilinear forms and are defined by
and
Since is densely and continuously imbedded in , standard results on variational evolution equations in an Hilbert triplet (see e.g. [13, 6]) and their regularizing effects show that a unique solution exists in for every initial datum .
1.7 Main result I: weak convergence of and
The following theorem is the first main result of this paper. It states that for every time the measures solutions of (6) weakly converge to a limiting measure in , whose density is the solution of the limit system (1). Note that for a function the traces are well defined (in fact, the map is an isomorphism between and
We state our result in a general form, which holds even for signed measures in .
Theorem 1.
Let be the solution of (6–9) with initial datum . If
(11) 
and
weakly converges to as ,  (12) 
then , , and for every the solution weakly converge to
(13) 
whose densities belong to and solve the system
(14a)  
(14b)  
(14c) 
The positive constant in (14a,b) can be characterized as the asymptotic minimal transition cost
(15) 
Remark (The variational structure of the limit problem).
The “” limit problem (14a14c) admits the same variational formulation of the “” problem we introduced in Section 1.6. Recall that is the measure defined in (4) as the weak limit of ; we set , and for every with we set . We define
(16) 
Similarly, we set , which is continuously and densely imbedded in , and
(17) 
Then the system (14a,b,c) can be formulated as
(18) 
which has the same structure as (10).
1.8 Main result II: a stronger convergence of
Weak convergence in the sense of measures is a natural choice in order to describe the limit of , since the densities and the limit density are defined on different domains with respect to different reference measures. Nonetheless it is possible to consider a stronger convergence which better characterizes the limit, and to prove that it is satisfied by the solutions of our problem.
This stronger notion is modeled on Hilbert spaces (or, more generally, on Banach spaces with a locally uniformly convex norm), where strong convergence is equivalent to weak convergence together with the convergence of the norms:
(19) 
In this spirit, the next result states that under the additional request of “strong” convergence of the initial data , we have “strong” convergence of the densities ; we refer to [17, 10] (see also [2, Sec. 5.4]) for further references in a measuretheoretic setting.
Theorem 2.
Applying, e.g., [2, Theorem 5.4.4] we can immediately deduce the following result, which clarifies the strengthened form of convergence that we are considering here. This convergence is strong enough to allow us to pass to the limit in nonlinear functions of :
Corollary 3.
Under the same assumptions as in Theorem 2 we have
(23)  
where is an arbitrary continuous function satisfying the quadratic growth condition
for suitable nonnegative constants .
1.9 Structure of the proof
Let us briefly explain the structure of the proof of Theorems 1 and 2. This will also clarify the term convergence in the title, and highlight the potential of the method for wider application.
The analogy between (10) and (18) suggests to pass to the limit in these weak formulations, or even better, in their equivalent integrated forms
(24) 
Applying standard regularization estimates for the solutions to (10) and a weak coercivity property of , it is not difficult to prove that “weakly” converges to for every , i.e.
The concept of weak convergence of densities that we are using here is thus the same as in Theorem 1, i.e. weak convergence of the corresponding measures in .
In order to pass to the limit in (24) the central property is the following weakstrong convergence principle:
For every there exists with as such that for every
Note that the previous property implies in particular that recovery family converges “strongly” to , according to the notion considered by Theorem 2, i.e. iff with both and . Corollary 6 shows that this weakstrong convergence property can be derived from convergence in the “weak” topology of the family of quadratic forms
(25) 
In order to formulate this property in the standard framework of convergence we will extend and to lower semicontinuous quadratic functionals (possibly assuming the value ) in the space , following the approach of [8, Chap. 1113]. While the convergence of is a direct consequence of the weak convergence of to , the convergence of is more subtle. The convergence of and the structure of the limit depends critically on the choice of (defined in (5)): as we show in Section 3.2, the scaling of in terms of is chosen exactly such that the strength of the ‘connection’ between and is of order as .
The link between convergence and stability of evolution problems of parabolic type is well known when is a fixed and coercive bilinear form (see, e.g., [4, Chap. 3.9.2]) and can therefore be considered as the scalar product of the Hilbert space . In this case the embedding of the problems in a bigger topological vector space (the role played by in our situation) is no more needed, and one can deal with the weak and strong topology of , obtaining the following equivalent characterizations (see e.g. [5, Th. 3.16] and [8, Th. 13.6]):

Pointwise (strong) convergence in of the solutions of the evolution problems;

Pointwise convergence in of the resolvents of the linear operators associated to the bilinear forms ;

Moscoconvergence in of the quadratic forms associated to ;

convergence in the weak topology of of the quadratic forms to for every .
In the present case, where does depend on , convergence of the extended quadratic forms with respect to the weak topology of is thus a natural extension of the latter condition; Theorem 4 can be interpreted as essentially proving a slightly stronger version of this property. Starting from this convergence result, we will derive the convergence of the evolution problems by a simple and general argument, which we will present in Section 4.
1.10 Discussion
The result of Theorem 1 is amongst other things a rigorous version of the result of Kramers [12] that was mentioned in the introduction. It shows that the simple reactiondiffusion system (14) can indeed be viewed as an upscaled version of a diffusion problem in an augmented phase space; or, equivalently, as an upscaled version of the movement of a Brownian particle in the same augmented phase space.
At the same time it generalizes the work of Kramers by adding the spatial dimension, resulting in a limit system which—for this choice of , see below for more on this choice—captures both reaction and diffusion effects.
Measures versus densities. It is interesting to note the roles of the measures and their densities with respect to . The variational formulation of the equations are done in terms of the densities but the limit procedure is better understood in terms of the measures , since a weak convergence is involved. This also allows for a unification of two problems with a different structure (a FokkerPlanck equation for and a reactiondiffusion system for the couple .)
Gradient flows. The weak formulation (10) shows also that a solution can be interpreted as a gradient flow of the quadratic energy with respect to the distance. Another gradient flow structure for the solutions of the same problem could be obtained by a different choice of energy functional and distance: for example, as proved in [11], FokkerPlanck equations like (6) can be interpreted also as the gradient flow of the relative entropy functional
(26) 
in the space of probability measures endowed with the socalled Wasserstein distance (see e.g. [2]). Other recent work [1] suggests that the Wasserstein setting can be the most natural for understanding diffusion as a limit of the motion of Brownian particles, but in this case it is not obvious how to interpret the limit system in the framework of gradient flows on probability measures, and how to obtain it in the limit as .
In a forthcoming paper we investigate a new distance for the limit problem, modeled on the reactiondiffusion term, and we study how the limit couple of energy and dissipation can be obtained as a limit.
The choice of . In this paper the time scale is chosen to be equal to , and it is a natural question to ask about the limit behaviour for different choices of . If the scaling is chosen differently—i.e. if converges to or —then completely different limit systems are obtained:

If , then the reaction is not accelerated sufficiently as , and the limit system will contain only diffusion (i.e. in (14));

If , on the other hand, then the reaction is made faster and faster as , resulting in a limit system in which the chemical reaction is in continuous equilibrium. Because of this, both and have the same concentration , and solves the diffusion problem
Note the instantaneous equilibration of the initial data in this system.
While the scaling in terms of of can not be chosen differently without obtaining structurally different limit systems, there is still a choice in the prefactor. For with fixed, the prefactor will appear in the definition (15) of .
There is a also a modelling aspect to the choice of . In this paper we use no knowledge about the value of in the diffusion system at finite ; the choice is motivated by the wish to have a limit system that contains both diffusive and reactive terms. If one has additional information about the mobility of the system in the  and directions, then the value of will follow from this.
Equal rate constants. The assumption of equal depth of the two minima of corresponds to the assumption (or, depending on one’s point of view, the result) that the rate constant in (14) is the same for the two reactions and . The general case requires a slightly different choice for , as follows.
Let the original macroscopic equations for the evolution of and (in terms of densities that we also denote and ) be
(27a)  
(27b) 
Choose a fixed function such that and . We then construct the enthalpy by setting
where is the same enthalpy function as above. The same proof as for the equalwell case then gives convergence of the finite problems to (27).
Equal diffusion constants. It is possible to change the setup such that the limiting system has different diffusion rate in and . We first write equation (6) as
where the mobility matrix and the flux are given by
By replacing the identity matrix block in by a block of the form the directional diffusion can be modified as a function of . This translates into two different diffusion coefficients for and .
The function . The limit result of Theorem 1 shows that only a small amount of information about the function propagates into the limit problem: specifically, the local secondorder structure of around the wells and around the mountainpass point.
One other aspect of the structure of is hidden: the fact that we rescaled the variable by a factor of can also be interpreted as a property of , since the effective distance between the two wells, as measured against the intrinsic distance associated with the Brownian motion, is equal to after rescaling.
We also assumed in this paper that has only ‘half’ wells, in the sense that is defined on instead of . This was for practical convenience, and one can do essentially the same analysis for a function that is defined on . In this case one will regain a slightly different value of , namely . (For this reason this is also the value found by Kramers [12, equation (17)]).
Single particles versus multiple particles, and concentrations versus probabilities. The description of this paper of the system in terms of a probability measure on is the description of the probability of a single particle. This implies that the limit object should be interpreted as the density (with respect to ) of a limiting probability measure, again describing a single particle.
This is at odds with common continuum modelling philosophy, where the main objects are concentrations (mass or volume) that represent a large number of particles; in this philosophy the solution of (14) should be viewed as such a concentration, which is to say as the projection onto space of a joint probability distribution of a large number of particles.
For the simple reaction these two interpretations are actually equivalent. This arises from the fact that reaction events in each of the particles are independent of each other; therefore the joint distribution of a large number of particles factorizes into a product of copies of the distribution of a single particle. For the case of this paper, therefore, the distinction between these two views is not important.
More general reactions. The remark above implies that the situation will be different for systems where reaction events cause differences in distributions between the particles, such as the reaction . This can be recognized as follows: a particle that has just separated from a particle (in a reaction event of the form ) has a position that is highly correlated with the corresponding particle, while this is not the case for all the other particles. Therefore the particles will not have the same distribution. The best one can hope for is that in the limit of a large number of particles the distribution becomes the same in some weak way. This is one of the major obstacles in developing a similar connection as in this paper for more complex reaction equations.
1.11 Plan of the paper
One of the main difficulties in the proof of Theorem 1, namely the singular behaviour given by the concentration of the invariant measure onto the two lines at , can be overcome by working in the underlying space of (signed or probability) measures in . This point of view is introduced in Section 2. Section 3 contains the basic convergence results (Theorem 4) and the proof of Theorem 1 and of Theorem 2. The argument showing the link between convergence of the quadratic forms and the convergence of the solutions to the evolution problems (see the comments in section 1.9) is presented in Section 4 in a general form, which can can be easily applied to other situations.
2 Formulation of the evolution equations in measure spaces
The KramersSmoluchowski equation
We first summarize the functional framework introduced above. Let us denote by the scalar product in defined by
(28) 
with the corresponding norm . We introduced two Hilbert spaces
and the bilinear forms
(29)  
(30) 
with which (7) has the variational formulation
(31) 
The main technical difficulty in studying the limit behaviour of (31) as consists of the dependence of the functional spaces . Since for our approach it is crucial to work in a fixed ambient space, we embed the solutions of (31) in the space of finite Borel measures by associating to the measure . We thus introduce the quadratic forms
if and  (32)  
if and  (33) 
trivially extended to when is not absolutely continuous with respect to or its density does not belong to or respectively. Denoting by and their proper domains, we still denote by and the corresponding bilinear forms defined on and respectively. Setting , , (31) is equivalent to the integrated form
(34) 
We also recall the standard estimates
(35)  
(36)  
(37) 
Although versions of these expressions appear in various places, we were unable to find a reference that completely suits our purposes. We therefore briefly describe their proof, and we use the more conventional formulation in terms of the bilinear forms and and spaces and ; note that is an inner product for , and is an inner product for .
When is sufficiently smooth, standard results (e.g. [6, Chapter VII]) provide the existence of a solution , such that the functions and are nonincreasing; in addition, the solution operator (semigroup) is a contraction in . For this case all three expressions can be proved by differentiation.
In order to extend them to all , we note that for fixed the two norms on given by (the square roots of)
(38) 
are identical by (35) on a dense subset. If we approximate a general by smooth , then the sequence is a Cauchy sequence with respect to both norms; by copying the proof of completeness of the space (see e.g. [6, Th. IV.8]) it follows that the integral in (38) converges. This allows us to pass to the limit in (35). The argument is similar for (37), when one writes the sum of (35) and (36) as
(39) 
The reactiondiffusion limit
We now adopt the same point of view to formulate the limit reactiondiffusion system in the setting of measures. Recall that for we set , and thus we defined the function space
and the bilinear forms
(40)  
(41) 
As before we now extend these definitions to arbitrary measures by
(42)  
(43) 
with corresponding bilinear forms and ; problem (14a,b,c) can be reformulated as
or in the integral form
(44) 
3 convergence result for the quadratic forms
The aim of this section is to prove the following convergence result:
Theorem 4.
If as in then
(45) 
For every such that there exists a family weakly converging to such that
(46) 
Note that endowed with the weak topology is the dual of a separable Banach space, and therefore the sequential definition of convergence coincides with the topological definition [8, Proposition 8.1 and Theorem 8.10]; consequently Theorem 4 implies the convergence of the families and . Theorem 4 actually states a stronger result, since the recovery sequence can be chosen to be the same for and . This joint convergence of the families and is nearly equivalent with convergence of combined quadratic forms:
Lemma 5.
Proof.
The first part of the Lemma is immediate. For the second part, suppose that and satisfies (48); the liminf inequality for yields