Data assimilation and parameter estimation for a multiscale stochastic system with stable Lévy noise
Abstract
This work is about low dimensional reduction for a slowfast data assimilation system with nonGaussian stable Lévy noise via stochastic averaging. When the observations are only available for slow components, we show that the averaged, low dimensional filter approximates the original filter, by examining the corresponding Zakai stochastic partial differential equations. Furthermore, we demonstrate that the low dimensional slow system approximates the slow dynamics of the original system, by examining parameter estimation and most probable paths.
Keywords: Multiscale systems, nonGaussian Lévy noise, averaging principle, Zakai equation, parameter estimation, most probable paths
1 Introduction
Data assimilation is a procedure to extract system state information with the help of observations [18]. The state evolution and the observations are usually under random fluctuations. The general idea is to gain the best estimate for the true system state, in terms of the probability distribution for the system state, given only some noisy observations of the system. It provides a recursive algorithm for estimating a signal or state of a random dynamical system based on noisy measurements. It is also very important in many practical applications from inertial guidance of aircrafts and spacecrafts to weather and climate prediction. Most of the existing works on data assimilation is conducted in the context of Gaussian random fluctuations. The effects of multiscale signal and observation processes in the context of Gaussian random fluctuations has been considered by Park et.al.(see [1]), and they have shown that the probability density of the original system converges to that of the reduced system, by a Fourier analysis method. Imkeller et.al.[2] have further proved the convergence in distribution for the optimal filter, via backward stochastic differential equations and asymptotic techniques.
However, random fluctuations are often nonGaussian (in particular, Lévy type) in nonlinear systems, for example, in geosciences [21], and biosciences [3, 4, 5, 14, 15, 16, 32]. There are experimental demonstrations of Lévy fluctuations in optimal foraging theory, rapid geographical spread of emergent infectious disease and switching currents. Humphries et. al. [25] used GPS to track the wandering black bowed albatrosses around an island in Southern Indian Ocean to study the movement patterns of searching food. They found that by fitting the data of the movement steps, the movement patterns obeys the powerlaw property with power parameter . La Cognata et. al. [15] considered a LotkaVolterra system of two competing species subject to multiplicative stable Lévy noise, and analyzed the role of the Lévy noise sources. Lisowski et. al. [14] studied a model of a stepping molecular motor consisting of two connected heads, and examined its dynamics as noise parameters varied. Speed and direction appear to very sensitively depend on characteristics of the noise. They explored the effect of noise on the ballistic graphenebased small Josephson junctions in the framework of the resistively and capacitively shunted model and found that the analysis of the switching current distribution made it possible to efficiently detect a nonGaussian noise component in a Gaussian background.
Lévy motions are appropriate models for a class of important nonGaussian processes with jumps or bursts [3, 4, 22]. It is desirable to consider data assimilation when the system evolution is under Lévy motions. This has been recently considered by one of us and other authors but not in the multiscale context (see [20, 28, 29, 35]).
The multiscale stochastic dynamical systems arise widely in finance and biology. For example, there are two kinds of mutual securities in financial markets. One for the lowrisk bonds, which can be characterized by ordinary differential equations; the other for highrisk stocks, whose price has two different changes. On the one hand, the edge change caused by the normal supply and demand can be characterized by Gaussian noise. On the other hand, due to the arrival of the important information of the stock, there will be a finite jump in the stock price. Such perturbations can be characterized by nonGauss noise. In general, stock prices change at all times, while bond prices change for months or even years. Thus, the price of these two securities can be characterized by twoscales system with nonGaussian noise (see [9]). Moreover, a large number of observations from biological experiments showed that the production of mRNA and proteins occur in a bursty, unpredictable, and intermittent manner, which create variation or noise in individual cells or celltocell interactions. Such burstlike events appear to be appropriately modeled by the nonGaussian noise. Since the mRNA synthesis process is faster than the protein dynamics, this leads to a twotimescale system (see [23]). Here represents the ratio between the natural time scales of the protein and mRNA.
Parameter estimation for continuous time stochastic models is an increasingly important part of the overall modeling strategy in a wide variety of applications. It is quite often the case that the data to be fitted to a diffusion process has a multiscale character with Gaussian noise. One example is in molecular dynamics, where it is desirable to find effective models for low dimensional phenomena (such as conformational dynamics, vacancy diffusion and so forth) which are embedded within higher dimensional timeseries. We are often interested in the parameter (see [23]), which represents the degradation or production rates of protein and mRNA. In this paper, we develop a parameter estimation method for multiscale diffusions with nonGaussian noise. The results established here may be used to examine the change rate for lowrisk bounds (see [9]).
In this present paper, we consider a slowfast data assimilation system under Lévy noise, but only the slow component is observable. By averaging out the fast component via an invariant measure, we thus reduce the system dimension by focusing on the slow system evolution. We prove that the reduced lower dimensional filter effectively approximates (in probability distribution) the original filter. We demonstrate that a system parameter may be estimated via the low dimensional slow system, utilising only observations on the slow component. The accuracy for this estimation is quantified by moment, with . We apply the stochastic NelderMead method [31] for optimization in the searching for the estimated parameter value. Furthermore, we illustrate the low dimensional approximation by comparing the most probable paths for the slow system and the original system. Finally, we make some remarks in Section 6.
This paper is organized as follows. After recalling some basic facts about Lévy motions and the generalized solution in the next section, we address the effects of the multiscale signal and observation processes in the context of Lévy random fluctuations in Section 3 to Section 5. We illustrate the low dimensional slow approximation by examining zakai equation, parameter estimation and most probable paths. Finally, we give some discussions and comments in a more biological context.
2 Preliminaries
We recall some basic definitions for Lévy motions (or Lévy processes).
Definition 1.
A stochastic process is a Lévy process if

(a.s.);

has independent increments and stationary increments; and

has stochastically continuous sample paths, i.e., for every , in probability, as .
A Lévy process taking values in is characterized by a drift vector , an nonnegativedefinite, symmetric covariance matrix and a Borel measure defined on . We call the generating triplet of the Lévy motions . Moreover, we have the LévyItô decomposition for as follows:
(2.1) 
where is the Poisson random measure, is the compensated Poisson random measure, is the jump measure, and is an independent standard dimensional Brownian motion. The characteristic function of is given by
(2.2) 
where the function is the characteristic exponent
(2.3) 
The Borel measure is called the jump measure. Here denotes the scalar product in .
Definition 2.
For , an dimensional symmetric stable process is a Lévy process with characteristic exponent
(2.4) 
with .
For an dimensional symmetric stable Lévy process, the diffusion matrix , the drift vector , and the Lévy measure is given by
(2.5) 
where .
For every function , the generator for this symmetric stable Lévy process in is
(2.6) 
It is known [6] that extends uniquely to a selfadjoint operator in the domain. By Fourier inverse transform,
(2.7) 
where
(2.8) 
with being the unit vector in .
For fix , and set
(2.9) 
Let be the weighted space with norm:
(2.10) 
For , let be the order weighted Sobolev space with norm
(2.11) 
where denotes the order gradient.
Definition 3.
A backward predictable stochastic process is called a generalized solution of the equality
(2.12) 
if for every it satisfies the following equation
(2.13) 
with being the generator of some Lévy process.
3 A slowfast filtering system
Let us consider the following slowfast signalobservation system:
(3.14)  
(3.15)  
(3.16) 
Here is an valued signal process which represents the slow and fast components. The constant represents the noise intensity for the slow variable. The observation process is valued. The standard Brownian motions are independent. The nonGaussian processes (with ) are independent symmetric stable Lévy processes with triplets and , respectively. The parameter is the ratio of the slow time scale to the fast time scale.
We make the following assumptions on this filtering system.
Hypothesis H.1.
The functions satisfy the global Lipschitz conditions, i.e., there exists a positive constant such that
for all
Remark 1.
Note that with the help of the global Lipschitz condition, it follows that there is a positive constant such that
for all .
Hypothesis H.2. The coefficients are of class with the first and second order derivatives bounded
Hypothesis H.3. The sensor function is of class , i.e. all the bounded continuous functions on .
Hypothesis H.4. There exists a positive constant , such that
(3.17) 
This hypothesis (H.4.) ensures the existence of an invariant measure (see [12]) for the fast component . With the special scaling exponent for the fast component , this invariant measure is independent of (see [11, 12]).
The infinitesimal generator of the slowfast stochastic system is
(3.18) 
where
(3.19) 
Here , is the gradient, and is the Hessian matrix (with respect to and respectively). Let
(3.20) 
where is the collection of all negligible sets of . Define
(3.21) 
where denotes taking the algebra generated by the union . That is,
(3.22) 
By the version of Girsanov’s change of measure theorem, we obtain a new probability measure , such that the observation becomes independent of the signal variables . This can be done through
(3.23) 
Denote
(3.24) 
Then by the KallianpurStriebel formula, for every bounded differentiable function , we have the following representation:
(3.25) 
The unnormalized conditional distribution of , given , is defined as . Thus, we have the following Zakai equation:
(3.26) 
The marginal of is defined as
(3.27) 
Now we define a reduced, low dimensional signalobservation system
(3.28) 
Here
(3.29) 
The unnormalized conditional distribution corresponding to the filter for the reduced system (3.28) satisfies the following (reduced) Zakai equation
(3.30) 
where
(3.31) 
However, we are more interested in the reduced filtering problem with the actual observation . This leads us to rewrite the reduced Zakai equation (3.30) as follows:
(3.32) 
Lemma 1.
Assume that the following conditions are satisfied:

The functions , and for and and their derivatives of first order (in x) as well as the derivatives of second order (in x) are uniformly bounded by the constant . The functions are locally uniformly bounded in .

.
Let be a generalized solution of problem
(3.33) 
where
(3.34) 
and is a generalized solution of problem
(3.35) 
Then the following formulas hold
(3.36) 
and
(3.37) 
where is the forward stochastic differential equation with generator and is the backward stochastic differential equation with generator . Here and satisfy the following equations
(3.38) 
and
(3.39) 
Proof.
Denote . Before we proceed to the proof of the lemma, let us consider the problem
(3.40) 
By the definition of a backward predictable stochastic process, the coefficients in equation (3.40) are predictable relative to the family of algebra with . The process is a Wiener martingale with respect to the same family and the initial condition is measurable with respect to the minimal algebra of this family.
Let be a generalized solution of problem. Then by the definition of generalized solution, for every , the following equality holds on ,
(3.41) 
Denote in equation (3.41), we find that satisfies the following equation on , for every .
(3.42) 
Thus is an generalized solution of problem (3.35).
On the other hand, changing the variables in equality (3.42), we obtain that is a generalized solution of problem (3.40). Thus we have proved that problems (3.35) and (3.40) are equivalent. Therefore all the results obtained for problem (3.35) are naturally carried over to problem (3.40). For the problem (3.40), we can use the similar method to obtain the existence and uniqueness of  solution to stochastic fractal equations by using purely probabilistic argument (see [19, 30]). This completes the proof of this lemma. ∎
Now we show that the reduced system approximates the original system, by examining the corresponding Zakai equations. Before describing the theorem, we start by describing the probabilistic representation for semilinear stochastic fractal equations. We proceed to consider the probability measure . Note that the process is a Brownian motion under . For convenience, we rewrite and as and , respectively. By Lemma 1, we know solves a stochastic partial differential equation (SPDE).
(3.43) 
Here denotes Itô’s backward integral. Likewise, we introduce , and then solves the following SPDE
(3.44) 
By the version of Girsanov’s change of measure theorem and Markov property of , we know that for any : . In particular, we have
(3.45) 
Similarly, we have
(3.46) 
This is our main result on the comparison between the original filter and the reduced filter. This is desirable when only the slow component is observable.
Theorem 1.
Under the hypotheses (H.1)(H.4) and for with , there exists a positive constant such that for , the following estimate holds
(3.47) 
with a positive constant independent of . This implies that the reduced filter approximates the original filter, as the scale parameter tends to zero.
Proof.
(3.48) 
Using the Jensen’s inequality, we have
(3.49) 
Applying a similar argument from [26], we obtain
(3.50) 
Finally, we conclude that
(3.51) 
This completes the proof of Theorem 1. ∎
4 Parameter estimation
In this section, we consider parameter estimation in the following slowfast dynamical system
(4.52) 
where is an matrix, is a positive definite matrix with eigenvalues . is an unknown parameter defined in an compact set of .
Hypothesis H.5. For all , the function is uniformly bounded and Lipschitz w.r.t x and y. The function is smooth with the first order derivatives bounded by .
Define an averaged, low dimensional slow stochastic dynamical system in (see [27]), as in the previous section
(4.53) 
where
We will estimate the unknown parameter , based on this low dimensional slow system (4.53) but using the observations on the slow component only. Denote the solution of the original slowfast system (4.52) with actual parameter by , and the solution of the slow system (4.53) with parameter by . We recall the following lemma.
Lemma 2.
Under hypothesis (H.6), the following strong convergence holds
(4.54) 
Proof.
The proof of average principle is similar the the infinite case which has been studied in [26], Thus we omit here. ∎
We take as the objective function and assume there is a unique such that . This is our estimated parameter value. Then we can state our main result as follows.
Theorem 2.
Under hypothesis (H.6), the estimated parameter value converges to the true parameter value, as the scale parameter approaches zero. That is,
(4.55) 
Proof.
Note that
(4.56) 
for some positive constant . Integrating both sides with respect to time, we get
(4.57) 
We calculate the difference between and to obtain
By the variation of constant formula, we have
Using the mean value theorem, we obtain
(4.58) 
for some with . Denote