Feynman-Kac particle integration with geometric interacting jumps

# Feynman-Kac particle integration with geometric interacting jumps

Pierre Del Moral INRIA Bordeaux Sud-Ouest and University of Bordeaux, France Pierre E. Jacob National University of Singapore, Singapore Anthony Lee University of Warwick, UK Lawrence Murray CSIRO Mathematics, Informatics and Statistics, Perth, Australia Gareth W. Peters University College London, UK
###### Abstract

This article is concerned with the design and analysis of discrete time Feynman-Kac particle integration models with geometric interacting jump processes. We analyze two general types of model, corresponding to whether the reference process is in continuous or discrete time. For the former, we consider discrete generation particle models defined by arbitrarily fine time mesh approximations of the Feynman-Kac models with continuous time path integrals. For the latter, we assume that the discrete process is observed at integer times and we design new approximation models with geometric interacting jumps in terms of a sequence of intermediate time steps between the integers. In both situations, we provide non asymptotic bias and variance theorems w.r.t. the time step and the size of the system, yielding what appear to be the first results of this type for this class of Feynman-Kac particle integration models. We also discuss uniform convergence estimates w.r.t. the time horizon. Our approach is based on an original semigroup analysis with first order decompositions of the fluctuation errors.

Keywords : Feynman-Kac formulae, interacting jump particle systems, measure valued processes, non asymptotic bias and variance estimates.

Mathematics Subject Classification :
Primary: 62L20; 65C05; 60G35. ; Secondary: 60G57, 81Q05; 82C22.

## 1 Introduction

Feynman-Kac formulae are central path integration mathematical models in physics and probability theory. More precisely, these models and their interacting particle interpretations have come to play a significant role in applied probability, numerical physics, Bayesian statistics, probabilistic machine learning, and engineering sciences. Applications of these particle integration techniques are increasingly used to solve a variety of complex problems in nonlinear filtering, data assimilation, rare event sampling, hidden Markov chain parameter estimation, stochastic control and financial mathematics. A detailed account of these functional models and their application domains can be found in the series of research books [3, 10, 21, 27, 34] and, more recently, in [4, 16, 26].

In computational physics, these techniques are used for free energy computations, specifically in estimating ground states of Schrödinger operators. In this context, these particle models are often referred as quantum or diffusion Monte Carlo methods [1, 2, 7, 39]. We also refer the reader to the series of articles [11, 24, 40, 41, 46].

In advanced signal processing, they are known as particle filters or sequential Monte Carlo methods, and were introduced in three independent works in the 90’s [9, 32, 37]. These stochastic particle algorithms are now routinely used to compute sequentially the flow of conditional distributions of the random states of a signal process given some noisy and partial observations [3, 10, 12, 21, 27, 28, 35, 38]. Feynman-Kac formulae and their particle interpretations are also commonly used in financial mathematics to model option prices, futures prices and sensitivity measures, and in insurance and risk models [4, 5, 33, 42, 44, 43]. They are used in rare event analysis to model conditional distributions of stochastic processes evolving in a rare event regime [6, 5, 20].

This article presents geometric interacting jump particle approximations of Feynman-Kac path integrals. It also contains theoretical results related to the practical implementation of these particle algorithms for both discrete and continuous time integration problems. A key result is the presentation of connections between the interacting jump particle interpretations of the continuous time models and their discrete time generation versions. This is motivated by the fact that while the continuous time nature of these models is fundamental to describing certain phenomena, the practical implementation of these models on a computer requires a judicious choice of time discretization. Conversely, as shown in section 2.1 in [25], a discrete time Feynman-Kac model can be encapsulated within a continuous time framework by considering stochastic processes only varying on integer times. Continuous time Feynman-Kac particle models are based on exponential interacting jumps [15, 24, 21, 30, 31, 29, 46], while their discrete time versions are based on geometric type jumps [10, 16, 19]. From a computational perspective, the exponential type interacting jumps thus need to be approximated by geometric type jumps. Incidentally, some of these geometric type interacting jump particle algorithms are better suited to implementation in a parallel computing environment (see section 5.3).

Surprisingly, little attention has been paid to analyze the connections between exponential and geometric type jump particle models. There are references dealing with these two models separately [8, 17, 18, 22, 25, 24, 46], but none provide a convergence analysis between the two. In this paper we initiate this study with a non asymptotic bias and variance analysis w.r.t. the time step parameter and the size of the particle population scheme. Special attention is paid to the stochastic modeling of these interacting jump processes, and to a stochastic perturbation analysis of these particle models w.r.t. local sampling random fields.

We conclude this section with basic notation used in the article. We let be the Banach space of all bounded Borel functions on some Polish111i.e. homeomorphic to a complete separable metric space state space equipped with a Borel -field , equipped with the uniform norm . We denote by the oscillation of a function . We let be the Lebesgue integral of a function with respect to a finite signed measure on . We also equip the set of finite signed measures with the total variation norm , where the supremum is taken over all functions with . We let be the subset of all probability measures. We recall that any bounded integral operator on is an operator from into itself defined by , for some measure , indexed by , and we set . These operators generate a dual operator on the set of finite signed measures defined by . A Markov kernel is a positive and bounded integral operator s.t. . The Dobrushin contraction coefficient of a Markov kernel is defined by , where the supremum is taken over all functions s.t. . Given some positive potential function on , we denote by the Boltzmann-Gibbs transformation defined by .

## 2 Description of the models

### 2.1 Feynman-Kac models

We consider an -valued Markov process , defined on a standard filtered probability space . The set represents the space of càdlàg paths equipped with the Skorokhod topology which turn it into a Polish space. A point represents a sample path of the canonical process . We also let and be the sigma-field and probability measure of the process . Finally, we also consider the -augmentation of so that the resulting filtration satisfies the usual conditions of right continuity and completion by -negligible sets (see for instance [36, 45], and the references therein). We also consider a time inhomogeneous bounded Borel function on .

We let and be the Feynman-Kac measures on defined for any bounded measurable function on , by the following formulae

 Qt(f):=Λt(f)/Λt(1) {\rm with}Λt(f)=E(f(X[0,t]) exp(∫t0Vs(Xs)ds)) (1)

and we let and , respectively, be the -marginals of and .

We consider the mesh sequence , , with time step associated with some integer , and we let and be the Feynman-Kac measures on defined for any bounded measurable function on , by the following formulae

 Q(m)tn(f):=Λ(m)tn(f)/Λ(m)tn(1){\rm with}Λ(m)tn(f)=E(f(X[0,tn]) ∏0≤p

We also denote by and , respectively, the -marginal of and .

• Case (D) : We have and , where , is an -valued Markov chain, and are Borel positive functions s.t. is bounded.

In this case, the marginal and of the Feynman-Kac measures of and on integer times are given for any by the formula

 ηn(f)=γn(f)/γn(1){\rm with}γn(f):=E(f(Xn) ∏0≤p
• Case (C) : The process is a continuous time Markov process with infinitesimal generators defined on some common domain of functions , and . The set is a sub-algebra of the Banach space generating the Borel -field , and for any measurable function the Feynman-Kac semigroup , , defined by

 Qs,t(f)(x)=E(f(Xt) exp(∫tsUr(Xr)dr) ∣∣ Xs=x)

leaves invariant; that is we have that . For any , the mappings and , and their norm as well the norm of the first order derivatives only depend on and on the norms of the functions and their derivatives.

The regularity conditions stated in (C) correspond to time inhomogeneous versions of those introduced in [24]. They hold for pure jump processes with bounded jump rates with , or for Euclidean diffusions on with regular and Lipschitz coefficients by taking as the set of -functions with derivatives decreasing at infinity faster that any polynomial function. These regularity conditions allow the use of most of the principal theorems of stochastic differential calculus, e.g. the “carré du champ”, or square field, operator that characterizes the predictable quadratic variations of the martingales that appear in Ito’s formulae. These regularity conditions can probably be relaxed using the extended setup developed in [21].

We have already mentioned that the particle interpretations associated with the continuous time models (1) are defined in terms of interacting jump particle systems [21, 22, 24, 25]. The implementation of these continuous time particle algorithms is clearly impractical and we therefore resort to the geometric interacting processes associated with the -approximation models defined in (2). These discrete generation interacting jumps models provide new and different types of adaptive resampling procedures, which differ from those discussed in the articles [12, 13], and the references therein.

### 2.2 Mean field particle models

In this section, we provide a brief description of the geometric type interacting jump particle models associated with the -approximation Feynman-Kac model defined in (2). First, if we define

then it is well known that satisfies the following evolution equation

 μ(m)tn+1=ΨGtn(μ(m)tn)Mtn,tn+1. (4)

Further details on the derivation of these evolution equations can be found in [10, 21, 16]. The particle interpretation of this model depends on the interpretation of the Boltzmann-Gibbs transformation in terms of a Markov transport equation

 ΨGtn(μ)=μStn,μ (5)

for some Markov transitions , that depend on the time parameter and on the measure . The choice of these Markov operators is not unique; we refer to [10] for a more thorough discussion of these models. In this article, we consider an abstract general model, and illustrate our study with the following three classes of models.

• Case 1 : We have , for some non negative and bounded function . In this situation, (5) is satisfied by the Markov transition

 Stn,μ(x,dy):=e−Utn(x)/m δx(dy)+(1−e−Utn(x)/m) Ψe−Utn/m(μ)(dy).
• Case 2 : The function is non negative. In this situation, (5) is satisfied by the Markov transition

• Case 3: The Markov transport equation (5) is satisfied by the Markov transition

 Stn,μ(x,dy):=(1−atn,μ(x)) δx(dy)+atn,μ(x) Ψ(eVtn/m−eVtn(x)/m)+(μ)(dy)

with the rejection rate

In these three cases we have the following first order expansion

 Stn,μ=Id+1m ˆLtn,μ+ 1m2 ˆRtn,μ (6)

with some jump type generator and some integral operator s.t. , where the supremum is taken over all and . The jump generators corresponding to the three cases presented above are described respectively in (12), (13), and (14). The proofs of these expansions is rather elementary, and they are housed in the appendix, on page 7.3.

In addition, whenever (5) is satisfied, we have the evolution equation

 μ(m)tn+1=μ(m)tnKn+1,μ(m)tn{\rm with the Markov kernels}Ktn,tn+1,μ=Stn,μMtn,tn+1. (7)

The mean field -particle model associated with the evolution equation (7) is a Markov process in with elementary transitions given by

 P(ξtn+1∈dx | ξtn)=∏1≤i≤NKtn,tn+1,μNtn(ξitn,dxi){\rm with}μNtn=1N∑1≤i≤Nδξitn, (8)

where stands for an infinitesimal neighborhood of the point .

## 3 Statement of the main results

Our first main result relates the Feynman-Kac models (1) and their -approximation measures (2) in case (D) and (C).

###### Theorem 3.1

In case (D), we have

 ν(m)n=νn=γnandμ(m)n=μn=ηn

with the Feynman-Kac measures and defined in (3).

In case (C), we have the first order decomposition

 Λ(m)tn=Λtn+1m rm,tnandQ(m)tn=Qtn+1m ¯¯¯rm,tn

with some remainder signed measures s.t. .

The proof of the theorem is rather technical and it is postponed to the appendix.

The first assertion of theorem 3.1 allows us to turn a discrete time Feynman-Kac model (3) into a continuous time model (1). To be more precise, we have that with the Feynman-Kac semigroup

 Q(m)tp,tn(f)(x):=E(f(Xtn) ∏p≤q

in case (D), for integer times , with , we also have that with the Feynman-Kac semigroup

 Qk,n(f)(x):=E(f(Xn) ∏k≤l

Thus, the normalized Markov kernels also coincide with the Markov kernels . In addition, for any and , we also have the semigroup formulae

 Q(m)k,n(f)(x)=Gk(x)r/m Q(m)k+r/m,n(f)(x){\rm and% }P(m)k,n=P(m)k+r/m,n=Pk,n. (9)

We prove the l.h.s. assertion using the fact that for any and any , with and , we have and

 Q(m)k,n(f)(x) = Gk(x)r/m×E⎛⎝f(Xtnm) ∏k+r/m≤tq

For a Feynman-Kac measure (1) associated with a continuous diffusion style process , it is important to observe that the l.h.s. measure in the -approximation model (2), as defined on a time mesh sequence, can be thought of as a time discretization of the exponential path integrals in the continuous time model (1). Nevertheless, the elementary Markov transitions of the Markov chain are generally unknown. To get some feasible Monte Carlo approximation scheme, we need a dedicated technique to sample the transitions of this chain. One natural strategy is to replace in (2), the reference Markov chain by the Markov chain associated with some Euler type discretization model with time step . The stochastic analysis of these models is discussed in some detail in the articles [17, 18, 15], including first order expansions in terms of the size of the time mesh sequence.

Our second main result is the following non asymptotic bias and variance theorem for the -approximation mean field model introduced in (8).

###### Theorem 3.2

We assume that the Markov transport equation (5) is satisfied for Markov transitions also satisfying the first order decomposition (6).

In case (C), for any function , and any we have the non asymptotic bias and variance estimates

 ∣∣E(μNtn(f))−μtn(f)∣∣≤ctn(f) [1N+1m]

and

 E([μNtn(f)−μtn(f)]2)≤ctn(f) [1N+1m2]

for some finite constant that only depends on and on .

In case (D), for any s.t. , and for any we have the non asymptotic bias and variance estimates

 N ∣∣E(μNn(f))−ηn(f)∣∣≤a(n)andN E([μNtn(f)−μtn(f)]2)≤a(n) (1+1N a(n))

for some some constant

 a(n)≤c ∑0≤k

Under appropriate regularity conditions on the Feynman-Kac model, we can prove that the constant is uniformly bounded w.r.t. the time parameter; that is we have that . For a detailed discussion of these uniform convergence properties w.r.t. the time parameter, we refer the reader to the book [10], and the more recent article [16]. To be more precise, we let be the Feynman-Kac semigroup associated with the flow of measures . In this notation, by proposition 2.3 in [23] we have that the Dobrushin contraction coefficient of the Markov kernel is given by

 β(Pk,n)=supμ1,μ2∈P(E)∥Φk,n(μ1)−Φk,n(μ2)∥tv.

On the other hand, we also have that

 Qk,n(1)(x)=∏k≤l

Using the fact that , we find that

 logQk,n(1)(x)Qk,n(1)(y)=∑k≤l

Assuming that for any and , and

 c1≤Gl(x)≤c2{\rm and}supμ1,μ2∈P(E)∥Φk,l(μ1)−Φk,l(μ2)∥tv≤c3 e−c4(k−l) (10)

for some positive and bounded constants , , we find that

 β(Pk,n)≤c3 e−c4(k−n){\rm and}loggk,n≤2(c2c3/c1)(∑k≤l

This clearly implies that .

For instance, it was proven in [21, 14] that condition (10) is met for time homogeneous models as soon as the Markov transition of the Markov chain satisfies the following mixing condition

 ∃m≥1, ∃ρ>0 : ∀x,y∈EMm(x,% \LARGE.)≥ρ Mm(y,\LARGE.).

It is well known that this condition is satisfied for any aperiodic and irreducible Markov chains on finite state spaces, as well as for bi-Laplace exponential transitions associated with a bounded drift function, and for Gaussian transitions with a mean drift function that is constant outside some compact domain.

The remainder of the article is organized as follows:

Section 4 is concerned with continuous time particle interpretations of the Feynman-Kac models (1). By the representation theorem 3.1, these schemes also provide a continuous time particle interpretation of the discrete time models (3) without further work. In section 4.2, we present the McKean interpretation of the Feynman-Kac models in terms of a time inhomogeneous Markov process whose generator depends on the distribution of the random states. The choice of these McKean models is not unique. We discuss the three interpretation models corresponding to the three selection type transitions presented on page 2.2. The mean field particle interpretation of these McKean models are discussed in section 4.3.

Of course, even for discrete time models (3) these continuous time particle interpretations are based on continuous time interacting jump models and they cannot be used in practice without an additional level of approximation. In this context, when using an Euler type approximation these exponential interacting jumps are replaced by geometric type recycling clocks. These interacting geometric jumps particle models are discussed in section 5, which is dedicated to the discrete time particle interpretations of the Feynman-Kac models presented in (2). In section 5.1, we discuss the McKean interpretation of the Feynman-Kac models in terms of a time inhomogeneous Markov chain model whose elementary transitions depends on the distribution of the random states. Again, the choice of these McKean models is not unique. We discuss the three interpretation models corresponding to the three cases presented on page 2.2. The mean field particle interpretation of these McKean models are discussed on page 29.

Once again, using the representation formulae (2) we emphasize that these schemes also provide a discrete generation particle interpretation of the discrete time models (3). In contrast to standard discrete generation particle models associated with (3), these particle schemes are defined on a refined time mesh sequence between integers. This time mesh sequence can be interpreted as a time dilation. Between two integers, the particle evolution undergoes an additional series of intermediate time evolution steps. In each of these time steps, a dedicated Bernoulli acceptance-rejection trial coupled with a recycling scheme is performed. As the time step decreases to , the resulting geometric interacting jump processes converge to the exponential interacting jump processes associated with the continuous time particle model. The final section, section 6, is mainly concerned with the proof of theorem 3.2.

## 4 Continuous time models

### 4.1 Feynman-Kac semigroups

In case (C) the semigroup of the flow of non negative measures is given for any by the following formulae , with the Feynman-Kac semigroup defined for any by

 Qs,t(f)(x)=E(f(Xt) exp{∫tsVs(Xs) ds} | Xs=x).

This yields , with the nonlinear transformation on the set of probability measures defined for any by

 Φs,t(μs)(f):=μs(Qs,t(f))/μs(Qs,t(1)).

Using some stochastic calculus manipulations, we readily prove that satisfies the following integro-differential equation

 ddtμt(f)=μt(Lt(f))+μt(Vtf)−μt(Vt)μt(f) (11)

for any function . Further details on the derivation of these evolution equations can be found in the articles [24, 22]. The particle interpretation of this model depends on the interpretation of the correlation term in the r.h.s. of (11) in terms of a jump type generator. The choice of these generators is not unique. Next, we discuss three important classes of models. These three situations are the continuous time versions of the three cases discussed on page 2.2.

• Case 1 : We assume that , for some non negative function . In this situation, we have the formula

 μt(Vtf)−μt(Vt)μt(f)=μt(Ut[μt(f)−f])=μt(ˆLt,μt(f))

with the interacting jump generator

 ˆLt,μt(f)(x)=Ut(x) ∫ [f(y)−f(x)] μt(dy). (12)
• Case 2 : When is a positive function, then we have the formula

 μt(Vtf)−μt(Vt)μt(f)=μt(ˆLt,μt(f))

with the interacting jump generator

 ˆLt,μt(f)(x)=∫ [f(y)−f(x)] Vt(y) μt(dy)=μt(Vt) ∫ [f(y)−f(x)] ΨVt(μt)(dy). (13)
• Case 3 : For any bounded potential function we have

 μt(Vtf)−μt(Vt)μt(f)=∫(f(y)−f(x)) (Vt(y)−Vt(x))+ μt(dx) μt(dy)=μt(ˆLt,μt(f))

with , and with the interacting jump generator

 ˆLt,μt(f)(x)=∫ [f(y)−f(x)] (Vt(y)−Vt(x))+ μt(dy). (14)

### 4.2 McKean interpretation models

In the three cases discussed above, for any test functions we have the evolution equation

 ddtμt(f)=μt(Lt,μt(f)){\rm with}Lt,μt:=Lt+ˆLt,μt. (15)

These integro-differential equations can be interpreted as the evolution of the laws, given by , of a time inhomogeneous Markov process with infinitesimal generators that depend on the distribution of the random states at the previous time increment. This probabilistic model is called the McKean interpretation of the evolution equation (15) in terms of a time inhomogeneous Markov process. In this framework, using Ito’s formula for any test function , we have

 dft(¯¯¯¯Xt)=(∂∂t+Lt,μt)(ft)(¯¯¯¯Xt)+d¯¯¯¯¯¯Mt(f) (16)

with a martingale term with predictable angle bracket

 d⟨¯¯¯¯¯¯M(f)⟩t=ΓLt,μt(ft,ft)(¯¯¯¯Xt)dt.

Using the r.h.s. description of in (15), for any we notice that

 ΓLt,μt(f,f)=ΓLt(f,f)+ΓˆLt,μt(f,f).

Next, we provide a description of this Markov process in the three cases discussed above.

• Case 1: In this situation, between the jump times the process evolves as the process . The rate of the jumps is given by the function . In other words, the jump times are given by the following recursive formulae

 Tn+1=inf{t≥Tn : ∫tTnUs(¯¯¯¯Xs) ds≥en}

where , and stands for a sequence of i.i.d. exponential random variables with unit parameter. At the jump time the process jumps to new site randomly chosen with the distribution .

For any we also have that

 ΓˆLt,μt(f,f)(x)=ˆLt,μt((f−f(x))2)(x)=Ut(x) ∫ [f(y)−f(x)]2 μt(dy).

In this situation, an explicit expression of the time inhomogeneous semigroup , , of the process is provided by the following formula

 ¯¯¯¯Ps,t,μs(f)(x) = E(f(¯¯¯¯¯Xt) ∣∣ ¯¯¯¯¯Xs=x) = Qs,t(1)(x) Φs,t(δx)(f)+(1−Qs,t(1)(x)) Φs,t(μs)(f) = Qs,t(f)(x)+(1−Qs,t(1)(x)) Φs,t(μs)(f).

We let and be the Markov transition and the transformation of probability measures defined as and replacing by the integral operator

 Q(m)tn,tn+1(f)(x) = e−Utn(x)/m E(f(Xtn+1) | Xtn=x).

Under the assumptions of theorem 3.1, using elementary calculations we prove that

 ¯¯¯¯P(m)tn,tn+1,μtn=¯¯¯¯Ptn,tn+1,μtn+1m R(m)tn,tn+1,μtn (17)

with some remainder signed measures such that , for some finite constant whose values only depend on the potential function .

• Case 2: In this situation, between jump times the process evolves as the process . The rate of the jumps is given by the parameter . In other words, the jump times are given by the following recursive formulae

 Tn+1=inf{t≥Tn : ∫tTnμs(Vs) ds≥en}

where , and stands for a sequence of i.i.d. exponential random variables with unit parameter. At the jump time the process jumps to new site randomly chosen with the distribution .

For any we also have that

 ΓˆLt,μt(f,f)(x)=ˆLt,μt((f−f(x))2)(x)=∫ [f(y)−f(x)]2 Vt(y) μt(dy).
• Case 3: In this case, between jump times the process evolves as the process . The rate of the jumps is given by the function

 Wt,μt(x) := μt((Vt−Vt(x))+) = μt(Vt 1Vt≥Vt(x))−μt(Vt≥Vt(x)) Vt(x).

In other words, the jump times are given by the following recursive formulae

 Tn+1=inf{t≥Tn : ∫tTnWt,μt(¯¯¯¯Xs) ds≥en}

where , and stands for a sequence of i.i.d. exponential random variables with unit parameter. At the jump time the process jumps to new site randomly chosen with the distribution .

For any we also have that

 ΓˆLt,μt(f,f)(x)=ˆLt,μt((f−f(x))2)(x)=∫ [f(y)−f(x)]2 (Vt(y)−Vt(x))+ μ