# On ergodic least-squares estimators of the generalized diffusion coefficient for fractional Brownian motion

###### Abstract

We analyse a class of estimators of the generalized diffusion coefficient for fractional Brownian motion of known Hurst index , based on weighted functionals of the single time square displacement. We show that for a certain choice of the weight function these functionals possess an ergodic property and thus provide the true, ensemble-averaged, generalized diffusion coefficient to any necessary precision from a single trajectory data, but at expense of a progressively higher experimental resolution. Convergence is fastest around , a value in the subdiffusive regime.

###### pacs:

02.50.-r, 05.10.Gg, 82.37.-j, 87.80.NjSingle molecule spectroscopy techniques allow the tracking of single particles over a wide range of time scales bra (); saxton (); mason (). In complex media such as living cells, a number of recent studies have reported evidence for subdiffusive transport of particles like proteins weber (), viruses seisenberger (), chromosome monomers bronstein (), mRNA golding () or lipid granules ralf (). Subdiffusion is typically characterized by a sublinear growth with time of the mean square displacement (MSD), with , where is the particle position at time , denotes the ensemble average and is a generalized diffusivity.

A growing body of single trajectory studies suggest that fractional Brownian motion (fBm), among the variety of stochastic processes that produce subdiffusion, may be a model particularly relevant to subcellular transport. FBm is a Gaussian continuous-time random process with stationary increments and is characterized by a so-called Hurst index . If , trajectories are subdiffusive with increments that are negatively and long range correlated mandelbrot (). Such correlations were observed in subdiffusing mRNA molecules polaco (), RNA-proteins or chromosomal loci weber () within E. coli cells. Similarly, fBm can be used to describe the dispersion of apoferritin proteins in crowded dextran solutions weissprl () and of lipid molecules in lipid bilayers metzlerlipids ().

Whereas the determination of an anomalous exponent from data has been extensively studied, as it demonstrates deviation from standard Brownian motion (BM), the problem of estimating the generalized diffusion constant has received much less attention. It appears that is much more sensitive than to many biological factors and its precise determination can potentially yield valuable information about the kinetics of transcription, translation and other physico-biological processes. The generalized diffusivity of RNA molecules in bacteria is greatly affected (either positively or negatively) by perturbations, for instance treatment with antibiotic drugs, which have however a negligible effect on weber (). Likewise, the coefficient of lipids in membranes is strongly reduced by small cholesterol concentrations, whereas remains unchanged metzlerlipids (). In the context of search problems, a particle following a subdiffusive fBm actually explores the space more compactly than a BM and can have a higher probability of eventually encountering a nearby target weissbiophysj (). The larger the value of , the faster this local exploration.

In this paper, generalizing our previous results for standard BM dean1 (), we present a method to estimate the ensemble averaged diffusivity from the analysis of single fBm trajectories of a priori known anomalous exponent. Estimating diffusion constants from data is not an easy task when trajectories are few and ensemble averages cannot be performed. BM and fBm are ergodic processes and time averages tend to ensemble averages, but convergence can be slow dengbarkai (). For finite trajectories of finite resolution, variations by orders of magnitude have been observed for estimators of the normal diffusion coefficient obtained from single particles moving along DNA austin (), in the plasma membrane saxton () or in the cytoplasm of mammalian cells goulian (). Large fluctuations are also manifest in subdiffusive cases weber (); metzlerlipids ().

A broad dispersion in the measures of the diffusion coefficient raises important questions about optimal fitting methodologies. A reliable estimator must possess an ergodic property, so that its most probable value should converge to the true ensemble average independently of the trajectory considered and its variance should vanish as the observation time increases. Recently, much effort has been invested in the analysis of this challenging problem and several different estimators have been analyzed, based, e.g., on the sliding time-averaged square displacement greb1 (); greb3 (), mean length of a maximal excursion tej (), the maximum likelihood approximation berglund (); michalet (); mb (); boyer (); boyer1 () and optimal weighted least-squares functionals dean1 ().

Our aim here is to determine an ergodic least-square estimator for the generalized diffusion coefficient when the underlying stochastic motion is given by a fBm. The estimators considered here are single time quantities, unlike others based on fits of two-time quantities such as the time averaged MSD.

Let us consider a fractional Brownian motion in one dimension with and zero expectation value for all , where is the total observation time. The covariance function of the process is given by mandelbrot ():

(1) | |||||

where ) is the generalized diffusion coefficient and the Hurst exponent . The Hurst index describes the raggedness of the resulting motion, with a higher value leading to a smoother motion. Standard Brownian motion is a particular case of the fBm corresponding to . As already mentioned, for the increments of the process are negatively correlated so that the fBm is subdiffusive. On the other hand, for the increments of the process are positively correlated and superdiffusive behavior is observed.

We consider a single trajectory , that is, a particular realization of an fBm process with a known , and write down the following weighted least-squares functional:

(2) |

where is some weighting function to be determined afterwards and is a trial parameter. We call an estimate of the generalized diffusion coefficient from the single trajectory , if it minimizes . Calculating the partial derivative , setting it to zero and solving the resulting equation for , we find the following least-squares estimator of the generalized diffusion coefficient :

(3) |

where we have introduced the notation

(4) |

Note that the estimator measures the ratio of the observed generalized diffusion coefficient for a single given trajectory relative to the ensemble-averaged value. Moreover, holds for any arbitrary , making it possible to compare the effectiveness of different choices of . It is worthwhile remarking that is given by a single time integration (a local functional) and thus differs from other estimates used in the literature which involve two-time integrals (see e.g., dengbarkai ()).

Further on, from a straightforward calculation the variance of the estimator is, for arbitrary weight function ,

(5) |

where is the covariance function of a squared fBm trajectory

(6) |

This function can be calculated exactly using Eq. (1) to give

(7) | |||||

Inserting the latter expression into Eq. (5) and noticing that the kernel is a symmetric function of and , we have

(8) |

Following Ref.dean1 (), we choose

(9) |

where is a lag time and a tunable exponent. In a discrete time description, can be set equal to the interval between successive measurements dean1 (). We thus identify as a resolution parameter in the present continuous description. We also note that in dean1 (), it was proven that a power law weight function of the type in Eq. (9) was optimal among all weight functions. Fixing and scanning over different values of , we seek the value for which the variance of is smallest. Hopefully, for such value, the variance should vanishes in the limit of infinite resolution or infinite data size, i.e. when the parameter tends to zero. To check the latter point, we consider first the limit of an infinitely long observation time, . For the integrals in Eq. (8) can be performed exactly yielding

(10) | |||||

where is the gamma-function. On the other hand, for and , the result in Eq. (8) can be conveniently represented as a single integral

(11) |

where is the confluent hypergeometric function. The integral in Eq. (On ergodic least-squares estimators of the generalized diffusion coefficient for fractional Brownian motion) can be also performed exactly by using the series representation of the confluent hypergeometric function and then resumming the resulting series. However, the expression obtained is rather lengthy as it contains several hypergeometric functions . On the other hand, the result in the form of Eq. (On ergodic least-squares estimators of the generalized diffusion coefficient for fractional Brownian motion) can be tackled by Mathematica; in addition the asymptotic behavior can be easily extracted from it, so that we prefer to work with the compact expression (On ergodic least-squares estimators of the generalized diffusion coefficient for fractional Brownian motion) rather than with an exact but cumbersome expression.

In Fig.1 we show the dependence of the variance of the estimator on the exponent , for different values of the Hurst index . We notice that for any fixed , the variance vanishes as approaches and is non-zero for any other value. This means that for a fractional Brownian motion with Hurst index the estimators in Eq. (3) with power-law weight functions possess an ergodic property only when .

The last issue we discuss is that of the decay rate of the variance when is small but finite in the ergodic case . It is straightforward to show from Eq. (8) that in the limit the variance is given to leading order by:

(12) |

where is a constant defined by:

(13) |

which exists for any . This result generalizes that of Ref. dean1 () for ordinary Brownian motion. We conclude that the variance of the estimator vanishes logarithmically with the total observation time. In other words, the diffusion constant estimated from one trajectory by this method tends toward the correct value logarithmically slowly. The prefactor , which is displayed in Fig.2, reaches a minimum at . From Fig.2, we notice that, keeping the resolution fixed, the variance of will be small for processes with , typically. This interval encompasses almost all the anomalous exponent values reported in single particle studies. Conversely, the function diverges as or . Therefore, we can expect that, even with the ergodic choice of , the estimates of the diffusion constant should become highly inaccurate for nearly localized or nearly ballistic fBm processes.

In conclusion, we have shown that the true, ensemble-average generalized diffusion coefficient of a fractional Brownian motion of known Hurst index can be obtained from single trajectory data using the weighted least-squares estimator in Eq. (3) with the weight function . Such an estimator possesses an ergodic property so that can be evaluated with any necessary precision but at the expense of increasing the observation time (or decreasing ). A limitation of the present class of estimators, which are based on single-time functionals of , is admittedly their slow convergence toward the ensemble average. Two-time functionals, based on the time averaged MSD, for instance, exhibit faster convergence: for fBm with the relative variance of the time averaged MSD vanishes as dengbarkai (). Nevertheless these other estimators might be more sensitive to measurement errors and may not be accurate when diffusion is no longer a pure process but a mixture of processes with different characteristic times. A quantitative comparison between estimators beyond the ideal cases considered here is a necessary future step.

###### Acknowledgements.

GO acknowledges helpful discussions with M. Kleptsyna. DSD, CMM and GO are partially supported by the ESF Research Network ”Exploring the Physics of Small Devices”. CMM is supported by the European Research Council and the Academy of Finland.## References

- (1) C. Bräuchle, D. C. Lamb, and J. Michaelis, Eds., Single particle tracking and single molecule energy transfer (Wiley-VCH, Weinheim, 2010).
- (2) M. J. Saxton and K. Jacobson, Ann. Rev. Biophys. Biomol. Struct. 26, 373 (1997).
- (3) T. G. Mason and D. A. Weitz, Phys. Rev. Lett. 74, 1250 (1995).
- (4) S. C. Weber, A. J. Spakowitz, and J. A. Theriot, Phys. Rev. Lett. 104, 238102 (2010).
- (5) G. Seisenberger et al., Science 294, 1929 (2001).
- (6) I. Bronstein et al., Phys. Rev. Lett. 103, 018102 (2009).
- (7) I. Golding and E. C. Cox, Phys. Rev. Lett. 96, 098102 (2006).
- (8) J. -H. Jeon et al., Phys. Rev. Lett. 106, 048103 (2011).
- (9) B. Mandelbrot and J. W. van Ness, SIAM Review 10, 422 (1968).
- (10) M. Magdziarz, A. Weron, and K. Burnecki, Phys. Rev. Lett. 103, 180602 (2009).
- (11) J. Szymanski and M. Weiss, Phys. Rev. Lett. 103, 038102 (2009).
- (12) J. H. Jeon, H. Martinez-Seara Monne, M. Javanainen, and R. Metzler, Phys. Rev. Lett. 109, 188103 (2012).
- (13) G. Guigas and M. Weiss, Biophys. J. 94, 90 (2008).
- (14) D. Boyer, D. S. Dean, C. Mejía-Monasterio and G. Oshanin, Preprint. arXiv:1211.1151 (2012).
- (15) W. Deng and E. Barkai, Phys. Rev. E 79, 011112 (2009).
- (16) Y. M. Wang, R. H. Austin and E. C. Cox, Phys. Rev. Lett. 97, 048302 (2006).
- (17) M. Goulian and S. M. Simon, Biophys. J. 79, 2188 (2000).
- (18) D. S. Grebenkov, Phys. Rev. E 83, 061117 (2011); Phys. Rev. E 84, 031124 (2011).
- (19) A. Andreanov and D. S. Grebenkov, J. Stat. Mech. P07001 (2012)
- (20) V. Tejedor et al., Biophys J 98, 1364 (2010).
- (21) A. J. Berglund, Phys. Rev. E 82, 011917 (2010).
- (22) X. Michalet, Phys. Rev. E 82, 041914 (2010); 83, 059904 (2011)
- (23) X. Michalet and A. J. Berglund, Phys. Rev. E 85, 061916 (2012)
- (24) D. Boyer and D. S. Dean, J. Phys. A: Math. Gen. 44, 335003 (2011).
- (25) D. Boyer, D. S. Dean, C. Mejía-Monasterio, and G. Oshanin, Phys. Rev. E 85, 031136 (2012).