Thermodynamic inference based on coarse-grained data or noisy measurements

Thermodynamic inference based on coarse-grained data or noisy measurements


Fluctuation theorems have become an important tool in single molecule biophysics to measure free energy differences from non-equilibrium experiments. When significant coarse-graining or noise affect the measurements, the determination of the free energies becomes challenging. In order to address this thermodynamic inference problem, we propose improved estimators of free energy differences based on fluctuation theorems, which we test on a number of examples. The effect of the noise can be described by an effective temperature, which only depends on the signal to noise ratio, when the work is Gaussian distributed and uncorrelated with the error made on the work. The notion of effective temperature appears less useful for non-Gaussian work distributions or when the error is correlated with the work, but nevertheless, as we show, improved estimators can still be constructed for such cases. As an example of non-trivial correlations between the error and the work, we also consider measurements with delay, as described by linear Langevin equations.

05.40.-a, 05.70.-a, 05.70.Ln

I Introduction

Fluctuation theorems are symmetry relations, which constrain the probability distributions of thermody- namic quantities arbitrarily far from equilibrium Jarzynski [1997a], ?, Crooks [1998], ?, Seifert [2012]. Their discovery has represented a major progress in our understanding of the second law of thermodynamics and has also accompanied many advances in the observation and manipulation of various experimental non-equilibrium systems, such as biopolymers Ribezzi-Crivellari and Ritort [2014], Gupta et al. [2011], manipulated colloids Wang et al. [2002], Carberry et al. [2007], mechanical oscillators or electronic circuits Ciliberto et al. [2010] or quantum devices Küng et al. [2012].

One major field of applications of fluctuations theorems lies in the determination of free-energies, through proper averaging of the work within well-defined non-equilibrium ensembles. In practice, in order to determine free energies using the Jarzynski relation Jarzynski [1997a], ? for instance, a large number of experiments are required in order to ensure that the rare trajectories which contribute the most are sampled correctly Jarzynski [2006].

In addition to this sampling problem, other sources of errors in the determination of the free energy can arise from the measurement process itself. For instance, the experiment may involve some degrees of freedom which evolve on a much faster time scale than the response time of the measurement device, the experiment may not allow to measure all the degrees of freedom which are needed to evaluate the work or for some other reasons, the work is not properly evaluated from the measurements. Clearly, a difference can easily arise between the true trajectories of the system and the coarse-grained or noisy trajectories, which are in fact recorded. This uncertainty in the trajectories leads to a difference between the true work and the measured work, which we call error and which limits our ability to determine free energy differences using fluctuation theorems.

In order to address this issue, a proper understanding of the way coarse-graining or measurement noise affects fluctuation relations is needed. The modifications of fluctuation relations due to coarse-graining have been studied by a number of authors following the original theoretical work of Rahav et al. Rahav and Jarzynski [2007] and motivated by various experimental systems such as manipulated colloids Tusch et al. [2014], Mehl et al. [2012], granular systems Naert [2012], quantum dot devices Bulnes Cuetara et al. [2011], Küng et al. [2012], molecular motors Lacoste and Mallick [2009], Pietzonka et al. [2014], and single biopolymer molecules Dieterich et al. [2015], Alemany et al. [2015], Ribezzi-Crivellari and Ritort [2014]. For instance, for molecular motors, the issue of coarse-graining is central, since only their position is typically available as a function of time experimentally. The chemical consumption of ATP from these molecules is hidden and this limits our ability to use fluctuation theorems for molecular motors. Naturally, for other systems, the precise modifications of the fluctuation relations will take various forms depending on the original dynamics and the way coarse-graining is performed Esposito [2012], Bo and Celani [2014], Michel and Searles [2013].

The present paper addresses the effect of coarse-graining or noise on fluctuation theorems of the Jarzynski and Crooks type. It is closely related to two recent studies, the first one on the error associated with finite time step integration in Langevin equations Sivak et al. [2013] and the second one on thermodynamic inference of free energy differences in single molecules experiments Alemany et al. [2015], Ribezzi-Crivellari and Ritort [2014]. Building mainly on these two works, we revisit this issue at a general level. We think that such an approach is pertinent since the question we are interested in is not bound to a specific experimental setup or dynamics: at some level, it originates from a fundamental property of entropy, namely its dependence on coarse-graining.

The remainder of the paper is organized as follows. In section II, we present general properties of the correction factors to the Jarzynski and Crooks relations. Then in section III, we first consider the simple case when the work and the error are Gaussian distributed and the error is uncorrelated with the work. This example is then extended in two ways: first by considering non-Gaussian work distributions and then by considering the specific case that the error is linearly correlated with the work. We end in section IV by a numerical verification of our results based on specific choices of dynamics. This section also includes an analytical and numerical study of a model based on Langevin equations for which, correlations in the error arise due to measurement delays.

Ii General properties of Fluctuation theorems with coarse-graining or noise

The Jarzynski relation Jarzynski [1997a], ? allows to determine equilibrium free-energy differences from an average of non-equilibrium measurements:


where is the work done on a system and denotes a protocol of variation of a control parameter between time and time , which starts initially in an equilibrium state A corresponding to the value , and ends up when the control parameter has reached at time . Although the state reached by the system at time is not in general an equilibrium one, represents the equilibrium free energy difference between states corresponding to and . The average in Eq. (1), denoted by , is taken over all non-equilibrium trajectories which are realized in this process.

Very much related to the Jarzynski relation, the Crooks fluctuation theorem, constrains the ratio of probability distributions of the work associated with an arbitrary protocol which starts in an equilibrium state, , with respect to its time-reversed twin, , associated with  Crooks [1998], ?:


Both, Eqs. (1) and (2) have been experimentally used to determine free-energy differences. From Eq. (1) follows straightforwardly that , while from Eq. (2) one obtains , where solves .

As mentioned in the introduction, we are interested in situations in which the true work is not accessible due to coarse-graining or noise present in the measured variables or due to an incorrect evaluation of the work. To describe the first source of error, due to the trajectories, we distinguish the true trajectory of the system, which will be typically inaccessible, from the measured (or coarse-grained) one which is accessible and which we shall denote by . Unless we specify otherwise, the distribution of the initial condition of the true trajectory, namely , is assumed to be at equilibrium. In contrast, the distribution of the initial condition of the measured trajectory, namely , does not need to be at equilibrium and is typically correlated with .

In order to describe the second source of error, at the level of the work itself, we assume that both works are evaluated from an Hamiltonian, but that two different Hamiltonians or may be involved. More precisely, we define




With these notations, we write generally:


where denotes the true value of the work defined for the true trajectory , is similarly the measured work associated with the measured (or coarse-grained) trajectory, and is the corresponding error. For simplicity, we choose not to indicate explicitly the dependence on the driving in and . This error can frequently be modeled as a Gaussian distribution with non-zero mean and variance. Furthermore, it may in general depend on the duration of the experiment and on the rate of change of the driving protocol, although we can not exclude other contributions independent of the driving.

Let us also introduce two corrections factors and , which capture respectively the modifications of Eq. (1) and Eq. (2) due to measurement errors or coarse-graining. The modified Jarzynski relation becomes


and the modified Crooks relation becomes


where denotes the probability distribution of the measured work values, which equals . From these equations, it is apparent that both estimators of free energy are biased. Indeed, the first one leads to the estimate of free energy , while the second one leads to , where solves .

To shorten the notations, we shall denote the symmetry functions as


and similarly


ii.1 A joint distribution function based formulation

In order to evaluate the corrections factors and , we rely on a symmetry relation for joint distributions García-García et al. [2010, 2012]. To understand how it is derived, it is useful to recall that at the heart of Crooks relation, Eq. (2), there is a deeper statement on the path probability density of true trajectories which is


where it has been assumed that the system’s initial condition at corresponds to equilibrium. The starting point of this derivation is the ratio of the joint probabilities of true and measured trajectories in the forward process to that in the reverse process:


with probing the time reversal symmetry of the conditional probability . In the last step, we have used Eq. (5) and Eq. (10). We can then write


It is simple to show using Eq. (3) that the true work is antisymmetric under time reversal in the following sense:


where the tilde operation on or indicates that dynamics occurs in the presence of a reversed protocol. Naturally, given the similarity of definitions between the true and the measured works, the same property holds for the measured work:


As a result of these two relations, the error, defined in Eq. (5), is also antisymmetric under time reversal, . Then, using these relations and Eq. (11) we get


Integrating over , we have


Therefore one finally arrives at the relation




In the following, we restrict to the case where , which holds when . As we shall see, this assumption is not too restrictive and allows already to derive some interesting results. Under this assumption, Eq. (17) simplifies to


which is precisely the fluctuation theorem for the joint distribution of the measured work and the error García-García et al. [2012]. From Eq. (19) we can immediately derive Eq. (6)


leading to the explicit form of the correction to the Jarzynski estimator:


in terms of the marginal time-reversed distribution of the error


We now proceed with Eq. (7), which can be easily deduced from (19). We have:


From Eq. (II.1) we immediately obtain Eq. (7) with the identification


A link between and can be simply derived from the fact that the detailed theorem Eq. (7) must lead to the integral theorem Eq. (6):


which implies after comparing with Eq. (6):


Notice that only depends on the error distribution function in Eq. (21) or on the correlations between the measured work and the error in the equivalent formulation of Eq. (26). In both cases, the true work does not explicitly appears Sivak et al. [2013]. The same property holds for the correction .

ii.2 Explicit corrections for uncorrelated error

In practice, the evaluation of the functions and is rather difficult since this requires a knowledge of the joint distribution of the error and the measured work. In order to progress, we introduce further assumptions in this section.

We can generally write the joint probability distribution of the measured work and the error as


where in the second line, we have changed variables from to using Eq. (5); this change of variable has a Jacobian unity since is fixed, hence the third line. When the error is uncorrelated with the true work, , and we obtain the following factorization relation:


Thanks to the factorization property of Eq. (28), the experimental work distribution becomes a simple convolution:


Furthermore, the conditional probability of the work given the error is just the true work distribution, but shifted, . By Bayes formula, the conditional probability of the error given the work reads


From the last equation and (24), we obtain the form of in terms of the true work and the error distributions:


Eqs. (29) and (31) constitute the first main result of the present paper. These explicit expressions of the correction factors can be derived when it is possible to integrate out the contribution of the error independently of the other degrees of freedom of the system. More precisely, we have used two main assumptions: the first one is the invariance under time reversal symmetry of and the second one is the statistical independence of and . As shown in Appendix A, taken together these assumptions also imply the invariance of the error distribution under time-reversal symmetry, namely:


In the following, we present various applications of this framework to specific work and error distributions.

Iii Consequences for specific work and error distributions

iii.1 Uncorrelated Gaussian error and Gaussian work distribution

Before addressing more complex situations, it is instructive to consider a simple case where the true work and error distributions are Gaussian, and the error is assumed to be uncorrelated with the true work, of mean and of variance . In this case, the experimental work distribution will also be a Gaussian, and the correction factor to the Crooks fluctuation theorem, , will be a linear function of . To be explicit, let us take the work and noise probability distributions of the form


Naturally, since and are assumed to be uncorrelated, the variance of the measured work is simply the sum of the variances of the work and of the error: . Now, the bias in the Jarzynski estimator, can be evaluated using Eqs. (21), (32) and (34), with the result


which depends on temperature, the variance of the noise and its mean.

Let us now calculate the bias in the Crooks estimator, from Eqs. (31), (33) and (34). We find:


where is the signal-to-noise ratio.

Figure 1: Sketch of the effect of Gaussian uncorrelated noise on symmetry functions, or equivalently on Crooks fluctuation theorem, for a Gaussian work distribution. If , the measurement noise produces a decrease of the slope of the symmetry function (green dotted line) as compared to (red dashed line). This change of slope (a rotation of the line) does not affect the intersection point with the work axis, which corresponds to the free-energy difference, . When however, the symmetry function should be in addition translated by (black solid line). All energies are measured in units of .

This result can be further simplified using the fluctuation theorem of the true work, namely , which is equivalent in this case to . Thus, we obtain


in terms of the signal-to-noise ratio and the function . In this simple case, the Crooks theorem for the distribution of the measured work reads


where is the symmetry function defined in Eq. (9) and is the function:


As expected, the Crooks fluctuation theorem is recovered in the absence of noise, i.e. when .

It is apparent with Eq. (38), that the mean of the error shifts the estimation of the free energy by a constant, while the variance of the error affects the slope of the symmetry function. When the mean of the error is zero (, only the change of slope occurs. In that case, the Crooks estimator for the free-energy is not biased, while the Jarzynski estimator is. As the amount of noise or coarse-graining increases, the signal to noise ratio decreases, and the slope of the symmetry function decreases. Since the intersection point of this straight line with the work axis remains always equal to the free energy difference, the line undergoes a rotation with respect to the point on the work axis. When the mean of the error is non-zero, this straight line undergoes in addition an horizontal translation by the amount , as shown in Fig 1.

Notice that the change of slope can be equivalently described by a change of temperature. One can thus introduce an effective temperature, equal to the temperature of the heat bath divided by , therefore larger than since according to Eq. (39). In the linear response regime, the same effective temperature will appear in the ratio of the response and correlation functions Verley et al. [2011]. It is important to appreciate however that this notion of effective temperature only applies to situations like the present one where the correction factor in the Crooks relation, namely, is linear. In general, this function is not linear as will become clear in the next examples and in the section reporting numerical results. In such cases, this effective temperature is less meaningful.

To summarize the results of this section, we have shown that an additive correction to the work due to an instrument error or noise leads, in the case that the work and the error are Gaussian distributed, with uncorrelated error, to a multiplicative factor for the temperature, in other words, to an effective temperature. In addition, if the error has nonzero mean, the free-energy estimator is shifted by an amount precisely equal to the mean value of the error.

iii.2 Uncorrelated Gaussian error with arbitrary work distribution

We now show how to correct for measurement errors, when the true work distribution is arbitrary, keeping the same assumptions for the error (uncorrelated and Gaussian distributed). We use Eq. (29) in order to relate the probability distribution of the measured work to the probability distribution of the true work. Let us implement a shift by an arbitrary quantity in the argument of this distribution:


where we have used Eq. (32) in the last step of Eq. (III.2).

After the changes of variables in (40) and in (III.2), we get:


where we have used, in the last step of Eq. (III.2), the explicit form of the error distribution, Eq. (34). It is now clear that choosing leads to:


Let us first analyze the case of unbiased error, . We observe that, remarkably, the shift in Eq. (44) removes the bias that was present in the Crooks estimator for measured work and at the same time provides the correct slope for the fluctuation theorem. Thus, the transformation of Eq. (44) solves in a simple way two problems at once: the need to calibrate the experiment against noise and the problem of the bias in the estimator. We shall illustrate this method using simulations in Sec. IV.4.

This result fully agrees with the results of Ref. Ribezzi-Crivellari and Ritort [2014], which is concerned with the inference of free-energies from partial work measurements in the context of single molecule experiments. The authors of this work showed that a shift of the type of Eq. (44) can be used to exploit measurements of the “wrong” work in a symmetric dual trap system, in which one of the traps is fixed, while the other one is moved. Such a transformation allows to recover the correct work distribution when the work distribution is Gaussian and to eliminate the biases in the Jarzynski and Crooks estimators. However, as recognized by the authors, in the case of an asymmetric setup of the traps, a shift of this kind does not permit to recover the correct work distribution (see Ref. Ribezzi-Crivellari and Ritort [2014] for details). This corresponds to our biased case, when . In such a case, the elimination of the bias in the Crooks estimator is in principle not possible, at least not in the absence of additional information on the error distribution Alemany et al. [2015].

iii.3 Correlated Non-Gaussian error distribution

Before moving to more complicated cases where the error is correlated with the true work and is non-Gaussian, let us consider a simple extension of the previous example. Let us assume that the error is of the form


so that the error is now the sum of a part which is proportional to the measured work, and another part , which is still uncorrelated with the true work . By construction, the previous case is recovered for . Note that when is non-Gaussian, this total error will also be non-Gaussian and correlated with .

Let us introduce the probability distribution of the uncorrelated part of the error, . As before with Eq. (II.2), we consider the joint distribution


Now, using the property that the variable is uncorrelated with the true work , we obtain


An important point is that Eqs. (21) and (24) do not hold in terms of and respectively, since is not the total error, but only its uncorrelated part. For instance, using Eq. (24) and recalling that , we will now have:


where we have used the subscript to make explicit the dependence on this parameter, and we have noticed, given that is uncorrelated from , that the second term in the second line of Eq. (III.3) is given exactly by Eq. (31) with the substitution . Note that this result could also be derived by directly computing the joint probability of and , which can be easily done as follows:


where we have used Eq. (47) to get the last line. Thus, we have for


Introducing , and using directly Eq. (24) together with Eqs. (III.3) and (50), we again obtain (III.3).

Notice that, in particular, when the distributions of the true work and are Gaussian distributed, one obtains


with , , and defined as before, but now in terms of the variance of the uncorrelated part of the error, .

It is worth noting, as we see from Eq. (III.3), that this type of correlation only introduces a stretching of the original via a rescaling of , plus an additional correction which is linear in . In particular, in the Gaussian case the stretching can be reabsorbed in the linear correction because is linear in for .

For the case of non-Gaussian work distributions but with a Gaussian distribution of , it is interesting to seek a relation of the type of Eq. (44) as improved estimators of free energy. Proceeding in the same way as before, the expressions for the forward and reverse probability distributions of the measured work shifted by an amount are:




where we have used the relation


which holds under the same assumptions leading to Eq. (19), as shown in Appendix B.

Let us now assume has a mean and a variance , and for any arbitrary , let us introduce the shifted symmetry function


It can be shown that when , this shifted symmetry function has a simple form:


It is important at this point to contrast this result with that obtained in Eq. (44) for . Although one obtains again a linear relation for the shifted symmetry function, the slope is not one (in units of ) but . Since a priori neither nor are known, one should vary the shift parameter in a plot of versus , until the data points collapse on a straight line. From the value of the slope of that line, the value of can be inferred, and from the actual value of , the value of can then be deduced. To apply this method, it is important to be sure that there is a unique value of the optimal shift . We adress this point in appendix C by proving that indeed there is a unique optimal shift and furthermore that for no other value of , the symmetry function is a linear function of . Naturally, this proof includes the case considered previously.

When , this transformation of the symmetry function leads to a complete calibration since no other parameter needs to be fixed, and the correct estimate of the free-energy difference can be recovered, as we shall illustrate numerically in Sec. IV.5. However, when , the estimator is biased by the mean of the error in a way which can not be fixed in the absence of additional information, as also found in the previous case.

Iv Applications to specific choices of dynamics for the measured variable

In this section, we shall apply the theoretical framework developed in previous sections to some specific dynamics for the measured variable. Before we do so, we discuss the choice of measured variables in single molecule experiments (typically position or force). Then, assuming the position is the measured variable, we discuss the consequences of the particular choice of the relation between the dynamics of the measured position and that of the true position. Here, we shall restrict ourselves to two separate cases:

(a) Simple additive noise: the measured position and the true position are related by


(b) Additive noise with delay: the measured position and the true position are related by


From an experimental point of view, case describes purely random measurement errors, which corresponds to the assumption that and are uncorrelated. In contrast, case describes a case where these variables are correlated because the measurement device introduces a delay between and its measured value, . Clearly, both cases are relevant experimentally.

Furthermore, for both dynamics and , we assume the distribution of to be an equilibrium one, while that of is not, but corresponds to a stationary non-equilibrium distribution. The system can be prepared in such a state at by starting the evolution at a time in the absence of driving, so that the distributions of and are both stationary. Naturally, both variables and may still be correlated with each other.

iv.1 Choice of measured variable: position vs. force

Before implementing the above dynamics, let us now discuss a practical question regarding the choice of measured variables in single-molecule experiments. In a first setup, where the position is measured, the Hamiltonian which is typically used has the form: , where describes the macromolecule under study (a DNA filament or RNA hairpin, for instance), with labeling the relevant degrees of freedom of that system. This molecule is attached to a bead which is held in an optical trap, and the energy of the bead is given by , where is the position of the bead and the position of the trap center. Finally, accounts for the coupling between the molecule and the bead.

Usually, the calibration of optical tweezers relies on a harmonic approximation for the trapping potential, , where denotes the stiffness of the trap and the position of its center. In this case, the work is

which does not depend explicitly on the degrees of freedom of the molecule under study characterized by . In this case, the work on the system is exactly equal to the work on the bead, since the trap is the only term of the Hamiltonian which depends on . The structure of the error in this situation is very simple:


which shows that the error increases with the driving speed and accumulates with the duration of the experiment .

One limitation of such a setup where the position is measured lies in the harmonic approximation used for the trapping potential, an approximation which is expected to fail at large distances from the bead to the center of the trap. Furthermore, recent studies have found great variability in the trap stiffness as a function of the position, even in the region where a constant stiffness was expected Jahnel et al. [2011]. To overcome such issues, a different setup is often preferred, where no assumption on the form of the trapping potential is needed.

In this alternative setup, the force rather than the position, is directed measured from the change in the momentum flux of the light beam impinging on the optical trap Smith et al. [2003]. There is no need to assume a particular form of the trapping potential: one rather measures the force signal, , which also has some noise (i.e., , the true force exerted by the optical trap). The position of the center of the trap is the control parameter which we assume to be error free as we did so far. In some setups one does not have direct access to the position of the trap and one has also to infer it with some error, but we dismiss that possibility here and assume that this is our control parameter 1. For this setup the work reads:


with . Note that the trapping potential , thus, , and the definition (60) coincides with the Jarzynski work Jarzynski [1997a], ?, satisfying the nonequilibrium work theorem in the form given by (1). In this case the structure of the error is also very simple


It is worth noting that both, Eq. (59) and Eq. (61), have the same structure. In addition, note that the assumption that is error free is not very dangerous. This can be seen as follows. In the first setup, one can redefine the distances and consider the error in measuring instead of alone. In the second case, one does not need to know the value of in order to calculate the work because the force is directly recorded. In both cases what remains free is the pulling velocity, , which is very well controlled even if itself is not.

Since it is a rather simple matter to switch between notations for the force setup and that for the position setup, we limit ourselves in the rest of the paper to only one case, which we chose to be the position setup.

iv.2 Corrected Jarzynski estimator

Let us derive the correction to the Jarzynski estimator in the presence of measurement error within dynamics defined in Eq. (57).


In this case one has , where is the path probability density of the error trajectory . Thus, since the error in Eq. (59) is a linear functional of , it can be integrated explicitly. We thus have


where we have used the Jarzynski equality, Eq. (1), and we have introduced the generating functional of the cumulants of , . From this, we obtain the following estimate of the free energy,