Entropy production and coarse-graining in Markov processes

# Entropy production and coarse-graining in Markov processes

A. Puglisi CNR-ISC c/o Dipartimento di Fisica, Università Sapienza, p.le A. Moro 2, 00185 Roma, Italy and Istituto Sistemi Complessi (ISC), CNR, via dei Taurini 19 00185 Roma    S. Pigolotti The Niels Bohr International Academy, The Niels Bohr Institute, Blegdamsvej 17, DK-2100 Copenhagen, Denmark    L. Rondoni Dipartimento di Matematica and INFN, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy    A. Vulpiani Dipartimento di Fisica, CNR-ISC and INFN, Università Sapienza, p.le A. Moro 2, 00185 Roma, Italy
###### Abstract

We study the large time fluctuations of entropy production in Markov processes. In particular, we consider the effect of a coarse-graining procedure which decimates fast states with respect to a given time threshold. Our results provide strong evidence that entropy production is not directly affected by this decimation, provided that it does not entirely remove loops carrying a net probability current. After the study of some examples of random walks on simple graphs, we apply our analysis to a network model for the kinesin cycle, which is an important biomolecular motor. A tentative general theory of these facts, based on Schnakenberg’s network theory, is proposed.

To our friend and colleague Massimo Falcioni, on his 60th birthday

## I Introduction

The coarse-graining procedure is a fundamental ingredient of the statistical description of physical systems Ma (1985); Kadanoff (2000); Castiglione et al. (2008). By coarse-graining we mean a procedure which reduces the number of observables to simplify the physical description. For instance, it is used to describe the behaviour of the physically relevant quantities, or slow variables, which depends on the coupling among all variables characterizing the system of interest, including the so-called fast variables. The archetype of such a procedure is the treatment of Brownian colloidal particles, immersed in a fluid, in terms of the Langevin equation. In this sense any model meant to represent a real phenomenon may be thought of as a coarse-grained, i.e. reduced, description. The purpose of a model is, indeed, to advance our understanding of the object under investigation, by highlighting its interesting features and discarding the irrelevant ones. In turn, the roles of relevant and irrelevant characteristics depend on the purpose of the analysis to be performed. Furthermore, it isn’t always obvious which quantities should be listed as interesting, and which ones should be neglected, especially if a new problem is to be tackled Ma (1985); Kadanoff (2000); Castiglione et al. (2008); Bonaldi et al. (2009); Gregorio et al. (2009). Therefore, it is critical to understand how specific physical observables depend on the coarse-graining procedure.

Examples of coarse-grained descriptions at different resolution levels include the steps meant to connect the microscopic descriptions of systems of physical interest to the macroscopic ones, for instance the passage from the deterministic -space description (positions and momenta of the particles) to the stochastic -space description (position and momentum of one particle), up to macroscopic descriptions such as hydrodynamics, Fourier law, Navier-Stokes equations, etc.

Other methods use the coarse-graining procedure in order to reduce the number of variables, e.g. by a decimation method which suppresses the fast variables, or perform a spatial coarse-graining, as in the renormalization group approach. In these methods the coarse-graining is parametrized by some threshold, here denoted as coarse-graining level (CGL). This paper is devoted to the investigation of the impact of variations of CGL on the entropy production of non-equilibrium systems.

In the last decades, the introduction of the so-called Fluctuation Relations (FR) for deterministic dynamics, by Evans, Cohen, Morriss, Gallavotti, Jarzynski and other authors brought about important developements in the physics of far from equilibrium systems Evans et al. (1993); Gallavotti and Cohen (1995); Jarzynski (1997). In the specific context of Markov processes, here discussed, Lebowitz and Spohn Lebowitz and Spohn (1999) showed that the “entropy production” per unit time, measured on a time-interval , say, is described by a large deviation theory whose Cramer function, , enjoys the following symmetry property:

 C(Wt)−C(−Wt)=−Wt .

This relation is the stochastic counterpart of the deterministic steady state FR, and we call it Lebowitz-Spohn FR, or simply FR.

We remark that the FR does not provide any specific information about the shape of . Therefore, a coarse-graining procedure which preserves the Markovian character of the model should preserve the validity of the FR as well, although it may change the shape of the Cramer function. As a matter of fact, Rahav and Jarzynski Rahav and Jarzynski (2007) argue that the validity of the FR is little affected by the coarse-graining procedure, even in nontrivial cases, such as those in which the decimation (or blocking) of variables results in the loss of the Markovian property.

In the present paper, at variance with Ref.Rahav and Jarzynski (2007), we do not address the question of validity of the FR (which is always satisfied by our models), but focus our attention on the effects of the decimation procedures on the behaviour of the Cramer function.

Understanding how changes under variations of the CGL is relevant, e.g. to interpret experimental results, since they are always obtained at finite resolution (for instance in frequency, Bonaldi et al. (2009)). Likewise, any model describing a real system is necessarily affected by some degree of approximation or of idealism. For instance, the entropy production defined by Lebowitz and Spohn appears to be a rather abstract quantity, depending on the direct and inverse trajectories in the state space, as well as on their probabilities in the stationary state, cf. Section II. Furthermore, such a quantity cannot be measured in a direct way. Therefore, it needs to be connected to directly measurable quantities, for its properties to be assessed.

Naively, one may expect the entropy production computed through a model which encompasses lots of details of the system of interest to be higher than that computed through less detailed models. Consider for example the Markov chain depicted in Fig. 1a, with transition probabilities and much larger than the remaining ones. One may impose that a net current flows from to and from to , by choosing transition probabilities and and by tuning the other parameters so that all states have the same stationary probability.

In this system, detailed balance does not hold, hence the mean entropy production is positive. However, the mean entropy production vanishes if the fast states and are decimated, and the Markov chain is reduced to the one represented in Fig. 1b, where corresponds to the old state and to the old state . Indeed, detailed balance, which implies a vanishing entropy production, holds in any Markov chain with two states.

In the following we consider Markovian systems described by a Master equation:

 dPndt=∑l≠nPlWl→n−W0nPn (1)

where is the transition rate from state to state , with , being the number of possible states of the process, is the probability to stay in the state at time , and

 W0n=∑l≠nWn→l. (2)

We adopt the coarse graining procedure introduced in Ref.Pigolotti and Vulpiani (2008) and described in Appendix A, which amounts to a decimation of all states whose characteristic times are smaller than a given . The resulting master equation for the surviving states may be written as

 d~Pjdt=∑i≠j~Pi~Wi→j−~W0j~Pj (3)

with transition rates as prescribed by Pigolotti and Vulpiani (2008), , , and for all .

Varying the number of the slow states and the shape of the Cramer function may in principle change.

In Section II, we present some numerical results for differently decimated Markov processes. In section III, we discuss the possibility of constructing a general theory, not yet available, of the decimation effects. In Section IV, we draw some conclusions and discuss open problems. Appendix A illustrates the decimation procedure; Appendix B reports analytical results about the effect of decimation on the current in a single loop; Appendix C recalls the graph analysis of currents in a Markov system, based on Schnakenberg’s theory; Appendix D describes in detail the kinesine model mentioned at the end of section II; Appendix E lists the main symbols used in the text.

## Ii Some numerical results

### ii.1 Entropy production on a trajectory

While the concept of entropy production, or energy dissipation, dates a long time back de Groot and Mazur (1984), only recently have the fluctuations of entropy production attracted particular interest, thanks to important theoretical and numerical results, supported by some experimental evidence. For a trajectory of duration of a continuous time Markov process, in which transitions are observed, being the -th visited state, the following definition of entropy production has been given by Lebowitz and Spohn Lebowitz and Spohn (1999):

 Wt=1tlnWω0→ω1Wω1→ω2...Wωm−1→ωmWω1→ω0Wω2→ω1...Wωm→ωm−1. (4)

It can be shown that the times at which the transitions occur affect the numerical value of only with corrections of order , negligible in the limit. In the present paper, we consider this entropy production, introduced by Lebowitz and Spohn. Clearly, this quantity does not need to represent any real thermodynamic observable, since it can be defined independently of the physical relevance of the Markov process at hand. Nevertheless, as commonly done in the literature, we will refer to it merely as to “entropy production”.

Even if in this paper we consider the case of continuous time, in the discrete time case (Markov chains) one can use the same definition (4), by replacing with the probability of a transition in a time step , . The relation between continuous and discrete time quantities is incorporated in the equalities for , and .

The connections of with other definitions of entropy production rate are discussed in Appendix C. Here, it suffices to recall that in the steady state, if the invariant probability of the process satisfies the detailed balance condition

 Wω→ω′Wω′→ω=Pinvω′Pinvω. (5)

The system is in equilibrium if eq. (5) holds. If detailed balance does not hold, one has . Let be the Cramer function of the probability density function (pdf) of , in the steady state, i.e. let be defined by

 C(Wt)=−limt→∞1tlog[f(Wt)]. (6)

In numerical calculations, the Cramer function must be approximated by its finite time counterparts, . Therefore, in our calculations, we have chosen times large enough that further growths of the averaging times practically do not affect our results.

Lebowitz and Spohn have shown that the condition

 C(Wt)−C(−Wt)=−Wt, (7)

is better and better approximated as the time grows Lebowitz and Spohn (1999). The limit of relation (7) is known as a Steady State Fluctuation Relation (SSFR). It does not provide the shape of , but only a symmetry property of . Remarkably, is system-dependent Marini Bettolo Marconi et al. (2008), while (7) holds quite in general.

In the following, we address the question of the dependence of on the CGL which, in the decimation procedure of Pigolotti and Vulpiani (2008), is parametrized by the threshold time . The protocol of Ref.Pigolotti and Vulpiani (2008), eliminates all states with average exit time , and requires the surviving states to have re-normalized transition rates . Denote by and by the pdf of the entropy production and its Cramer function, for the decimated process with threshold time .

The present investigation suggests the conjecture that the entropy production does not depend sensibly on the precise properties of fast and slow variables of the Markov process: it only depends on the currents flowing in the system.

### ii.2 Results on 1d and 2d regular lattices

Let us begin focusing on continuous time random walks on simple topologies, i.e. on regular lattices with periodic boundary conditions, with random transition rates restricted to nearest neighbours. To simplify the procedure, we require every state , , to have characteristic (exit) time , belonging to a set such that . This condition corresponds to a separation of time-scales which represents a mild requirement for the decimation protocol of Pigolotti and Vulpiani (2008) to apply. Transition rates may then be chosen to have, or not to have, a preferential direction, in order to allow, or to prevent, a positive entropy production. For instance, entropy production can be positive in lattices, only if some rates obey (cf. Appendix C, for a more precise condition, based on the notion of affinities).

Simulations for regular lattices show a striking robustness of the entropy production Cramer function with respect to decimation. The numerically computed Cramer function for chains is plotted in Figure 2 for different CGL. The figure shows that decimating of the system, i.e. leaving only the slowest states, the fluctuations of entropy production remain substantially the same. The result does not seem to depend on the details of the transition rates, but only on the separation of time-scales.

In the following sections, we show that the entropy production is not directly related to the properties of fast and slow states per se either, while it seems reasonable to conjecture that it is closely related to the currents flowing in the system. These are global quantities, rather than local ones, which depend on the topology and on the interplay among all transition rates, an idea that may be understood in simple terms, as follows. In the case of a random walk on a ring (a 1-dimensional lattice with periodic boundary conditions), one may write

 Wω0→ω1Wω1→ω2...Wωn−1→ωnWω1→ω0Wω2→ω1...Wωn→ωn−1=(WforwWback)m×R (8)

where , , is an integer and is a correction term. In presence of a mean current , one has where is the current computed in the time window (i.e. ) and indicates the integer part of . This leads to the relation

 Wt≈G(t)logWforwWback+O(1/t). (9)

Since the decimation protocol eliminates fast states and modifies transition rates in order to leave basically unaltered the currents connecting the surviving states, the fluctuations of entropy production should not be sensibly affected by the decimation procedure. Observe, however, that this current conservation is only approximate, not exact. The modification of the current produced by the decimation can indeed be estimated analytically in simple cases, as in systems whose states form a single loop. In Appendix B, we argue that the correction should be generally small and related to the ratio of the times spent in the fast states and in the slow ones. Figure 2: Numerically computed Cramer function for entropy production distribution in 1D continuous time random walks with p.b.c. In each plot a comparison among different coarse graining levels is shown. In both panels, each state of the non-decimated system may take one out of three possible average exit times: 1 (10% of states), 0.1 (20% of states) and 0.01 (70% of states). In the left panel, the probability of jumping to the right is 0.4 (60% of states) or 0.6 (40% of states). In the right panel, the probability of jumping to the right is 0.4. In the left frame we have N=100 and t=2⋅103. In the right frame we have N=300 and t=103. The numerical computation has been performed with the Gillespie algorithm Gillespie (1977), where the actual probability of a transition is the product of the transition rate by the characteristic time.

Similar results are reported in Figure 3 for 2D regular square lattices, where jumps occur among nearest neighbours: even with this topology, the fluctuations of entropy production appear not to be affected by the CGL, although the result is not as robust as in the case. Indeed, a substantial change in entropy production can be observed if the system contains a very large number of fast states to be decimated. Nevertheless, it still is interesting to realize that the Cramer function has not changed substantially, even after of the original system has been decimated. Note that the square lattice topology is drastically altered by decimation: states which have not been decimated remain connected by chains of transitions, but the system is not planar nor a regular lattice anymore. Currents in this case may still be defined within a more general graph theory Schnakenberg (1976), such as that discussed in Appendix C. Figure 3: Approximated Cramer function for entropy production distribution in continuous time random walks on a 2D squared lattice (nearest neighbours) with p.b.c. In each plot a comparison among different coarse graining levels is shown. In both cases the probabilities of jumping to one of the four nearest neighbours are biased to give a net current in one direction. The left and the right panels differ by the values of the exit times. Left: the states have exit times 1 (50% of states), 0.1 (20% of states) and 0.01 (30% of states). Right: the states have exit times 1 (70% of states), 0.1 (20% of states) and 0.01 (10% of states). In both cases N=100 and t=1000

### ii.3 Results on graphs with fast and slow loops

Guided by the conjecture that the fundamental ingredient for entropy production is the current flowing in a circuit, we construct Markov processes composed of independent loops joined by a single interchange state. The general structure of this graph is illustrated in Fig. 4. The main slow loop is decorated by fast loops (first level), which are on their turn decorated by faster loops (second level), etc. After decimation, one may encounter different situations:

1. the new and the old structures have the same topology, i.e. only pieces of loops have been suppressed but the number and position of loops is the same;

2. all loops of the faster (outer) level are suppressed;

3. all loops of the two fastest levels are suppressed;

4. and so on;

Loops at the same level have similar properties and, in particular, are chosen to have, or not to have, a positive entropy production, i.e. to have or not to have a preferential direction in their transition rates. Figure 4: Sketch of a graph made of three levels of nested loops: in this example, the main level loop has no preferential direction, while the second and third levels are made of smaller loops with faster states and preferential directions.

The computed Cramer function for the entropy production of the case of Figure 4 is reported in Figures 5 and 6. In Figure 5 the states of the fast loop have slightly different characteristic times, allowing a progressive decimation of the fast loop. At a decimation threshold such that the fast loop is still alive, even if made of only three states (blue curve), we are in situation 1 and the Cramer function of the entropy production is very close to that of the non-decimated system (black curve). A further increase of the decimation threshold makes the fast loop disappear: it remains with only two states and the system falls in situation 2, where the Cramer function of the entropy production has a sudden macroscopic change (red curve). The inset of Figure 5 shows as a function of the decimated percentage of the fast loop: neglecting a very weak growth, appears practically constant, until the fast loop is not reduced to a -states branch. If the main loop is configured to have a non-zero current, this is what remains at that point, otherwise the fluctuations of are reduced to a very narrow and symmetric peak around zero.

Figure 6 shows other cases with three levels of loops. The Cramer function of the entropy production always changes when a level of current-carrying loops is entirely removed. Figure 5: Approximate Cramer function for entropy production distribution in continuous time random walks on a graph similar to that of Figure 4, with two hierarchical levels. A comparison among different coarse graining levels is shown. The main loop is made of 100 states with average exit time 1 and preferential direction given by balanced (left) or unbalanced (right) transition rates. In the case of unbalanced rates, they are 0.6 toward left and 0.4 toward right. The second level loops have 30 states and a bias in the transition probabilities (0.8 vs. 0.2) chosen to give a preferential direction, while their characteristic times range from 0.1 to 0.7. In all simulations t=1000. Figure 6: Approximate Cramer function for entropy production distribution in continuous time random walks on the topologies of Figure 4, with three hierarchical levels of nested loops and t=104. In each plot a comparison among different coarse grained levels is shown. The main loop is made of 100 states with average exit time 1 and no preferred direction. The second level loops (each with 10 states) have average exit time 0.1 and a bias in the transition probability chosen to give a preferential direction. The third level loops (each with 5 states) have average exit time 0.01. The difference between the left and right frames is in the transition rates of this third level. Left: third level loops have a preferential direction. Right: third level loops without preferential direction.

### ii.4 A model from molecular biology: coarse graining of the Kinesin’s network

The examples in the previous sections suggest that the entropy production is weakly affected by the coarse graining procedure, apart from the cases in which loops contributing significantly to the entropy are destroyed and the entropy production undergoes an abrupt decrease. A natural question is whether this phenomenology is a peculiarity of the model introduced here, or similar behaviors pertain to other realistic models of non-equilibrium systems. Biochemical reactions are often characterized by non-equilibrium processes acting over different timescales and thus afford an ideal benchmark for the ideas proposed here. In this subsection, we study the effect of coarse graining on a recent network model of the kinesin motor cycle Liepelt and Lipowski (2007).

Kinesins are a common category of motor proteins Howard (2001) that are used for transport on microtubules in eucaryotic cells. Like many other non-equilibrium reactions inside cells, kinesin is powered by ATP. A number of experiments during the last decades elucidated many structural and dynamical details of this systems. In particular, it is now understood that kinesins are made of two identical heads, that walk on microtubules with a “hand-over-hand” mechanism Yildiz et al. (2004), alternating their position at the front.

The model proposed in Liepelt and Lipowski (2007) describes both the ATP-driven chemical reactions and the mechanical step in which the two heads swap. The multiple cycle structure of the reaction is given by this chemomechanical nature and by considering the fact that ATP may be burned by both heads. The scheme of the reaction and the possible transitions are illustrated in Fig.(7). Figure 7: Scheme of the transition network of the kinesin model Liepelt and Lipowski (2007). a) Reaction network. The states, numbered from 1 to 6, are characterized by the two heads bound to ATP (A), ADP (D) or free. The molecules bound (or released) during the chemical transitions are shown as connected to the arrows. The dashed arrow represents the mechanical transition, in which kinesin makes its step on the microtubule. b) Kinesin network after coarse graining the fastest states, first state 2, then state 5. Dot-dashed arrows represent the new transitions appearing as a consequence of the coarse graining procedure.

Each of the two heads may be in three different configurations: free, bound to ATP and bound to ADP, resulting in possible states. However, the motor is believed to work “out of phase”, i.e. states in which the two heads are in the same configuration are unlikely to be observed. This reduces the model to the states represented in the scheme of Fig. (7). Of all the possible transition among the states, only those which are consistent with experimental observations are considered in the model and shown in the diagram. Clearly, the assumptions above (in particular that of considering only states) already imply some level of coarse graining with respect to the complete problem. However, the effect of these assumptions on the entropy production is hard to determine, since it would be difficult to construct a more detailed model, from the available experimental results. We then take the model of Fig. (7) as our starting point, and study the effect of decimating the states of the system.

The transition rates of the model depend on the ADP, ATP and P concentration and on the load force of the molecular motor. Moreover, the parameters determining these rates have a slight dependence on the kind of the experiment one wants to reproduce, since different experiments work in different conditions and may use different kinds of proteins in the kinesin family. We determined the rates by choosing the parameters fitting the experiment of Ref.Carter and Cross (2005) and assumed fixed concentrations of (micromoles) for simplicity. We then consider two different cases: one without work load and one with a work force equal to (piconewton). Details on the derivation of the rates and numerical values are given in Appendix (D).

In both cases (with and without load), state is the fastest and state is the second fastest. We compare then the entropy production of the complete model, of the model in which state has been adiabatically eliminated and of the model in which both and have been eliminated. The pdf of finite-time-averaged entropy production , obtained from realizations of trajectories of length is shown in Figure 8, without load in the left panel, with load in the right panel. Decimation of state 2 leaves the pdf unchanged. In the case without load, further decimation of state 5 changes abruptly the pdf to a close-to-zero peaked pdf. Figure 8: Pdf’s of the entropy production per unit of time. Black curve is for the original model with 6 states. Red is after decimation of state 2. Green is after decimation of states 2 and 5. Left panel: the case without load. Right panel: the case with load.

This model reproduces the same scenario of the “fast loop-slow loop” model of the previous section. The first coarse graining strongly alters the structure of the network, but its effect on the entropy production and its fluctuations is barely noticeable. Conversely, decimating one more state drastically reduces the entropy production. Notice also that the most irreversible transition in the original model is the “mechanical” transition between state and , since for the load-free case and for the loaded case. In the sense specified in the next section, the information about the irreversibility of this transition is lost, when the level of coarse graining is too large.

## Iii Tentative theory

Consider a continuous time Markov process: each state can be seen as a vertex of a graph, and transitions with a positive rate correspond to edges (also called links) between and . As in Lebowitz and Spohn (1999), we assume that the transition has a positive rate whenever the inverse transition does. As illustrated in Section 2.B, the entropy production of random walks on 1D rings is closely related to the current flowing in the ring. In this Section, we attempt to generalize this observation to generic graphs Burioni and Cassi (2005). The main tool for this purpose is a decomposition in fundamental cycles, which is illustrated in Appendix C. In the example of Figure 4, the fundamental cycles are nothing but the loops.

Let us introduce a different functional :

 Qt=∑αA(→Cα)Gα(t) (10)

which depends only on a few “structural properties” of the process: the fundamental cycles (i.e. a property of the graph), their affinities and their fluctuating currents , averaged over a time interval , which depend also on the transition rates. Figure 9: Equivalence of the Cramer function for Wt and Qt at t=104. Left: random walk on a 2d lattice with same parameters as in Figure 3 (right). Right: random walk on a nested loop graph with same parameters as that in Figure 6 (right).

In various numerical simulations, we have verified that the fluctuations of and those of are practically indistinguishable, at large times: cf. Figure 9 for two examples. It is also known that for large times Andrieux and Gaspard (2007).

To obtain a complete theory, it remains to show that the “structural properties” are not affected by the decimation; this task can be subdivided in three steps:

1. examine the fate of fundamental cycles after decimation of one fast state; there are three possibilities: i. cycles may be destroyed, ii. transformed into different ones, iii. new cycles may be created;

2. derive the corresponding variation of affinities;

3. obtain the values of currents after decimation.

Concerning task , it is easy to realize that affinities do not change in transformed cycles, while new cycles have zero affinity, so they do not contribute to the entropy production. Disappearing cycles pose, instead, a difficult question: numerical simulations show that they are usually small and that the affinity lost with their removal is equally small. At the moment, however, we do not have an analytical estimate of this quantity. Task 3 is a hard problem too: the stationary value of currents must satisfy many coupled Kirchhoff equations and depends on the properties of the whole graph. Numerical simulations suggest that average currents are not drastically influenced by our decimation procedure. One rough explanation of this fact can be given for a system with small entropy production, obtained from a perturbation of an equilibrium system . Indeed, one may assume a linear relation between the affinities and the average currents of , of the form , with coefficients determined only by the properties of . If the decimation procedure, which replaces with a new system , leaves substantially unaltered the invariant probability of the surviving states, it is reasonable to assume that the decimated system is another small perturbation of . Then, the linear relation between affinities and currents of retains the same coefficients , leading to the conclusion that the currents are conserved under decimation, if affinities are.

We now discuss the consequences of decimation on cycles and their affinities. A maximal (also called “spanning”) tree is found on the original graph . This tree includes all vertices (states) and only a part () of the original number of edges. All pairs of vertices are connected by a unique path on this tree. All edges left out from the tree (a number ) are called “chords”. A chord connecting vertices and , attached to the unique path connecting and along the tree, is a closed loop. All loops generated in this way constitute the set of fundamental loops, which become “cycles” when orientation is taken into account. These fundamental cycles determine the statistics of and therefore of , cf. Appendix C for the details. In the cases discussed below, the removal of a vertex using our decimation procedure preserves almost exactly the fundamental cycles and their affinities; the small variations observed are due to the possibile reduction of whole -loops to -loops (i.e. simple links) corresponding to a total loss of the affinity of the original -loop. The impact of this unfortunate event is difficult to estimate, because it depends on the topology of the graph: the removal of a vertex may lead to a crunch of a number of loops smaller than or equal to the degree of the removed vertex. The amount of lost affinity for each reduced loop is expected to be small, since it is associated with a small loop, and correspondingly small should be the loss in current and in entropy production. It is remarkable that the exit times of states do not affect the affinities, although they can affect the currents.

Nevertheless, decimation may affect the large loops as well; a progressive and repeated removal of vertices may eventually reduce a large loop to a -loop. Unfortunately, controlling these events goes beyond our mathematical ability, therefore, the size of the error in the conservation of fundamental cycles under decimation remains an open question.

In the figures, all black objects (vertices and links) are related to the original graph, red objects are the new ones formed after decimation. Solid links are part of the maximal tree, dashed links are chords. When the state labelled by is removed, it is linked to some other states collectively denoted as : -states (linked to ) are in number of . These links are broken and all pairs of states and (previously connected to ) are connected among each other with a new transition rate (or, if the link already exists, its transition rate is updated adding that amount). We consider the simplest case where one chord at most is involved in the decimation procedure. With this assumption, three possibilities can be encountered: Figure 10: An example of state-removal where no chords belong to the subgraph involving the removed state. Black objects (vertices and links) are relative to the original graph, red objects are formed by the decimation protocol. Solid links are part of the maximal tree, dashed links are chords.
1. new loops with no entropy production:

The simplest case is realized when no chords connect any to and no chords connect any to any , cf. Fig. 10 for one example. In this case, all original links (, , and in Fig. 10) are on the spanning tree and no links join any to any . After the links and the central state have been removed, the red links are created (a, b, c, d, e, f in the example): they are in a number . A number (for any ) are new chords (a, d and e in the example), while the remaining are links of the new spanning tree (b, c and f). Therefore new loops have been created (in the example they are with chord a, with chord e and with chord d). It is immediate to verify that the affinity of the new loops vanishes: for instance the loop has forward transition rate given by and backward transition rate given by and they exactly cancel out (the exit rates are omitted, but they cancel out trivially): these new loops do not contribute to the entropy production. Figure 11: An example of state-removal where a chord is removed by decimation. Black objects (vertices and links) concern the original graph, before decimation, red objects are the new ones formed after decimation. Solid links are part of the maximal tree, dashed links are chords.
2. loop-shortening:

Another possibility (see Fig. 11) is that some link is a chord in the original graph, which means that it is not in the spanning tree: e.g. link A in the example, with . Then, state is connected to 0 through some other unique path on the tree, possibly passing through a state (the unique path on the tree is also represented in the figure, terminating with the link ), forming a loop . In this case, the decimation of state creates the link (link “a” in the example) as a chord of the loop , which is two steps shorter than the orginal loop. It is immediate to see that the affinity of the new loop is the same as that of the old one. In this case, all new links starting from , or from , and ending in another , must be chords, since and are joined by a unique path on the tree which has not been touched by decimation: the number of new chords is larger than in the previous case, but all their loops have zero affinities.

There is also the possibility that the loop passing through chord is simply given by , i.e. that it is a -loop, originally belonging to the graph: this case can be put in the last category, simply exchanging the roles of the chords A and a: we call loop-crunching this case. Figure 12: An example of state-removal where a chord connects two states which are directly linked to the removed state. Black objects (vertices and links) pertain to the original graph, before decimation, red objects are formed by decimation. Solid links are part of the maximal tree, dashed links are chords.
3. loop-crunching:

The last possibility is that some link already existed in the original graph, which means that it is a chord of the loop , cf. Fig. 12, where chord a connects states 4 and 3. In this case the removal of state leads us to crunch the -loop, making it a simple link with a new transition rate. The original loop and its contribution to entropy production are then lost.

We stress that a mathematical proof of the above considerations is still lacking, although our arguments are strongly supported by numerical results.

## Iv Concluding remarks and open problems

In this paper we support, both numerically and theoretically, the idea that fluctuations of the entropy production are essentially insensitive to a coarse-graining based on decimation of fast states, provided that decimation does not remove fundamental loops carrying net currents. The threshold of coarse-graining level which trigger the removal of such loops is not fully understood, but our investigation suggests that entropy production fluctuations are generally quite robust with respect to decimation. Moreover, this robustness does not appear directly related to the characteristic times of the removed states. Robustness or fragility of loops appears mostly related to the global structure of the network at hand.

We applied this analysis to the network model of the biomolecule known as kinesin, discovering that no entropy production is lost if a coarse-graining from six to five states is performed. This observation is potentially interesting in biophysics, since entropy production is a fundamental property of irreversible chemical reactions, such as those fueling the kinesin motor protein. On the contrary, the decimation of the model from six to four states is catastrophic, making the model unsuitable to produce work.

More detailed studies are necessary to quantify the entropy production variations induced by coarse graining: the main missing ingredient is the evaluation of the effect of decimation on the currents of the surviving loops. This will lead to a better understanding of the role and meaning of as a definition of entropy production. In particular, understanding the relation between and macroscopic observable properties of the system may help in modelling non-equilibrium systems.

###### Acknowledgements.
A. P. acknowledges the support of the “Granular-Chaos” project, funded by the Italian MIUR under the FIRB-IDEAS grant number RBID08Z9JE. S. P. wishes to thank M. Mueller for suggesting the kinesin example. L. R. acknowledges the contribution of the European Research Council within the 7th Framework Programme (FP7) of the European Community (EC), ERC Grant Agreement n. 202680. The EC is not responsible for any use that might be made of the data appearing herein.

## Appendix A The decimation procedure

In this Appendix, we summarize the coarse graining method introduced in Pigolotti and Vulpiani (2008). Consider a master equation of the form of eq. (1). Due to the Markovian nature of the process, the time spent in a generic state is exponentially distributed with average . One may wish to decimate all states having an average permanence time smaller than a prescribed threshold . To do that, Ref.Pigolotti and Vulpiani (2008) sets to 0 the time spent in these states. In this way, the fast states disappear from the description and transitions to them are redirected to other states with proper statistical weights. In formulae, if a state is linked to a state via a fast state that must be eliminated, the transition rate from to is renormalized to yield the rate:

 ~Wi→j=Wi→j+Wi→nWn→j/W0n (11)

If , the decimation creates a direct connection between the surviving states, which is reminscent of the states that disappeared from the model under consideration.

This procedure corresponds to an adiabatic approximation and is commutative, if the prescription of Pigolotti and Vulpiani (2008) is followed. Once the set of states to be decimated is determined by the threshold, they can be decimated in any order without affecting the final result, as long as the set itself is not modified during the decimation procedure. It may happen, indeed, that the permanence time of some of the states selected for decimation becomes larger than , while other states are decimated. The recipe of Pigolotti and Vulpiani (2008) requires that this state be eventually decimated nonetheless.

## Appendix B Effect of decimation on the current in a single loop

In this Appendix, we investigate the effect of decimation on the current of a single loop consisting of states. For convenience, let us rewrite the master equation:

 ddtPn(t)=Wn−1→nPn−1+Wn+1→nPn+1−Pn(Wn→n+1+Wn→n−1) ,

with , as:

 ddtPn(t)=Jn−Jn−1 (12)

where the local current is given by:

 Jn=Pn−1Wn−1→n−PnWn→n−1. (13)

In a stationary state, the current is site-independent and one may write . In particular, detailed balance and equilibrium hold if . The set of equation , together with the normalization condition , can be solved for both and the invariant measure . For instance, let us proceed iteratively, as follows:

 Pn = Pn−1Wn−1→nWn→n−1−JWn→n−1= (14) = Pn−2Wn−2→n−1Wn−1→nWn→n−1Wn−1→n−2−J(1Wn→n−1+1Wn−1→n−2Wn−1→nWn→n−1)=… = PnN∏k=1Wk−1→kWk→k−1−J(N−1∑j=01Wn−j→n−j−1j∏k=0Wn−j+k−1→n−j+kWn−j+k→n−j+k−1).

We obtain from the last expression

 Pn=−J(N−1∑j=01Wn−j→n−j−1j∏k=0Wn−j+k−1→n−j+kWn−j+k→n−j+k−1)1−N∏k=1Wk−1→kWk→k−1 (15)

and by means of the normalization condition , we reach the following closed expression for :

 J=(N∏k=1Wk−1→kWk→k−1)−1N∑n=1N−1∑j=01Wn−j→n−j−1j∏k=0Wn−j+k−1→n−j+kWn−j+k→n−j+k−1. (16)

Let us now decimate one fast state, say, and consider the current. It is easy to show that the numerator is not affected by the decimation protocol defined by eq. (11). Conversely, the denominator decreases by an amount which can be espressed as follows:

 ΔD=Do−Dd=1Wn∗→n∗−1+N−1∑j=01Wn∗+1→n∗j∏k=0Wn∗+k→n∗+k+1Wn∗+k+1→n∗+k (17)

where and are the denominator in (16) for the original and the decimated system, respectively. As is positive, the current in the decimated system is larger than in the original one, the difference being

 ΔJ=Jd−J=JΔDD0−ΔD. (18)

This allows us to check what happens in simple cases. For instance, eq. (17) leads to , if all state have same left and right jump rates (the two must be different to have a non-trivial current). If the rate of the decimated state is much faster than the others, eq. (17) also shows that the correction decreases linearly with the separation of time scales, i.e. with the ratio of the average rates of the fast states and that of the other states. This is consistent with the picture of the current correction being essentially due to a rescaling of the times related to the elimination of the fast state. In other words, the magnitude of the correction seems to be always related to the ratio of the time spent in the fast state(s) and the time spent in the slow ones.

## Appendix C Graphs and currents

We consider a Markovian (continuous time) process on states. The states are considered as nodes of a graph. The transitions between different states are considered as links (edges) between nodes.

### c.1 Fundamental cycles

Graph theory simplifies the classification of closed loops on a graph Schnakenberg (1976), identifying a set of fundamental “cycles”. Given a graph with vertices (nodes) and edges (links between nodes), the strategy - exemplified in Figure 13 - is the following: Figure 13: An example of graph with 5 states and three fundamental loops: all transitions (links) have a given orientation. A possible maximal tree is the one made of only solid links, with the three dashed links representing the remaining chords, which individuate three fundamental loops. Any other possible loop, e.g. →C=1→2→4→5→1, can be decomposed in the sum of fundamental loops, e.g. →C=→C1+→C3.
• identify a maximal tree , i.e. a set containing all vertices and part of the edges, which is connected and does not contain circuits. It is easy to show that has edges. Many maximal trees can be identified, but one suffices;

• given an arbitrary maximal tree , the edges of which do not belong to are called chords of , they are in a number ;

• if only one chord , , is added to , the new graph contains only one circuit, , obtained by removing all edges which are not part of the circuit; therefore, from a maximal tree, circuits can be generated adding the chords;

• the set of circuits obtained from the chords of a maximal graph is called a fundamental set of circuits, denoted by

• orientation of edges must be introduced: each edge is assumed to be oriented in an arbitrary direction, giving the oriented version of , denoted as ; then one can take a subgraph with oriented edges , which may have different orientations with respect to the original orientations of the edges of . The function is introduced for these cases: it returns if the edge is in and has the original orientation, if it is in and has opposite orientation, and if is not in .

• a cycle is an oriented circuit, e.g. ; a fundamental cycle is denoted by : for simplicity we always choose the orientation of a fundamental circuit to be parallel to the orientation of its chord , i.e. ;

• the scalar product among cycles is defined as

 (→C,→Cα)=Sα(→C)Sα(→Cα)≡Sα(→C) (19)

where is the chord which generates the circuit ; this scalar product can only take three values: , or .

• a decomposition of cycles is finally achieved: any cycle (oriented circuit) of the graph can be linearly decomposed using the fundamental set as a basis:

 →C=ν∑α=1(→C,→Cα)→Cα (20)

### c.2 Currents

The current for the transition is

 J(ω→ω′,t)=Pω(t)Wω→ω′−Pω′(t)Wω′→ω. (21)

The stationary state value is denoted by . The stationarity condition is equivalent to

 ∑ω′J(ω′→ω)=0∀ω (22)

which is known as Kirchhoff current law. If the transition corresponds to the oriented edge , its steady state current is also denoted as .

The current (or flux) on a fundamental circuit is defined as the steady state transtion current flowing in the chord in the original direction and is denoted by . For instance if is the oriented edge corresponding to the transition , then and the flux of the associated cycle is equal to .

The Kirchhoff law for the steady state guarantees that a current on any edge is the sum of the currents going through the cycles which intersect the edge, i.e.

 Je=ν∑α=1Se(→Cα)Jα. (23)

An edge of the graph can be oriented in a different direction with respect to the edges of the cycles, therefore the sign function is used.

The fluctuating instantaneous current depends on the particular realization of the Markov process; it is measured on a chord as:

 jα(t)=+∞∑n=−∞Sα(en)δ(t−tn) (24)

where is the time of the random transition (an oriented edge of the graph) during a trajectory of the stochastic process. In brief, is the instantaneous and oriented rate of the transitions in the chord , for a particular realization of the process. It is a stochastic variable. Its time-average (in a finite time ) is denoted by

 Gα(t)=1t∫t0dt′jα(t′), (25)

which is still a stochastic variable. Some properties of and have been studied in Andrieux and Gaspard (2007).

### c.3 Affinities

The affinity of a transition is defined as

 A(ω→ω′,t)=lnPω(t)Wω→ω′P′ω(t)Wω′→ω (26)

The affinity of a cycle is defined as , but it can also be defined as . where . The equivalence of these two forms is due to the fact that all cancel out, in a cycle. For this reason the affinity of a cycle does not depend upon time, but only on the transition rates, which come from the “external physical constraints”, e.g. mechanical, chemical and thermodynamical forces.

Thanks to the decomposition of cycles described above, one can linearly decompose the affinity of any cycle in terms of affinities of a “fundamental set of cycles” :

 A(→C)=∑α(→C,→Cα)A(→Cα) (27)

where is the previously defined scalar product between cycles.

### c.4 Entropy production

Having defined the Gibbs entropy as

 S(t)=−∑ωPω(t)lnPω(t), (28)

its time derivative can be decomposed in two parts , where the bilinear form

 diSdt=12∑ω,ω′J(ω→ω′,t)A(ω→ω′,t)≥0 (29)

is considered as the internal entropy production, and the rest is the entropy flux through the boundaries of the system of interest. In the steady state one has .

A definition of entropy production per trajectory is given by Lebowitz and Spohn Lebowitz and Spohn (1999), see Eq. (4). It depends on a particular realization, i.e. it is a stochastic variable. It can also be written as:

 Wt=1t∑eB(e)∫t0dt′je(t′). (30)

Lebowitz and Spohn have noticed that

 limt→∞⟨Wt⟩=diSdt∣∣∣st (31)

in the stationary state. The following relation has instead been noticed in Ref.Andrieux and Gaspard (2007):

 Wt=Qt+Rt (32)

with

 Qt =∑αA(→Cα)Gα(t) (33) Rt =1t∑e≠αB(e)[∫t0dt′(je(t′)−∑αSe(→Cα)jα(t′))]. (34)

The term is the contribution due to the fundamental set of cycles. The “remainder” has zero average (thanks to the Kirchhoff law Eq. (23)). This implies that

 (35)

since

 limt→∞⟨Gα(t)⟩=limt→∞1t⟨∫t0jα(t′)dt′⟩=Jα. (36)

Numerical comparison of the fluctuations of and those of show that they have identical Cramer functions (see Figure 9), in many examples of continuous time Markov processes.

From eq. (21), detailed balance, with respect to the invariant measure, is equivalent to

 J(ω→ω′)=0∀(ω→ω′), (37)

which implies that the probability of any trajectory is equal to the probability of its time reversal. Detailed balance also implies that the flux on any cycle vanishes, , and that affinities vanish on a single edge as well as on any cycle, eg. . As an immediate consequence, the internal entropy production vanishes:

 diSdt∣∣∣st=0. (38)

## Appendix D Appendix: parameters in the kinesin model

As sketched in Section II.4, the rates in the kinesin network model of Ref.Liepelt and Lipowski (2007) are adjusted to the parameters obtained by specific experiments. Moreover, they depend on the concentrations of the chemical species entering the reaction (ADP, ATP and P), as well as on the load force . More formally, one has:

 Wi→j=kij Iij([X]) Φij(F) (39)

where the ’s are the experiment-specific parameters. The functions and express the dependence of the reaction rates on a generic chemical species and/or on the load force . If the transition from to does not involve chemical binding, we define .

Assuming diluted solutions, all reactions are diffusion-limited, so that we can assume . The ’s are adimensional functions, with the convention . Theoretical considerations lead to for the chemical transitions, i.e. all but those between states and . Mechanical transitions are parametrized by and . The ’s and are additional parameters obtained by experiments, while is the adimensional force ( being the average kinesin step length and the Boltzmann constant). With these choices, the ’s are dimensionally different depending on whether they multiply a concentration (dimensions of rate divided by concentration, ) or not (dimension of a rate, ).

The parameters we used in the simulations are derived from those reproducing the results of the experiment Carter and Cross (2005):

• The values of ’s describing the experiment Carter and Cross (2005) according to Liepelt and Lipowski (2007) are , , , , , , . The upper and lower cycle in Fig. 7 are assumed to have same parameters, apart from the transition from to , which is determined from theoretical considerations as .

• Typical concentrations in the experiment are . For simplicity, we assume all of them to be kept constant and equal to .

• The mechanical parameters reproducing the results of experiment Carter and Cross (2005) are: , , , . In all cases, we have .

In section II.4, we considered two instances of the model. The first one is without load, . In this case and with the assumptions above, it is easy to obtain the transition rates: all the ’s and concentrations are equal to , so from Eq. (39) we obtain : the rates are just the ’s listed above.

About the load case, the unit of the adimensional force is equal to . Experiments are performed with forces of the order of piconewton. We took a value : substituting this value in the expression for the ’s leads to the following values of the transition rates, which are those used in the simulations of the model with load: , , , , , , , .

## Appendix E List of the main symbols

• is the transition rates from to

• is the exit rate from state

• is the characteristic time of state

• is the probability of being in at time

• is the invariant probability of being in

• is the time threshold for decimation

• are the new transition rates in the decimated process

• is the Lebowitz-Spohn entropy production integrated on time and divided by

• is the Cramer’s function of the entropy production

• is the probability density of

• is the Cramer function in the decimated process with a time threshold .

• is an oriented cycle of the graph

• is the current on cycle averaged on a finite time

• is the affinity associated to the oriented cycle .

## References

• Ma (1985) S.-K. Ma, Statistical Mechanics (World Scientific, 1985).
• Kadanoff (2000) L. Kadanoff, Statistical Physics: Statics, Dynamics and Remormalization (World Scientific, 2000).
• Castiglione et al. (2008) P. Castiglione, M. Falcioni, A. Lesne, and A. Vulpiani, Chaos and coarse graining in Statistical Mechanics (Cambridge University Press, 2008).
• Bonaldi et al. (2009) M. Bonaldi, L. Conti, P. D. Gregorio, L. Rondoni, G. Vedovato, A. Vinante, M. Bignotto, M. Cerdonio, P. Falferi, N. Liguori, et al., Phys. Rev. Lett. 103, 010601 (2009).
• Gregorio et al. (2009) P. D. Gregorio, L. Rondoni, M. Bonaldi, and L. Conti, J. Stat. Mech. p. P10016 (2009).
• Evans et al. (1993) D. J. Evans, E. G. D. Cohen, and G. P. Morriss, Phys. Rev. Lett. 71, 2401 (1993).
• Gallavotti and Cohen (1995) G. Gallavotti and E. G. D. Cohen, J. Stat. Phys. 80, 931 (1995).
• Jarzynski (1997) C. Jarzynski, Phys. Rev. Lett. 78, 2690 (1997).
• Lebowitz and Spohn (1999) J. L. Lebowitz and H. Spohn, J. Stat. Phys. 95, 333 (1999).
• Rahav and Jarzynski (2007) S. Rahav and C. Jarzynski, J. Stat. Mech. p. P09012 (2007).
• Pigolotti and Vulpiani (2008) S. Pigolotti and A. Vulpiani, J. Chem. Phys. 128, 154114 (2008).
• de Groot and Mazur (1984) S. R. de Groot and P. Mazur, Non-equilibrium thermodynamics (Dover Publications, New York, 1984).
• Marini Bettolo Marconi et al. (2008) U. Marini Bettolo Marconi, A. Puglisi, L. Rondoni, and A. Vulpiani, Phys. Rep. 461, 111 (2008).
• Gillespie (1977) D. T. Gillespie, J. Phys. Chem. 81, 2340 (1977).
• Schnakenberg (1976) J. Schnakenberg, Rev. Mod. Phys. 48, 571 (1976).
• Liepelt and Lipowski (2007) S. Liepelt and R. Lipowski, Phys. Rev. Lett. 98, 258102 (2007).
• Howard (2001) J. Howard, Mechanics of Motor Proteins and the Cytoskeleton (Sinauer, 2001).
• Yildiz et al. (2004) A. Yildiz, M. Tomishige, R. D. Vale, and P. R. Selvin, Science 303, 676 (2004).
• Carter and Cross (2005) N. J. Carter and R. A. Cross, Nature (London) 435, 308 (2005).
• Burioni and Cassi (2005) R. Burioni and D. Cassi, J. Phys. A 38, R45 (2005).
• Andrieux and Gaspard (2007) D. Andrieux and G. Gaspard, J. Stat. Phys. 127, 107 (2007).
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   