Nonequilibrium and information: the role of crosscorrelations
Abstract
We discuss the relevance of information contained in crosscorrelations among different degrees of freedom, which is crucial in nonequilibrium systems. In particular we consider a stochastic system where two degrees of freedom and  in contact with two different thermostats  are coupled together. The production of entropy and the violation of equilibrium fluctuationdissipation theorem (FDT) are both related to the crosscorrelation between and . Information about such crosscorrelation may be lost when singlevariable reduced models, for , are considered. Two different procedures are typically applied: (a) one totally ignores the coupling with ; (b) one models the effect of as an average memory effect, obtaining a generalized Langevin equation. In case (a) discrepancies between the system and the model appear both in entropy production and linear response; the latter can be exploited to define effective temperatures, but those are meaningful only when timescales are well separated. In case (b) linear response of the model well reproduces that of the system; however the loss of information is reflected in a loss of entropy production. When only linear forces are present, such a reduction is dramatic and makes the average entropy production vanish, posing problems in interpreting FDT violations.
1 Introduction
Energy and information are well known to be related: the conceptual Maxwell’s demon experiment is a popular representation of such an empirical fact. Extracting energy from a cold to a hot reservoir requires a device able to discern fast molecules from slow ones, i.e. requires the processing of information. Proofs in simplified models and overwhelming experimental evidence leads to the conclusion that this information costs, in energy, more than what is gained in the extraction [1, 2, 3]. Such an issue is recently receiving a renewed interest, both theoretical [4, 5] and experimental [6], in the contest of small systems and nonequilibrium thermodynamics.
What is important, in this debate, is a proper evaluation of the information needed to perform the physical process under examination. This in turn amounts to have a good model of the system: in particular it is crucial that the model correctly reproduces the information fluxes involved in its dynamics. Such an issue appears to be delicate, since modelling implies some level of coarsegraining and, as a consequence, a loss of information [7, 8, 9].
Remaining in the framework, mentioned above, of energy flowing between two different  and disconnected  reservoirs, an interesting example is provided by the information associated to the energy flux, which is measured in the following way. We consider two probes, e.g. two colloids or big molecules, which are coupled by a linear spring. Each probe is in contact with one of the two reservoirs, so that the spring between the probes also connects the thermostats. The interaction between a probe and its own thermostat  described in details in the text below  is characterized by a typical time which is in principle different for each probe [10, 11, 12]. At equilibrium (identical thermostats) such typical times are not relevant, but they become important in the more general nonequilibrium case.
In such a system it is straightforward to compute the socalled entropy production rate, which is a measure of how fast information is created in the ensemble of probes’ pairs or, equivalently, of how fast this ensemble would relax to equilibrium if allowed to. The relaxation to equilibrium is forbidden by some (undetailed) external constraint which prevents the two thermostats from equilibrating and which continuously dissipates the information so far created, allowing the system to achieve a nonequilibrium steady state [13] (see also [14, 15]). Of course, if the reservoirs have the same temperature, this entropy production rate vanishes and the steady state satisfies detailed balance [16].
In this paper we discuss the effect of modelling the system by removing from the description one of the two probes. Two cases are interesting: (a) one simply ignores the existence of the coupling with the second degree of freedom; (b) one keeps some information about such a coupling, but  for the purpose of making things simpler  replaces it with proper memory terms and an effective colored noise, resulting in a generalized Langevin equation [17] with the second kind fluctuation dissipation relation which is not satisfied. In case (a), one expects equilibrium, therefore even simple observations (for instance linear response) do not agree with expectations, and the departure of such agreement may be interpreted to define nonequilibrium effective temperatures [18, 19]. However such a procedure is really meaningful only in the presence of strong separation of timescales, otherwise unphysical effective temperatures appear [11]. In case (b) the statistics of the dynamics of the remaining probe is properly reproduced, including linear response. However the measure of the rate of information creation (entropy production) is underestimated. This discrepancy becomes dramatic in the case of linear couplings: in that case entropy production completely vanishes, resulting in the idea that the system is at equilibrium.
The plan of the paper is the following. In Section 2 we present the system with two variables, justifying it from a physical point of view, and we offer a review of its statistical dynamics properties. The system is taken fully linear from the beginning, but  as also detailed in the Appendixes  most of the results are more general. The importance of crosscorrelation between the two variables, and the effect of removing (in different ways) the second one from the description of the system, is discussed in Section 3. Finally in Section 4 we put our result in a more general perspective, discussing the role of channels for the transport of energy and information and how they depend on the chosen level of description.
Appendixes contain not only lengthy calculations accompanying the main results of the paper, but also deeper insights into the problem: A discusses also a partially nonlinear case, as well as formulations in (time) Fourier space; B discusses the case of the same system with inertia, such that one of the degree of freedom has different parity under timereversal; C explains the subtle conditions necessary to reduce the system with two variables to the model with one variable and memory; finally in D we offer an explicit example where the entropy production in the full description (two Markovian variables) has an additional contribution, with respect to the reduced description (one variable with memory), which carries crucial information about the difference of temperature among the two thermostats.
2 A system with two temperatures
Most of the ideas in this paper are illustrated by using a simple simple stochastic nonequilibrium system with two coupled degrees of freedom. The purpose of this section is to describe it and recall the main known properties of its dynamics. Our system is described by two coupled Langevin equations:
(1) 
where and are uncorrelated white noises, with zero mean and unitary variance.
The above stochastic equations can be thought as modelling the system portrayed in Fig. 1. The system includes two particles (for simplicity in one dimension), with positions and and momenta and whose Hamiltonian is given by
(2) 
Each particle is moving in a dilute fluid which exerts a viscous drag with coefficient , and which is coupled to a thermostat with temperature ; a natural way of modelling for the dynamics of this system is the following:
(3) 
Now, by taking the overdamped limit we get:
(4) 
which corresponds to model (2) by identifying , and
(5) 
2.1 Steady state properties
System (2), in a more compact form, reads
(6) 
where e are 2dimensional vectors and is a real matrix, in general not symmetric, is a Gaussian process, with covariance matrix:
(7) 
and
(8) 
In order to reach a steady state, the real parts of ’s eigenvalues must be positive. This condition is verified if and . Extension to a generic dimension and nondiagonal matrices (which however must remain symmetric) is straightforward.
The steady state is characterized by a bivariate Gaussian distribution [20]:
(9) 
where is a normalization coefficient and the matrix of covariances satisfies
(10) 
Solving this equation gives
(11) 
Moreover, in this system it is also possible to calculate the path probabilities. The probability of a trajectory in the phase space can be written in the following form:
(12) 
where the integral involves all the possible realizations of the noise with the corresponding weight. By introducing auxiliary variables, using the integral representation of the delta function, one obtains [21]:
(13) 
Where .
In the following,
we will also use the OnsagerMachlup expression for the path
probabilities, which is also obtained by integrating expression
(13) over the hat variables [22]
(14) 
Expression (14) has the advantage of not needing the presence of auxiliary fields.
Equilibrium is defined as the regime where path and their timereversal have the same probability, i.e.
(15) 
where is the time reversed phase point, and , defined in (9), represent the probability of the initial condition. It is easy to verify that such a condition leads to
(16)  
(17) 
where we have defined, for , the timedelayed crosscorrelation and the parity ( or ) under timereversal of the th variable. Considering that one has for the matrix of timedelayed correlations [20]
(18) 
by evaluating the above conditions at , it is seen that the equilibrium definition (16) leads to two important conditions:

if (because is symmetric by construction);

, which are the socalled Onsager reciprocal relations, being the Onsager matrix (indeed  at equilibrium  it relates current to thermodynamic forces).
Note that Eq. (10) means also where, for a generic matrix , we define its symmetrized . Therefore if all variables have the same parity, the equilibrium condition state above reads . This happens, for instance, for overdamped Langevin equations, such as the one considered here, with physical interpretation (4).
2.2 The Response analysis
Thanks to linearity of equations (2) the response properties of the system can be easily calculated
(19) 
Where we have defined .
Moreover, by a direct comparison between Eqs. (18) and (19) gives:
(20) 
Where is the inverse matrix of (11).
By deriving with respect to time equation (19), and substituting into (20) , one has
(21) 
note that this is a particular case of a generalized response equation, also called Generalized Fluctuation Dissipation Relation (GFDR) [23, 24, 25, 26, 11, 27]. For instance within the physical interpretation given in (5), the Onsager matrix reads
(22) 
where and, as usual . Note that is diagonal if or .
For a correct comparison with standard literature, one must change slightly the definition of response used until this moment. Let us suppose to make a perturbation of the Hamiltonian (2) with a term . From equations of motion (2) one has
(23) 
With such a mapping, one may write the linear response formula (21) for the degree of freedom as
(24) 
where the two contributions on the right side to the response depends on the timescale of observation.
When or , vanishes and one recovers the equilibrium condition (equivalent to reciprocal relations for overdamped variables), together with the known equilibrium fluctuation dissipation relation, .
2.3 Entropy production
In this system it is easy to compute the entropy production functional of a single trajectory. Let us consider a general trajectory and its timereversal . LebowitzSpohn defined the fluctuating entropy production functional as follows [28]:
(25) 
with
(26) 
where is the stationary distribution, i.e. the bivariate Gaussian with covariance given by Eq. (11) and is the probability of the trajectory introduced in equation (14). Lebowitz and Sphon have shown that the average (over the steady ensemble) of , if detailed balance is not satisfied, increases with time, while the term , usually known as “border term”, is usually negligible for large times, unless particular conditions of “singularity” occur [29, 30, 31].
For simplicity of notation, let us define . In order to write down an explicit expression, it is necessary to establish the behavior of the variables under time reversal (e.g. positions are even and velocities are odd under time inversion transformation). Let us assume that under time reversal it holds , where can be or , using also . Then one can define
(27)  
(28) 
Given this notation [20] it is possible to write down a compact
form for the entropy production
(29) 
Formula (29) is valid also in presence on nonlinear
terms and with several variables.
From now on, in order to carry on
the calculations, it is necessary to take a decision on the parity of
the variables, under the timereversal transformation. We will discuss
explicitly the overdamped dynamics case
(4), in which the variables and
, being positions, are both even under the change of
time. Overdamped cases are usually simpler because the terms
vanish. The nonoverdamped case is discussed in B and
has the same technical level with the difference that the velocity
variable is odd under timereversal. The exact expression of the
entropy production includes also border terms, which are not extensive
in time. We do not include those terms in the calculations, since we
are interested in the asymptotic expression.
Using (29), the entropy
production is calculated to be
(30) 
Note that the terms and are not extensive in time. Therefore the entropy production, for large times, Eq. (30) can be recast into
(31) 
It is possible to calculate the mean value of the entropy production rate (a limit for large times is meant)
(32)  
Equation (32) can be closed by substituting the equation of motion (2) and the values of the static correlations (11), obtaining
(33) 
The formula applied to the physical interpretation (5), gives:
(34) 
It is immediate to recognize in formula (34) that the mean rate is always positive, as expected. Moreover it is zero at equilibrium and in other more trivial cases, namely when the dynamical coupling term goes to zero. It can approach to zero also in the limit of time scale separation, but we will return on this point in Section 3.1.
3 Outofequilibrium information and crosscorrelations
In order to predict the response of an equilibrium system it is sufficient to know its autocorrelation, as stated from the fluctuation dissipation theorem. In a broad sense, autocorrelation and response have the same information content. On the contrary we have shown that the cross correlations between different degrees of freedom plays a crucial role in nonequilibrium response. The same is true by considering the average entropy production rate: it is zero at equilibrium because the cross correlation between and vanish.
In experiments or numerical simulations, however, if only is observed, one is tempted to describe it by some effective stochastic process which relegates the role of other degrees of freedom to some kind of noise. The crudest way of doing it is neglecting any timedelayed coupling of with other variables: of course such a model is  in the absence of other external forces  necessarily an equilibrium model, and cannot agree with observations; nevertheless, the comparison with the equilibrium expectation can  in some cases  lead to interesting interpretations. In the following we review the case of effective temperatures, which are deduced by forcing a comparison between nonequilibrium and equilibrium response (autocorrelation), and in the case of extreme timescale separation, carry useful information about the two nonequilibrium thermostats. After that, we also discuss a more informed way of modelling the system, by considering timedelayed effects of the other degrees of freedom in terms of memory and colored noise. The predictions of such a model are much closer to observations, but we show that crucial pieces of the puzzle are still missing.
3.1 Comparison with a single variable, equilibrium, model
Extending what is certainly true at equilibrium, one may insist in comparing response and correlation, by defining [18, 19]
(35) 
with , where and are two different observables of the system. The use of two times and allows one to includes also cases where the time translational invariance is not satisfied and observables does depend in a nontrivial way on the waiting time (for instance in aging systems). Equation (35) represents an attempt to generalize the temperature in system out of equilibrium, where ergodicity is broken. The validity of a thermodynamic interpretation of this quantity is clear in some limits, namely well separated timescales [18, 32, 10, 33].
At a first sight, equation (35) appear in sharp contrast with the “crosscorrelation” description given in Section 2.2, mainly because only the perturbed variable is involved [11]. Nevertheless in some cases also a partial view of the correlation response plot is meaningful, in particular in the case of time scales separation. For instance in the physical interpretation (4), the model reveals an interesting and nontrivial interplay of timescales. For simplicity let us consider the case . A typical time for variable , corresponding to its relaxation time when decoupled by (i.e. ), is . Analogously it is possible to define a characteristic time for : . An interesting limit is the following:
where the additional second condition guarantees that the interactions have the same order of magnitude, so that the limit is nontrivial and remains of pure nonequilibrium. In this case it can be shown that the two timescales and correspond to those obtained by inverting the two eigenvalues of the matrix . Most importantly, only in this limit the analysis of integrated response versus correlation produces a two slopes curve, where and are recognized as inverse of the measured slopes. However this is a limit case, and more general conditions can be considered.
In particular we consider the timeintegrated response , and its two contributions appearing in the splitting formula (24) such that and
(36)  
(37) 
case  

a  2  0.6  200  1  200  1  1  1  400  0.5 
b  5  0.2  20  40  30  20  2  2/3  47.3  12.7 
Our choices of parameters, always with , are resumed in Table 1: a case (a) where the timescales are mixed, and a case (b) where scales are well separated. Of course we do not intend to exhaust all the possibilities of this rich model (given in more detail in [11]), but to offer a few examples which may shed light on the role of cross correlations for linear response.
The parametric plots, for the cases of Table 1, are shown in Figure 2, top frames. In the same figure, bottom frames, we present the corresponding contributions and as functions of time. We briefly discuss the two cases:

If the timescales are not separated, the general form of the parametric plot, see Fig. 2b, is a curve. In fact, as shown in Fig. 2d, the cross term is relevant at all the timescales. The slopes at the extremes of the parametric plot, which can be hard to measure in an experiment, are (at early times, high values of ) and some slope close to (at large times, low values of ). Apart from that, the main information of the parametric plot is to point out the relevance of the coupling of with the “hidden” variable .
Note also that, if the relative coupling is changed, the information on and may disappear from the plot [11]. In summary the correct formula for the response is always the GFDR: . However, the definition of an effective temperature through the relation , can be useful in those limits which are relevant for glassy systems [19], where the behavior of the additional term is such that in certain ranges of timescales.
3.2 Comparing with a singlevariable nonequilibrium model with memory
Another classical approach to reduce the description of a manybody system, e.g. to focus on a (possibly slow) single degree of freedom, without losing the information of the reciprocal feedback between the original variables, is to use a nonMarkovian description, with memory and colored noise. In order to fix ideas, let us consider again the linear model (2). By integrating formally the second equation one has
(38) 
Putting (38) an equation for is obtained:
(39) 
with
(40) 
it is worth noting that, with this mapping, the detailed balance condition, given in the Markovian description by , is “translated” into
(41) 
which is the Fluctuation Dissipation Relation of the second kind, derived by Kubo for generalized Langevin equations [17].
This mapping appear to be a harmless mathematical trick, and one is tempted to consider the original system and the reduced model as equivalent. Actually it hides a loss of relevant information, detected for instance by entropy production, as we discuss in the following.
The previous Section shows that if one takes the point of view of one variable a different interpretation, respect to the two variables case, of the “violations”of the fluctuation response theorem can be given. This interpretations are not in contrast each other, namely the condition is always the “equilibrium fingerprint” which satisfies FDT. On the contrary, the scenario is different if one compare the entropy production in the NonMarkovian system to what found in Section 2.3.
Average entropy production for this nonMarkovian model (originally described in [12]) is better studied in frequency space, and can be approached for a more general model. This is done with details in A, while here we mention the main results. We start taking into account the following onedimensional Langevin equation
(42) 
where is Gaussian noise of zero mean and correlation
(43) 
with . In this model one can calculate the path probability and its reversed. The meanvalue of LebowitzSphon functional, ignoring all the contributes nonextensive in time, reads
(44) 
Where the average is performed on the space of trajectories. The functions appearing here are the Fourier transforms (See A.2 for the details of the calculation).
From equation (44) it is easy to see that for the linear case, namely for one has:
(45) 
Remarkably, it predicts a vanishing entropy production also in the case of the Linear model for , in sharp contrast with what found in (31) or in (125) for the case of underdamped dynamics.
From this result it emerges that the two approaches represent the same physical situation but with different level of details: moreover the choice of the level of the description does not affect almost any of the observables, for instance correlations and responses of the main variable are unaffected, bringing the same FDT analysis of the models. In order to focus on the reason of this difference, let us consider the model with exponential memory (39) that we rewrite here in a lightened notation, for clarity:
(46) 
The path probability of this process, starting form the position at time , can be expressed in the following form (see C for details)
(47) 
where we have used a simplified notation and we have introduced . Moreover, where the ’s are the Gaussian measures of the noises and is a Gaussian distribution with zero mean and variance .
After introducing an auxiliary process , equation (47) can be recast into:
(49)  
After integrating over the noises, one obtains the following expression for the probability
(50) 
where
(51) 
It is straightforward to recognize that equation (51) is the action of the corresponding two variable stochastic process:
(52) 
for the particular choice of the initial condition , following the Gaussian distribution . This result shows how the path probability distribution of the model (46) is essentially given by a marginalization of the corresponding Markovian one. From such an identification it is straightforward to explain the results showed in the previous sections.
3.3 General consequences of projections on entropy production
If we denote with the average over the
paths in the model (46) and with the average on the equivalent model on the auxiliary variable
one has
(53) 
The relation (53) show how, each observable of the variable has the same values when computed in the two models.
On the contrary
(54) 
where we have defined with the probability of the inverted trajectory.
And, as a consequence, . This fact explains the difference observed. Moreover it is simple to observe that
(55) 
where the last inequality is a straightforward application of the properties of KullbackLeibler relative entropy, which is always nonnegative [34]. Then, this projection mechanism, in general, has the effect of reducing entropy production. The equality is satisfied if
(56) 
The physical meaning of (56) is clear: it represents a
sort of “reduced” detailed balance condition, it must be valid for
the variables one wants to remove from the description.
If one
removes from the description variables which are in equilibrium with
respect to the others which remains, the procedure will not affect the
entropy production. It is simple to note that this condition is not
valid, in general for the model (4), once
one decides to project away the variable .
Under this point of view it is possible also to have an idea of why
the projection mechanism is not dangerous when the time scales are
well separated. Let us consider, for instance, the system in figure
1. In the limit of , the particle
can be seen as blocked. Therefore the particle is in equilibrium
respect to the system “thermostat + blocked particle ” and eq
(56) is valid for every values of .
4 Conclusions and perspectives
The linear equations (2) constitute a simplified model of a more complex, and perhaps realistic, system with degrees of freedom: such system is made of two subsystems, say and , made of, respectively, and degrees of freedom, with . The degrees of freedom of subsystem are coupled to a thermostat at temperature and are immersed in an external confining potential, assumed harmonic for simplicity. Furthermore, the degrees of freedom of subsystem interact among themselves by intermolecular potential which are, in general, not harmonic. In each subsystem there is also a probe with position and momentum , with mass much larger than all the others in the same subsystem: such condition on the masses of the probes is sufficient to expect a linear Langevinlike dynamics for this degree of freedom, where the (nonlinear) interaction with all other molecules is represented by an uncorrelated noise, while a linear velocity drag is due to collisional relaxation, and of course the external harmonic potential is still present, reproducing the situation of Figure 1 and Eq. (2). Finally, these two “slow” degrees of freedom (with respect to the faster and lighter molecules) are coupled one to the other by some potential . This coupling is the only connection between systems and .
In the absence of the coupling between the probes, the two systems remain separated and each one thermalizes to its own thermostat. When the coupling is present, the whole system will have the possibility to relax toward an overall equilibrium, but this is prevented by the presence of the two thermostats which are ideally infinite and never change their own temperature. The results is a nonequilibrium steady state where energy is continuously transferred on average from the hot to the cold reservoir. Such situation is quite simple, but the nature of the coupling may pose some ambiguities when the system is represented by the simplified variables model. Indeed the above picture holds even if the coupling potential is harmonic: however in the harmonic case the modes at different frequencies, i.e. will be decoupled. So, what is driving the system toward equilibrium, i.e. exchanging heat or producing entropy? In the harmonic case, the only channel for heat to flow is the one connecting to with the same : the two components of the same mode are at different temperature and can exchange heat. In summary, each mode has its own channel, which is separated from the others. When the variables model is reduced to the variable model with memory, the information about this channel is completely lost because the two thermostats are reduced to only one. Each “cycle” at frequency which behaves as a loop with a given current, is flattened to a harmonic oscillator with zero net current. The only remaining entropy production belongs to the exchange between different modes. In this sense the single variable model does not faithfully reproduce the full entropy production of the whole system. On the other side, if some nonlinearities are present, there are other “channels” of thermalization, due to the coupling between different modes, even of the same variable: such channels are still active after the projection to the single variable mode, and they continue to contribute (maybe not exactly with the same average value) to a nonzero entropy production. In D we discuss an example where two “channels” for entropy production are present (unbalance of temperatures and an external force) and their different fates, after a reduction of the description, is discussed.
This energy passing mechanism is evidently given by the correlations between different degrees of freedom. Such a role is crucial for two aspects:

The response of the system to an impulsive perturbation is , where a and b are some constants. As expected, for the equilibrium limit , and the usual fluctuation response relation holds. On the contrary, when more than a thermostat is present, a coupling between different degrees of freedom emerges, âbreakingâ the usual form of the response relation.

The entropy production rate can be calculated by using the OnsagerMachlup formalism. Also in this case, the rate is proportional to the cross correlations with a prefactor depending on the two temperatures and , and vanishing in the limit .
These conclusions are not specific for the “two variables” model (2). As mentioned before also other variable and some nonlinearity can be inserted and the same description is still valid.
Appendix A Generalized Langevin equations and nonequilibrium issues
In this Appendix we study linear response and entropy production for a particular generalized Langevin equation. Part of the results obtained here have been obtained in similar or different ways in [12].
a.1 Set up
Consider the following simple onedimensional Langevin equation
(57) 
where is Gaussian noise of zero mean and correlation
(58) 
with . The force term contains a local in time part, denoted , and a linear memory term,
(59) 
Both and are left unspecified.
Also we are interested into the stationary regime, so we let the initial time to , and the final one to . Under this assumption the probability of the a trajectory generated by the Langevin equation (57) is
(60) 
where is the inverse of defined as
(61) 
a.2 Entropy production
Consider now the reversed trajectory . Its probability follows from (63) by noticing that . To compute the ratio between the probability of a trajectory and its reversed we then have to separate the terms even and odd under the replacement into (63). To this end we have to look closer to .
From its definition we have
(65) 
Now
(66)  
so that
(67) 
with
(68)  
where
(69)  
(70) 
are real even functions of . Collecting all terms we have
(71) 
and (63) takes the form
(72)  