Two temperatures

# Non-equilibrium and information: the role of cross-correlations

## Abstract

We discuss the relevance of information contained in cross-correlations among different degrees of freedom, which is crucial in non-equilibrium systems. In particular we consider a stochastic system where two degrees of freedom and - in contact with two different thermostats - are coupled together. The production of entropy and the violation of equilibrium fluctuation-dissipation theorem (FDT) are both related to the cross-correlation between and . Information about such cross-correlation may be lost when single-variable reduced models, for , are considered. Two different procedures are typically applied: (a) one totally ignores the coupling with ; (b) one models the effect of as an average memory effect, obtaining a generalized Langevin equation. In case (a) discrepancies between the system and the model appear both in entropy production and linear response; the latter can be exploited to define effective temperatures, but those are meaningful only when time-scales are well separated. In case (b) linear response of the model well reproduces that of the system; however the loss of information is reflected in a loss of entropy production. When only linear forces are present, such a reduction is dramatic and makes the average entropy production vanish, posing problems in interpreting FDT violations.

## 1 Introduction

Energy and information are well known to be related: the conceptual Maxwell’s demon experiment is a popular representation of such an empirical fact. Extracting energy from a cold to a hot reservoir requires a device able to discern fast molecules from slow ones, i.e. requires the processing of information. Proofs in simplified models and overwhelming experimental evidence leads to the conclusion that this information costs, in energy, more than what is gained in the extraction [1, 2, 3]. Such an issue is recently receiving a renewed interest, both theoretical [4, 5] and experimental [6], in the contest of small systems and non-equilibrium thermodynamics.

What is important, in this debate, is a proper evaluation of the information needed to perform the physical process under examination. This in turn amounts to have a good model of the system: in particular it is crucial that the model correctly reproduces the information fluxes involved in its dynamics. Such an issue appears to be delicate, since modelling implies some level of coarse-graining and, as a consequence, a loss of information [7, 8, 9].

Remaining in the framework, mentioned above, of energy flowing between two different - and disconnected - reservoirs, an interesting example is provided by the information associated to the energy flux, which is measured in the following way. We consider two probes, e.g. two colloids or big molecules, which are coupled by a linear spring. Each probe is in contact with one of the two reservoirs, so that the spring between the probes also connects the thermostats. The interaction between a probe and its own thermostat - described in details in the text below - is characterized by a typical time which is in principle different for each probe [10, 11, 12]. At equilibrium (identical thermostats) such typical times are not relevant, but they become important in the more general non-equilibrium case.

In such a system it is straightforward to compute the so-called entropy production rate, which is a measure of how fast information is created in the ensemble of probes’ pairs or, equivalently, of how fast this ensemble would relax to equilibrium if allowed to. The relaxation to equilibrium is forbidden by some (undetailed) external constraint which prevents the two thermostats from equilibrating and which continuously dissipates the information so far created, allowing the system to achieve a non-equilibrium steady state [13] (see also [14, 15]). Of course, if the reservoirs have the same temperature, this entropy production rate vanishes and the steady state satisfies detailed balance [16].

In this paper we discuss the effect of modelling the system by removing from the description one of the two probes. Two cases are interesting: (a) one simply ignores the existence of the coupling with the second degree of freedom; (b) one keeps some information about such a coupling, but - for the purpose of making things simpler - replaces it with proper memory terms and an effective colored noise, resulting in a generalized Langevin equation [17] with the second kind fluctuation dissipation relation which is not satisfied. In case (a), one expects equilibrium, therefore even simple observations (for instance linear response) do not agree with expectations, and the departure of such agreement may be interpreted to define non-equilibrium effective temperatures [18, 19]. However such a procedure is really meaningful only in the presence of strong separation of time-scales, otherwise unphysical effective temperatures appear [11]. In case (b) the statistics of the dynamics of the remaining probe is properly reproduced, including linear response. However the measure of the rate of information creation (entropy production) is underestimated. This discrepancy becomes dramatic in the case of linear couplings: in that case entropy production completely vanishes, resulting in the idea that the system is at equilibrium.

The plan of the paper is the following. In Section 2 we present the system with two variables, justifying it from a physical point of view, and we offer a review of its statistical dynamics properties. The system is taken fully linear from the beginning, but - as also detailed in the Appendixes - most of the results are more general. The importance of cross-correlation between the two variables, and the effect of removing (in different ways) the second one from the description of the system, is discussed in Section 3. Finally in Section 4 we put our result in a more general perspective, discussing the role of channels for the transport of energy and information and how they depend on the chosen level of description.

Appendixes contain not only lengthy calculations accompanying the main results of the paper, but also deeper insights into the problem:  A discusses also a partially non-linear case, as well as formulations in (time) Fourier space; B discusses the case of the same system with inertia, such that one of the degree of freedom has different parity under time-reversal; C explains the subtle conditions necessary to reduce the system with two variables to the model with one variable and memory; finally in D we offer an explicit example where the entropy production in the full description (two Markovian variables) has an additional contribution, with respect to the reduced description (one variable with memory), which carries crucial information about the difference of temperature among the two thermostats.

## 2 A system with two temperatures

Most of the ideas in this paper are illustrated by using a simple simple stochastic non-equilibrium system with two coupled degrees of freedom. The purpose of this section is to describe it and recall the main known properties of its dynamics. Our system is described by two coupled Langevin equations:

 ˙X1 = −αX1+λX2+√2D1ϕ1 ˙X2 = −γX2+μX1+√2D2ϕ2 (1)

where and are uncorrelated white noises, with zero mean and unitary variance.

The above stochastic equations can be thought as modelling the system portrayed in Fig. 1. The system includes two particles (for simplicity in one dimension), with positions and and momenta and whose Hamiltonian is given by

 Htot=p212m1+p222m2+12k1x21+12k2x22+12k(x1−x2)2. (2)

Each particle is moving in a dilute fluid which exerts a viscous drag with coefficient , and which is coupled to a thermostat with temperature ; a natural way of modelling for the dynamics of this system is the following:

 ˙p1 = −∂H∂x1−γ1˙x1+√2γ1T1ϕ1 ˙p2 = −∂H∂x2−γ2˙x2+√2γ2T2ϕ2. (3)

Now, by taking the overdamped limit we get:

 γ1˙x1 = −(k1+k)x1+kx2+√2γ1T1ϕ1 γ2˙x2 = kx1−(k+k2)x2+√2γ2T2ϕ2 (4)

which corresponds to model (2) by identifying , and

 α→k1+kγ1λ→kγ1γ→k+k2γ2μ→kγ2D1→T1γ1D2→T2γ2. (5)

System (2), in a more compact form, reads

 dXdt=−AX+\boldmathϕ, (6)

where e are 2-dimensional vectors and is a real matrix, in general not symmetric, is a Gaussian process, with covariance matrix:

 ⟨ϕi(t′)ϕj(t)⟩=2Dijδ(t−t′), (7)

and

 A=(α−λ−μγ)mmmmmD=(D100D2) (8)

In order to reach a steady state, the real parts of ’s eigenvalues must be positive. This condition is verified if and . Extension to a generic dimension and non-diagonal matrices (which however must remain symmetric) is straightforward.

The steady state is characterized by a bivariate Gaussian distribution [20]:

 ρ(X)=Nexp(−12Xσ−1X) (9)

where is a normalization coefficient and the matrix of covariances satisfies

 D=Aσ+σAT2. (10)

Solving this equation gives

 σ=⎛⎜ ⎜⎝D2λ2−D1μλ+D1γ(α+γ)(α+γ)(αγ−λμ)D2αλ+D1γμ(α+γ)(αγ−λμ)D2αλ+D1γμ(α+γ)(αγ−λμ)D1μ2−D2λμ+D2α(α+γ)(α+γ)(αγ−λμ)⎞⎟ ⎟⎠. (11)

Moreover, in this system it is also possible to calculate the path probabilities. The probability of a trajectory in the phase space can be written in the following form:

 P({X(s)}t0)=∫DϕP(ϕ)δ(˙X+AX−ϕ), (12)

where the integral involves all the possible realizations of the noise with the corresponding weight. By introducing auxiliary variables, using the integral representation of the delta function, one obtains [21]:

 P({X(s)}t0)∼∫D^XeS(X,^X) (13)

Where .
In the following, we will also use the Onsager-Machlup expression for the path probabilities, which is also obtained by integrating expression (13) over the hat variables [22]

 P({X(s)}t0)∼e((˙X+AX)D−1(˙X+AX)) (14)

Expression (14) has the advantage of not needing the presence of auxiliary fields.

Equilibrium is defined as the regime where path and their time-reversal have the same probability, i.e.

 ρ[X(0)]P({X(s)}t0)=ρ[IX(t)]P({IX(s)}t0), (15)

where is the time reversed phase point, and , defined in (9), represent the probability of the initial condition. It is easy to verify that such a condition leads to

 Cij(t) =ϵiϵjCji(t), (16) ˙Cij(t) =ϵiϵj˙Cji(t), (17)

where we have defined, for , the time-delayed cross-correlation and the parity ( or ) under time-reversal of the -th variable. Considering that one has for the matrix of time-delayed correlations [20]

 C(t) =eAtσ ˙C(t) =eAtAσ, (18)

by evaluating the above conditions at , it is seen that the equilibrium definition (16) leads to two important conditions:

1. if (because is symmetric by construction);

2. , which are the so-called Onsager reciprocal relations, being the Onsager matrix (indeed - at equilibrium - it relates current to thermodynamic forces).

Note that Eq. (10) means also where, for a generic matrix , we define its symmetrized . Therefore if all variables have the same parity, the equilibrium condition state above reads . This happens, for instance, for overdamped Langevin equations, such as the one considered here, with physical interpretation (4).

### 2.2 The Response analysis

Thanks to linearity of equations (2) the response properties of the system can be easily calculated

 R(t)=e−At (19)

Where we have defined .

Moreover, by a direct comparison between Eqs. (18) and (19) gives:

 R(t)=C(t)σ−1 (20)

Where is the inverse matrix of (11).

By deriving with respect to time equation (19), and substituting into (20) , one has

 R(t)=˙C(t)(Aσ)−1 (21)

note that this is a particular case of a generalized response equation, also called Generalized Fluctuation Dissipation Relation (GFDR) [23, 24, 25, 26, 11, 27]. For instance within the physical interpretation given in (5), the Onsager matrix reads

 Aσ=⎛⎜⎝T1γ1ΣΔT−ΣΔTT2γ2⎞⎟⎠ (22)

where and, as usual . Note that is diagonal if or .

For a correct comparison with standard literature, one must change slightly the definition of response used until this moment. Let us suppose to make a perturbation of the Hamiltonian (2) with a term . From equations of motion (2) one has

 ¯¯¯¯¯¯¯¯δx1(t)δh(0)=1γ1¯¯¯¯¯¯¯¯δx1(t)δx1(0). (23)

With such a mapping, one may write the linear response formula (21) for the degree of freedom as

 ¯¯¯¯¯¯¯¯δx1(t)δh(0)=(Aσ)−111γ1ddt⟨x1(t)x1(0)⟩+(Aσ)−112γ1ddt⟨x1(t)x2(0)⟩, (24)

where the two contributions on the right side to the response depends on the time-scale of observation.

When or , vanishes and one recovers the equilibrium condition (equivalent to reciprocal relations for overdamped variables), together with the known equilibrium fluctuation dissipation relation, .

### 2.3 Entropy production

In this system it is easy to compute the entropy production functional of a single trajectory. Let us consider a general trajectory and its time-reversal . Lebowitz-Spohn defined the fluctuating entropy production functional as follows [28]:

 W′t=logρ[X(0)]P({X(s)}t0)ρ[X(t)]P({IX(s)}t0)=Wt+bt (25)

with

 bt=log{ρ[X(0)]}−log{ρ[(X)]}, (26)

where is the stationary distribution, i.e. the bivariate Gaussian with covariance given by Eq. (11) and is the probability of the trajectory introduced in equation (14). Lebowitz and Sphon have shown that the average (over the steady ensemble) of , if detailed balance is not satisfied, increases with time, while the term , usually known as “border term”, is usually negligible for large times, unless particular conditions of “singularity” occur [29, 30, 31].

For simplicity of notation, let us define . In order to write down an explicit expression, it is necessary to establish the behavior of the variables under time reversal (e.g. positions are even and velocities are odd under time inversion transformation). Let us assume that under time reversal it holds , where can be or , using also . Then one can define

 Frevi(X) = 12[Fi(X)−ϵiFi(ϵX)]=−ϵiFrevi(ϵX) (27) Firi(X) = 12[Fi(X)+ϵiFi(ϵX)]=ϵiFiri(ϵX). (28)

Given this notation [20] it is possible to write down a compact form for the entropy production1 simply by substituting equation (14) into (25), obtaining:

 Wt=∑kD−1kk∫t0dsFirk[˙Xk−Frevk]. (29)

Formula (29) is valid also in presence on non-linear terms and with several variables.
From now on, in order to carry on the calculations, it is necessary to take a decision on the parity of the variables, under the time-reversal transformation. We will discuss explicitly the overdamped dynamics case (4), in which the variables and , being positions, are both even under the change of time. Overdamped cases are usually simpler because the terms vanish. The non-overdamped case is discussed in B and has the same technical level with the difference that the velocity variable is odd under time-reversal. The exact expression of the entropy production includes also border terms, which are not extensive in time. We do not include those terms in the calculations, since we are interested in the asymptotic expression.
Using (29), the entropy production is calculated to be

 Wt=1D1∫t0dt′λX2˙X1−αX1˙X1+1D2∫t0dt′μX1˙X2−γX2˙X2 (30)

Note that the terms and are not extensive in time. Therefore the entropy production, for large times, Eq. (30) can be recast into

 Wt≃[λD1−μD2]∫t0X2˙X1dt′. (31)

It is possible to calculate the mean value of the entropy production rate (a limit for large times is meant)

 1t⟨Wt⟩ ≃ [λD1−μD2]1t∫t0X2˙X1dt′= (32) = [λD1−μD2]⟨X2˙X1⟩.

Equation (32) can be closed by substituting the equation of motion (2) and the values of the static correlations (11), obtaining

 1t⟨Wt⟩=(D2λ−D1μ)2D1D2(α+γ) (33)

The formula applied to the physical interpretation (5), gives:

 1t⟨Wt⟩=(k)2(k+k1)γ2+(k+k2)γ1ΔT2T2T1. (34)

It is immediate to recognize in formula (34) that the mean rate is always positive, as expected. Moreover it is zero at equilibrium and in other more trivial cases, namely when the dynamical coupling term goes to zero. It can approach to zero also in the limit of time scale separation, but we will return on this point in Section 3.1.

## 3 Out-of-equilibrium information and cross-correlations

In order to predict the response of an equilibrium system it is sufficient to know its autocorrelation, as stated from the fluctuation dissipation theorem. In a broad sense, autocorrelation and response have the same information content. On the contrary we have shown that the cross correlations between different degrees of freedom plays a crucial role in non-equilibrium response. The same is true by considering the average entropy production rate: it is zero at equilibrium because the cross correlation between and vanish.

In experiments or numerical simulations, however, if only is observed, one is tempted to describe it by some effective stochastic process which relegates the role of other degrees of freedom to some kind of noise. The crudest way of doing it is neglecting any time-delayed coupling of with other variables: of course such a model is - in the absence of other external forces - necessarily an equilibrium model, and cannot agree with observations; nevertheless, the comparison with the equilibrium expectation can - in some cases - lead to interesting interpretations. In the following we review the case of effective temperatures, which are deduced by forcing a comparison between non-equilibrium and equilibrium response (autocorrelation), and in the case of extreme time-scale separation, carry useful information about the two non-equilibrium thermostats. After that, we also discuss a more informed way of modelling the system, by considering time-delayed effects of the other degrees of freedom in terms of memory and colored noise. The predictions of such a model are much closer to observations, but we show that crucial pieces of the puzzle are still missing.

### 3.1 Comparison with a single variable, equilibrium, model

Extending what is certainly true at equilibrium, one may insist in comparing response and correlation, by defining [18, 19]

 T(AB)eff(t,tw)≡RAB(t,tw)˙CAB(t,tw), (35)

with , where and are two different observables of the system. The use of two times and allows one to includes also cases where the time translational invariance is not satisfied and observables does depend in a non-trivial way on the waiting time (for instance in aging systems). Equation (35) represents an attempt to generalize the temperature in system out of equilibrium, where ergodicity is broken. The validity of a thermodynamic interpretation of this quantity is clear in some limits, namely well separated time-scales [18, 32, 10, 33].

At a first sight, equation (35) appear in sharp contrast with the “cross-correlation” description given in Section 2.2, mainly because only the perturbed variable is involved [11]. Nevertheless in some cases also a partial view of the correlation response plot is meaningful, in particular in the case of time scales separation. For instance in the physical interpretation (4), the model reveals an interesting and non-trivial interplay of time-scales. For simplicity let us consider the case . A typical time for variable , corresponding to its relaxation time when decoupled by (i.e. ), is . Analogously it is possible to define a characteristic time for : . An interesting limit is the following:

 τ1 ≪ τ2 kk1 ∼ T1T2,

where the additional second condition guarantees that the interactions have the same order of magnitude, so that the limit is non-trivial and remains of pure non-equilibrium. In this case it can be shown that the two timescales and correspond to those obtained by inverting the two eigenvalues of the matrix . Most importantly, only in this limit the analysis of integrated response versus correlation produces a two slopes curve, where and are recognized as inverse of the measured slopes. However this is a limit case, and more general conditions can be considered.

In particular we consider the time-integrated response , and its two contributions appearing in the splitting formula (24) such that and

 Q11(t) =(Aσ)−111γ1[C11(0)−C11(t)] (36) Q12(t) =(Aσ)−121γ1[C12(0)−C12(t)]. (37)

Our choices of parameters, always with , are resumed in Table 1: a case (a) where the time-scales are mixed, and a case (b) where scales are well separated. Of course we do not intend to exhaust all the possibilities of this rich model (given in more detail in [11]), but to offer a few examples which may shed light on the role of cross correlations for linear response.

The parametric plots, for the cases of Table 1, are shown in Figure 2, top frames. In the same figure, bottom frames, we present the corresponding contributions and as functions of time. We briefly discuss the two cases:

1. In the “glassy” limit , with the constraint , the well known broken line is found, see Fig. 2a. Figure 2c shows that is negligible during the first transient, up to the first plateau of , while it becomes relevant during the second rise of toward the final plateau.

2. If the timescales are not separated, the general form of the parametric plot, see Fig. 2b, is a curve. In fact, as shown in Fig. 2d, the cross term is relevant at all the time-scales. The slopes at the extremes of the parametric plot, which can be hard to measure in an experiment, are (at early times, high values of ) and some slope close to (at large times, low values of ). Apart from that, the main information of the parametric plot is to point out the relevance of the coupling of with the “hidden” variable .

Note also that, if the relative coupling is changed, the information on and may disappear from the plot [11]. In summary the correct formula for the response is always the GFDR: . However, the definition of an effective temperature through the relation , can be useful in those limits which are relevant for glassy systems [19], where the behavior of the additional term is such that in certain ranges of time-scales.

### 3.2 Comparing with a single-variable non-equilibrium model with memory

Another classical approach to reduce the description of a many-body system, e.g. to focus on a (possibly slow) single degree of freedom, without losing the information of the reciprocal feedback between the original variables, is to use a non-Markovian description, with memory and colored noise. In order to fix ideas, let us consider again the linear model (2). By integrating formally the second equation one has

 X2(t)=∫t−∞dse−γ|t−s|[μX1(s)+√2D2ϕ2(s)]. (38)

Putting (38) an equation for is obtained:

 ˙X1=−αX1+λμ∫t−∞dse−γ|t−s|X1(s)+η(t) (39)

with

 ⟨η(t)η(s)⟩=2D1δ(t−s)+D2λ2γe−γ|t−s|. (40)

it is worth noting that, with this mapping, the detailed balance condition, given in the Markovian description by , is “translated” into

 ⟨η(t)η(s)⟩∝e−γ|t−s|, (41)

which is the Fluctuation Dissipation Relation of the second kind, derived by Kubo for generalized Langevin equations [17].

This mapping appear to be a harmless mathematical trick, and one is tempted to consider the original system and the reduced model as equivalent. Actually it hides a loss of relevant information, detected for instance by entropy production, as we discuss in the following.

The previous Section shows that if one takes the point of view of one variable a different interpretation, respect to the two variables case, of the “violations”of the fluctuation response theorem can be given. This interpretations are not in contrast each other, namely the condition is always the “equilibrium fingerprint” which satisfies FDT. On the contrary, the scenario is different if one compare the entropy production in the Non-Markovian system to what found in Section 2.3.

Average entropy production for this non-Markovian model (originally described in [12]) is better studied in frequency space, and can be approached for a more general model. This is done with details in A, while here we mention the main results. We start taking into account the following one-dimensional Langevin equation

 m¨x=−γ˙x−hx[x(t)]−∫t−∞dt′g(t−t′)x(t′)+η (42)

where is Gaussian noise of zero mean and correlation

 ⟨η(t)η(t′)⟩=ν(t−t′) (43)

with . In this model one can calculate the path probability and its reversed. The mean-value of Lebowitz-Sphon functional, ignoring all the contributes non-extensive in time, reads

 (44)

Where the average is performed on the space of trajectories. The functions appearing here are the Fourier transforms (See A.2 for the details of the calculation).

From equation (44) it is easy to see that for the linear case, namely for one has:

 ⟨\rm Im[x(ω)hx(−ω)]⟩=0 (45)

Remarkably, it predicts a vanishing entropy production also in the case of the Linear model for , in sharp contrast with what found in (31) or in (125) for the case of underdamped dynamics.

From this result it emerges that the two approaches represent the same physical situation but with different level of details: moreover the choice of the level of the description does not affect almost any of the observables, for instance correlations and responses of the main variable are unaffected, bringing the same FDT analysis of the models. In order to focus on the reason of this difference, let us consider the model with exponential memory (39) that we rewrite here in a lightened notation, for clarity:

 ˙x=−h(x)+λμ∫tt0e−γ(t−s)x(s)+η(t)≡fx+η(t) (46)

The path probability of this process, starting form the position at time , can be expressed in the following form (see C for details)

 P[x|x0]=∫Dσδ[˙x+fx+s(t)−λ√2Dy∫tt0dsg(t−s)ϕy(s)−√2Dxϕx(t)] (47)

where we have used a simplified notation and we have introduced . Moreover, where the ’s are the Gaussian measures of the noises and is a Gaussian distribution with zero mean and variance .

After introducing an auxiliary process , equation (47) can be recast into:

 P[x|x0] = ∫DσDyδ[˙x+hx−λy−√2Dxϕx(t)]× (49) δ[y−y0g(t−t0)−∫tt0dsg(t−s)[μx(s)+√2Dyϕy(s)]

After integrating over the noises, one obtains the following expression for the probability

 P[x|x0]=∫dy0P0(y0)∫y(0)=y0DyeS(x,y) (50)

where

 S(x,y)=−12Dx∫t1t0dt[˙x+hx−λy]2−12Dy∫t1t0dt[˙y+γy−μx]2 (51)

It is straightforward to recognize that equation (51) is the action of the corresponding two variable stochastic process:

 {˙x=−hx+λy+√2Dxϕx˙y=−γy+μx+√2Dyϕy (52)

for the particular choice of the initial condition , following the Gaussian distribution . This result shows how the path probability distribution of the model (46) is essentially given by a marginalization of the corresponding Markovian one. From such an identification it is straightforward to explain the results showed in the previous sections.

### 3.3 General consequences of projections on entropy production

If we denote with the average over the paths in the model (46) and with the average on the equivalent model on the auxiliary variable one has 2, for an observable which depends only on

 ⟨O⟩x=∫DxP(x)O(x)=∫DxDyeS(x,y)O(x)=⟨O⟩x,y. (53)

The relation (53) show how, each observable of the variable has the same values when computed in the two models.

On the contrary

 ∫DxDyP(x,y)log[∫DyP(x,y)∫DyP1(x,y)]≠∫DxDyP(x,y)log[P(x,y)P1(x,y)]. (54)

where we have defined with the probability of the inverted trajectory.

And, as a consequence, . This fact explains the difference observed. Moreover it is simple to observe that

 ⟨W⟩x,y−⟨W⟩x=∫DxDyP(x,y)logP(x,y)P1(y|x)P(x)≥0 (55)

where the last inequality is a straightforward application of the properties of Kullback-Leibler relative entropy, which is always non-negative [34]. Then, this projection mechanism, in general, has the effect of reducing entropy production. The equality is satisfied if

 P1(y|x)=P(y|x) (56)

The physical meaning of (56) is clear: it represents a sort of “reduced” detailed balance condition, it must be valid for the variables one wants to remove from the description.
If one removes from the description variables which are in equilibrium with respect to the others which remains, the procedure will not affect the entropy production. It is simple to note that this condition is not valid, in general for the model (4), once one decides to project away the variable . Under this point of view it is possible also to have an idea of why the projection mechanism is not dangerous when the time scales are well separated. Let us consider, for instance, the system in figure 1. In the limit of , the particle can be seen as blocked. Therefore the particle is in equilibrium respect to the system “thermostat + blocked particle ” and eq (56) is valid for every values of .

## 4 Conclusions and perspectives

The linear equations (2) constitute a simplified model of a more complex, and perhaps realistic, system with degrees of freedom: such system is made of two sub-systems, say and , made of, respectively, and degrees of freedom, with . The degrees of freedom of sub-system are coupled to a thermostat at temperature and are immersed in an external confining potential, assumed harmonic for simplicity. Furthermore, the degrees of freedom of sub-system interact among themselves by intermolecular potential which are, in general, not harmonic. In each subsystem there is also a probe with position and momentum , with mass much larger than all the others in the same sub-system: such condition on the masses of the probes is sufficient to expect a linear Langevin-like dynamics for this degree of freedom, where the (non-linear) interaction with all other molecules is represented by an uncorrelated noise, while a linear velocity drag is due to collisional relaxation, and of course the external harmonic potential is still present, reproducing the situation of Figure 1 and Eq. (2). Finally, these two “slow” degrees of freedom (with respect to the faster and lighter molecules) are coupled one to the other by some potential . This coupling is the only connection between systems and .

In the absence of the coupling between the probes, the two systems remain separated and each one thermalizes to its own thermostat. When the coupling is present, the whole system will have the possibility to relax toward an overall equilibrium, but this is prevented by the presence of the two thermostats which are ideally infinite and never change their own temperature. The results is a non-equilibrium steady state where energy is continuously transferred on average from the hot to the cold reservoir. Such situation is quite simple, but the nature of the coupling may pose some ambiguities when the system is represented by the simplified -variables model. Indeed the above picture holds even if the coupling potential is harmonic: however in the harmonic case the modes at different frequencies, i.e. will be decoupled. So, what is driving the system toward equilibrium, i.e. exchanging heat or producing entropy? In the harmonic case, the only channel for heat to flow is the one connecting to with the same : the two components of the same mode are at different temperature and can exchange heat. In summary, each mode has its own channel, which is separated from the others. When the -variables model is reduced to the -variable model with memory, the information about this channel is completely lost because the two thermostats are reduced to only one. Each “cycle” at frequency which behaves as a loop with a given current, is flattened to a harmonic oscillator with zero net current. The only remaining entropy production belongs to the exchange between different modes. In this sense the single variable model does not faithfully reproduce the full entropy production of the whole system. On the other side, if some non-linearities are present, there are other “channels” of thermalization, due to the coupling between different modes, even of the same variable: such channels are still active after the projection to the single variable mode, and they continue to contribute (maybe not exactly with the same average value) to a non-zero entropy production. In D we discuss an example where two “channels” for entropy production are present (unbalance of temperatures and an external force) and their different fates, after a reduction of the description, is discussed.

This energy passing mechanism is evidently given by the correlations between different degrees of freedom. Such a role is crucial for two aspects:

• The response of the system to an impulsive perturbation is , where a and b are some constants. As expected, for the equilibrium limit , and the usual fluctuation response relation holds. On the contrary, when more than a thermostat is present, a coupling between different degrees of freedom emerges, âbreakingâ the usual form of the response relation.

• The entropy production rate can be calculated by using the Onsager-Machlup formalism. Also in this case, the rate is proportional to the cross correlations with a pre-factor depending on the two temperatures and , and vanishing in the limit .

These conclusions are not specific for the “two variables” model (2). As mentioned before also other variable and some non-linearity can be inserted and the same description is still valid.

## Appendix A Generalized Langevin equations and non-equilibrium issues

In this Appendix we study linear response and entropy production for a particular generalized Langevin equation. Part of the results obtained here have been obtained in similar or different ways in [12].

### a.1 Set up

Consider the following simple one-dimensional Langevin equation

 ˙x=−h[x]+η (57)

where is Gaussian noise of zero mean and correlation

 ⟨η(t)η(t′)⟩=ν(t−t′) (58)

with . The force term contains a local in time part, denoted , and a linear memory term,

 h[x(t)]=hx[x(t)]−∫t−∞dt′g(t−t′)x(t′) (59)

Both and are left unspecified.

Also we are interested into the stationary regime, so we let the initial time to , and the final one to . Under this assumption the probability of the a trajectory generated by the Langevin equation (57) is

 P{x}∝exp{−12∫+∞−∞dtdt′[˙x(t)+h[x(t)]]ν−1(t−t′)[˙x(t′)+h[x(t′)]]} (60)

where is the inverse of defined as

 ∫+∞−∞dsν(t−s)ν−1(s−t′)=∫+∞−∞dsν−1(t−s)ν(s−t′)=δ(t−t′). (61)

By going in the Fourier space,

 x(t)=∫+∞−∞dω2πe−iωtx(ω)⟷x(ω)=∫+∞−∞dteiωtx(t). (62)

the probability (60) becomes

 P{x}∝exp{−12∫+∞−∞dω2π[−iωx(ω)+h(ω)]ν(ω)−1[iωx(−ω)+h(−ω)]} (63)

where , with , and

 h(ω)=∫+∞−∞dteiωth[x(t)]. (64)

### a.2 Entropy production

Consider now the reversed trajectory . Its probability follows from (63) by noticing that . To compute the ratio between the probability of a trajectory and its reversed we then have to separate the terms even and odd under the replacement into (63). To this end we have to look closer to .

From its definition we have

 h(ω)=hx(ω)−∫+∞−∞dteiωt∫t−∞dt′g(t−t′)x(t′) (65)

Now

 ∫t−∞dt′g(t−t′)x(t′) = ∫+∞−∞dω2πx(ω)∫t−∞dt′e−iωt′g(t−t′) (66) = ∫+∞−∞dω2πe−iωtx(ω)∫∞0dt′eiωt′g(t′)

so that

 h(ω)=hx(ω)−g(ω)x(ω) (67)

with

 g(ω) = ∫∞0dteiωtg(t) (68) = ∫∞0dt′cos(ωt)g(t)+i∫∞0dtsin(ωt)g(t) = ϕ(ω)+iωψ(ω)

where

 ϕ(ω) = ∫∞0dt′cos(ωt)g(t) (69) ψ(ω) = ∫∞0dtsin(ωt)ωg(t) (70)

are real even functions of . Collecting all terms we have

 h(ω)=hx(ω)−ϕ(ω)x(ω)−iωψ(ω)x(ω) (71)

and (63) takes the form

 P{x} ∝ exp{−12∫+∞−∞dω2π[−iω˜x(ω)+˜hx(ω)]ν(ω)−1[iω˜x(−ω)+˜hx(−ω)]} (72) ∝ exp{−12∫+∞−∞dω2π[ω2˜x(ω)˜x(−ω)+˜hx(ω)˜hx(−ω)]ν(ω)−1 +1