# Thermodynamics of Maximum Transition Entropy for Quantum Assemblies

## Abstract

This work presents a general unifying theoretical framework for quantum non-equilibrium systems. It is based on a re-statement of the dynamical problem as one of inferring the distribution of collision events that move a system toward thermal equilibrium from an arbitrary starting distribution. Using a form based on maximum entropy for this transition distribution leads to a statistical description of open quantum systems with strong parallels to the conventional, maximum-entropy, equilibrium thermostatics. A precise form of the second law of thermodynamics can be stated for this dynamics at every time-point in a trajectory. Numerical results are presented for low-dimensional systems interacting with cavity fields. The dynamics and stationary state are compared to a reference model of a weakly coupled oscillator plus cavity supersystem thermostatted by periodic partial measurements. Despite the absence of an explicit cavity in the present model of open quantum dynamics, both the relaxation rates and stationary state properties closely match the reference. Additionally, the time-course of energy exchange and entropy increase is given throughout an entire measurement process for a single spin system. The results show the process to be capable of initially absorbing heat when starting from a superposition state, but not from an isotropic distribution. Based on these results, it is argued that logical inference in the presence of environmental noise is sufficient to resolve the paradox of wavefunction collapse.

## I Introduction

Few topics draw more interest and debate than the relation between quantum mechanics, the measurement problem, and information theory. The problem of sewing these together into a coherent description of a causal or probabilistic universe lies at the foundations of physical law.

These long-standing topics are receiving renewed attention due to the increasing precision of experiments on trapped quantum systems(1); (2); (3); (4); (5); (6) and the corresponding ability to probe the nature of decoherence processes.(7); (8); (9) Parallel developments in nonequilibrium statistical mechanics of open systems are building an increasingly powerful toolset for predicting their often non-intuitive behavior.(10); (11); (12) Important studied applications include spectroscopy(13); (14), quantum computing,(15) reaction rate theory,(16) mixed quantum-classical molecular dynamics(17), and superconducting nanocircuits.(18)

Recently, a general unifying theoretical framework for non-equilibrium thermodynamics was proposed for classical systems.(19) Such a framework has been a long sought goal for the nonequilibrium community.(20); (21) It takes the form of a maximum entropy distribution over each transition step between the current and future positions and momenta,

(1) |

The two constraints in the exponent are the average deviation of the square of the action functional, and the average change in total energy. Their Lagrange multipliers are , one fourth the inverse diffusion constant, and , half the inverse thermal energy. When the Hamiltonian is in the form, (), Eq. 1 delivers the usual Langevin picture of molecular kinetics, where momentum updates are given by . The kinetic partition function, , is a moment generating function that contains all the information about force/flux relationships. It makes an appearance in both the nonlinear fluctuation-dissipation and entropy production relations.(22); (19) It also provides a physical interpretation of the Lagrange multipliers (collision rate, temperature, etc.) as constant forces responsible for bringing about equilibrium through noisy energy exchange with external systems. Equilibrium properties like temperature, pressure, or chemical potential are only defined for the central system through the action of an external thermostat able to exchange energy, volume, or molecules – a consistent interpretation of temperature out of equilibrium.(23); (24)

Importantly, this work established that dynamic processes can serve as the foundation for statistical mechanics. In contrast to the fluctuation theorems, it does not require Boltzmann-Gibbs equilibrium distributions (or even stationary states) for initial and final densities. Rather, the dynamics is determined directly from maximum transition entropy. Apart from being the stationary state of the dynamics, the Boltzmann distribution plays no special role. The log-ratio of forward and backward (inference) probabilities,

(2) |

determines the ‘amount of irreversibility’ for the entire process. It consists of changes in the information entropy of the system plus heat rejected to the environment. The reverse process is determined by attempting to infer the starting distribution from the final one – a process subtly different from time reversal. Most notably, since is a Lagrange multiplier favoring transitions with smaller energy exchange, , this changes sign during inference to find starting states with larger energy.

Based on the success of this work, it is reasonable to ask whether these results can be extended to describe quantum dynamics. The result would be a theory that directly models the statistics of transitions between quantum states, deriving thermodynamics as a consequence. One major obstacle stands in the way of such an extension. The quantum measurement problem tangles up position and momenta, obscuring the classical picture of phase space. This means that it is not possible to define a single path or trajectory which a wavefunction undergoes to transition from one time point to the next. This problem manifests in a variety of ways, including very high frequency oscillations in the propagator for the Wigner distribution and wavefunction collapse in matrix mechanics.

This paper presents a novel statement of a transition probability distribution over quantum states. It retains much of the structure of the classical nonequilibrium theory referred to above. The distribution reduces to ordinary quantum dynamics when a collision rate parameter is set to zero. At nonzero rate parameter, the transition probability contains an exponential bias toward trajectories representing exchange of energy with an external thermostatic reservoir. Additionally, a normalization constant for each step of the dynamics acts as a moment generating function for the energy fluxes, and a measure of irreversibility over a single timestep can be directly interpreted in terms of heat and work.

In what follows, we will prove the unique properties of this approach, and analyze several of its consequences. Section II provides logical consistency arguments that eliminate alternative possible transition probability assignments. Section II.3 then introduces a novel quantum Andersen thermostat acting on an explicit model of the environment in an optical cavity experiment. Section III goes over non-standard mathematical notation and properties of transition superoperators used here. Results in section IV compare the simulation of our physically motivated reference system (containing an explicit environment) with the maximum transition entropy distribution (which contains no explicit environment). Section IV.2.3 provides a detailed picture of energy and entropy balance during a wavefunction collapse process which could occur in a noisy interferometry experiment. The conclusion discusses interpretation questions and future extensions and applications of the results. The appendix contains a detailed proof that the maximum transition entropy approach reduces to the density-matrix master equation of Caldeira and Leggett in the high-temperature limit.

## Ii Quantum Transition Processes

Exact quantum dynamics follows unitary evolution,

or, for the density matrix ,

The unitarity of means that both angles and distances between alternate state vectors, , remain unaltered during time evolution. This also leads the eigenvalues of to remain constant, so that entropy is a constant of the dynamics.

A modification introduced by Jaynes(25) was to consider multiple alternative, unitary time evolution operations,

(3) |

The final state density matrix then becomes a mixture of outcomes due to the unknown Hamiltonian that was applied to evolve the system in time. Unfortunately, this mixing picture was left incomplete, since it could not account for the temperature of the external system, and thus to lead to the thermal distribution.

For defining , we can represent the effect of the external system as a collection of random impulsive forces. Each will cause the central system to undergo a transition (in the position representation)

In the continuous limit, the index goes over into . These transitions are the short-time limit for any coupling Hamiltonian linear in the coordinate, . Of course, other forms of environmental noise (e.g. periodic or spatially localized functions) are possible.

Allowing for a quantum source of random impulsive forces, the most general transition operator will be the superposition,

Measuring the amount of momentum actually exchanged will collapse the combined system onto a definite value of , so that we need only consider noisy transformations of the form (Eq. 3),

(4) |

Jaynes analyzed this propagator in detail for the case when (equivalently ) does not depend on or .(25) We will term this case the “unweighted” distribution. One central result is the subjective theorem,

(5) |

For the increase in the von-Neumann entropy,(26); (12) , of the density matrix, , propagated from time to time using Eq. 3. The physical content of the theory is that the classical uncertainty in the random noise applied during time propagation serves as an upper bound to the increase in the quantum entropy of the system, . In addition, Jaynes showed that the unweighted process always contains the uniform (infinite temperature) distribution as a stationary state.

The first central question in the theory of open quantum systems is, “How do we assign probabilities to the collision events?” Fermi’s golden rule is the traditional answer, but depends on assuming a directionality and time-dependence for the external energy. In a real interaction process, however, more collisions will occur when the central system is moving with larger velocity. In addition, the energy of the external system will decrease as the collision proceeds. The system’s energy change due to the impulse will then depend on the system’s initial momentum and the external force correlation time.

The classical nonequilibrium theory(24) modeled these random collisions by a maximum entropy distribution for , conditional on the average impulse size, and the energy change, . Correlation (not considered in this work for clarity) was also added by constraining the autocorrelation function, .

This lead to the unnormalized distribution (Eq. 1 in an explicit form, where ),

This result is easily shown to be equivalent to the Langevin equation:

In the quantum case, the dynamics are realized in terms of density matrices rather than operators. To find an analogous result, the next subsections develop this analysis for several possible exponential biasing forms.

Once a probability scheme is chosen, one encounters the second central question of open quantum systems, “Should time evolution be described by statistics of multiple individual wavefunctions or by the statistics of the time-evolution operator?” This work refrains from considering statistics among alternate density matrices or among non-orthogonal sets of wavefunctions.

In principle, there are two different types of prior information for assigning probabilities to alternate transitions – creating weights that rely on complete knowledge of the system wavefunction, , or incorporation of a weighting into the transition operators themselves. These are dual to one another in the same sense that the Schrödinger picture is dual to the Heisenberg one. Both lead to a scheme for moving from a starting pure wavefunction to a density matrix a short instant later. The first approach leads to transition weights, , that depend exponentially on . Sec. II.1 shows this choice to be inadmissible based on the known properties of the measurement process. The second method fixes this defect by working in superoperator space. Here, it is possible to average over all propagators, , treating them as members of a linear set of dimension (Eq. 15). Weighting the result by performing a similarity transformation is equivalent to weighting each propagator, and results in a propagator that is linear in except for normalization.

### ii.1 Maximum Entropy conditional on

Since each possible time evolution operation () influences the energy of the isolated system, we can choose in the quantum evolution operation (Eq. 3) to be consistent with energy conservation. For a prescribed average energy change, the maximum entropy form is,

(6) |

The probability distribution over is therefore conditional on the starting state, . This conditioning reduces the likelihood of interactions with the external environment that raise the energy of the system.

Using random impulsive with this probability assignment, the wavefunction continually diffuses through Hilbert space. Despite the similarity to quantum state diffusion,(27); (28); (29) it is not clear whether any set of evolution operators and constraints will bring the two into the same form.

The analysis of Eq. 6 and the associated “kinetic” partition function,

are relatively straightforward, except that the probability distribution is over the random perturbation rather than over the physical system of interest.

Since the propagator will preferentially weight certain outcomes over others, it implicitly gives a model for wavefunction collapse. Measurement happens when one of the set of possible outcomes at time is selected by the environmental noise process.

To be consistent with observation, however, the collapse should be completely described by the density matrix at time . Extra information is not allowed because individual wavefunctions do not represent mutually exclusive events.(30) Rather, any two wavefunctions, say with and with , span a 2-dimensional subspace where the probability of any measurement can be represented by arbitrary linear combinations(25)

However, the choice of Eq. 6 leads to a change in the final density matrix () when this reparameterization is done. This dependence would make it experimentally possible to distinguish different combinations of wavefunctions making up the density matrix (i.e. ). However, this is not observed in practice. Therefore is unsuitable for use as prior information to directly determine the distribution of .

We note that quantum state diffusion does not seem to have this drawback, since the form of the dynamics is such that the density matrix found by averaging at each timestep conforms to the master equation in Lindblad form.(28); (31)

In addition to this problem, note that there is also a loss of compactness in describing a weighting over transition processes, . For example, two distinct propagators, and , could lead to the same final . It would then be most economical to collapse these two propagators into a single logical possibility, reducing the upper bound of the subjective H theorem (5). This can be done using some linear algebra to state the transition probability in a way that does not depend on the (arbitrary) space of random propagators, . The bare minimum description of the transition probability for a wavefunction is given by the 4-index superoperator(25); (12) (Sec. III),

(7) |

Expressions built in terms of this operator will then depend only on the actual action of the propagators on the system, rather than the abstract space of .

### ii.2 Energy-Weighted Propagator

This section examines a probability assignment based directly on a weighting of the superoperator, , toward transitions representing larger amounts of energy loss from the system. First, re-express the transformation (Eq. 7) in the energy basis as . Next, apply an exponential bias toward propagators with larger amounts of net energy loss. This leads to a re-weighting,

(8) |

The weights connecting transitions between energy eigenstates are of maximum entropy form. It is difficult to prove a more general statement about the entropy over transition superoperators. Since a more general extremum principle has not yet been found, this approach will be termed the energy-weighted propagator in the analysis below.

When written in an arbitrary basis, the bias on the energy change (Eq. 8) has the effect of transforming each of the unitary operators of Eq. 3 into . This is equivalent to transforming the canonical propagators derived by an eigenvalue decomposition of . For proofs and related notation, see Sec. III. Finally, we arrive at a prescription for adding a bias accounting for energy conservation to an initial stochastic transition superoperator, ,

(9) |

The description of the dynamics then becomes something like

However, is not necessarily normalized. Instead, the overall normalization can be expressed as a matrix operator,

(10) |

In principle, there are three options for normalizing . The simplest is applying a single scaling constant to the whole input density matrix, . Failing that, the density matrix could be scaled before applying to give a linear superoperator,

(11) |

As a final alternative, the starting density matrix can be normalized by decomposition into a set of pure states, , each of which is given an individual normalization constant,

(12) |

The first choice has nice mathematical properties, but is physically inadmissible. It shows non-causal behavior by combining otherwise independent parts of a statistical mixture of states. Specifically, the single normalization constant takes contributions from all starting states in a mixture,

This operator does not make a suitable transition distribution, since summing over final states does not give back the starting probabilities, . Consequently, it cannot reproduce the starting density matrix, and will not be considered further.

The second choice defines a linear superoperator that makes a more suitable transition distribution, but does not act like a moment generating function as we would expect from maximum-entropy statistical mechanics. In particular, the first derivative of the matrix logarithm does not give the energy change over a time-step unless has a special form (commutes either with the energy, , or the density matrix, ).

This equation is proved by noting the derivative, is zero and using . Also, the starting density matrix is not recovered using the naïve definition,

That leads instead to . Although it will not be considered in this work, numerical experiments (not shown) find that this operator behaves much like , when the noise level in is much smaller.

The third choice propagates orthogonal wavefunctions belonging to the initial density matrix separately. This leads to a proper causal picture(25) where a single pure state chosen from a statistical mixture at each point in time is responsible for the system’s future behavior. It reduces to the second choice, , when commutes with .

It is also simple to check that propagating the mixture of Eq. II.1 gives an answer independent from ,

Interestingly, this means that, although must be specialized for , the choice of propagating only the eigenstates of in Eq. 12 is not unique. The same final density matrix, , would result from propagating any non-orthogonal weighted set of starting states which combine to give (an array in the terminology of Ref. (25)).

#### Transition Free Energy and Entropy

The remainder of the paper introduces a reference process and proceeds to analyze the qualitative and analytical mathematical properties of the third choice, . In particular, we find that a kinetic free energy functional,

(13) |

(where was defined in Eq. 10) has properties consistent with maximum entropy thermodynamics.

A nonnegative relative information entropy can be defined for each transition,

(14) |

By the Klein inequality, this is necessarily positive when is any normalized, completely positive 4-index tensor.

Conventionally, is chosen to be a “reverse” transformation from the final density to an inferred joint density matrix – so that is a measure of the irreversibility of the process during time .(32); (33) This choice appeals to information theory on the grounds that, after discarding in favor of the updated , presents a “best guess” as to the 2-time probability density of the system.(19) Since is the information lost during the update step, it must obviously go to zero for deterministic evolution and depend on the ratio of forward to reverse probabilities.

Physically, Eq. 14 should be derived by considering the trade-off between spontaneous decrease in the system’s information entropy and heat released into the environment.(24) In this case, the reason for using a reverse transformation is so that depends on temperature like . The difference, , then contains a heat release term (negative of the system’s energy gain ) that counter-balances increase or decrease of the system’s instantaneous information entropy, . This should be compared to the classical result (Eq. 2), where, for quantum systems,

Re-stated in terms of the system itself, the system cannot store thermal energy without increasing its information entropy.

A straightforward choice based on these considerations is,

and consequently

Note that the direction of time plays no special role other than swapping the application order of – matching the final indices so it encounters . The results in Sec. IV.2.3 show that this definition is most closely related to the information loss interpretation, since it decomposes approximately as,

### ii.3 Reference Process

To illustrate the features of the energy-weighted propagator, it is helpful to compare against a reference process. Many recent experiments on molecular beams and trapped atomic systems have been carried out to probe the dynamic properties of wavefunction collapse.(34); (3); (7); (8) Although experiment makes the best reference in principle, direct comparison is limited by uncertainties in the source and strength of environmental noise. These differences, reflected in alternative choices for , make noticeable differences as the noise strength increases.(7) In addition, the Hamiltonians for most experimental setups introduce additional complexities which are not essential for understanding the theory.

We therefore consider a physically motivated theoretical reference process for environment-induced decoherence and dissipation. This provides a strong intuition for the role of energy-weighting in the stochastic propagator of the density matrix.

The simplified reference system consists of two resonant harmonic oscillators.(35) Oscillator represents the system of interest (e.g. the atom in a Paul trap). Oscillator represents a single electromagnetic mode of an enclosing cavity. Spin is ignored, and the two are simply coupled by an energy term, . The Hamiltonian is thus:

where are the lowering operators for the uncoupled harmonic oscillators. Similar models using a harmonic oscillator as the environment and a 2-level atom as the system of interest have also been investigated in the context of deriving fluctuation theorems for exact dynamics.(36); (37) Further elaborations on this model to include the atomic spin and the direction and polarization of the field(38); (39); (40) could be used for comparison to experiment.(7)

The environment in our reference model has the responsibility for introducing both decoherence and energy dissipation. To model this, we propose a physically motivated thermostatting scheme. It is based on the Andersen thermostat for molecular dynamics. There, the velocity of a particle is periodically replaced with a random sample from the Boltzmann distribution.(41) By analogy, this scheme periodically replaces the cavity state with a pure energy level sampled from the Boltzmann distribution.

The process can be visualized as in Fig. 1. The joint wavefunction is represented in the uncoupled energy space as

The panels of Fig. 1 show probability densities for these joint states, . Starting from a state where both energy levels are known in 1a, the joint system undergoes deterministic evolution according to

which introduces a system-field superposition (1b-c). Measurement of the cavity mode (1c) collapses the system into one of the pure cavity states, , chosen in the usual way with probability . Finally, the cavity mode is replaced (1d) with a new state sampled from the thermal distribution,

There is a distinction in this process between decoherence and dissipation. Decoherence occurs when the environmental state is measured, since a state containing all the information about the the system-cavity superposition is projected onto a single cavity mode. This step usually introduces small increases in the energy of the joint system because the coupling, has only off-diagonal terms in – which are removed by the projection. However, replacement of the cavity mode with a thermal sample accomplishes energy exchange – heating or cooling the system until thermal equilibrium is obtained.

#### Deterministic Limit

The measurement process described above incessantly projects the environment into a pure state. This type of measurement traces over the environmental degrees of freedom, so that at each measurement point the environment is seen to follow a deterministic path without superpositions. Starting with random off-diagonal matrix elements for the environment will have the same effect – canceling paths that begin in a superposition state.

If the trace (measurement) process is made continuous, the environment follows a definite trajectory, and is only felt as a deterministic external force – replacing with . This is the quasi-classical model used for many time-dependent quantum processes, where the Hamiltonian depends predictably on time through .

## Iii Methods - Superoperator Notation

For analyzing the re-weighting of Eq. 8, we introduce the following notation for density matrix superoperators. The canonical form for a superoperator is(26)

(15) |

It is a 4-index quantity. Explicitly, . Unless otherwise noted, each index (i,j,etc.) ranges over integer energy levels, . Superoperators appropriate for the density matrix can all be written in the form of Eq. 15 by virtue of the (super-Hermitian) symmetry property . This property also guarantees a spectral decomposition into matrices, , which are mutually orthogonal – as defined by .

We define the inner and outer trace operations,

Superoperators are applied to matrices by a substitution pattern,

and are composed following the same pattern,

We also define a simple algebraic notation for multiplication,

so that

Note the difference in application order between composition and multiplication. The multiplication operation makes certain manipulations clearer. In this notation, the Choi-Jamiolkowski isomorphism is simply obtained by writing the identity for a complete basis,

Writing an initial transition tensor explicitly in the energy basis, ,

We define the direct product of a superoperator and a density matrix, , as the projection onto the eigenspace of the density matrix (),

so that . This operation effectively specializes a superoperator to be consistent with its operand, . The choice to project or not does not affect the discussion in Sec. II leading to .

The numerical results presented here made use of tabulated tensors calculated by numerically integrating Eq. 4 for using a 20-point Gauss-Hermite quadrature rule and an exact matrix exponential computed through eigenvalue decomposition.

## Iv Results

The first role of the environment is to provide a mechanism for decoherence by introducing random noise into the time evolution. This section presents results from the reference process, which show the importance of the total energy in determining intermediate states just before collapse.

### iv.1 Dynamics of the Reference Process

The reference process makes quantum jumps through periodically projecting the cavity into a Fock state. Although this always leaves the central system in a superposition state, Fig. 2 shows that the probabilities for each system state still appear to follow a jump process. The resulting trajectory is very reminiscent of fluorescence measurements of single-atom excited state probabilities.(42)

We find numerically that the Andersen-type thermostat scheme employed for the reference process leaves the system very nearly in the equilibrium state of the coupled system. However, when the coupling strength is large, projecting out the cavity from the combined equilibrium density matrix gives a system density that can deviate significantly from the Boltzmann distribution (Fig. 4). Practically, this means that (for physically correct reasons), the thermostat does not work at strong coupling and only yields the exact Boltzmann distribution in the zero-coupling limit. We shall find similar behavior for the maximum-entropy jump process.

We first determine the dependence of the decorrelation time and energy dissipation in the reference process as a function of the coupling strength, , and the time between interactions, . Fig. 2 shows the average system energy level vs. time over 1000 stochastic trajectories, each starting from the ground state. The index labels the system energy level, projected from the complete system plus cavity state. Each parameter set was run for a length of 1500, using energy levels for each system and a numerical matrix exponent between each reset.(43) The time-course of fits well to an exponential with relaxation rate that increases approximately as .

We have verified numerically that the relaxation rate depends on the square of the coupling strength, , as predicted by perturbation theory.(10) Finally, a dependence on temperature will come from the dependence of the jump probability on energy level. Since the jump probability between and states by perturbation theory is proportional to , we might expect quadratic dependence on . However, at high temperature the jump probability empirically shows a linear, rather than quadratic dependence on . Fitting the observed relaxation rates, , using this scaling resulted in reasonable agreement for the constants,

(16) |

### iv.2 Properties of the Energy-Weighted Propagator

This section begins by showing that the qualitative dynamics and approach to equilibrium of the energy weighted propagator is very similar to the reference process above. Next, simple examples of the kinetic partition function are given. The section concludes by showing the relation between work, heat and entropy production.

Fig. 3 shows the time-dependence of a single-harmonic oscillator simulated with the maximum-entropy matrix propagator derived in Sec. II.2. The variance of the random impulsive forces was chosen to give the relaxation rate corresponding to the reference process (using Eq. 16). The variance of momentum jumps is related to the diffusion constant of Langevin dynamics as .(24) As should be expected, the energy relaxation rate was found to be linear in the diffusion constant, at per step.

The dynamics of the weighted propagator appears qualitatively similar to the reference process. Fig. 3a shows jump behavior that indicates the propagator induces localization of the wavefunction in energy space. The transitions are slightly more random, as might be expected from a maximum uncertainty model for interaction with the environment. The average probabilities in panel 3b (dashed lines) fit the reference process (solid lines) noticeably better than a simple exponential – compare to the green line in Fig. 2. We stress that the only fitting parameter is the diffusion constant, which was chosen so that the relaxation times agree between the two processes.

#### Stationary Density Matrix

The propagator admits the Boltzmann-Gibbs distribution with as a stationary state of the dynamics. This is shown since , and . The unweighted propagator always preserves the infinite temperature distribution, since it is an average over unitary operations (Eq. 3).

The propagator (Eq. 12) would not normally be expected to exhibit a Boltzmann-Gibbs stationary distribution. However, empirically the stationary state of at small is very close to the Boltzmann-Gibbs distribution with (Fig. 4). At large , the stationary state becomes more like the Boltzmann-Gibbs distribution with for .

The stationary distributions in Fig. 4 were found by repeated eigenvalue analysis on the re-normalized propagator, . Convergence was achieved when the eigenvectors of rotated with . The stationary state shows two regimes. The high-temperature, low diffusion constant limit gives a density matrix approximately proportional to . The low-temperature, high diffusion constant limit gives instead the density matrix . This behavior is evident at the break in the slope of probability vs. .

This behavior can be justified when there are enough energy levels in the system such that , with decaying with increasing energy difference, . In this case, the moment generating functions,

are nearly independent of the starting state, , for small . If they are independent, the Boltzmann-Gibbs distribution obeys the detailed balance condition,

so the equilibrium probability of energy level is .

As the coupling strength () is increased, the moment generating functions tend to their extreme value, . Detailed balance then gives . The probability of jumping to the lowest energy level, , depends on the collision rate parameter. As the rate parameter decreases, the cross-over point where the effective temperature of the stationary state doubles (from ) is delayed to ever lower temperatures. This behavior is seen clearly in the numerical results.

The upper left panel of Fig. 4 shows that an analogous effect happens in the reference model. There, the thermal state with the coupled system-cavity Hamiltonian depends on the temperature and coupling strength. When the coupling strength is small, tracing over the environment gives very nearly the Boltzmann-Gibbs density matrix for the system. As the coupling strength increases, the system’s projected density matrix plateaus to a constant, minimum, temperature at smaller . This may explain the requirement for a rescaled effective temperature in interpreting electron transport experiments probing the fluctuation theorems.(44)

The lower panels of Fig. 4 show the total magnitude of all off-diagonal elements in the density matrix as a function of temperature. Although these are on the order of (), both the reference process and the energy-weighted propagator show a maximum at intermediate temperatures. In the reference system, the off-diagonal elements are mainly two above the diagonal, and can be traced to the double-excitation process, , in the linear coupling, .

#### Moments of the Flux Distribution

The function (Eq. 13) acts as a cumulant generating function for the distribution of the energy flux. This is easy to show using successive differentiation of . From the definition of (Eq. 9),

The summation set, , contains nonnegative integers, that sum to .

The cumulants of the energy flux (derivatives of ) therefore break into sums over cumulants from each starting state, ,

(17) |

While each cumulant, is defined in the usual way in terms of order polynomial expressions in the moments,

These are a suitable definitions for the order cumulants of the distribution of .(11) When and are both energy eigenstates, it reduces to the classical statistical distribution (45).

#### Wavefunction Collapse

To illustrate the properties of the entropy, we give details on the process of wavefunction collapse with the energy-weighted propagator. This will be illustrated with a simple molecular beam experiment (Fig. 6).(46) Here a stream of atoms in a maximally entangled spin state are subjected to a random perturbing magnetic field during flight from a source to a detector. The final density matrix can be measured by counting the number of spin-up particles observed along each direction.

A uniform external magnetic field provides the energy function needed to induce wavefunction collapse. Conventionally, the field is said to ‘measure’ the particle’s spin state. Because of the field, the particles arriving in the detector are only found in one of two states - up or down along the -axis. This investigation of thermalization therefore demonstrates the hallmark of wavefunction collapse – that measurement forces the system onto the diagonal in the measurement basis.

The mathematical properties of this system are similar to recent experimental investigations on decoherence rate of quantum systems achieved using Ramsey interferometry on trapped atoms in superpositions of coherent harmonic oscillator states.(7) There, each of two states is tagged with a spin. A random displacement shifts the relative phase of the two states, resulting in decoherence that can be probed by moving the states back together spatially and then measuring the probability of up-spin along each detector angle, .

The propagator corresponding to evolution of a spin in an external magnetic field is , where represents a vector of Pauli spin matrices, and the direction is proportional to the magnetic field, . The constants stand for the appropriate gyromagnetic ratio and magneton.(47) Its operation can be computed exactly using a general formula for the exponential of a trace-free matrix,(43)

(18) |

where and is the magnitude of both (positive and negative) eigenvalues of . The action of the unitary operator can be calculated on a Hermitian matrix by expanding as , where ,

(19) | ||||

(20) |

Its average effect on the density matrix is determined by the statistics of the direction. Integration over is difficult in general, but we can derive useful results by choosing with fixed , . Averaging over a uniform distribution for gives an unnormalized propagator,

(21) | ||||

When , , so there are no random forces. In this case, the density matrix above reduces to the exact equation of motion for a particle precessing in a magnetic field along the -direction.

The temperature-dependence of the stationary state agrees with the qualitative results on the harmonic oscillator from Sec. IV.2.1. It resembles , while deviating toward larger effective temperatures for large when the noise is strong (, not shown).

Extracting the matrix elements from Eq. IV.2.3 leads to the transition tensor,

(22) | ||||

The outer indices are for , with each sub-block indexed by for .

The expectation of the energy change per unit time can be found from . For a state with , random collisions cause the energy to decrease on average,

The process of collapse happens after even a single time-step because the modeler’s state of knowledge is now described by a weighted set of possible wavefunctions, whereas the actual ensuing dynamics will be associated with only one of those choices.

Figure 7 shows the time-course of work and system entropy for the dynamics of a spin system starting from the superposition state (left panel), , and from a statistical mixture of up and down states (right panel), . Energy and information entropy differences, and at each time step are differences between thermostatic state functions, and . The total entropy increase (, Eq. 14) is a measure of irreversibility, not a state function. Due to its definition, it contains an exact contribution, . Subtracting this gives a quantity which we term , to represent the heat given off to the environment.

From the perspective of as the unique axis, both starting states look identical – giving a 50% probability of detecting the starting state as ‘up.’ However, the information entropy of the first matrix is the minimum possible (zero), while the second is the maximum possible (). As time progresses in the left panel, the entropy of the state first increases due to random collisions before finally decreasing to settle down into an equilibrium state. However, the energy continually decreases as collisions tend to rotate the spin into the direction on average. This gives an initially negative ‘heat,’ since knowledge of the pure starting state could have been used to cool a properly constructed external environment. after the cross-over region, the density matrix is compressed toward the up-state, decreasing the information entropy and requiring a release of heat into the environment.

The initial heating region is completely absent when starting from the uniform distribution in the right panel of Fig. 7. Since the system starts in the maximally uncertain state, it continually gives off heat while lowering its energy. Initially, the compression (measured by decrease in information entropy) is slow with large decreases in energy. Eventually, the system settles down into a final compression phase similar to the left panel.

The dotted lines are plotted using the path relation for the total entropy derived in Eq. 14. It can be seen that the total entropy increase per unit time is always positive, reflecting the trade-off between work and entropy. Work has to be done to compress the distribution toward the up-state. In the absence of environmental noise, superposition makes it impossible for this to happen. Here, it occurs because our probability assignment for collisions accounts for the fact that the energy of the system plus its environment is conserved. Because of this, the environment is not allowed to impart arbitrarily large amounts of energy to the system. The total entropy has to increase when work is done in this ‘thermal’ way, since the number of allowable collision events decreases with time.

## V Discussion

It is at first surprising how closely the dynamics of the energy-weighted propagator matches the model with an explicit cavity and measurement process. Both show a nearly exponential decay toward an eventual steady state. The steady-state is very nearly a Boltzmann distribution at high temperatures or weak coupling. This result should be expected to hold independently from the specifics of the environmental noise process. The nonequilibrium dynamical method here thus fulfills its goal of providing a first-principles model for the dynamic energy exchange processes which derives the equilibrium distribution. Both models also show deviations toward higher effective temperatures due to environmental coupling – an effect observed in experiments.(44)

The low-temperature behavior of the stationary state is a physically real effect. It occurs here because of the continual random influence of the environment, removing the possibility of approaching absolute zero temperature. It will have an imperceptible effect on the energy-temperature equation of state, since the behavior of the system is dominated by the ground state at these low temperatures. Unless the interaction is unreasonably large, this method still gives the probability of being in the ground state at 1 to 7 decimal places.

### v.1 Relation to Entropy and Work Fluctuation Theorems

Both the definition (Eq. 14) and its interpretation differ from the usual fluctuation theorems.(33); (11) Conventional fluctuation theorems are defined with respect to a Boltzmann-like stationary distribution and an exact, reversible, dynamical process (or a stochastic process obeying detailed balance for that stationary distribution). The transition entropy approach pursued here begins by postulating a stochastic transition process and only then determines its equilibrium and nonequilibriumn properties.

Because of this difference in origin, the entropy definition also differs from other expressions related to the second law.(32); (33); (48); (12) The origin of the usual fluctuation theorem in exact dynamics leads to vanishing irreversible heat dissipation in the log-average,

where is the Helmholtz free energy of a thermostatic equilibrium state at temperature allowed by the dynamics. Of course, the equilibrium relation between and sets , and there is no irreversible entropy production. In order to make statements about irreversibility, exact fluctuation relations have to be supplemented by a model for the thermalization of the environment. It is the entropy increase of the environment that is ultimately responsible for irreversibility derived in this way.

Several work fluctuation theorems have been adapted to the quantum mechanical setting using moment generating functions,

for the energy difference between initial and final states.(49); (11) These are closely related to the energy-weighted propagator derived here (Eq. 12). They are the natural analogues to the definition of work distributions used for deriving the classical fluctuation theorems. The twist of perspective in this work shows that their success can also be viewed as justification for postulating a statistical law of random transitions. Viewed in this way, the exponential averages are not just a calculational device for comparing to some ‘instantaneous reference distribution’ but are also useful statistical likelihoods for observing transitions in energy space. The results of this work show that these likelihoods can be relied upon to generate a physical steady-state distribution for an open quantum system – without any reference to the Boltzmann-Gibbs equilibrium.

## Vi Conclusions

This work provides a new perspective on the dynamics of open quantum systems. Three complete examples were analyzed. Understanding this perspective motivates a new statement of the central question for nonequilibrium systems: “What is the most logically consistent way to represent an experimenter’s subjective probability of the system-environment interaction when only the system’s trajectory is known?” Utilizing a model of the maximum entropy type for this interaction gives an implicit representation of the environment based only on its role as a source of noise and dissipation. In addition, the dynamics is qualitatively very similar to results with a more explicit representation of the environment.

A second question was also considered, related to the most complete description of the system in-between interactions with the environment. The firmly established measurement hypothesis of quantum theory was used to answer it. A single wavefunction is a complete description of an isolated system, but a density matrix represents the maximal combination of physical and statistical information available for measurement. Agreement with the peculiar property that individual wavefunctions do not represent mutually exclusive events was used to select an admissible form for the propagator. The conclusion was to constrain the energy change induced by the operator rather than the energy change of the wavefunction itself. Subtleties with normalization of nonlinear models based on the density matrix lead to the final form for Eq. 12.

A completely linear propagator, (Eq. 11) was mentioned as a possibility, but rejected because it did not clearly connect initial and final wavefunction states between time-steps. Although its stationary state is simpler to find numerically, preliminary results (not shown) display a minimum effective temperature. At very low temperature (high ), the stationary state probabilities start to distribute uniformly across energy levels.

The dynamics of the favored propagator, (Eq. 12), were compared to a reduced model for thermalization of a trapped atom in an optical cavity. In addition to showing similar relaxation rates and dynamical behavior, the energy-weighted propagator showed mathematical properties consistent with maximum entropy thermodynamics. Specifically, the transition free energy (Eq. 13) acts as a generating functional for the distribution of energy flux (Fig. 5).

In addition, the transition entropy (Eq. 14) describes the trade-off between cooling of the system (lowering its information entropy) and heat release into the environment. Running a collapse experiment on a spin in a pure, superposition state initially absorbs heat from the environment, whereas the same experiment on a completely random spin continually decreases entropy, requiring a continuous heat release into the environment (Fig. 7). The physical manifestation of this difference is because incoming random light will be scattered in a preferential set of directions by the known spin. The transition entropy serves as a clue to this opportunity. However, actually using this initial absorption of heat requires a way to make use of the fact that the particle begins in a known state, e.g. by capturing the scattered light. Otherwise, the known force exerted by the particle is silently converted into heat. Again, we find that entropy is a consequence of the interplay between the known and unknown.

This work presents a consistent picture of the role of the observer and the environment throughout the process of wavefunction collapse. Collapse is not possible without a noisy environment. Interacting with that environment, however, will cause further entanglement rather than collapse. Nevertheless, the interaction will remove coherences present in the system, while favoring ending states that are lower in energy. At this point, the experimenter’s state of knowledge about the system is described by a density matrix that accounts for these differences in energy. However, the actual dynamics of the system will be due only to a single wavefunction belonging to that density matrix. The most logically consistent way to guess the system’s state after some amount of interaction has taken place is to choose a wavefunction from the density matrix using the usual quantum measurement rule. Put another way, wavefunction collapse never occurs, but the experimenter’s state of knowledge becomes heavily weighted toward individual wavefunctions representing a collapsed state. In this interpretation, Schrödinger cat states are incomplete until some environmental interaction takes place.

The reduction of the environment to a statistical noise source with a bare minimum number of properties is critical to achieving this picture. Rather than starting from an exact dynamical process for the environment, it is possible to present a set of interaction Hamiltonians and derive a weighted propagator. It is immediately clear how to add time- and spatially-dependent models for the environment (e.g. correlated noise sources). Other maximum entropy-type constraints, e.g. the change in the total momentum, , can also be included in the biasing exponential along with the energy weight. The result is a greatly simplified model of the essential elements for nonequilibrium quantum systems.

## Appendix

## Appendix A Comparison with the Density Matrix Master Equation

This section shows that another well-known formalism for simulating the dynamics of open quantum systems can be derived as a low-order approximation of the energy-weighted propagator (Eq. 12).

Master equations for the density matrix are the most popular means for describing the role of environmental noise in localizing particles on a microscopic scale.(50) Caldeira and Leggett began with a description of collapse of a density toward the diagonal induced by coupling to a bath of harmonic oscillators. However, this coupling alone does not provide a suitable mechanism for energy dissipation. Later, these decoherence models were extended to add dissipation by imposing an energy distribution on the bath variables.(51); (52)

The particular high-temperature limit derived by Caldeira and Leggett for external forces linear in is,(53); (10)

(23) |

where the damping rate should be of dimension .

It will be shown that the terms of Eq. 23 are in 1:1 correspondence with those appearing in the high-temperature limit of . In this limit, and are identical.

The first step is approximating the energy weighting on . Since the random forces, , are of order , terms up to contribute,