Pure State Quantum Statistical Mechanics
The capabilities of a new approach towards the foundations of Statistical Mechanics are explored. The approach is genuine quantum in the sense that statistical behavior is a consequence of objective quantum uncertainties due to entanglement and uncertainty relations. No additional randomness is added by hand and no assumptions about a priori probabilities are made, instead measure concentration results are used to justify the methods of Statistical Physics. The approach explains the applicability of the microcanonical and canonical ensemble and the tendency to equilibrate in a natural way.
This work contains a pedagogical review of the existing literature and some new results. The most important of which are: i) A measure theoretic justification for the microcanonical ensemble. ii) Bounds on the subsystem equilibration time. iii) A proof that a generic weak interaction causes decoherence in the energy eigenbasis. iv) A proof of a quantum H-Theorem. v) New estimates of the average effective dimension for initial product states and states from the mean energy ensemble. vi) A proof that time and ensemble averages of observables are typically close to each other. vii) A bound on the fluctuations of the purity of a system coupled to a bath.
descriptionlabel \setkomafontpagehead \setkomafontpagenumber
Master’s Thesis in Theoretical Physics \publishers Author: Christian Gogolin111publications@cgogolin.de, Supervisors: Prof. Dr. Haye Hinrichsen Prof. Dr. Andreas Winter Institute: Julius-Maximilians-Universität Würzburg Theoretische Physik III
This work is dedicated to Kathrin and Meggy,
the two most important persons in my life.
A philosopher once said “It is necessary for the very existence of science that the same conditions always produce the same results.” Well, they do not.
Richard Feynman, The Character of Physical Law
- \thechapter Introduction
\thechapter Quantum Statistical Mechanics
- 1 Setup
- 2 Ensemble averages and pure state quantum Statistical Mechanics
- 3 Average effective dimension of random pure states
- 4 Equilibration
- 5 Ergodicity
- 6 Dynamics of the state of the subsystem
- 7 Equilibration and einselection
- 8 Initial state independence and the Second Law
- \thechapter Conclusions
- \thechapter Distance measures for quantum states
- \thechapter The Haar Measure
- \thechapter Levy’s lemma and its application in Quantum Mechanics
Notation guide and definitions
- Hilbert spaces
- observables and projectors
observables projectors all projectors
- quantum states
pure states mixed states reduced states/marginals time averaged/dephased states
- trace norm
- trace distance
(0.2) (0.3) (0.4)
- Hilbert space norm
- Hilbert-Schmidt norm
- operator norm of a hermitian operator
- Von Neumann entropy
- quantum mutual information between and
- effective dimension
Chapter \thechapter Introduction
Despite being very well confirmed by experiments Thermodynamics and classical Statistical Physics still lack a commonly accepted and conceptually clear foundation.
The reason for this unsatisfactory situation is that physicists have not yet succeeded in finding concise and convincing justifications for the fundamental axioms of Statistical Physics. An overview of the attempts to axiomatize Statistical Physics and Thermodynamics and to justify the axioms from classical Newtonian Mechanics and the conceptual problems with these approaches can be found for example in  and  and the references therein.
Quantum Mechanics claims to be a fundamental theory. As such it should be capable of providing us with a microscopic explanation for all phenomena we observe in macroscopic systems, including irreversible processes like thermalization. But, its unitary time evolution seems to be incompatible with irreversibility  leading to an apparent contradiction between Quantum Mechanics and Thermodynamics. This apparent contradiction is part of the long standing problem of the emergence of classically from Quantum Mechanics.
To overcome this problem many authors have suggested to modify Quantum Theory, either by adding nonlinear terms to the von Neumann equation or by postulating a periodical spontaneous collapse of the wave function . Others have considered effective, Markovian, time evolutions for open quantum systems  and it has been shown that system bath models that evolve under a special form of Hamiltonian tend to evolve into states that are classical superpositions of so called pointer states — a phenomenon called environmentally induced super selection, a term due to Zurek . Depending on the author subsets of these approaches are subsumed under the term decoherence theory [7, 5, 8, 9].
In face of the enormous success of standard Quantum Mechanics in explaining microscopic phenomena and the additional difficulties that arise when the von Neumann equation is modified and the existence of macroscopic quantum systems on the one hand, and the broad applicability of Statistical Mechanics and Thermodynamics on the other, we feel that neither a modification of Quantum Theory, nor considerations restricted to special situations can provide a satisfactory explanation of the statistical and thermodynamic behavior of our macroscopic world. Consequently we will seek to derive general statements independent of particular models and we will not use the Markov assumption. Furthermore, we believe that neither the assumption of ergodicity nor classical or quantum chaos are good starting points for constructing a convincing and consistent foundation for Statistical Mechanics and Thermodynamics (see for example footnote 1 and 2 in ).
The struggle for a quantum mechanical explanation of behavior usually described by Statistical Physics dates back to the founding fathers of Quantum Theory, most notably von Neumann  and Schrödinger . Recently work on this subject was resumed and there has been remarkable success:
In [13, 10, 14, 15, 16, 17, 18] a justification for the applicability of the canonical ensemble is given that does not rely on subjective, added randomness or ensemble averages. While [10, 14, 17] make particular assumptions on the Hamiltonian and introduce the concept of temperature, and thereby are able to derive explicitly the Boltzmann distribution, the aim of [13, 15, 16] is more to show that the reduced states of random states of large quantum systems typically look like the reduced state of the microcanonical state,  in addition uses time dependent perturbation theory. All these works are based on typicality arguments and the phenomenon of measure concentration .222It is very interesting to compare thees articles with the works of Jaynes [20, 21] Although there are huge differences concerning the interpretation, the before mentioned works are methodologically very close to certain aspects of the approach of Jaynes, especially with respect to the way they make use of measure concentration arguments. It is thus surprising and unfortunate that Jaynes’ works have been completely ignored in the recent literature..
There are some works that investigate equilibration and thermalization in particular models [28, 29, 30, 31, 32]. Due to the additional structure in the less general situations considered in these works a more detailed analysis is possible and the authors can make assertions about the time scales on which equilibration happens.
In  it is shown how the concepts of work and heat can be defined on purely microscopical grounds without using classical external driving and in  the limits of purely quantum microscopic thermal machines are investigated. See also the references in [33, 34] for works discussing and applying definitions of work and heat based on time dependent Hamiltonians and external driving.
There have been attempts to derive the Second Law of Thermodynamics [39, 40] or a statistical H-Theorem  for the von Neumann entropy from Quantum Mechanics and in  (see also the older references 4 and 5 in ) a different entropy measure, “microscopic diagonal entropy”, was proposed to overcome the contradiction between microscopic time reversal invariance and the Second Law.
Unfortunately the often mathematically rigorous and far reaching results of these works are almost complete ignored by textbooks on Statistical Mechanics and Thermodynamics, this is true even for the results obtained by von Neumann in 1930  (an exception is ). This situation is unfortunate since some of the results mentioned above address long standing conceptual issues at the very heart of Statistical Mechanics and Thermodynamics.
Chapter \thechapter Quantum Statistical Mechanics
Especially [13, 45, 22, 15, 16, 8] argue for a new interpretation of the foundations of Statistical Mechanics. Following Seth Lloyd  we called this approach pure state quantum Statistical Mechanics. In what follows we give a concise and self contained review of the results of these and other related works in a unified and consistent notation. In the first section we introduce the general setup and fix the notation. We then review the recent progress in the field and present additional new results concerning the justification of the applicability of the microcanonical and canonical ensemble, equilibration, ergodicity and initial state independence. Finally we show that these results imply a statistical quantum Second Law of Thermodynamics.
We consider arbitrary quantum systems that can be described using a Hilbert space of finite dimension .333If the Hilbert space of a real system is infinite dimensional it should always be possible to find an effective description in a finite dimensional Hilbert space by introducing a high energy cut-off. If eigenstates with extremely high energy had a crucial influence on the behavior of realistic systems physicists would be in a desperate position. Without the ability to prepare and thus study these states in detail it were very difficult to make reliable predictions. The author therefore believes that whenever the behavior of some model is crucially changed by introducing such a cutoff this is due to the very fact that it is a model. Moreover, it was demonstrated in  that many of the phenomena we that can be rigorously proven in the finite dimensional case also occur in infinite dimensional systems. We thus believe that the restriction to finite dimensions as mainly a technicality. We assume that all observables, including energy, are bounded linear operators, i.e have a finite operator norm.
We will often talk about systems that can be divided into two parts, which we will call the bath and the subsystem , such that where and are the Hilbert spaces of the subsystem and the bath respectively. It shall be emphasized that we will not make any special a priori assumptions about the size and structure of the bath and system. All results will be completely general. The only reason why we call one part the bath and the other the subsystem is that in the end we will be interested in situations where the dimension of the Hilbert space of the bath is much larger than the dimension of the Hilbert space of the system.
We denote by the set of all projectors on and by the set of all rank projectors on . We write and for normalized pure state vectors and use and to denote their associated pure density matrices in . The set of all, possibly mixed, normalized density matrices on , i.e. the set of all positive-semidefinite hermitian matrices with trace one, will be denoted by and we will use the symbols and for, possibly mixed, states from . Their reduced states, or marginals, on the subsystem and bath are indicated by superscript letters like in and .
The Hamiltonian of the joint system has energy eigenstates with corresponding energy eigenvalues that we will assume to be given in units of . The Hamiltonian governs the time evolution of the joint system. If the initial state of the system was we will denote the state at time by with .
The Hamiltonians considered herein are completely general except for one extremely weak constraint, namely that they have non-degenerate energy gaps or are non-resonant.444This assumption already appears in the work of von Neumann  and later in [25, 24, 36] This assumption imposes a restriction on the equality of the gaps between energy eigenvalues, namely
Note that there are two slightly different versions of this assumption: In the first, stronger version the indices run over all eigenstates of the Hamiltonian, i.e. . This version implies that the spectrum of the Hamiltonian is non-degenerate. In the weaker version the indices run only over all distinct eigenvalues, so that degeneracies in the energy spectrum are allowed as long as the gaps between the degenerate subspaces are non-degenerate.
It shall be emphasized that even the stronger version is an extremely weak restriction as every Hamiltonian can be made to be non-resonant by adding an arbitrary small random perturbation. Generic Hamiltonians have non-degenerate energy gaps. Every Hamiltonian becomes non-degenerate by adding an arbitrary small random perturbation; therefore the Hamiltonians of macroscopic systems can be expected to satisfy this constraint.
The physical implication of this assumption is that the Hamiltonian is fully interactive in the sense that there exists no partition of the composite system into a subsystem and bath such that the Hamiltonian can be written as a sum where and act on the subsystem and bath alone.
In the following we will use the stronger version of the non-degenerate energy gaps assumption for the sake of simplicity. However, results similar to the ones presented herein hold under the second, weaker version. Basically, what one has to do is replace projectors onto energy eigenstates by projectors onto degenerate subspaces and refine some of the quantities appearing in the theorems, in particular the effective dimension (see the discussion in ).
The consequence of the non-degenerate energy gaps assumption that we exploit in the present work is that time averaging a state that evolves under such a Hamiltonian
gives the same result as dephasing the initial state with respect to the energy eigenbasis of
We will therefore use the letter to refer to time averaged and dephased states respectively.
In what follows we will often talk about random pure states drawn from some subspace . Unless explicitly stated otherwise by a random pure state we mean a state that was chosen according to the Haar measure on , which is the unique unitary left and right invariant measure on  (see appendix \thechapter for more information).
2 Ensemble averages and pure state quantum Statistical Mechanics
In conventional Statistical Mechanics probabilities, expectation values, variances and higher moments of observables are computed via ensemble averages. Depending on the situation under consideration one must employ the microcanonical, canonical or the appropriate grand canonical ensemble . The validity of this approach is beyond all doubt and the results obtained using it have been confirmed by innumerous experiments.
On the other hand, the role of probability [48, 21] in Physics, the problem of ergodicity and especially the microscopic justification of the Second Law of Thermodynamic are very subtle issues and many fundamental questions concerning them are still open despite many decades of research .
The starting point of our discussion will be to show how the applicability of ensemble averages can be justified using Quantum Mechanics and measure concentration techniques without any extra assumptions.
2.1 The microcanonical ensemble
The microcanonical ensemble is in some sense the most fundamental ensemble. In classical Statistical Physics it is applied to closed systems in equilibrium. The other ensembles, canonical and grand canonical can be derived from it .
In the quantum setting the microcanonical ensemble is used in situations where all one knows about a closed physical system is that the value of some observable , which corresponds to a conserved quantity, i.e , lies in some interval .555Note that thermodynamically closed does not necessarily mean completely isolated . In this section we will however talk only about completely isolated systems. Let be the eigenvectors of and the restricted subspace spanned by those eigenvectors that have eigenvalues in the interval. The microcanonical expectation value of any observable with respect to is then defined to be
where is the projector onto the subspace of eigenstates of with eigenvalues in . Knowing only that measuring would give a value in we ascribe to the system the mixed state 666Note that there are other possible generalizations of the microcanonical ensemble to the quantum setting that are discussed in the literature (s. [49, 50, 51]).
Equation (2.1) and (2.2) are the quantum version of the equal a priory probability postulate, which is the fundamental postulate of convectional Statistical Mechanics. All compatible states are assigned the same a priory probability.
It is beyond all doubt that this approach to calculate expectation values has proven to be extremely useful and yields results in good agreement with experiments. However it remains puzzling why dynamically evolving and intrinsically quantum mechanical systems may be described by the static, highly mixed state (2.2).
2.1.1 Typicality of general observables
The recent results suggest that the equal a priory probability postulate is dispensable . Instead of assuming that the state (2.2) yields a good description of the system it is possible to proof that for almost all pure states of large systems all subsystems behave as if the system were in the state (2.2). A statement the authors of  called General Canonical Principle.
The idea to reproduce the results obtained using the microcanonical ensemble average, without added randomness form nothing but pure Quantum Mechanics, and thereby justifying its use, was already discussed in 1991 by J.M. Deutsch . A mathematically more precise statement about the equivalence of ensemble averages and expectation values of random pure states can be found in the Ph.D. thesis of Seth Lloyd which appeared in the same year :
 Let be a subspace of dimension of the Hilbert space of some physical system. Let be the projector onto and let be the average over random pure states . Then for every observable with :777The additional constraint is not discussed in the main text of , but it is stated and used in the proof the theorem.
The interpretation of theorem 2.1 is straight forward: If the dimension of is large, it tells us that the mean square deviation the expectation value of computed over random pure states from the microcanonical expectation value is small, which implies that the two expectation values will be similar with high probability.
The methods used in  to proof the General Canonical Principle, namely Levy’s lemma (see appendix \thechapter), can be used to proof a stronger, exponential bound on the probability to observe a deviation from the predictions of the microcanonical ensemble when measuring an observable acting on the full Hilbert space:
Let be a subspace of dimension of the Hilbert space of some physical system. The probability that the expectation value of an arbitrary observable in a randomly chosen pure state differs from its microcanonical expectation value with respect to is exponentially small in the sense that for every
where is a constant with .
The expectation value of this function with respect to a randomly chosen pure states clearly is
Its Lipschitz constant with respect to the Hilbert space norm is upper bounded by , as :
Applying Levy’s lemma (see appendix \thechapter) to gives the desired result. ∎
Theorem 2.2 tells us that as becomes large the set of states for which deviates from by at most a given amount becomes exponentially small. Typical states will give expectation values that agree very well with the predictions of the microcanonical ensemble.
Of course, typicality of expectation values is not sufficient to justify the microcanonical ensemble from measure theoretic considerations. Variances and higher moments also need to be considered.
In  it is claimed that theorem 2.1 implies that not only the expectation values, but in addition all higher moments are likely to be close to the microcanonical ones for typical states. But what is actually proved is that the variance in state computed with respect to the microcanonical expectation value
is close to the microcanonical variance
with high probability given that is large. The additional deviation caused by the fact that (2.8) differs from the variance in state
is not taken into account.
But, as one may already anticipate, the additional error typically is very small, so that it is not surprising that theorem 2.2 can be used to proof that not only the expectation values, but in addition the variances of almost all states are compatible with the variance of the microcanonical ensemble. We expect that similar statements hold for all higher moments.
In particular we can proof that:
Let be a subspace of dimension of the Hilbert space of some physical system. The probability that the variances of some observable in a random pure state
differs from the variance that follows from the microcanonical ensemble
is exponentially small, in the sense that for every
where is a constant with .
Let and be the -th moment of the probability distribution of the observable with respect to the state and the microcanonical ensemble respectively, so that in particular and . To simplify the notation we define
For all we have:
The second term in the last line can be bounded. Applying theorem 2.2 to gives
This is an exponential version of the bound found in .
Bounding the first term is in general more complicated except for the variances where we can use the following argument: Assume that the deviation between and is
and therefore we have by theorem 2.2 for all
Combining the two estimates we arrive at:
Now, every observable can be renormalized such that and rescaled such that its operator norm is one. Doing this one changes the variance by a factor of so that we get
Substituting gives the second bound. ∎
Note that all important steps in the above discussion are valid also for higher moments except for the bound on , which is especially simple for the special case . We expect however that slightly more complicated arguments can be made for all higher moments.
Measuring the same typical pure state of a large enough quantum system we therefore can expect to not only get expectation values that are close to the microcanonical ones but in addition the observed variances will be almost identical to the ones predicted by conventional Statistical Mechanics. Note that these variances are caused by objective quantum uncertainties888The interpretation of the word objective depends on the preferred interpretation of Quantum Mechanics. A discussion of this point (that comes to the conclusion that Quantum Mechanical probabilities are not objective in a certain sense) can for example be found in . However, they are certainly in some sense more objective than probabilities that result form the voluntary dismissal of information due to coarse graining. We shall not elaborate on this point here as it would lead us to far away from the subject of this work. and not by ensemble averages due to a subjective lack of knowledge of the micro state.
Concluding we may say that, given an ensemble of large quantum mechanical systems we are, by measure only a reasonably small number of observables, with very high probability, unable to decide whether all systems of the ensemble are in the same random pure state choose from some subspace, or representatives of the corresponding microcanonical ensemble. We call this property of large quantum systems microcanonical typicality. However, there are combinations of initial states and observables that give a measurement statistic that deviates radically from the predictions of the microcanonical ensemble. This happens for example when is an eigenstate of . These measurements are the ones that best characterize the system under consideration and an experimentalist will always seek for such a characterization. Thus the physical significance of the above results is questionable.
In the following sections we will elaborate more on this point and present arguments similar to theorem 2.2 for coarse grained observables and for situations where only a subsystem of a larger quantum system is experimentally accessible and we will see that in these situations the criticism expressed above does not apply.
2.1.2 Typicality of coarse grained observables
We have seen that when all observables are experimentally accessible there always exist measurements, in particular measurements in the eigenbasis, which give a measurement statistic for a random pure state that deviates radically from the one predicted by the microcanonical ensemble.
However, on macroscopic systems most observables are not accessible. This is not only a consequence of experimental limitations but manly due to the vast number of dimensions of the Hilbert spaces of macroscopic systems [11, 24]. As an example consider the spin degrees of freedom of a macroscopic magnet. The typical Hilbert space of such a system has a dimension of the order of . Trying to measure an observable that can distinguish that many states, or even worse, doing state tomography on such a system, certainly is a completely futile task.
Obviously we need to find a way to take our limited capabilities into account when seeking a realistic description of macroscopic systems. The way we will do that here is the simplest and most straight forward one can possibly think of and similar considerations date back to the work of von Neumann .
Let be the set of experimentally accessible macro observables , where, without loss of generality we can assume that the are positive-semidefinite and have trace one . We think of the as macroscopic observables, so that, due to the limited resolution of our measurement apparatuses, the will be highly degenerate. Furthermore we want the to be classical in the sense that . Such a set of commuting observables induces a pseudo norm and an associated pseudo trace distance
which measures how well two states and can be distinguished from one another by the restricted set of observables.999Note that reduces to the normal trace distance if . See appendix \thechapter for more information on distance measures for quantum states. The set of accessible measurements partitions the total Hilbert space of the system into a complete set of orthogonal subspaces with of macroscopically distinguishable states, or macro states, such that states from one subspace can not be distinguished by any of the and that two states are distinguishable by at least one of the whenever they are in different subspaces:
Every macroscopic observable that we can measure by using all our measurement capabilities is of the form
where the are the projectors onto the corresponding subspaces and the real parameters.
In realistic situations we can expect that and the following theorem tells us that we are unlikely to have any chance of distinguishing a random pure state from the microcanonical state under these conditions:
Let be a restricted subspace of dimension of the Hilbert space of some physical system. Assume that the physically feasible, macroscopic measurements allow one to distinguish a total number of macro states. Then the probability that a random pure state gives an expectation value for any of the accessible macroscopic observables that differs from that of the microcanonical one with respect to is exponentially small, namely
where is a constant with .
The proof is inspired by the considerations in appendix VI of . As explained above defines a set of mutually orthogonal projectors onto subspaces of indistinguishable states and consequently every accessible observable is of the form
so that . Obviously for all such observables it holds that
Inserting into theorem 2.2 we find that for random pure states
where . Using the union bound we see that this implies that
so that for all accessible observables
The important quantity in the above theorem is the quotient in the exponent of (2.32) which quantifies how good our abilities to prepare and measure a state are. Assuming that the dimensions of each of the subspaces of indistinguishable states are approximately identical one can expect that and grows exponentially with the number of constituents of the system. In contrast is basically given by the spread of the spectra of the physically accessible observables divided by the resolution of the measurement apparatuses. The spread of the spectra can be expected to grow at most polynomial with the system size and the resolution of the measurement apparatuses will be roughly independent of the system size. One can therefore expect that for large enough systems one enters the regime where and where the above theorem becomes meaningful.
2.2 The canonical ensemble
The usual situation in which the canonical ensemble is applied are subsystems of weakly interacting composite systems whose total energy is known to lie in some narrow interval. A slightly more general situation is that of a composite system subject to the constraint that the value of some observable corresponding to an extensive and conserved quantity is known to lie within some interval. This understanding of the canonical ensemble includes what is sometimes called the grand canonical ensemble. For the sake of simplicity we restrict ourselves to the canonical case where . The generalization to the grand canonical case is almost trivial.
Using the canonical ensemble to calculate expectation values is equivalent to assuming that the state of the system of interest is given by the so called canonical state
where is the inverse temperature, the eigenstates of the system Hamiltonian and
the partition sum, which ensures normalization.
Taking (2.39) as the system state is usually justified by regarding it as a subsystem of a larger, closed composite system to which the microcanonical ensemble can be applied [17, 52, 53].101010Alternatively one plead the Bayesian probability and the principle of maximum entropy principle [21, 20]. The following is a sketch of how this justification works.
The argument presented herein follows closely the discussion in . Note that the argument is solely based on combinatorics and the identification of the thermodynamic entropy with the entropy defined via the number of compatible micro states. There is nothing specifically quantum to it. Very similar arguments can be found in nearly every textbook on Statistical Mechanics.
The Hamiltonian of the composite system
consists of a system Hamiltonian , a bath Hamiltonian and an interaction term . The interaction term is assumed to be small in the sense that the total energy of the system is approximately the sum of the system energy and the bath energy, i.e. that energy is extensive, and that the energy eigenstates are close to product states.
The energy of the composite system is assumed to be known to lie in some interval that is assumed to be small on a macroscopic energy scale, but still large enough such that the subspace spanned by the energy eigenstate with eigenvalues in the interval is large.
Assuming that the composite system is in the microcanonical state and using that the energy eigenstates of are approximately product states we find for the reduced state of the system
where the are the eigenstates of with energy and the are the number of eigenstates of with eigenvalues in the interval .
The last step is to introduce the concept of temperature. The inverse temperature of the bath is defined via where is the entropy of the bath when it is held at energy . Assuming that the energy levels of the bath become exponentially dense with increasing energy, which seems to be a reasonable assumption for most thermodynamic systems, one can expect that .111111This is probably the most critical step in the argument. The assumption of exponentially dense energy gaps conflicts with the assumption that does not significantly influence the eigenstates of the uncoupled Hamiltonian , as this can be guarantied only when the coupling is smaller than the energy gaps of the uncoupled Hamiltonian. Such that, if the bath is much larger than the system we have:
So that finally one reaches the conclusion that under the given conditions.121212Using a similar argument, but under additional assumptions on the interaction Hamiltonian, namely that it only couples adjacent energy eigenstates, the canonical ensemble is also derived in .
Now the question is: Is it possible to come to the same conclusion without using the ad hoc assumption of the microcanonical state for the composite system? In  consequences of theorem 2.1 on the equivalence of expectation values obtained using the canonical ensemble and expectation values of typical quantum states have been already been discussed. Using similar arguments it is shown in [37, 10, 17] that the reduced state of a typical random state from the subspace compatible with the imposed energy constraint will, with high probability, be close to . Herein we focus on the more rigorous exponential bounds provided by theorem 2.2 and the results obtained in .
Of course theorem 2.2 is also applicable to observables that act only locally on the subsystem and our considerations concerning variances and higher moments also remain valid. Consequently theorem 2.2 and 2.3 already tell us that the measurement statistics of local observables does not differ much whether we assume that the composite system is in the microcanonical state corresponding to or in one particular random pure state from .
For reduced states of random pure states an even more powerful statement can be proved. This is the main result of :
(Theorem 1 in )131313In many situations theorem 2.5 can be further improved. See  for details. Let be a subspace of dimension of the Hilbert space of some physical system. The probability that the reduced state of a randomly chosen pure state is more than away from the reduced microcanonical state is given by
Whenever , which is exactly the situation we are interested in, this theorem gives a full replacement for the assumption made in (2.42). If one trusts the argument presented above that , this theorem, together with the usual assumption of weak interaction, proves that almost every pure state drawn from a sufficiently large subspace is locally equivalent to the canonical state. That is, there exists no measurement at all by which they can be distinguished. This is a measure theoretic justification for the applicability of the canonical ensemble that does not rely on the microcanonical ensemble or the equal a priory probability postulate. The authors of  call it General Canonical Principle.
3 Average effective dimension of random pure states
In this section we will discuss the effective dimension
where , of random pure initial states drawn according to different distributions. This quantity will be important in the following discussion. Roughly spoken we will find that a high effective dimension causes thermodynamic behavior, while a small effective dimension will make quantum effects observable.
Before we go on it is useful to develop an intuitively understanding for the effective dimension. Obviously we have if is pure and the completely mixed state has an effective dimension of . Expanding an arbitrary pure initial state in the energy eigenbasis as follows
we find that, under the assumption of non-degenerate energy gaps, its effective dimension is
Therefrom we see that the effective dimension can be interpreted as a measure for the number of energy eigenstates that contribute significantly to the given initial state . This intuition can already serve as a justification for the assumption that for macroscopic objects will typically be very large.
In the remainder of this section we will establish a number of rigorous measure theoretic statements supporting this intuition. The considerations will necessarily be quite technical. In particular, we will consider states drawn according to the Haar measure from subspaces of the total Hilbert space, product states, where both tensor components are drawn from subspaces according to the Haar measure, and states from the mean energy ensemble. When first reading this work it is maybe better to settle with the intuitive argument given above, skip the rest of this section and continue reading in section 4.
3.1 States drawn from subspaces
One of the centrals result derived in  is that almost all pure states drawn according to the unitary invariant Haar measure from a high dimensional subspace have a high effective dimension:
(Theorem 2 in ) i) The average effective dimension with respect to a Hamiltonian with non-degenerate energy gaps , where the average is computed over uniformly random pure initial states drawn from some subspace of dimension , is such that
ii) For a random pure initial state , the probability that is smaller than is exponentially small, namely
with a constant .
The above theorem states that whenever one draws a state according to the Haar measure form a high dimensional subspace one will almost certainly get a state with a high effective dimension. Note that theorem 3.1 is a very strong statement. It is actually much stronger than what we will need in the following, namely that is much larger than some low, fixed power of the dimension of the Hilbert space of the subsystem .
3.2 Product states
A particularly interesting class of initial states are product states. Theorem 3.1 shows that almost all states chosen from sufficiently large subspaces have a high effective dimension. The set of product states however is not a subspace.
The applicability of theorem 3.1 to product states is therefore limited to the case where either the system or the bath states are fixed and the other is chosen from a subspace or of the Hilbert space of the bath or system respectively, such that or .
Here we show that a slightly modified version of the first part of theorem 3.1 holds for product states where both the system and the bath part are chosen from subspaces and respectively:
The average effective dimension with respect to a Hamiltonian with non-degenerate energy gaps where the average is computed over product states consisting of uniformly random pure initial states chosen from subspaces of dimension respectively is such that
The proof uses some of the ideas from the proof of theorem 2 in . The first step is to see that the average effective dimension is bounded by the inverse of the average purity of the time averaged state as follows.
To bound the average purity we first use the simple identity
where is the swap operator of the two tensor components. Equation (3.8) can easily be proved by expanding it in a basis.
Second, we need the following lemma, which follows from the representation theory of the unitary group:
 Let be the average over random pure states drawn from some subspace of dimension . Then
where and is the projector onto the subspace .
Third, we need the assumption of non-degenerate energy gaps to identify the time average with the dephasing map introduced in (1.3). In addition we need another linear swap operator that is defined via its action on product states,
where and . Note that is unitary, and .
Writing instead of for the eigenstates to simplify the notation, the average purity can be written as follows:
Thereby and are the identity and the swap operator on the product spaces , the are the projectors onto the symmetric product of subspaces