Quantum Probability Theory and the Foundations of Quantum Mechanics
- 1 A glimpse of Quantum Probability Theory and of a Quantum Theory of Experiments
- 2 Models of Physical Systems
- 3 Classical ("realistic") models of physical systems
- 4 Physical systems in quantum mechanics
A Removing the veil: Empirical properties of physical systems in quantum mechanics
- A.1 Information loss and entanglement
- A.2 Preliminaries towards a notion of "empirical properties" of quantum mechanical systems
- A.3 So, what are "empirical properties" of a quantum-mechanical system?
- A.4 When does an observation or measurement of a physical quantity take place?
- A.5 Generalizations and summary
- A.6 Non-demolition measurements
- .7 Appendix to Section 5
1. A glimpse of Quantum Probability Theory and of a Quantum Theory of Experiments
By and large, people are better at coining expressions than at filling them with interesting, concrete contents. Thus, it may not be very surprising that there are many professional probabilists who may have heard the expression but do not appear to be aware of the need to develop "quantum probability theory" into a thriving, rich, useful field featured at meetings and conferences on probability theory. Although our aim, in this essay, is not to contribute new results on quantum probability theory, we hope to be able to let the reader feel the enormous potential and richness of this field. What we intend to do, in the following, is to contribute some novel points of view to the "foundations of quantum mechanics", using mathematical tools from "quantum probability theory" (such as the theory of operator algebras).
The "foundations of quantum mechanics" represent a notoriously thorny and enigmatic subject. Asking twenty-five grown up physicists to present their views on the foundations of quantum mechanics, one can expect to get the following spectrum of reactions 111This story is purely fictional, but quite plausible.: Three will refuse to talk – alluding to the slogan "shut up and calculate" – three will say that the problems encountered in this subject are so difficult that it might take another 100 years before they will be solved; five will claim that the "Copenhagen Interpretation", , has settled all problems, but they are unable to say, in clear terms, what they mean; three will refer us to Bell’s book  (but admit they have not understood it completely); three confess to be "Bohmians"  (but do not claim to have had an encounter with Bohmian trajectories); two claim that all problems disappear in the Dirac-Feynman path-integral formalism [22, 28]; another two believe in "many worlds"  but make their income in our’s, and two advocate "consistent histories" ; two swear on QBism , (but have never seen "les demoiselles d’Avignon"); two are convinced that the collapse of the wave function  - spontaneous or not - is fundamental; and one thinks that one must appeal to quantum gravity to arrive at a coherent picture, .
Almost all of them are convinced that theirs is the only sane point of view 222and that Heisenberg’s 1925 paper  cannot be understood.. Many workers in the field have lost the ability to do technically demanding work or never had it. Many of them are knowingly or unknowingly envisaging an extension of quantum mechanics – but do not know how it will look like. But some claim that "quantum mechanics cannot be extended" , (perhaps unaware of the notorious danger of "no-go theorems").
At least fifteen of the views those twenty-five physicists present logically contradict one another. Most colleagues are convinced that somewhat advanced mathematical methods are superfluous in addressing the problems related to the foundations of quantum mechanics, and they turn off when they hear an expression such as "-algebra" or "type-III factor". Well, it might just turn out that they are wrong! What appears certain is that the situation is somewhat desperate, and this may explain why people tend to become quite emotional when they discuss the foundations of quantum mechanics; (see, e.g., ).
When the senior author had to start teaching quantum mechanics to students, many years ago, he followed the slogan "shut up and calculate" – until he decided that the situation described above, namely the fact that we do not really understand, in a coherent and conceptual way, what that most successful theory of physics called "quantum mechanics" tells us about Nature, represents an intellectual scandal.
Our essay will, of course, not remove this scandal. But we hope that, with some of our writings, (see also ,), we may be able to contribute some kind of intellectual "screw driver" useful in helping to unscrew 333"dévisser les problèmes" (in reference to A. Grothendieck) the enigmas at the root of the scandal, before very long. We won’t attempt to extend or "complete" quantum mechanics (although we bear people no grudge who try to do so, and we wish them well). We are convinced that starting from simple, intuitive, general principles ("information loss" and "entanglement generation") and then elucidating the inherent in quantum mechanics will lead to a better understanding of its deep message. (Of course, we realize that our hope is lost on people who are convinced that the mysteries surrounding the interpretation of quantum mechanics can be unravelled without any use of somewhat advanced mathematical concepts.)
Just to be clear about one point: We are not claiming to present any "revolutionary" new ideas; and we do not claim or expect to get much credit for our attempts.
But, by all means, let’s get started! Quantum mechanics is "quantum", and it is intrinsically "probabilistic" [26, 10]. We should therefore expect that it is intimately connected to quantum probability theory, hence to "non-commutative measure theory", etc. However, in the end, "quantum mechanics is quantum mechanics and everything else is everything else!" 444"The one thing to say about art is that it is one thing. Art is art-as-art and everything else is everything else." Ad Reinhardt, 
1.1. Might quantum probability theory be a subfield of (classical) probability theory?
And – if not – what’s different about it? These questions are related to one concerning the existence of hidden variables. The first convincing results on hidden variables were brought forward by Kochen and Specker  and (independently) by Bell . These matters are so well known, by now, that we do not repeat them here. The upshot is that, loosely speaking, quantum probability theory cannot be imbedded in classical probability theory (except in the case of a two-level system).
The deeper problems of quantum mechanics can probably only be understood if we admit a notion of , introduce time-evolution, proceed to consider repeated measurements, i.e., time-ordered sequences of observations or measurements resulting in a time-ordered sequence of events, and understand in which way information gets lost for ever, in the course of time evolution. (We believe that this will lead to an acceptable "ontology" of quantum mechanics [2, 24]) not involving any fundamental role of the "observer".)
In both worlds, the classical and the quantum world, physical quantities or (potential) properties are represented by self-adjoint operators, , and possible events by spectral projections, , or certain products thereof (POVM’s; see Appendix A to Section 4, and Subsection 5.4). A successful measurement or observation of a physical quantity or property represented by an operator results in one of several possible events, (spectral projections of ), with the properties that
Suppose we carry out a sequence of mutually "independent" measurements or observations of physical quantities, , ordered in time, i.e., before before … before (). A physical theory should enable us to predict the probabilities for all possible "histories",
of events, where are the possible events resulting from a successful measurement of , . – On the basis of what prior knowledge? Well, we must know the time evolution of physical quantities and the "state", , of the system, , we observe. That means that, given a state , there should exist a functional, , that associates with each history – but for what of histories, i.e., for which properties ? – a probability
By property (iii) in Eq. (1.1),
because is normalized such that . In a classical theory, the projections , , are characteristic functions on a measure space, , and a state, , is a probability measure on . It then follows from property (iii) that
for arbitrary .
If we consider a quantum mechanical system with finitely many degrees of freedom then the projections are orthogonal projections on a separable Hilbert space, , and, by Gleason’s theorem , is given by a density matrix, , on . Moreover, according to [53, 62, 74, 49],
The problem with Eq. (1.5) is that, most often, it represents physical and probability-theoretical nonsense. For example, it is usually left totally unclear what physical quantities or properties of will be measurable (i.e., which family of histories will become observable), given a time evolution and a state . But such problems do not stop people from studying Eq. (1.5) again and again – and we are no exception. To address one of the key problems with Eq. (1.5), we study an example.
We consider a monochromatic beam of light, which, according to Einstein , consists of individual photons of fixed frequency. We then bring three filters into the beam that produce linearly polarized light. The direction of polarization is given by an angle that can be varied by rotating the filter around the axis defined by the beam; see Figure 1.1.
With the filter , we associate two possible events
Experimentally, one finds that, for any initially unpolarized beam of light, (meaning that the photons are all prepared in a state ),
if only filters and are present, with . It follows from Eq. (1.6) that
the probability that a photon passes the first filter, , being , because the initial beam is unpolarized (or circularly polarized). Formulae (1.6) and (1.7) can be tested experimentally by intensity measurements before and after each filter. If the projections were characteristic functions on a measure space,, then we would have that
Setting , and , Eq. (1.10) would imply that , which is obviously wrong! What is going on? It turns out that the sum rule (1.9) is violated. The reason is that the projections and do not commute. This fact is closely related to non-vanishing interference between and analogous to the interference encountered in the double-slit experiment. Interference between and is measured by
Choosing and (for example), we find a non-vanishing interference term, which explains why the sum rule (1.9) is violated. What is the message? The first filter, 1, may be interpreted as "preparing" the photons in the beam hitting the filter 2 to be linearly polarized as prescribed by the angle . In our experimental set-up there is no instrument measuring whether a photon has passed filter 2, or not. The only measurement is made after filter 3, where either a photon triggers a Geiger counter to click, or there is no photon triggering the Geiger counter. Let us denote the probability for the first event (Geiger counter clicks) by , the second by . The histories contributing to are
The unique history contributing to appears to be
These findings can be accounted for by associating with the event the operator
and with the event the operators
It should however be noted that
For this reason, some people may prefer to replace by the pair , , and to set , . Then,
corresponds to the "virtual history"
which cannot be interpreted classically. This should not bother us, because no measurement is carried out between filters 2 and 3.
There is a more drastic way to present these findings: Consider N filters in series, the filter being rotated through an angle . The probability for an initially vertically polarized photon to be transmitted through all the filters is then given by
If, however, all filters, except for the one, are removed, then
Actually, the discussion presented above, although often repeated, is somewhat misleading. The only measurement takes place after the last filter and is supposed to determine whether a photon has passed all the filters, or not. The corresponding physical quantity corresponds to the operators , where is the label of the last filter, and the measurement consists in verifying whether a Geiger counter placed after the last filter has clicked, or not. The filters have nothing to do with measurements, but determine (or, at least, affect) the form of the time evolution of the photons. The use of POVM’s in discussing experiments like the ones above is not justified at a fundamental, conceptual level. It merely substitutes for a more precise understanding of time-evolution that involves including the filters in a quantum-mechanical description. It appears that, often, POVM’s are used to cover up a lack of understanding of the time-evolution of large quantum systems. The role they play in a quantum theory of experiments is briefly described in Subsect. 5.4.
A more compelling way of convincing oneself that quantum probability cannot be imbedded in classical probability theory than the one sketched above consists in studying correlation matrices of families of (non-commuting) possible events in two independent systems. One then finds that the numerical range of possible values of the matrix elements of such correlation matrices is strictly larger in quantum probability theory than in classical probability theory, as discovered by Bell [8, 69]. See  for an alternative approach.
1.2. The quantum theory of experiments
We return to considering a system, , and suppose that consecutive measurements have been carried out successfully, with the measurement described by spectral projections , , of a physical quantity , with
for all . (We could also use POVM’s, instead of projections, but let’s not!) The probability of a history in a state of given by a density matrix is then given by formula (1.5), above. The measurements can be considered to be successful only if the sum rules (1.4) are very nearly satisfied, for all i. Whether this is true, or not, can be determined by studying the interference between different histories. Given a state , we define matrices, , , by
where is the expectation of the operator in the state . Measurements of the quantities can be considered to be successful only if is approximately diagonal, i.e.,
which is customarily called "decoherence"; see, e.g., [46, 36, 9, 48]. All this is discussed in much detail in Sections 4.3 and 5. In particular, we will show that decoherence is a consequence of "entanglement generation" between the system and its environment and of "information loss", meaning that the original state of cannot be fully reconstructed from the results of arbitrary measurements carried out after some time , long after the interactions between and have set in; see Sect. 5, and [29, 30, 16]. In local relativistic quantum theory with massless particles (photons), the kind of information loss alluded to here is a general consequence of Huyghens’ principle  and of "Einstein causality". It appears already in classical field theory. In local relativistic quantum theory it becomes manifest in the circumstance that the algebra of operators representing physical quantities measurable by a localized observer after some time does not admit any pure states. See .
The key problem in a quantum theory of experiments (or measurements/observations) is, however, to find out which physical quantities will be measured (i.e., what potential properties of a system will become "empirical" properties, or what families of histories of events can be expected to be observed) in the course of time, given the choice of a system, , coupled to an environment, , of a specific time evolution of , and of a fixed state, , of . This is sometimes referred to as the problem of eliminating the mysterious role of the "observer" from quantum mechanics (making many worlds superfluous), and of determining the "primitive ontology" of quantum mechanics, . This problem will be reckoned with in Subsects. 5.3 and 5.4.
One customarily distinguishes between "direct (or von Neumann) measurements" and (indirect, or) "non-demolition measurements" carried out on a physical system . It may be assumed that it is clear what is meant by a direct measurement. A non-demolition measurement is carried out by having a sequence of "probes" interact with the system , one after another, with the purpose of measuring a physical quantity, , of with (for simplicity) finite point spectrum, . If is in an eigenstate, , of corresponding to the eigenvalue right before it starts to interact with the probe, , the time-evolution of the composed system, , is assumed to leave invariant but changes the state of in a manner that depends non-trivially on , for each . This leads to entanglement between and , If, for simplicity, it is assumed that the probes are all independent of one another and that interacts with strictly after and strictly before , then the state of decohers exponentially rapidly with respect to the basis , as . More precisely, if denotes the state of after its interaction with and before its interaction with , with
exponentially rapidly. This is easily verified; (see Subsect. 5.6). A more subtle result on decoherence involving correlated probes that lead to memory effects has been established in .
One might ask what happens if a direct measurement is carried out on every probe after it has interacted with , . (We assume, for simplicity, that all probes are identical, independent and identically prepared, and that they are all subject to the same direct measurement). Then one can show that, under natural non-degeneracy conditions, the state, , of , after the passage of probes , converges to an eigenstate of , i.e.,
as , for some , and the probability of approach of to is given by . This important result has been derived by M. Bauer and D. Bernard in  as a corollary of the Martingale Convergence Theorem; (see  for earlier ideas in this direction.) The convergence claimed in Eq. (1.21) is remarkable, because it says that, asymptotically as , a pure state (some eigenstate of ) is approached; i.e., a very long sequence of indirect (non-demolition) measurements carried out on always results in a "fact" (namely, the state of approaches an eigenvector of the quantity that one intends to measure). Somewhat related results ("approach to a groundstate") for more realistic models have been proven in [31, 21, 33]. 555A result of the form of Eq. (1.21) was conjectured by J.F. in the 90’s. But the proof remained elusive.
1.3. Organization of the paper
In Section 2, we introduce an abstract algebraic framework for the formulation of mathematical models of physical systems that is general enough to encompass classical and quantum mechanical models. We attempt to clarify what kind of predictions a model of a physical system ought to enable us to come up with. Furthermore, we summarize some important facts about operator algebras needed in subsequent sections.
In Section 3, we describe classical models of physical systems within our algebraic framework and explain in which sense, and why, they are "realistic" and "deterministic".
In Section 4, we study a general class of quantum-mechanical models of physical systems within our general framework. We explain what some of the key problems in a quantum theory of observations and measurements are.
The most important section of this essay is Section 5. We attempt to elucidate the roles played by entanglement between a system and its environment and of information loss in understanding "decoherence" and "dephasing", which are key mechanisms in a quantum theory of measurements and experiments; see also [46, 8, 36, 48]. In particular, we point out that the state of the composition of a system with its environment can usually not be reconstructed from measurements long after interactions between the system and its environment have set in; ("information loss"). We also discuss the problem of "time in quantum mechanics" and sketch an answer to the question when an experiment can be considered to have been completed successfully; ("when does a detector click?"). Put differently, the "primitive ontology" of quantum mechanics is developed in Subsects. 5.3 and 5.4. Finally, in Subsection 5.6, we briefly develop the theory of indirect non-demolition measurements, following  .
An outline of relativistic quantum theory and of the role of space-time in relativistic quantum theory has been sketched in lectures and will be presented elsewhere; (see also ).
A rough first draft of this paper has been written during J.F.’s stay at the School of Mathematics of the Institute for Advanced Study (Princeton), 2012/2013. His stay has been supported by the ’Fund for Math’ and the ’Monell Foundation’. He is deeply grateful to Thomas C. Spencer for his most generous hospitality. He acknowledges useful discussions with Ph. Blanchard, P. Deift, S. Kochen and S. Lomonaco. He thanks D.Bernard for drawing his attention to  and W. Faris for correspondence. He is grateful to D. Buchholz, D. Dürr, S. Goldstein, J. Yngvason and N. Zanghi for numerous friendly and instructive discussions, encouragement and for the privilege to occasionally disagree in mutual respect and friendship.
2. Models of Physical Systems
In this section, we sketch a somewhat abstract algebraic framework suitable to formulate mathematical models of physical systems. Our framework is general enough to encompass classical and quantum-mechanical models.
Throughout most of this essay, we consider non-relativistic models of physical systems, so that, in principle, all "observers" have access to the same observational data. For this reason, reference to "observers" is superfluous in the framework to be exposed here. This is radically different in causal relativistic models.
In every model of a physical system, , one specifies in terms of (all) its "potential properties", i.e., in terms of "physical quantities" or "observables" characteristic of ; see, e.g., . No matter whether we consider classical or quantum-mechanical systems, "physical quantities" are represented, mathematically, by bounded, self-adjoint, linear operators. Thus, a system is specified by a list
of physical quantities, , characteristic of that can be observed or measured in experiments.
In classical physics, a physical quantity, , is given by a real-valued (measurable or continuous) function on a topological space, , which is the "state space" of (the phase space if is Hamiltonian). Quantum-mechanically, more general linear operators are encountered, and, as is well known, the operators in need not all commute with one another. It is natural to assume that if is a physical quantity of then so is any polynomial, , in with real coefficients. It is, however, not very plausible that arbitrary real-linear combinations and/or symmetrized products of distinct elements in would belong to . But, in non-relativistic physics, it has turned out to be reasonable to view as a self-adjoint subset of an operator algebra, , usually taken to be a or a von Neumann algebra, in terms of which a model of can be formulated. Physicists tend to be scared when they hear expressions like ’C*-’ or ’von Neumann algebra’. Well, they shouldn’t!
2.1. Some basic notions from the theory of operator algebras
In order to render this paper comprehensible to the non-expert, we summarize some basic definitions and notions from the theory of operator algebras; for further details see .
An algebra, , over the complex numbers is a complex vector space equipped with a multiplication: If and belong to , then
where "" denotes multiplication in . One says that is a algebra iff there exists an anti-linear involution, , on , i.e., , with , for all , such that
where is the complex conjugate of , and
The algebra is a normed algebra (Banach algebra) if it comes with a norm satisfying
( is complete in , i.e., every Cauchy sequence in converges to an element of ).
A Banach algebra, , is a algebra iff
We define the centre, , of to be the subset of given by
A state, , on a algebra with identity is a linear functional with the properties that
for all , and
A representation, , of a -algebra on a complex Hilbert space, , is a homomorphism from to the algebra, , of all bounded linear operators on ; i.e., is linear, , , and , (where is the operator norm of a bounded linear operator on ).
With a -algebra and a state on we can associate a Hilbert space, , a unit vector , and a representation, , of on such that is dense in (i.e. is cyclic for ), and
where is the scalar product on . This is the so-called Gel’fand-Naimark-Segal (GNS) construction.
A theorem due to Gel’fand and Naimark says that every algebra, , can be viewed as a norm-closed subalgebra of closed under , for some Hilbert space .
Thus, consider a -algebra , for some Hilbert space . We define the commuting algebra, or commutant, , of by
The double commutant of , , is defined by
It turns out that and are closed in the so-called weak topology of ; i.e., if is a sequence (net) of operators in (or in ), with
for all , where , then (or , respectively). Subalgebras of that are closed in the weak topology are called von Neumann algebras (or -algebras).
Thus, if is a -algebra contained in , for some Hilbert space , then and are von Neumann algebras. A von Neumann algebra is called a iff its centre, , consists of multiples of the identity operator .
A von Neumann factor is said to be of type iff is isomorphic to , for some Hilbert space . A general von Neumann algebra, , is said to be of type I iff is a direct sum (or integral) over its centre, , of factors of type I. A -algebra is called a type-I -algebra, iff, for every representation , of on a Hilbert space ,
A automorphism, , of a -algebra is a linear isomorphism from onto with the properties
for all .
It is clear what is meant by , where and are or von Neumann algebras. We define
the "relative commutant" of in .
Given a set of operators in a -algebra , we define to be the -subalgebra of generated by , i.e., the norm-closure of arbitrary finite complex-linear combinations of arbitrary finite products of elements in the set , where is the operation on .
A trace on a von Neumann Algebra is a function defined on the positive cone, , of positive elements of (i.e., elements of the form , ) that satisfies the properties
A trace is said to be finite if . It can then be uniquely extended by linearity to a state on . Conversely, any state on enjoying the property
defines a finite trace on . We say that is faithful if for any non-zero element . A trace is said to be normal if for every bounded net of positive elements in , and semifinite, if, for any , , there exists , , such that . Traces play an important role in the classification of von Neumann algebras. It can be shown that a von Neumann algebra is a direct sum (or direct integral) of factors of type and type if and only if it admits a faithful finite normal trace; see . Similarly, is a direct sum (or direct integral) of type I, type and type factors iff it admits a faithful semifinite normal trace. We use these results in Section A to characterize the centralizer of a state .
For the time being, we do not have to know more about operator algebras than what has just been reviewed here. We can test our understanding of the notions introduced above on the example of direct sums of full finite-dimensional matrix algebras (block-diagonal matrices) and by doing some exercises, e.g., reproducing a proof of the GNS construction, or applying this material to group theory.
2.2. The operator algebras used to describe a physical system
We have said that (a model of) a physical system, , is specified by a list
of physical quantities or potential properties, (), characteristic of that can be observed or measured in experiments. (What is meant by this will hopefully become clear later, in Sections 4 and A). We assume that is a self-adjoint subset of a algebra. As explained in Sect. 2.1, we may then consider
the smallest algebra containing . The algebra is called the "algebra of observables" defining ; (possibly a misnomer, because, a priori, only the elements of correspond to observable physical quantities - but let’s not worry about this). For physical systems with finitely many degrees of freedom, is usually a type-I algebra.
We would like to have some natural notions of symmetries of a system , including time evolution. Here we encounter, for the first but not the last time, the complication that is usually in contact with some environment, , which may also include experimental equipment used to measure some observables of . The environment is a physical system, too, and there usually are interactions between and ; in fact, only thanks to such interactions is it possible to retrieve information from , i.e., measure a potential property , , of in a certain interval of time. One typically chooses to be the smallest system with the property that the composed system, , characterized by
can be viewed as a "closed physical system".
What is a "closed physical system"? Let , and let denote the algebra generated by ; i.e., . We say that is a closed (physical) system if the time evolution of physical quantities characteristic of is given in terms of automorphisms of ; i.e., given two times, and , is a automorphism of that associates with every physical quantity in specified at time an operator in representing the same physical quantity at time . We must require that
for any triple of times .
Given a physical system, , we choose its environment such that, within a prescribed precision, can be considered to be a closed physical system. "For all practical purposes" (FAPP, see ), i.e., within usually astounding precision, is much … much smaller than the entire universe; it does usually not include the experimentalist in the laboratory observing or the laptop of her theorist colleague next door, etc.. To say that is a closed physical system does, however, not exclude that is entangled with another physical system, .
Given and , as above, we call the "dynamical algebra" of .
Let denote a group of symmetries of . We will assume that every element can be represented by a automorphism, , of , with the property that
i.e., is a representation of in the group, , of automorphisms of . We say that is a group of dynamical symmetries of iff and time evolution commute, for all and arbitrary pairs of times .
By a "state of a physical system" we mean a state on the algebra , in the sense of Eqs. (2.5) and (2.6) in Subsect. 2.1. (This will turn out to be a misnomer when we deal with quantum systems. But the expression appears to be here to stay.) The set of all states of is denoted by .
To summarize, a (model of a) physical system, , is specified by the following data.
Definition 2.1 (Algebraic data specifying a model of a physical system).
A list of physical quantities, or observables, , generating a algebra, , of "observables", that is contained in the algebra (the "dynamical -algebra" of ) of a closed system, , containing .
The convex set, , of states of , interpreted as states on the algebra .
We should explain what is meant by "time translations": For each time , we have copies and isomorphic to and , respectively, which are contained in . If and are the operators in representing an arbitrary potential property, or observable, , of at times and , respectively, then
with , for arbitrary times and in .
We say that the system is autonomous iff
where is a one-parameter group of automorphisms of .
We say that a system is a subsystem of a system iff
The composition, , of two systems, and , can be defined by choosing
and to contain the algebra generated by and . (A more precise discussion would lead us into the theory of tensor categories.)
2.3. Potential properties, information loss and possible events
Let be a physical system coupled to an environment and described, mathematically, by data
A "potential property" of is represented by an element or, more generally, by a self-adjoint operator in the algebra . An observation of a potential property, , of at time will be described in terms of the operator , where is a fiducial time at which the state of is specified. Next, we have to clarify in which sense information is lost, as time increases. In local, relativistic q