Foundations of Quantum Decoherence

Foundations of Quantum Decoherence

John Gamble
July 17, 2019

The conventional interpretation of quantum mechanics, though it permits a correspondence to classical physics, leaves the exact mechanism of transition unclear. Though this was only of philosophical importance throughout the twentieth century, over the past decade new technological developments, such as quantum computing, require a more thorough understanding of not just the result of quantum emergence, but also its mechanism. Quantum decoherence theory is the model that developed out of necessity to deal with the quantum-classical transition explicitly, and without external observers. In this thesis, we present a self-contained and rigorously argued full derivation of the master equation for quantum Brownian motion, one of the key results in quantum decoherence theory. We accomplish this from a foundational perspective, only assuming a few basic axioms of quantum mechanics and deriving their consequences. We then consider a physical example of the master equation and show that quantum decoherence successfully represents the transition from a quantum to classical system.


Physics and Mathematics Independent Study Thesis \degreetoobtainB.A. in Physics and Mathematics \presentschoolThe College of Wooster \academicprogramDepartments of Mathematics and Physics \gradyear2008 \advisorDr. John Lindner,
The Moore Professor of Astronomy & Professor of Physics \secondadvisorDr. Derek Newland,
Assistant Professor of Mathematics \copyrighted


I gratefully acknowledge the loving help and support of my parents, John and Clare Gamble, and of my fiancée, Katherine Kelley. I extend sincere thanks to my advisors, John Lindner and Derek Newland, for their long hours and dedication to this project. I also thank Jon Breitenbucher for painstakingly assembling and maintaining this LaTeX
template, which made the writing process significantly more enjoyable than it would have been otherwise. Finally, I am grateful to The Gallows program for providing me an environment in which I could grow, learn, and succeed.

Chapter \thechapter Preface


[lines=2, lhang=0.33, loversize=0.1]This thesis is designed to serve a dual purpose. First, it is a stand-alone treatment of contemporary decoherence theory, accomplishing this mostly within a rigorous framework more detailed than is used in typical undergraduate quantum mechanics courses. It assumes no prior knowledge of quantum mechanics, although a basic understanding obtained through a standard introductory quantum mechanics or modern physics course would be helpful for depth of meaning. Although the mathematics used is introduced thoroughly in chapter Foundations of Quantum Decoherence, the linear algebra can get quite complicated. Readers who have not had a formal course in linear algebra would benefit from having ref. Poole (2006) on-hand during some components, especially chapters Foundations of Quantum Decoherence and Foundations of Quantum Decoherence. The bulk of the work specifically related to decoherence is found in the last three chapters, and readers familiar with quantum mechanics desiring a better grasp of decoherence theory should proceed to the discussion of quantum mechanics in phase-space, found in chapter Foundations of Quantum Decoherence.

Second, this thesis is an introduction to the rigorous study of the foundations of quantum mechanics, and is again stand-alone in this respect. It develops the bulk of quantum mechanics from several standard postulates and the invariance of physics under the Galilei group of transformations, outlined in sections 7 and 13, respectively. Readers interested in this part of the thesis should study the first three chapters, where many fundamental results of quantum mechanics are developed. We now begin with a motivating discussion of quantum decoherence.

One of the fundamental issues in physics today is the emergence of the familiar macroscopic physics that governs everyday objects from the strange, underlying microscopic laws for the motion of atoms and molecules. This collection of laws governing small bodies is called quantum mechanics, and operates entirely differently than classical Newtonian physics. However, since all macroscopic objects are made from microscopic particles, which obey quantum mechanics, there should be some way to link the two worlds: the macro and the micro. The conventional interpretation of quantum mechanics answers questions about the transition from classical to quantum mechanics, known as quantum emergence, through a special measurement process, which is distinct from the other rules of quantum mechanics Griffiths (2005).111In fact, the motion of a system not being measured is considered unitary, and hence reversible, while the measurement process is conventionally considered discontinuous, and hence irreversible. So, not only are they treated separately, but they are considered fundamentally different processes!

However, when this measurement concept is used, problems arise. The most famous of these problems is known as Schrödinger’s cat, which asks about the nature of measurement through a paradox Omnès (1999). The problem creates ambiguity about

  1. when a measurement occurs, and

  2. who (or what) performs it.

When all is said and done, the conventional interpretation leaves a bitter taste in the mouths of many physicists; what they want is a theory of quantum measurement that does not function due to subjectively defined observation. If no external observers are permitted, how can classical mechanics ever emerge from quantum mechanics? The answer is that complex systems, in essence, measure themselves, which leads us to decoherence.

1 Decoherence and the Measurement Problem

Figure 1: A graphical representation of decoherence. Here, the environment, which is treated statistically, can be thought of as an information reservoir. It serves to absorb the quantum interference properties of the system, making the system appear as a classical, statistically prepared state.

Quantum decoherence theory is a quantitative model of how this transition from quantum to classical mechanics occurs, which involves systems performing local measurements on themselves. More precisely, we divide our universe into two pieces: a simple system component, which is treated quantum mechanically, and a complex environmental component, which is treated statistically.222The words statistical and classical are being tossed around here a bit. What we mean is statistical in the thermodynamic sense, for example probability distributions prepared by random coin-tosses. These random, statistical distributions are contrasted against quantum states, which may appear to be random when observed, but actually carry quantum interference information. Since the environment is treated statistically, it obeys the rules of classical (statistical) mechanics, and we call it a mixture Ballentine (1998). When the environment is coupled to the system, any quantum mechanical information that the system transfers to the environment is effectively lost, hence the system becomes a mixture over time, as indicated in figure 1.

In the macroscopic world, ordinary forces are huge compared to the subtle effects of quantum mechanics, and thus large systems are very difficult to isolate from their environments. Hence, the time it takes large objects to turn to mixtures, called the decoherence time, is very short. It is important to keep in mind that decoherence is inherently local. That is, if we consider our entire universe, the system plus the environment, quantum mechanically, classical effects do not emerge. Rather, we need to “focus” on a particular component, and throw away the quantum mechanical information having to do with the environment Omnès (1999).

In order to clarify this notion of decoherence, we examine the following unpublished example originally devised by Herbert J. Bernstein Greenstein and Zajonc (2006). To start, consider an electron gun, as shown in figure 2. Electrons are an example of a two-state system, and as such they possess a quantum-mechanical property called spin Nielsen and Chuang (2000). As we develop in detail later in section 10, the spin of a two-state system can be represented as a vector pointing on the unit two-sphere. Further, any possible spin can be formed as a linear combination of a spin pointing up in the direction, and a spin pointing down in the direction.333In linear algebra terminology, we call the spin vectors pointing in and a basis for the linear vector space of all possible states. We deal with bases precisely in section 3.

We suppose that our electron gun fires electrons of random spin, and then we use some angular control device to fix the electron’s spin to some angle (that we set) in the -plane. Then, we use a Stern-Gerlach analyzer adjusted to some angle to measure the resulting electron. The Stern-Gerlach analyzer measures how close its control angle is to the spins of the electrons in the beam passing through it Greenstein and Zajonc (2006). It reads out a number on a digital display, with corresponding to perfect alignment and corresponding to anti-alignment.

Figure 2: A sketch of Bernstein’s thought experiment. The electrons with initial random spin are set to a certain angle in the -plane at the first angular control. A switch determines whether or not an additional phase factor is added using a roulette wheel. Then, a Stern-Gerlach analyzer is used to measure the angle of electron spin.

So far, we can always use the analyzer to measure the quantum-mechanical spin of each electron in our beam. We simply turn the analyzer’s angular control until its digital display reads one, and then read the value of the angular control. Similarly, if we were to turn the analyzer’s control to the angle opposite from the beam’s angle, the display would read zero. The fact that these two special angles always exist is fundamental to quantum mechanics, resulting from a purely non-classical phenomenon called superposition.444The precise nature of quantum superposition is rather subtle, and we discuss it at length in section 10. We next insert another component into the path of the electron beam. By turning on a switch, we activate a second device that adjusts the angle of our beam in the -plane by adding . The trick is that this device is actually attached to a modified roulette wheel, which we spin every time an electron passes. The roulette wheel is labeled in radians, and determines the value of Greenstein and Zajonc (2006).

We now frantically spin the angular control attached to our analyzer, attempting to find the initial angle of our electron beam. However, much to our surprise, the display appears to be stuck on Greenstein and Zajonc (2006). This reading turns out to be no mistake, since the angles of the electrons that the analyzer is measuring are now randomly distributed (thanks to the randomness of the roulette wheel) throughout the -plane. No matter how steadfastly we attempt to measure the spin of the electrons in our beam, we cannot while the roulette wheel is active. Essentially, the roulette wheel is absorbing the spin information of the electrons, as we apparently no longer have access to it.

This absorption of quantum information is the exact process that the environment performs in quantum decoherence theory. In both cases, the information is lost due to statistical randomness, and forces a quantum system to be classically random as well. The roulette wheel in this simplified example, just like the environment in reality, is blocking our access to quantum properties of a system. In chapter Foundations of Quantum Decoherence, we return to a more physical example of decoherence using the quantitative tools we develop in this thesis. First, we need to discuss the mathematical underpinnings of quantum mechanics.

2 Notational Conventions

Throughout this thesis, we adopt a variety of notational conventions, some more common than others. Here, we list them for clarity.

  • The symbol will always be used in the case of a definition. It indicates that the equality does not follow from previous work. The sign indicates equality that logically follows from previous work.

  • An integral symbol without bounds,

    is a definite integral from to , rather than the antiderivative, unless otherwise noted.

  • Usually, the differential term in an integrand will be grouped with the integral symbol and separated by . This is standard multiplication, and is only included for notational clarity.

  • Vectors are always given in Dirac kets, , operators on abstract vector or Hilbert spaces are always given with hats, , linear functionals over vector spaces are given in Dirac bras, , and operators on function spaces are given with checks, .

  • Both partial and total derivatives are given using either standard Leibniz or in a contracted form , where

  • The symbol is used to denote a special representation of a particular structure. Its precise definition is made clear by context.

  • The symbol is used to denote the complex conjugate of a complex number.


Chapter \thechapter Mathematical background


[lines=2, lhang=0.33, loversize=0.1]Before we begin our discussion of quantum mechanics, we take this chapter to review the mathematical concepts that might be unfamiliar to the average undergraduate physics major wishing a more detailed understanding quantum mechanics. We begin with a discussion of linear vector spaces and linear operators. We next generalize these basic concepts to product spaces, and finally consider spaces of infinite dimension. Quantum mechanics is much more abstract than other areas of physics, such as classical mechanics, and so the immediate utility of the techniques introduced here is not evident. However, for the treatment in this thesis to be mostly self-contained, we proceed slowly and carefully.

3 Linear Vector Spaces

In this section, we introduce linear vector spaces, which will be the stages for all of our subsequent work.555Well, actually we will work in a triplet of abstract spaces called a rigged Hilbert space, which is a special type of linear vector space. However, most textbooks on quantum mechanics, and even most physicists, do not bother much with the distinction. We will look at this issue in more detail in section 6. We begin with the elementary topic of vector spaces Poole (2006).

If satisfies the criteria for a vector space, the members are called vectors, and the members are called scalars. For the purposes of quantum mechanics, the field we are concerned with is almost always , the field of complex numbers, and has the usual (Euclidean) topology.666The fields we refer to here are those from abstract algebra, and should not be confused with force fields (such as the electric and magnetic fields) used in physics. Loosely speaking, most of the sets of numbers we deal with in physics are algebraic fields, such as the real and complex numbers. For more details, see ref Anderson and Feil (2005). Since the operation is by definition interchangeable with the field operation , it is conventional to use the symbol for both, and we do so henceforth Anderson and Feil (2005).777In definition LABEL:defn:lindep, we use the notion , which might be foreign to some readers. is considered an index set, or a set of all possible allowed values for . Then, by , we are letting run over the entire index set. Using this powerful notation, we can treat almost any type of general sum or integral. For more information, see ref. Gamelin and Greene (1999)

This means that, if a set of vectors is linearly dependent, we can express one of the member vectors in terms of the others. If a set of vectors is not linearly dependent, we call it linearly independent, in which case we would not be able to express one of the member vectors in terms of the others Poole (2006).

It follows directly from this definition that, in any vector space with finite dimension , any basis set will have precisely members. Because quantum mechanics deals with a Euclidean vector space over the complex numbers, it is advantageous to precisely define the inner product of two vectors within that special case Ballentine (1998).

Although it is not immediately clear, the inner product is closely related to the space of linear functionals on , called the dual space of and denoted . Below, we define these concepts precisely and then show their connection through the Riesz representation theorem Ballentine (1998).

We connect the inner product with the dual space using the Riesz representation theorem Ballentine (1998).

The proof of this theorem is straightforward, but too lengthy for our present discussion, so we will reference a simple proof for the interested reader Ballentine (1998). The consequences of this theorem are quite drastic. It is obviously true that the inner product of two vectors, which maps them to a scalar, is a linear functional. However, the Riesz theorem asserts that any linear functional can be represented as an inner product. This means that every linear functional has precisely one object in the dual space, corresponding to a vector in the vector space. For this reason, we call the linear functional associated with with a dual vector and write it as


and we contract our notation for the inner product of two vectors and to


a notational convention first established by P. A. M. Dirac. The vectors in are called kets and the dual vectors, or linear functionals associated with vectors in , are called bras. Hence, when we adjoin a bra and a ket, we get a bra-ket or bracket, which is an inner product. Note that by the definition of the inner product, we have


so if we multiply some vector by a (complex) scalar , the corresponding dual vector is . When we form dual vectors from vectors, we must always remember to conjugate such scalars. As another note, when choosing a basis, we frequently pick it as orthonormal, which we define below Poole (2006).

For any vector space, we can always find such a basis, so we do not lose any generality by always choosing to use one.888The process for finding an orthonormal basis is called the Graham-Schmidt algorithm, and allows us to construct an orthonormal basis from any basis. For details, see ref. Poole (2006).

A useful example that illustrates the use of vectors and dual vectors can be found by constraining our vector space to a finite number of dimensions. Working in such a space, we represent vectors as column matrices and dual vectors as row matrices Nielsen and Chuang (2000). For example, in three dimensions we might have




where and are the unit vectors from basic physics Halliday et al. (2004). Then, the linear functional corresponding to is999Here, notice that to generate the representation for from , we must take the complex conjugate. This is necessary due to the complex symmetry of the inner product established in eqn. 6.


We represent the inner product as matrix multiplication, so we write


which indicates that and are orthogonal, as we expect.

4 Linear Operators

So far, we have looked at two main types of objects in a vector space: vectors and linear functionals. In this section, we focus on a third: the linear operator. Recall that linear functionals take vectors to numbers. Similarly, linear operators are objects that take vectors to other vectors. Formally, this is the following definition Riley et al. (1998).

Throughout the rest of this thesis, whenever we discuss an operator on a vector space, we will always use a hat to avoid confusion with a scalar. In a finite dimensional vector space, as indicated previously, we often represent vectors by column matrices and dual vectors by row matrices. Similarly, we represent operators by square matrices Nielsen and Chuang (2000). For example, if




We can also use our formalism to access individual elements of an operator in its matrix representation. Working in the three-dimensional standard, orthonormal basis from the example above, we specify as








which is just the matrix equation Ballentine (1998)


where we made the definition


We call the matrix element corresponding to the the operator . Note that the matrix elements of an operator depend on our choice of basis set. Using this expression for a matrix element, we define the trace of an operator. This definition is very similar to the elementary notion of the trace of a matrix as the sum of the elements in the main diagonal.101010Since the individual matrix elements of an operator depend on the basis chosen, it might seem as if the trace would vary with basis, as well. However, the trace turns out to be independent of basis choice Ballentine (1998).

So far, we have defined operators as acting to the right on vectors. However, since the Riesz theorem guarantees a bijection between vectors and dual vectors (linear functionals in the dual space), we expect operators to also act to the left on dual vectors. To make this concept precise, we write a definition.

From this definition, it follows that


which is an important result involving the adjoint, and is sometimes even used as its definition. This correctly suggests that the adjoint for operators is very similar to the conjugate transpose for square matrices, with the two operations equivalent for the matrix representations of finite vector spaces.111111Many physicists, seeing that linear functionals are represented as row matrices and vectors are represented as column matrices, will write . This is not technically correct, as the formal definition LABEL:def:adjoint only defined the adjoint operation for an operator, not a functional. However, though it is an abuse of notation, it turns out that nothing breaks as a result Ballentine (1998). For clarity, we will be careful not to use the adjoint in this way.

Although the matrix representation of an operator is useful, we need to express operators using Dirac’s bra-ket notation. To do this, we define the outer product Nielsen and Chuang (2000).

Note that this is clearly linear, and is an operator, as


for , a vector space. Further, if an operator is constructed in such a way, eqn. 25 tells us that its adjoint is


Self-adjoint opeartors, i.e. operators such that


are especially important in quantum mechanics. The main properties that make self-adjoint operators useful concern their eigenvectors and eigenvalues.121212We assume that the reader has seen eigenvalues and eigenvectors. However, if not, see ref. Poole (2006) or any other linear algebra text for a thorough introduction. We summarize them formally in the following theorem Ballentine (1998).


Let and so that and are arbitrary (nonzero) eigenvectors of corresponding to the eigenvalues and . Then, using eqn. 25, we deduce Ballentine (1998)


Since , we get , so is real. Hence, any arbitrary eigenvalue of a self-adjoint operator is real. Next, we consider combinations of two eigenvectors. That is,


Thus, if , , so and are orthogonal as claimed. ∎

Now that we have shown this orthogonality of distinct eigenvectors or an operator, we would like to claim that these eigenvectors form a basis for the vector space in which the operator works. For finite dimensional spaces, this turns out to be the case, although the proof quite technical, so we omit it with reference Ballentine (1998). However, infinite dimensional cases produce problems mathematically, hence the eigenvectors of an operator in such a space need not form a basis for that space Ballentine (1998). For the moment, we will proceed anyway, returning to this issue in section 6.

Suppose that is the set of all eigenvectors of the self-adjoint operator . Since eigenvectors are only determinable up to a scaling factor, as long as our vectors are of finite magnitude, we may rescale all of these vectors to be an orthonormal set of basis vectors Poole (2006). By our assumption, this set forms a basis for our vector space, . Thus, for any , we can write


Noting that, since the basis vectors are orthonormal,


we get


It follows immediately that

which is called the resolution of the identity. This leads us to a result that allows us to represent self-adjoint operators in terms of their eigenvector bases, the spectral theorem Ballentine (1998).


Let be an arbitrary vector. Then, since is a basis for , we can write




Now, we consider the other side of the equation. We get Ballentine (1998)


where we used the orthonormality of our basis vectors. This holds for arbitrary , so Ballentine (1998)


as desired. ∎

Since we assumed that the eigenvectors for any self-adjoint operator formed a basis for the operator’s space, we may use the spectral theorem to decompose self-adjoint operators into basis elements, which we make use of later.

5 The Tensor Product

So far, we have discussed two types of products in vector spaces: inner and outer. The tensor product falls into the same category as the outer product in that it involves arraying all possible combinations of two sets, and is sometimes referred to as the cartesian or direct product Anderson and Feil (2005). We formally define the tensor product operation below Nielsen and Chuang (2000).

The tensor product is linear in the normal sense, in that it is distributive and can absorb scalar constants Nielsen and Chuang (2000). Further, we define linear operators on a product space by


The definition for the tensor product is quite abstract, so we now consider a special case in a matrix representation for clarity. Consider a a two-dimensional vector space, , and a three-dimensional vector space . We let the operator


act over , and the operator


act over . Then, operating on arbitrary vectors, we find




The representation of the tensor product as a matrix operation is called the Kronecker product, and is formed by nesting matrices from right to left and distributing via standard multiplication Nielsen and Chuang (2000). We now illustrate it by working our example.


But by eqn. 43, we should be able to first construct the tensor product of the of the operators and and apply the resulting operator to the tensor product of and . Working this out using the Kronecker product, we have





and we confirm that this example follows


when we use the Kronecker product representation for the tensor product. Since the matrix representation is very convenient for finite dimensional vector spaces, we frequently use the Kronecker product to calculate the tensor product and then shift back to the abstract Dirac notation.

6 Infinite Dimensional Spaces

So far, we have largely ignored the main complication that arises when we move from a finite dimensional space to an infinite one: the spectrum of eigenvectors for a self-adjoint operator is no longer guaranteed to form a basis for the space. To deal with this problem, we will have to work in a slightly more specific kind of vector space, called a Hilbert space, denoted . A Hilbert space is defined below Ballentine (1998).

Note that for the vector spaces described in the above definition, the Hilbert space associated with them always follows , and that holds if (but not only if) has finite dimension. Without spending too much time on the technicalities, there is a generalized spectral theorem that applies to spaces very closely related to, but larger than, Hilbert spaces Ballentine (1998). To determine precisely what this space should be, we must first develop a certain subspace of a Hilbert space, which we define by including all vectors subject to


converging for all . For a Hilbert space, we require a much weaker condition, as we do not have the rapidly increasing in each term of the summand. We define this space as , and note that always Ballentine (1998). The ramifications of the extra normalization requirement for a vector to be in can be thought of as a requirement for an extremely fast decay as . We now define the space of interest, called the conjugate space of , and written as in terms of its member vectors Gamelin and Greene (1999). Any vector belongs to if

Figure 3: The spaces . The area shaded blue is the rigged Hilbert space triplet.

converges for all and is continuous on . Since we noted that for a vector to be in , it must vanish very quickly at infinity, is not nearly as restricted as a vector in . Thus, we have the triplet


which is called a rigged Hilbert Space triplet, and is shown in figure 3 Ballentine (1998).131313The argument used here is rather subtle. If the reader is not clear on the details, it will not impair the comprehension of later sections. To thoroughly understand this material, we recommend first reading the treatment of normed linear spaces in ref. Gamelin and Greene (1999), and then the discussion of rigged Hilbert spaces in refs. Ballentine (1998) and Sudbery (1986). We noted earlier that the set of eigenvectors of a self-adjoint operator need not form a basis for that operator’s space if the space has infinite dimension. This means that the spectral theorem would break down, which is what we wish to avoid. Fortunately, a generalized spectral theorem has been proven for rigged Hilbert space triplets, which states that any self adjoint operator in has eigenvectors in that form a basis for Ballentine (1998). Due to this, we will work in a rigged Hilbert space triplet, which we will normally denote by the corresponding Hilbert space, . We do this with the understanding that to be completely rigorous, it might be necessary to switch between the component sets of the triplet on a case-by-case basis.

Now that we have outlined the space in which we will be working, there is an important special case of an infinite dimensional basis that we need to examine. If our basis is continuous, then we can convert all of our abstract summation formulas into integral forms, which are used very frequently in quantum mechanics, since the two most popular bases (position and momentum) are usually continuous.141414A common form of confusion when first studying quantum mechanics is the abstract notion of vectors. In classical mechanics, a vector might point to a particular spot in a physical space. However, in quantum mechanics, a vector can have infinite dimensionality, and so can effectively point to every point in a configuration space simultaneously, with varying magnitude. For this reason, a very clear distinction must be drawn between the vectors used in the formalism of quantum mechanics and the everyday vectors used in classical mechanics. Specifically, suppose we have a continuous, orthonormal basis for a rigged Hilbert space given by , where is a real interval. Then, if we have Ballentine (1998)


we find a special case of eqn. 6. This is


where the integral is taken over the real interval . Similarly, for an operator , definition LABEL:defn:trace becomes Ballentine (1998)


and for self-adjoint , theorem LABEL:thm:spectral is


When working in a continuous basis, these integral forms of the inner product, trace, and spectral theorem will often be more useful in calculations than their abstract sum counterparts, and we make extensive use of them in chapter Foundations of Quantum Decoherence.

Chapter \thechapter Formal Structure of Quantum Mechanics


[lines=2, lhang=0.33, loversize=0.1]We now use the mathematical tools developed last chapter to set the stage for quantum mechanics. We begin by listing the correspondence rules that tell us how to represent physical objects mathematically. Then, we develop the fundamental quantum mechanical concept of the state and its associated operator. Next, we investigate the treatment of composite quantum mechanical systems. Throughout this chapter, we work in discrete bases to simplify our calculations and improve clarity. However, following the rigged Hibert space formalism developed in section 6, translating the definitions in this section to an infinite-dimensional space is straightforward both mathematically and physically.

7 Fundamental Correspondence Rules of Quantum Mechanics

At the core of the foundation of quantum mechanics are three rules. The first two tell us how to represent a physical object and describe its physical properties mathematically, and the third tells us how the the object and properties are connected. These three rules permit us to state a physical problem mathematically, work the problem mathematically, and then interpret the mathematical result physically Ballentine (1998).

The first physical object of concern is the state, which completely describes the physical aspects of some system Ballentine (1998). For instance, we might speak of the state of a hydrogen atom, the state of a photon, or a state of thermal equilibrium between two thermal baths.

Now that we have introduced the state, we can discuss the physical concepts used to describe states. These concepts include momentum, energy, and position, and are collectively known as dynamical variables Ballentine (1998).

We now link the first two axioms with the third Ballentine (1998).

Though we claimed that these three axioms form the fundamental framework of modern quantum mechanics, they most likely seem foreign to the reader who has seen undergraduate material. In the next section, we work with the state operator and show that, in a special case, the formalism following from the correspondence rules outlined above is identical to that used in introductory quantum mechanics courses.

8 The State Operator

In axiom LABEL:axm:state, we defined , the state operator. However, the formal definition is very abstract, so in this section we investigate some of the properties of the state operator in an attempt to solidify its meaning. Physicists divide quantum mechanical states, and thus state operators, into two broad categories. Any given state is either called pure or impure. Sometimes, impure states are also referred to as mixtures or mixed states. We now precisely define a pure state Ballentine (1998).

Although the importance of pure and impure states is not yet evident, we will eventually need an efficient method of distinguishing between them. The definition, which is phrased as an existence argument, is not well-suited to this purpose. To generate a more useful relationship, consider a pure state. We have


Thus, if a state is pure, it necessarily follows Ballentine (1998)


Although seemingly a weaker condition, this result turns out to also be sufficient to describe a pure state. To show this, we suppose that our state space is discrete and has dimension .151515This is mainly for our convenience. The argument for an infinite-dimensional space is similar, but involves the generalized spectral theorem on our rigged Hilbert space. Invoking the spectral theorem, theorem LABEL:thm:spectral, we write


where is the spectrum of eigenvalues for , corresponding to the unit-normed eigenvectors of , . If we consider some with and let , we have


which is






Since all of the eigenvalues of must also follow this relationship, they must all either be one or zero. But by axiom LABEL:axm:state, , so exactly one of the eigenvalues must be one, while all the others are zero. Thus, eqn. 67 becomes


where we have taken . Evidently, is a pure state, and we have shown sufficiency Ballentine (1998).

At this point, it is logical to inquire about the necessity of the state operator, as opposed to a state vector alone. After all, most states treated in introductory quantum mechanics are readily represented as state vectors. However, there are many states that are prepared statistically, and so cannot be represented as a state vector. An example of one of these cases is found in section 11. These impure states or mixtures turn out to be of the utmost importance when we begin to discuss quantum decoherence, the main focus of this thesis Zurek (2003).

We now turn our attention to the properties of pure states, and illustrate that the state vectors defining pure state operators behave as expected under our correspondence rules. By axiom LABEL:axm:expectation, we know that the expectation value of the dynamical variable (observable) of a state is


If is a pure state, then we can write


Hence, becomes


which, by definition LABEL:defn:trace, is Ballentine (1998)


where we have used definition LABEL:defn:orthobasis to pick the basis to be orthonormal and contain the vector .161616This works since is guaranteed to have unit magnitude by definition LABEL:defn:pure. This is the standard definition for an expectation value in introductory quantum mechanics, which we recover by letting be pure Griffiths (2005); Cohen-Tannoudji et al. (1977).

9 Composite Systems

In order to model complex physical situations, we will often have to consider multiple, non-isolated states. To facilitate this, we need to develop a method for calculating the state operator of a composite, or combined, quantum system Ballentine (1998).

Note that if is pure, there exists some characteristic state vector of where


and each corresponds to . As an important notational aside, eqn 78 is frequently shortened to Nielsen and Chuang (2000)


where the tensor products are taken as implicit in the notation. Just as we discussed dynamical variables associated with certain states, so can we associate dynamical variables with composite systems. In general, an observable of a composite system with substates is formed by Nielsen and Chuang (2000)


where each is an observable of the th substate. We have now extended the concepts of state and dynamical variable to composite systems, so it is logical to treat an expectation value of a composite system. Of course, since a composite system is a state, axiom LABEL:axm:expectation applies, so we have


However, composite systems afford us opportunities that single systems do not. Namely, just as we trace over the degrees of freedom of a system to calculate expectation values on that system, we can trace over some of the degrees of freedom of a composite state to focus on a specific subsystem.171717Here, a degree of freedom of a state can be thought of as its dimensionality. It is used analogously with the notion in a general system in classical mechanics, where the dimensionality of a system’s configuration space corresponds to the number of degrees of freedom it possesses. For more on this, see ref. Thornton and Marion (2004). We call this operation the partial trace over a composite system, and we define it precisely below Nielsen and Chuang (2000).

If the partial trace is applied to a composite system repeatedly such that all but one of the subsystem state operators are traced out, the remaining operator is called a reduced state operator Nielsen and Chuang (2000).

The partial trace and reduced state operator turn out to be essential in the analysis of composite systems, although that fact is not immediately obvious. To illustrate this, we consider some observable that acts only on the th subsystem of a composite system. We choose a basis , where each element is formed by the Kronecker product of the basis elements of the corresponding subsystems. That is, each basis vector has the form , where each is one of the orthonormal basis vectors of the th substate space. Then, from axiom LABEL:axm:expectation, we have


We use the resolution of the identity, eqn. LABEL:eqn:projector, to write our expectation value as


where corresponds to a basis vector. This becomes


If the observable acts as identity on all but the th subsystem, by eqn. 43, we have


Since our chosen basis is orthonormal, for any non-zero term in the sum, we must have (except for and ), in which case the final inner produce is unity. Hence, we get


If we apply eqn. 43, letting , we have




Since each trace is just a scalar, we can write


Recognizing the definition LABEL:def:redstate for the reduced state operator and the resolution of the identity from eqn. LABEL:eqn:projector, we find Ballentine (1998)

Due to this remarkable result, we know that the reduced state operator for a particular subsystem is enough to tell us about any observable that only depends on the subsystem. Further, we end up with a formula for the expectation value of a component observable very similar to axiom LABEL:axm:expectation for observables of the full system.

10 Quantum Superposition

Though we have introduced some of the basic formalism of the state, we are still missing one of the key facets of quantum mechanics. This piece is the superposition principle, which, at the time of this writing, is one of the core aspects of quantum mechanics that no one fully understands. However, due to repeated experimental evidence, we take it as an axiom.

The superposition principle allows us to create new and intriguing states that we would not have access to otherwise. In fact, if we have linearly independent states of a system, any point on the unit n-sphere corresponds to a valid state of the system.181818The reader might wonder why the superposition principle is necessary, after all, we know that state vectors exist in a Hilbert space, and Hilbert spaces act linearly. However, we were not guaranteed until now that any vector of unit norm in Hilbert space represents a valid physical situation. The superposition principle gives us this, which allows us great freedom in constructing states. If we consider a two-state system with an orthonormal basis , the 2-sphere of possible states guaranteed by the superposition principle is conveniently visualized imbedded in 3-space. This visualization of a two-state system is called the Bloch sphere representation, and is pictured in figure 4 Nielsen and Chuang (2000). To calculate the position of a system in Bloch space, we use the formula

Figure 4: Two-state systems can be visualized as being vectors on a two-sphere, known in quantum physics as the Bloch sphere. The angles and are defined in eqn. LABEL:eqn:bloch_def for pure states, and the axes x, y, and z are defined in eqn. 97 for all states.

where is the 3-vector,


and is the vector of Pauli spin matrices,


The Pauli matrices are




Writing eqn. 95 explicitly, we find


This is trivially a basis for all two by two matrices, so we can indeed represent any by eqn. 95. Further, if we use the fact that , we know


so . With this constraint in mind, it is conventional to write eqn. 95 as Nielsen and Chuang (2000)

Also, since is self-adjoint, the diagonal entries must all be real, so . By the same reasoning,


Since and are arbitrary, we can choose either of them to be zero, and the resulting equation must hold for all values of the other. Hence, and , so both and are real, and is a real-valued vector. Since is real, we use it as a position vector that tells us the location of the system in Bloch space and call it the Bloch vector. If we have a pure state


we can express the location of the state in terms of the familiar polar and azimuthal angles of polar-spherical coordinates. Taking into account our redefined, conventional , eqn. LABEL:eqn:bloch2 is


We use the polar-spherical coordinate identities for unit vectors


to determine