(Never) Mind your p’s and q’s: Von Neumann versus Jordan on the Foundations of Quantum Theory.

# (Never) Mind your p’s and q’s: Von Neumann versus Jordan on the Foundations of Quantum Theory.

## Abstract

In two papers entitled “On a new foundation [Neue Begründung] of quantum mechanics,” Pascual Jordan (1927b, g) presented his version of what came to be known as the Dirac-Jordan statistical transformation theory. As an alternative that avoids the mathematical difficulties facing the approach of Jordan and Paul A. M. Dirac (1927), John von Neumann (1927a) developed the modern Hilbert space formalism of quantum mechanics. In this paper, we focus on Jordan and von Neumann. Central to the formalisms of both are expressions for conditional probabilities of finding some value for one quantity given the value of another. Beyond that Jordan and von Neumann had very different views about the appropriate formulation of problems in quantum mechanics. For Jordan, unable to let go of the analogy to classical mechanics, the solution of such problems required the identification of sets of canonically conjugate variables, i.e., ’s and ’s. For von Neumann, not constrained by the analogy to classical mechanics, it required only the identification of a maximal set of commuting operators with simultaneous eigenstates. He had no need for ’s and ’s. Jordan and von Neumann also stated the characteristic new rules for probabilities in quantum mechanics somewhat differently. Jordan (1927b) was the first to state those rules in full generality. Von Neumann (1927a) rephrased them and, in a subsequent paper (von Neumann, 1927b), sought to derive them from more basic considerations. In this paper we reconstruct the central arguments of these 1927 papers by Jordan and von Neumann and of a paper on Jordan’s approach by Hilbert, von Neumann, and Nordheim (1928). We highlight those elements in these papers that bring out the gradual loosening of the ties between the new quantum formalism and classical mechanics.

###### keywords:
Pascual Jordan, John von Neumann, transformation theory, probability amplitudes, canonical transformations, Hilbert space, spectral theorem
1

, Corresponding author. Address: Tate Laboratory of Physics, 116 Church St. SE, Minneapolis, MN 55455, USA, Email: janss011@umn.edu

## 1 Introduction

### 1.1 The Dirac-Jordan statistical transformation theory

On Christmas Eve 1926, Paul A. M. Dirac, on an extended visit to Niels Bohr’s institute in Copenhagen, wrote to Pascual Jordan, assistant to Max Born in Göttingen:2

Dr. Heisenberg has shown me the work you sent him, and as far as I can see it is equivalent to my own work in all essential points. The way of obtaining the results may be rather different though … I hope you do not mind the fact that I have obtained the same results as you, at (I believe) the same time as you. Also, the Royal Society publishes papers more quickly than the Zeits. f. Phys., and I think my paper will appear in their January issue. I am expecting to go to Göttingen at the beginning of February, and I am looking forward very much to meeting you and Prof. Born there (Dirac to Jordan, December 24, 1916, AHQP).3

Dirac’s paper, “The physical interpretation of the quantum dynamics” (Dirac, 1927), had been received by the Royal Society on December 2. Zeitschrift für Physik had received Jordan’s paper, “On a new foundation [Neue Begründung] of quantum mechanics” (Jordan, 1927b), on December 18.4 In both cases it took a month for the paper to get published: Dirac’s appeared January 1, Jordan’s January 18.5 What Dirac and Jordan, independently of one another, had worked out and presented in these papers has come to be known as the Dirac-Jordan (statistical) transformation theory.6 As Jordan wrote in a volume in honor of Dirac’s 70th birthday:

After Schrödinger’s beautiful papers [Schrödinger, 1926a], I formulated what I like to call the statistical transformation theory of quantum mechanical systems, answering generally the question concerning the probability of finding by measurement of the observable the eigenvalue , if a former measurement of another observable had given the eigenvalue . The same answer in the same generality was developed in a wonderful manner by Dirac (Jordan, 1973, p. 296; emphasis in original).

In our paper, we focus on Jordan’s version of the theory and discuss Dirac’s version only to indicate how it differs from Jordan’s.7

Exactly one month before Dirac’s letter to Jordan, Werner Heisenberg, Bohr’s assistant in the years 1926–1927, had already warned Jordan that he was about to be scooped. In reply to a letter, apparently no longer extant, in which Jordan must have given a preview of Neue Begründung I, he wrote:

I hope that what’s in your paper isn’t exactly the same as what’s in a paper Dirac did here. Dirac’s basic idea is that the physical meaning of , given in my note on fluctuations, can greatly be generalized, so much so that it covers all physical applications of quantum mechanics there have been so far, and, according to Dirac, all there ever will be (Heisenberg to Jordan, November 24, 1926, AHQP).8

The “note on fluctuations” is a short paper received by Zeitschrift für Physik on November 6 but not published until December 20. In this note, Heisenberg (1926b), drawing on the technical apparatus of an earlier paper on resonance phenomena (Heisenberg, 1926a), analyzed a simple system of a pair of identical two-state atoms perturbed by a small interaction to allow for the flowing back and forth of the amount of energy separating the two states of the atoms.9 The ’s are the elements of a matrix implementing a canonical transformation, , from a quantum-mechanical quantity for the unperturbed atom to the corresponding quantity for the perturbed one. The discrete two-valued indices and label the perturbed and unperturbed states, respectively. Heisenberg proposed the following interpretation of these matrix elements:

If the perturbed system is in a state , then gives the probability that (because of collision processes, because the perturbation suddenly stops, etc.) the system is found to be in state (Heisenberg, 1926b, p. 505; emphasis in the original).

Heisenberg (1926b, pp. 504–505) emphasized that the same analysis applies to fluctuations of other quantities (e.g., angular momentum) and to other quantum systems with two or more not necessarily identical components as long as their spectra all share the same energy gap.

In the introduction of his paper on transformation theory, Dirac (1927, p. 622) briefly described Heisenberg’s proposal, thanked him for sharing it before publication, and announced that it is “capable of wide extensions.” Dirac, in fact, extended Heisenberg’s interpretation of to any matrix implementing a transformation of some quantum-mechanical quantity from one matrix representation, written as , to another, written as . The primes on the indices labeling rows and columns distinguish the numerical values of the quantities and from those quantities themselves (ibid., p. 625). Depending on the spectrum of the associated quantities, these indices thus take on purely discrete values, purely continuous ones, or a mix of both. Dirac wrote all equations in his paper as if the indices only take on continuous values (ibid.). He introduced the compact notation for the transformation matrix from the -basis to the -basis (ibid., p. 630). In the spirit of Heisenberg’s proposal, is interpreted as the probability that has a value between and given that has the value . Although the notation is Dirac’s, this formulation of the interpretation is Jordan’s (1927b, p. 813).10

It is easy to understand why Heisenberg would have been excited about what he saw as Dirac’s generalization of his own work. He felt strongly that the interpretation of the quantum formalism should naturally emerge from matrix mechanics without any appeal to wave mechanics. For this reason, he initially disliked Born’s statistical interpretation of the wave function as well as Bohr’s concept of complementarity, with its emphasis on wave-particle duality.11 Transformation theory reconciled him with both ideas. It showed that wave-particle duality was just one example of a much broader plurality of equivalent forms in which quantum mechanics can be expressed.12 And it showed that the Schrödinger energy eigenfunctions can be seen as elements of the transformation matrix diagonalizing the Hamiltonian. As Dirac put it:

The eigenfunctions of Schrödinger’s wave equation are just the transformation functions … that enable one to transform from the () scheme of matrix representation to a scheme in which the Hamiltonian is a diagonal matrix (Dirac, 1927, p. 635; emphasis in the original).13

Jordan (1927b, p. 810, p. 822) clearly recognized this too. The probability interpretation could thus either be given in terms of Schrödinger wave functions or in terms of transformation matrices.14

Heisenberg explicitly made the connection between wave functions and transformation matrices in the letter to Jordan from which we quoted above:

is the solution of a transformation to principal axes and also of a differential equation à la Schrödinger, though by no means always in -space. One can introduce matrices of a very general kind, e.g., with indices . The that solves Born’s transformation to principal axes in Qu.M. II (Ch. 3, Eq. (13)) is Schrödinger’s (Heisenberg to Jordan, November 24, 1926, AHQP).15

Eq. (13) in Ch. 3 of “Qu.M. II,” the famous Dreimännerarbeit, is (Born, Heisenberg, and Jordan, 1926, p. 351). This equation summarizes how one solves problems in matrix mechanics: one has to find the matrix for a “transformation to principal axes,” i.e., a canonical transformation from initial coordinates to new coordinates in which the Hamiltonian becomes the diagonal matrix . In Ch. 3 of the Dreimännerarbeit, i.e., prior to Schrödinger’s work, Born, who wrote the chapter, had already come close to making the connection Heisenberg makes here between transformation matrices and solutions of the time-independent Schrödinger equation.16

Expanding on his comments in the letter to Jordan, Heisenberg began his next paper—the sequel to the paper on resonance phenomena (Heisenberg, 1926a) that he had used for his note on fluctuations (Heisenberg, 1926b)—with a three-page synopsis of the remarkable new formalism that subsumed both wave and matrix mechanics (Heisenberg, 1927a, sec. 1, pp. 240–242). This paper was received by Zeitschrift für Physik on December 22, 1926. The new formalism was of such recent vintage at that point that Heisenberg (1927a, p. 240, note) had to list the three main sources he cited for it (Dirac, 1927; Jordan, 1927b; Heisenberg, 1926b) as “forthcoming” [im Erscheinen ]. He made much of the third item, his own note on fluctuations. Jordan, he wrote, “has independently found results that are equivalent to those of the Dirac paper and those of a preceding paper by the author ” (Heisenberg, 1927a, p. 240, note; our emphasis). He referred to the latter again when he introduced the probability interpretation of arbitrary transformation matrices (ibid., p. 242), Dirac’s far-reaching generalization of his own proposal in his note on fluctuations. Heisenberg cited Jordan’s paper in addition to his own note but not Dirac’s paper. Moreover, he did not mention Born’s (1926a,b) statistical interpretation of the wave function anywhere in this brief exposition of the new general formulation of quantum mechanics. This undoubtedly reflects Heisenberg’s initial hostility toward Born’s seminal contribution (see note 10).17

In his later reminiscences, Heisenberg did give Born and Dirac their due (cf. note 10). Discussing the statistical interpretation of the quantum formalism in his contribution to the Pauli memorial volume, for instance, he mentioned Born, Dirac, and Pauli along with his own modest contribution (see the italicized sentence below). On this occasion, however, he omitted Jordan, perhaps because, for reasons we will discuss below, he much preferred Dirac’s version of transformation theory over Jordan’s. Heisenberg wrote:

In the summer of 1926, Born developed his theory of collisions and, building on an earlier idea of Bohr, Kramers, and Slater, correctly interpreted the wave [function] in multi-dimensional configuration space as a probability wave.18 Pauli thereupon explained to me in a letter that Born’s interpretation is only a special case of a much more general interpretation prescription. He pointed out that one could, for instance, interpret as the probability that the particle has a momentum between and .19 This fit well with my own considerations about fluctuation phenomena. In the fall of 1926, Dirac developed his transformation theory, in which then in all generality the absolute squares of matrix elements of unitary transformation matrices were interpreted as probabilities (Heisenberg, 1960, p. 44; our emphasis).

Born’s work was undoubtedly more important for the development of the Dirac-Jordan statistical transformation theory than Heisenberg’s. Before acknowledging Heisenberg’s note on fluctuations, Dirac (1927, p. 621), in fact, had already acknowledged both the preliminary announcement and the full exposition of Born’s (1926a,b) theory of quantum collisions, published in July and September of 1926, respectively. These papers contained the statistical interpretation of the wave function for which Born was awarded part of the 1954 Nobel Prize in physics.

Concretely, Born (1926b, p. 805) suggested that, given a large number of systems in a superposition of energy eigenfunctions , the fraction of systems with eigenfunctions is given by the absolute square of the complex expansion coefficients .20 In the preliminary version of the paper, Born (1926a, p. 865) introduced his probability interpretation examining the case of inelastic scattering of an electron by an atom. He wrote the wave function of the electron long after and far away from the point of interaction as a superposition of wave functions for free electrons flying off in different directions. As in the case of the expansion in terms of energy eigenstates mentioned above, Born interpreted the absolute square of the coefficients in this expansion in terms of free electron wave functions as the probability that the electron flies off in a particular direction. He famously only added in a footnote that this probability is not given by these coefficients themselves but by their absolute square (Born, 1926a, p. 865).

In Neue Begründung I, Jordan (1927b, p. 811) cited Born’s second longer paper on quantum collisions and a more recent one in which he elaborated on his statistical interpretation of the wave function (Born, 1926b, c). In the latter, Born (1926c, p. 174) noted, although he did not use that term at this point, that his probability interpretation typically leads to the occurrence of interference terms (see note 37 for a simple example).2122 These two papers by Born (1926b, c) are mentioned much more prominently in Neue Begründung I than Heisenberg’s (1926b) note on fluctuations, which is cited only as providing an example of a special case in which there are no interference terms (Jordan, 1927b, p. 812, note).

In a two-part overview of recent developments in quantum theory that appeared in Die Naturwissenschaften in July and August 1927, Jordan (1927i, Pt. 2, p. 645) accorded Heisenberg’s note a more prominent role, recognizing it, along with one of his own papers, as an important step toward statistical transformation theory. Jordan’s (1927a) paper was submitted about three weeks after Heisenberg’s (1926c) but was independent of it.23 The two papers are remarkably similar. Both are concerned with the reconciliation of two descriptions of the energy exchange between two quantum systems, a continuous description in terms of a mechanism similar to beats between two waves and a description in terms of quantum jumps (cf. note 8). Both argued, against Schrödinger,24 that despite appearances to the contrary quantum jumps are unavoidable. As Jordan (1927i, Pt. 2, pp. 645–646) concluded in his Naturwissenschaften article, the probabilistic nature of the laws of quantum mechanics is key to reconciling the continuous and discontinuous descriptions. In the letter Heisenberg mentioned above (see note 18), Pauli had drawn the same conclusion from his analysis of an ingenious example of his own device that he told Heisenberg was “a pure culture of your favorite resonance phenomenon” (Pauli, 1979, p. 344). Instead of two quantum systems exchanging energy, Pauli considered one quantum system, a particle constrained to move on a closed ring, which periodically encounters a small obstacle. He considered the case in which the particle is in a state in which it should alternate between rotating clockwise and rotating counterclockwise. That was only possible, Pauli explained, if we accept the conclusion of Born’s analysis of quantum collisions that there is a definite probability that the system reverses course upon hitting the obstacle, which classically would not be large enough to change the direction of the system’s rotation. Unlike Pauli in this letter or Heisenberg (1926b) in his note on fluctuations, Jordan (1927a) did not elaborate on exactly how the statistical element in his example of a resonance phenomenon should formally be introduced, via wave functions à la Born or via transformation matrices à la Heisenberg. In the Naturwissenschaften article, he did not resolve this issue either. Instead, Jordan (1927i, Pt. 2, p. 646) wrote in the paragraph immediately following his discussion of resonance phenomena that the statistical nature of the quantum laws “manifests itself in many ways even more impressively and intuitively” in Born’s (1926b) analysis of quantum collisions. This strongly suggests that Jordan’s statistical interpretation of quantum mechanics in Neue Begründung I owed more to Born’s statistical interpretation of wave functions than to Heisenberg’s statistical interpretation of transformation matrices.

Jordan wanted to use what he had learned from Pauli to provide a new unified foundation for the laws of quantum mechanics by showing that they can be derived as “consequences of a few simple statistical assumptions” (Jordan, 1927b, pp. 810–811). The central quantities in Jordan’s formalism are what he called “probability amplitudes.” This echoes Born’s (1926b, p. 804) term “probability waves” for Schrödinger wave functions but is an extremely broad generalization of Born’s concept. Moreover, Jordan (1927b, p. 811) credited Pauli rather than Born with suggesting the term. Jordan defined a complex probability amplitude, , for two arbitrary quantum-mechanical quantities and with fully continuous spectra.28 At this point, he clearly labored under the illusion that it would be relatively straightforward to generalize his formalism to cover quantities with wholly or partly discrete spectra as well. In Neue Begründung II, Jordan (1927g) would discover that such a generalization is highly problematic. Unaware of these complications and following Pauli’s lead, Jordan interpreted as the conditional probability for finding a value between and for given that the system under consideration has been found to have the value for the quantity .29

Eigenfunctions with eigenvalues for some Hamiltonian of a one-dimensional system in configuration space are examples of Jordan’s probability amplitudes . The quantities and in this case are the position and the Hamiltonian , respectively. Hence gives the conditional probability that has a value between and given that has the value . This is the special case of Jordan’s interpretation given in Pauli’s paper on gas degeneracy (in dimensions).

As indicated above, Jordan took an axiomatic approach in his Neue Begründung papers. Here we clearly see the influence of the mathematical tradition in Göttingen (Lacki, 2000). In fact, Jordan had been Richard Courant’s assistant before becoming Born’s. In his Neue Begründung papers, Jordan began with a series of postulates about his probability amplitudes and the rules they ought to obey (the formulation and even the number of these postulates varied) and then developed a formalism realizing these postulates.

A clear description of the task at hand can be found in a paper by David Hilbert, Lothar Nordheim, and the other main protagonist of our story, John von Neumann, who had come to Göttingen in the fall of 1926 on a fellowship of the International Education Board (Mehra and Rechenberg, 2000–2001, pp. 401–402). Born in 1903, von Neumann was a year younger than Dirac and Jordan, who, in turn, were a year younger than Heisenberg and two years younger than Pauli. The paper by Hilbert, von Neumann, and Nordheim grew out of Hilbert’s course on quantum mechanics in 1926/1927 for which Nordheim prepared most of the notes.30 The course concluded with an exposition of Neue Begründung I (Sauer and Majer, 2009, pp. 698–706). The notes for this part of the course formed the basis for a paper, which was submitted in April 1927 but not published, for whatever reason, until the beginning of 1928.31 In the introduction, the authors described the strategy for formulating the theory:

One imposes certain physical requirements on these probabilities, which are suggested by earlier experience and developments, and the satisfaction of which calls for certain relations between the probabilities. Then, secondly, one searches for a simple analytical apparatus in which quantities occur that satisfy these relations exactly (Hilbert, von Neumann, and Nordheim, 1928, p. 2–3; cf. Lacki, 2000, p. 296).

After everything that has been said so far, it will not come as a surprise that the quantities satisfying the relations that Jordan postulated for his probability amplitudes are essentially32 the transformation matrices central to Dirac’s (1927) presentation of the statistical transformation theory. An example, based on lecture notes by Dirac,33 will give a rough illustration of how this works.

One of the features that Jordan saw as characteristic of quantum mechanics and that he therefore included among his postulates was that in quantum mechanics the simple addition and multiplication rules of ordinary probability theory for mutually exclusive and independent outcomes, respectively, apply to probability amplitudes rather than to the probabilities themselves. Jordan (1927b, p. 812) used the phrase “interference of probabilities” for this feature. He once again credited Pauli with the name for this phenomenon, even though Born (1926b, p. 804) had already talked about the “interference of … “probability waves” ” in his paper on quantum collisions. As we saw above, Born (1926c, p. 174) had discussed the feature itself in a subsequent paper, albeit only in the special case of Schrödinger wave functions. Jordan (1927b) was the first, at least in print, who explicitly recognized this feature in full generality. It is implicit in Dirac’s (1927) version of transformation theory, but Dirac may well have shared the skepticism about the interference of probabilities that Heisenberg expressed in a letter to Jordan (we will quote and discuss the relevant passage below).

As we will see in Section 2.1, Jordan’s postulate about the addition and multiplication of probability amplitudes basically boils down to the requirement that the amplitudes , and for the quantities , , and with purely continuous spectra satisfy the relation . It is easy to see intuitively, though much harder to prove rigorously (see Section 1.2), that this relation is indeed satisfied if these three amplitudes are equated with the transformation matrices—in the notation of Dirac (1927) explained above—, , and , respectively. These matrices relate wave functions in -, -, and -space to one another. From and , it follows that

 ψ(a)=∫∫dbdc(a/b)(b/c)ψ(c).

Comparing this expression to , we see that , in accordance with Jordan’s postulates.34

This brings us to an important difference between Jordan’s and Dirac’s versions of their statistical transformation theory. For Dirac, the transformation element was primary, for Jordan the statistical element was. Most of Dirac’s (1927) paper is devoted to the development of the formalism that allowed him to represent the laws of quantum mechanics in different yet equivalent ways (secs. 2–5, pp. 624–637). The probability interpretation of the transformation matrices is then grafted onto this formalism in the last two sections (ibid., secs. 6–7, 637–641). Jordan’s (1927b) paper begins with the axioms about probability (Pt. I, secs. 1–2, pp. 809–816). It is then shown that those can be implemented by equating probability amplitudes with transformation matrices, or, to be more precise, with integral kernels of canonical transformations (Pt. 2, secs. 1–6, pp. 816–835).

Heisenberg strongly preferred Dirac’s version of statistical transformation theory over Jordan’s. For one thing, he disliked Jordan’s axiomatic approach. As he told Kuhn in his interview for the AHQP project:

Jordan used this transformation theory for deriving what he called the axiomatics of quantum theory … This I disliked intensely … Dirac kept within the spirit of quantum theory while Jordan, together with Born, went into the spirit of the mathematicians (AHQP interview with Heisenberg, session 11, pp. 7–8; quoted by Duncan and Janssen, 2009, pp. 360–361).

Presumably, the axiomatic approach in and of itself would not have presented too much of an obstacle for Heisenberg. As Paul Ehrenfest told Jordan, who still remembered it with amusement when interviewed decades later: “Since you wrote the paper axiomatically, that only means that one has to read it back to front.”35 In that case, one would encounter probability amplitudes in the guise of transformation matrices first, as in Dirac’s version of the theory.

Heisenberg had more serious reservations about Jordan’s version of theory, which initially made it difficult for him even to understand Neue Begründung I. He expressed his frustration in a letter to Pauli a few weeks after the paper was published. After praising Jordan’s (1927d) habilitation lecture which had just appeared in Die Naturwissenschaften, Heisenberg wrote:

I could not understand Jordan’s big paper in [Zeitschrift für Physik]. The “postulates” are so intangible and undefined, I cannot make heads or tails of them (Heisenberg to Pauli, February 5, 1927; Pauli, 1979, p. 374).36

About a month later, Heisenberg wrote to Jordan himself, telling him that he was working

on a fat paper [Heisenberg, 1927b, on the uncertainty princple] that one might characterize as physical commentary on your paper and Dirac’s. You should not hold it against me that I consider this necessary. The essence from a mathematical point of view is roughly that it is possible with your mathematics to give an exact formulation of the case in which and are both given with a certain accuracy (Heisenberg to Jordan, March 7, 1927, AHQP; emphasis in the original).37

Heisenberg now had a clearer picture of Jordan’s approach and of how it differed from the approach of Dirac, who had meanwhile left Copenhagen and had joined Born and Jordan in Göttingen. After registering some disagreements with Dirac, Heisenberg turned to his disagreements with Jordan:

With you I don’t quite agree in that, in my opinion, the relation has nothing to do with the laws of probability. In all cases in which one can talk about probabilities the usual addition and multiplication of probabilities is valid, without “interference.” With Dirac I believe that it is more accurate to say: all statistics is brought in only through our experiments (Heisenberg to Jordan, March 7, 1927, AHQP; emphasis in the original).

The reservation expressed in the first two sentences is largely a matter of semantics. Heisenberg did not dispute that the relation given by Jordan, which we wrote as above, holds in quantum mechanics (as we will see in Section 1.2, it corresponds to the familiar completeness and orthogonality relations). Heisenberg also did not dispute that this relation describes interference phenomena.38 What Heisenberg objected to was Jordan’s way of looking upon this relation as a consequence of his addition and multiplication rules for probability amplitudes. This did not materially affect Jordan’s theory as Jordan (1927b, p. 813) only ever used those rules to derive this particular relation.39

The reference to Dirac for Heisenberg’s second reservation is to the concluding sentence of Dirac’s paper on transformation theory:

The notion of probabilities does not enter into the ultimate description of mechanical processes: only when one is given some information that involves a probability … can one deduce results that involve probabilities (Dirac, 1927, p. 641).40

Instead of elaborating on this second reservation, Heisenberg told Jordan that he had just written a 14-page letter about these matters to Pauli and suggested that Jordan have Pauli send him this letter for further details. In this letter, the blueprint for his uncertainty paper, Heisenberg told Pauli:

One can, like Jordan, say that the laws of nature are statistical. But one can also, and that to me seems considerably more profound, say with Dirac that all statistics are brought in only through our experiments. That we do not know at which position the electron will be the moment of our experiment, is, in a manner of speaking, only because we do not know the phases, if we do know the energy … and in this respect the classical theory would be no different. That we cannot come to know the phases without … destroying the atom is characteristic of quantum mechanics (Heisenberg to Pauli, February 23, 1927; Pauli, 1979, Doc. 154, p. 377; emphasis in the original; see also Heisenberg, 1927b, p. 177).

So Heisenberg—and with him Dirac—held on to the idea that nature itself is deterministic and that all the indeterminism that quantum mechanics tells us we will encounter is the result of unavoidable disturbances of nature in our experiments. Jordan, by contrast, wanted at least to keep open the possibility that nature is intrinsically indeterministic (as did Born [1926a, p. 866; 1926b, pp. 826–827]). Jordan (1927d) made this clear in his habilitation lecture.41 In his Neue Begründung papers, Jordan did not discuss the nature of the probabilities he introduced. He did not even properly define these probabilities. This was done only by von Neumann (1927b) with the help of the notion of ensembles of systems from which one randomly selects members, a notion developed by Richard von Mises and published in book form the following year (von Mises, 1928, see note 134).

As far as the issue of determinism versus indeterminism is concerned, Dirac thus stayed closer to classical theory than Jordan. In other respects, however, Jordan stayed closer. Most importantly, Jordan’s use of canonical transformations is much closer to their use in classical mechanics than Dirac’s.42 In the letter to Jordan from which we quoted right at the beginning of our paper, Dirac clearly identified part of the difference in their use of of canonical transformations:

In your work I believe you considered transformations from one set of dynamical variables to another, instead of a transformation from one scheme of matrices representing the dynamical variables to another scheme representing the same dynamical variables, which is the point of view adopted throughout my paper. The mathematics appears to be the same in the two cases, however (Dirac to Jordan, December 24, 1916, AHQP).43

Traditionally, canonical transformations had been used the way Jordan used them, as transformations to new variables, and not the way Dirac used them, as transformations to new representations of the same variables. Canonical transformations had been central to the development of matrix mechanics (Born and Jordan, 1925; Born, Heisenberg, and Jordan, 1926; see Section 2.2 below). Prior to Neue Begründung I, Jordan (1926b, c) had actually published two important papers on the implementation of canonical transformations in matrix mechanics (Lacki, 2004; Duncan and Janssen, 2009). Jordan’s use of canonical transformations in his Neue Begründung papers was twofold. First, as already mentioned above, Jordan tried to show that integral kernels in canonical transformations have all the properties that probability amplitudes must satisfy according to his postulates. Second, he tried to use canonical transformations to derive differential equations for probability amplitudes for arbitrary quantities (such as the time-independent Schrödinger equation for ) from the trivial differential equations satisfied by the probability amplitude for for some generalized coordinate and its conjugate momentum that he started from. The ’mind your ’s and ’s’ part of the title of our paper refers to the crucial role of canonical transformations and conjugate variables in Jordan’s formalism.

Both ways in which Jordan relied on canonical transformations in his Neue Begründung papers turned out to be problematic and resulted in serious mathematical problems for his version of statistical transformation theory. These problems do not affect Dirac’s version since Dirac (1927) only relied on canonical transformations in very loose sense. However, as we will see once we have introduced some modern tools to analyze the Dirac-Jordan theory in Section 1.2, the two versions of the theory do share a number of other mathematical problems. The ’never mind your ’s and ’s’ part of our title thus does not refer to Dirac’s version of statistical transformation theory, but to the fundamentally different Hilbert space formalism that von Neumann (1927a) introduced as an alternative to the Dirac-Jordan theory (Section 1.3).

To conclude this subsection, we briefly indicate how Jordan ran into problems with his twofold use of canonical transformations in Neue Begründung and explain why those problems do not plague Dirac’s version of the theory. We return to these problems in Section 1.2, relying more heavily on modern concepts and notation, and analyze them in greater detail in Sections 2 and 4, again helping ourselves to modern tools. As Heisenberg (1960, p. 44) wrote in the passage from his contribution to the Pauli memorial volume quoted above, Dirac only considered unitary transformations in his transformation theory. Unitary transformations of Hermitian operators preserve their Hermiticity. Unfortunately, there are many canonical transformations that are not unitary (Duncan and Janssen, 2009, p. 356). In Neue Begründung I, Jordan therefore tried to develop a formalism that would allow quantities that, from a modern point of view, correspond to non-Hermitian operators. As long as the quantities and correspond to Hermitian operators, the probability amplitude is the integral kernel of a unitary transformation and the conditional probability is equal to . In such cases, the integral kernel is simply identical to Dirac’s unitary transformation matrix . As soon as and/or are non-Hermitian, however, is the integral kernel of a non-unitary transformation and the expression for becomes more complicated, involving both the probability amplitude itself and what Jordan called a “supplementary amplitude” (Ergänzungsamplitude). Following the lead of Hilbert, von Neumann, and Nordheim (1928), Jordan dropped the Ergänzungsamplitude in Neue Begründung II. The price he paid for this was what, from the point of view of classical mechanics, amounted to the rather arbitrary restriction of the canonical transformations allowed in his theory to unitary ones.

The other main problem that Jordan ran into with his use of canonical transformations is directly related to his axiomatic approach. Unlike Dirac, Jordan wanted to derive the entire theory from his statistical postulates. This meant, for instance, that Jordan did not include the canonical commutation relation, (where is Planck’s constant), among his postulates. Nor did he assume the usual association of the momentum conjugate to with the differential operator (with ) acting on wave functions in -space. Dirac assumed both these elements. Instead of the commutation relation for and , Jordan (1927b, p. 814) assumed that the probability amplitude for these two quantities had a particularly simple form—in our notation: . This then is how Planck’s constant enters Jordan’s formalism. Normally, the canonical commutation relation defines what it means for and to be canonically conjugate variables. Jordan, however, defined and to be canonically conjugate if and only if .44 This probability amplitude tells us that “[f]or a given value of all possible values of are equally probable(Jordan, 1927b, ibid, p. 814; emphasis in the original; hats added).45 The amplitude trivially satisfies the equations and (ibid.). Subjecting these basic equations to canonical transformations, Jordan argued for the usual association of quantum-mechanical quantities with differential operators acting on wave functions (e.g., with ). Once those associations have been made, his assumption about the form of entails the usual commutation relation for and . Performing more canonical transformation on the basic equations for , Jordan also tried to derive differential equations for probability amplitudes involving quantities related to the initial and via canonical transformations. In this manner, for instance, he tried to recover the time-independent Schrödinger equation.

It was not until Neue Begründung II, written in Copenhagen on an International Education Board fellowship and received by Zeitschrift für Physik on June 3, 1927, that Jordan realized that this strategy for deriving Schrödinger-type differential equations for his probability amplitudes was severely limited. He came to this realization as he was trying to extend his formalism, which in Neue Begründung I was restricted to quantities with continuous spectra, to cover quantities with partly or wholly discrete spectra as well. The key problem (as we will show using modern tools at the end of Section 2.3), is that quantities related by a canonical transformation have the same spectrum. It follows that Jordan’s procedure to get from the differential equations for probability amplitudes for one pair of quantities to those for another pair fails as soon as one pair has purely continuous spectra (such as and ) and the other pair has partly or wholly discrete spectra (such as, typically, the Hamiltonian). In response to this problem, Jordan changed the way he used canonical transformations to the way Dirac had been using them all along, where the transformations are no longer to new quantities but rather to new representations of the same quantities (Jordan, 1927g, pp. 16–17; we will quote the relevant passage in Section 4).

The treatment of quantities with partly or wholly discrete spectra in Neue Begründung II necessitated further departures from the classical formalism of canonical transformations and conjugate variables. For quantities with fully continuous spectra, Jordan could show that his definition of conjugate variables in terms of a probability amplitude reduces to the standard definition in terms of a commutation relation. This is not true for quantities with partly or wholly discrete spectra. In Neue Begründung II, Jordan gave a simple proof that such quantities can never satisfy the standard commutation relation (see Section 4, Eqs. (103)–(108)). Jordan presented it as a point in favor of his formalism that his alternative definition of canonically conjugate variables works for quantities with partly or wholly discrete spectra as well. Jordan’s definition, however, led to such counter-intuitive propositions as different components of spin qualifying as canonically conjugate to one another.46

All in all, the state of affairs by the end of Neue Begründung II can fairly be characterized as follows. Jordan was still trying to rely on the classical formalism of canonical transformations and conjugate variables to build up the formalism realizing the postulates for his quantum-mechanical probability amplitudes. However, these classical concepts had to be stretched almost beyond the breaking point to arrive at a satisfactory formulation of his quantum theory.

### 1.2 Mathematical challenges facing the Dirac-Jordan theory

From a modern point of view, the realization of the axioms of Neue Begründung is supplied by the Hilbert space formalism.47 Jordan’s probability amplitudes are then identified with ‘inner products’ of ‘eigenvectors’ and of operators and .48 The reason we used scare quotes in the preceding sentence is that for quantities with completely continuous spectra, to which Jordan restricted himself in Neue Begründung I, the ‘eigenvectors’ of the corresponding operators are not elements of Hilbert space. That in modern quantum mechanics they are nonetheless routinely treated as if they are vectors in Hilbert space with inner products such as is justified by the spectral theorem for the relevant operators.

Of course, neither the spectral theorem nor the notions of an abstract Hilbert space and of operators acting in it were available when Jordan and Dirac published their respective versions of transformation theory in 1927. The Hilbert space formalism and the spectral theorem were only introduced later that year, by von Neumann (1927a), and two more years passed before von Neumann (1929) published a rigorous proof of the spectral theorem. So even though Dirac (1927) introduced the notation for what Jordan wrote as , Dirac, like Jordan, did not at that time conceive of these quantities as ‘inner products’ of two more elementary quantities (see note 46). Although Dirac later did accept the split (once again see note 46), probability amplitudes remained the fundamental quantities for Jordan (Duncan and Janssen, 2009, p. 361).

Once the ‘inner-product’ structure of probability amplitudes is recognized and justified with the help of the spectral theorem, Jordan’s basic axioms about the addition and multiplication of probability amplitudes are seen to reduce to statements about orthogonality and completeness familiar from elementary quantum mechanics. For instance, as we mentioned in Section 1.1, Jordan’s postulates demand that the probability amplitudes , and for quantities , , and with purely continuous spectra satisfy the relation . Once probability amplitudes are identified with ‘inner products’ of ‘eigenvectors’ (appropriately normalized, such that, e.g., , where is the Dirac delta function), the familiar completeness relation, (which holds on account of the resolution of unity, , corresponding to the spectral decomposition of the operator ), guarantees that . In this sense, the Hilbert space formalism thus provides a realization of Jordan’s postulates.

In the absence of the Hilbert space formalism and the spectral theorem, Jordan relied on the formalism of canonical transformations to develop the analytical apparatus realizing his axiomatic scheme. As we saw in Section 1.1, his starting point was the probability amplitude for some generalized coordinate and its conjugate momentum .49 This special probability amplitude, to reiterate, trivially satisfies two simple differential equations. Jordan then considered canonical transformations to other canonically conjugate variables and and derived differential equations for arbitrary probability amplitudes starting from the ones for . In this way, he claimed, one could recover both the time-independent and the time-dependent Schrödinger equations as examples of such equations.

Both claims are problematic. The recovery of the time-dependent Schrödinger equation requires that we look upon the time not as a parameter as we would nowadays but as an operator to be expressed in terms of the operators and . More importantly, Jordan’s construction only gets us to the time-independent Schrödinger equation for Hamiltonians with fully continuous spectra. In Neue Begründung I, Jordan deliberately restricted himself to quantities with completely continuous spectra, confident at that point that his approach could easily be generalized to quantities with wholly or partly discrete spectra. He eventually had to accept that this generalization fails. The problem, as he himself recognized in Neue Begründung II, is that two quantities and related to each other via a canonical transformation (implemented by the similarity transformation ) always have the same spectrum.50 Hence, no canonical transformation that can be implemented in this way can take us from quantities such as and with a completely continuous spectrum to a Hamiltonian with a wholly or partly discrete spectrum.

The quantities in Jordan’s formalism do double duty as probability amplitudes and as integral kernels of canonical transformations. Even if we accept the restriction to quantities with fully continuous spectra for the moment, Jordan could not quite get his formalism to work, at least not at the level of generality he had hoped for. In hindsight, we can see that another major hurdle was that canonical transformations from one set of conjugate variables to another, although they do preserve the spectra, do not always preserve the Hermitian character of the operators associated with these variables in quantum mechanics (Duncan and Janssen, 2009, secs. 5–6). Initially Jordan tried to get around this problem through the introduction of the Ergänzungsamplitude. Once he dropped that notion, he had to restrict the allowed canonical transformations to those associated with unitary operators. In the modern Hilbert space formalism, the integral kernels of canonical transformations in Jordan’s formalism are replaced by unitary operators. There is no need anymore for considering canonical transformations nor, for that matter, for sorting quantities into pairs of conjugate variables. Jordan’s reliance on canonical transformations and conjugate variables became even more strained in Neue Begründung II, when he tried to extend his approach to quantities with partly or wholly discrete spectra. He had a particularly hard time dealing with the purely discrete spectrum of the recently introduced spin observable. Minding his ’s and ’s, Jordan ended up putting himself in a straitjacket.

### 1.3 Von Neumann’s alternative to the Dirac-Jordan theory

At the end of their exposition of Jordan’s theory in Neue Begründung I, Hilbert, von Neumann, and Nordheim (1928, p. 30) emphasized the mathematical difficulties with Jordan’s approach (some of which they had caught, some of which they too had missed), announced that they might return to these on another occasion, and made a tantalizing reference to the first of three papers on quantum mechanics that von Neumann would publish in 1927 in the Proceedings of the Göttingen Academy (von Neumann, 1927a, b, c). This trilogy formed the basis for his famous book (von Neumann, 1932). The first of these papers, “Mathematical foundations (Mathematische Begründung) of quantum mechanics,” is the one in which von Neumann (1927a) introduced the Hilbert space formalism and the spectral theorem (he only published a rigorous proof of the latter two years later). One might therefore expect at this juncture that von Neumann would simply make the observations that we made in Section 1.2, namely that the Hilbert space formalism provides the natural implementation of Jordan’s axiomatic scheme and that the spectral theorem can be used to address the most glaring mathematical problems with this implementation. Von Neumann, however, did nothing of the sort.51

Von Neumann was sharply critical of the Dirac-Jordan transformation theory. As he put it in the introduction of his 1932 book: “Dirac’s method does not meet the demands of mathematical rigor in any way—not even when it is reduced in the natural and cheap way to the level that is common in theoretical physics” (von Neumann, 1932, p. 2; our emphasis). He went on to say that “the correct formulation is not just a matter of making Dirac’s method mathematically precise and explicit but right from the start calls for a different approach related to Hilbert’s spectral theory of operators” (ibid., our emphasis).52 Von Neumann only referred to Dirac in this passage, but as co-author of the paper with Hilbert and Nordheim mentioned above, he was thoroughly familiar with Jordan’s closely related work as well. He also clearly appreciated the difference in emphasis between Dirac and Jordan. Talking about the Schrödinger wave function in the introduction of the second paper of his 1927 trilogy, he wrote: “Dirac interprets it as a row of a certain transformation matrix, Jordan calls it a probability amplitude” (von Neumann, 1927b, p. 246).53 In the opening paragraph of this article, von Neumann contrasted wave mechanics with “transformation theory” or “statistical theory,” once again reflecting the difference in emphasis between Dirac and Jordan. Yet, despite his thorough understanding of it, von Neumann did not care for the Dirac-Jordan approach.

Von Neumann’s best-known objection concerns the inevitable use of delta functions in the Dirac-Jordan approach. However, von Neumann also objected to the use of probability amplitudes. Jordan’s basic amplitude, , is not in the space of square-integrable functions that forms one instantiation of abstract Hilbert space. Moreover, probability amplitudes are only determined up to a phase factor, which von Neumann thought particularly unsatisfactory. “It is true that the probabilities appearing as end results are invariant,” he granted in the introduction of his paper, “but it is unsatisfactory and unclear why this detour through the unobservable and non-invariant is necessary” (von Neumann, 1927a, p. 3). So, rather than following the Jordan-Dirac approach and looking for ways to mend its mathematical shortcomings, von Neumann, as indicated in the passage from his 1932 book quoted above, adopted an entirely new approach. He generalized Hilbert’s spectral theory of operators54 to provide a formalism for quantum mechanics that is very different from the one proposed by Jordan and Dirac.

The only elements that von Neumann took from transformation theory—more specifically Jordan’s version of it—were, first, Jordan’s basic idea that quantum mechanics is ultimately a set of rules for conditional probabilities , and second, the fundamental assumption that such probabilities are given by the absolute square of the corresponding probability amplitudes, which essentially boils down to the Born rule.55 Von Neumann derived a new expression for conditional probabilities in quantum mechanics that avoids probability amplitudes altogether and instead sets them equal to the trace of products of projection operators, as they are now called. Instead of the term ‘projection operator’, Von Neumann used the term Einzeloperator (or E.Op. for short; cf. note 116). The probability , e.g., is given by , where and are projection operators onto, in Dirac notation, the ‘eigenvectors’ and of the operators and , respectively. Unlike probability amplitudes, these projection operators do not have any phase ambiguity. This is easily seen in Dirac notation. The projection operator does not change if the ket is replaced by and the bra accordingly by . We should emphasize, however, that, just as Jordan and Dirac with their probability amplitudes/transformation functions , von Neumann did not think of his projection operators as constructed out of bras and kets, thus avoiding the problem that many of these bras and kets are not in Hilbert space.

Toward the end of his paper, von Neumann (1927a, pp. 46–47) noted that his trace expression for conditional probabilities is invariant under “canonical transformations.” What von Neumann called canonical transformations, however, are not Jordan’s canonical transformations but simply, in modern terms, unitary transformations. Such transformations automatically preserve Hermiticity and the need for something like Jordan’s Ergänzungsamplitude simply never arises. Von Neumann noted that his trace expression for conditional probabilities does not change if the projection operators and are replaced by and , where is an arbitrary unitary operator (). In von Neumann’s approach, as becomes particularly clear in his second paper of 1927 (see below), one also does not have to worry about sorting variables into sets of mutually conjugate ones. This then is what the ’never mind your ’s and ’s’ part of the title of our paper refers to. By avoiding conjugate variables and canonical transformations, von Neumann completely steered clear of the problem that ultimately defeated Jordan’s attempt to derive all of quantum mechanics from his set of axioms, namely that canonical transformations can never get us from ’s and ’s with fully continuous spectra to quantities with wholly or partly discrete spectra, such as the Hamiltonian.

In Mathematische Begründung, von Neumann not only provided an alternative to Jordan’s analysis of probabilities in quantum mechanics, he also provided an alternative to the Dirac-Jordan transformation-theory approach to proving the equivalence of matrix mechanics and wave mechanics (von Neumann, 1927a, p. 14). This is where von Neumann put the abstract notion of Hilbert space that he introduced in his paper to good use. He showed that matrix mechanics and wave mechanics correspond to two instantiations of abstract Hilbert space, the space of square-summable sequences and the space of square-integrable functions, respectively (Dieudonné, 1981, p. 172).56 As von Neumann reminded his readers, well-known theorems due to Parseval and Riesz and Fisher had established that and are isomorphic.57

In his second 1927 paper, “Probability-theoretical construction (Wahrscheinlichkeitstheoretischer Aufbau) of quantum mechanics,” von Neumann (1927b) freed himself even further from relying on the Dirac-Jordan approach. In Mathematische Begründung he had accepted the Born rule and recast it in the form of his trace formula. In Wahrscheinlichkeitstheoretischer Aufbau he sought to derive this trace formula, and thereby the Born rule, from more fundamental assumptions about probability. Drawing on ideas of von Mises (see note 134), von Neumann started by introducing probabilities in terms of selecting members from large ensembles of systems. He then made two very general and prima facie perfectly plausible assumptions about expectation values of quantities defined on such ensembles (von Neumann, 1927b, pp. 246–250). From those assumptions, some assumptions about the repeatability of measurements (von Neumann, 1927b, p. 271, p. 262, cf. note 137), and key features of his Hilbert space formalism (especially some assumptions about the association of observables with Hermitian operators), von Neumann did indeed manage to recover the Born rule. Admittedly, the assumptions needed for this result are not as innocuous as they look at first sight. They are essentially the same as those that go into von Neumann’s infamous no-hidden-variable proof (Bell, 1966; Bacciagaluppi and Crull, 2009; Bub, 2010).

Along the way von Neumann (1927b, p. 253) introduced what we now call a density operator to characterize the ensemble of systems he considered. He found that the expectation value of an observable represented by some operator in an ensemble characterized by is given by , where we used the modern notation for the density operator (von Neumann used the letter ). This result holds both for what von Neumann (1927b) called a “pure” (rein) or “uniform” (einheitlich) ensemble (p. 255), consisting of identical systems in identical states, and for what he called a “mixture” (Gemisch) (p. 265). So the result is more general than the Born rule, which obtains only in the former case. Von Neumann went on to show that the density operator for a uniform ensemble is just the projection operator onto the ray in Hilbert space corresponding to the state of all systems in this ensemble. However, he found it unsatisfactory to characterize the state of a physical system by specifying a ray in Hilbert space. “Our knowledge of a system,” von Neumann (1927b, p. 260) wrote, “is never described by the specification of a state … but, as a rule, by the results of experiments performed on the system.” In this spirit, he considered the simultaneous measurement of a maximal set of commuting operators and constructed the density operator for an ensemble where what is known is that the corresponding quantities have values in certain intervals. He showed that such measurements can fully determine the state and that the density operator in that case is once again the projection operator onto the corresponding ray in Hilbert space.

Von Neumann thus arrived at the typical quantum-mechanical way of conceiving of a physical problem nowadays, which is very different from the classical way to which Jordan was still wedded in Neue Begründung. In classical mechanics, as well as in Jordan’s version of transformation theory, the full description of a physical system requires the specification of a complete set of ’s and ’s. In quantum mechanics, as was first made clear in von Neumann’s Wahrscheinlichkeitstheoretischer Aufbau, it requires the specification of the eigenvalues of all the operators in a maximal set of commuting operators for the system. In other words, the ’never mind your ’s and ’s’ part of the title of our paper carried the day.

### 1.4 Outline of our paper

In the balance of this paper we cover the contributions of Jordan and von Neumann (initially with Hilbert and Nordheim) to the developments sketched above in greater detail. We give largely self-contained reconstructions of the central arguments and derivations in five key papers written in Göttingen and, in one case (Neue Begründung II), in Copenhagen over the span of just one year, from late 1926 to late 1927 (Jordan, 1927b, g; Hilbert, von Neumann, and Nordheim, 1928; von Neumann, 1927a, b). To make the arguments and derivations in these papers easier to follow for a modern reader, we translate them all into the kind of modern notation introduced in Sections 1.2 and 1.3. To make it easier for the reader to check our claims against the primary sources, we provide detailed references to the latter, including where necessary equation numbers and legends for the original notation. We will not cover Dirac (cf. note 6), although we will occasionally refer to his work, both his original paper on transformation theory (Dirac, 1927) and the book based on it (Dirac, 1930). We will also freely avail ourselves of his bra and ket notation. Our focus, however, will be on Jordan and von Neumann.

We begin, in Section 2, with Neue Begründung I (Jordan, 1927b). In this paper Jordan only dealt with quantities with completely continuous spectra, suggesting that the generalization to ones with partly or wholly discrete spectra would be straightforward (Jordan, 1927b, p. 811, p. 816). We cover Jordan’s postulates for his probability amplitudes (Section 2.1) and his construction of a realization of these postulates, especially his use of canonical transformations between pairs of conjugate variables to derive the differential equations for these amplitudes (Section 2.3). In Section 2.2, drawing on an earlier paper (Duncan and Janssen, 2009), we remind the reader of the role of canonical transformations in matrix mechanics. In Section 2.4, we take a closer look at Jordan’s notion of a supplementary amplitude [Ergänzungsamplitude].

In Section 3, we discuss the paper by Hilbert, von Neumann, and Nordheim (1928), submitted in April 1927, that grew out of the exposition of Jordan’s approach in Hilbert’s 1926/1927 course on quantum mechanics (Sauer and Majer, 2009, pp. 698–706). Hilbert and his co-authors had the advantage of having read the paper in which Dirac (1927) presented his version of transformation theory. Jordan only read Dirac’s paper when he was correcting the page proofs of Neue Begründung I (Jordan, 1927b, p. 809; note added in proof).

In Section 4, we consider Neue Begründung II (Jordan, 1927g), received by Zeitschrift für Physik in early June 1927 and written in part in response to criticism of Neue Begründung I by Hilbert, von Neumann, and Nordheim (1928) and by von Neumann (1927a) in Mathematische Begründung. Since von Neumann introduced an entirely new approach, we deviate slightly from the chronological order of these papers, and discuss Mathematische Begründung after Neue Begründung II. In the abstract of the latter, Jordan (1927g, p. 1) promised “a simplified and generalized presentation of the theory developed in [Neue Begründung] I.” Drawing on Dirac (1927), Jordan simplified his notation somewhat, although he also added some new and redundant elements to it. Most importantly, however, the crucial generalization to quantities with partly or wholly discrete spectra turned out to be far more problematic than he had suggested in Neue Begründung I. Rather than covering Neue Begründung II in detail, we highlight the problems Jordan ran into, especially in his attempt to deal with spin in his new formalism.

In Sections 5 and 6, we turn to the first two papers of von Neumann’s trilogy on quantum mechanics of 1927. In Section 5, on Mathematische Begründung (von Neumann, 1927a), we focus on von Neumann’s criticism of the Dirac-Jordan transformation theory, his proof of the equivalence of wave mechanics and matrix mechanics based on the isomorphism between and , and his derivation of the trace formula for probabilities in quantum mechanics. We do not cover the introduction of his Hilbert space formalism, which takes up a large portion of his paper. This material is covered in any number of modern books on functional analysis.58 In Section 6, on Wahrscheinlichkeitstheoretischer Aufbau (von Neumann, 1927c), we likewise focus on the overall argument of the paper, covering the derivation of the trace formula from some basic assumptions about the expectation value of observables in an ensemble of identical systems, the introduction of density operators, and the specification of pure states through the values of a maximal set of commuting operators.

In Section 7, we summarize the transition from Jordan’s quantum-mechanical formalism rooted in classical mechanics (mind your ’s and ’s) to von Neumann’s quantum-mechanical formalism which no longer depends on classical mechanics for its formulation (never mind your ’s and ’s).

As a coda to our story, we draw attention to the reemergence of the canonical formalism, its generalized coordinates and conjugate momenta, even for spin- particles, in quantum field theory.

## 2 Jordan’s Neue Begründung I (December 1926)

Neue Begründung I was received by Zeitschrift für Physik on December 18, 1926 and published January 18, 1927 (Jordan, 1927b). It consists of two parts. In Part One (I. Teil ), consisting of secs. 1–2 (pp. 809–816), Jordan laid down the postulates of his theory. In Part Two (II. Teil ), consisting of secs. 3–7 (pp. 816–838), he presented the formalism realizing these postulates. In the abstract of the paper, Jordan announced that his new theory would unify all earlier formulations of quantum theory:

The four forms of quantum mechanics that have been developed so far—matrix theory, the theory of Born and Wiener, wave mechanics, and -number theory—are contained in a more general formal theory. Following one of Pauli’s ideas, one can base this new theory on a few simple fundamental postulates (Grundpostulate) of a statistical nature (Jordan, 1927b, p. 809).

As we mentioned in Section 1.2, Jordan claimed that he could recover both the time-dependent and the time-independent Schrödinger equation as special cases of the differential equations he derived for the probability amplitudes central to his formalism. This is the basis for his claim that wave mechanics can be subsumed under his new formalism. Nowhere in the paper did he show explicitly how matrix mechanics is to be subsumed under the new formalism. Perhaps Jordan felt that this did not require a special argument as the new formalism had grown naturally out of matrix mechanics and his own contributions to it (Jordan, 1926a, b). However, as emphasized repeatedly already, Jordan (1927b) restricted himself to quantities with purely continuous spectra in Neue Begründung I, so the formalism as it stands is not applicable to matrix mechanics. Like Dirac’s (1927) own version of statistical transformation theory, Jordan’s version can be seen as a natural extension of Dirac’s (1925) -number theory. It is only toward the end of his paper (sec. 6) that Jordan turned to the operator theory of Born and Wiener (1926). In our discussion of Neue Begründung I, we omit this section along with some mathematically intricate parts of secs. 3 and 5 that are not necessary for understanding the paper’s overall argument. We do not cover the concluding sec. 7 of Jordan’s paper either, which deals with quantum jumps (recall his earlier paper on this topic [Jordan, 1927a], which we briefly discussed in Section 1.1).

Although we will not cover Jordan’s unification of the various forms of quantum theory in any detail, we will cover (in Section 5) von Neumann’s criticism of the Dirac-Jordan way of proving the equivalence of matrix mechanics and wave mechanics as a prelude to his own proof based on the isomorphism of and (von Neumann, 1927a). In our discussion of Neue Begründung I in this section, we focus on the portion of Jordan’s paper that corresponds to the last sentence of the abstract, which promises a statistical foundation of quantum mechanics. Laying this foundation actually takes up most of the paper (secs. 1–2, 4–5).

### 2.1 Jordan’s postulates for probability amplitudes

The central quantities in Neue Begründung I are generalizations of Schrödinger energy eigenfunctions which Jordan called “probability amplitudes.” He attributed both the generalization and the term to Pauli. Jordan referred to a footnote in a forthcoming paper by Pauli (1927a, p. 83, note) proposing, in Jordan’s terms, the following interpretation of the energy eigenfunctions (where labels the different energy eigenvalues) of a system (in one dimension): “If is normalized, then gives the probability that, if the system is in the state , the coordinate [] has a value between and (Jordan, 1927b, p. 811). A probability amplitude such as this one for position and energy can be introduced for any two quantities.

In Neue Begründung I, Jordan focused on quantities with completely continuous spectra. He only tried to extend his approach, with severely limited success, to partly or wholly discrete spectra in Neue Begründung II (see Section 4). For two quantities and that can take on a continuous range of values and , respectively,59 there is a complex probability amplitude such that gives the probability that has a value between and given that has the value .

In modern Dirac notation would be written as (cf. our discussion in Section 1.2). Upon translation into this modern notation, many of Jordan’s expressions turn into instantly recognizable expressions in modern quantum mechanics and we will frequently provide such translations to make it easier to read Jordan’s text. We must be careful, however, not to read too much into it. First of all, von Neumann had not yet introduced the abstract notion of Hilbert space when Jordan and Dirac published their theories in early 1927, so neither one thought of probability amplitudes as ‘inner products’ of ‘vectors’ in Hilbert space at the time. More importantly, for quantities ’s and ’s with purely continuous spectra (e.g., position or momentum of a particle in an infinitely extended region), the ‘vectors’ and are not elements of Hilbert space, although an inner product can be defined in a generalized sense (as a distribution) as an integral of products of continuum normalized wave functions, as is routinely done in elementary quantum mechanics. That continuum eigenstates can be treated as though they are indeed states in a linear space satisfying completeness and orthogonality relations which are continuum analogs of the discrete ones which hold rigorously in a Hilbert space is, as we will see later, just the von Neumann spectral theorem for self-adjoint operators with a (partly or wholly) continuous spectrum.

In the introductory section of Neue Begründung I, Jordan (1927b, p. 811) listed two postulates, labeled I and II. Only two pages later, in sec. 2, entitled “Statistical foundation of quantum mechanics,” these two postulates are superseded by a new set of four postulates, labeled A through D.60 In Neue Begründung II, Jordan (1927g, p. 6) presented yet another set of postulates, three this time, labeled I through III (see Section 4).61 The exposition of Jordan’s theory by Hilbert, von Neumann, and Nordheim (1928), written in between Neue Begründung I and II, starts from six “physical axioms” (pp. 4–5), labeled I through VI (see Section 3). We will start from Jordan’s four postulates of Neue Begründung I, which we paraphrase and comment on below, staying close to Jordan’s own text but using the notation introduced above to distinguish between quantities and their numerical values.

Postulate A. For two mechanical quantities and that stand in a definite kinematical relation to one another there are two complex-valued functions, and , such that gives the probability of finding a value between and for given that has the value . The function is called the probability amplitude, the function is called the “supplementary amplitude” (Ergänzungsamplitude).

Comments: As becomes clear later on in the paper, “mechanical quantities that stand in a definite kinematical relation to one another” are quantities that can be written as functions of some set of generalized coordinates and their conjugate momenta. In his original postulate I, Jordan (1927b, p. 162) wrote that “ is independent of the mechanical nature (the Hamiltonian) of the system and is determined only by the kinematical relation between and ” (hats added). Hilbert et al. made this into a separate postulate, their axiom V: “A further physical requirement is that the probabilities only depend on the functional nature of the quantities and , i.e., on their kinematical connection [Verknüpfung], and not for instance on additional special properties of the mechanical system under consideration, such as, for example, its Hamiltonian” (Hilbert, von Neumann, and Nordheim, 1928, p. 5). With , the statement about the kinematical nature of probability amplitudes translates into the observation that they depend only on the inner-product structure of Hilbert space and not on the Hamiltonian governing the time evolution of the system under consideration.62

It turns out that for all quantities represented, in modern terms, by Hermitian operators, the amplitudes and are equal to one another. At this point, however, Jordan wanted to leave room for quantities represented by non-Hermitian operators. This is directly related to the central role of canonical transformations in his formalism. As Jordan (1926a,b) had found in a pair of papers published in 1926, canonical transformations need not be unitary and therefore do not always preserve the Hermiticity of the conjugate variables one starts from (Duncan and Janssen, 2009). The Ergänzungsamplitude does not appear in the presentation of Jordan’s formalism by Hilbert, von Neumann, and Nordheim (1928).63 In Neue Begründung II, Jordan (1927g, p. 3) restricted himself to Hermitian quantities and silently dropped the Ergänzungsamplitude. We return to the Ergänzungsamplitude in Section 2.4 below, but until then we will simply set everywhere.

Postulate B. The probability amplitude is the complex conjugate of the probability amplitude . In other words, . This implies a symmetry property of the probabilities themselves: the probability density for finding the value for given the value for is equal to the probability density for finding the value for given the value for .

Comment. This property is immediately obvious once we write as with the interpretation of as an ‘inner product’ in Hilbert space (but recall that one has to be cautious when dealing with quantities with continuous spectra).

Postulate C. The probabilities combine through interference. In sec. 1, Jordan (1927b, p. 812) already introduced the phrase “interference of probabilities” to capture the striking feature in his quantum formalism that the probability amplitudes rather than the probabilities themselves follow the usual composition rules for probabilities.64 Let and be two outcomes [Tatsachen] for which the amplitudes are and . If and are mutually exclusive, is the amplitude for the outcome ’ or ’. If and are independent, is the amplitude for the outcome ’ and ’.

Consequence. Let be the probability amplitude for the outcome of finding the value for given the value for . Let be the probability amplitude for the outcome of finding the value for given the value for . Since and are independent, Jordan’s multiplication rule tells us that the probability amplitude for ’ and ’ is given by the product . Now let be the probability amplitude for the outcome of finding the value for given the value for . According to Jordan’s addition rule, this amplitude is equal to the ‘sum’ of the amplitudes for ’ and ’ for all different values of . Since has a continuous spectrum, this ‘sum’ is actually an integral. The probability amplitude for is thus given by65

 Φ(Q,β)=∫χ(Q,q)φ(q,β)dq.[NB1,sec. 2,Eq. 14] (1)

Special case. If , the amplitude becomes the Dirac delta function. Jordan (1927b, p. 814) introduced the notation even though and are continuous rather than discrete variables [NB1, sec. 2, Eq. 16]. In a footnote he conceded that this is mathematically dubious. In Neue Begründung II, Jordan (1927g, p. 5) used the delta function that Dirac (1927, pp. 625–627) had meanwhile introduced in his paper on transformation theory. Here and in what follows we will give Jordan the benefit of the doubt and assume the normal properties of the delta function.66

Using that the amplitude is just the complex conjugate of the amplitude , we arrive at the following expression for :

 Φ(β′,β′′)=∫φ∗(q,β′)φ(q,β′′)dq=δβ′β′′.[NB1,sec. 2,Eqs. 15,16,17] (2)

Comment. Translating Eqs. (1)–(2) above into Dirac notation, we recognize them as familiar completeness and orthogonality relations:67

 ⟨Q|β⟩=∫⟨Q|q⟩⟨q|β⟩dq,⟨β′|β′′⟩=∫⟨β′|q⟩⟨q|β′′⟩dq=δ(β′−β′′). (3)

Since the eigenvectors of the operator are not in Hilbert space, the spectral theorem, first proven by von Neumann (1927a), is required for the use of the resolution of the unit operator .

Postulate D. For every there is a conjugate momentum . Before stating this postulate, Jordan offered a new definition of what it means for to be the conjugate momentum of . If the amplitude of finding the value for given the value for is given by

 ρ(p,q)=e−ipq/ℏ,[NB1,sec. 2,Eq. 18] (4)

then is the conjugate momentum of .

Anticipating a special case of the uncertainty principle (cf. notes 36 and 44), Jordan (1927b, p. 814) noted that Eq. (4) implies that “[f]or a given value of all possible values of are equally probable.”

For ’s and ’s with completely continuous spectra, Jordan’s definition of when is conjugate to is equivalent to the standard one that the operators and satisfy the commutation relation . This equivalence, however, presupposes the usual association of the differential operators and ’multiplication by ’ with the quantities and , respectively. As we emphasized in Section 1.2, Jordan did not think of these quantities as operators acting in an abstract Hilbert space, but he did associate them (as well as any other quantity obtained through adding and multiplying ’s and ’s) with the differential operators and (and combinations of them). The manipulations in Eqs. (19ab)–(24) of Neue Begründung I, presented under the subheading “Consequences” (Folgerungen) immediately following postulate D, are meant to show that this association follows from his postulates (Jordan, 1927b, pp. 814–815). Using modern notation, we reconstruct Jordan’s rather convoluted argument. As we will see, the argument as it stands does not work, but a slightly amended version of it does.

The probability amplitude , Jordan’s , trivially satisfies the following pair of equations:

 (p+ℏi∂∂q)⟨p|q⟩=0,[NB1,sec. 2,Eq. 19a] (5)
 (ℏi∂∂p+q)⟨p|q⟩=0.[NB1,sec. 2,Eq. 19b] (6)

Unless we explicitly say otherwise, expressions such as are to be interpreted as our notation for Jordan’s probability amplitudes and not as inner products of vectors and in Hilbert space.

Following Jordan (NB1, sec. 2, Eqs. 20–22), we define the map , which takes functions of and turns them into functions of (the value of a new quantity with a fully continuous spectrum):

 T:f(p)→(Tf)(Q)≡∫⟨Q|p⟩f(p)dp. (7)

In other words, [NB1, Eq. 21]. For the special case that , we get:

 (T⟨p|q⟩)(Q)=∫⟨Q|p⟩⟨p|q⟩dp=⟨Q|q⟩, (8)

where we used completeness, one of the consequences of Jordan’s postulate C (cf. Eqs. (1)–(3)). In other words, maps onto :68

 ⟨Q|q⟩=T⟨p|q⟩.[NB1,sec. 2,Eq. 22] (9)

Likewise, we define the inverse map , which takes functions of and turns them into functions of :69

 T−1:F(Q)→(T−1F)(p)≡∫⟨p|Q⟩F(Q)dQ. (10)

In other words, 70 For the special case that we get (again, by completeness):

 (T−1⟨Q|q⟩)(p)=∫⟨p|Q⟩⟨Q|q⟩dQ=⟨p|q⟩, (11)

or, more succinctly,

 ⟨p|q⟩=T−1⟨Q|q⟩. (12)

Applying to the left-hand side of Eq. (5) [NB1, Eq. 19a], we find:

 T((p+ℏi∂∂q)⟨p|q⟩)=Tp⟨p|q⟩+ℏi∂∂qT⟨p|q⟩=0, (13)

where we used that differentiation with respect to commutes with applying (which only affects the functional dependence on ). Using that (Eq. (12)) and (Eq. (9)), we can rewrite Eq. (13) as:71

 (TpT−1+ℏi∂∂q)⟨Q|q⟩=0.[NB1,sec. 2,Eq. 23a] (14)

Similarly, applying to the left-hand side of Eq. (6) [NB1, Eq. 19b], we find:

 T((ℏi∂∂p+q)⟨p|q⟩)=Tℏi∂∂p⟨p|q⟩+qT⟨p|q⟩=0, (15)

where we used that multiplication by commutes with applying . Once again using that and , we can rewrite this as:72

 (Tℏi∂∂pT−1+q)⟨Q|q⟩=0.[NB1,sec. 2,Eq. 23b] (16)

Eqs. (14) and (16) [NB1, Eqs. 23ab] gave Jordan a representation of the quantities and in the -basis. The identification of in the -basis is straightforward. The quantity in Eq. (5) [NB1, Eq. 19a] turns into the quantity in Eq. (14), [NB1, Eq. 23a]. This is just what Jordan had come to expect on the basis of his earlier use of canonical transformations (see Section 2.2 below). The identification of in the -basis is a little trickier. Eq. (6) [NB1, Eq. 19b] told Jordan that the position operator in the original -basis is (note the minus sign). This quantity turns into in Eq. (16) [NB1, Eq. 23b]. This then should be the representation of in the new -basis, as Jordan stated right below this last equation: “With respect to (in Bezug auf ) the fixed chosen quantity [] every other quantity [] corresponds to an operator []” (Jordan, 1927b, p. 815).73

With these representations of his quantum-mechanical quantities and , Jordan could now define their addition and multiplication through the corresponding addition and multiplication of the differential operators representing these quantities.

Jordan next step was to work out what the differential operators and , representing and in the -basis, are in the special case that . In that case, Eqs. (14) and (16) [NB1, Eqs. 23ab] turn into:

 (TpT−1+ℏi∂∂q)⟨q′|q⟩=0, (17)
 (Tℏi∂∂pT−1+q)⟨q′|q⟩=0. (18)

On the other hand, . So trivially satisfies:

 (ℏi∂∂q′+ℏi∂∂q)⟨q′|q⟩=0,[NB1,sec. 2,Eq. 24a] (19)
 (−q′+q)⟨q′|q⟩=0.[NB1,sec. 2,Eq. 24b] (20)

Comparing Eqs. (19)–(20) with Eqs. (17)–(18), we arrive at

 TpT−1⟨q′|q⟩=ℏi∂∂q′⟨q′|q⟩, (21)
 −Tℏi∂∂pT−1⟨q′|q⟩=q′⟨q′|q⟩. (22)

Eq. (21) suggests that , the momentum in the -basis acting on the variable, is just . Likewise, Eq. (22) suggests that , the position in the -basis acting on the variable, is just multiplication by . As Jordan put it in a passage that is hard to follow because of his confusing notation:

Therefore, as a consequence of (24) [our Eqs. (19)–(20)], the operator [multiplying by in our notation] is assigned (zugeordnet) to the quantity [Grösse] itself [ in our notation]. One sees furthermore that the operator [ in our notation] corresponds to the momentum [] belonging to [] (Jordan, 1927b, p. 815).74

It is by this circuitous route that Jordan arrived at the usual functional interpretation of coordinate and momentum operators in the Schrödinger formalism. Jordan (1927b, pp. 815–816) emphasized that the association of and with and can easily be generalized. Any quantity (Grösse) obtained through multiplication and addition of and is associated with the corresponding combination of differential operators and .

Jordan’s argument as it stands fails. We cannot conclude that two operations are identical from noting that they give the same result when applied to one special case, here the delta function (cf. Eqs. (21)–(22)). We need to show that they give identical results when applied to an arbitrary function. We can easily remedy this flaw in Jordan’s argument, using only the kind of manipulations he himself used at this point (though we will do so in modern notation). We contrast this proof in the spirit of Jordan with a modern proof showing that Eqs. (14) and (16) imply that and , now understood in the spirit of von Neumann as operators acting in an abstract Hilbert space, are represented by and , respectively, in the -basis. The input for the proof à la Jordan are his postulates and the identification of the differential operators representing momentum and position in the -basis as and , respectively (cf. our comments following Eq. (16)). The input for the proof à la von Neumann are the inner-product structure of Hilbert space and the spectral decomposition of the operator . Of course, von Neumann (1927a) only introduced these elements after Jordan’s Neue Begründung I.

Closely following Jordan’s approach, we can show that Eqs. (14) and (16) [NB1, Eqs. 23ab] imply that, for arbitrary functions , if is set equal to ,

 (TpT−1F)(q)=ℏi∂∂qF(q), (23)
 (−Tℏi∂∂pT−1F)(q)=qF(q). (24)

Since is an arbitrary function, the problem we noted with Eqs. (21)–(22) is solved. Jordan’s identification of the differential operators representing momentum and position in the -basis does follow from Eqs. (23)–(24).

To derive Eq. (23), we apply , defined in Eq. (7), to . We then use the definition of in Eq. (10) to write as:

 (TpT−1F)(Q) = ∫⟨Q|p⟩p(T−1F)(p)dp (25) = ∫⟨Q|p⟩p[∫⟨p|Q′⟩F(Q′)dQ′]dp = ∫∫⟨Q|p⟩p⟨p|Q′⟩F(Q′)dpdQ′.

We now set , use Eq. (5) to substitute for , and perform a partial integration:

 (TpT−1F)(q) = ∫∫⟨q|p⟩p⟨p|q′⟩F(q′)dpdq′ (26) = ∫∫⟨q|p⟩(−ℏi∂∂q′⟨p|q′⟩)F(q′)dpdq′ = ∫∫⟨q|p⟩⟨p|q′⟩ℏidF(q′)dq′dpdq′.

On account of completeness and orthogonality (see Eq. (3) [NB1, Eqs. 14–17]), the right-hand side reduces to . This concludes the proof of Eq. (23).

To derive Eq. (24), we similarly apply to :

 (−Tℏi∂∂pT−1F)(Q) = −∫⟨Q|p⟩ℏi∂∂p(T−1F)(p)dp (27) = −∫⟨Q|p⟩ℏi∂∂p[∫⟨p|Q′⟩F(Q′)dQ′]dp = −∫∫⟨Q|p⟩ℏi∂∂p⟨p|Q′⟩F(Q′)dpdQ′.

We now set and use Eq. (6) to substitute for :

 (−Tℏi∂∂pT−1F)(q)=∫∫⟨q|p⟩q′⟨p|q′⟩F(q′)dpdq′=qF(q), (28)

where in the last step we once again used completeness and orthogonality. This concludes the proof of Eq. (24).

We now turn to the modern proofs. It is trivial to show that the representation of the position operator in the -basis is simply multiplication by the eigenvalues . Consider an arbitrary eigenstate of position with eigenvalue , i.e., . It follows that , where is an arbitrary eigenvector of an arbitrary Hermitian operator with eigenvalue . The complex conjugate of this last relation,

 ⟨q|^q|Q⟩=q⟨q|Q⟩, (29)

is just the result we wanted prove.

It takes a little more work to show that Eq. (14) [NB1, Eq. 23a] implies that the representation of the momentum operator in the -basis is . Consider Eq. (25) for the special case :

 TpT−1⟨Q|q⟩=∫∫⟨Q|p⟩p⟨p|Q′⟩⟨Q′|q⟩dpdQ′. (30)

Recognizing the spectral decomposition of in this equation, we can rewrite it as:

 TpT−1⟨Q|q⟩=∫⟨Q|^p|Q′⟩⟨Q′|q⟩dQ′=⟨Q|^p|q⟩, (31)

where in the last step we used the resolution of unity, . Eq. (14) tells us that

 TpT−1⟨Q|q⟩=−ℏi∂∂q⟨Q|q⟩. (32)

Setting the complex conjugates of the right-hand sides of these last two equations equal to one another, we arrive at:

 ⟨q|^p|Q