1 Introduction

C

Aspects Of Perturbative Unitarity

Damiano Anselmi

Dipartimento di Fisica “Enrico Fermi”, Università di Pisa,

Largo B. Pontecorvo 3, 56127 Pisa, Italy

and INFN, Sezione di Pisa,

Largo B. Pontecorvo 3, 56127 Pisa, Italy

damiano.anselmi@unipi.it

Abstract

We reconsider perturbative unitarity in quantum field theory and upgrade several arguments and results. The minimum assumptions that lead to the largest time equation, the cutting equations and the unitarity equation are identified. Using this knowledge and a special gauge, we give a new, simpler proof of perturbative unitarity in gauge theories and generalize it to quantum gravity, in four and higher dimensions. The special gauge interpolates between the Feynman gauge and the Coulomb gauge without double poles. When the Coulomb limit is approached, the unphysical particles drop out of the cuts and the cutting equations are consistently projected onto the physical subspace. The proof does not extend to nonlocal quantum field theories of gauge fields and gravity, whose unitarity remains uncertain.

1 Introduction

The problem of quantum gravity is the apparent incompatibility between unitarity and renormalizability. For example, the quantization of Einstein gravity gives a theory that is unitary, but not renormalizable [1, 2]. If the counterterms generated by renormalization are included, the theory becomes renormalizable with infinitely many independent couplings111For this reason, “nonrenormalizable” and “renormalizable with infinitely many independent couplings” are often used interchangeably. and predictive at low energies (see, for example, [3]). It is also possible to build theories of quantum gravity that are renormalizable (with finitely many independent couplings), but not unitary. One way to achieve this goal is by including higher-derivative terms that make the propagators fall off more rapidly at high energies [4]. It is not known how to build a theory that is renormalizable and unitary at the same time.

Perturbative unitarity is thus a key issue in quantum field theory. In scalar and fermion theories it can be proved by means of the cutting equations [5, 6]. In gauge theories, additional aspects need to be addressed, such as the compensation between the Faddeev-Popov ghosts and the temporal and longitudinal components of the gauge fields. This compensation can be proved diagrammatically [7] by means of the Ward identities or more formally at the level of the Fock space [8]. For a variety of reasons, we believe that the last word has not been said on this topic and that an attempt to reorganize and generalize the proof is most welcome. First, a treatment of perturbative unitarity in quantum gravity is still missing. Second, the existing proofs in gauge theories are involved, which suggests that they are not optimized.

In this paper we offer a more economic approach and a new, exhaustive proof that works not only in Abelian and non-Abelian gauge theories, but also in quantum gravity and a variety of other nonrenormalizable local theories, in arbitrary dimensions greater than 3.

First we prove a number of basic tools, such as the largest time equation and the cutting equations, paying attention to the minimum assumptions that they require. Then, we show that the unphysical degrees of freedom can be consistently dropped from the cuts. To achieve this goal, we identify a special gauge that leads to the unitarity equation in a straightforward way.

In gauge theories and gravity, several common gauges have inconvenient features. The Lorenz gauge, for example, gives propagators that have double poles, which prevents the derivation of the cutting equations. The Coulomb gauge has a nice feature, because it just propagates the physical degrees of freedom. However, it introduces unwanted singularities in the Feynman diagrams, which are under control only in QED.

The special gauge is a new gauge that has no double poles, interpolates between the Feynman gauge and the Coulomb gauge and satisfies all the assumptions that are required to derive the cutting equations. Moreover, when the gauge fields are given a mass to regulate their on shell infrared divergences and the Coulomb limit is approached, the threshold for the production of unphysical particles grows enough to drop them out from the cuts. After that, a few technical tricks allow us to complete the proof. The special gauge is unique in both Yang-Mills theory and gravity.

An even simpler proof of perturbative unitarity is available in QED, where it is possible to work directly in the Coulomb gauge.

We pay attention to details such as the regularization and the renormalization of the cutting equations, the presence of contact terms, the double poles of the gauge field propagators, the orders of the limits with which various parameters are removed, the infrared divergences and other singularities that disappear by summing up the cut diagrams. Some of these problems are not treated carefully (or are not even mentioned) in the existing literature.

Because of the key role played by the largest time equation, the proofs we give in this paper do not generalize to nonlocal quantum field theories of gauge fields and gravity, including those whose propagators have no poles on the complex plane besides the graviton one [9]. For this reason, the consistency of those theories remains unclear.

The paper is organized as follows. In section 2 we derive the cutting equations under the minimum assumptions. In section 3 we derive the unitarity equation. In section 4 we introduce the special gauge in Abelian and non-Abelian gauge theories. In section 5 we give the simplest proof of perturbative unitarity, using the Coulomb gauge in QED. In section 6 we use the special gauge to prove unitarity in all gauge theories. In section 7 we generalize the special gauge and the proof of unitarity to quantum gravity. Section 8 contains the conclusions.

2 The cutting equations

We investigate perturbative unitarity following the guidelines of refs. [6, 10, 11], which consist of proving, in the order:

— the largest time equation,

— the cutting equations,

— the pseudounitarity equation,

— the unitarity equation.

The pseudounitarity equation is a more general version of the unitarity equation, where the cuts may propagate both physical and unphysical particles. In gauge theories and gravity it is helpful to first derive the pseudounitarity equation and then prove that it implies the unitarity equation, by showing that the external legs and the cuts can be consistently projected onto the physical subspace.

In this section we reconsider the cutting equations and search for the minimum assumptions that are necessary to derive them. We assume invariance under translations and spatial rotations, but we do not assume Lorentz invariance. Indeed, we need results that can be applied to gauge choices that violate Lorentz invariance, such as the Coulomb gauge and the special gauge. We do not assume from the beginning that the theory is local. Nevertheless, along with the derivation it emerges that the vertices must be local.

2.1 Regularization

We use the dimensional regularization, or one of its variants [12], directly in Minkowski spacetime. Several assumptions and arguments of our derivations make no sense unless there are exactly one time component and one energy component, so we dimensionally continue the space coordinates, but do not continue the time coordinate.

Let denote the physical spacetime dimension and the continued one, where is a complex number. Split the continued spacetime into the product of the time line times the continued space . Denote the metric of flat spacetime with diag.

Typically, we work with theories whose vertices are local and whose gauge fixed propagators are equal to ratios of polynomials and of the energy and the space momentum ,

 ~f(E,p)=u(E,p)v(E,p), (2.1)

with denominators equal to products of polynomials , where and are positive constants and is real. The symbol is used to specify the contour prescription.

These theories are well regularized by the prescription of first integrating on the space momenta , then on the energies . Indeed, after the integration on the space momenta the energy integrals behave as

 ∼∫E=±∞dEEm(E2)nε/2 (2.2)

for large , where and are nonnegative integers and . The analytic continuation in makes these integrals well defined.

Various manipulations can simplify the propagators and generate local integrands. Then the result is zero, because the dimensionally regularized integral vanishes, as in

 ∫+∞−∞dE2π∫dd−1−εp(2π)d−1−εEmpi1⋯pin=0, (2.3)

where, again, and are nonnegative integers.

Note that we may not be able to perform the usual contour integrations on the energy. Nonetheless, each step of the calculation is consistent. For example, in , we have

 = −iΓ(ε−12)(4π)(3−ε)/2∫+∞−∞dE2π(m2−E2−iϵ)(1−ε)/2 (2.4) = Γ(ε2−1)(4π)(2−ε)/2(m2−iϵ)1−ε2.

Interchanging the energy and momentum integrals does not make sense, in general, as in (2.3), but in specific cases it may be allowed, as in (2.4).

The propagators have the structure (2.1) in all the cases we consider, with two exceptions: the Coulomb gauge in QED and the mass terms introduced to regulate the (on shell) infrared divergences in non-Abelian gauge theories and nonrenormalizable theories. In both cases some denominators have , but the regularization can be proved to work well by means of ad hoc methods and/or appropriate truncations.

Equipped with this regularization technique, we are ready to begin our investigation. The algorithm to renormalize the divergences is described along the way.

2.2 The largest time equation

The largest time equation is implied by the following minimum assumptions:

() the vertices are localized in time;

() the propagators in coordinate space can be decomposed as

 f(x)=θ(x0)g+(x)+θ(−x0)g−(x) (2.5)

in the sense of distributions.

For the moment, we do not make further assumptions about the distributions . Formula (2.5) and similar formulas written below are exact identities among distributions. In particular, they imply that contains no contributions proportional to or its derivatives.

By assumption (), each vertex is associated with a definite time , but need not be associated with a unique space coordinate . By translational invariance, a propagator is described by a time difference and a space difference , as usual.

Consider a raw Feynman diagram in coordinate space. By this we mean the plain product of the vertices and the propagators, with no integrations over the space and time coordinates. We denote the raw diagram by , where are the locations of the vertices in time, while the dependences on the space coordinates are omitted.

Next, build variants of the diagram as follows. Mark any subset of vertices by putting hats on their times . Multiply by an overall factor , where is the number of marked vertices. Replace the propagators connecting two unmarked vertices, two marked vertices and a marked vertex with an unmarked one, respectively, as specified by the following scheme:

 x ⟶ y:θ(x0−y0)g+(x−y)+θ(y0−x0)g−(x−y), ^x ⟶ ^y:θ(x0−y0)g−(x−y)+θ(y0−x0)g+(x−y), (2.6) ^x ⟶ y:g+(x−y),x⟶^y:g−(x−y).

Finally, do not modify the values of the vertices. For the sake of generality we assume that the propagators are oriented. The orientation is specified by the arrows.

Now, assume that the vertices have distinct times. Then, we have the identity

 ∑markings MFM(x01,⋯,^x0i,⋯,^x0j,⋯,x0n)=0, (2.7)

which is known as the largest time equation. The sum is over all the ways to mark the vertices, including the cases where the vertices are all marked and all unmarked.

Here is the proof of (2.7). Since the vertices have distinct times, one vertex must have the largest time. Denote that time by . Pick any diagram of the sum (2.7). The time may be marked or not in . If it is marked (unmarked), the sum (2.7) contains another diagram that is identical to except for the fact that is unmarked (marked). The sum vanishes, because the propagators between a point222There maybe more than one point with time , if the vertex is nonlocal in space. and any other points , are, in the various cases,

 z ⟶ y: g+(z−y),^z⟶y: g+(z−y),z⟶^y: g−(z−y),^z⟶^y: g−(z−y), x ⟶ z: g−(x−z),x⟶^z: g−(x−z),^x⟶z: g+(x−z),^x⟶^z: g+(x−z).

In the end, the diagrams and are equal except for an overall minus sign due to the marking/unmarking of . This implies (2.7).

2.3 Contact terms

To derive the cutting equations, we must calculate the Fourier transforms of the largest time equations, which demands to integrate on the coordinates. However, in the derivation of (2.7) we have assumed that the vertices had different times. We want to make sure that this assumption can be dropped, because only in that case the result of the Fourier transform has a straightforward diagrammatic interpretation.

More precisely, we need to show that when we take any (one-sided) limits of coinciding times on the functions of equation (2.7), we do not miss terms that give nontrivial contributions to the integrals on the coordinates.

Call two vertices nearest neighbors if they are connected by a propagator. Observe that, to prove (2.7), the point of largest time just needs to be compared with its nearest neighbors. For this reason, equation (2.7) trivially extends to the case where there are vertices with coinciding times, as long as no pairs of them are made of nearest neighbors. Precisely, denote the vertices with coinciding times by and call their time . When is not the largest time, we can proceed exactly as above, which leads to (2.7). When is the largest time, we can pick any of the as the vertex and, again, proceed as above to obtain (2.7). Thus, the only situation that deserves attention is when some nearest neighbors have coinciding times. Nontrivial contributions to the integrals on the coordinates can only appear when contact terms are present.

Consider that the vertices may carry time derivatives. For example, in quantum gravity the Einstein-Hilbert action is corrected by terms built with the Riemann tensor and its derivatives, which may contain an arbitrary number of time derivatives acting on the metric tensor. By means of partial integrations, the derivatives can be moved to the propagators (2.6). Then, they may generate contact terms proportional to or its derivatives:

 ∂n0f(x)=θ(x0)∂n0g+(x)+θ(−x0)∂n0g−(x)+n∑k=1δ(k−1)(x0)[limx0→0+∂n−k0g+(x)−limx0→0−∂n−k0g−(x)]. (2.8)

However, the largest time equation (2.7) is only sensitive to the first two terms that appear on the right-hand side of this equation, since the vertices must have distinct times.

In specific cases, such as when the vertices cannot provide enough time derivatives to create nontrivial contact terms, assumption () is sufficient for our purposes. However, in general it is necessary to replace it with the stronger assumption that

() the vertices are local

and further assume that

() the contact terms are local, i.e. the time derivatives of the propagators satisfy the property

 ∂n0f(x)=θ(x0)∂n0g+(x)+θ(−x0)∂n0g−(x)+local terms; (2.9)

() and their derivatives have well-defined limits for .

When the propagators have the structure (2.1) property () follows as a consequence. Observe that a nontrivial contact term arises when the numerator contains a power of greater than or equal to the maximum power of appearing in the denominator. Let denote the degree of as a polynomial in . Assumption () implies that the degree of in must be smaller than . Write

 ~f(E,p)=u(E,p)Er+w(E,p),

where also has degree smaller than . When is multiplied by a sufficient power of , the numerator may contain a power that simplifies the power of the denominator. Write , where is a polynomial of and is a polynomial of degree in . Then,

 E~f(E,p)=Eu′′(E,p)−w(E,p)u′(p)Er+w(E,p)+u′(p).

The ratio on the right-hand side does not contain contact terms, because the numerator contains at most powers of . Thus, (2.9) holds for . The argument can be easily iterated for , , which proves (2.9) for every .

It is easy to show that property () follows from (2.1), as long as (2.1) has only simple poles and are regular distributions.

We are ready to describe the procedure to deal with the contact terms. Consider a diagram where some differentiated propagators carry contact terms. Separate them from the rest of as in formula (2.9) and write as a sum of contributions , such that each internal line of is either a contact term or does not carry contact terms. The contact terms of draw a subdiagram (which may be disconnected), as shown in the picture

It is easy to prove that if contains loops, it vanishes. Indeed, by assumptions () and (), the vertices and the contact terms are both local, so each loop of contact terms is a linear combination of integrals (2.3) in momentum space. Thus, we can assume that is a tree subdiagram. Each connected component of is equal to a product of (derivatives of) delta functions times a new local vertex that can be obtained by gluing the vertices of together. In turn, is a product of (derivatives of) delta functions times a reduced diagram , built with the ordinary vertices and the vertices .

Now, has no contact terms and thus satisfies the largest time equation (2.7). It is connected if the original diagram is connected.

By property (), a line that connects a marked vertex with an unmarked one is not interested by contact terms. By the same property, have well-defined limits for . Then, formula (2.6) shows that the contact terms carried by the lines connecting pairs of marked vertices are equal to minus the contact terms of . We can easily show that, thanks to this fact, a minus sign is associated with each marked vertex of type , as expected. Indeed, originates from the markings of all the vertices of . Each such vertex provides a minus sign, but other minus signs come from the contact terms of , because they are associated with pairs of marked vertices. Since is a tree diagram, the sum of the number of its vertices plus the number of its lines is odd, so always carries a minus sign.

If we sum the largest time equation (2.7), derived under the condition that all the nearest neighbors have distinct times, to the largest time equations satisfied by the diagrams , multiplied by the appropriate products of (derivatives of) delta functions , the right-hand side of (2.8) is fully reconstructed, for each propagator. Observe that assumption () plays an important role here, because it ensures that each diagram involves a finite number of time derivatives.

The conclusion is that if we add assumptions (), () and (), the largest time equation (2.7) holds even if we drop the assumption that the vertices are located at distinct times. Then formula (2.7) can be interpreted as an identity of distributions and we can safely compute its Fourier transform.

The same conclusion holds when the vertices do not provide enough time derivatives to generate contact terms, in which case assumptions (), () and () need not be satisfied.

2.4 The cutting equations

Once the contact terms are dealt with as explained above, the Fourier transform of the largest time equation (2.7) is an analogous equation in momentum space, where the propagators and the vertices are replaced by their Fourier transforms. Denoting the Fourier transform of with , we get

 ∑markings MGM(p1,⋯,pn)=0. (2.10)

Now we simplify this identity by converting it into a set of cutting equations. The cutting equations are consequences of the assumptions made so far and the following additional one:

() the Fourier transforms of have the form

 ~g±(p)=θ(±p0)h±(p). (2.11)

For the moment, we make no further assumptions about the distributions . We interpret formulas (2.11) by saying that the energy flows from an unmarked vertex to a marked vertex, that is to say from the past to the future.

Consider a connected, amputated diagram of formula (2.10). Call the external legs whose energies flow into (out of) the diagram ingoing (outgoing). Mark the end points of the outgoing external legs and leave the end points of the incoming external legs unmarked.

We refer to the vertices and the end points of the external legs by simply calling them “points”. Thus, the energy flows from an unmarked point to a marked point. Between two marked points or two unmarked points it can flow in both directions.

We want to show that every diagram of (2.10) vanishes, unless it can be cut into two pieces, leaving the marked and unmarked points on opposite sides of the cut. If that is the case, we denote the diagram by .

Consider a marked vertex. Its nearest points cannot be all unmarked, because then the orientations of the energy flows would imply the violation of energy conservation. Thus, at least one of its nearest neighbors is a marked point. Next, consider a connected subdiagram made of some marked vertices and the legs attached to them. Again, energy conservation implies that the nearest points of the subdiagram must include at least another marked point. Extending the subdiagram point by point, we find that each connected subdiagram of marked points must include the end point of an outgoing line. Similarly, a connected subdiagram of unmarked points must include the end point of an incoming line.

Because of this, the diagram is cut into two (not necessarily connected) subdiagrams. The cut crosses the propagators that connect a marked vertex to an unmarked vertex, as well as the external lines that connect a marked point to an unmarked point. For example, we have

 (2.12)

In the left figure, the marked points are circled and the solid line denotes the cut. From now on, instead of marking the vertices, we just shadow the marked side of the cut, as shown in the right figure of (2.12). Normally, the incoming legs are drawn on the left-hand side and the outgoing legs are drawn on the right-hand side.

Since the external legs are amputated, the cutting of an external leg does not have any particular meaning besides the graphical one: all the marked points must lie on one side of the cut and all the unmarked points must lie on the other side.

We conclude that the Fourier transform (2.10) of the largest time equation (2.7) simplifies into the cutting equation

 ∑cuttings CGC(p1,⋯,pn)=0. (2.13)

We stress that equations (2.7), (2.10) and (2.13) do not assume that the external legs are on shell.

The sum of formula (2.13) contains two special contributions that it is convenient to single out. They are the contributions and of the diagrams where all the vertices are unmarked or marked, respectively. We have

 G(p1,⋯,pn)+¯G(p1,⋯,pn)=−∑proper cuttings CGC(p1,⋯,pn), (2.14)

where the sum is restricted to the “proper” cuttings, which are those where at least one vertex is marked and at least one vertex is unmarked.

Everything we have said so far is valid at the regularized level. If the locality of counterterms holds, the diagrams built with the counterterms satisfy analogous properties. Combining the cutting equation of one diagram with the cutting equations satisfied by the diagrams that subtract its subdivergences and overall divergence, we obtain the renormalized cutting equation.

Note that in the renormalized cutting equation every side of the cut is appropriately renormalized. On the other hand, no counterterms are associated with subdiagrams containing the cut or part of it. The consistency of this fact is proved by the renormalized cutting equation itself. Indeed, after the inclusion of the counterterms the left-hand side of formula (2.14) is convergent, so the right-hand side must also be convergent.

So far, the assumptions we have made are more general than the usual ones. However, we anticipate that we cannot obtain the pseudounitarity equation unless we impose further restrictions.

2.5 Examples

Now we give some simple examples concentrating on scalar fields . Examples with fermions and gauge fields are given later on.

If we interpret the decomposition (2.5) as the usual T-ordered one, where

 f(x)=⟨0|Tφ(x)φ(0)|0⟩,g+(x)=⟨0|φ(x)φ(0)|0⟩,g−(x)=⟨0|φ(0)φ(x)|0⟩=g+(−x),

and further assume Lorentz invariance, then we obtain the standard Källén-Lehman (KL) representation. Indeed, now and are Lorentz invariant and so are . However, the sign of depends on the reference frame, unless . Thus, vanish for and depend only on for . Then, implies that and must be the same function, which we denote by . Inserting inside (2.5) and working out the Fourier transform of , we find the KL decomposition

 ~f(p)=∫+∞0iρ(s)dsp2−s+iϵ. (2.15)

We have used the identity

 θ(x0)=i2π∫+∞−∞e−iτx0dττ+iϵ, (2.16)

then changed the integration variable from to and used for . Note that is not assumed to be nonnegative.

In the case of ordinary (i.e. non-higher-derivative) free scalar fields, we have

 ~f(p)=ip2−m2+iϵ,~g±(p)=2πθ(±p0)ρ(p2),ρ(s)=δ(s−m2).

The simplest cutting equation is the one satisfied by the propagator:

 (2.17)

In deriving this equation, the end points must be imagined as vertices, so each shadowed end point gives a factor . This explains the signs of (2.17). In formulas, we have

 ip2−m2+iϵ+−ip2−m2−iϵ=2πθ(p0)δ(p2−m2)+2πθ(−p0)δ(p2−m2). (2.18)

At one loop we have

 (2.19)

which can be checked easily (see for example [11]).

Nonlocal quantum field theories do not satisfy the assumptions that lead to the largest time equation, unless their vertices are local in time. Then, however, either Lorentz invariance or gauge invariance is violated.

3 The pseudounitarity and unitarity equations

In this section we derive the pseudounitarity equation and explain when it implies the unitarity equation. As said, we must make additional assumptions, which eventually lead to a general Källén-Lehman spectral representation, even if we do not assume it from the start.

First, the shadowed regions of the cutting equations should correspond to the complex conjugate diagrams, that is to say we must assume that

() the action is Hermitian.

In particular, this implies that the shadowed propagator is the Hermitian conjugate of the unshadowed one and that the cut propagators (2.11) are Hermitian, i.e. . The minus sign associated with each marked vertex is then justified by the fact that the vertices are anti-Hermitian.

Second, the cut propagators must project onto the on shell states of the free field limit. This means that we must replace () by the more restrictive assumption that

() the Fourier transforms of have the form

 ~g±(p)=πN∑i=1a±i(p)δ(p0∓ωi), (3.1)

where are positive functions of , while are Hermitian matrices whose entries are functions of .

Using the identity (2.16) it is easy to check that the Fourier transform of formula (2.5) is

 ~f(p)=i2N∑i=1a+i(ωi,p)(p0+ωi)−a−i(−ωi,p)(p0−ωi)(p0)2−ω2i+iϵ. (3.2)

With suitable assumptions on and , this formula matches (2.1).

At this point, we diagonalize the matrices and and normalize their eigenvalues to , and . Calling the diagonalizing matrices and , we have

where , for bosons and fermions, respectively, and , are diagonal matrices with eigenvalues , and . The matrices and collect the external particle and antiparticle states.

Writing the matrix as , the cutting equations (2.14) can be collected into the pseudounitarity equation

 −iT+iT†=THT†, (3.3)

where is the diagonal matrix having diagonal blocks and .

If there exists a subspace of states of the free field theory such that equation (3.3) holds with when the external legs and the cut legs are projected onto , then the pseudounitarity equation implies perturbative unitarity, which is expressed by the equation

 −iT+iT†=TT† (3.4)

in .

Summarizing, the assumption that turns the pseudounitarity equation into the unitarity equation is that

() there exists a subspace of states with , , such that the cutting equations still hold after the external legs and the cut legs are projected onto .

3.1 The Källén-Lehman spectral representation

We have found that, in general, the propagator must have the form (3.2), which means, in particular, that there can only be simple poles on the real axis, but no double poles and no poles away from the real axis. We can recast formula (3.2) in the form of the general Källén-Lehman representation

 (3.5)

where the densities and are the Hermitian matrices given by

 ρ(s,p) = N∑i=1ωi2(a+i(ωi,p)+a−i(−ωi,p))δ(s−ω2i), σ(s,p) = N∑i=112(a+i(ωi,p)−a−i(−ωi,p))δ(s−ω2i).

It is easy to check formula (2.17) in this general case.

The representation (3.5) has a form similar to the one known from Lorentz violating theories [13]. When Lorentz symmetry holds, the densities and vanish for , so after a translation the representation acquires a more common form, that is to say (2.15) with replaced by . Lorentz invariance also implies that this sum has the form  and further relates the functions and .

3.2 Examples

Most bosons have , so the coefficient of in the numerator of (3.5) vanishes. This gives

 ~f(p)=iωa+(ω,p)(p0)2−ω2+iϵ. (3.6)

Lorentz invariant scalars have , .

Examples where the coefficient of does not vanish are the Chern-Simons gauge fields and the fermions. In particular, free Dirac fermions have , which gives

 ~f(p)γ0=ipμγμ+mp2−m2+iϵ.

Interacting fermions in Lorentz invariant theories have

 ~f(p)γ0=i∫+∞0ρ(s)+pμγμσ(s)p2−s+iϵds.

Let us now consider gauge fields. If we choose a covariant gauge the pseudounitarity equation exists only when the propagators have the form

 ~fμ1⋯μn,ν1⋯νn(p)=iIμ1⋯μn,ν1⋯νnp2+iϵ, (3.7)

where is a constant tensor built with the metric . In other words, no covariant gauges besides the Feynman ones satisfy the assumptions. The common Lorenz gauge for vector fields, which gives the propagator

 −ip2(ημν−(1−λ)pμpνp2), (3.8)

does not lead to the cutting equations (2.14) when the gauge-fixing parameter is different from , because of the double pole. It is possible to deform (3.8) by introducing fictitious masses that split the double pole into simple poles, but it is not easy to study the limit where the fictitious masses are removed in the cutting equations.

We see that, in the end, the conditions imposed by the very existence of the cutting equations and the requirement that they lead to the pseudounitarity and unitarity equations are very restrictive.

The largest time equation (2.7), the cutting equations (2.14), the pseudounitarity equation (3.3) and the unitarity equation (3.4) also hold when the external legs of the diagrams correspond to the insertions of local composite fields. Indeed, it is easy to check that the arguments that lead to those equations remain valid. More generally, the equations still hold when the external legs include both elementary fields and local composite fields.

3.3 Infrared divergences and other singularities

The uncut diagrams that appear on the left-hand side of the cutting equations (2.14) are regular off shell. However, the individual diagrams that appear on the right-hand side have cuts, which are necessarily on shell. In the presence of massless particles there can be infrared divergences. For example, consider the sum

 (3.9)

in QED. The first two diagrams contain the infrared divergences of the one-loop radiative corrections to the vertex. However, the third diagram is also infrared divergent and compensates the divergences of the other two.

The cancellation of the infrared divergences on the right-hand side of equation (2.14) (when the external legs are off shell) is a well-known fact [14], so we do not need to spend more words on it. At the same time, for various arguments of the next sections we need to deal with cut diagrams that are individually infrared convergent. This can be achieved by inserting fictitious masses in the propagators. It is possible to do so without violating the assumptions we have made so far. However, the fictitious masses violate gauge invariance and it is necessary to remove them with care to successfully prove the perturbative unitarity of gauge theories.

Other singularities occur when self-energy subdiagrams are present. For example, the product of a cut propagator times an unshadowed propagator with the same momentum is equal to

 i(2π)θ(p0)δ(p2−m2)p2−m2+iϵ=2πϵθ(p0)δ(p2−m2), (3.10)

in the case of ordinary scalar fields. On the other hand, the product of an unshadowed propagator times a shadowed one with the same momentum is

 ip2−m2+iϵ−ip2−m2−iϵ=1(p2−m2+iϵ)2+2πϵδ(p2−m2), (3.11)

where we have used (2.18).

Again, the left-hand sides of the cutting equations are smooth, while the individual diagrams on the right-hand side may have singularities for that cancel out in the sum. The cancellation can be seen by keeping the width different from zero and taking the limit only at the very end.

For example, consider the bubble diagram where one propagator is replaced by the one-loop self-energy. The right-hand side of the cutting equation is equal to minus the sum

 (3.12)

where we have assumed, for definiteness, that the energy flows in from the left. Using (2.19), (3.10) and (3.11), it is easy to check that the sum

 (3.13)

(with propagators on the external legs) is equal to

 \raisebox−0.45pt\includegraphics[width=56.905512pt]c5.eps×1(p2−m2+iϵ)2,

which is regular when . Thus, (3.12) is also regular.

4 The special gauge

We have seen that the only covariant gauge that leads to the pseudounitarity equation is the Feynman gauge, which corresponds to formula (3.8) with . However, the Feynman gauge has ghosts, that is to say the matrix of formula (3.3) has negative entries.

The Feynman gauges do not make unitarity manifest. Actually, all the propagators (3.7) for have ghosts. It is hard to prove that the ghosts compensate each other in the Feynman gauge, although not impossible [7, 8]. Here we prefer to follow a different strategy, which amounts to prove perturbative unitarity in gauge theories and gravity by working in a new, noncovariant gauge that satisfies all the requirements we have outlined and interpolates between the Feynman gauge and the Coulomb gauge.

We call the new gauge “special”, because of its properties. In this section we build the special gauge in Abelian and non-Abelian gauge theories, while in section 7 we build it in quantum gravity. We work in arbitrary dimensions333The case , which we do not treat here in detail, can be studied by including the Chern-Simons term, to avoid the infrared problems due to the superrenormalizability of the gauge coupling. greater than 3. The gauge group indices are understood in the formulas written below.

Consider the gauge-fixed Lagrangian

 LYM=−14FμνFμν−12λG2(A)−¯CG(DC), (4.1)

where is the gauge choice, which is assumed to be linear in , while denotes the covariant derivative. In the theory is nonrenormalizable, so we should include infinitely many corrections of higher dimensions, which are optional in . We do not write them explicitly, because, for our purposes, it is sufficient to assume that they are perturbative, local and Hermitian. We also omit the matter contributions, which are not important for the moment.

Now, take

 G(A)=ζ∂0A0+∇⋅A, (4.2)

where is another gauge-fixing parameter. For we have the Lorenz and Coulomb gauges, respectively.

 ⟨A0(k)A0(−k)⟩0 = −i(λE2−k2)(ζE2−k2)2,⟨Ai(k)A0(−k)⟩0=i(ζ−λ)kiE(ζE2−k2)2, ⟨Ai(k)Aj(−k)⟩0 = iΠijE2−k2+i(ζ2E2−λk2)(ζE2−k2)2kikjk2, (4.3)

where the denotes the free field limit of the average and

 Πij=δij−kikjk2 (4.4)

is the projector onto the transversal components of the gauge field. The ghost propagator is

 ⟨C(k)¯C(−k)⟩0=iζE2−k2. (4.5)

The propagators just listed do not satisfy the assumptions required by the pseudounitarity equation for generic values of and , because they have double poles. The special gauge is defined as the one with , where the double poles disappear. We obtain

 ⟨A0(k)A0(−k)⟩0 = ⟨Ai(k)Aj(−k)⟩0 = iΠijE2−k2+iϵ+iλλE2−k2+iϵkikjk2, (4.6)

where we have inserted the contour prescriptions, which are now straightforward. Note that formulas (4.6) have good power counting behaviors. In particular, the denominators cancel out in the sum. The KL spectral representation (3.5) is satisfied, although the densities are not positive definite.

The limit takes us to Coulomb gauge, actually the Landau limit of the Coulomb gauge. There, the assumptions we have made do not hold, because, for example, is proportional to , which violates (2.5). In the next section we show that in QED there is a way to circumvent this difficulty and work directly at . Instead, in non-Abelian gauge theories (and a fortiori gravity) it is necessary to work at .

To deal with the infrared divergences, we need to introduce an infrared cutoff and remove it later. This can be done in various ways. We describe two methods that are good for our purposes, a simpler one and a more involved one. The simpler method works well in renormalizable theories, the more involved one is designed to work in nonrenormalizable theories.

4.1 Renormalizable theories

For the moment, we concentrate on renormalizable Abelian and non-Abelian gauge theories, possibly coupled to matter. There, it is sufficient to replace the propagators (4.6) with

 ⟨A0(k)A0(−k)⟩0 = −iλE2−k2−m2γ+iϵ=−⟨C(k)¯C(−k)⟩0,⟨Ai(k)A0(−k)⟩0=0, ⟨Ai(k)Aj(−k)⟩0 = iΠijE2−k2−m2γ+iϵ+iλλE2−k2−m2γ+iϵkikjk2. (4.7)

The Lagrangian that leads to the propagators (4.7) is equal to (4.1) plus the mass terms

 Lmγ=m2γ2AμAμ+m2γ2λ(1−λ)(∇⋅A)1Δ(∇⋅A)−m2γ¯CC, (4.8)

and is nonlocal in space.

We must show that the regularization defined in subsection 2.1 is well defined in the special gauge, because the denominator of the projector does not have the form specified in formula (2.1). We must also pay attention to the contact terms, because the procedure of subsection 2.3 to deal with them does depend on (2.1).

For definiteness, we call an expression regular if it just involves denominators equal to products of polynomials with and . We call any other expression irregular. For example, the projector is irregular, although it has good infrared and ultraviolet behaviors.

Let us point out a few obvious facts. The propagators of the bosonic fields and those of the Faddeev-Popov ghosts decrease like at large energies. Instead, the fermionic propagators decrease like . Call the vertices that carry at least three legs “proper”. In renormalizable Abelian and non-Abelian gauge theories coupled to matter the proper vertices with no fermionic legs contain at most one time derivative, while the proper vertices involving fermionic legs have no time derivatives. Finally, the irregular contributions to read

 i(1−λ)m2γ(λE2−k2−m2γ+iϵ)(E2−k2−m2γ+iϵ)kikjk2 (4.9)

and behave like at large energies.

We treat the quadratic counterterms as two-leg vertices. To minimize the number of time derivatives acting on the same propagator in coordinate space, the kinetic counterterms are assumed to have the forms for bosons and for fermions.

Now, consider a Feynman diagram. The power of the energy brought by the vertices of each loop is at most equal to the number of proper vertices with no fermionic legs, plus twice the number of bosonic quadratic counterterms, plus the number of fermionic quadratic counterterms. Then, if we ignore the tadpoles for a moment, every energy integral is convergent (before integrating on the space momenta) and the multiple energy integrals are overall convergent.

The tadpoles can be treated apart. The fermionic tadpole is straightforward, because it does not involve the projector . The bosonic tadpole involves it by means of (4.9), which contributes to a convergent energy integral. Moreover, adopting the prescription of symmetric integration, the energy integrals are convergent in both types of tadpoles.

Thus, the regularization of subsection 2.1 is well defined. The integrals on the energy and those on the space momenta can be freely interchanged.

The same arguments prove that the irregular contributions (4.9) to the propagators cannot generate contact terms. Indeed, the vertices cannot provide enough powers to compensate the appearing in the denominator of (4.9). Thus, the contact terms can only come from the regular terms and can be treated as explained in subsection 2.3.

The locality of counterterms is usually proved by differentiating the Feynman diagrams with respect to the external energies and space momenta, then showing that a sufficient number of such derivatives makes the integrals overall convergent [15]. This strategy works when the integrands are regular. Instead, the derivatives of with respect to the components of just improve the ultraviolet behavior of the integral on , but do not improve the behavior of the integral on the energy . Nevertheless, we have shown that all the integrals on the energies are convergent by themselves, so their ultraviolet behaviors do not need to be improved. For this reason, a sufficient number of derivatives with respect to the external energies and space momenta does make a diagram overall convergent. Once the diagram is equipped with the counterterms that subtract its subdivergences, the same operation makes the sum fully convergent. It is easy to check that all the regions of integration are properly subtracted. This proves the locality of counterterms. For similar reasons, it is straightforward to prove that the counterterms are polynomial in .

In the end, the renormalization in the special gauge is straightforward. The renormalized Lagrangian coincides with the one at plus the counterterms

 ΔLmγ=m2γ2ΔZ0(A0)2−m2γ2ΔZs(Ai)2−m2γΔZg¯CC,

where , and are divergent constants.

The other requirements of the previous sections are fulfilled, before and after renormalization. This ensures that the largest time equation, the cutting equations and the pseudounitarity equation hold in the special gauge for arbitrary in , if the theory is renormalizable by power counting.

4.2 Nonrenormalizable theories

The construction just given is sufficient for renormalizable gauge theories, such as the standard model in flat space. In view of the generalization to quantum gravity, we explain how to adapt the special gauge to nonrenormalizable gauge theories in arbitrary dimensions .

We introduce two fictitious masses, and , which play different roles. Define

 Pλ,θ,η≡1λE2−θk2−ημ2−m2γ+iϵ

and replace the propagator of (4.7) by

 ⟨Ai(k)Aj(−k)⟩0=iP1,1,1δij+i(QN(λ,r)−P1,1,1)kikjk2+μ2, (4.10)

where

 QN(λ,r)≡λN∑n=0m2nγ(λ−1)nn∏q=0Pλ,rq,rq, (4.11)

and are positive constants such that for , with . For example, we can choose .

Before explaining where the idea for the replacement (4.10) comes from, we give its key properties, in connection with unitarity and renormalization.

It is easy to prove the identity

 QN(λ,r)=N∑n=0Pλ,rn,rnN∑q=n(m2γk2+μ2)qcnq(λ), (4.12)

where are polynomials of . We see that regulates the infrared divergences that would appear in the individual terms on the right-hand side of this formula at .

The irregular term inside is multiplied by the difference , which satisfies the property

 ikikjk2+μ2(QN(λ,r)−P1,1,1)=−ikikjk2+μ2(λ−1)N+1m2N+2γE2−k2−μ2−m2γ+iϵN∏n=0Pλ,rn,rn+ regular terms. (4.13)

To derive this formula, note that if we set for every and , the function resums into

 Q∞(λ)≡λ∞∑n=0m2nγ(λ−1)nn∏q=0Pλ,1,1=λλE2−k2−μ2−λm2γ+iϵ. (4.14)

Replacing by in (4.10), we obtain

 ⟨Ai(k)Aj(−k)⟩0=iP1,1,1δij+i(1−λ)kikj(E2−k2−μ2−m2γ+iϵ)(λE2−k2−μ2−λm2γ+iϵ). (4.15)

Then, there are no irregular terms, the regularization of subsection 2.1 is well defined, the contact terms are under control by means of the procedure of subsection 2.3 and the locality of counterterms is obvious. The point is that the arguments of section 6 about unitarity do not work well with the choice (4.15), because (4.14) shows that the squared mass gets multiplied by in some cuts, which invalidates the inequality (6.3) at .

Nonetheless, the resummation (4.14) gives us the inspiration for the replacement (4.10). Indeed, truncate the sum of (4.14) to and replace the coefficients of in the denominators with arbitrary numbers , so as to obtain (4.11). It is clear that these operations lead to the behavior (4.13). In particular, the variations of the coefficients just affect the regular terms. The role of those coefficients is to make sure that there are no double poles.

Now we use (4.13) to prove that the modification (4.10) has the properties we need, that is to say the regularization of subsection 2.1 is well defined, the contact terms are under control and the locality of counterterms holds. Such properties are obviously satisfied by the regular contributions to the Feynman diagrams, so we can concentrate on the contributions that involve irregular terms.

We recall that we are considering a nonrenormalizable theory, whose Lagrangian contains infinitely many vertices. It is helpful to expand the interaction Lagrangian in powers of the energy and focus on some finite truncation. If, at the same time, we truncate the loop expansion to a finite order, only a finite number of amplitudes, vertices and diagrams are involved in every calculation. So doing, we are able to prove perturbative unitarity within any finite truncation, which is enough to prove perturbative unitarity for the whole theory.

By formula (4.13), there exists an such that all the irregular contributions to the Feynman diagrams are overall convergent within the truncation. At one loop, the integrals that contain irregular terms are convergent by themselves. At higher orders, they are convergent once the counterterms that subtract the subdivergences (associated with the regular contributions to the subdiagrams) are included. Thus, the regularization of subsection 2.1 is well defined. Since the irregular terms do not contribute to the renormalization of the theory, the locality of counterterms obviously holds. Moreover, formula (4.13) shows that for large enough the vertices cannot provide enough powers of to match the total powers appearing in the denominators of the irregular terms. This means that the contact terms are local, within the truncation, because they can only be generated by the regular terms. This fact, together with the locality of the vertices, ensures that the procedure of subsection 2.3 to deal with the contact terms is still valid.

The construction also works in the case of renormalizable theories, where it is sufficient to choose .

Formulas (4.12) and (4.10) show that the assumptions that lead to the pseudounitarity equation are satisfied at , , for arbitrary .

Some remarks are in order, about the recovery of gauge invariance and gauge independence when is sent back to zero. Using the Batalin-Vilkovisky formalism [16], gauge invariance is encoded into the antiparentheses , where collects the mass terms, while gauge independence is encoded in the expression

 ∂S∂λ−(S,Ψλ)=∂Smγ∂λ,

where is the derivative of the gauge fermion , which is the local functional that performs the gauge fixing (for a recent reference, with details and the notation, see [17]). The right-hand sides of both equations should vanish, at least when , but they do not if . Their effects on the generating functional of the one-particle irreducible correlation functions are encoded into the averages and , which contain insertions of new vertices besides those of the standard Feynman rules. We want to make sure that the Feynman diagrams that contain such insertions are also well regularized and satisfy the locality of counterterms, and check that their contact terms are still under control.

Write the free massive Lagrangian in compact notation as

 Lfree+Lmγ=12ΦαQαβΦβ,

where are all the fields (including the ghosts, the antighosts and the Lagrange multipliers for the gauge fixing). We have

 ∂Smγ∂λ=12∫Φα∂Qαβ∂λΦβ−∂∂λ∫Lfree.

The last contribution is local. The other term gives , if we include the propagators attached to its legs. The irregular part can then be easily derived from formula (4.13). Again, if is large enough this irregular insertion cannot generate contact terms and every subintegral that contains it is overall convergent.

Similarly,

 2(S,Smγ)=(S,Φα)QαβΦβ−2(S,S% free).

The last contribution is local, while the other term becomes local, precisely equal to , once it is inserted in a Feynman diagram and the propagator attached to the right field is included.

5 Proof of unitarity in QED

We are now ready to give the simplest proof of perturbative unitarity in gauge theories, which applies to QED (in ).

First we show that the Feynman diagrams can be calculated directly in the Coulomb gauge, which can be reached as the limit of the special gauge defined in the previous section. The quadratic part of the Lagrangian is singular at