Stabilization of Linear Higher Derivative Gravity with Constraints

Stabilization of Linear Higher Derivative Gravity with Constraints

Tai-jun Chen DAMTP, University of Cambridge, Wilberforce Road, CB3 0WA, Cambridge    Eugene A. Lim Department of Physics, King’s College London, Strand, London, WC2R 2LS
July 26, 2019

We show that the instabilities of higher derivative gravity models with quadratic curvature invariant can be removed by judicious addition of constraints at the quadratic level of metric fluctuations around Minkowski/de Sitter background. With a suitable parameter choice, we find that the instabilities of helicity-0, 1, 2 modes can be removed while reducing the dimensionality of the original phase space. To retain the renormalization properties of higher derivative gravity, Lorentz symmetry in the constrained theory is explicitly broken.

I introduction

It is well known that non-degenerate higher derivative theories suffer from Ostrogradski’s instability Ostrogradski (1850); Simon (1990); de Urries and Julve (1998); Woodard (2007); Chen et al. (2013). For example, consider an action with a quadratic 2nd order time derivative term


The equation of motion is 4th order, and hence its phase space is 4 dimensional. We can define the two canonical coordinates and their conjugate momenta to be , , and the Hamiltonian is hence


while only appears linearly in the Hamiltonian as . The linearity of the in this term renders the Hamiltonian unbounded from below and the theory is thus unstable. Because of this undesirable property, non-degenerate higher derivative theories are often viewed as taboo and avoided in the literature.

There are several classes of higher derivative theories in the market, which evade this instability. A higher derivative theory may be degenerate, which means that the theory is constrained. For example, in gravity Starobinsky (1980); De Felice and Tsujikawa (2010); Sotiriou and Faraoni (2010), the naive unstable degree of freedom is rendered harmless by a gauge constraint. Furthermore, some theories are secretly 2nd order despite the appearance of higher derivative terms in the action due to a clever cancellation of the higher derivative terms in the equation of motion – as seen in the Galileon theory Nicolis et al. (2009); Deffayet et al. (2009); Nicolis et al. (2010).

On the other hand, generic non-degenerate higher derivative theories are inevitably unstable. Theories with curvature invariants such as , Stelle (1977); Whitt (1984); Hindawi et al. (1996a, b); Carroll et al. (2005); Nunez and Solganik (2005); Chiba (2005); Calcagni et al. (2005); Navarro and Van Acoleyen (2006); De Felice et al. (2006), or the Weyl invariant Maldacena (2011); Lu et al. (2012) 111In 4D, the Weyl invariant can be written as because the Gauss-Bonnet term a total divergence and so does not contribute to the classical equations of motion. suffer from the sickness of Ostrogradski’s instability.

One way to deal with the instability is to impose boundary conditions in such a way that the unstable modes vanish. For example, in Maldacena (2011); Park and Sorbo (2013) the modes with the wrong sign of the kinetic terms are “turned off” by imposing suitable boundary conditions. However, this is only valid at the quadratic level. In the presence of higher order interaction terms beyond the quadratic power of the field, the vacuum states will rapidly decay (even classically) into states with positive energy modes and negative energy modes by the entropic argument Kallosh et al. (2008); Eliezer and Woodard (1989); Woodard (2007); Chen et al. (2013). The “removed” instability is thus revived.

We will consider the following action first investigated by Stelle Stelle (1977)222Here we have turned on the bare cosmological constant since the theory admits constant curvature background solution with .


This action with mass dimension parameters and in general contains eight degrees of freedom Stelle (1978), two of them corresponding to the massless graviton in general relativity, five corresponding to the massive graviton, and the last one is a massive scalar. Among them, the helicity-2 sector is a non-degenerate higher derivative theory and thus suffers from Ostrogradski’s instability.

Nevertheless, this action is interesting as it is power-counting renormalizable Stelle (1977) – the presence of higher derivative terms in the action means that there exist higher spatial derivatives in the propagator of the graviton modes. These spatial derivatives suppress the UV divergences in the loops, rendering the theory naively renormalizable. The price we pay for this is the presence of the higher time derivative terms which leads to Ostrogradski’s instability.

One way to take advantage of this insight is to impose different scaling dimensions to the time and space coordinates – a stratagem utilized by Hor̆ava Horava (2009a, b, c); Mukohyama (2010); Sotiriou (2011). The low energy limit of this theory is then a generic 1st order time derivative graviton action with higher order Lorentz violating spatial derivative terms, which is both stable and power-counting renormalizable.

In this paper, we pursue a different tack. We ask whether we can selectively remove the linear instability by imposing constraints on the theory. This idea is motivated by our recent proof Chen et al. (2013) that the linearly unstable phase space can be excised from the theory by a judicious choice of additional constraints (i.e. the final dimensionality of the phase space will be smaller). We will show that, at least in the linear theory, we can add by hand to the theory additional constraint terms which will render the theory stable, while simultaneously preserving the improved renormalizable features of it. Roughly speaking, we add a constraint where the higher time-like derivative terms in the equation of motion is constrained to some lower time-like derivative or higher order spatial derivative term, i.e.


We will show that the final form of this constrained theory is, at least linearly, that of a second order equation of motion of higher order spatial derivatives very similar in spirit to the Hor̆ava model. Of course, such addition of constraints changes the general theory – however, as we have simply worked in linear theory, we do not know what is the non-linear completion of the theory. We will leave this for future work.

Our strategy is as follows. In section II we show how to perturb the action up to second order in metric perturbation in general background, which will be used in Minkowski/de Sitter backgrounds. In section III we obtain the action quadratic in the metric fluctuation by parameterizing the metric fluctuation in Minkowski background. Since up to quadratic order, the action can be separated into helicity-0, 1, 2 sectors, we demonstrate how the instabilities appear in each sector. In section IV, we show that, how the helicity-0, 1, 2 instabilities can be rendered stable by introducing suitable constraints. We study the behavior and how to remove the instabilities in de Sitter background in section V, VI. We conclude in section VII.

Ii Higher Derivative Gravity: Quadratic Action

In order to study how do the instabilities appear in action (3) at the quadratic order in the metric fluctuation, we will need to expand every curvature invariant up to second order in the metric perturbation , which is defined by


where at this stage can be general background metric and Gullu et al. (2010). The inverse metric up to second order in can be written as


Assuming a constant curvature background of either Minkowski (), de Sitter (), or Anti-de Sitter (), we compute the second order action


where is d’Alembert operator and the linearized Ricci tensor, Ricci scalar, and Einstein tensor are defined by 333see, for example, Deser and Tekin (2003)


Note that the indices are raised and lowered by background metric .

Iii Quadratic action around Minkowski background

In this section we want to study how the instabilities appear in the action at the quadratic level of perturbation around Minkowski background . We parameterize the metric fluctuation by


where is symmetric, traceless tensor and the index are raised and lowered by . We can further decompose and into helicity-0, 1, 2 modes,


where and are longitudinal and transverse parts of vector , is transverse, and is symmetric, trace-free and transverse, and the angled bracket indices component


is trace-free. By this decomposition, we can separate the action into helicity-0, 1, 2 sectors, since at the quadratic level there is no mixing between different helicities.

iii.1 Helicity- sector

The second order action of helicity-2 modes is


which describes two massless helicity-2 degrees of freedom originating from the massless graviton and two massive helicity-2 degrees of freedom coming from the quadratic invariant term . Since there is no first class (i.e. gauge) constraint in the helicity-2 modes, there are four helicity-2 degrees of freedom in the theory. Notice that only the term enters in this expression.

Ostrogradski’s choice of canonical coordinates is the pair of canonical variables and defined by


One might notice that in Ostrogradski’s formalism the two canonical variables , have different dimensionalities, the field is dimensionless while has mass dimension 1 and thus the dimension of canonical momenta are different. The dimensionality is not particularly important – in principle one can rescale to make the two canonical variable at the same footing.

Using the Legendre transform, we construct the Hamiltonian as usual


It is easy to check that the Hamiltonian (III.1) generates the equations of motion for the 4 canonical variables via the Poisson Bracket . The important point here is that the Hamiltonian is linearly dependent on in the second term and hence the Hamiltonian is unbounded from below – the term can be arbitrarily negative when , or vice versa.

This instability is often called a “ghost”, i.e. a dynamical degree of freedom with the wrong sign kinetic term. To see this, we can explicitly diagonalize the Hamiltonian by the following canonical transformation


and the Hamiltonian thus becomes


where the pair is ghostlike. In the classical theory, the unboundedness of the Hamiltonian leads to instabilities as the phase space for negative energy higher frequency modes become unbounded below 444This is to be contrasted with tachyonic instability – which is the “exponential blowing up” of each individual mode.. In the quantum theory, while this instability does not prevent us from identifying a vacuum state and then constructing the Fock space of many particle states, imposition of positivity in the energy of all particle states will lead to some states possessing negative norms, i.e. ghosts. One can further excise these unphysical negative norm states from the Fock space, but this generically leads to violations of unitarity. For a review of the quantization issues with such theories, see Appendix A.

iii.2 Helicity-1 sector

The second order action of helicity-1 modes can be written by the gauge invariant variable 555One should not be unduly worried by the appearance of the non-local square root of Laplacian operator. Recall that the Laplacian operator has zero or positive eigenvalues , e.g. with . Formally, (as long as both and vanish at the boundary), i.e. .


The action describes a vector with mass and the sign of also decides the overall sign of the action, i.e. if , the helicity-1 modes are ghostlike. The Euler-Lagrange equation of action (18) is


which can be solved by Fourier transform, and the solutions are harmonic oscillators with frequency . The canonical momentum conjugate to is as usual defined by


and since we use the gauge invariant variable to write the action, there is no constraint in helicity-1 sector and the Hamiltonian is


If we choose , then , which means the theory is tachyonic. On the other hand, if we choose in eq.(21), the Hamiltonian will be negative definite and thus ghostlike. One can see that if , we can perform a canonical transformation of the variables into “canonically normalized” form , with the Hamiltonian


where the mass of the helicity-1 ghost is . In other words, the helicity-1 modes are either tachyonic or ghostlike .

iii.3 Helicity-0 sector

The second order action for helicity-0 modes is more complicated. With the help of two gauge invariant variables


the action can be written as


There are two scalar functions in the action, and because of the second order time derivatives on , there are three naive degrees of freedom 666As in eq.(1), an extra time derivative in the action will bring you two more dimensions of phase space, i.e., one more degree of freedom.. One degree of freedom will eventually be removed by gauge constraint and the helicity-0 modes sector are in general consist of two degrees of freedom. Notice that all the second order time derivatives appear on the second line with the coefficient – this is the well-known fact Stelle (1977) that if we choose , the massive scalar will be frozen and removed from the theory because of its infinite mass. The only degree of freedom in this sector is the helicity-0 mode of massive graviton.

On the other hand, we know that is simply an type theory which is degenerate and hence is also ghost-free – this fact is not manifest in the eq.(24) above if we simply set . However, when , the action can be rearranged as


where we have suggestively written the second line in the action (III.3) as a complete square. By varying , we obtain


Inserting eq.(26) back into the action (III.3) we obtain the action of a single non-ghostlike massive scalar field


as we would expect for type theories. It is clear that since the action is only dependent on and , there is only one ghost-free d.o.f.. Notice that if this scalar is a tachyonic unstable d.o.f., which is consistent with the general gravity, where we require to avoid tachyonic instability Starobinsky (2007); De Felice and Tsujikawa (2010). By setting means that the mass term blows up and rendering this d.o.f. non-dynamical, i.e. it reduces to simple General Relativity.

Harking back to the action for general and , eq.(24), Ostrogradski’s choice of canonical coordinates is


where the choice means that becomes a primary constraint instead of an additional d.o.f..

The Hamiltonian can be expressed by the canonical coordinates


The primary constraint is and all the constraints can be generated by the consistency relation


where means “weak equality” (i.e. only satisfied when the variables are on-shell) – see Henneaux and Teitelboim (1992) for a discussion on this point.

Since , are second class 777Note that in the case of , the constraints and are not second class and the theory will contain two more constraints. The reduced phase space is then two dimensional and the Hamiltonian is bounded below if , same as the conclusion of full theory., we can use them to reduce the phase space , and the reduced Hamiltonian is


The linear dependence of again renders the Hamiltonian unbounded from below.

In order to see the mass content of helicity-0 modes, we will need to further diagonalize the Hamiltonian by the following canonical transformation:


The diagonalized Hamiltonian is then


The reduced Hamiltonian of helicity-0 sector contains two massive degrees of freedom. One is a massive ghost comes from massive graviton with mass and the other is massive scalar with positive definite kinetic energy, with mass .

Let us combine the result from all sectors. In III.1, we saw that there are four helicity-2 degrees of freedom and two of them suffer from ghost-like instabilities. In III.2, the two helicity-1 degrees of freedom are either ghostlike or tachyonic, depending on the sign of . In III.3, one of two scalar degrees of freedom is ghostlike. With , one can see the unstable modes in helicity-0, 1, 2 sectors are massive with mass , which corresponds to the massive graviton. This result is derived by the Stelle in his seminal work on higher derivative gravity Stelle (1978) using an auxiliary field methodology. Here we rederived the results using the usual Hamiltonian formalism.

There are two special choices of parameters in the linearized theory. With , the massive graviton sector gains an infinite mass and hence becomes non-dynamical. In this case the theory consists of one massless graviton with one massive scalar field (i.e. an theory). On the other hand, by taking the limit , the massive scalar field becomes infinitely massive and hence non-dynamical. In this case, the theory’s particle content reduces to one massive and one massless graviton. With the latter choice and a total minus sign, at the linear level one can have a theory with a healthy massive graviton Park and Sorbo (2013), since this choice is consistent with the Fierz-Pauli tuning. However, one should expect that the Boulware-Deser ghost Boulware and Deser (1972) will enter at the nonlinear level.

Iv Stabilization by constraints in Minkowski background

In this section, we will demonstrate how to remove the unstable degrees of freedom by introducing constraints via auxiliary fields. As shown in Chen et al. (2013), this will result in the effective dimensionality of phase space being reduced. Roughly speaking, we impose the constraints such that the auxiliary fields are related to second order time derivative of the unstable fields, resulting in the final equations of motion being second order in time derivatives yet up to fourth order in spatial derivatives. The advantage of preserving spatial part of the “higher derivative” component is that we retain the improved renormalization properties of such theories, at the price of giving up Lorentz invariance.

One might ask what if we remove the instabilities without explicitly breaking Lorentz invariance? Here we emphasize that we can equally insert constraints to remove the higher spatial derivatives, with the end result being a stable 2nd order theory both in space and time derivatives. For example, the unconstrained helicity-2 action (III.1) can be written as

Without the full theory, we do not know how to introduce into the action without breaking Lorentz invariance while removing the highest time derivative in the equations of motion. The best thing we can do is to couple with , and the Lorentz invariance is not explicitly broken by extra terms. We can modify the action as

If , we force coupling to every and if we only force coupling to those where the cannot be removed by an integration by part. The equations of motion of the theory are

which can be written as a single equation of

The equation is either trivial if or a Klein-Gordon equation with mass if . In both cases, the equations of motion will have same order of time derivatives and spatial derivatives and the improved renormalization properties will not be retained.

For notational simplicity, from now on we drop the traceless notation , , which should be clear from the context.

iv.1 Helicity-2 sector

We begin by introducing a helicity-2 auxiliary tensor field into the action (III.1)


where is transverse traceless, which also explicitly breaks Lorentz invariance. The canonical coordinates are


and the Hamiltonian is


The Poisson bracket of a pair of transverse traceless canonical coordinates can be found as


where is the transverse traceless projection operator defined by , while is the transverse projection operator. Since the equations of motion in the Hamiltonian picture are generated by Poisson bracket, the projection operator will preserve the transverse traceless characteristic.

It is clear that is a primary constraint as it is an auxiliary field. Via the consistency relation, we can generate further (traceless and transverse) secondary constraints as follow


We can use the constraints , to eliminate the degree of freedom , and use , to eliminate . The coefficients in the action (IV.1) are chosen such that there are at least four constraints in the theory and there is no in which will generate nonlocal terms in the reduced Hamiltonian.

Using the constraints, can be written as follow


and the reduced Hamiltonian becomes


To check whether the reduced Hamiltonian is bounded from below, we will explicitly quantize the theory. Similar to QED in the Coulomb gauge, one can follow Dirac’s method to quantize the constrained system. We first write down the generalized version of Poisson bracket (i.e. Dirac bracket), which generates time evolution of any fields in constrained theory while preserving all the constraints. We then promote all the fields to operators and the commutators of two fields now become times the Dirac bracket of them.

To write down the Dirac bracket, we first define a matrix ,

where and are two operators and . The inverse of is

and the Dirac bracket of two field , is defined by

Equipped with Dirac bracket, one can use the reduced Hamiltonian to write down the equations of motion of this system


Using eq.(41) and eq.(42), we find


which is the Euler-Lagrange equation of the action (IV.1). We can solve eq.(43) by taking the Fourier transform


where satisfies


For any , is a harmonic oscillator with frequency , where is positive definite if .

In order to quantize the theory, we write , as linear summation of creation and annihilation operators , ,


where the coefficients are chosen in such a way that they solve the equations of motion eqs.(41), (42), and the superscript labels the polarizations. The symmetric transverse traceless tensor satisfies , and is normalized as


with the completeness relation


The operator is defined by replacing all the in the transverse traceless projection operator by . One can calculate the Dirac bracket of (, ) and the commutator of the two operators is thus


With the normalization eq.(47) and the completeness relation eq.(48), the commutation relation eq.(49) is equivalent to


One can thus rewrite the reduced Hamiltonian (IV.1) as creation and annihilation operators


The energy spectrum is real and bounded from below if is positive definite as long as .

iv.2 Helicity-1 sector

We now turn to the attention of the helicity-1 unstable modes. As shown in III.2, this sector is tachyonic if and ghostlike if . As usual, we will remove it by modifying the action (18) with the introduction of a helicity-1 field


Ostrogradski’s choice of canonical coordinates is


and the Hamiltonian is


There are four constraints in the theory, which can be found as