Contents

Biasing and the search for primordial non-Gaussianity beyond the local type

Jérôme Gleyzes, Roland de Putter, Daniel Green, Olivier Doré

California Institute of Technology, Pasadena, CA 91125 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109 Department of Physics, University of California, Berkeley, CA 94720, USA

Abstract
Primordial non-Gaussianity encodes valuable information about the physics of inflation, including the spectrum of particles and interactions. Significant improvements in our understanding of non-Gaussanity beyond Planck require information from large-scale structure. The most promising approach to utilize this information comes from the scale-dependent bias of halos. For local non-Gaussanity, the improvements available are well studied but the potential for non-Gaussianity beyond the local type, including equilateral and quasi-single field inflation, is much less well understood. In this paper, we forecast the capabilities of large-scale structure surveys to detect general non-Gaussianity through galaxy/halo power spectra. We study how non-Gaussanity can be distinguished from a general biasing model and where the information is encoded. For quasi-single field inflation, significant improvements over Planck are possible in some regions of parameter space. We also show that the multi-tracer technique can significantly improve the sensitivity for all non-Gaussianity types, providing up to an order of magnitude improvement for equilateral non-Gaussianity over the single-tracer measurement.

## 1 Introduction

Understanding the physics of the very early universe is one of the central goals of modern cosmology. Of particular interest is physics responsible for the creation of the initial fluctuations that seeded structure formation. Primordial non-Gaussianity (PNG) is a powerful probe as it is sensitive to non-linear evolution of the fluctuations back to the time when the modes were created. Different models of the early universe can produce qualitatively different predictions for the deviations from Gaussianty, meaning a detection or new upper limits will help determine the mechanism responsible. Constraints on primordial non-Gaussianity from the CMB [1] currently provide the best limits on all types of non-Gaussanity, but fall short of some well-motivated targets. Significant improvements are possible with large-scale structure (LSS) and can potentially reach some of these thresholds [2].

Scale-dependent halo bias is a powerful probe of primordial non-Gaussianity [3, 4, 5, 6]. In particular, scale-dependent bias is sensitive to the squeezed-limit bispectrum of primordial fluctuations. The single-field consistency relations [7, 8] state that in any single-field model of inflation, the dependence of this bispectrum on the wave vector of the long mode must be of the form (see also [9, 10, 11, 12, 13] for more detailed discussions on consistency relations in the large-scale structure)

 ⟨φ(kL)φ(q1)φ(q2)⟩′∝k2LPp(kL)Pp(|q1+q2|/2)⋯+O(k4L), (1.1)

where is the primordial Newtonian potential with power spectrum . This primordial potential is related in linear theory to the Newtonian potential during matter domination by where is the linear growth factor of matter fluctuations, normalized such that during matter domination and is the transfer function, which goes to unity as . The important feature is that the small-scale power is modulated by the primordial long-wavelength Newtonian potential. As a result, any scale-dependent bias for the halo density field due to primordial non-Gaussianity, , will be inversely proportional to the transfer function, . This is realized for equilateral type non-Gaussianity, since (see [14])

 bNG(q)=9(bδ−1)fEqNLΩmδcH20R2∗D(z)T(q), (1.2)

where is the linear bias, is the density parameter of matter, is a critical density that typically appears in peak-background split calculations (see for example [15]), is the Hubble parameter today and is the Lagrangian radius of the objects considered. On the other hand, this consistency relation can be violated if multiple fields contributed to inflation [16, 17, 18, 19]. For example, local type non-Gaussianity, which arises when there are multiple light fields, gives

 bNG(q)=3(bδ−1)fLocNLΩmδcH20D(z)T(q)q2 (1.3)

(in the single-field case, the consistency relation implies that any bias contribution with this scale dependence has to be exactly zero modulo projection effects [20, 21]). More generally, models where one of the fields is not light also violate the consistency relation (see for example quasi-single field inflation [22]), and give

 bNG(q)∝fΔNLΩmδcH20D(z)T(q)q2(qR∗)Δ (1.4)

with .

Constraints from scale-dependent halo bias on these types of primordial non-Gaussianity have been extensively studied in the literature, especially in the case of local non-Gaussianity (e.g. [23, 24]), but also for equilateral [25, 26, 27] and quasi-single field non-Gaussianity [28, 29, 30]. One potential concern is that the scale-dependent bias effect may in principle be degenerate with contributions to halo bias from other sources than primordial non-Gaussianity, specifically non-linear bias and/or non-local111By this we mean that can depend on spatial derivatives of . bias (evidence for both of those has been in observed in simulations [31, 32]). While based on fundamental principles it is hard to mimic the scale-dependence of local-type scale-dependent bias by such contributions, it may be an important effect in the other cases mentioned above. This can potentially severely weaken the constraining power of halo clustering on primordial non-Gaussianity. In particular, for equilateral non-Gaussianity, the scale-dependent bias can be expanded as , which at first sight is fully degenerate with the gradient bias expansion . In fact, considering only the terms, and assuming to be of order unity, suggests an error floor on of order 1000 [33, 25]. On the other hand, the effects of equilateral non-Gaussianity and non-local bias should not be exactly the same, as they come in with a different characteristic scale. For primordial non-Gaussianity, the typical scale is (), while it is the Lagrangian size of the halos of interest for the gradient bias (). In other words, one expects the degeneracy to not be exact when taking into account higher order terms.

In this paper, we study in more detail the constraints that can be obtained on primordial non-Gaussianity from scale-dependent bias in the presence of non-linear biasing and non-local (gradient) bias. We follow the approach of [33, 34, 35] to write the most general bias expansion based on principles of symmetry and the equivalence principle. We then forecast constraints on primordial non-Gaussianity, marginalizing over the effects of the various bias contributions, where for non-linearities, we include all terms up to 1-loop order.

Of particular interest is how information about equilateral and quasi-single field-type non-Gaussianity is encoded in galaxy power spectra. We study dependence on survey volume, number density, use of multiple tracers (as proposed in [36]) to understand the reach and limitations of scale-dependent bias for non-local non-Gaussianity. After marginalizing over bias, only quasi-single field inflation with and local non-Gaussianity show the potential for significant improvements over Planck [1] with realistic configurations. Nevertheless, we show that the multi-tracer technique allows for significant improvements over single tracers and could extend the potential reach of this large-scale structure and scale-dependent bias.

The paper is outlined as follows: In Section 2, we detail the model for the halo power spectra in terms of a biasing model. In Section 3, we explain our forecasting methodology. In Section 4, we focus on equilateral non-Gaussianity, because there one might expect degeneracy with other bias terms to be the strongest. We then turn to non-Gaussianity of the types that violate the consistency relations in Section 5 (i.e. require multiple fields in the early universe) and quantify the degradation to parameter constraints due to non-linear and non-local biasing. In Section 6, we discuss the multi-tracer technique and demonstrate the potential for improving constraints on equilateral non-Gaussianity. We conclude in Section 7.

The reader that is familiar with non-local, non-linear bias and/or MCMC forecasting may want to go directly to the results, starting in Section 4.

## 2 Modeling the halo power spectrum

In this section, we describe the model of halo clustering. Our discussion here follows the systematic, general approach of [37, 38, 33, 39, 34, 40, 41, 42] (see also the review [15] and references therein). We will consider the halo power spectrum in configuration space, i.e. we will not include redshift space distortions, which would make the power spectrum anisotropic, and would significantky complicate the analysis.

### 2.1 Scale-dependent bias from primordial non-Gaussianity

We start with the leading order, local contributions to the halo overdensity , including the scale-dependent bias from primordial non-Gaussianity, which is our signal,

 δh=bδδ+fNLbΨΨ+ϵ0. (2.1)

Here, is the matter overdensity and the Gaussian, linear bias. For non-Gaussian initial conditions, there can be a modulation of the small-scale variance of initial perturbations by a long-wavelength fluctuation,

 Ψ(q)≡(qμ)Δφ(q), (2.2)

where is the primordial metric perturbation, related to the primordial curvature perturbation, and an arbitrary energy scale. The term above describes the simplest form of bias due to this primordial non-Gaussianity (see [33, 43] for additional refinements) and has been tested in simulations [44, 45, 46]. Finally, is a stochastic white noise contribution, which is assumed to be uncorrelated with the other terms.

The primordial metric fluctuation can be related to the present-day matter fluctuation by,

 (2.3)

We can then write the non-Gaussian contribution as a scale-dependent bias,

 δh(q)=(bδ+bNG(q))δ(q)+ϵ0,bNG(q)=bΨM−1(q)(qμ)Δ. (2.4)

Parametrizing the primordial non-Gaussianity (specifically, the squeezed-limit bispectrum) in the conventional way, and applying a simple halo model, the scale-dependent bias can be written in terms of a non-Gaussianity parameter222Interestingly, these parameters in a given volume may be biased relative to the true statistical  [47] but the parameter is robust to such effects [48]. [4, 49, 50, 51, 26],

 bNG(q) = 2fLocNL(bδ−1)δcM−1(q)(local), (2.5) bNG(q) = 6fEqNL(bδ−1)δc(qR∗)2M−1(q)(equilateral), (2.6) bNG(q) = 6f(Δ)NL(bδ−1)δc(qR∗)ΔM−1(q)(generic exponentΔ∈[0,2]), (2.7)

where we recall that is the critical overdensity for spherical collapse, and is the Lagrangian radius of the halos of interest.333For the quasi-single field case, the conventional prefactor is not 6, but a more complicated function of , see [52]. When comparing our results to the existing QSF constraints in Section 5.2, we will use the full normalization. However, when drawing a link from local to equilateral configurations, we will use the normalization (2.7). The amplitudes of non-Gaussanity, , defined in eqs. (2.5)–(2.7) are determined by the amplitudes of the squeezed limit of the bispectrum. Constraints from the CMB often normalize by the amplitude of the bispectrum in the equilateral configuration, which can be non-trivially related to the amplitude in the squeezed limit. This can be important for comparing limits from scale-dependent bias with limits from Planck, as we will see in the case of quasi-single field inflation (Section 5.3).

The above expressions for scale-dependent bias are based on the squeezed-limit behavior of the primordial bispectrum. This means their range of validity is technically limited to , where is the long mode, and the wave vector of the short modes. For a given halo type, this places a requirement on the maximum wave vector included in the analysis, . In order to maximize the information from scale-dependent bias, we will push to relatively large values, , but one has to keep in mind that for modes that do not satisfy , the actual form of the scale-dependent bias may be modified. We will come back to the optimization of in Section 4.2.

### 2.2 Non-linear and non-local halo bias

In this section, we will explicitly go step by step through the calculations of the halo-halo power spectrum in the presence of PNG. The reader not interested in those technical details is encouraged to skip to either the forecasting method Section 3, or even directly to the results in Section 4.

Specifically, we want to compute

 ⟨δh(k)δh(q)⟩=(2π)3δ(D)(k+q)Phh(k), (2.8)

using the methodology of standard perturbation theory (SPT). We refer the reader to [53] for definitions and notations.

On top of the SPT contributions up to 1-loop, we will also include the most general bias model for . To do so, we will allow to depend on any term that is permitted by the symmetries of the system, e.g. rotational symmetry and the equivalence principle. This means that we allow to depend on derivatives of the matter overdensity , leading to terms like , known as non-local bias. We will also include non-linear bias, i.e. we will allow to depend on terms like . Our derivation will follow closely the work of [34] (and also [39]), which computed for gaussian initial conditions, to which we will add results of [33] regarding the PNG.

In the expansion of , the only term in that we will keep is the linear one (i.e. the first line of eq (2.34) in [33]), the other (loop) terms being much smaller for the scales of interest. Our expansion is going to be in , where is a purely gaussian variable444In principle, the non-linear relation between the matter overdensity and due to the PNG will show up as a modification to the kernels of standard perturbation theory. However we have checked that those modifications are negligible compared to the effect of the scale-dependent bias.. Let’s first discuss the structure of the terms that we will get. As in [33], we will conduct the expansion with diagrams for an easier representation.

#### 2.2.1 The ingredients

The goal of this section is to present what are the terms that appear in the bias when considering that it is in principle non-local and non-linear. We will go through order by order (in ).

• Linear in

Those terms will be represented by a line. External lines should be weighted by a linear , given by

 δh(q)=[bδ+fNLbΨM(q)(qμ)Δ+bq2q2R2∗+bq4q4R4∗+…]δ(→q). (2.9)

The terms come from Fourier transforming the dependence of on . The dots signify that in principle there are other terms in this gradient expansion. Internal lines on the other hand are necessarily . Contracting two lines gives a power spectrum (represented by a black dot), with the appropriate bias weights. There is also a stochastic term, , that is allowed by the symmetries. It does not correlate with and gives a noise contribution to the power spectrum.

These will be represented by a striped circle with one line going in and two going out. The circle will have a weighting function (where is the momentum of the line going in) that will depend on the exact nature of the quadratic term. Since our operators are expressed in terms of and not , internal lines can only carry the standard perturbation theory kernel of second order for , . The restriction to the kernel will be shown by a circle with a grid. If the line is an external line, then two additional types of terms are possible: the quadratic overdensity and the square of tidal tensor , where . Similarly to the linear case, there is a stochastic term that appears. We will see that this term can be absorbed in the definition of the linear bias (see App. A).

• Cubic in

These will be represented by a striped square with one line going in and three going out. The square will have a weighting function (where is the momentum of the line going in) that will depend on the exact nature of the cubic term. At one loop, on top of the standard perturbation theory kernel of third order for , , we have four additional contributions: , , . The last one, denoted in [33] (it is related to the variable in [34] although it is not exactly the same). This term is more subtle. At first order, and the velocity divergence are the same up to a normalization factor, which means that only a single variable is needed to describe them. At second order they are not the same, but their difference (called in [33]) is not a new variable, i.e. it can be expressed in terms of the expressions in the paragraph right above (see [33]). Only when multiplied with a first order quantity such as does it become a new, independent variable, which is why it only starts appearing at cubic order. We will give more explicit expressions when looking directly at the kernels. There are also mixed stochastic terms, and , which give divergent contributions and therefore will not appear in the final (renormalized) result.

Now that we have the ingredients, it is a matter of getting all the possible terms in the expansion of up to 1-loop. That will be represented by diagrams, similarly to standard perturbation theory [53]. The terms in are obtained by contracting the “outgoing” lines from two ingredients in the list above. The contractions, that we will denote with a black dot, come with a linear power spectrum and imposes the sum of the momenta of the lines to be zero. We will illustrate this below. Note that in principle, we should be working with renormalized quantities [39] from the start. Instead, we will work with the bare quantities, and keep only the non-divergent parts in our final result.

#### 2.2.2 Tree level

Only the terms that are at most linear in can appear here. Thus, we have to contract two external lines as in eq. (2.9), as well as two stochastic terms . This gives

 PTreehh(q)≡b2fullPG(q)+Pϵ0, (2.10)

where we have defined

 bfull(q)≡bδ+fNLbΨM(q)(qμ)Δ+bq2q2R2∗+bq4q4R4∗+… (2.11)

and we denoted the variance of the by .

#### 2.2.3 1-loop

Here, there are three types of diagrams that contribute. 1) a second-order vertex contracted with another second-order vertex (which is similar to the usual of SPT). 2) a third order vertex with one loop and one contraction with a linear (the equivalent of ). There is a third type, that does not appear in SPT. This is because our quadratic operators are constructed from the full , and not the first order as in SPT. Therefore, we can have a second-order vertex contracted with the SPT second-order vertex and a linear , which we call type 3. The structures of the diagrams are shown in Fig. 1

Let us now compute the contribution of the different types of diagrams.

• Type 1)

The contractions impose , so that there is only one integral over momentum (which is expected because we are looking at 1-loop). The diagram and its symmetrical partner therefore give

 Phh(q)⊃∫pG(q−p,p)[G′(p−q,−p)+G′(−p,p−q)]PG(|q−p|)PG(p). (2.12)

We use the shorthand notation . If , we recover , the 1-loop term in SPT obtained when contracting two second-order .

• Type 2)

The contractions impose . The diagram and its symmetrical partners (2 others, depending on which leg of the cubic vertex is contracted with the second external line) therefore give

 Phh(q)⊃PG(q)∫pPG(p)[R(−p,p,−q)+R(−p,−q,p)+R(−q,−p,p)]. (2.13)

Again, if , this is , the 1-loop term in SPT obtained when contracting a first-order with a third-order .

• Type 3)

This type is slightly more involved. However, following the basic rules above, the diagram and its symmetrical partners (3 others, depending on where the second vertex is connected and where the second external line connects to the second vertex) give

 Phh(q)⊃2PG(q)∫pPG(p)F2(−p,q)[G(q−p,p)+G(p,q−p)], (2.14)

where we have used that .555In principle, there is another diagram where the external is connect to one branch of the first kernel while the two branches of the second kernel are connected together. However, this leads to a vertex which is zero.

We therefore have the ingredients as well as the structure of the bias expansion. The only thing left is to assign the proper kernel to each ingredient, and then we will be able to compute at 1-loop.

#### 2.2.4 The kernels

Let us now translate the non-linear bias terms to quadratic and cubic kernels. In order to compactly write the expressions, it is convenient to introduce the cosine between and

 μij≡ki⋅kjkikj, (2.15)

as well as another cosine, between and

 μ−≡(q−p)⋅p|q−p|p. (2.16)
• Quadratic operators: We have two kernels, and . We do not consider and as they are much smaller (suppressed by small non-Gaussianity and loop order).

We can therefore associate the kernels: and , where is defined in eq. (2.16).

On top of it, there is the standard SPT kernel, (see App. A for details).

• Cubic operators We have the following kernels (they are symmetrized):

 δ3→1,δs2ij→R(k1,k2,k3)=13(μ212+μ223+μ213−1),∇i∇jΦδ∇j∇kΦδ∇k∇iΦδ→μ12μ23μ13, (2.17)

with is defined in eq. (2.15). The kernel for reads

 Tr[Π(1)Π(2)]→R(k1,k2,k3)=13⎡⎣(k1⋅(k2+k3)k1|k2+k3|)2[G2(k2,k3)−F2(k2,k3)]+perms⎤⎦, (2.18)

with the second-order kernel for the velocity (see [53]). Plugging the expressions for and , one finds

 Tr[Π(1)Π(2)]→R(k1,k2,k3)=221⎡⎣(k1⋅(k2+k3)k1|k2+k3|)2[μ223−1]+perms⎤⎦. (2.19)

#### 2.2.5 The halo-halo power spectrum

In this section, we will combine everything to compute at first order in loops and . We have already computed the tree level contribution

 PTreehh(q)=b2fullPG(q)+Pϵ0, (2.20)

with defined in eq. (2.11).

The next step is to remove the divergent quantities in the loop contributions (that would not have been there if we had worked with renormalized quantities from the beginning). Once this is taken into account (see App. A), the expression for the halo power spectrum reads

 Phh(q)=PTreehh(q)+bfull(q)2Ploop(q)+4bfull(q){∫p[bδ2+bs2(μ2−−13)][PG(|q−p|)−PG(p)]PG(p)}+2∫p[bδ2+bs2(μ2−−13)]2[PG(|q−p|)−PG(p)]PG(p)+6bfull(q)bΠΠ(2)PG(q)∫p(27[(q⋅pqp)2−1]2μ2−3+863)PG(p), (2.21)

where is the standard one loop contribution to the power spectrum [53]. This expression is similar to the results in [34], with the addition of the PNG. Notice that out of the type 2) and 3) diagrams, only one remains after renormalization (last line). Indeed, when the divergent parts are removed, all the non-zero cubic terms are proportional to each other, as already noted in [34]. This is why we regroup them into a single term, (c.f. Appendix A). However, we cannot the same for the terms in second and third lines of eq. (2.21) because depends on and .

To better gauge the size of those different terms, we plot them in Fig. 2 at , choosing , , and all the other bias parameters equal to one. The cosmology is set to that of Planck 2015 [54], and the linear power spectrum is obtained from CAMB [55].

## 3 Forecasting methodology: fNL constraints from the halo power spectrum

In order to estimate the constraining power of future surveys, we will resort to a Monte Carlo Markov Chain (MCMC), using the python package emcee [56]. To this end, we need the likelihood function, i.e. the probability to observe a halo power spectrum given a set of parameters . Since we want to focus on the bias expansion of eqs. (2.11) and (2.21), we will fix all the cosmological parameters. Therefore, our set of parameters is given by

 θ={bδ,bq2,bq4,…,bs2,bδ2,bΠΠ,Δ,Pϵ0}. (3.1)

We approximate the likelihood by that of the power spectrum of a Gaussian galaxy density field, but using the full non-linear expression for the theory power spectrum, i.e. the power spectrum given in eq. (2.21). Given an observed (or in our case fiducial) spectrum , this leads to the likelihood,

 logL(^P(k)|θ)=−Nk2[^P(k)Pth(k,θ)−log(^P(k)Pth(k,θ))],Nk≡k2V2π2Δk, (3.2)

where is the number of modes in a bin for a survey of volume . We have ignored terms that do not vary with and therefore will not modify our exploration of the parameter space. The power spectrum thus is not a Gaussian variable, but follows a distribution. This is particularly important for large scales, which are crucial for local non-Gaussianity, as the number of modes is low so that the central limit theorem does not apply, cf. [57].

The cosmology is given by a CDM model with parameters given by the latest Planck release [54], and we will not vary them as we want to focus only on the potential degeneracy within the bias expansion. Our parameter estimations are for a survey of volume , with a mean redshift . We will consider modes going from to 666 is chosen so that the truncation of the gradient expansion is sensible., with bin width . The fiducial linear bias , the lagrangian radius and galaxy density (which gives the shot-noise and fixes the fiducial value of ) are determined using the halo mass function and bias at redshift [58, 59]. We will use a minimum halo mass of (see Section 6, and in particular Fig. 8 for a discussion on this choice), which implies,

 bδ,fid=3.6R∗=3.7Mpc/h¯n=1.2×10−4(h/Mpc)3. (3.3)

This gives a , well within the linear regime at redshift .

We will take flat priors for all the free parameters. For the bias expansion, in particular the gradient terms, we will impose a flat prior between . Indeed, because we explicitly put the scaling with in the spatial derivatives, the terms in front of them should be of order one. This is why the expansion is not completely degenerate with in the scale-dependent bias, which evolves on a different spatial scale given by the matter radiation equality, . We will explore this in more details in the next section.

The fiducial parameter values and prior ranges are listed in Table 1. Although for realistic halos, the non-linear and non-local biases are expected to deviate from zero (see [60, 61, 62, 32] for assessment of a subset of these from simulations), we do not expect constraints on primordial non-Gaussianity to be very sensitive to the exact choice of fiducial values, so we set them to zero for simplicity.

The truncation order for the gradient expansion will be determined as the order when adding new gradient terms does not change the constraint on . As we will see, for , this corresponds to including terms up to .

## 4 Equilateral non-Gaussianity

We first consider constraints on scale-dependent bias due to equilateral non-Gaussianity, characterized by . According to eq. (2.6), this type of non-Gaussianity leads to a scale-dependence

 bNG(q)∝1T(q), (4.1)

where is the tranfer function of matter perturbations, normalized to one when . Since the scale-independent part of is unobservable (because completely degenerate with the linear bias), this means that the signal is dominated by small scales, , where Mpc is the matter-radiation equality scale. This stands in strong contrast with the more commonly studied halo bias due to local non-Gaussianity, which is dominated by large scales. This can readily be seen in Fig. 2.

Since there is a large number of modes available on these small scales, in principle we expect scale-dependent bias to strongly constrain . On the other hand, if we expand the scale-dependent bias signal around ,

 bNG(q)∝1+c2(qkeq)2+O((qkeq)4), (4.2)

we find a scale-dependence that will to a certain extent be degenerate with the non-local and even non-linear bias terms discussed in Section 2.2. However, the characteristic scale for the primordial non-Gaussianity signal is different that for the non-primordial terms which are determined by the size of the halos. Thus, the degeneracy may not be exact, and below we investigate quantitatively what the expected constraints on are with and without the inclusion of non-linear and non-local biasing.

### 4.1 Optimistic forecast, bδ and fNL only

As a starting point, we consider constraints on marginalized only over the linear, Gaussian bias . This corresponds to the case of the other bias parameters being known to vanish. It is also the standard approach to forecasting constraints on local non-Gaussianity, which relies on very large scales, where additional bias corrections are small. For the fiducial galaxy sample discussed in Section 3, Figure 3 shows in green the uncertainty in as a function of the survey volume. The solid curves assume our default choice and, for comparison, the dashed curves show the constraints for (we will discuss what is an appropriate cutoff choice in Section 4.2). Since there is a strong signal on small scales, the results are very sensitive to the choice of .

We also show in dashed red the current value of from the bispectra of CMB fluctuations, measured by Planck [1]. For our preferred choice , we see that a survey with volume Gpc, comparable with next-generation galaxy surveys, appears to improve the constraint significantly compared to the CMB measurement: vs . Thus, scale-dependent bias in theory appears as a promising way of improving our knowledge of equilateral non-Gaussianity.

### 4.2 Including non-local bias

 δh(q)⊃nmax∑nevenbqn(R∗q)nδ(1)(q), (4.3)

see Section 2.2, and marginalize over the bias parameters . As discussed in the beginning of this section, these gradient bias terms come with a typically smaller characteristic length scale, , than the characteristic scale for the onset of equilateral scale-dependent bias, . However, without restrictions on the coefficients in eq. (4.3), if we simply expand both the signal, eq. (4.2), and the gradient bias in powers of , it is clear that they are exactly the same, i.e. we can mimic the effect of exactly with the gradient bias expansion by absorbing the difference in characteristic scales in the coefficients, . This is a fine-tuning that does not reflect the physical difference between the two contributions. This physical difference is therefore imposed by our prior of order unity on the parameters, see Table 1.

In general, the gradient bias expansion is expected to converge if the non-locality scale associated with halo formation is small compared to the mode of interest, . Assuming this is satisfied, there must be some finite truncation of the expansion, , such that including terms beyond negligibly affects the results, and in particular . We study this convergence for in Figure 4. The dots show the constraints using the model with and the gradient bias expansion as a function of , assuming a survey volume Gpc.

We see that for , the results have converged by . For , while the bound appears to converge by , the cutoff is clearly outside the regime where the gradient expansion can be expected to make sense. For smaller cutoff, , convergence is reached early, at , but since this cutoff means using a much smaller number of modes, the constraint is significantly weakened. As a compromise between the gradient expansion being well behaved, and using as many modes as possible, we choose as our default value for the rest of this work the cutoff . As discussed in Section 2.1, our expressions for the scale-dependent bias assume the squeezed-limit bispectrum behavior, which technically is only valid for . We thus note that for the smallest modes we include, there may be non-negligible corrections to the scale-dependent bias that we do not take into account.

Assuming the default cutoff motivated above, the orange curve in Figure 3 shows the constraint on as a function of survey volume when including the gradient expansion. We see that marginalization over non-local bias severely weakens the bound on non-Gaussianity, by about a factor of .

### 4.3 Including non-linearities

Finally, we include the loop terms from non-linear bias and evolution, discussed in Section 2.2, marginalizing over the corresponding parameters. The orange crosses in Figure 4 confirm that even in this more general scenario, our truncation of the gradient expansion at is appropriate. The purple curve in Figure 3 shows the final constraints as a function of survey volume. The degradation factor relative to the simple case without non-linear or non-local bias is about , slightly worse than with non-local biasing only. We note that it would be wrong to conclude from this that non-local bias is necessarily more degenerate with scale-dependent bias than the non-linear terms. If we add the non-linear terms first, and then include the non-local terms, we would again find that the first step gives the biggest deterioration (see the magenta curve in Fig. 3).

### 4.4 Summary

While a “naive” forecast suggests scale-dependent bias in galaxy surveys may improve the constraint on beyond the current CMB error bar to , implementing a general treatment of bias and non-linear evolution leads to strong degeneracies, weakening the constraints by roughly a factor leading to for a survey volume Gpc. On the bright side, we do find that the degeneracy is not exact, and that even after marginalization, information on remains in scale-dependent bias. Unfortunately, the degraded uncertainties are at least an order of magnitude above the current Planck constraint, even assuming very large survey volumes of order a few Gpc.

The forecasts in this section assume the use of a single halo sample. When multiple samples with different biases and with high number density are available, it is possible to evade the cosmic variance bound and to strongly improve the constraints [36, 63]. We will study in Section 6 if this method can make constraints on from scale-dependent bias more competitive with the CMB. We also point out that for equilateral non-Gaussianity the halo bispectrum [64, 65, 66] is expected to do better than the halo power spectrum because it makes use of primordial bispectrum configurations beyond the squeezed limit, and the equilateral signal is dominated by those triangles (unlike local non-Gaussianity). Even for the halo bispectrum, however, degeneracies with non-linear evolution and biasing place limits on what can be achieved [25].

## 5 Beyond single-field inflation

### 5.1 Particle physics and the squeezed limit

Inflation is thought to have occurred at energies such that GeV. During inflation, any particles with masses are excited from the vacuum and can potentially impact the evolution of the fluctuations we ultimately observe. These extra fields are particularly important in the squeezed limit as the can lead to violations of the single-field consistency relations [22, 67]. There are good reasons to think a plethora of new particles could appear at these energies and we should take this possibility seriously [22, 67, 68, 69, 70, 71].

The most well-studied possibility is local non-Gaussianity, which arises most commonly from scalar fields with . In this case, the primordial perturbation of the gravitational potential can be written as

 φ=φG+fNL(φ2G−⟨φG⟩2), (5.1)

where is a gaussian variable of power spectrum . In the squeezed limit, local non-Gaussianity takes the form

 limk1→0B(k1,k2,k3)=⟨φ(k1)φ(k2)φ(k3)⟩→4fNLPφ(k1)Pφ(k2) (5.2)

where forces . For (known as Quasi-Single Field, or QSF), the extra fields do not directly contribute to the dynamics at the end of inflation (often responsible for local non-Gaussianity) but lead to non-trivial mode coupling at horizon crossing. This leads to a squeezed777This expression is only valid for . The expression for can be found in e.g. [52]. bispectrum [22, 52]

 limk1→0B(k1,k2,k3)→−18√3πΓ(3/2−Δ)2ΔN3/2−Δ(8/27)f(Δ)NL(k1k2)ΔPφ(k1)Pφ(k2) (5.3)

where and is the second-kind Bessel function of order . For , we find weaker power law scaling in the squeezed limit, with . For this behavior becomes oscillatory and the amplitude is exponentially suppressed [72, 70]. In both cases, the overall power law is determined by the time evolution of the wave-function for the massive fields outside the horizon. The suppression in the squeezed limit is ultimately due to the decay of the massive field between the time of horizon crossing of the short and long wavelength modes [67]. In these simple models, the power law never reaches the single-field limit because we must reproduce the dilution expected for very massive particles [70]. However, by using interactions, we can modify the time evolution to achieve any  [73].

### 5.2 Constraints on f(Δ)NL (Δ=0−2)

Here we study constraints on scale-dependent bias with arbitrary scale-dependence, , where, in summary, corresponds to local non-Gaussianity, to equilateral non-Gaussianity, and the range to quasi-single field inflation (QSFI) with a range of masses.

In order to understand the statistical power of scale-dependent bias to constrain these models, it is useful to consider the constraints where the amplitude in the squeezed limit is fixed as a function of . Specifically, in this section we will assume

 bNG(q) = 6f(Δ)NL(bδ−1)δc(qR∗)ΔM−1(q) , (5.4)

where we will take to vary independently of . The case corresponds to local non-Gaussianity, although with a slightly unusual normalization . In this limit, the signal-to-noise is dominated by the largest scales where they are robust to non-linear evolution and biasing. However, because of the non-trivial -dependence of there is also potentially information available at large that is not degenerate with the bias parameters. As we increase , the amount of information available at decreases and (naively) increases at .

The constraint on for varying is shown in Figure 5. We see that when , the results are unaffected by marginalization over bias parameters, which means most of the constraining power is coming from large scales (for example, one can see in Fig. 2 that the other contributions to are very different from that of ). However, for , we find much weaker constraints when marginalizing over higher order biases.

We can understand the forecasts qualitatively from the likelihood function in eq. (3.2). For a small deviation from the hypothetical model, then we have

 logL(^P(k)|θ)≈−k3V4π2Δlnk12(δP(k)Pth(k,θ))2 . (5.5)

If we consider only the change from the primordial non-Gaussanity, we have and therefore the likelihood scales like

 logL(^P(k)|θ)≈−18k3Vπ2Δlnk(bδ−1)2b2δf(Δ)NL2δ2c(kR∗)2ΔM−2(k) , (5.6)

where we have assumed a sample-variance limited measurement. As , and we see therefore that for , the signal-to-noise at low- is dominated by the smallest (largest scales). However, is not simply a power law and for the grows more slowly than and therefore the signal-to-noise increases with increasing . As a result, we expect there to be contributions to the signal both at largest and smallest scales in the survey for . Increasing makes the contribution to the signal increasingly dominated by the smallest scales, especially for .

Thus, the best constraints are for close to zero, where adding other bias terms does not significantly alter . This is because for low , the signal peaks at large scales which are linear and do not get large contributions from the rest of the bias expansion (see Fig. 2). By contrast, small scales are the most non-linear and degeneracies with biasing will appear as weakening constraints. Therefore, when approaching the equilateral case, the constraints notably worsen for the full bias expansion. The peak in the curve when marginalizing solely over the linear bias comes from the fact that for close to (where most of the signal is for these values of ), the transfer function roughly behaves as , so that . Thus, for , there is enhanced degeneracy with the linear bias. As we increase , it will become less degenerate with but more degenerate with the higher order biases, which is reflected in the deferences in the forecasts for difference biasing models.

The scaling properties for these forecasts follow from eq. (5.5) and, in particular, the noise level set by the cosmic variance of the tracer power spectrum. However, mode-coupling occurs on a mode-by-mode basis and can, in principle, be measured without cosmic variance. We will discuss the multi-tracer approach to cosmic variance cancellation in Section 6, but we should anticipate that the forecasts with behave qualitatively differently because of the change of the noise properties.

### 5.3 Constraints on quasi-single field inflation

Having understood the scaling behavior of the signal-to-noise for generic scale-dependent bias, we are now in a position to understand the corresponding constraints on QSFI. We now must take into account the change to the normalization of the squeezed limit, given by eq. (5.3).

Planck has put constraints on non-Gaussianity coming from QSF inflation [1], using an expression for the bispectrum that interpolates between the local and equilateral shapes. The expression is given by [22]

 Bφ(k1,k2,k3)≡6C2φF(k1,k2,k3),F(k1,k2,k3)≡33/2Nν(8/27)fNLNν[8k1k2k3/(k1+k2+k3)3](k1k2k3)3/2(k1+k2+k3)3/2, (5.7)

where . When looking at the squeezed limit888large-scale structure probes of quasi-single field inflation need not use the squeezed limit to place constraints on . See [74, 75, 76] for some examples. , one recovers eq. (5.3), which leads to a scale-dependent bias similar to eq. (2.7)

 limk→0bQSFNG(k)=−9√3πΓ(3/2−Δ)2ΔN3/2−Δ(8/27)f(Δ)NL(bδ−1)δc(qR∗)ΔM−1(q), (5.8)

which, as before, is only valid for . It diverges as , because the limit does not commute with the limit . There is a different expression in this case, given in [52]. This expression is mainly for comparison with eq. (2.7). For our analysis however, we will not use it. Indeed, we expect that for large values of , the signal peaks at scales where this approximation is not valid. This is why we will rather use the expression computed from integrating over the bispectrum (5.7) without taking the squeezed limit. This means that it is given by (see [52])

 bNG(k)=δc[bδ−1]2M(k)I21(k)σ2m