# How to generate a significant effective temperature for cold dark matter, from first principles

## Abstract

I show how to reintroduce velocity dispersion into perturbation theory (PT) calculations of structure in the Universe, i.e., how to go beyond the pressureless fluid approximation, starting from first principles. This addresses a possible deficiency in uses of PT to compute clustering on the weakly non-linear scales that will be critical for probing dark energy. Specifically, I show how to derive a non-negligible value for the (initially tiny) velocity dispersion of dark matter particles, , where is the deviation of particle velocities from the local bulk flow. The calculation is essentially a renormalization of the homogeneous (zero order) dispersion by fluctuations 1st order in the initial power spectrum. For power law power spectra with , the small-scale fluctuations diverge and significant dispersion can be generated from an arbitrarily small starting value – the dispersion level is set by an equilibrium between fluctuations generating more dispersion and dispersion suppressing fluctuations. For an power law normalized to match the present non-linear scale, the dispersion would be . This corresponds roughly to the slope on the non-linear scale in the real CDM Universe, but CDM contains much less initial small-scale power – not enough to bootstrap the small starting dispersion up to a significant value within linear theory (viewed very broadly, structure formation has actually taken place rather suddenly and recently, in spite of the usual “hierarchical” description). The next order PT calculation, which I carry out only at an order of magnitude level, should drive the dispersion up into balance with the growing structure, accounting for small dispersion effects seen recently in simulations.

## I Introduction

Observing the large-scale density fluctuations in the Universe is one of the best ways we have to approach many fundamental questions about the Universe, e.g., understanding inflation, dark matter, dark energy, the curvature of the Universe, neutrino masses, possible extra dimensions, modifications of gravity, etc. (e.g., Komatsu et al., 2009; Seljak et al., 2005a, 2006a, 2006b; Sefusatti and Komatsu, 2007; Afshordi et al., 2003; Desjacques and Seljak, 2009; Lam et al., 2009; Grossi et al., 2009; Barnaby and Huang, 2009; Taruya et al., 2008; McDonald and Seljak, 2008; McDonald, 2008; Scoccimarro, 2009; Chan and Scoccimarro, 2009; McDonald, 2009). Statistics of the current density fluctuations can be used to infer statistics of the small initial perturbations from which they grew, and in turn to understand the physics of the very early Universe. Measuring the evolution of large-scale structure (LSS) over time tells us about the present matter content of the Universe and the dynamical rules its evolution follows. Before we can learn anything about fundamental properties of the Universe, however, we must be able to compute the directly observable astrophysical quantities (e.g., the CMB Nolta et al. (2009), galaxy clustering statistics (Tegmark et al., 2006), Ly forest absorption (McDonald et al., 2005; McDonald et al., 2006a; Viel and Haehnelt, 2006; McDonald and Eisenstein, 2007; McDonald, 2003; McDonald et al., 2002), galaxy ellipticity correlations used to measure weak lensing (Hoekstra et al., 2006), galaxy cluster/Sunyaev-Zel’dovich effect (SZ) measurements (Aghanim et al., 2008), and possibly future 21cm surveys (Chang et al., 2008)) given a hypothetical underlying model.

As observational statistics become more and more precise, with the potential to measure more and more subtle differences between models, the requirements on the phenomenological theory needed for their interpretation become correspondingly more stringent. Currently, linear-order perturbation theory (Peebles, 1980; Seljak and Zaldarriaga, 1996; Kaiser, 1984, 1987) provides our primary means of calculating LSS observables for cosmological parameter estimation and model testing, with only ad hoc, and recently demonstrably inadequate, attempts to correct for non-linearity once it becomes non-negligible (Sánchez and Cole, 2008; Tegmark et al., 2006; Seljak et al., 2006a; Hamann et al., 2008; Komatsu et al., 2009) (a vast number of papers have been written on beyond-linear calculations, but most of these are never used in parameter estimation papers). Linear theory can robustly describe observations on very large scales or at very early times, but breaks down when the perturbations become too large on a given scale. When considering gravitational evolution only (i.e., dark matter only), numerical simulations can be used to compute the fully non-linear evolution of the density field to high accuracy (with a lot of care and computer power Heitmann et al. (2008a, b); Heitmann et al. (2009)), however, as discussed extensively in McDonald and Roy (2009), numerical simulations are a less than ideal tool for interpreting future precision observations, once one considers real observables which are inevitably influenced by baryonic effects (e.g., star formation, and the accompanying complication of general gas dynamics). To summarize the argument in McDonald and Roy (2009): Beyond linear order perturbation theory (PT) (Peebles, 1980; Juszkiewicz, 1981; Vishniac, 1983; Fry, 1984; Goroff et al., 1986; Scoccimarro and Frieman, 1996; Bernardeau et al., 2002; Jeong and Komatsu, 2006) should provide the primary means of interpreting very high precision future LSS data, just like linear theory has provided the primary means in the past. The range of scales over which higher order PT will be necessary (i.e., linear theory is insufficient) and sufficient (i.e., it will be accurate enough after computing a modest number of terms) will become larger as the precision of observations increases, while the chances of robustly, completely, convincingly describing the full precision of the observations using inevitably somewhat ad hoc prescriptions for star formation and related things in simulations becomes more remote. The key difference between PT and simulations is the fact that perturbative clustering can be completely described by a finite set of well-defined parameters, no matter how complicated the small-scale physics is (at least as conjectured in McDonald and Roy (2009)), while the need for fully non-linear calculations implies that there is generally no bound on the number or form of free parameters (more or less by definition). The idea that the importance of PT relative to simulations is increasing with time may seem backwards relative to conventional wisdom, however, my argument is that this conventional wisdom was developed for the era of not very precise observations, when corrections to linear theory were already too large to be described by PT by the time they were large enough to be statistically significant, i.e., past PT work was premature. To be clear, I am not saying that simulations will not be extremely useful, only that they are unlikely to be the leading tool for extracting fundamental cosmological information from very high precision observations (much like the situation in high energy physics, where lattice QCD simulations provide much qualitative, and recently even quite precise quantitative, insight Dürr et al. (2008), but high precision constraints on models are made primarily in the regime accessible to perturbation theory Yao and et al. (2006)).

Following the above line of reasoning, this paper is aimed at building up our ability to do calculations based on perturbation theory. It deals exclusively with the gravity-only part of the calculation, however, this should be viewed only as an intermediate goal. PT is most necessary for practical computations of observables which can not realistically be done from first principles in a simulation – understanding how to do computations for dark matter-only is simply a prerequisite for computing these observables. This continues a recent line of work related to renormalization/resummation methods Crocce and Scoccimarro (2006a, b); Matarrese and Pietroni (2007); Izumi and Soda (2007); Taruya and Hiramatsu (2008); McDonald (2007); Valageas (2008); Matsubara (2008a); Matarrese and Pietroni (2008); Crocce and Scoccimarro (2008); Pietroni (2008); Bernardeau et al. (2008); Bernardeau and Valageas (2008); Lesgourgues et al. (2009); Widrow et al. (2009); Carlson et al. (2009); Nishimichi et al. (2009). In my opinion, the key to maximizing the usefulness of all of this work is eventually connecting it to galaxy biasing and related complications Seljak et al. (2005b); McDonald (2006); Matsubara (2008b); McDonald and Roy (2009).

One traditional potential limitation of LSS perturbation theory, which I focus on addressing in this paper, is that it assumes the particles at a point in space all have exactly the same velocity Pueblas and Scoccimarro (2009). The equations used are those of a pressureless fluid. This is often referred to as the “single-stream approximation”, however, I will avoid this language as it assumes a certain picture of large-scale structure formation that may or may not have anything to do with reality. I say this because, for typically observed times, stream crossing is actually ubiquitous – in fact, there are many orders of magnitude in scale of deeply non-linear structure, to the point where the initial conditions on very small scales may be effectively forgotten. Of course, the standard language implicitly means no stream crossing when the field is in some sense smoothed on the typical scale where the perturbation theory is supposed to apply. When you look at it this way, however, it is clear that, if coldness is a good effective theory, it is not simply because the particles were initially cold, there is an additional requirement that the velocity field remains effectively cold on scales just smaller than the ones of interest.

To give a qualitative preview of the paper: The exact equation for the evolution of collisionless particles is the Vlasov equation for the full distribution function in 6-dimensional phase space (particle density in position and momentum space). The standard hydrodynamic equations are derived by taking moments of the Vlasov equation with respect to momentum – the zeroth moment gives an equation for the evolution of density, the first moment gives an equation for the evolution of bulk velocity, and higher moments are normally dropped, e.g., the second moment which would describe velocity dispersion. At first glance, one might think that standard PT could be extended by simply adding the evolution equation for the 2nd and possibly higher moments, however, that turns out to be less straightforward than it sounds. Viewed conventionally, dropping these higher moments is not really an added approximation in standard Eulerian perturbation theory. The lowest order evolution equations contain only a Hubble drag term, so any small initial velocity dispersion will rapidly become even smaller. Higher order equations contain no source terms, meaning that the higher order results are always proportional to the tiny starting value. Valageas (2001) showed that even perturbation theory using the full distribution function directly leads to the same result. I will show, however, that this argument for dropping dispersion from the perturbation theory is faulty, in that, while the lowest order terms are very small, the series is very rapidly diverging, in the sense that higher order terms are increasing in size instead of decreasing. This implies that some rearrangement of the series is needed, as in a resummation or renormalization group calculation. In case this mathematical reasoning is not sufficient motivation, recently Pueblas and Scoccimarro (2009) computed the velocity dispersion generated in N-body simulations, finding it to be small but not completely dynamically negligible.

There has been much discussion in the past about different ways of adding velocity dispersion, or more general changes to the small-scale effective theory used for PT calculations (e.g., Tatekawa, 2004; Ribeiro and Peixoto de Faria, 2005; Sotani and Tatekawa, 2006; Menci, 2002; Buchert and Domínguez, 2005; Gaite and Domínguez, 2007; Shoji and Komatsu, 2009), however, these papers all lacked a first principles derivation, starting from the exact equations, of the model they use. This made their usefulness for precision calculations questionable, as there were always added assumptions and/or free parameters. The point of this paper is to show how to do a straightforward computation that takes us directly from the initial homogeneous, cold, starting theory to a theory with highly developed, potentially hot, small-scale structure.

The rest of the paper is as follows: In §II I show how to use renormalization group-inspired ideas to reintroduce velocity dispersion from first principles. In §III I give a short discussion of the implications of these velocity dispersion calculations for redshift-space distortions. Finally, in §IV I discuss the results.

## Ii Generation of velocity dispersion

In this section I present the calculation that generates velocity dispersion. In §II.1 I discuss the time evolution equations for cold dark matter, which are the starting point for perturbation theory. In §II.2 I compute the velocity dispersion taking the standard PT approach, which leads to negligible dispersion. In §II.3 I apply an approximate renormalization group approach to models with power law initial power spectra, which demonstrates how significant dispersion can be generated. Finally, in §II.4 I take a related but more exact approach, which should be easily extendible to more general, higher order, calculations, including CDM power spectra.

### ii.1 Evolution equations

The exact evolution of collisionless particles is is described by the Vlasov equation (Peebles, 1980)

(1) |

with

(2) |

where is the particle density at phase-space position , is the particle mass (which plays no role in the final results), and ( here is a particle’s peculiar velocity, not to be confused with the mean peculiar velocity used everywhere else). is the comoving position and is the conformal time, with the expansion factor. Except when otherwise indicated . The density field is obtained by averaging the distribution function over momentum:

(3) |

and the bulk (mean) velocity and higher moments of the velocity distribution e.g., the dispersion of particle velocities around their bulk velocity, can be similarly obtained by multiplying the distribution function by any number of ’s (e.g., one to obtain bulk velocity) before integrating over . The mean velocity of the particles at is

(4) |

and the velocity dispersion tensor is

(5) |

i.e., with the deviation of a particle’s velocity from the local mean velocity, and the average is over all particles at .

As discussed by Peebles (1980), taking moments of the Vlasov equation with respect to momentum leads to a hierarchy of evolution equations for these quantities. The density evolution equation is the usual continuity equation,

(6) |

where , . The bulk velocity evolution equation acquires a velocity dispersion term that is usually dropped to give the Euler equation for standard perturbation theory

(7) |

where with the usual Hubble parameter. Multiplying the Vlasov equation by and integrating over gives (after some substitution to remove bulk velocity terms)

(8) |

where . Similar equations can be derived for and higher moments.

### ii.2 Standard perturbation theory

Perturbation theory consists of writing the fields as a series of terms of at least formally increasing order of smallness, i.e., . The evolution equations are solved order-by-order, with lower order solutions appearing as sources in the higher order equations so that is of order (Bernardeau et al., 2002).

For simplicity, I will often assume an Einstein-de Sitter (EdS) Universe. This assumption makes analytic calculations easier without qualitatively changing the results. We then have useful relations:

(9) |

Symmetry allows for a zero order (assumed to be homogeneous and isotropic) component of , , and in fact we expect there to be primordial velocity dispersion for realistic WIMPs, albeit very small. The background dispersion evolves following:

(10) |

where the normalization must be fixed by the initial conditions.

The linearized equations are:

(11) | |||||

(12) | |||||

(13) |

where the Poisson equation holds order by order relating to . Here is where the story ends for velocity dispersion in standard PT. The CDM velocity dispersion is supposed to be initially small, and only gets smaller. At 1st order the evolution equation for contains only the Hubble drag term and terms proportional to the tiny , so the perturbations will remain small. The last term from Eq. (7) can be dropped and the hierarchy is closed. I will show that this treatment is justified in the approach of standard PT, but retaining the apparently small dispersion terms allows for the renormalization in the next subsection. I will drop for simplicity (and because symmetry prevents it from acquiring a non-zero background value).

In this paper my only goal is to renormalize the zero-order velocity dispersion, , so I am going to henceforth assume the velocity field is curl-free and only solve for and . The techniques of this paper could probably be used to reactivate the vorticity variables, but symmetry guarantees that they will not have any homogeneous (zero order, i.e., background) component (see Pueblas and Scoccimarro (2009) for a measurement of the vorticity power spectrum in simulations, and an argument that vorticity is completely irrelevant for large-scale clustering). The linearized equations, assuming EdS, can be re-written as:

(14) |

(15) |

(16) |

where

(17) | |||||

(18) | |||||

(19) | |||||

(20) |

I have moved to Fourier space, e.g., , where .

These equations can’t be solved exactly. We expect that the density and velocity modes will be as usual linear theory at low until suppressed below some Jeans-like filtering scale. The basic behavior can be found more quantitatively from a small expansion, where I first drop the terms and obtain, using , the usual growing mode solutions, . I then substitute these solutions into the terms to obtain, to order ,

(21) |

Here I have eliminated the new integration constants at order by requiring that the corrections are zero at some very early initial time . With the assumption that the fields are curl free, we can invert the solution to obtain

(22) |

If I assume the Jeans-like suppression kernel is Gaussian, i.e., , we have so . Note that I have in this derivation only retained the fasted growing parts of , , and , which is consistent in standard perturbation theory where the initial time can be taken to be arbitrarily small, but, as we will see, is dangerous when we start renormalizing. Finally, note that as long as is very small, which it will be for CDM on observable scales (basically by definition), we have changed nothing by retaining the velocity dispersion terms, i.e., the standard perturbation theory with zero velocity dispersion is self-consistent at this order.

The 2nd order equations are:

(23) | |||||

(24) | |||||

(25) |

For now I am only interested in , which will renormalize .

(26) |

Evaluating this in Fourier space, using the same variable redefinitions as above, including , gives

(27) |

where deriving the term involving only requires that and completely describe the velocities, while the term involving requires the above approximate relation between , , and .

Using the right-most side of Eq. (27), and assuming standard linear theory for , the solution is:

(28) |

where is a constant. If I choose to make , and use the zero order solution (remembering that ), I finally obtain

(29) |

The standard PT approach would be to drop the term on the right in the brackets, assuming . From the point of view of standard PT, we could conclude that the result for is small, for CDM, so dropping velocity dispersion is justified (while is large for CDM, it is not large enough to overcome the small dispersion and make large); however, we have a clear breakdown in the premise of the perturbation theory, because .

### ii.3 Renormalization group approach

The problem with the standard calculation outlined in §II.2, which leads me to renormalization, is that the 2nd term in Eq. (29) () diverges relative to the first (), increasingly as time progresses. One might argue that the correction is still small, in an absolute sense, but this is nonsense because there is no reason not to expect the higher order terms in the series to be even larger, i.e., the truth could be arbitrarily large. Fortunately, we have tools to deal with this kind of breakdown in perturbative solutions of differential equations Chen et al. (1994); Kunihiro (1995, 1997); Shirkov and Kovalev (2001); Kovalev and Shirkov (2006).

A key observation about perturbation theory is that there is inevitable ambiguity in the solution that comes from solving a set of differential equations at each order. We obtain a new set of integration constants at each order, while the initial conditions only determine one set. This is apparent in Eq. (29), where I fixed the 2nd order integration constant by the arbitrary requirement that . I can instead fix it to set , where is some arbitrary time, so the solution is

(30) |

where must depend on , in order to match the initial conditions. The final result should not depend on the arbitrary , so the RG equation is obtained by taking the derivative of with respect to , evaluated at , and setting it equal to zero:

(31) |

or

(32) |

Now, generally depends on through the filtering scale , as discussed above, but supposed for the moment that the power spectrum was not truncated so is independent of . The solution to the RG equation would be, long after the initial time,

(33) |

where is the initial . For a power law with , is infinite, so we have infinite growth of the velocity dispersion, but even in the case of CDM with asymptotic high- slope slightly less than -3 (because the primordial slope, from inflation, appears to be less than one Seljak et al. (2006a)), will become enormous. Furthermore, as discussed in McDonald (2007), higher order corrections to the power spectrum generically push the slope to larger values, so the result, without filtering, will be even larger when calculated to higher order, to the point of being practically infinite. Infinite velocity dispersion is of course too much – the point here is simply to demonstrate how the ridiculously small initial seed velocity dispersion in CDM can grow into substantial velocity dispersion later.

Including the Jeans filtering resulting from the velocity dispersion itself will regulate the growth of the dispersion. Recall that the approximate filtering scale is . To be precise, quantifies the frozen-in power suppression due to some velocity dispersion present at initial time , long after this dispersion has redshifted away. When we in effect add new dispersion at time , through the RG equation, the smoothing will not be instantaneous; however, it takes place quite quickly, so I will assume I can use in the RG equation, where is a fudge factor to account for the lag between the introduction of dispersion and the smoothing of the power spectrum. As with the original definition for , where I assumed that the smoothing is Gaussian, this approximation means the results should only be trusted at the order-of-magnitude level. I will do a more exact calculation below. Assuming Gaussian filtering, and power spectrum (with ), the variance is

(34) |

Using Eq. (34) in Eq. (32) I find, well after the initial time,

(35) |

Recall that . This result may not be very illuminating, but it can be re-written in a way that makes it very clear:

(36) |

i.e., the velocity dispersion simply grows to the point where the Jeans-like filtering reduces the variance to of order unity, with the exact relation dependent on the slope of the power spectrum.

#### RG method, 2nd iteration

The calculation leading to Eq. (36) does not look entirely self-consistent. At various stages in it, I used the fact that , however, in the end, the renormalized is , growing quickly rather than decaying. The brute force way to solve this problem would have been to compute RG equations for the pieces of the calculation that depended on this assumption (e.g., the amplitude of ), and solve them jointly. It is simpler, and will lead to self-consistent and enlightening results, to redo the calculation starting with , with . The large-scale solutions for and are unchanged, but we now have . The smoothing estimated from the expansion changes for all three fields:

(37) |

(38) |

(39) |

Note that there is a qualitative difference in these smoothing kernels relative to the case in that appears instead of , i.e., because the comoving Jeans scale is increasing when , the smoothing continues to grow with time rather than freezing out. The numerical factors are also different, with being generally significantly smoother than , with between them. For example, for , , , and . Now,

(40) |

where and and are the smoothing scales implied by Eq. (38) and (39). The important thing to recognize about here is that it is time independent, i.e., the growth of the power spectrum is canceled by the change in .

Now, I have recomputed the linear equations given , so it remains only to recompute the perturbative corrections . Eq. (27) for is changed, because is no longer a solution to the zero order equation for . This is a key fact – at the heart of all renormalization is the concept that the naive lowest order result is not always the best one to perturb around when doing computations to higher order. Sometimes it is better to perturb around something else, with the criteria for “better” being simply that the corrections remain small. The usual renormalization procedure of doing the calculation to lowest order, using the results to compute higher order corrections, then absorbing large corrections back into the lowest order, is an algorithmic way of accomplishing what one could accomplish by simply perturbing around the better starting point from the beginning (if one has a way to determine/guess what it is). Here, I am perturbing around , i.e., is defined by , and the evolution equation I derive for is:

(41) |

The 2nd term on the right hand side is new. Note that I have not yet fixed . Normally, would be set by the initial conditions, but there is no need for that because the homogeneous part of the solution for can be used to match the initial conditions (then, since the homogeneous solution is just , the memory of initial conditions will fade away). If I choose to make , which is possible because depends on through the smoothing kernels, the equation for becomes trivial, just the homogeneous equation, i.e., there are no perturbative corrections to ! (That is, for the correct amplitude, and the special value .) This result should not be viewed as some kind of fortuitous coincidence, or artificial tuning. It is what we aim for in a renormalization group calculation, and reflects the physical correctness of the idea that the velocity dispersion should be rapidly growing, tracking the non-linear scale, with the the unstable cold start completely irrelevant (at least after enough time has passed, and for this kind of power law power spectrum).

We immediately have an equation like Eq. (36), except coming from a much more accurate calculation:

(42) |

The variance of is larger, because of the small coefficient of , and lesser smoothing of , i.e., the more accurate version of Eq. (36) is:

(43) |

e.g., this evaluates to at , with . Keep in mind that this is still essentially the linear theory variance. There will be non-linear corrections, however, especially for relatively large , it seems like a very good thing to be starting with finite linear variance rather than the infinite variance of the bare power law. For example, for the standard 1-loop PT correction diverges Scoccimarro and Frieman (1996), while here that problem is clearly solved, i.e., we have a natural high- cutoff. Choosing the pivot point at , defined by , these equations can be solved to give

(44) |

or

(45) |

e.g., for . As one might guess, the velocity dispersion scale increases relative to the non-linear scale as the power law increases to include more small-scale power. Note that this calculation was self-consistent, without any very ugly approximations, although the use of Gaussian smoothing kernels based on the expansion makes the results still only approximate.

### ii.4 A more exact, general approach

A simpler way approach the split between homogeneous background and perturbations, which I could have just started with (except I think the above derivation has some pedagogical value), is to define , with . Then it is clear that

(46) |

and

(47) |

where and I have dropped the term. These are the fully non-linear equations, with the need to include the right hand side of Eq. (46) as the source of homogeneous velocity dispersion, while subtracting the same thing from the perturbations, being a simple result of the definition of the mean and perturbations, rather than some kind of renormalization. Either way, however, we have the same basic idea that the perturbations feed back on the background, new relative to standard PT with just density and velocity.

#### RG approach at the level of equations rather than solutions

We still need some perturbative approach to solving the coupled equations for and the fluctuations in density, velocity, and velocity dispersion. The RG approach used above and in McDonald (2007) is not ideal. First, one needs to solve the perturbative equations analytically, which isn’t generally possible without approximations that can lead to uncontrolled errors. Second, the RG equation that is derived may not be analytically solvable, as I found in McDonald (2007). We can solve the first problem (needing analytic solutions for the perturbation evolution), without necessarily making the second problem any worse, by considering applying the ideas behind the RG method at the level of equations rather than solutions. Suppose we want to solve

(48) |

The standard perturbative equations are

(49) |

etc., where , etc.. Our solution up to 2nd order is , but note that we could solve a different pair of equations like this:

(50) |

and get the same solution . The idea, analogous to the above use of the RG method, is to choose to be 2nd order and absorb any undesirable parts of . If we choose we have

(51) |

where I have dropped 3rd order terms. Now, if desired, we can set the initial value of to zero and forget about it. All we have left is the equation for . This looks like nothing more than a very long-winded way of saying “if you want to solve to 2nd order, solve the truncated equation ”, and at the level that I will use it here, that is really all it is. However, there was potential for more than that within the argument, in that could have been chosen to be something else if this was convenient to, say, remove a particularly divergent part of while still producing an easy to solve equation for . Also, could in principle have been given some part of the initial conditions. I present this mainly as an explanation of the connection between the RG method and the method of simply numerically solving truncated equations. The difference between solving these truncated non-linear equations and the original perturbation equations is that all of the feedback between 2nd and 1st order is included, rather than leaving the first order solution fixed while computing the 2nd order solution.

As suggested in McDonald (2007), and implemented in Pietroni (2008), one approach to computing the power spectrum of the fields is to use the time evolution equations for the fields to derive time evolution equations for the power spectrum, and to solve those numerically. Unfortunately, these equations involve the bispectrum, which must then be solved for, and the bispectrum evolution equations involve the trispectrum, etc., i.e., one has an infinite hierarchy of equations. The method of truncating this hierarchy by setting the connected trispectrum to zero is equivalent to applying the above reasoning about truncating the series of perturbative equations, i.e., at linear order in the initial power spectrum the perturbative equations only contain the power spectrum, at 2nd order they only contain the power spectrum and bispectrum, while the trispectrum is needed at 3rd order. This type of approach seems like the only option for including the velocity dispersion as discussed in this paper in high precision calculations, because one cannot solve the evolution equations analytically when including the velocity dispersion. One has to recognize that standard PT could not even produce the analytic results that it does (for time evolution – there are generally numerical integrals over ) without the special fact that, to a very good approximation, one can use the solution to the evolution equations in an Einstein-de Sitter Universe in a realistic Universe as well – this is a very fragile situation and any added complication tends to produce un-solvable equations (e.g., non-negligible massive neutrinos are a good example of this, where, additionally, equations including velocity dispersion would naturally appear explicitly Lesgourgues et al. (2009); Saito et al. (2009)), and this approximation can never describe the kind of dependence on the equation of state of dark energy found by McDonald et al. (2006b).

In anticipation of eventually incorporating the velocity dispersion into a scheme for evolving the power spectrum and higher order statistical equations (the alternative method of closing the hierarchy in Taruya and Hiramatsu (2008) may also be a useful way to do this), I write here the evolution equations at 1st order in the power spectrum. Without the new velocity dispersion parts, these would just be the equations for the standard linear theory power spectrum. (I am still assuming an EdS Universe, although that is not necessary here.)

(52) |

(53) |

(54) |

(55) |

(56) |

(57) |

(58) |

Intuitive understanding of this last term is that it measures the tendency for velocities to converge on overheated places, and vice versa (remember that, for the definitions we’re using, positive and mean, respectively, convergence and, in a one-dimensional sense at least, negative 2nd derivative of the dispersion). Note that, for this paper, writing all of these equations involves some redundancy, in that one could just solve Eqs. (14-16) to get growth factors as functions of , using them to evaluate Eq. (58). When one goes to higher order, however, where , , and are no longer perfectly correlated, one will need all the equations.

#### Power law power spectra

Before I solve these equations completely numerically, it is informative to assume a power law power spectrum, for which we can derive some scalings. Defining by , and using time evolution , I find . The velocity dispersion must follow

(59) |

Inevitably, this is the same evolution that the RG calculation above predicted. For this to be consistent with Eq. (58), we must have

(60) |

i.e., as structure grows on large scales, the Jeans-like effect of velocity dispersion must erase enough small-scale structure to maintain fixed , with the fixed value being larger for power spectra where there is less small-scale power. Again, this result is identical to the RG result. This applies only for of course – for this discussion breaks down right at the beginning, with the definition of . We can also solve analytically for the growing mode amplitude of in the limit, which is suppressed relative to and because of the factor (this came from the fact that fluctuations are measured relative to the changing background dispersion). The result, again the same as in the RG calculation, is

(61) |

i.e., the dispersion fluctuations are smaller for models with less small-scale power, although the best way of looking at this is probably to think of these models as the ones where the non-linear scale is changing more quickly, and therefor the homogeneous dispersion is increasing more quickly, diluting the perturbations. All that remains to be calculated (for a power law power spectrum) are the high- suppression kernels that must be applied to each field, which generally must be obtained numerically.

Figure 1 shows these kernels, i.e., the effect of the Jeans-like smoothing due to the mean velocity dispersion, for power law power spectra with , corresponding to roughly the present non-linear scale, and , appropriate to the non-linear scale at a much earlier time.

To be clear, Fig. 1 was made by evolving Eqs. (52-58) numerically, with initial conditions for a scaling solution determined by first evolving assuming the scaling solution for , then restarting using the results for the initial kernels and normalization (at that point, the system will evolve stably on the scaling solution). Generally, the initial conditions would break the scaling, e.g., if one starts with the bare un-smoothed power law, that is not consistent with scaling (one would think that non-linear effects would erase the memory of a break in scaling in the distant past, but that does not happen at linear order). In each case Fig. 1 shows the power spectrum from the numerical calculation divided by the power spectrum that would be obtained by taking the limit (specifically, taking the overall normalization of to zero, which does not affect which sets the normalization of ). The kernels for the different power laws are dramatically different when plotted as functions of (). Additionally, the relation between the non-linear scale and is much different: for , while for , i.e., the dispersion scale lags much farther behind the non-linear scale for the spectrum with less small-scale power, not surprisingly. Note that the non-linear and dispersion scales are increasing extremely rapidly for , like , so that the dispersion scale only lags the non-linear scale by a factor of in expansion factor. Similarly, even though the extreme-looking spike in the kernel covers a factor in , a given mode only spends of an expansion factor within this feature. In spite of the apparently large differences, there is a remarkable relation between the two power law cases: the expansion factor by which the dispersion scale lags the non-linear scale, i.e., , is essentially identical between the two cases – 0.262 vs. 0.265 from my numerical solution, for and . To be clear, the dispersion in each case is determined by the requirement that Eq. (60) for is satisfied after the power is filtered by the complex-looking kernels in Fig. 1. Eq. (45) actually does a reasonable job anticipating this relation, in spite of its approximations, predicting 0.18 and 0.12 for and , respectively, but it is not clear if there is any deep reason for the near perfect agreement in the exact calculation (the under-prediction by Eq. 45 is understandable, as the Gaussian approximation I used there for the kernels works essentially perfectly for the initial suppression, but misses the wiggles, which produce extra dispersion).

#### Cdm

The situation for a CDM power spectrum is less straightforward than for power law power spectra, where we could imagine the initial dispersion was arbitrarily small, yet still obtain an interesting dispersion tied to the non-linear scale. In the real Universe, we have specific physically motivated initial conditions, and we cannot ignore them (at least not within the lowest order calculation in this paper – as I discuss below, I actually expect that the influence of the smallest scale initial conditions will be erased when the calculation is taken to the next order). A typical WIMP with mass GeV decouples thermally from the relativistic electrons, positrons, and neutrinos at MeV Bertschinger (2006). The coupling to the relativistic plasma actually produces acoustic oscillations in the transfer function, which I will ignore. The 1D velocity dispersion at kinetic decoupling is . For a typical WIMP, the scale below which power is erased is of order the scale of free-streaming after decoupling, but would actually be similar even for much more massive (i.e., slow moving) particles, because a substantial fraction of the effect is due to Silk-damping-like friction between the WIMPs and leptons during decoupling Bertschinger (2006). Using Eq. (48) of Bertschinger (2006), I will assume the power entering the matter dominated era is suppressed by with where is the conformal time at decoupling and is the conformal time at matter-radiation equality (after which the free streaming has frozen). For the parameters I am using, the primordial velocity dispersion, in Hubble flow distance units, extrapolated to the present time assuming no enhancement by structure formation, would be , and the smoothing scale is .

The total density variance in a realistic CDM model, in linear theory, without any enhancement of velocity dispersion by structure formation, is , where is the linear theory growth factor normalized to be 1 at the present time. If we could assume that followed the same form, as it does before there is any feedback (and assume an EdS Universe), the solution to Eq. (58) would be