MAMPOSSt: Modelling Anisotropy and Mass Profiles of Observed Spherical Systems. I. Gaussian 3D velocities
Abstract
Mass modelling of spherical systems through internal kinematics is hampered by the mass / velocity anisotropy degeneracy inherent in the Jeans equation, as well as the lack of techniques that are both fast and adaptable to realistic systems. A new fast method, called MAMPOSSt, is developed and thoroughly tested. MAMPOSSt performs a maximum likelihood fit of the distribution of observed tracers in projected phase space (projected radius and lineofsight velocity). As in other methods, MAMPOSSt assumes a shape for the gravitational potential (or equivalently the total mass profile). However, instead of postulating a shape for the distribution function in terms of energy and angular momentum, or supposing Gaussian lineofsight velocity distributions, MAMPOSSt assumes a velocity anisotropy profile and a shape for the threedimensional velocity distribution. The formalism is presented for the case of a Gaussian 3D velocity distribution. In contrast to most methods based on moments, MAMPOSSt requires no binning, differentiation, nor extrapolation of the observables. Tests on clustermass haloes from CDM dissipationless cosmological simulations indicate that, with 500 tracers, MAMPOSSt is able to jointly recover the virial radius, tracer scale radius, dark matter scale radius and outer or constant velocity anisotropy with small bias (10% on scale radii and 2% on the two other quantities) and inefficiencies of 10%, 27%, 48% and 20%, respectively. MAMPOSSt does not perform better when some parameters are frozen, and even particularly worse when the virial radius is set to its true value, which appears to be the consequence of halo triaxiality. The accuracy of MAMPOSSt depends weakly on the adopted interloper removal scheme, including an efficient iterative Bayesian scheme that we introduce here, which can directly obtain the virial radius with as good precision as MAMPOSSt. Additional tests are made on the number of tracers, the stacking of haloes, the chosen aperture, and the density and velocity anisotropy models. Our tests show that MAMPOSSt with Gaussian 3D velocities is very competitive with other methods that are either currently restricted to constant velocity anisotropy or 3 orders of magnitude slower. These tests suggest that MAMPOSSt can be a very powerful and rapid method for the mass and anisotropy modeling of systems such as clusters and groups of galaxies, elliptical and dwarf spheroidal galaxies.
keywords:
methods: analytical – galaxies: kinematics and dynamics – galaxies: haloes – galaxies: clusters: general – dark matter1 Introduction
The determination of mass profiles is one of the fundamental issues of astronomy. Subtracting the mass density profile of the visible component, one deduces the dark matter (hereafter, DM) density profile, which can be confronted to the predictions from cosmological body simulations. This is especially relevant given the differences between the total NFW (Navarro, Frenk, & White, 1996) or better Einasto (Navarro et al., 2004) density profiles derived in dissipationless simulations of a single dark matter component on one hand, and the density profiles found for the DM in hydrodynamical cosmological simulations (Gnedin et al., 2004). Moreover, the knowledge of the total density profile serves as a fundamental reference, relative to which one can scale various astronomical tracers such as the mass density profiles of the stellar, gas and dust components, as well as the luminosity in different wavebands. These studies can be performed as a function of system mass and other attributes such as galaxy colour (e.g., Wojtak & Mamon, 2013 and references therein).
Mass profiles can be derived from internal motions, or alternatively from Xray or lensing observations. This paper focuses on mass profiles from internal kinematics. In this class of mass modeling, one has to deal with a degeneracy between the unknown radial profiles of total mass and of the velocity anisotropy (hereafter ‘anisotropy’)
(1) 
(note that, in spherical symmetry, one must have ). While radial outer orbits are expected for structures in an expanding universe (e.g. Ascasibar & Gottlöber, 2008 for dark matter particles and Ludlow et al., 2009 for subhaloes), the dissipative nature of the gas dynamics is expected to produce tangential orbits in the inner regions of systems formed from gasrich mergers or collapse. Therefore, lifting the Mass  Anisotropy Degeneracy can provide useful constraints on the formation of the structure under study.
A common method to extract the mass profile is to assume that the lineofsight (hereafter, LOS) velocity distribution, at given projected radius, is Gaussian (Strigari et al., 2008; Battaglia et al., 2008; Wolf et al., 2010). These methods perform adequately on the mass profile, but provide weak constraints on the anisotropy (Walker et al., 2009). Merritt (1987) pointed out that anisotropic models have nonGaussian LOS velocity distributions. Therefore, the observed kurtosis of the distribution of LOS velocities serves as a powerful constraint to the anisotropy (Gerhard, 1993; van der Marel & Franx, 1993; Zabludoff, Franx, & Geller, 1993). In fact, if one assumes that the anisotropy is constant throughout the system, the fourth order Jeans equation can be used to express the LOS velocity kurtosis as an integral of the tracer density, anisotropy and total mass profiles (Łokas, 2002). Moreover, Richardson & Fairbairn (2012) were able to generalize the expression for the LOS velocity kurtosis for radially varying anisotropy in the framework of separable augmented density and 4th order anisotropy equal to the standard anisotropy (the latter appears to be an excellent approximation for CDM haloes, see Fig. 10 of Wojtak et al., 2008). One can then perform a joint fit of the observed LOS velocity dispersion and kurtosis profiles. This was found to (partially) lift the massanisotropy degeneracy when applied to dwarf spheroidal galaxies (Łokas, 2002) and the Coma cluster, (Łokas & Mamon, 2003): the joint constraint of LOS velocity dispersion and kurtosis profiles allows the estimation of both the mass profile (i.e., normalization and concentration) and the anisotropy of the cluster, contrary to the case when the LOS kurtosis profile is ignored.
An interesting route is to perform nonparametric inversions of the data assuming either the mass profile to obtain the anisotropy profile (anisotropy inversion, pioneered by Binney & Mamon, 1982) or the anisotropy profile to obtain the mass profile (mass inversion, independently developed by Mamon & Boué, 2010 and Wolf et al., 2010). These inversion methods are powerful in that they are nonparametric, but they suffer from their requiring the user to bin the data, smooth it, and extrapolate it beyond the range of data.
Hence, one would like to go one step further and constrain the full information contained in the observed projected phase space (projected radii and LOS velocities, hereafter, PPS) of LOS velocities as a function of projected radii. In other words, rather than using the 0th, 2nd and possibly 4th moments of the LOS velocity distribution, we wish to use the full set of even moments.
The traditional way to analyze the distribution of particles in PPS is to assume a form for the sixdimensional distribution function (DF) in terms of energy () and angular momentum () and fit the triple integral of equation (5) below, using this DF for , to the distribution of particles in PPS. The worry is that we have no good a priori knowledge of the shape of the DF, . One clever idea is to throw orbits in a gravitational potential, since each orbit is a Dirac delta function in energy and angular momentum. One then seeks a linear combination of these orbits, with positive coefficients, to match the data. This orbit model (Schwarzschild, 1979; Richstone & Tremaine, 1984; Syer & Tremaine, 1996) is very powerful (and can handle nonspherical gravitational potentials), but too slow to obtain meaningful errors on the parameters. A similar, and in principle faster, technique is to assume that the DF is the linear combination (again with positive coefficients) of elementary DFs (Dejonghe, 1989; Merritt & Saha, 1993; Gerhard et al., 1998), but only one such study has been made (Kronawitter et al., 2000), and it is not clear whether the elementary DFs, although numerous, constitute a basis set.
An important step forward has been performed by Wojtak et al. (2008), who analyzed the haloes in CDM cosmological simulations, to show that the DF can be approximated to be separable in energy and angular momentum, with a simple analytical approximation for the angular momentum term. In a sequel, Wojtak et al. (2009) have shown that it is feasible to fit the distribution of particles in PPS with equation (5), using the approximation of the DF found by Wojtak et al. (2008).
However, it is not yet clear whether selfgravitating quasispherical astrophysical systems have the DF of CDM haloes: In particular, if the dynamical evolution of these systems is influenced by the dissipation of their gaseous component, the DF may not be separable in terms of energy and angular momentum. Dissipation is not expected to affect much the internal kinematics of large systems such as galaxy clusters^{1}^{1}1However, the joint Xray and lensing analysis of a cluster by Newman et al. (2009) reveals a shallower inner density profile than NFW, suggesting that dissipation is also important in clusters., but is expected to be increasingly important in smaller systems such as galaxy groups, and especially galaxies themselves. For this reason, it is useful to consider a massmodeling method that is independent of the dependence of the DF on energy and angular momentum.
In this work, we present an alternative method, in which we fit the distribution of particles in PPS making assumptions on the radial profiles of mass and anisotropy as well as the radial variations of the distribution of spacevelocities. We call this method Modelling of Anisotropy and Mass Profiles of Observed Spherical Systems, or MAMPOSSt for short.^{2}^{2}2MAMPOSSt should evoke the mass analog of a lamppost, and mampostería in Spanish means masonry, hence the building blocks of structures. The MAMPOSSt method is described in Sect. 2.1, its Gaussian approximation is described in Sect. 2.2. Tests on haloes derived from a cosmological body simulation are presented in Sect. 3. A discussion follows in Sect. 4.
2 Method
2.1 General method
The observed tracer population of a spherical system has a DF
(2) 
where is the tracer number density profile.
MAMPOSSt fits the distribution of objects in PPS (projected radius and LOS velocity ), assuming parametrized forms for

the gravitational potential (or equivalently a total mass density profile, through the Poisson equation),

the anisotropy profile (eq. [1])

the distribution of 3D velocities, .
Consider a point P at distance from the centre, O, of the spherical system, with projected radius and consider the spherical coordinates where the unit vectors and are in the plane containing OP and the LOS, while is perpendicular to this plane. Consider also the cylindrical coordinate system , where is the axis along the LOS and is the axis perpendicular to the LOS, but in the plane containing O and P and the LOS. The Jacobian of the transformation from the spherical coordinate system to the new one is unity, hence one can write
The distribution of LOS velocities at P is then obtained by integrating velocities over the two perpendicular axes (dropping from for clarity):
(3) 
Note that dynamical systems have maximum velocities set by the escape velocity, (where is the gravitational potential), on one hand, and by the maximum allowed (observable) absolute LOS velocity on the other hand. In what follows, we will neglect both limits, unless explicitly mentioned otherwise.
The surface density of observed objects (the tracer) in PPS is then obtained by integrating along the LOS
(4)  
(5) 
where
(6) 
is the tracer surface density at projected radius . Equation (5) is equivalent to equation (2) of Dejonghe & Merritt (1992).
If the tracer number density profile , appearing in equation (4), is not known and if the incompleteness of the data is independent of the projected radius, then one can estimate by Abel inversion of of equation (6):
(7) 
But this is not necessary, as we shall see below.
In MAMPOSSt, rather than replace the velocities by energy and angular momentum and numerically solve the triple integral of equation (5) (as first proposed by Dejonghe & Merritt, see also Wojtak et al., 2009), we analytically derive from equation (3) for known 3D velocity distributions. With the analytical form of , equation (4) provides the surface density distribution of tracers in PPS through a single integral. Note, however, that another single integral is required because the expression for will involve (see eqs. [25] and [26], below, for the Gaussian case), which is obtained by solving the spherical Jeans equation
(8) 
where is the anisotropy of (eq. [1]) for our given choices of total mass and anisotropy profiles. We thus need to insert the solution (van der Marel, 1994; Mamon & Łokas, 2005)
(9) 
in the expression for (eq. [3]) to derive , via equation (4), where is given, while is obtained with equation (7). In equation (9), is the radial profile of the total mass (this is the only instance where the gravitational potential enters MAMPOSSt). For a given choice of parameters, the single integral of equation (4) must be evaluated for every data point , whereas the other integral (eq. [9]) for need only be evaluated once, on an adequate grid of .
Note that for projected radii extending from to and absolute LOS velocities extending from 0 to a maximum velocity, which for projected radius is theoretically equal to , and in practice is possibly specified by a cut of obvious velocity interlopers, , one can write
(10)  
where we used equation (4) for , assumed that is normalised, reversed the order of the integrals in and , and where is the predicted number of objects within projected radius , while . Equation (10) then implies that the probability density of observing an object at position of PPS is
where equation (2.1) arises from equation (4), while equation (LABEL:qdef2) is obtained by writing . Here, is the number of tracers in a cylinder of projected radius , the terms and are given by
(13)  
(14) 
where is the cumulative tracer number density profile, while is the characteristic radius of the tracer. One easily verifies that . The values of and appearing in (eq. [10]) can be hard limits, or alternatively the respective minimum and maximum projected radii of the observed tracers if no hard limits are specified.
We fit the parameters (mass scale or concentration and possibly normalization, anisotropy level or radius, as well as the tracer scale – if not previously known) that enter the determination of to the observed surface density, using maximum likelihood estimation (MLE), i.e. by minimizing
(15) 
for the parameter vector , where is the number of data points, with given by equation (2.1).
Writing , where is the vector of the parameters other than , one has
(16) 
where
(17) 
and
(18) 
Combining the last equality of equation (18) with equation (5), integrating over LOS velocities, reversing the order of the two outer integrals of the resulting quadruple integral, and using equation (6) yields . So, if the scale of the tracer distribution is already known, then, according to equations (15) and (16), maximizing the likelihood amounts to minimizing
(19) 
Now, if is not known, then one may be tempted to solve for it by minimizing , and then proceed with equation (19) to minimize for the remaining parameters, . However, since (from eqs. [15] and [16]), the most likely solution for that minimizes will not in general be that which minimizes at the same time and . Moreover, if one seeks to obtain the distributions of parameters and consistent with the MLE solution (for example with Markov Chain MonteCarlo techniques), the joint analysis of equations (2.1) and (15) is required. On the other hand, if is known from other data, while the current dataset is known to have a completeness, , that is a function of projected radius, then one could indeed minimize of equation (19). The proper solution is then to minimize weighting the data points by the inverse completeness, i.e. minimizing
(20) 
For computational efficiency, we perform the following tasks:

For each run of parameters, we first compute from equation (9) on a logarithmic grid of , and compute cubicspline coefficients at these radii. Then, when we compute the LOS integral of equation (4) for each , we evaluate with cubic spline interpolation (in loglog space, using the cubic spline coefficients determined at the start).

We terminate the LOS integration in equation (2.1) at roughly 15 virial radii,^{3}^{3}3The virial radii are loosely defined here as the radius where the mean density of the halo is 200 times the critical density of the Universe. , instead of infinity, as the Hubble flow pushes the velocities of the material beyond this distance to values over above the mean of the system (see Mamon, Biviano, & Murante, 2010, hereafter MBM10). The LOS integration varies only very slightly with the number of virial radii, so as long as the virial radius is correct to a factor of two, this choice of integration limit is not an issue.
We now need to choose a model for the shape of the 3D velocity distribution. While MAMPOSSt, can, in principle, be run with any model, the simplest one is the (possibly anisotropic) Gaussian distribution, which we describe in Sect. 2.2 below.
2.2 Gaussian 3D velocity distributions
The simplest assumption for the 3D velocity distribution is that it is Gaussian:
(21) 
where the velocity dispersions are functions of . This Gaussian distribution assumes no streaming motions: e.g. no rotation, and no mean radial streaming, which is adequate for in highmass haloes (i.e. groups and clusters) and in galaxymass haloes (Cuesta et al., 2008). Inserting equation (21) into equation (3) and integrating over leads to
Calling the angle between the lineofsight (direction ) and the radial vector , one has
(23)  
(24) 
with which the integral over in equation (LABEL:hofvz2) yields a Gaussian distribution of LOS velocities at point P:
(25) 
of squared dispersion
(26) 
The integral of along the LOS is obtained from equations (4) and (25):
(27)  
According to equations (18) and (27), the probability of measuring a velocity at given projected radius is
(28)  
We remind the reader that is a chosen function of , is a function of given by equation (7), while is a function of given by equation (9). For isotropic systems (), equation (27) leads to a Gaussian distribution of LOS velocities. However, for anisotropic velocity tensors, the distribution of LOS velocities will generally not be Gaussian (as Merritt, 1987 found when starting from distribution functions instead of Gaussian 3D velocities). Hence, the Gaussian nature of is not equivalent to the popular assumption that is Gaussian on : even if is a Gaussian at point P, its integral along the LOS is not Gaussian, unless and is constant.
If one of the parameters to determine with MAMPOSSt is the normalization of the mass profile, one should not be tempted in expressing the radii in terms of the virial radius , the velocities in terms of the virial velocity , the tracer densities in terms of what we wish (as they appear in both the numerator and denominator of eq. [28]). Doing so, equation (28) becomes
(29)  
where the quantities with tildes are in virial units. Equation (29) indicates that when one varies the (and the virial velocity in proportion as ), the highest probabilities are reached for the highest normalizations: becomes very small, while is unaffected to first order. This unphysical result is the consequence of using a parameter (the virial radius) as part of the data variable. On the other hand, using equation (28), one sees that the highest probabilities are reached at intermediate values of the normalization.
Taking the second moment of the velocity distribution of equation (28) leads to the equation of anisotropic projection yielding the LOS velocity dispersion, :
(30)  
Equation (30) recovers the equation of anisotropic kinematic projection, first derived by Binney & Mamon (1982).
If interlopers are removed with a velocity cut , then the expression for becomes
(31) 
3 Tests
3.1 Simulated haloes
rank  ID  

1  18667  0.789  0.179  0.401  0.117  1.14  0.276  1.33 
2  21926  0.842  0.123  0.342  0.085  1.34  0.050  1.73 
3  30579  0.890  0.189  0.443  0.120  1.33  0.053  1.69 
4  25174  0.956  0.144  0.377  0.099  1.23  0.162  1.36 
5  3106  1.010  0.297  0.661  0.166  1.05  2.384  1.09 
6  8366  1.076  0.434  0.819  0.249  1.11  0.689  1.29 
7  13647  1.151  0.227  0.536  0.151  1.19  0.265  1.41 
8  1131  1.174  0.197  0.499  0.133  1.18  0.352  1.34 
9  17283  1.298  0.505  1.009  0.277  1.04  0.727  1.05 
10  434  1.374  0.317  0.699  0.210  1.30  0.165  1.70 
11  5726  1.660  0.407  0.921  0.249  1.42  0.050  2.20 
Stack  1.090.08  0.260.04  0.600.08  0.170.02  1.210.04  0.260.08  1.450.10 
Notes: Properties obtained from fits to the particle data of 11 haloes. Cols. 1 and 2: cluster identification; col. 3: virial radius ; col. 4: scale radius () of the NFW mass density profile (eq. [32]); col. 5: scale radius () of the Hernquist mass density profile (eq. [33]); col. 6: scale radius () of the Burkert mass density profile (eq. [34]); col. 7: mean anisotropy () within ; col. 8: anisotropy radius with the ML anisotropy model; col. 9: asymptotic anisotropy () at infinite radius with the T anisotropy model. Radii are in units of . The measured anisotropies do not incorporate streaming motions.
To test MAMPOSSt, we use clustermass haloes extracted by Borgani et al. (2004) from their large cosmological hydrodynamical simulation performed using the parallel Tree+SPH GADGET–2 code of Springel et al. (2005). The simulation assumes a cosmological model with , , , , and . The box size is Mpc. The simulation used DM particles and (initially) as many gas particles, for a DM particle mass of . The softening length was set to comoving kpc until and fixed afterwards (i.e., kpc). The simulation code includes explicit energy and entropy conservation, radiative cooling, a uniform timedependent UV background (Haardt & Madau, 1996), the selfregulated hybrid multiphase model for star formation (Springel & Hernquist, 2003), and a phenomenological model for galactic winds powered by TypeII supernovae.
DM haloes were identified by Borgani et al. (2004) at redshift with a standard Friendsoffriends (FoF) analysis applied to the DM particle set, with linking length 0.15 times the mean interparticle distance. After the FoF identification, the centre of the halo was set to the position of its most bound particle. A spherical overdensity criterion was then applied to determine, for each halo, our proxy for the virial radius, , where the mean density is 200 times the critical density of the Universe.
To save computing time, we worked on a random subsample of roughly two million particles among the . We have extracted 11 clustermass haloes from these simulations, among which, ten are about logarithmically spaced in virial radius, , while the halo is the most massive in the entire simulation. Their properties are listed in Table 1. We made no effort to omit irregular haloes, but among the list of 12 irregular haloes out of 105 extracted by MBM10 from the same simulation, 2 are in our sample (haloes 17283 and 434). We list the characteristic radii of three models fitted by MLE to the mass density profiles of the particle data (from 0.03 to ), namely:

the NFW density profile
(32) where is the radius of slope in the mass density profile, related to the concentration ;
Denoting the scales , and by the generic , the mass profiles of these models (required for eq. [9]) can be written
(35) 
where
(36) 
The NFW model has long been known to fit well the density profiles of CDM haloes (Navarro et al., 1996), and while Navarro et al. (2004) found that Einasto models fit them even better, MBM10 found that the NFW model describes the outer LOS velocity dispersion profile of the DM component of their stacked clustermass halo in Borgani et al.’s hydrodynamical cosmological simulation even (slightly) better than the Einasto model. The Hernquist model differs from the NFW one because it has a steeper logarithmic slope at large radii, rather than . The Burkert model, on the other hand, has the same asymptotic as the NFW model, but a core at the centre, , rather than a cusp ( in both the NFW and Hernquist models).
In Table 1, we also list the values of the parameters characterizing different velocityanisotropy models, namely:

the constant anisotropy model (‘Cst’ model hereafter), where we assume spherical symmetry and therefore ;

the model (‘ML’ model hereafter) of Mamon & Łokas (2005);
(37) characterized by the anisotropy radius ;

a generalization of the ML model, which is also a simplified version of the model of Tiret et al. (2007), isotropic at and with anisotropy radius identical to (hereafter called ‘T’ model):
(38) characterized by the anisotropy value at large radii, . In our T model, the anisotropy radius is set to the scale radius of the mass density profile. Note also that in the following we provide the values of , rather than .
Note that the ML and the T models used here are identical for and . With these values, the ML and T models provide a good fit to the average anisotropy profile of a set of clustermass cosmological haloes (MBM10).
In Fig. 1, we show the individual halo velocity anisotropy profiles and the ML anisotropy model with (or, equivalently, the T anisotropy model with ). There is a huge scatter in the of the individual haloes, as already observed by, e.g., Wojtak et al. (2008), especially at , while 8 out of 11 haloes have .
3.2 Observing cones and interloper removal
To test MAMPOSSt, we select 500 DM particles around each halo, out to a maximum projected distance from the halo centre, for which we consider three values: , , and . We analyze three orthogonal projections for each halo – these are in fact cones with an observer at away, but the opening angle being very small has no noticeable effect on our results. The particles in these cones are used by MAMPOSSt as tracers of the halo gravitational potential.
However, these 500particle samples include interlopers, i.e. DM particles that are located in projection at , but are effectively outside in real (3D) space, i.e. with . It is impossible to remove all these interlopers in the observed redshift space, where only 3 of the 6 phasespace coordinates of the tracers are known (e.g., MBM10). Moreover, since the LOS velocity distribution of interlopers in mock cones around CDM haloes is the sum of a Gaussian component and a uniform one (see MBM10 for a quantified view), and since the Gaussian one resembles that of the particles in the virial sphere, it is important to remove the flat LOS velocity component, at least at high absolute LOS velocity, where it dominates. It is possible to remove these high objects with suitable interloper removal algorithms.
To see how MAMPOSSt depends on the choice of the interloper removal algorithm, we here consider three different algorithms.
The first one is a new, iterative algorithm, that we name “Clean”, which is fully described in Appendix B. Clean first looks for gaps in the LOS velocities, then estimates the virial radius from the aperture velocity dispersion, assuming an NFW model with ML anisotropy with an anisotropy radius and a concentration depending on the estimate of via the relation of Macciò, Dutton, & van den Bosch (2008), then only considers the galaxies within from the median LOS velocity, and finally iterates. Our assumed anisotropy profile fits reasonably well the anisotropy profiles of DM haloes (MBM10), as is clear for our 11 haloes (see Fig. 1). The factor 2.7 was found by MBM10 to best preserve the local LOS velocity dispersion for the assumed density and anisotropy models.
We also consider two other interloper removal algorithms, namely:

the method (hereafter, KBM) of Katgert, Biviano, & Mazure (2004, see their Appendix A), in which a galaxy is flagged as an interloper under the condition , with derived from using eq. (8) of Carlberg, Yee, & Ellingson (1997). This method was invented as a poorman’s proxy for the dHK method when the observational sampling of the halo projected phasespace is poor.
3.3 The general 4parameter case
There is no a priori limitation on the number of free parameters that can be used in MAMPOSSt to characterise the mass and velocity anisotropy profiles. With samples of tracers (assumed massless throughout these tests) it is appropriate to consider free parameters, two for the mass distribution, one for the velocity anisotropy distribution, and one for the spatial distribution of the tracers. All these models are characterised by the two free parameters, the ‘virial’ radius and a characteristic scaleradius (, , and for the NFW, Hernquist, and Burkert models, respectively). Herafter, we generically use to refer to this characteristic scaleradius of the mass density profile.
We use the NFW model, in projection (Bartelmann, 1996; Łokas & Mamon, 2001), to fit the projected number density profile of the tracer. Note that the normalization of this profile does not enter the MAMPOSSt equations, so the only free parameter is . Herafter we call this parameter , to avoid confusion with the characteristic radius of the NFW mass density profile. We only consider one model for the number density profile of the tracer, because this is a direct observable, unlike the mass density profile. While one should not be too restrictive in the model choice for the mass density profile, the observer is generally able to choose the bestsuited model for the tracer number density profile by direct examination of the data before running MAMPOSSt. We choose the NFW model because it provides a reasonable description of the number density profiles of the DM particles in our simulated haloes.
For the velocity anisotropy profile, we consider the three models described above, Cst, ML, and T, each characterised by a single anisotropy parameter, and , respectively. In equation (38), we use .
To search for the bestfit solution, we run the MAMPOSSt algorithm in combination with the NEWUOA^{4}^{4}4NEWUOA is available at http://plato.asu.edu/ftp/other_software/newuoa.zip minimization routine of Powell (2006). For estimating error bars on the best fit parameters, as well as confidence contours on pairs of parameters, we fit our model parameters using the Markov Chain Monte Carlo (MCMC) technique (e.g., Lewis & Bridle, 2002). In MCMC, the dimensional parameter space is populated with proposals, for each of which the likelihood is computed. The new proposal is accepted if the ratio of new to previous likelihood is either greater than unity or else greater than a uniform random number. The proposal is found by assuming a dimensional Gaussian probability distribution around the previous proposal. We adopt the publicly available CosmoMC code by A. Lewis.^{5}^{5}5CosmoMC is available at http://cosmologist.info/cosmomc/. We run 6 chains in parallel using Message Parsing Interface (MPI), and the covariance matrix is used to update the parameters of the Gaussian proposal density to ensure faster convergence.
Fig. 2 illustrates the MAMPOSSt analysis via MCMC for the general case with four free parameters, using the NFW model for the mass density profile, and the constant (free parameter) anisotropy model. In particular, it shows that the different parameters are not correlated, except for a positive correlation between and .
Our FORTRAN code takes roughly 1 second to find the MAMPOSSt 4parameter solution for a 500 particle sample run in scalar on a decent desktop or laptop computer, and 4 minutes to produce confidence limits for this solution with the CosmoMC (Lewis & Bridle, 2002) MCMC code, with 6 chains of elements run in parallel (MPI) on a PC equipped with a 4core 8thread Intel CoreI7 2600 processor.
N  Membership  anisotropy  

bias  ineff.  corr.  bias  ineff.  corr.  bias  ineff.  corr.  bias  ineff.  corr.  
500  Clean  NFW  Cst  0.004  0.040  0.909  0.027  0.102  0.835  0.032  0.217  0.578  0.007  0.073  –0.255  
500  Clean  NFW  ML  –0.003  0.040  0.904  0.024  0.104  0.832  0.057  0.229  0.601  –0.221  0.887  –0.172  
500  Clean  NFW  T  –0.006  0.040  0.903  0.026  0.103  0.838  0.039  0.169  0.709  0.007  0.085  0.621  
500  dHK  NFW  Cst  0.018  0.042  0.885  0.027  0.099  0.838  0.051  0.319  0.406  0.004  0.147  –0.215  
500  dHK  NFW  ML  0.012  0.041  0.909  0.028  0.100  0.840  0.059  0.286  0.611  –0.161  0.904  –0.118  
500  dHK  NFW  T  0.012  0.044  0.902  0.027  0.100  0.844  0.022  0.199  0.636  –0.045  0.264  0.464  
500  KBM  NFW  Cst  0.005  0.038  0.906  0.018  0.100  0.850  –0.006  0.218  0.535  0.020  0.078  –0.198  
500  KBM  NFW  ML  –0.003  0.039  0.908  0.020  0.100  0.851  –0.005  0.232  0.557  –0.191  0.795  0.101  
500  KBM  NFW  T  –0.006  0.038  0.911  0.020  0.099  0.856  –0.010  0.184  0.689  0.018  0.094  0.595  
500  Clean  Her  T  0.002  0.039  0.909  0.026  0.102  0.835  0.039  0.132  0.755  0.014  0.086  0.546  
500  Clean  Bur  T  0.000  0.039  0.910  0.047  0.196  0.377  0.048  0.145  0.704  –0.019  0.071  0.603  
500  Clean  NFW  T  –0.004  0.048  0.877  0.089  0.115  0.902  0.016  0.143  0.744  0.009  0.108  0.232  
500  Clean  NFW  T  –0.014  0.035  0.905  0.039  0.179  0.420  0.093  0.210  0.538  –0.001  0.090  0.436  
100  Clean  NFW  Cst  –0.001  0.058  0.844  0.033  0.201  0.537  –0.133  0.341  0.342  0.003  0.119  –0.053  
100  Clean  NFW  ML  –0.011  0.061  0.834  0.034  0.199  0.539  –0.087  0.336  0.484  –0.137  0.925  –0.245  
100  Clean  NFW  T  –0.008  0.058  0.850  0.032  0.200  0.532  –0.108  0.277  0.436  0.014  0.143  0.249 
Notes: These results are for 11 haloes each observed along 3 axes, general 4 freeparameter case. Col. 1: Number of initially selected particles (before interloper removal); col. 2: Maximum projected radius for the selection, where and ; col. 3: Interloperremoval method (dHK: den Hartog & Katgert, 1996; KBM: Katgert et al., 2004; Clean: App. B); col. 4: mass density model (NFW: Navarro et al., 1996; Her: Hernquist, 1990; Bur: Burkert, 1995); col. 5: anisotropy model (Cst: ; ML: eq. [37], Mamon & Łokas, 2005; T: eq. [38], adapted from Tiret et al., 2007); cols. 6–8: virial radius; cols. 9–11: tracer scale radius; cols. 12–14: dark matter scale radius; cols. 15–17: velocity anisotropy (i.e., for the Cst model, for the ML model, and for the T model). The columns ‘bias’ and ‘ineff.’ respectively provide the mean and standard deviation (both computed with the biweight technique) of , while columns ‘corr.’ list the Spearman rank correlation coefficients between the true values and MAMPOSStrecovered ones (values in boldface indicate significant correlations between and values at the confidence level).
The results for the different interloper rejection methods, mass density and velocity anisotropy models, and for the different maximum projected radii used in the selection of the 500 particles are listed in Table 2 and displayed in Fig. 3.
We list and show the biweight measures (see, e.g., Beers, Flynn, & Gebhardt, 1990) of mean and dispersion of where is the recovered value of the parameter and its true value, because, according to our tests, they perform better than standard statistical estimators of location and scale when the parent distributions are not pure Gaussians. We call ‘bias’ and ‘inefficiency’ the mean and dispersion of . If the dispersion in true values of a given parameter is small, one can spuriously obtain low values of the dispersion when the MAMPOSSt and true values show no correlation. We therefore also list the Spearman rank correlation coefficient between and , marking in boldface those correlations that are significant at the 99% confidence level. We list the results for all interloper rejection methods and all anisotropy models only for the NFW mass density model and for the radial selection. For simplicity, we only show a limited set of results for the other mass density models and for the other radial selections.
Remarkably, as seen in Table 2, the results for the four parameters are almost independent of the interloper removal algorithm, the Clean and KBM algorithm performing slightly better than dHK. The results for and also depend very little on the chosen anisotropy model. On average, the values of the parameter are recovered with almost no bias (from to %) and with only % inefficiency. The parameter estimates are always slightly positively biased (4–7%), and are recovered with % efficiency. Also, the parameter estimates generally display a slight positive bias, except for the KBM interloper removal method, and overall the bias ranges from to %, while the efficiency ranges from to %.
As far as the anisotropy parameter is concerned, the ML model behaves very differently from the Cst and T models, in that it is virtually impossible to constrain the anisotropy radius of the former, , while it is possible to obtain quite good constraints on the anisotropy parameters of the other two models, and . More precisely, the estimates are always negatively biased (by %) and are affected by a huge dispersion (almost one order of magnitude). On the other hand, and are almost unbiased (the bias ranges from to %) and they are affected by dispersions of, typically, %, if we consider the Clean and KBM interloper removal algorithms. So, apparently, it is much easier to constrain the ‘normalisation’ of a given anisotropy profile, than to constrain the characteristic radius at which the anisotropy changes, particularly so if this change is mild, as in the ML model. Note, however, that the difficulty of MAMPOSSt in constraining the anisotropy parameter of the ML model does not mean that the ML model is a poor representation of reality, and in fact Fig. 1 suggests the opposite. Moreover, constraints obtained on the and parameters are equally good with the Cst and T anisotropy models, as with the ML one.
As seen in Table 2, correlations between recovered and observed values of the parameters and are almost always signficant. This is also true for the anisotropy parameter, but not for and .
In Fig. 4, we show the correlations existing between the true and recovered values of the different parameters, using the T anisotropy model, for the 11 haloes along the 3 different orthogonal projections. Projections effects render the determination of the mass and anisotropy profile of a single 500particle halo very uncertain. However, Fig. 4 shows that tracers are sufficient to rank haloes for each of the different parameters considered here, i.e. by mass (), scale radius of the tracer distribution () and of the total mass distribution (), and outer velocity anisotropy .
The importance of projection effects is also very clear from Fig. 5, where we display the ratio of the recovered to true values of the parameters for each halo along the three different projection axes. This figure also shows that there is no trend of under or overpredicting the parameter values with halo mass.
All the above considerations apply for the NFW mass profile. Our tests with the Hernquist and Burkert mass profiles give similar results, as can be seen in Table 2, where for simplicity, we only list the results for the Clean interloperremoval algorithm and for the Tiret anisotropy model. The results are very similar for the NFW and Hernquist models. Results are similar also for the Burkert model, except for the scale of the number density profile, for which the bias and inefficiency are both higher than those obtained using the NFW and Hernquist mass models. Since the model we use for the number density profile has not changed (a projected NFW), this result suggests that it is difficult to accomodate a tracer with a central cuspy spatial distribution in a potential with a central core.
All the results described so far were obtained for a selection of 500 particles within . Changing the value of is not without effects on the results. The inefficiency on decreases when is gradually increased from to . The inefficiency on anisotropy is largest for the smallest , and statistically similar for the two larger values. Increasing the aperture to the virial radius or above increases the number of tracers near the virial radius where is estimated. Moreover, increasingly larger apertures will capture better the asymptotic value of the anisotropy profile . In contrast, is less efficiently determined when is increased from to , while has its worst inefficiency for , with statistically equivalent values for the two smaller apertures. This might be due to the increasing fraction of unidentified interlopers, and/or to the presence of (sub)structures at larger radii.
To assess the sensitivity of the MAMPOSSt technique to the number of tracers, besides the 500particle selection, we have also considered samples of 100 particle tracers, randomly extracted from the same projections of the same 11 cosmological haloes. Results of the MAMPOSSt analysis are listed at the bottom of Table 2 and displayed in Fig. 3. Also in this case, for simplicity, we only display a limited set of results. When compared to the results for the 500particle samples, there is no significant change in the average values of the bias with which the different parameters are recovered, except for , where the bias becomes negative, while it was mostly positive for the 500particle samples. The parameter value underestimation is not very severe, however, %. The efficiencies with which the different parameters are estimated are significantly affected by the reduction in number of tracers. The dispersion increases from to 15% for , from to 60% for , from to 100% for , and from to 33% for and . There is no significant change in the dispersion for , but this was already extremely large for the 500particle samples.
3.4 Cases with constraints on parameters
What is the effect of reducing the number of free parameters on the performance of the MAMPOSSt algorithm? To assess this point we consider several cases that reproduce what observers do in practice when faced with the problem of determining the internal dynamics of cosmological haloes. In all cases, we consider 500 particles selected within in each halo. We only apply the Clean interloper removal algorithm, and we only consider the NFW mass density model, for simplicity.
Specifically, we consider the following Cases:

General [Gen]: , , , and the anisotropy parameter (one among the following: , , , depending on the anisotropy model considered) are all free MAMPOSSt parameters. This is the case considered so far.

General with fitted outside MAMPOSSt [Split]: the free parameters are the same as in the Gen case, but is fitted outside MAMPOSSt, via MLE. We thus split the minimization of the parameters into two parts.

Known virial mass or radius [KVir]: is fixed and assumed to be exactly known, and the anisotropy parameter are free parameters in MAMPOSSt, is an external free parameter, as in the Split case.

Estimated virial mass or radius [EVir]: similar to KVir, except that is not the true value, but the value estimated from the LOS aperture velocity dispersion (after interloper removal, see Appendix B).

CDM: is estimated from using the theoretical relation between these two quantities provided by Macciò et al. (2008); and the anisotropy parameter are free parameters in MAMPOSSt, is an external free parameter, as in the Split case.

Mass follows Light [MfL]: and the anisotropy parameter are free parameters in MAMPOSSt, is an external free parameter, as in the Split case, and is assumed to be identical to .

Tied Light and Mass [TLM]: and the anisotropy parameter are free parameters in MAMPOSSt, while and are assumed to be an identical free parameter.

Isotropic [iso]: is assumed, are free parameters in MAMPOSSt, is an external free parameter, as in the Split case.

Anisotropy model à la MBM10 [MBM]: ML anisotropy model with forced to be identical to ; are free parameters in MAMPOSSt, is an external free parameter, as in the Split case.

Anisotropy linked to the mass density profile, using the anisotropy  slope relation of Hansen & Moore (2006) [HM].
Case  

Gen  
Split  
KVir  known  
EVir  from  
CDM  from  
MfL  
TLM  
iso  0  
MBM  =  
HM 