Self-consistent flattened isochrones
We present a family of self-consistent axisymmetric stellar systems that have analytic distribution functions (dfs) of the form , so they depend on three integrals of motion and have triaxial velocity ellipsoids. The models, which are generalisations of Hénon’s isochrone sphere, have four dimensionless parameters, two determining the part of df that is even in , and two determining the odd part of the df (which determines the azimuthal velocity distribution). Outside their cores, the velocity ellipsoids of all models tend to point to the model’s centre, and we argue that this behaviour is generic, so near the symmetry axis of a flattened model, the long axis of the velocity ellipsoid is naturally aligned with the symmetry axis and not perpendicular to it as in many published dynamical models of well-studied galaxies. By varying one of the df’s parameters, the intensity of rotation can be increased from zero up to a maximum value set by the requirement that the df be non-negative. Since angle-action coordinates are easily computed for these models, they are ideally suited for perturbative treatments and stability analysis. They can also be used to choose initial conditions for an N-body model that starts in perfect equilibrium and to model observations of early-type galaxies. The modelling technique introduced here is readily extended to different radial density profiles, more complex kinematics, and multi-component systems. A number of important technical issues surrounding the determination of the models’ observable properties are explained in two appendices.
keywords:galaxies: kinematics and dynamics
Although real galaxies are by no means in states of dynamical equilibrium, equilibrium models have a fundamental role to play in the study of galaxies and the Universe. One reason for this is that most of a galaxy’s mass is thought to reside in dark matter that can only be traced by its gravitational field, and we can map this field through the kinematics of field stars only to the extent that the stars are in dynamical equilibrium. A further reason is that modest deviations of galaxies from dynamical equilibrium are best modelled by perturbing equilibrium models – most successful branches of physics, from celestial mechanics to high-energy physics through plasma physics and quantum condensed-matter physics, comprise applications of perturbation theory.
Currently the range of equilibrium galaxy models is extraordinarily limited. The theory of globular clusters rests to a great extent on the equilibrium models introduced by Michie (1963) and popularised by King (1966). Our still very limited understanding of spiral structure owes much to the two-dimensional equilibrium models of Kalnajs (1976), Zang & Toomre (Toomre, 1981) and Evans (Evans, 1994; Evans & Read, 1998; Sellwood & Evans, 2001).
On account of the paucity of fully three-dimensional equilibrium models, several recent analyses of our Galaxy’s halo have relied on models which have distribution functions (dfs) of the form
where is a star’s energy and is the magnitude of its angular momentum (Deason et al., 2011). Models with a df of this form only make dynamical sense to the extent that the gravitational potential can be considered to be spherically symmetric, which it probably cannot be in the solar neighbourhood. Moreover, in the case of radial bias, the df (1) implies infinite phase-space density in the limit of radial orbits, which is inherently implausible and potentially compromises the model’s stability (Fridman & Polyachenko, 1984; Palmer & Papaloizou, 1987), while in the case of tangential bias, the df implies a distribution of velocities that is physically implausible because it is bimodal in (Fermani & Schönrich, 2013). Hence dfs of the form (1) do not constitute a satisfactory basis for dynamical modelling.
Although galaxies are probably never precisely axisymmetric, they are often sufficiently nearly so for axisymmetric models to be valuable starting points from which better-fitting models may be derived by perturbation theory. In this paper we present a new class of axisymmetric models. In a subsequent paper it will be shown how the technique we introduce here can be used to produce a wide range of axisymmetric models.
Already in the paper in which Jeans introduced his theorem, it was clear that the equilibrium df of our Galaxy cannot be a function of the form that depends only on the classical integrals in an axisymmetric potential. Dependence on a third integral is essential, and the field has been held back by the want of an analytic formula for . In the 1970s it became evident that the dfs of elliptical galaxies also involve in an essential way (Bertola & Capaccioli, 1975; Binney, 1976; Davies et al., 1983), although recent systematic surveys have cast a new light on this conclusion (Cappellari et al., 2007; Emsellem et al., 2007). Nevertheless, it remains true that an abundance of observational material indicates that realistic equilibrium models of real galaxies must depend on in addition to and .
Numerical orbit integrations in the 1960s showed that generic orbits in flattened axisymmetric potentials respect a third integral (Henon & Heiles, 1964; Ollongren, 1965). But these experiments did not lead to useful analytic expressions for . A promising attack on this problem through Hamiltonian perturbation theory was pursued by Contopoulos and his collaborators (Contopoulos, 1960), but this line of attack was frustrated by the fact that the vertical oscillations of most stars are far from harmonic, so their orbits are not readily treated as perturbed harmonic oscillators. Moreover the coupling between a star’s radial and vertical oscillations is fundamental to their dynamics, so neither motion should be considered in isolation.
Eddington (1915) and Stäckel (1893) showed that analytic expressions for can be obtained for a certain class of potentials that are now known as Stäckel potentials. de Zeeuw (1985) showed that many of these potentials are generated by remarkably galaxy-like density distributions, and he clarified the nature of orbits in these potentials. The present paper relies on a technique, the “Stäckel Fudge” (Binney, 2012a, hereafter B12a), which is an extension of this classic work. This approximation consists of applying to an arbitrary gravitational potential formulae that would be valid if the potential were of Stäckel’s form even though the potential does not have this form.
Since any function of a star’s isolating integrals is itself an isolating integral, there is in principle enormous freedom in the choice of arguments of a galaxy’s df. However, certain integrals stand out for special consideration: the action integrals. These alone can be embedded in a canonical coordinate system, and their conjugate (angle) variables have the remarkable property of increasing linearly in time:
Action integrals are unique up to a set of discrete canonical transformations that map between rational linear combinations of any given set of actions, and map the angle variables into integer linear combinations of the given angles. Hence when combined with reasonably physical requirements such as “the radial action should quantify the extent of radial oscillations” and “the vertical action should quantify the extent of vertical oscillations”, actions are uniquely defined. This uniqueness greatly facilitates the comparison of models by making it possible to compare the density with which orbits in slightly different potentials are populated. Therefore it is natural to require the df to be a function of the actions.
To derive from the observable properties of a model, we need expressions for the actions as function of the ordinary phase-space coordinates. Here we use the Stäckel Fudge from B12a. Numerical experiments presented in B12a show that the actions and angles that one obtains from the Fudge are significantly more accurate than those obtained from the adiabatic approximation (Binney, 2010), and they are also valid for orbits that move far from the equatorial plane and thus lie outside the range of validity of the adiabatic approximation. Whereas Binney (2012b) and Binney et al. (2014) only required actions for orbits that pass within of the solar neighbourhood, we require actions throughout our models. This change has prompted us to undertake significant revisions of the scheme described by B12a to obtain them with increased accuracy and reduced numerical effort. These revisions are described in Appendix I.
In Section 2 we define our dfs, which are derived from the df of the isochrone sphere. In Section 3 we explain how a model is relaxed to its self-consistent configuration and detail checks of the numerical accuracy. In Section 4 we describe the observable properties of some specific models and refine our choice of df. In Section 5 we discuss some potential applications of these models, including choice of initial conditions for N-body models, studies of symmetry breaking in stellar systems and related perturbation analyses, and modelling observations of early-type galaxies. Section 6 sums up. Two Appendices give numerical details that are likely to be useful when building the models.
2 The distribution functions
The isochrone potential (Hénon, 1959)
where is the model’s total mass and is its scale length, is highly unusual in that it admits analytic expressions for the both angles and actions as functions of (e.g. Binney & Tremaine, 2008, §3.5). Moreover, the associated Hamiltonian
can be written as a simple function of the actions
where is the total angular momentum. The df that self-consistently generates can be derived from Eddington’s inversion formula. It proves to be (Hénon, 1960)
generates a spherical model because and appear in it on an equal footing. We can flatten the model by causing the df to decrease with increasing faster than with increasing or . A df that achieves this goal is
where ﬀ is a triple of constants with .
The radial density profile of a spherical model is largely determined by , the number of stars per unit energy at (Binney & Tremaine, 2008, Fig. 4.5 and §4.4). That is, the radial density profile of a model changes only modestly when stars are shifted over a surface of constant . A tangentially anisotropic model is created if stars are shifted over the surface towards the line , where is the energy of a circular orbit of energy (Fig. 1). Conversely, a radially anisotropic model is created if stars are moved over surfaces of constant towards the axis. Given that the df is invariably a decreasing function of all three actions, when we replace in with , the value of is decreased for a given value of if , and increased otherwise. We would like these changes to average to zero over a surface of constant so we can interpret them as the results of moving stars over these surfaces while keeping unchanged. For , the change in caused by the substitution is
In order to preserve the radial density profile, we want the integral of over a surface of constant to vanish. This will be approximately the case if
where is the barycentre of the surface of constant , i.e., the action of the form that lies in the surface. For the isochrone potential one finds
So long as we require equation (10) to be satisfied, only two of the scaling factors should be considered independent for the third can be obtained from this equation. Below we explore models in which and are taken to be constants and becomes through equation (10) a function of energy:
where and are the angular and radial frequencies of the specified orbit in the isochrone sphere defined by equation (2).
2.1 Rotating models
The df of the isochrone is an even function of the angular momentum and this property is preserved by the transformation (8) proposed above. When the df is an even function of , the model does not rotate. It can be set rotating by adding to the given even df, , a df that is an odd function of . Then the complete df becomes
where is a free parameter that allows one to vary the rotation speed from zero at up to a maximum value that is set by the requirement that is never negative.
Given an even df , a natural definition of an odd df is
where is an odd function of . A maximally rotating model is obtained by choosing and , but with this choice of , is discontinuous on the plane , where its absolute value is typically large. To avoid such a discontinuity we adopt
Here is a dimensionless parameter that specifies the steepness of the rotation curve near the origin: the larger the value of , the more steeply the curve rises. With these choices we have
Naturally the azimuthal velocity dispersion is
3 Finding the self-consistent potential
Given values for and to specify a df, one has to recover the corresponding model’s density and self-consistent potential from iterations. One adopts some trial gravitational potential and computes the density implied by and the df on a grid in the plane. Then one computes the potential implied by this density distribution, and computes the extrapolated potential
Now one repeats this cycle with replaced by . A positive value of speeds convergence of these iterations; if is above a threshold, numerical instability sets in. For the dfs explored here, works well.
Figs 2 and 3 show results obtained with , when the trial potential is that of an isochrone sphere flattened to axis ratio . In Fig. 2 a black dotted curve shows the density on the minor axis yielded by the df in the trial potential divided by the density of the corresponding isochrone sphere. The red curve shows the corresponding ratios after one adjustment to the potential, followed by the magenta, green and blue curves for the second through the fourth adjustments of the potential. The near coincidence of the green and blue curves demonstrates that the iterations have essentially reached a stationary point. The full curves in Fig. 2 show the corresponding results for the major axis. Where the blue curves run nearly horizontally, the density profile is essentially proportional to that of the underlying isochrone sphere, as planned. Within the full and dotted blue curves necessarily converge on a point, which indicates the ratio of the central density in the final model to that of the isochrone sphere.
Finding the density and potential that correspond to a given df using five iterations of the density takes CPU hours on a desk-top machine. If the key loop of the code is parallelised a model can be constructed in less than half an hour.
3.1 Checks of accuracy
Many checks on the accuracy of the computations are possible. First, one can compute orbits in a model’s potential and evaluate the actions at different points along the orbit. The fluctuations in the computed actions are then typically of . Errors in the evaluation of forces and the potential by interpolation on the radial grid (see Appendix II) could alone account for errors of this magnitude. Interpolation of the grid in action space (Appendix I) introduces errors of order in the actions used for the evaluation of moments.
When the apparatus is used to evaluate the density of a model that is essentially the isochrone sphere by setting in the df and adopting the potential of an isochrone sphere that has been squashed to axis ratio (the case gives rise to a singularity in the equations employed), the density of the isochrone sphere is recovered to parts in .
The integral yields the mass of the model divided by and from the definition (8) of the df it follows that the model’s mass is the same as that of the isochrone, . Numerically we can compute the mass of a model that lies within radius by computing times the monopole component of the radial component of the gravitational force . In the typical case of the flattened model plotted in Fig. 2, with the outer edge of the grid set to , the radial force implies that lies inside , while in the isochrone sphere lies inside . The discrepancy between these two masses is a small fraction of the mass that lies outside the grid in the spherical case. This finding is consistent with there being no error in the computed mass and potential. Note too that we expect to obtain the same mass by integrating over as we do by integrating over only because the system is canonical. Hence the mass check tests the validity of the Stäckel Fudge.
The virial theorem provides three useful checks on both the validity of the calculations and the convergence of the potential. For an axisymmetric system the tensor virial theorem has two non-trivial components, one, associated with the cylindrical radius and a vertical component, . The sum of these two components constitutes the scalar virial theorem, . After five iterations one finds in a typical case that while and .
A further check on accuracy is provided by the Jeans equations. As discussed below, these are satisfied to within the precision with which we are evaluating spatial derivatives within the final model.
We consider first non-rotating models. The half-mass radius of the isochrone sphere is .
4.1 Ellipticity profiles
The top panel of Fig. 3 shows contours of constant density of the model with , in the plane, with four contours per decade. The flattening of the model is evident. The black curve in the lower panel of Fig. 3 shows the ellipticity of isodensity surfaces as a function of the length of the semi-major axis. From to is nearly constant, falling from to . Inside , the ellipticity rises steeply. In fact in this model the peak density is not reached at the centre but in the equatorial plane at . The density at the centre is, however, times the peak density, so the upward lurch of the ellipticity curve in Fig. 3 is caused by a very minor depression in the central density.
The black curve in Fig. 4 shows the ellipticity as a function of the logarithm of radius in the model obtained by setting and . The choice implies that the dependence of the df on is precisely that of the isochrone sphere, so equation (10) now causes to be materially smaller than unity, so the df becomes radially biased. The ellipticity peaks at just outside the core and further out slowly declines, reaching at . Inside the core plunges to negative values, passing zero at and reaching . Thus the innermost part of the model is prolate rather than oblate. It is instructive to understand why.
The figure shows that the dominant terms in this equation are the first and last terms, which are plotted in red and blue, respectively. The other two terms are comparable outside but further in the anisotropy term (that involving ), shown in magenta, first dominates the term involving and then dominates the radial-force term, which has to vanish at the origin. It acts in the same sense as the radial-force term, i.e., inwards, and the pressure term (plotted in red) has to grow in magnitude to counteract it. It is this term that makes the model’s centre prolate. Fig. 6 shows the radial dependencies of , and along this model’s principal axes. We see that at the centre is approached through the equatorial plane, , plotted in blue, does approach (black) from below, but Fig. 5 shows that it does not do so quite fast enough to prevent the anisotropy term in the radial Jeans equation growing as the centre is approached.
An analogous analysis of the Jeans equations for the oblate model plotted in Fig. 3 shows that in this model the anisotropy term is slightly negative, so it pushes material away from the centre and thus gives rise to the slight central depression in the density that is responsible to the central spike in the model’s ellipticity curve.
Physically the steep rise in the ellipticity of the tangentially biased model and the central prolateness of the radially biased model are not very significant because in the nearly homogeneous core a small difference between the densities distance from the origin along the minor and major axes can translate into a large ellipticity of the isodensity surfaces. Nonetheless, these central ellipticity changes detract from the models’ elegance, and we seek a modification of the df that will moderate or eliminate them.
From the Jeans equations it is clear that the key is to ensure that fast enough as . Since the df of the isochrone sphere has everywhere, our goal should be attainable by ensuring that as . Setting in equation (10) we find that
with given by equation (11). We can now require that both and tend to as by writing
where is a given constant, is computed from equation with replaced by , and
is a function that vanishes at the origin and is essentially unity for values of the argument bigger than .
The broken red curve in the lower panel of Fig. 3 shows the run of ellipticity we obtain when we use equations (4.1) with and . The steep central rise in of the original model has been replaced by a modest decline to a central value just above 0.2. The dashed red curve in Fig. 4 shows the ellipticity of the radially biased model when equation (4.1) is employed. The model now becomes slightly oblate rather than prolate deep in the core.
4.2 The velocity ellipsoids
Fig. 7 shows the variation within the plane of the velocity dispersions , and within the azimuthally biased, non-rotating model with and . We see that surfaces of constant and are quite flattened, while those of are distinctly prolate. Fig. 8 is the analogous figure for the radially biased model with , . Now the surfaces of constant are decidedly more flattened and those of constant are less flatted, and the surfaces of constant are even more prolate.
Fig. 9 shows the orientation of the velocity ellipsoids of the tangentially (top) and radially biased non-rotating models. The plots are remarkably similar. Beyond the ellipsoids are approximately aligned with spherical polar coordinates, while at smaller radii the ellipsoids swing towards alignment with cylindrical polar coordinates.
Fig. 10 shows how the principal velocity dispersions vary along the major (full curves) and minor axes of the radially biased model , . The most remarkable feature is the extreme flatness of the profile along the minor axis – there is no decrease in between the centre and . Further out it converges on the curve for along the major axis, and is significantly higher than the curve for on the minor axis. The corresponding plot for the azimuthally biased model , shows that the curve for remaining flat until it converges with the curve for along the major axis and follows it down. Thus beyond the core of any model it seems that varies along the minor axis much as varies along the major axis. This reflects the strong connection between these moments and the way the df depends on .
4.3 Projections of rotating models
The left panel of Fig. 11 shows the projected mass density of the azimuthally biased model , when it is viewed from the equatorial plane. The isodensity contours have ellipticity and are clearly boxy: the disciness coefficient (Binney & Merrifield, 1998, eq. 4.10) . Only one of the 48 galaxies in the SAURON survey presented by Emsellem et al. (2007) has a more negative disciness. It seems likely that in real early-type galaxies such a negative disciness from the spheroid is counteracted by a strongly positive contribution to the disciness from an embedded disc.
The centre and right panels of Fig. 11 show the projected velocity dispersion and mean-streaming velocity of the maximally rotating version of the model. The contours of constant velocity dispersion are very boxy. The mean streaming velocity decreases with distance from the major axis, so the galaxy does not rotate on cylinders. Emsellem et al. (2007) defined as a measure of rotation rate the parameter
where angle brackets signify luminosity-weighted averages over the part of the image that lies within the effective radius . The data plotted yield . By reducing the parameter appearing in equation (13) we can produce a model with the same ellipticity and any value of up to 0.19. By decreasing the inclination at which we view the model plotted in Fig. 11, we can construct models in which and move to the origin on a certain curve. In Fig. 12 the squares show the points along this curve for inclinations .
Projection of the radially biased model , yields isodensity contours of ellipticity and disciness . The maximally rotating model () has when viewed edge-on. The triangles in Fig. 12 show points in the plane for inclinations .
5.1 N-body models
One of the commonest applications of self-consistent models that have analytic distribution functions such as Michie–King models (Michie, 1963; King, 1966) and Hernquist models (Hernquist, 1990) is the generation of N-body models that start in an equilibrium configuration rather than experiencing an early period of violent relaxation towards an uncontrolled equilibrium. Hence we now explain how the present models can be used to choose initial conditions for an N-body model.
We start by sampling the isochrone sphere. This is conveniently done by defining
is the escape speed, tabulating the integrals
on a suitable grid in the rectangle , . Then the df of the isochrone sphere can be sampled by picking a number that is uniformly distributed in [0,1] and finding by interpolation on the grid the radius that satisfies and then choosing a new random number and determining the value of that satisfies . Now we evaluate the isochrone’s df at this radius and kinetic energy.
Next we choose random directions for the position and velocity vectors and and evaluate at the chosen phase-space position the actions for the flattened model, and thus evaluate the df . of this model. We accept this point with probability , where is a constant of order unity chosen to ensure that the ratio never exceeds unity.
5.2 Stability of models
For some values of ﬀ the model will be unstable to bar formation. Specifically, if is too small and the model too radially biased, it will suffer from the radial-orbit instability (Fridman & Polyachenko, 1984; Palmer & Papaloizou, 1987). Similarly, if the model is set rotating too fast by adding a large odd df , it will develop a rotating bar. Investigation of the onset and development of these instabilities promises to be a fascinating field of research that would extend to stellar dynamics the classical work of Dedekind, Jacobi and Riemann on rotating fluid bodies (Chandrasekhar, 1969).
It is likely that for values of that are smaller than some critical value, a triaxial equilibrium can be found for the given that has lower energy than the axisymmetric equilibrium constructed here, and that axisymmetric models with are liable to the radial-orbit instability. In the limit of infinitely many stars implicit in our discussion, the transition from an axisymmetric equilibrium to a triaxial one has all the characteristics of a phase transition. It would be fascinating to know the nature of this transition.
Currently we are not in a position to construct triaxial models given because the Stäckel Fudge used here seems not to extend to triaxial potentials. Torus mapping does extend to barred systems, even rotating ones (Kaasalainen & Binney, 1994; Kaasalainen, 1995), so it should be possible to build models with given by this method.
In addition to establishing the relations between axisymmetric and triaxial equilibria, one would want to follow the dynamics of the instability that effects the loss of symmetry. The present models are ideal for such an investigation from two respects. First, the growth of the perturbation can be followed in angle-action coordinates, which were invented to study the stability of the solar system and are thus the natural coordinates of Hamiltonian perturbation theory. Although angle-action coordinates have been used by a number of authors to study the stability of planar, axisymmetric discs (e.g. Kalnajs, 1977) and spherical galaxies (e.g. Weinberg, 1991; Saha, 1992), only a small number of rather special models have been studied in this way for want of a wider range of models for which angle-action coordinates are available.
Another direction of research into the stability of galaxies that is opened up by the present models is the method of perturbation particles (Leeuwin et al., 1993; Leeuwin & Athanassoula, 2000). In this method, invented in unpublished work by G. Rybicki, the initial model is represented by an analytic df and particles are used merely to quantify the difference between the model’s time-evolving state and the initial condition. Because the particles start massless and only gather (positive and negative) mass as the dynamics unfolds, the Poisson fluctuations in the gravitational potential are much smaller than in a conventional N-body simulation with the same number of particles, and physics can be explored to higher precision. To date a major restriction on the use of the method has been the shortage of dynamically interesting equilibria for which is known analytically. Hence our models greatly widen the range of applicability of this promising method.
5.3 Modelling early-type galaxies
The advent of integral-field spectrographs has rejuvenated the study of the internal dynamics of early-type galaxies (Bacon et al., 2001; Emsellem et al., 2007; Cappellari et al., 2011). Now that it is feasible to quantify the line-of-sight velocity distribution (LOSVD) over a large part of the image of an E or S0 galaxy, sophisticated dynamical models can be fitted to the data. In addition to mapping the variation of the mass-to-light ratio within early-type galaxies, these models have revealed internal structures in these systems, such as discs and kinematically decoupled cores.
The models fitted have been of two types: Schwarzschild orbit-superposition models (Schwarzschild, 1979; van de Ven et al., 2008) and models based on the Jeans equations (Satoh, 1980; Binney et al., 1990; Cappellari, 2008). In either case the three-dimensional luminosity distribution is inferred from the photometry under some assumption of symmetry and inclination angle, a matching potential is adopted, and then model parameters are adjusted to optimise the fit between predicted and observed kinematics.
Schwarzschild modelling is very general but cumbersome because the model parameters are the weights of some thousands of orbits that together form a library, and the selection of orbits for the library is an art rather than a science. In view of these objections, models based on the Jeans equations are widely used although they lack either generality or rigour depending on how the modelling is done. The rigorous approach is to assume that , which is equivalent to assuming that the df has the restricted “two-integral” form . The df of our own Galaxy is very different from a two-integral df so it is essential to fit more general models to observations of external galaxies.
Recently the “Jeans Anisotropic Multi-Gaussian Expansion” (JAM) models of Cappellari (2008) have been widely used. These models assume that the principal axes of the velocity ellipsoid are always aligned with the cylindrical coordinate directions, and that , where is a constant. Cappellari recognises that the principal axes really align much more nearly with prolate ellipsoidal coordinates than cylindrical coordinates (as Fig. 9 confirms) but argues that on the minor axis, as in the equatorial plane, the short axis is parallel to the axis, while at intermediate latitudes the ellipsoid is nearly spherical. At points on the minor axis of any of our models the long axis of the velocity ellipsoid points radially rather than tangentially except in the core. It is worth understanding why this is so.
Only orbits with small values of approach the minor axis, and in three dimensions an orbit of this type comprises an elliptical annulus that lies in a plane that is only slightly inclined to the minor axis and precesses around the axis. If the orbit’s radial action is diminished, the annulus may shrinks within the precessing plane to an elliptical curve, and in the limit the orbit becomes a shell orbit. If is increased, the annulus becomes thick on account of the large radial excursions along the orbit. At a fixed energy, the sequence of orbits that starts with the shell orbit and proceeds through orbits of ever higher eccentricity is a sequence in which grows from zero to infinity. Along the model’s minor axis, orbits in the early part of this sequence stretch the model’s velocity ellipsoid in the tangential direction, while orbits in the later part stretch the ellipsoid radially. Hence the tangential bias along the minor axis that is assumed in the JAM models implies dominance by orbits with small . However it is precisely these orbits that we have suppressed in order to make the model oblate. Hence there is an essential connection between the flattening of our models and the radial bias of the velocity ellipsoids along the minor axis beyond the core. (Inside the core orbits with small tend to oscillations along the minor axis rather than thin annuli in a precessing plane and the argument above does not hold.)
To construct a model in which the velocity ellipsoids behave as assumed when building a JAM model one might add to one of our models a distinct population of stars on essentially shell orbits, i.e. orbits with and . These orbits are elongated parallel to the model’s symmetry axis, so adding them will reduce the model’s flattening. But by the same token they will tend to dominate the population of stars on the axis and thus there stretch the velocity ellipsoids tangentially. Fig. 13 shows the structure of the velocity ellipsoids inside a model of this type. Specifically a new component was added to the df of the model with , . The df of the new component is
where is defined by equation (11). With the new component included, the longest axis of the velocity ellipsoids (plotted in black) points tangentially out to along the minor axis rather than only within as in the original model. This change arises because the velocity ellipsoids do not twist around as one moves outwards near the minor axis, as they do in the original model. Inevitably, the addition of the new component diminishes the model’s flattening: declines from at the half-mass radius to zero at , and the model is prolate at smaller radii.
It may be that many fast-rotating early-type galaxies do contain a distinct component that comprises near-polar orbits like the model shown in Fig. 13, but if this is the case, it is a remarkable circumstance. In any case the discussion above makes it physically evident that JAM models are far from generic. If rigorous construction of models with their presumed properties is possible, a prerequisite would seem to be a df that has at least two peaks on each surface of constant energy in action space: most stars must be associated with a peak in the region of low and be responsible for the model’s flattening, while a second peak near the vertex of that surface is responsible for the tangential orientation of the velocity ellipsoids along the model’s minor axis.
We have concentrated on exceptionally simple dfs. Galaxies with different shapes and kinematics could be constructed using dfs that are either linear combinations of the dfs explored here, or involve other simple functional forms for . In particular it would be straightforward to make a model that, like NGC 4550, has counter-rotating discs (Rubin et al., 1992), or a galaxy that has a kinematically decoupled core.
5.4 Multi-component galaxies
All galaxies are thought to contain substantial amounts of dark matter, and most galaxies contain stellar discs in addition to a spheroidal stellar component. It is straightforward to generalise the present models to multi-component systems: one simply chooses a df for each component. A very convenient aspect of this choice is that each component’s mass is specified by the integral , so it is determined before one solves for the model’s spatial form. The latter is done just as in Section 3 with the df given by . Naturally the observables of the relaxed model are obtained by integrating only over the dfs of the stellar components. We hope shortly to present models of our Galaxy that have been constructed in this way by combining a df for of the type described by Binney (2012b) with a df for the dark halo of the type described by Pontzen & Governato (2013).
We have presented a new type of self-consistent model of hot, axisymmetric stellar systems that have specified dfs , where is the triple of action values. The even part of the df is specified by two dimensionless parameters and . When the model becomes oblate. If the model is radially biased, while dropping below unity reduces the radial bias and, for sufficiently small values of , the model becomes azimuthally biased. The odd part of the df, which does not contribute to the model’s density profile , is controlled by two dimensionless parameters, , which controls the steepness of the rotation curve at the centre, and , which determines how fast the model rotates: increasing both speeds up the model’s rotation and diminishes the magnitude of the azimuthal velocity dispersion .
We focused on the observables of just two exemplary models, a tangentially biased model with and a radially biased model with . Both models achieve peak ellipticities at about and are distinctly boxy when seen edge-on. In each model the velocity ellipsoids are aligned with cylindrical coordinates within the core, but beyond the core they quickly align with radial polar coordinates, and along the minor axis the ellipsoids point towards the centre rather than tangentially. We have argued that the only way to make the velocity ellipsoids point tangentially at significant distances down the minor axis is to include a distinct component of stars on nearly polar orbits.
The models could provide initial conditions for N-body models that start from perfect equilibrium, something that is possible only when the df is explicitly known. Since the angle-action coordinates of any point in the phase space of one of these models are readily computed, the models provide perfect testbeds for studies of galactic stability and evolution.
The present models differ from Schwarzschild models in having vastly fewer free parameters: four versus the number of orbits in the orbit library employed. They differ from models based on the Jeans equations in providing full velocity distributions rather than just the first two moments of the distributions. It would be straightforward to extend these models in various directions. For example, it is easy to devise other approaches than simple action scaling to move from the df of an ergodic model to a three-integral df f(J), and we are currently exploring one of these alternatives. Another direction in which the present work is being extended is to multi-component systems, in which stars and dark matter have distinct distribution functions, and, in the case of our Galaxy, the thin disc, thick disc and halo stellar populations all have distinct dfs.
We started from the isochrone sphere because it provides an analytic expression for . In a forthcoming paper we will show how to obtain a good approximation to an analogous expression for any spherical model, and thus extend the present work to other popular spherical systems, such as the Hernquist (1990) sphere.
- Bacon et al. (2001) Bacon R. et al., 2001, MNRAS, 326, 23
- Binney (1976) Binney J., 1976, MNRAS, 177, 19
- Binney (2010) Binney J., 2010, MNRAS, 401, 2318 (B10)
- Binney (2012a) Binney, J., 2012, MNRAS, 426, 1324 (B12a)
- Binney (2012b) Binney, J., 2012b, MNRAS, 426, 1328
- Binney et al. (2014) Binney J., Burnett B., et al., 2014, MNRAS, in press (arXiv1309.4285)
- Binney et al. (1990) Binney J., Davies R.L., Illingworth G., 1990, ApJ, 361, 78
- Binney & Merrifield (1998) Binney J., Merrifield M.R., 1998, “Galactic Astronomy”, Princeton University Press
- Binney & Tremaine (2008) Binney J., Tremaine S., 2008, Galactic Dynamics, Princeton University Press: Princeton
- Bertola & Capaccioli (1975) Bertola F., Capaccioli M., 1975, ApJ, 200, 439
- Cappellari et al. (2011) Cappellari M., et al., 2011, MNRAS, 416, 1680
- Cappellari et al. (2007) Cappellari M., Emsellem E., et al, 2007, MNRAS, 379, 345
- Cappellari (2008) Cappelari, M., 2008, MNRAS, 390, 71
- Chandrasekhar (1969) Chandrasekhar S., 1969, “Ellipsoidal figures of equilibrium”, Yale University Press, New Haven
- Contopoulos (1960) Contopoulos G., 1960, ZA, 49, 273
- Davies et al. (1983) Davies R.L., Efstathiou G., Fall S.M., Illingworth G., Schechter P.L., 1983, ApJ, 266, 41
- Deason et al. (2011) Deason, A.J., Belokurov, V. & Evans, N.E., 2011, MNRAS, 411, 1480
- Delhaye (1965) Delhaye, J., 1965, in Galactic Structure, eds Blaauw, A., Schmidt, M. (Chicago: University of Chicago Press), p. 61
- de Zeeuw (1985) de Zeeuw P.T., 1985, MNRAS, 216, 273
- Eddington (1915) Eddington, A.S., 1915, MNRAS, 76, 37
- Emsellem et al. (2007) Emsellem, E., et al., 2007, MNRAS, 379, 401
- Evans (1994) Evans N.w., 1994, MNRAS, 267, 333
- Fermani & Schönrich (2013) Fermani F., Schönrich R., 2013, MNRAS, 432, 2402
- Fridman & Polyachenko (1984) Fridman A.M., Polyachenko V.L., 1984, “Physics of Gravitating Systems”, Springer, New York
- Hénon (1959) Henon M., 1959, Ann Ap, 22, 126
- Hénon (1960) Hénon M., 1960, Ann Ap, 23, 474
- Henon & Heiles (1964) Hénon M., Heiles K., 1964, AJ, 69, 73
- Hernquist (1990) Hernquist L., 19990, ApJ, 356, 359
- Kaasalainen (1995) Kaasalainen, M., 1995, PhRvE, 52, 1193
- Kaasalainen & Binney (1994) Kaasalainen, M. & Binney, J., 1994, MNRAS, 268, 1033
- Kalnajs (1976) Kalnajs A., 1976, ApJ, 205, 751
- Kalnajs (1977) Kalnajs A., 1977, ApJ, 212, 637
- King (1966) King I.R., 1966, AJ, 71, 64
- Leeuwin et al. (1993) Leeuwin F., Binney J., Combes F., 1993, MNRAS, 262, 1013
- Leeuwin & Athanassoula (2000) Leeuwin F., Athanassoula E., 2000, MNRAS, 317, 79
- Michie (1963) Michie R.W., 1963, MNRAS, 125, 127
- Ollongren (1965) Ollongren A., 1965, in “Stars and stellar systems V”, A. Blaauw & M Schmidt eds., Chicago U.P., p. 501
- Palmer & Papaloizou (1987) Palmer P., Papaloizou J., 1987, MNRAS, 224, 1043
- Pontzen & Governato (2013) Pontzen A., Governato F., 2013, MNRAS, 430, 121
- Evans & Read (1998) Evans N.W., Read J., 1998, MNRAS, 300, 106
- Rubin et al. (1992) Rubin V.C., Graham J.A., Kenney J.D.P., 1992, ApJ, 394, L9
- Saha (1992) Saha P., 1991, MNRAS, 254, 132
- Satoh (1980) Satoh C., 1980, PASJ, 32, 41
- Schwarzschild (1979) Schwarzscild M., 1979, ApJ, 232, 236
- Sellwood & Evans (2001) Sellwood J.A., Evans N.W., 2001, ApJ, 546, 176
- Stäckel (1893) Stäckel P., 1893, Math Ann., 42, 537
- Toomre (1981) Toomre A., 1981, in “The structure and evolution of normal galaxies”, S.M. Fall & D. Lynden-Bell eds., Cambridge U.P., p.111
- van de Ven et al. (2008) van de Ven G., de Zeeuw P.T., van den Bosch R.C.E., 2008, MNRAS, 385, 614
- Weinberg (1991) Weinberg M., 191, ApJ, 368, 66
Appendix I: Improving the evaluation of actions
The scheme for the evaluation of actions is that described in B12a except for modifications described here.
A redefined third integral
A redefinition of the third integral extracted from the Stäckel Fudge proves expedient: in the notation of B12a we now use as the third integral
where is the location of the minimum in the effective potential that governs oscillations in . Thus defined is related to the third integral defined by equation (9) of B12a by
Subtracting ensures that for a shell orbit, and dividing by ensures that is almost invariant under a change in . In fact, has many of the properties one expects of the “radial energy” and is thus a nicely physical third integral.
As discussed by B12a, the integrals over velocity space are significantly accelerated if actions are obtained by interpolation from variables that can be computed from algebraically rather than by evaluation of integrals over and . Interpolation errors prove to dominate the error budget, but they can be minimised by judicious choice of the grid. Three suitable algebraic quantities are the difference between the orbit’s energy and the energy of the circular orbit with its value of , and the radial energy (eq. 27). Together specify the effective potentials in which the star is presumed to oscillate in and , and thus make the actions and available by quadrature. Our approach is to tabulate the values of these integrals on a grid in
It is impracticable for the grid to cover the whole of action space, which is infinite. Good accuracy can be achieved, however, by covering the part of action space on which the df is largest. This part occupies the tetrahedron that lies between the approximately planar surface and the origin (Fig. 1). We take the primary axis to be the axis, which in Fig. 1 runs left to right. At each grid point on this axis we have to consider a triangular domain. At low the triangles are large, and they shrink with increasing . Fig. 14 shows one of these triangles. Running across it is a series of lines of constant .
The grid points with known actions are defined as follows. For each grid value of we choose a grid of values of by taking equal increments in the speed that determines . Next we find the intersection with the equatorial plane of the shell orbit () of the given values by locating the minimum of the effective potential . Once this point has been located, we can compute the speed with which these orbits pass through the point and we compute the values of and the actions for a series of orbits that pass through the point moving at equally spaced angles with respect to the vertical: when , the shell orbit is generated, while when , a planar orbit is generated, which has the largest value of of any orbit with the given . In Fig. 14 the generated orbits form a sequence of points that runs along one of the lines of constant that slope down from the vertical axis to the horizontal axis.
Interpolation within this grid works as follows. Given a triple , we determine between which two planes of constant the triple lies; let be the fractional displacement of the triple from the lower plane. Then by linear interpolation on values stored for each grid value of we estimate for the given angular momentum and thus obtain and thus the speed . We interpolate linearly between the maximum speeds used for the adjacent angular-momentum grid points to produce the renormalised variable .
The adjacent grid planes of constant have grid points along lines of constant , and we find between which two such lines our point lies on both planes. In a given plane, let denote the fractional displacement implied by above the line of smaller .
Now that we have identified the nearest grid lines , we have to identify the points on these lines that most nearly correspond to our point. By linear interpolation on the end points of our lines we estimate the maximum value of along the (non-grid) line of constant through our point, and use this to produce the dimensionless coordinate . Then along each grid line we find the actions associated with this value of , and finally combine these four values of each of and using the weights provided by and .
A tiresome complication is that for the line degenerates into the single point , and these actions have not been explicitly calculated. So, the lowest value of in the stored values for a plane is greater than zero, and our value of may prove to be smaller than this. In this case we have to interpolate between zero actions and the actions on the line associated with the first stored line .
The array structure used to store the actions and values of are non-cubical: the number of values of stored decreases with increasing from to . At fixed the number of values of stored for each value of increases from to . The stored values of increase from a small value to .
The grid has uniform increments between a value as close as possible to zero (there is a coordinate singularity at zero, so we must avoid it) and a largest value of , where is the largest radius at which moments are required.
Choosing the focal distance
B12a shows that for orbits in the extended solar neighbourhood was a reasonable choice for the distance down the axis of the focus of the confocal coordinate system upon which the Stäckel Fudge relies. B12a shows also that results are not sensitive to this parameter. Now that we need to use the Stäckel fudge at all points of a variety of potentials, one needs an algorithm for choosing for any orbit in any potential.
Since a shell orbit should lie on a surface , a natural procedure is to compute shell orbits for relatively large , fit them with ellipses in the plane, and to read off from these ellipses.
There is a shell orbit that reaches to the axis for any pair and for the flattened isochrone one can fit ellipses to these orbits over the entire grid. One finds that when the corresponding curves of at fixed for different values of are plotted together, they lie one on top of another. Consequently, we take to be a function of alone. For a given, small value of we compute the maximal shell orbits on the grid in and fit them with ellipses.
From the orbit integrations we have the point in the middle of the orbit . We start by seeking an ellipse that passes through this point and near to some other point on the orbit. An ellipse through is
and we minimise
with respect to at fixed by seeking the solution to
From this we obtain :
Finally we have
Appendix II: Computing
The grid on which the density is computed is based on Gauss-Legendre integration with respect to colatitude and finite-difference integration in spherical radius . The potential is obtained as a sum
over even-order Legendre polynomials, with the coefficients obtained by interpolation. We take . The radial grid points are
Consequently, the grid points are uniformly spaced in for and uniformly spaced in for .
To evaluate the one has to compute integrals (cf eq. 2.95 of BT08)
where is obtained by integrating with respect to .
At small we know that like , so we estimate the integrals between grid points at small by
Beyond a fiducial radius these integrals are estimated as
with appropriate special treatment of the case . The radial derivatives of are obtained at the grid points by analytic differentiation of the power-series expansion:
Values of the and their first two radial derivatives are obtained at general points by interpolation from the grid-point values. Since the first non-trivial term in the power-series expansion of around the origin is , quadratic interpolation in between the nearest three grid points is used inside the tenth radial grid point. Linear interpolation is used further out.
When the potential is required at a radius that lies outside the grid, it is readily obtained from the potential at the edge of the grid because in the vacuum .
The tangential derivatives of are obtained by analytic differentiation of the Legendre polynomials.