RAMSES-RT: Radiation hydrodynamics in the cosmological context
We present a new implementation of radiation hydrodynamics (RHD) in the adaptive mesh refinement (AMR) code RAMSES. The multi-group radiative transfer (RT) is performed on the AMR grid with a first-order Godunov method using the M1 closure for the Eddington tensor, and is coupled to the hydrodynamics via non-equilibrium thermochemistry of hydrogen and helium. This moment-based approach has the large advantage that the computational cost is independent of the number of radiative sources - it can even deal with continuous regions of emission such as bound-free emission from gas. As it is built directly into RAMSES, the RT takes natural advantage of the refinement and parallelization strategies already in place. Since we use an explicit advection solver for the radiative transport, the time step is restricted by the speed of light - a severe limitation that can be alleviated using the so–called “reduced speed of light” approximation. We propose a rigorous framework to assess the validity of this approximation in various conditions encountered in cosmology and galaxy formation. We finally perform with our newly developed code a complete suite of RHD tests, comparing our results to other RHD codes. The tests demonstrate that our code performs very well and is ideally suited for exploring the effect of radiation on current scenarios of structure and galaxy formation.
keywords:methods: numerical, radiative transfer
With the surging interest in reionization and the first sources of light in the Universe, and also thanks to a steadily increasing computational power, cosmological simulation codes have begun to include ionizing radiative transfer (RT) in the last decade or so. This is generally seen as a second-order component in most astrophysical processes, but important nonetheless, and is obviously very important in the context of simulating reionization. Due to the challenges involved, most implementations have started out with the post-processing of ionizing radiation on simulations including only dark matter, but a few have begun doing coupled radiation hydrodynamics (RHD), which model the interplay of radiation and gas.
It is highly desirable to follow self-consistently, with RHD simulations, the time-evolution and morphology of large-scale intergalactic medium (IGM) reionization and at the same time the smaller scale formation of the presumed sources of reionization; how galaxy formation is regulated by the ionizing radiation being released, how much of the radiation escapes from the galaxies to ionize the IGM, how first generation stars are formed in a metal-free environment and how radiative and supernovae feedback from those stars affect the inter-galactic medium. The galaxies and the IGM are indeed inter-connected via the ionizing radiation: the photons released from the galaxies affect the state of the surrounding gas via ionization and heating and may even prevent it from falling in or condensing into external gravitational potentials, especially small ones (e.g. Wise & Abel, 2008; Ocvirk & Aubert, 2011), which can then in turn significantly alter the ionization history.
The importance of RT and RHD is of course not limited to the epoch of reionization. Stars keep emitting ionizing radiation after this epoch and their radiative feedback likely has an effect on the post-reionization regulation of star-formation (e.g. Pawlik & Schaye, 2009; Hopkins, Quataert & Murray, 2011), the mass distribution of stellar populations (Krumholz, Klein & McKee, 2012) and even gas outflows (Hopkins et al., 2012).
Radiation hydrodynamics are complex and costly in simulations. The inclusion of coupled radiative transfer in hydrodynamical codes in general is challenging mainly because of the high dimensionality of radiative transfer (space-, angular-, and frequency dimensions) and the inherent difference between the typical timescales of radiative transfer and non-relativistic hydrodynamics. Simulating the interaction between small and large scales (so relevant to the epoch of cosmic reionization) makes things even worse: one wants to simulate, in a statistically significant region of the Universe (i.e. of the order of 100 comoving Mpc across) the condensation of matter in galaxy groups on Mpc scales, down to individual galaxies on kpc scales, followed by the formations of stellar nurseries in those galaxies on pc scales, and ultimately the formation of stars on sub-pc scales and then the effect of radiation from those stars back to the large scale IGM. This cycle involves size differences of something like 9 to 10 orders of magnitude – which is too much for the most advanced codes and computers today, actually even so without the inclusion of radiative transfer.
Due to these challenges, simulations typically focus on only a subset of these scales; either they consider reionization on large scales and apply sub-resolution recipes to determine stellar luminosities and UV escape fractions, or they ignore the cosmological context and focus on star formation and escape fractions in isolated galaxies or even isolated stellar nurseries.
A number of large scale 3D radiative transfer simulations of reionization have been carried out in recent years (e.g. Gnedin & Ostriker, 1997; Miralda-Escudé, Haehnelt & Rees, 2000; Gnedin, 2000; Ciardi, Stoehr & White, 2003; Sokasian et al., 2004; Iliev et al., 2006b; Zahn et al., 2007; Croft & Altay, 2008; Baek et al., 2010; Aubert & Teyssier, 2010; Petkova & Springel, 2011b), though they must all to some degree use subgrid recipes for star formation rates, stellar luminosities and UV escape fractions, none of which are well constrained. The ionization history in these simulations thus largely depends on these input parameters and resolution – some in fact use the observational constraints of the ionization history to derive constraints on these free parameters (e.g. Sokasian et al., 2004; Croft & Altay, 2008; Baek et al., 2010; Aubert & Teyssier, 2010; Petkova & Springel, 2011b). Furthermore, most of these works have used a post-processing RT strategy instead of RHD, which neglects the effect the ionizing radiation has on the formation of luminous sources.
The primary driver behind this work is the desire to understand the birth of galaxies and stars during the dark ages, and how they link with their large scale environment. We have thus implemented a RHD version of the widely used cosmological code RAMSES (Teyssier, 2002), that we call RAMSES-RT, with the goal of running cosmological RHD simulations, optimized for galactic scale radiation hydrodynamics. RAMSES is an adaptive mesh refinement (AMR) code, which greatly cuts costs by adaptively allowing the resolution to follow the formation of structures. The RHD implementation takes full advantage of the AMR strategy, allowing for high resolution simulations that can self consistently model the interplay of the reionizing Universe and the formation of the first galaxies.
Some of the goals we will be able to tackle with this implementation are:
Study radiative feedback effects in primordial galaxies. These galaxies are by definition young and small, and the first stars are thought to be gigantic and very bright due to the lack of metals. The ionizing radiation from these first stars is likely to have a dramatic effect on the galaxy evolution. This is closely associated with the formation of molecules, needed to form the first stars, which is sensitive to the radiation field. Radiative feedback effects also appear to be relevant in lower-redshift galaxies, and likely have a considerable impact on the initial mass function of stellar populations (Krumholz, Klein & McKee, 2012).
Investigate the escape of ionizing photons from early galaxies, how it affects the ionization history and external structure formation, e.g. the formation of satellite galaxies.
Study the emission and absorption properties of galaxies and extended structures. Observable properties of gas are highly dependent on its ionization state, which in turn depends on the local radiation field (e.g. Oppenheimer & Schaye, 2013). To predict it correctly, and to make correct interpretations of existing observations, one thus needs to model the ionization state consistently, for which RHD simulations are needed.
Improve sub-resolution recipes: of course we have not implemented a miracle code, and we are still nowhere near simulating simultaneously the 9 to 10 orders of magnitude in scale needed for fully self-consistent simulations of reionization. Sub-resolution strategies are still needed, and part of the objective is to improve those via small-scale simulations of stellar feedback (SNe, radiation, stellar winds).
It is useful here to make clear the distinction between continuum and line radiative transfer: our goal is to study the interplay of ionizing radiation, e.g. from stellar populations and AGN, and the interstellar/intergalactic gas. We consider continuum radiation, because the spectra of stars (and AGN) are smooth enough that emission and absorption processes are not sensitive to subtle rest-frame frequency shifts, be they due to local gas velocities or cosmological expansion.
On the other side is line transfer, i.e. the propagation of radiation over a narrow frequency range, usually corresponding to a central frequency that resonates with the gas particles. An important example is the propagation of Ly photons. Here, one is interested in the complex frequency and direction shifts that take place via scattering on the gas particles, and gas velocities and subtle frequency shifts are vital components. Line transfer is mostly done to interpret observational spectra, e.g. from Ly emitting/absorbing galaxies (e.g. Verhamme, Schaerer & Maselli, 2006), and is usually run in post-processing under the assumption that the line radiation has a negligible effect on the gas dynamics (through this assumption is not neccessarily true; see Dijkstra & Loeb, 2009).
There is a bit of a grey line between those two regimes of continuum and line radiation - some codes are even able to do both (e.g. Baek et al., 2009; Pierleoni, Maselli & Ciardi, 2009; Yajima et al., 2012). Our implementation deals strictly with continuum radiation though, as do most RHD implementations, for the sake of speed and memory limitations. We do approximate multi-frequency, but only quite coarsely, such that simulated photons represent an average of photons over a relatively wide frequency range, and any subtle frequency shifts and velocity effects are ignored.
1.1 Radiative transfer schemes and existing implementations
Cosmological hydrodynamics codes have traditionally been divided into two categories: Smoothed Particle Hydrodynamics (SPH) and AMR. The drawbacks and advantages of each method have been thoroughly explored (e.g. Agertz et al., 2007; Wadsley, Veeravalli & Couchman, 2008; Tasker et al., 2008) and we now believe that both code types agree more or less on the final result if they are used carefully with recently developed fixes and improvements, and if applied in their regimes of validity. On the radiation side, it is quite remarkable that we have the same dichotomy between ray-tracing codes and moment-based codes. Comparative evaluations of both methods have been performed in several papers (Iliev et al., 2006a; Altay, Croft & Pelupessy, 2008; Aubert & Teyssier, 2008; Pawlik & Schaye, 2008; Iliev et al., 2009; Maselli, Ciardi & Kanekar, 2009; Petkova & Springel, 2009; Cantalupo & Porciani, 2011; Pawlik & Schaye, 2011; Petkova & Springel, 2011a; Wise & Abel, 2011), and here again, each method has its own specific advantage over the other one. Comparing both methods in the coupled case (RHD) within the more challenging context of galaxy formation, such as in the recent Aquila comparison project, (Scannapieco et al., 2012), remains to be done.
1.1.1 Ray-based schemes
Here the approximation is made that the radiation field is dominated by a limited number of sources. This allows one to approximate the local intensity of radiation, , as a function of the optical depth along rays from each source.
The simplest solution is to cast rays, or long characteristics from each source to each cell (or volume element) and sum up the optical depth at each endpoint. With the optical depths in hand, is known everywhere and the rates of photoionization, heating and cooling can be calculated. While this strategy has the advantage of being simple and easy to parallelize (each source calculation is independent from the other), there is a lot of redundancy, since any cell which is close to a radiative source is traversed by many rays cast to further-lying cells, and is thus queried many times for its contribution to the optical depth. The parallelization is also not really so advantageous in the case of multiprocessor codes, since rays that travel over large lengths likely need to access cell states over many CPU nodes, calling for a lot of inter-node communication. Furthermore, the method is expensive: the computational cost scales linearly with the number of radiative sources, and each RT timestep has order operations, where is the number of radiative sources and is the number of volume elements. Implementation examples include Abel, Norman & Madau (1999), Cen (2002) and Susa (2006).
Short characteristics schemes overcome the redundancy problem by not casting separate rays for each destination cell. Instead, the calculation of optical depths in cells is propagated outwards from the source, and is in each cell based on the entering optical depths from the inner-lying cells. Calculation of the optical depth in a cell thus requires some sort of interpolation from the inner ones. There is no redundancy, as only a single ray segment is cast through each cell in one time-step. However, there is still a large number of operations and the problem has been made inherently serial, since the optical depths must be calculated in a sequence which follows the radiation ripple away from the source. Some examples are Nakamoto, Umemura & Susa (2001), Mellema et al. (2006), Whalen & Norman (2006) and Alvarez, Bromm & Shapiro (2006).
Adaptive ray tracing (e.g. Abel & Wandelt, 2002; Razoumov & Cardall, 2005; Wise & Abel, 2011) is a variant on long characteristics, where rays of photons are integrated outwards from the source, updating the ray at every step of the way via absorption. To minimize redundancy, only a handful of rays are cast from the source, but they are split into sub-rays to ensure that all cells are covered by them, and they can be merged again if need be.
Cones are a variant on short characteristics, used in conjunction with SPH (Pawlik & Schaye, 2008, 2011) and the moving-mesh AREPO code (Petkova & Springel, 2011a). The angular dimension of the RT equation is discretized into tesselating cones that can collect radiation from multiple sources and thus ease the computational load and even allow for the inclusion of continuous sources, e.g. gas collisional recombination.
A hybrid method proposed by Rijkhorst et al. (2006) combines the long and short characteristics on patch-based grids (like AMR), to get rid of most of the redundancy while keeping the parallel nature. Long characteristics are used inside patches, while short characteristics are used for the inter-patches calculations.
Monte-Carlo schemes do without splitting or merging of rays, but instead reduce the computational cost by sampling the radiation field, typically both in the angular and frequency dimensions, into photon packets that are emitted and traced away from the source. The cost can thus be adjusted with the number of packets emitted, but generally this number must be high in order to minimize the noise inherent to such a statistical method. Examples include Ciardi et al. (2001), Maselli, Ferrara & Ciardi (2003), Altay, Croft & Pelupessy (2008), Baek et al. (2009), and Cantalupo & Porciani (2011). An advantage of the Monte-Carlo approach of tracking individual photon packets is that it naturally allows for keeping track of the scattering of photons. For line radiation transfer, where doppler/redshift effects in resonant photon scattering are important, Monte-Carlo schemes are the only feasible way to go – though in these cases, post-processing RT is usually sufficient (e.g. Cantalupo et al., 2005; Verhamme, Schaerer & Maselli, 2006; Laursen & Sommer-Larsen, 2007; Pierleoni, Maselli & Ciardi, 2009).
Ray-based schemes in general assume infinite light speed, i.e. rays are cast from source to destination instantaneously. Many authors note that this only affects the initial speed of ionization fronts (I-fronts) around points sources (being faster than the light speed), but it may also result in an over-estimated I-front speed in underdense regions (see §6.5), and may thus give incorrect results in reionization experiments where voids are re-ionized too quickly. Some ray schemes (e.g. Pawlik & Schaye, 2008; Petkova & Springel, 2011a; Wise & Abel, 2011) allow for finite light speed, but this adds to the complexity, memory requirement and computational load. With the exception of the cone-based methods (and to some degree the Wise & Abel (2011) implementation), which can combine radiation from many sources into single rays, ray-based schemes share the disadvantage that the computational load increases linearly with the number of radiative sources. Moment methods can naturally tackle this problem, though other limitations appear instead.
1.1.2 Moment-based radiative transfer
An alternative to ray-tracing schemes is to reduce the angular dimensions by taking angular moments of the RT equation (Eq. 2). Intuitively this can be thought of as switching from a beam description to that of a field or a fluid, where the individual beams are replaced with a “bulk” direction that represents an average of all the photons crossing a given volume element in space. This infers useful simplifications: two angular dimensions are eliminated from the problem, and the equations take a form of conservation laws, like the Euler equations of hydrodynamics. They are thus rather easily coupled to these equations, and can be solved with numerical methods designed for hydrodynamics. Since radiation is not tracked individually from each source, but rather just added to the radiation field, the computation load is naturally independent of the number of sources.
The main advantage is also the main drawback: the directionality is largely lost in the moment approximation and the radiation becomes somewhat diffusive, which is generally a good description in the optically thick limit, where the radiation scatters a lot, but not in the optically thin regime where the radiation is free-streaming. Radiation has a tendency to creep around corners with moment methods. Shadows are usually only coarsely approximated, if at all, though we will see e.g. in section §6.4 that sharp shadows can be maintained with idealized setups and a specific solver.
The large value of the speed of light is also an issue. Moment methods based on an explicit time marching scheme have to follow a Courant stability condition that basically limits the radiation from crossing more than one volume element in one time-step. This requires to perform many time-steps to simulate a light crossing time in the free-streaming limit, or, as we will see later, to reduce artificially the light speed. Implicit solvers can somewhat alleviate this limitation, at the price of inverting large sparse matrices which are usually ill-conditioned and require expensive, poorly parallelized, relaxation methods.
The frequency dimension is also reduced, via integration over frequency bins: in the grey (single group) approximation the integral is performed over the whole relevant frequency range, typically from the hydrogen ionization frequency and upwards. In the multigroup approximation, the frequency range is split into a handful of bins, or photon groups, (rarely more than a few tens due to memory and computational limitations) and the equations of radiative transfer can be solved separately for each group. Ray-tracing schemes also often discretize into some number of frequency bins, and they are usually more flexible in this regard than moment-based schemes: while the spectrum of each source can be discretized individually in ray-tracing, the discretization is fixed in space in moment-based schemes, i.e. the frequency intervals and resulting averaged photon properties must be the same everywhere, due to the field approximation.
In the simplest form of moment-based RT implementations, so-called flux limited diffusion (FLD), only the zero-th order moment of the radiative transfer equation is used, resulting in an elliptic set of conservation laws. A closure is provided in the form of a local diffusion relation, which lets the radiation flow in the direction of decreasing gas internal energy (i.e. in the direction opposite of the energy gradient). This is realistic only if the medium is optically thick, and shadows cannot be modelled. The FLD method has been used by e.g. Krumholz et al. (2007), Reynolds et al. (2009) and Commerçon et al. (2011), mainly for the purpose of studying the momentum feedback of infrared radiation onto dusty and optically thick gas, rather than photoionization of hydrogen and helium.
Gnedin & Abel (2001) and Petkova & Springel (2009) used the optically thin variable Eddington tensor formalism (OTVET), in which the direction of the radiative field is composed on-the-fly in every point in space from all the radiative sources in the simulation, assuming that the medium between source and destination is transparent (hence optically thin). This calculation is pretty fast, given the number of relevant radiative sources is not overburdening, and one can neglect these in-between gas cells. Finlator, Özel & Davé (2009) take this further and include in the calculation the optical thickness between source and destination with a long characteristics method, which makes for an accurate but slow implementation. A clear disadvantage here is that in using the radiation sources to close the moment equations and compute the flux direction, the scaling of the computational load with the number of sources is re-introduced, hence negating one of the main advantages of moment-based RT.
González, Audit & Huynh (2007), AT08 and Vaytet, Audit & Dubroca (2010) – and now us – use a different closure formalism, the so-called M1 closure, which can establish and retain bulk directionality of photon flows, and can to some degree model shadows behind opaque obstacles. The M1 closure is very advantageous in the sense that it is purely local, i.e. it requires no information which lies outside the cell, which is not the case for the OTVET approximation.
As shown by Dubroca & Feugeas (1999), the M1 closure has the further advantage that it makes the system of RT equations take locally the form of a hyperbolic system of conservations laws, where the characteristic wave speeds can be calculated explicitly and are usually close, but always smaller than the speed of light . Hyperbolic systems of conservation laws are mathematically well understood and thoroughly investigated, and a plethora of numerical methods exist to deal with them (e.g. Toro, 1999). In fact, the Euler equations are also a hyperbolic system of conservation laws, which implies we have the RT equations in a form which is well suited to lie alongside existing hydrodynamical solvers, e.g. in RAMSES.
1.2 From Aton to Ramses-Rt
The ATON code (AT08, Aubert & Teyssier, 2010) uses graphical processing units, or GPUs, to post-process the transfer of monochromatic photons and their interaction with hydrogen gas. GPUs are very fast, and therefore offer the possibility to use the correct (very large) value for the speed of light and perform hundreds to thousands of radiation sub-cycles at a reasonable cost, but only if the data is optimally structured in memory, such that volume elements that are close in space are also close in memory. It is ideal for post-processing RT on simulation outputs that are projected onto a Cartesian grid, but hard to couple directly with an AMR grid in order to play an active part in any complex galaxy formation simulation. Even so, we have in the newest version of the ATON code included the possibility to perform fully coupled RHD simulations using a Cartesian grid only (this usually corresponds to our coarser grid level in the AMR hierarchy), where RT is performed using the ATON module on GPUs.
In our RAMSES-RT implementation we use the same RT method as ATON does – the moment method with the M1 Eddington tensor closure. The biggest difference is that RAMSES-RT is built directly into the RAMSES cosmological hydrodynamics code, allowing us to perform RHD simulations directly on the AMR grid, without any transfer of data between different grid structures. Furthermore, we have expanded the implementation to include multigroup photons to approximate multifrequency, and we have added the interactions between photons and helium. We explicitly store and advect the ionization states of hydrogen and helium, and we have built into RAMSES-RT a new non-equilibrium thermochemistry model that evolves these states along with the temperature and the radiation field through chemical processes, photon absorption and emission. Finally, for realistic radiative feedback from stellar populations, we have enabled RAMSES-RT to read external spectral energy distribution (SED) models and derive from them luminosities and UV “colors” of simulated stellar sources.
We have already listed a number of RT implementations, two of which even function already in the RAMSES code (AT08, Commerçon et al., 2011), and one might ask whether another one is really needed. To first answer for the ATON implementation, it is optimized for a different regime than RAMSES-RT. As discussed, ATON prefers to work with structured grids, but it cannot deal well with adaptive refinement. This, plus the speed of ATON, makes it very good for studying large scale cosmological reionization, but not good for AMR simulations of individual halos/galaxies, e.g. cosmological zoom simulations, where the subject of interest is the effect of radiative feedback on the formation of structures and galaxy evolution, and escape fractions of ionizing radiation. The Commerçon et al. (2011) implementation is on the opposite side of the spectrum. Being based on the FLD method, it is optimized for RHD simulations of optically thick protostellar gas. It is a monogroup code that doesn’t track the ionization state of the gas. Furthermore, it uses a rather costly implicit solver, which makes it hard to adapt to multiple adaptive time stepping usually used in galaxy formation problems.
A few codes have been used for published 3D cosmological RHD simulations with ionizing radiation. As far as we can see these are Gnedin (2000), Kohler, Gnedin & Hamilton (2007) (both in ), Shin, Trac & Cen (2008), Petkova & Springel (2009) (in GADGET), Wise & Abel (2011) (in ENZO), Finlator, Dave & Ozel (2011) (in GADGET), Hasegawa & Semelin (2013) (START), and Pawlik, Milosavljević & Bromm (2013) (in GADGET). A few others that have been used for published astrophysical (ionizing) RHD simulations but without a co-evolving cosmology are Mellema et al. (2006), Susa (2006), Whalen & Norman (2006), and Baek et al. (2009). The rest apparently only do post-processing RT, aren’t parallel or are otherwise not efficient enough. Many of these codes are also optimized for cosmological reionization rather than galaxy-scale feedback.
Thus there aren’t so many cosmological RHD implementations out there, and there should be room for more. The main advantage of our implementation is that our method allows for an unlimited number of radiative sources and can even easily handle continuous sources, and is thus ideal for modelling e.g. the effects of radiative feedback in highly resolved simulations of galaxy formation, UV escape fractions, and the effects of self-shielding on the emission properties of gas and structure formation, e.g. in the context of galaxy formation in weak gravitational potentials.
The structure of the paper is as follows: in §2 we present the moment based RT method we use. In §3 we explain how we inject and transport radiation on a grid of gas cells, and how we calculate the thermochemistry in each cell, that incorporates the absorption and emission of radiation. In §4 we present two tricks we use to speed up the RHD code, namely to reduce the speed of light, and to “smooth” out the effect of operator splitting. In §5 we describe how the radiative transfer calculation is placed in the numerical scheme of RAMSES, and demonstrate that the radiation is accurately transported across an AMR grid. In §6, we present our test suite, demonstrating that our code performs very well in coupled radiation hydrodynamics problems and finally, §7 summarizes this work and points towards features that may be added in the future. Details of the thermochemistry and additional code tests are described in the appendix.
2 Moment-based radiative transfer with the M1 closure
Let denote the radiation specific intensity at location and time , such that
is the energy of photons with frequency over the range around propagating through the area in a solid angle around the direction .
The equation of radiative transfer (e.g. Mihalas & Mihalas, 1984) describes the local change in as a function of propagation, absorption and emission,
where is the speed of light, is an absorption coefficient and a source function.
By taking the zeroth and first angular moments of (2), we can derive the moment-based RT equations that describe the time-evolution of photon number density and flux (see e.g. AT08):
where is the radiative pressure tensor that remains to be determined to close the set of equations. Here we have split the absorption coefficient into constituent terms, , where is number density of the photo-absorbing species (=Hi, Hei, Heii), and is the ionization cross section between -frequency photons and species . Furthermore we have split the source function into (e.g. stellar, quasar) injection sources, , and recombination radiation from gas, . Here we only consider the photo-absorption of hydrogen and helium, which is obviously most relevant in the regime of UV photons. However, other absorbers can straightforwardly be added to the system.
Eqs. (3)-(4) are continuous in , and they must be discretized to be usable in a numerical code. AT08 collected all relevant frequencies into one bin, so the equations could be solved for one group of photons whose attributes represent averages over the frequency range. For a rough approximation of multifrequency, we split the relevant frequency range into a number of photon groups, defined by
where is the frequency interval for group . In the limit of one photon group, the frequency range is ; with groups, the frequency intervals should typically be mutually exclusive and set up to cover the whole H-ionizing range:
where represent average cross sections between each group and species , defined by111here we assume the spectral shape of to be identical, within each group, to that of .
We simplify things however by defining the group cross sections as global quantities, assuming a frequency distribution of energy for the radiative sources (e.g. a black-body or some sophisticated model). The cross sections are thus in practice evaluated by
where is photon energy (with the Planck constant). Likewise, average photon energies within each group are evaluated by
and furthermore, for the calculation of photoionization heating222see Eq. 64, energy weighted cross sections are stored for each group - absorbing species couple:
In RAMSES-RT, , and can be either set by hand or evaluated on-the-fly from spectral energy distribution tables as luminosity weighted averages from in-simulation stellar populations, using the expressions from Verner et al. (1996) for , and .
For each photon group, the corresponding set of equations (6)-(7) must be closed with an expression for the pressure tensor . This tensor is usually described as the product of the photon number density and the so-called Eddington tensor (see Eq. 12), for which some meaningful and physical expression is desired. Some formalisms have been suggested for . Gnedin & Abel (2001), Finlator, Özel & Davé (2009), and Petkova & Springel (2009) have used the so called optically thin Eddington tensor formalism (OTVET), in which is composed on-the-fly from all the radiation sources, the main drawback being the computational cost associated with collecting the positions of every radiative source relative to every volume element. Instead, like AT08 (and González, Audit & Huynh 2007 before them), we use the M1 closure relation (Levermore, 1984), which has the great advantages that it is purely local, i.e. evaluating it in a piece of space only requires local quantities, and that it can retain a directionality along the flow of the radiative field. In our frequency-discretized form, the pressure tensor is given in each volume element for each photon group by
where the Eddington tensor is
are the unit vector pointing in the flux direction, the Eddington factor and the reduced flux, respectively. The reduced flux describes the directionality of the group radiation in each point, and must always have . A low value means the radiation is predominantly isotropic, and a high value means it is predominantly flowing in one direction. Photons injected into a point (via an increase in photon density only) initially have zero reduced flux and thus are isotropic. Away from the source, the moment equations and M1 closure develop a preferred outwards direction, i.e. the reduced flux tends towards one. Beams can be injected by imposing unity reduced flux on the injected photons. In this case, the M1 closure correctly maintains unity reduced flux (and ) along the beam (see demonstrations in Fig. 1 and Sections 5.3, 6.4, and 6.8). For the arguments leading to these expressions and a general discussion we point the reader towards Levermore (1984) and González, Audit & Huynh (2007) and AT08.
3 The radiative transfer implementation
We will now describe how pure radiative transfer is solved on a grid – without yet taking into consideration the hydrodynamical coupling. The details here are not very specific to RAMSES-RT and are much like those of AT08.
In addition to the usual hydrodynamical variables stored in every grid cell in RAMSES (gas density , momentum density , energy density , metallicity ), RAMSES-RT has the following variables: First, we have the variables describing photon densities and fluxes for the photon groups. Second, in order to consistently treat the interactions of photons and gas, we track the non-equilibrium evolution of hydrogen and helium ionization in every cell, stored in the form of passive scalars which are advected with the gas, namely
For each photon group, we solve the set of equations (6)-(7) with an operator splitting strategy, which involves decomposing the equations into three steps that are executed in sequence over the same time-step , which has some pre-determined length. The steps are:
Photon injection step, where radiation from stellar and other radiative sources (other than gas recombinations) is injected into the grid. This corresponds to the term in (6).
3.1 The injection step
The equations to solve in this step are very simple,
where is a rate of photon injection into photon group , in the given cell. Normally, the injected photons come from stellar sources, but they could also include other point sources such as AGN, and also pre-defined point sources or even continuous “volume” sources333In Rosdahl & Blaizot (2012), we emitted UV background radiation from cosmological void regions, under the assumption that they are transparent to the radiation..
Given the time and time-step length , the discrete update in each cell done for each photon group is the following sum over all stellar particles situated in the cell:
where denotes the time index ( and ), is an escape fraction, is the cell volume, , and are mass, age and metallicity of the stellar particles, respectively, and is some model for the accumulated number of group photons emitted per solar mass over the lifetime (so far) of a stellar particle. The escape fraction, is just a parameter that can be used to express the suppression (or even boosting) of radiation from processes that are unresolved inside the gas cell.
RAMSES-RT can read stellar energy distribution (SED) model tables to do on-the-fly evaluation of the stellar particle luminosities, . Photon cross sections and energies can also be determined on-the-fly from the same tables, to represent luminosity-weighted averages of the stellar populations in a simulation. Details are given in Appendix B.
3.2 The transport step
The equations describing free-flowing photons are
We can write the above equations in vector form
where and . To solve (20) over time-step , we use an explicit conservative formulation, expressed here in 1D for simplicity,
where again denotes time index and denotes cell index along the x-axis. and are intercell fluxes evaluated at the cell interfaces. Simple algebra gives us the updated cell state,
and all we have to do is determine expressions for the intercell fluxes.
Many intercell flux functions are available for differential equations of the form (20) which give stable results in the form of (22) (see e.g. Toro, 1999), as long as the Courant time-step condition is respected (see §4.1). Following AT08 and González, Audit & Huynh (2007) we implement two flux functions which can be used in RAMSES-RT.
One is the Harten-Lax-van Leer (HLL) flux function (Harten, Lax & Leer, 1983),
are maximum and minimum eigenvalues of the Jacobian . These eigenvalues mathematically correspond to wave speeds, which in the case of 3D radiative transfer depend only on the magnitude of the reduced flux (14) and the angle of incidence of the flux vector to the cell interface. This dependence has been calculated and tabulated by González, Audit & Huynh (2007), and we use their table to extract the eigenvalues.
The other flux function we have implemented is the simpler Global Lax Friedrich (GLF) function,
which corresponds to setting the HLL eigenvalues to the speed of light, i.e. and , and has the effect of making the radiative transport more diffusive. Beams and shadows are therefore better modelled with the HLL flux function than with the GLF one, whereas the inherent directionality in the HLL function results in radiation around isotropic sources (e.g. stars) which is noticeably asymmetric, due to the preference of the axis directions.
Fig. 1 illustrates the difference between the two flux functions in some idealized 2D RAMSES-RT tests, where we shoot off beams and turn on isotropic sources. Here the photon-gas interaction is turned off by setting all photoionization cross sections to zero ( for any species ). It can be seen that the HLL flux function fails to give isotropic radiation (far left) and that the GLF function gives more diffusive beams (second from left). Note also how the diffusivity of beams with the HLL flux function is direction-dependent. A horizontal or vertical beam is perfectly retained while a diagonal one “leaks” to the sides almost as much as with the GLF function, which has the advantage of being fairly consistent on whether the beam is along-axis or diagonal. The right frames of the figure give an idea of how the radiative transport behaves in the case of multiple sources, i.e. with opposing beams and neighboring isotropic sources. The two opposing beams example is a typical configuration where the M1 closure relation obviously fails, creating a spurious source of radiation, perpendicular to the beam direction: Since opposing fluxes cannot cross each other in a single point in the moment approximation, the radiation is “squeezed” into those perpendicular directions. It is unclear to us how much of a problem this presents in astrophysical contexts. Beams, which clearly represent the worst case scenario, are not very relevant, but multiple nearby sources are. We generally prefer to use the GLF flux function, since we mostly deal with isotropic sources in our cosmological/galactic simulations, but the choice of function really depends on the problem. There is no noticeable difference in the computational load, so if shadows are important, one should go for HLL. AT08 have compared the two flux functions in some of the benchmark RT tests of Iliev et al. (2006a) and found that they give very similar results. We do likewise for the test we describe in §6.7, and come to the same conclusion.
3.3 The thermochemical step
Photon absorption and emission have the effect of heating and cooling the gas, so in order to self-consistently implement these interactions, we evolve along with them the thermal energy density of the gas and the abundances of the species that interact with the photons, here Hi, Hei and Heii via photoionizations and Hii, Heii (again) and Heiii via recombinations. We follow these abundances in the form of the three ionization fractions , and , that we presented in Eqs. (15). The set of non-equilibrium thermochemistry equations solved in RAMSES-RT consists of:
In the photon density and flux equations, (25) and (26), we have replaced the photon emission rate with the full expression for recombinative emissions from gas. Here, and represent case A and B recombination rates for electrons combining with species (). The factor is a boolean (1/0) that states which photon group -species recombinations emit into, and is electron number density (a direct function of the H and He ionization states, neglecting the contribution from metals).
The evolution (28) consists of, respectively on the RHS, Hi collisional ionizations, Hi photo-ionizations, and Hii recombinations. Here, is a rate of collisional ionizations. The evolution (29) consists of, from left to right, Hei collisional ionizations, Heiii recombinations, Hei photo-ionizations, and Heii collisional ionizations, recombinations, and photoionizations. Likewise, the evolution (30) consists of Heii collisional ionizations and photoionizations, and Heiii recombinations. The expressions we use for rates of recombinations and collisional ionizations are given in Appendix E.
The computational approach we use to solving Equations (25)-(30) takes inspiration from Anninos et al. (1997). The basic premise is to solve the equations over a sub-step in a specific order (the order we have given), explicitly for those variables that remain to be solved (including the current one), but implicitly for those that have already be solved over the sub-step. Eqs. (25) and (26) are thus solved purely explicitly, using the backwards-in-time (BW) values for all variables on the RHS. Eq. (27) is partly implicit in the sense that it uses forward-in-time values for and , but BW values for the other variables. And so on, ending with Eq. (30), which is then implicit in every variable except the one solved for (). We give details of the discretization of these equations in Appendix A.
3.3.1 The 10% thermochemistry rule
For accuracy, each thermochemistry step is restricted by a local cooling time which prohibits any of the thermochemical quantities to change by a substantial fraction in one time-step. We therefore sub-cycle the thermochemistry step to fill in the global RT time-step (see next section), using what can be called the rule: In each cell, the thermochemistry step is initially executed with the full RT time-step length, and then the fractional update is considered. If any of the evolved quantities (, , ionization fractions) have changed by more than , we backtrack and do the same calculation with half the time-step length. Conversely, if the greatest fractional change in a sub-step is , the timestep length is doubled for the next sub-step (without the backtracking).
Together, the quasi-implicit approach used in solving the thermochemistry, and the rule, infer that photons are in principle conserved only at the level444As discussed in §5.3, the photon transport accurately conserves photons, so thermochemistry errors are the sole source of non-conservation. This is because the thermochemistry solver is explicit in the photon density updates (i.e. uses before-timestep values of ionization fractions), but the following ionization fraction updates are implicit in the photon densities (i.e. they use after-timestep values for the photon densities). Thus, in the situation of a cell in the process of being photoionized, the ionization fractions are underestimated at the photon density updates and the photon densities are underestimated at the ionization fraction updates. Conversely, if the cell gas is recombining, the recombination-photon emission is slightly overestimated, since before-timestep values for the ionization fractions are used for the emissivity. However, judging from the performance in RT tests (§6) and thermochemistry tests (§C), this does not appear to be cause for concern.
3.3.2 The on-the-spot approximation
The photon-emitting recombinative term, the second RHS sum in (25), is optionally included. Excluding it is usually referred to as the on-the-spot approximation (OTSA), meaning that any recombination-emitted photons are absorbed “on the spot” by a near-lying atom (in the same cell), and hence these photon emissions cancel out by local photon absorptions. If the OTSA is assumed, the gas is thus not photoemitting, and the case A recombination rates are replaced with case B recombination rates in (25)-(30), i.e. photon-emitting recombinations straight down to the ground level are not counted. The OTSA is in general a valid approximation in the optically thick regime but not so when the photon mean free path becomes longer than the cell width.
It is a great advantage of our RT implementation that it is not restricted to a limited number of point sources. The computational load does not scale at all with the number of sources, and photon emission from gas (non-OTSA) comes at no added cost, whereas it may become prohibitively expensive in ray-tracing implementations.
4 Timestepping issues
RT is computationally expensive, and we use two basic tricks to speed up the calculation. One is to reduce the speed of light, the other is to modify slightly the traditional operator splitting approach, by increasing the coupling between photon injection and advection on one hand and thermochemistry and photo-heating on the other hand.
4.1 The RT timestep and the reduced speed of light
In each iteration before the three RT steps of photon injection, advection and thermochemistry are executed, the length of the time-step, , must be determined.
We use an explicit solver for the radiative transport (21), so the advection timestep, and thus the global RT timestep, is constrained by the Courant condition (here in 3D),
where is the cell width. This time-step constraint is severe: it results in an integration step which is typically 300 times shorter than in non-relativistic hydrodynamical simulations, where the speed of light is replaced by a maximum gas velocity () in Eq. 45. In a coupled (RHD) simulation, this would imply a comparable increase in CPU time, either because of a global timestep reduction (as we chose to implement, see Sec. 5), or because of many radiative sub-steps (as is implemented e.g. in ATON555But ATON runs on GPUs, which are about a hundred times faster than CPUs, whereas RAMSES-RT runs on CPUs and thus can’t afford such huge amount of RT subcycling. NB: ATON also increases the timestep by working on the coarse grid, and hence multiplying by a factor in Eq. 45.). In the case of radiative transfer with the moment equations, there are two well-known solutions to this problem.
The first solution is to use an implicit method instead of an explicit one to solve the transport equation, which means using forward-in-time intercell fluxes in (21), i.e replacing with . This seemingly simple change ensures that the computation is always stable, no matter how big the time-step, and we can get rid of the Courant condition. However: (i) It doesn’t mean that the computation is accurate, and in fact we still need some time-stepping condition to retain the accuracy, e.g. to restrain any quantity to be changed by more than say in a single time-step. Furthermore, such a condition usually must be checked by trial-and-error, i.e. one guesses a time-step and performs a global transport step (over the grid) and then checks whether the accuracy constraint was broken anywhere. Such trial-and-error time-stepping can be very expensive since it is a global process. (ii) Replacing with is actually not simple at all. Eqs. 21 become a system of coupled algebraic equations that must be solved via matrix manipulation in an iterative process, which is complicated, computationally expensive, and of limited scope (i.e. can’t be easily applied to any problem). Due to these two reasons we have opted out of the implicit approach. It is absolutely a valid approach however, and used by many (e.g. Petkova & Springel, 2009; Commerçon et al., 2011).
The second solution, which we have chosen instead, is to keep our solver explicit, and relax the Courant condition by changing the speed of light to a reduced light speed , the payoff being that the time-step (45) becomes longer. This is generally referred to as the reduced speed of light approximation (RSLA), and was introduced by Gnedin & Abel (2001). The idea of the RSLA is that in many applications of interest, the propagation of light is in fact limited by the much slower speed of ionizing fronts. In such situations, reducing the speed of light, while keeping it higher than the fastest I-front, will yield the correct solution at a much reduced CPU cost. In the following section, we provide a framework to help judge how accurate the RSLA may be in various astrophysical contexts.
4.2 A framework for setting the reduced light speed value
In the extremely complex framework of galaxy formation simulations, the accuracy of the results obtained using the RSLA can really only be assessed by convergence tests. It is nonetheless useful to consider a simple idealized setup in order to derive a physical intuition of where, when, and by how much one may reduce the speed of light. In this section, we thus discuss the expansion of an ionized region around a central source embedded in a uniform neutral medium.
We consider a source turning on and emitting ionizing photons at a rate into a homogeneous hydrogen-only medium of number density . An expanding sphere of ionized gas forms around the source and halts at the Strömgren radius within which the rate of recombinations equals the source luminosity:
where is the case-B recombination rate at , and where we have assumed that the plasma within is fully ionized.
The relativistic expansion of the I-front to its final radius is derived in Shapiro et al. (2006), and may be expressed as:
where is time in units of the recombination time , is the position of the I-front in units of , and the factor describes the light crossing time across the Strömgren radius in units of the recombination time, and basically encompasses all the free parameters in the setup (source luminosity, gas density, and temperature). Writing , we see that in many astrophysical contexts, stays in the range (see Table 1), simply because we are generally either interested in the effect of bright sources (e.g. a whole galaxy) on relatively low-density gas (e.g. the IGM) or of fainter sources (e.g. an O-star) on high-density gas (at e.g. molecular-cloud densities).
Let us now discuss briefly the evolution of an I-front given by Eq. 33 for illustrative values of :
(blue curve of Fig. 2): this is the limiting non-relativistic case, which assumes an infinite speed of light (). In this case, the I-front expands roughly as (its speed decreases as ) almost all the way to , which it reaches after about a recombination time.
(red curve of Fig. 2): here (and for all ), the I-front basically expands at the speed of light all the way to , which it thus reaches after a crossing time (which is equal to a recombination time in this case).
(green curve of Fig. 2): in this typical case, the I-front starts expanding at the speed of light, until . It then slows down and quickly reaches the limiting behavior after a crossing time (at ). The I-front then reaches after a recombination time (at ).
An important feature appearing in the two latter cases is that for any physical setup , the I-front is always well described by the limit after a crossing time (i.e. ). We can use this feature to understand the impact of reducing the speed of light in our code. Say we have a physical setup described by a value . Reducing the speed of light by a factor () implies an increase by a factor of the effective crossing time, and the effective in our experiment becomes . The solution we obtain with will be accurate only after an effective crossing time, i.e. after . Before that time, the reduced-light-speed solution will lag behind the real one.
How much one may reduce the speed of light in a given numerical experiment then depends on the boundary conditions of the problem and their associated timescales. Call the shortest relevant timescale of a simulation. For example, if one is interested in the effect of radiative feedback from massive stars onto the ISM, can be set to the lifetime of these stars. If one is running a very short experiment (see Sec. 6.5), the duration of the simulation may determine . Given this timescale contraint , one may reduce the speed of light by a factor such that the I-fronts will be correctly described after a timelapse well shorter than , i.e. . In other words, one may typically use . We now turn to a couple of concrete examples.
|Iliev tests 1,2,5|
|Iliev test 4|
4.3 Example speed of light calculations
In Table 1 we take some concrete (and of course very approximate) examples to see generally what values of are feasible. We consider cosmological applications from inter-galactic to inter-stellar scales and setups from some of the RT code tests described in §6.
Reionization of the inter-galactic medium
Here we are concerned with the expansion of ionization fronts away from galaxies and into the IGM, as for example in the fourth test of Iliev et al. (2006a) (hereafter Il06). In this test, the IGM gas density is typically cm, and the sources have . In such a configuration, the Strömgren radius is kpc, corresponding to a crossing time Myr. Because of the low density of the gas, the recombination time is very long ( Gyr), and we are thus close to the case discussed above (the green curve in Fig. 2).
Test 4 of Il06 is analyzed at output times Myr and Myr (see Fig. 19). In both cases, , and we cannot reduce the speed of light to get an accurate result at these times, because the expanding front has not yet reached the limit. Interestingly, we cannot increase the speed of light either, as is done in Il06 with C-Ray which assumes an infinite light-speed. From Fig. 2, it is clear that this approximation (the blue curve) will over-predict the radius of the front. We can use the analysis above to note that had the results been compared at a later output time Myr, the infinite-light-speed approximation would have provided accurate results. It is only ten times later, however, that reducing the speed of light by a factor ten would have provided accurate results.
We conclude that propagating an I-front in the IGM at the proper speed requires to use a value of the speed of light close to the correct value. This is especially true in Test 4 of the Il06 paper (last row of Table 1). This confirms that for cosmic reionization related studies, using the correct value for the speed of light is very important.
There is admittedly a lot of variety here, but as a rough estimate, we can take typical densities to be in the large-scale ISM and in star-forming clouds. In the stellar nurseries we consider single OB stars, releasing photons per second, and in the large-scale ISM we consider groups of () OB stars. The constraining timescale is on the order of the stellar cycle of OB stars ( Myr), and less for the stellar nurseries. In these two cases, which are representative of the dense ISM inside galactic disks, we see in Table 1 that the allowed reduction factor for the speed of light is much larger ( to ). This is due to two effects acting together: the gas density is higher, but the sources are fainter, since we are now resolving individual stellar clusters, and not an entire galaxy. Tests 1, 2 of Il06 and test 5 of its’ RHD sequel (Iliev et al., 2009) are also representative of such a favorable regime to use the reduced speed of light approximation (second to last row in Table 1). This rigorous analysis of the problem at hand confirms that propagating I-front in galaxy formation simulation can be done reliably using our current approach, while cosmic reionization problems are better handled with GPU acceleration and the correct speed of light.
4.4 Smoothed RT
A problem we had to face, while performing RAMSES-RT galaxy formation runs, as well as the various test cases presented here, is that there is often a small number of cells, usually along I-fronts, or close to strong radiation sources, that execute a huge number of thermochemistry subcycles in a single RT time-step. This is in part fault of the operator-splitting approach used, where the RT equations have been partly decoupled. Specifically, the photon density updates happen in three steps in this approach (see Fig. 3, top). The photon injection step always increases the number of photons, usually by a relatively large amount, and the transport step does the same when it feeds photons into cells along these I-fronts. The thermochemistry step in the I-front cells has the exact opposite effect: the photon density decreases again via absorptions. If the photon-depletion time is shorter than the Courant time, we have a curious situation where the cell goes through an inefficient cycle during the thermochemistry subcycles: it starts neutral with a large abundance of photons (that have come in via the transport and/or photon injection steps). It first requires a number of subcycles to evolve to a (partly) ionized state, during which the photon density is gradually decreased. It can then reach a turnaround when the photons are depleted. If the RT time-step is not yet finished, the cell then goes into a reverse process, where it becomes neutral again. This whole cycle may take a large number of thermochemical steps, yet the cell gas ends up being in much the same state as it started.
In reality, the ionization state and photon density would not cycle like this but would rather settle into a semi-equilibrium where the rate of ionizations equals that of recombinations.
For the purpose of saving up on computing time and reducing the number of thermochemistry subcycles, we have implemented an optional strategy we call smoothed RT that roughly corrects this non-equilibrium effect of operator splitting (see Fig. 3, bottom). In it, the result of () from the transport and injection steps in each cell is used to infer a rate for the thermochemistry step, rather than being set as an initial condition. We use the pre-transport, pre-injection values of and as initial conditions for the thermochemistry, but instead update the thermochemistry equations (25) and (26) to
where the new terms at the far right represent the rates at which the photon densities and fluxes changed in the transport and injection steps, i.e.
where and ( and ) denote a cell state before (after) cell injection (Eq. 16) and transport (Eqs. 18 and 19) have been solved over . The injection and transport steps are unchanged from the normal operator splitting method, except for the fact that the cell states are not immediately updated to reflect the end results of those steps. The results of the injection step go only as initial conditions into the transport step, and the end results of the transport step are only used to calculate the photon density and flux rates of change via Eqs. (36) and (37). Only after the thermochemistry step does a cell get a valid state that is the result of all three steps.
The idea is that when the photons are introduced like this into the thermochemistry step, they will be introduced gradually in line with the subcycling, and the photon density vs. ionization fraction cycle will disappear as a result and be replaced with a semi-equilibrium, which should reduce the number of subcycles and the computational load. The total photon injection (or depletion) will still equal , so in the limit that there are no photoionizations or photon-emitting recombinations, the end result is exactly the same photon density (and flux) as would be left at the end of the transport and injection steps without smoothed RT.
The advantage of the smoothing approach is perhaps best explained with an example: consider a cell with a strong source of radiation and gas dense and neutral enough that the timescales of cooling, ionization and/or recombination are much shorter than the global timestep length, . This could either be a source containing a stellar particle or a cell along an ionization front. Without smoothing, the photoionization rate in the cell can change dramatically as a result of photon injection/transport. The thermochemistry step thus starts with a high rate of photoionzations which gradually goes down in the thermochemistry sub-cycling as the gas becomes more ionized and the photons are absorbed. With smoothing, this dramatic change in the photoionization rate never happens, thus requiring fewer thermochemistry sub-cycles to react. A situation also exists where the smoothing approach slows down the thermochemistry: if a cell contains a strong source of radiation, but diffuse gas (i.e. long timescales compared to for cooling, ionization and/or recombination), the non-smoothed approach would result in little or no thermochemistry sub-cycling, whereas the smoothed approach would take many sub-cycles just to update the radiation field and effectively reach the final result of the injection and transport steps.
The gain in computational speed is thus quite dependent on the problem at hand, and also on the reduced light speed, which determines the size of the RT time-step, . We’ve made a comparison on the computational speed between using the smoothed and non-smoothed RT in a cosmological zoom simulation from the NUT simulations suite (e.g. Powell, Slyz & Devriendt, 2011) that includes the transfer of UV photons from stellar sources. Here, smoothed RT reduces the average number of thermochemistry subcycles by a factor of and the computing time by a factor . So a lot may indeed be gained by using smoothed RT.
One could argue that the ionization states in I-fronts are better modelled with smoothed RT, since the cycle of photon density and ionization fraction is a purely numerical effect of operator splitting. We have intentionally drawn a slightly higher end value of in the smoothed RT than non-smoothed in Fig. 3: whereas non-smoothed RT can completely deplete the photons in a cell, smoothed RT usually leaves a small reservoir after the thermochemistry, that more accurately represents the “semi-equilibrium value”.
Of course an alternative to smoothed RT, and a more correct solution, is to attack the root of the problem and reduce the global time-step length, i.e. also limit the transport and injection steps to the rule. Reducing the global time-step length is highly impractical though; the main reason for using operator splitting in the first place is that it enables us to separate the timescales for the different steps.
The same method of smoothing out discreteness that comes with operator splitting (in the case of pure hydrodynamics) has previously been described by Sun (1996), where it is referred to as “pseudo-non-time-splitting”.
5 Radiation hydrodynamics in Ramses
RAMSES (Teyssier, 2002) is a cosmological adaptive mesh refinement (AMR) code that can simulate the evolution and interaction of dark matter, stellar populations and baryonic gas via gravity, hydrodynamics and radiative cooling. It can run on parallel computers using the message passing interface (MPI) standard, and is optimized to run very large numerical experiments. It is used for cosmological simulations in the framework of the expanding Universe, and also smaller scale simulations of more isolated phenomena, such as the formation and evolution of galaxies, clusters, and stars. Dark matter and stars are modelled as collisionless particles that move around the simulation box and interact via gravity. We will focus here on the hydrodynamics of RAMSES though, which is where the RT couples to everything else.
RAMSES employs a second-order Godunov solver on the Euler equations of gravito hydrodynamics in their conservative form,
where is time, the gas density, the bulk velocity, the gravitational potential, the gas total energy density, the pressure, and represents radiative cooling and heating via thermochemistry terms (resp. negative and positive), which are functions of the gas density, temperature and ionization state. In RAMSES, collisional ionization equilibrium (CIE) is traditionally assumed, which allows the ionization states to be calculated as surjective functions of the temperature and density and thus they don’t need to be explicitly tracked in the code. is divided into kinetic and thermal energy density () components:
The system of Euler equations is closed with an equation of state which relates the pressure and thermal energy,
where is the ratio of specific heats. The Euler equations are adapted to super comoving coordinates, to account for cosmological expansion, by a simple transformation of variables (see §5.4).
The Euler equations are solved across an AMR grid structure. Operator splitting is employed for the thermochemistry source terms, i.e. is separated from the rest of the Euler equations in the numerical implementation – which makes it trivial to modify the thermochemistry solver, i.e. change it from equilibrium to non-equilibrium.
The basic grid element in RAMSES is an oct (Fig. 4), which is a grid composed of eight cubical cells. A conservative state vector is associated with each cell storing its hydrodynamical properties of gas density , momentum density , total energy density and metal mass density . (One can also use the primitive state vector, defined as .) Each cell in the oct can be recursively refined to contain sub-octs, up to a maximum level of refinement. The whole RAMSES simulation box is one oct at , which is homogeneously and recursively refined to a minimum refinement level , such that the coarse (minimum) box resolution is cells on each side. Octs at or above level are then adaptively refined during the simulation run, to follow the formation and evolution of structures, up to a maximum refinement level , giving the box a maximum effective resolution of cell widths per box width. The cell refinement is gradual: the resolution must never change by more than one level across cell boundaries.
5.1 Ramses multi-stepping approach
With AMR multi-stepping, the resolution is not only adaptive in terms of volume, but also in time, with different timestep sizes on different refinement levels. A coarse time-step, over the whole AMR grid, is initiated at the coarse level, , as we show schematically in Fig. 5. First, the coarse time-step length is estimated via (the minimum of) Courant conditions in all cells. Before the coarse step is executed, the next finer level, , is made to execute the same time-step, in two substeps since the finer level Courant condition should approximately halve the time-step length. This process is recursive: the next finer level makes its own time-step estimate (Courant condition, but also ) and has its next finer level to execute two substeps. This recursive call up the level hierarchy continues to the highest available level , which contains only leaf cells and no sub-octs. Here the first two substeps are finally executed, with step lengths . When the two substeps are done, the time-step is re-evaluated to be no longer than the sum of the two substeps just executed at , and then one step is executed. Then back to level to execute two steps, and so on. The substepping continues in this fashion across the level hierarchy, ending with one time-step for the coarsest level cells (with a modified time-step length ).
At the heart of RAMSES lies a recursive routine called which describes a single time-step at level , and is initially called from the coarsest level (). To facilitate our descriptions on how the RT implementation is placed into RAMSES, we illustrate the routine in pseudocode format in Listing 1, where we have excluded details and bits not directly relevant to RHD (e.g. MPI syncing and load-balancing, adaptive refinement and de-refinement, particle propagation, gravity solver, star formation, and stellar feedback).
First, the recursion is made twice, solving the hydrodynamics over two sub-steps at all finer levels. Then the Euler equations are solved over the current coarse time-step, for all cells belonging to the current level. It is important to note here that the hydrodynamical quantities are fully updated at the current level in the , but there are also intermediate hydro updates in all neighboring cells at the next coarser level. The coarser level update is only partial though, because it only reflects the intercell fluxes across inter-level boundaries, and fluxes across other boundaries (same level or next coarser level) will only be accounted for when the coarser level time-step is fully advanced. Until then, these coarser level neighbor cells have gas states that are not well defined, since they only reflect some of their intercell fluxes. It effectively means that at any point between the start and finish of the primary (coarse) call to , there are some cells in the simulation box (lying next to finer level cells) that have ill-defined intermediate hydrodynamical states. This point is further illustrated in Appendix D. It is important to keep in mind when considering the coupling of RT with the hydrodynamics of RAMSES.
Having put down the basics of AMR hydrodynamics, we are now in a position to add radiative transfer.
In RAMSES-RT, each cell stores some additional state variables. Here , where , and are the hydrogen and helium ionization fractions, which are advected with the gas as passive scalars (in the hydro solver), and , represent the variables of photon density and flux for each of the photon groups. Note that this represents a hefty increase in the memory requirement compared to the hydrodynamics only of RAMSES: the memory requirement for storing (which is the bulk of the total memory in most simulations) is increased by a factor of , where the represents the ionization fractions and the parenthesis term represents the photon fluxes and densities. Thus, with three photon groups, the memory requirement is increased by roughly a factor compared to a traditional RAMSES simulation.
Given the time-scale difference between hydrodynamics and radiative transfer, the obvious approach to performing RHD is to sub-cycle the three radiative transfer steps (injection, advection, thermochemistry) within the hydrodynamical step. There is, however, a major drawback to this approach, which is that it is incompatible with AMR multi-stepping: the RT sub-cycling must be done before/after each hydrodynamical AMR step at the finest refinement level only, and since light can in principle cross the whole box within the fine level hydrodynamical timestep, the RT sub-cycling must be done over the whole grid, over all levels. However, the partial hydrodynamical flux between cells at level boundaries always leaves some cells between the fine level steps with an intermediate (i.e. partially updated) gas state. This makes the thermochemistry ill-defined in those cells, since it needs to update the gas temperature in every cell, and for this to work the temperature must have a well defined and unique value everywhere. There are three ways around this:
First is to perform the RT subcycling only after a coarse hydrodynamical step, but here potentially thousands of fine-scale hydro steps would be executed without taking into account the thermochemistry.
Second, is to prohibit AMR multi-stepping, which makes the whole grid well defined after each step and thus allows for RT sub-cycling over the whole box. Multi-stepping is however one of the main advantages of AMR, and essentially allows us to refine in time as well as space, so this isn’t really an option.
We thus default to the third strategy, which we use in RAMSES-RT. Here we drop the subcycling of RT within the hydro step and perform the two on the same timestep length, which is the minimum of the RT and hydro timestep. Thus, with each hydro step, at any level, the RT steps are performed over the same level only. The basic scheme is illustrated in Fig. 6, and the pseudocode for the updated is shown in Listing 2. Obviously, the main drawback here is the timescale difference, which can be something like a factor of , meaning the number of hydrodynamical steps is increased the same factor and the run-time accordingly (plus numerical diffusion likely becomes a problem with such small hydrodynamical time-steps). However, if we also apply a reduced speed of light, we can shrink this factor arbitrarily, down to the limit where the hydro-timestep is the limiting factor and the only increase in computational load is the added advection of photons (which is considerably cheaper for one photon group than the hydrodynamical solver) and the non-equilibrium thermochemistry (which typically has a computational cost comparable to the equilibrium solver of RAMSES, provided we use RT smoothing). The question, which we have tried to answer in §4.2, is then how far we are allowed to go in reducing the light speed.
Parallelization is naturally acquired in RAMSES-RT by simply taking advantage of the MPI strategies already in place in RAMSES.
5.3 Radiation transport on an AMR grid
In RAMSES-RT, the radiation variables are fully incorporated into the AMR structure of RAMSES. The ionization fractions and photon densities and fluxes are refined and de-refined along with the usual hydro quantities, with a choice of interpolation schemes for newly refined cells (straight injection or linear interpolation). The radiative transfer, i.e. injection, transport and thermochemistry, is multi-stepped across the level hierarchy, thus giving AMR refinement both in space and time. Inter-level radiation transport is tackled in the same way as the hydrodynamical advection, i.e. transport on level includes partial updates of neighbouring cells on level . Update of the finer level cell RT variables over level boundaries involves the RT variables in a coarser cell, which are evaluated, again with the same choice of interpolation schemes. RAMSES-RT includes optional refinement criteria on photon densities, ion abundances and gradients in those, in addition to the usual refinement criteria that can be used in RAMSES (on mass and gradients in the hydrodynamical quantities).
Of the seven standard RT and RHD tests described in Section 6, five include active or inactive grid refinement, demonstrating that the radiation hydrodynamics perform robustly in conjunction with (on-the-fly) cell refinements/de-refinements. In addition, we demonstrate in Fig. 7 how radiation flux is well retained across changes in grid refinement. The upper left map of the figure shows a beam of radiation in a 2D RAMSES-RT experiment, where we use the HLL flux function and deactivate radiation-gas interactions (with zero photoionization cross sections). The beam is injected into two cells in the bottom left corner by imposing a unity reduced photon flux of photons , corresponding to a photon density of , at an angle of from the horizontal. The beam traverses a circular region of 2 successive levels of increasing refinement, going from refinement level 6 to 8, i.e. effective resolutions of to cells. We use here straight injection (i.e. no interpolation) for inter-level cell fluxes, but linear interpolation gives identical results. The snapshot is taken at s, just before the beam has had time to cross to the right edge of the cm wide box. To the right of the map we plot photon flux profiles, , across the coloured vertical lines in the map. The beam experiences diffusion, as can be seen by the widening of the flux profiles, but this is exclusively due to the intercell flux function and independent of the refinement changes. The far left plot shows the integrals across each flux profile, i.e. the total photon flux across each line. The values are consistent until around , and then reduce to zero towards the edge of the beam. We’ve verified that if the test is let to run for double the time, i.e. about s, the total flux is consistent throughout the whole box width to about in , so photons are very well conserved across the changes in refinement.
To further demonstrate flux conservation, the lower panel in the same figure shows an identical experiment except that the beam is horizontal, such that it can be perfectly maintained with the HLL flux function. To stay just under a light crossing time, we consider a shapshot at s. Here again, the flux is well preserved towards the edge of the beam, and we have verified that in two crossing times, the total flux is retained perfectly to the number precision, which here is 7 decimals.
We also consider another beam with the same setup, shown in Fig. 8, where instead of a static refinement region, the grid is actively refined on inter-cell gradients in photon density . According to the criterion, two adjacent cells at positions and are refined if
Straight injection (no interpolation) is used here for inter-level fluxes and cell refinements, but the results are identical when linear interpolation is used for inter-level fluxes and cell refinements. The snapshot here is taken at s ( a crossing time). The plot on the far right shows the flux conservation across different x-coordinates. (Note the total flux is slightly different from that in Fig. 7 because of the different geometry of the beam injection.) The total flux is again well maintained towards the beam edge. We verified that in two light crossing times, the discrepancy of the beam flux at different coordinates levels out to within .
These simple beam experiments demonstrate that the code accurately transports radiation across (even dynamically) changing refinement levels. The main errors are the artificial diffusion of radiation on the grid, which is not caused by refinement, but rather by the inter-cell flux function, and the dipole approximation inherent to the M1 closure, which does not allow opposing streams of radiation to pass through one another. Note though that while the diffusion is artificial, the total flux is well maintained, i.e. energy is conserved.
5.3.1 Speed of light
The AMR transport tests also demonstrate that radiation in RAMSES-RT propagates at the correct speed, i.e. at the speed of light. In each beam map (Figs. 7-8), a black line has been plotted over the beam, starting at the beam injection and ending at the light-crossing distance, i.e. , where is the snapshot time. Qualitatively it can be seen that the beam ends roughly at the same position as the black line, and in the flux plots on the far right side of each beam map it can be seen that the beam has roughly half the original flux at this end position. The far end of the beam is smooth over a few cell widths rather than discontinuous, because of numerical diffusion.
5.4 Cosmological settings
RAMSES uses super-comoving variables to allow for the impact of the cosmological expansion on the Poisson equation, the equations of hydrodynamics (38-40) and particle propagation (Martel & Shapiro, 1998; Teyssier, 2002): a change is made from the physical variables to super-comoving ones with
where is the Hubble constant, the matter density parameter, the comoving width of the simulation box (physical width at ), and the critical density of the Universe. When these variables are used instead of the physical ones, the cosmological expansion is accounted for, while all relevant equations remain unchanged, Euler equations included.
For consistency, and to partly account for the effect of cosmological expansion on the radiative transfer, the additional change is made in RAMSES-RT to super-comoving RT variables for the photon transport: