A new gravitational Nbody simulation algorithm for investigation of cosmological chaotic advection
Abstract
Recently alternative approaches in cosmology seeks to explain the nature of dark matter as a direct result of the nonlinear spacetime curvature due to different types of deformation potentials. In this context, a key test for this hypothesis is to examine the effects of deformation on the evolution of large scales structures. An important requirement for the fine analysis of this pure gravitational signature (without dark matter elements) is to characterize the position of a galaxy during its trajectory to the gravitational collapse of super clusters at low redshifts. In this context, each element in an gravitational Nbody simulation behaves as a tracer of collapse governed by the process known as chaotic advection (or lagrangian turbulence). In order to develop a detailed study of this new approach we develop the COsmic LAgrangian TUrbulence Simulator (COLATUS) to perform gravitational Nbody simulations based on Compute Unified Device Architecture (CUDA) for graphics processing units (GPUs). In this paper we report the first robust results obtained from COLATUS.
:
98.80.k,98.65.At.Cw.Dx,95.35.+d,95.75.Pq6x9
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Computing Institute (IC)  Fluminense Federal University (UFF),
Niteroi, RJ, Brazil
address=Computing Institute (IC)  Fluminense Federal University (UFF),
Niteroi, RJ, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
address=Lab for Computing and Applied Math (LAC)  National Institute for Space Research (INPE),
São José dos Campos, SP, Brazil
1 I. Introduction
The gravitational Nbody simulations have become a powerful tool for testing the theories of structure formation in astrophysical and cosmological systems [1]. In particular, it has been shown that the statistical characterization of dark matter distribution is an important ingredient in the investigation of largescale structure formation in the Hubble volume simulated from the GADGETVC algorithm [2].
Recently, an established statistical method was used to demonstrate the importance of considering chaotic advection (or Lagrange Turbulence) [3] in combination with gravitational instabilities in the  CDM simulations performed from the Virgo Consortium (VC) [4]. However, the GADGETVC algorithm does not allow in a straightforward approach the computation of the kinematics of a single particle, requirement which is necessary for the investigation of the chaotic advection in alternative cosmological models considering only barionic matter [6]. This limitation appears because the interaction forces are computed by the TreePM scheme [5]. The COsmic LAgrangian TUrbulence Simulator (COLATUS) is a new algorithm to perform gravitational Nbody simulations allowing the computation of the velocity of a single particle at every timestep and then the evaluation of its energy power spectrum. To achieve its objective COLATUS compute the gravitational forces by using a direct summation scheme(the simplest PP scheme). COLATUS is implemented in a Compute Unified Device Architecture (CUDA) by using the Nvidia graphics processing units (GPUs) to reduce the simulation runtime. In the present work we show the preliminary simulations including up to particles using 1536 cores of NVIDIA GTX680.
2 Ii. Gpu/cuda Cosmological NBody Simulation
For Nbodies under the influence of physical gravitational forces in a comoving coordinates() the motion equations take the form:
(1) 
where is the softening factor(collisionless assumption), is the scale factor which is taken to be at the present time and it changes ruled by the Friedmann equation:
(2) 
where is the radiation density, is the matter (dark plus baryonic) density, is spatial curvature density, is the cosmological constant or vacuum density today and is the Hubble parameter for . These equations are valid for nonrelativistic matter at scales that are much smaller than the Hubble radius [17]. The equation of motions can be integrated using the difference equations[24].
(3)  
(4)  
(5) 
However, these equations are not symplectic, but by making a suitable canonical transformation, one can derive an integrator that is symplectic [20, 22].
The Lagrangian for the particle motion in the comoving frame is:
(6) 
Switching to the Hamiltonian formalism, the momentum canonical to is and the Hamiltonian is:
(7) 
then
(8) 
Although this Hamiltonian is timedependent, it is separable, so the “drift” and “kick” operators are easily derived as:
(9) 
where is the step size.
Note that, the gravitational force in the Newtonian limit falls as , hence it is a long range force and we cannot ignore force due to distant particles. This makes calculation of forces the most time consuming task in NBody simulations. As a result, a lot of attention has been focused on this aspect and many algorithms and optimizing schemes have been developed, we refer the reader to [1, 8] for a detailed review. These schemes have been evolved to address this problem and replace direct summation by methods that are less time consuming. However the direct summation implemented in GPU/CUDA approach provide at least three features that help it achieve high efficiency [18]: (i) Straightforward parallelism with sequential memory access patterns; (ii) Data reuse that keeps the arithmetic units busy; and (iii) The fully pipelined arithmetic, including complex operations such as inverse square root, that are much faster clockforclock on a GPU than on a CPU.
The standard model that explains the large scale structure formation (e.g. galaxies, clusters of galaxies) assume that they were formed from the amplification of the small initial fluctuations via gravitational instability [7]. These models attempt to reduce cosmology to an initial value problem [1]. Then, given the initial conditions(universe at high redshift) and a cosmological model, the goal is to compute the evolution of structure until . The initial conditions are a densityvelocity field represented by a set of particles which must be setup in compatibility with of the observed primordial universe on cosmic microwave background radiation [15, 16, 13]. We generate the initial conditions by imposing perturbations on an initially uniform state represented by a “glass” distribution of particles generated by the method of White (1996). Using the algorithm described by [11], which is based on the Zeldovich approximation [15], a Gaussian random field is set up by perturbing the positions of the particles and assigning them velocities according to growing mode linear theory solutions [12].
The cosmological models specifies the universe composition(matter, radiation, spacetime curvature) and attempt to describes its evolution and interaction, they usually include some exotic ingredients as the dark matter and dark energy to explain anomalous galactic rotation curves and the accelerated expansion, respectively. Assuming that the gravitation is the dominant force (e.g.[8]) we compute the structure evolution by integrating the motion equations and computing the Nbody interaction forces in every timestep. The motion equations are in comoving spatial coordinates to consider the universe expansion and boundary conditions are periodic to be consistent with a homogeneous elements distribution [1, 8, 12]. The solutions of the motion equations would be exact if we were able to simulate the motions of all individual bodies. Unfortunately this can not be achieved since the systems of interest (cosmological large scale structures) contain of the order of stars. We approximate collection of a very large number of stars in the universe by one galaxy in an NBody simulation. Therefore the particles in an NBody simulation must interact in a purely collisionless manner [9, 10]. The mass of a galaxy (typically of the order of 0.1 to ) is adjusted in its normalized form (0.11) in the initial condition hypercube.
Based on the formalism described above we have implemented the algorithm COLATUS that is described in a simplified form in Appendix A.
3 III. Preliminary Results
The first main objective to be reached with the COLATUS is to show that simulations with the simplest type of particleparticle (PP) gravitational interaction, containing typical densities of barionic content only, one can get collapsed patterns consistent with gravitational simulations that include dark matter.
Figure 1 depicts three snapshots of the largescale structure formation for a cosmological box containing galaxies in the critical density distributed in a volume of which evolves over units of redshift. The spectrum for the initial distribution of particles follows the EisensteinHu law [23]. For we have computed the relation between the number of filaments and the number of voids, , and the averages for the filament lengths , filament thickness and void lenghts , which are consistent with those obtained by simulations based on more sophisticated supercomputers (see for example [5] Virgo and Millenium).
For the next simulations we intend to begin the study of the deformation potential influence on the large scale structures patterns (filament thickness, filament and void lenghts) obtained for z = 0. In order to implement this study we will modify the equation (1) by introducing a deformation potential as follow:
(10) 
where is a constant and is the position where the potential has it max value. Note that potential force fall as , then it acts like a mass particle located in a fixed position.
4 IV. Concluding Remarks
We show here a new way to apply simulations of gravitational Nbody using CUDA / GPU to test cosmological models that attempt to explain dark matter from effects due only to the curvature of spacetime. Preliminary results indicate that the algorithm is able to simulate the generation of largescale structures whose statistical properties are consistent with simulations that use of more sophisticated numerical schemes for supercomputers based on CPUs.
An important fact in this new algorithm is that it allows to directly follow the trajectory of individual galaxies during the evolution of the universe, thus allowing to test, with high precision, the effects of possible global and local deformations of the Lagrangian particle description in spacetime. In this context, the statistical methods for structural fine analysis should be those commonly used to characterize the process of chaotic advection which tracers moving in turbulent fluids are subject. In our cosmological approach, galaxy clusters are formed only by baryonic matter that also feel the effect of deformation that will intensify the effect of gravitational instabilities inherent in the system.
5 Appendix A
The COLATUS algorithm implement the direct summation scheme by using a code written in CUDA. On the GPUs each thread compute the force, acceleration, position and velocity of each particle. The sequence of the algorithm is as follows:

First, the CPU reads the initial conditions from a datafile and copy the initial positions and velocities to the global memory;

On the CPU, the main program allocate the Memory on the GPU;

On the CPU, the main program copy the initial conditions to the GPU;

On the CPU, the main program initialize the threads on the GPU each of them will perform the calculations;

On the GPU, the forces are calculated by the threads.

On the GPU, we integrate the motion equation to actualize the particles velocities and positions. We are testing the viability of make the numerical integration on the CPU, then we will use the GPU only for calculation of the iteration forces (e.g. [21])).

The process is repeated and concurrently other library shows a particle motion in the box.

At every timestep the GPU copy to the CPU the particle position and velocity.
Below is an excerpt from the main part of the code in CUDA which is responsible for calculating the particleparticle gravitational interaction.
Acknowledgements.
The authors thank the scientific Brazilian support agencies: CAPES, CNPq and FAPESP.References
 [1] Bertschinger,E. Simulation of Structure formation in the universe. Annu. Rev Astron.Astrophys 36 (1996) 5999654.
 [2] Caretta, C.A., R. R. Rosa, H. F. C. Velho, F. M. Ramos, M. M. Evidence of turbulencelike universality in the formation of galaxysized dark matter haloes. Astron. Astrophys 487 (2008) 445451.
 [3] Aref, H., The development of chaotic advection. Physics of Fluids, 14 (2002) 13151325.
 [4] Rosa, R. R. ; Ramos, F. M. ; Caretta, C. A. ; Velho, H. F. C. Extreme event dynamics in the formation of galaxysized dark matter structures. Computer Physics Communications 180 (4) (2009) 621624.
 [5] Springel, V. The cosmological simulation code GADGET2. Computer Physics Communications, Mon. Not. R. Astron. Soc. 364 (2005) 11051134.
 [6] (a)Rosa, R. An alternative coupled expanding universe without cosmological singularity. In: 38th COSPAR Scientific Assembly. [S.l.: s.n.], 2010. v. 38,p. 3663. (b) Rosa et al (in this issue)
 [7] Jeans, J. H. The Stability of a Spherical Nebula. Phil. Trans. Roy. Soc. Lond. 1 (1902) 199 153.
 [8] Bagla, J.S. Cosmological Nbody simulation: Techniques, scope and status. Current Science, 88 (2005) 10881100
 [9] Bagla, J.S. and Padmanabhan, T. Cosmological NBody Simulations, Pramana  J. Phys. 42(1997) 161.
 [10] Mo, H. ; van den Bosch, F. and White, S. Galaxy Formation and Evolution, Cambridge University Press (2010).
 [11] Efstathiou, G.; Davis M., White S. D. M.; Frenk C. S. Numerical techniques for Large Cosmological Nbody Simulations. ApJS 57(1985) 241
 [12] Jenkins, A.; Frenk, C. S.; Pearce, F. R.; Thomas, P. A.; Colberg, J. M.; White, S. D. M.; Couchman, H. M. P.; Peacock, J. A.; Efstathiou, G.; Nelson, A. H. Evolution of Structure in Cold Dark Matter Universes. Astrophysical Journal 499 (1998) 20.
 [13] Jeong, D. Cosmology with high redshift galaxy surveys. The university of texas at austin. PHD these(2010) p.391
 [14] Stoehr, F. Simulations of Galaxy Formation and Large ScaleStructure. Physics Department LudwigMaximilian UniversitAat MAunchen PHD these(2003) p.208.
 [15] Zel’dovich, Y. B. Gravitational Instability: An aproximation theory for large density perturbation. AA 5 (1970) 8489.
 [16] Guth A. H.,Inflationary universe: A possible solution to the horizon and flatness problems Phys. Rev. D. 23 (1981) (2) 347356
 [17] McCrea, W. H. Observable relations in relativistic cosmology. Mit 1 Abbildung. Zeitschrift für Astrophysik. 9 (1934) 290.
 [18] Nyland L., Harris M., and Prins J.. Fast nbody simulation with cuda. GPU Gems 3 (2007) 677695.
 [19] Hamada, T. 190 TFlops Astrophysical N body Simulation on cluster of GPUs. Molecular Simulation. IEEE (2010) 19.
 [20] Quinn, T. Katz, N. Stadel, J. Lake, G. Time stepping NBody simulations. Preprint, astroph/ 9710043.
 [21] Hamada, T. Nitadori, K. 190 TFlops Astrophysical Nbody Simulation on cluster of GPUs. Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. SC ’10(2010) 19.
 [22] Chambers, J. A Hybrid Sympletic Integrator that Permits Close Encounters between Massive Bodies. Mon.NotR.Astron.Soc.(1998) 18.
 [23] Eisenstein,D. Hu, W. Power Spectra for Cold Dark Matter and its Variants. Astrophys.J. 511 (1997) 5.
 [24] Hockney, R.W., Eastwood, J.W.. Computer Simulation In: Hut, P., McMillan, S. (Eds.). SpringerVerlag, Berlin, Using Particles. (1981) McGrawHill, New Work