Towards an ultra efficient kinetic scheme
Part I: basics on the BGK equation ^{†}^{†}thanks: This work was supported by
the ANR Blanc project BOOST and the ANR JCJC project ALE INC(ubator) 3D
Abstract
In this paper we present a new ultra efficient numerical method for solving kinetic equations. In this preliminary work, we present the scheme in the case of the BGK relaxation operator. The scheme, being based on a splitting technique between transport and collision, can be easily extended to other collisional operators as the Boltzmann collision integral or to other kinetic equations such as the Vlasov equation. The key idea, on which the method relies, is to solve the collision part on a grid and then to solve exactly the transport linear part by following the characteristics backward in time. The main difference between the method proposed and semiLagrangian methods is that here we do not need to reconstruct the distribution function at each time step. This allows to tremendously reduce the computational cost of the method and it permits for the first time, to the author’s knowledge, to compute solutions of full six dimensional kinetic equations on a single processor laptop machine. Numerical examples, up to the full three dimensional case, are presented which validate the method and assess its efficiency in 1D, 2D and 3D.
Keywords: Kinetic equations, discrete velocity models, semi
Lagrangian schemes, BoltzmannBGK equation, 3D simulation.
1 Introduction
The kinetic equations provide a mesoscopic description of gases and more generally of particle systems. In many applications, the correct physical solution for a system far from thermodynamical equilibrium, such as rarefied gases or plasmas, requires the resolution of a kinetic equation [7]. However, the numerical simulation of these equations with deterministic techniques presents several drawbacks due to the large dimension of the problem. The distribution function depends on seven independent variables: three coordinates in physical space, three coordinates in velocity space and the time. As a consequence, probabilistic techniques such as Direct Simulation Monte Carlo (DSMC) methods [1, 5, 6, 29] are extensively used in real situations due to their large flexibility and low computational cost compared to finite volume, finite difference or spectral methods for kinetic equations [17, 18, 28, 31, 35]. On the other hand, DSMC solutions are affected by large fluctuations. Moreover, in non stationary situations it is impossible to use time averages to reduce these fluctuations and this leads to, either poorly accurate solutions, or again to computationally expensive simulations.
For this reason, many different works have been dedicated to reduce some of the disadvantages of Monte Carlo methods. We quote [5] for an overview on efficient and low variance Monte Carlo methods. For applications of variance reduction techniques to kinetic equation let us remind to the works of Homolle and Hadjiconstantinou [21] and [22]. We mention also the work of Boyd and Burt [4] and of Pullin [36] who developed a low diffusion particle method for simulating compressible inviscid flows. We finally quote the works of Dimarco and Pareschi [14, 15] and of Degond, Dimarco and Pareschi [12] who constructed efficient and low variance methods for kinetic equations in transitional and general regimes.
In this work, we consider the development of a new deterministic method to solve kinetic equations. In particular, we focus on the development of efficient techniques for the discretization of the linear transport part of these equations. The proposed method is based on the socalled discrete velocity models (DVM) [28] and on the semi Lagrangian approach [8, 9]. The DVM models are obtained by discretizing the velocity space into a set of fixed discrete velocities [3, 28, 31, 32]. As a result of this discretization, the original kinetic equation is then represented as a set of linear transport equations plus an interaction term which couples all the equations. In order to solve the resulting set of equations, the most common strategy consists in an operator splitting strategy [10]: The solution in one time step is obtained by the sequence of two stages. First one integrates the space homogeneous equations and then, in the second stage, the transport equation using the output of the previous step as initial condition. More sophisticated splitting techniques can be employed, which permits to obtain high order in time discretizations of the kinetic equations as for instance the Strang splitting method [37]. In any case, the resulting method is very simple and robust but the main drawback is again the excessive computational cost. It is a matter of fact that the numerical solution through such microscopic models and deterministic schemes remains nowadays too expensive especially in multidimensions even with the use of supercomputers.
To overcome this problem, we propose to use a Lagrangian technique which exactly solves the transport stage on the entire domain and then to project the solution on a grid to compute the contribution of the collision operator. The resulting scheme shares many analogies with semiLagrangian methods [8, 9, 18] and with Monte Carlo schemes [24], as we will explain, but on the contrary to them, the method is as fast as a particle method while the numerical solution remains fully deterministic, which means that there is no source of statistical error. Thanks to this approach we are able to compute the solution of the full six dimensional kinetic equation on a laptop. This is, up to our knowledge, the first time that the full kinetic equation has been solved with a deterministic scheme on a single processor machine for acceptable mesh sizes and in a reasonable amount of time (around ten hours for space velocity space mesh points).
In this first work, we consider a simple collision operator, i.e. the BGK (BhatnagarGrossKrook) relaxation operator [20]. The extension of the method to other operators like the Boltzmann one [1, 7] or to other kinetic equations like the Vlasov equation [2, 18] will be considered in future works. At the present moment, the method is designed to work on uniform grids, although extensions to other meshes are possible and will be also considered in the next future.
The article is organized as follows. In section 2, we introduce the BoltzmannBGK equations and their properties. In section 3, we present the discrete velocity model (DVM). Then in section 4 we present the numerical scheme. Section 5 is devoted to the illustration of the analogies between such fast kinetic scheme (FKS) and particle methods. Several test problems up to three dimensional test cases which demonstrate the capabilities and the strong efficiency of the method are presented and discussed in section 6. Some final considerations and future developments are finally drawn in the last section.
2 BoltzmannBGK Equation
We consider the following kinetic equation as a prototype model for developing our method:
(1) 
with the initial condition
(2) 
This is the BoltzmannBGK equation where is a non negative function describing the time evolution of the distribution of particles which move with velocity in the position at time . For simplicity we consider the same dimension in space and in velocity space , however it is possible to consider different dimensions in order to obtain different simplified models. In the BGK equation the collisions are modeled by a relaxation towards the local thermodynamical equilibrium defined by the Maxwellian distribution function . The local Maxwellian function is defined by
(3) 
where and are the density and mean velocity while with the temperature of the gas and the gas constant. The macroscopic values , and are related to by:
(4) 
The energy is defined by
(5) 
while the kinetic entropy of by
(6) 
The parameter in (1) is the relaxation time. In this paper, is fixed at the beginning of each numerical test. Considering relaxation frequencies as functions of the macroscopic quantities does not change the numerical scheme we will propose and its behaviors. We refer to section 6 for the numerical values chosen.
If we consider the BGK equation (1) multiplied by , , (the socalled collision invariants), and then integrated with respect to , we obtain the following balance laws:
(7) 
which express the conservation of mass, momentum and total energy, in which is the pressure tensor while is the heat flux. Furthermore the following inequality expresses the dissipation of entropy:
(8) 
System (7) is not closed, since it involves other moments of the distribution function than just , and . Let us describe one way to close the system.
The Maxwellian can be characterized as the unique solution of the following entropy minimization problem
(9) 
where and are the vectors of the collision invariants and of the first three moments of respectively:
(10) 
This is the wellknown local Gibbs principle, and it expresses that the local thermodynamical equilibrium state minimizes the entropy, in the mathematical sense, of all the possible states subject to the constraint that moments are prescribed.
Formally, when the number of collision goes to infinity, which means , the function converges towards the Maxwellian distribution. In this limit, it is possible to compute the moments and of in terms of , and . In this way, one can close the system of balance laws (7) and get the socalled Euler system of compressible gas dynamics equations
(11) 
3 The Discrete Velocity Model (DVM)
The principle of Discrete Velocity Model (DVM) is to set a grid in the velocity space and to transform the kinetic equation in a set of linear hyperbolic equations with source terms. We refer to the work of Mieussens [28] for the description of this model and we remind to it for the details.
Let be a set of multiindices of , defined by , where are some given bounds. We introduce a Cartesian grid of by
(12) 
where is an arbitrary vector of and is a scalar which represents the grid step in the velocity space. We denote the discrete collision invariants on by .
Now, in this setting, the continuous distribution function is replaced by a vector , where each component is assumed to be an approximation of the distribution function at location :
(13) 
The fluid quantities are then obtained from thanks to discrete summations on :
(14) 
The discrete velocity BGK model consists of a set of evolution equations for of the form
(15) 
where is a suitable approximation of defined next. Two strongly connected and important questions arise when dealing with discrete velocity models. The first one is about the truncation and boundedness of the velocity space. The second one concerns the conservation of macroscopic quantities.
Truncation and boundedness of the velocity space.
In DVM methods one needs to truncate the velocity space and to fix some bounds. This gives the number of evolution equations (15). Of course, the number is chosen as a compromise between the desired precision in the discretization of the velocity space and the computational cost, while the bounds are chosen to give a correct representation of the flow. Observe in fact that, the macroscopic velocity and temperature are bounded above by velocity bounds. This implies that the discrete velocity set must be large enough to take into account large variations of the macroscopic quantities which may appear as a result of the time evolution of the equations. Moreover, as a consequence of the velocity discretization, we have that the temperature is bounded from below. We summarize the above remarks by the following statement. Let be a non negative distribution function, then the macroscopic velocity and temperature associated to in by
(16) 
where denotes the summation over the set of multiindices , satisfy the bounds [28]
(17)  
(18) 
Conservation of macroscopic quantities.
Exact conservation of macroscopic quantities is impossible, because in general the support of the distribution function is non compact. Thus, in order to conserve macroscopic variables, different strategies can be adopted, two possibilities are described in [19, 28]. Moreover, the approximation of the equilibrium distribution with must be carefully chosen in order to satisfy the conservation of mass, momentum and energy. In the following section we will discuss our choices in details. Such choices prevent the lack of conservation of physical quantities.
Remark 1
Once DVM model is defined as above, the common choice which permits to solve the kinetic equation is to discretize the evolution equations with the preferred finite volume or finite difference method [28, 31, 32, 35]. Alternatively, one can reconstruct the distribution function in space and then follows the characteristics backward in time to obtain the solution of the linear transport equation [8, 9, 17, 18]. Our choice, described in the next section, which enables to drastically decrease the computational cost, consists of an exact solution of the linear transport equation avoiding the reconstruction of the distribution function.
4 Fast kinetic schemes (FKS)
The main features of the method proposed in this work can be summarized as follows:

The BGK equation is discretized in velocity space by using the DVM method.

The transport part is exactly solved, which means without using a spatial mesh. The initial data of this step is given by the solution of the relaxation operator.

The relaxation part is solved on the grid. The initial data for this step is given by the value of the distribution function in the center of the cells after the transport step.
Before describing the scheme, we explain how we overcome the drawback of the lack of conservation of macroscopic quantities in DVM methods.
4.1 Conservative methods
We introduce the conservative method for the initial data and then we extend it to the scheme. The initialisation is done in two steps. First we fix
(19) 
Observe that, in order to do this operation we do not need to discretize the physical space, in others words, if the initial data is known continuously, this information can be kept. However, for simplicity, we already at this stage introduce a Cartesian uniform grid in the physical space. This is defined by the set of multiindices of , which is , where are some given bounds which represent the boundary points in the physical space. Next, the grid of is given by
(20) 
where represents at the same time the dimension of the physical space and the dimension of the velocity space which are taken equal for simplicity, even if this is not necessary for the setting of the numerical method. Finally, is a vector of which determines the form of the domain and is a scalar which represents the grid step in the physical space. We consider a third discretization which is the time discretization . We will later in the paper introduce the time step limitations.
We denote with the approximation and with the pointwise distribution value which are different, for conservation reasons, as explained next. In this notation, the discrete moments of the distribution are
(21) 
The corresponding discrete equilibrium is denoted , or equivalently by , which is an approximation of and it will be also defined later. When the distribution function is truncated in velocity space, conservation of the macroscopic quantities is no longer possible. Thus, in order to restore the correct conserved variables we make use of a simple constrained Lagrange multiplier method [19], where the constraints are mass, momentum and energy of the solution. Let us recall the technique from [19]: Let be the total number of discretization points of the velocity space of the distribution function. We consider one space cell, the same renormalization of should be considered for all spatial cells. Let
(22) 
be the pointwise distribution vector at and
(23) 
be the unknown corrected distribution vector which fulfills the conservation of moments. Let
(24) 
and be the vector of conserved quantities. Conservation can be imposed using a constrained optimization formulation:
(25)  
To solve this constrain minimization problem, one possibility is to employ the Lagrange multiplier method. Let be the Lagrange multiplier vector. Then the corresponding scalar objective function to be optimized is given by
(26) 
The above equation can be solved explicitly. In fact, taking the derivative of with respect to , for all and , for all , that is to say the gradient of , we obtain
(27) 
and
(28) 
Now, solving for we get
(29) 
and observing that the matrix is symmetric and positive definite, since is the integration matrix, one deduces that the inverse of exists. In particular the value of is uniquely determined by
(30) 
Back substituting into (27) provides
(31) 
Observe that, following the same principle, we can impose the conservation of other macroscopic quantities, in addition to mass, momentum and energy. A key point is that, in practice, we need to solve the above minimization problem only for the initial data because once the conservation is guaranteed for , this is also guaranteed for the entire computation because the exact solution is used for solving the transport step. The only possible source of loss of conservation for the entire scheme is the relaxation step. This means that, for this step, we will need to impose conservation of the macroscopic quantities but only for the equilibrium distribution.
The discretization of the Maxwellian distribution , should satisfy the same properties of conservation of the distribution , i.e. . To this aim, observe that the natural approximation
(32) 
cannot satisfy these requirements, due to the truncation of the velocity space and to the piecewise constant approximation of the distribution function. Thus, the calculation carried out above for the definition of the initial distribution , can be also performed for the equilibrium distribution . This should be done each time we invoke the equilibrium distribution during the computation. The function is therefore given by the solution of the same minimization problem defined in (4.1), and its explicit value is given mimicking (31) by
(33) 
where represents the pointwise values of the Maxwellian distribution . Notice that the computation of the new distributions and only involves a matrixvector multiplication. In fact, matrix only depends on the parameter of the discretization and thus it is constant in time. In other words matrices and can be precomputed and stored in memory during the initialisation step. They are used during the simulation when the solution of system (4.1) is invoked.
Another possibility to approximate the Maxwellian distribution is proposed in [28]. In that work, the authors define as the solution of a discrete entropy minimization problem
(34) 
This discretization (existence, uniqueness, convergence) has been mathematically studied in [28]. However, one drawback of this method, is the need for solving a non linear system of equations in each spatial cell for each time step. As we seek for efficiency, we only consider the first minimization strategy (4.1) to approximate the equilibrium distribution .
4.2 Conservative and fast kinetic schemes FKS
We can now present the full scheme. Let us start with the firstorder splitting scheme and then, define the secondorder in time method based on the Strang splitting strategy [37].
Let be the initial data defined as a piecewise constant function in space and in velocity space, solution of equation (31) with . We recall that the choice of a piecewise constant function in space is not mandatory for the method. Let also be the initial equilibrium distribution solution of equation (33) with . We start describing the first time step of the method starting at , we further generalize the method to the generic time step starting from .
First time step .
Let us describe the transport
and relaxation stages.
Transport stage. We need to solve linear transport
equations of the form:
(35) 
where the initial data for each of the equations is a piecewise constant function in the three dimensional space defined as
(36) 
The exact solution of the equations at time is given by
(37) 
Observe that, here, we do not need to reconstruct our function as for instance in the semiLagrangian schemes [17, 18], the shape of the function in space is in fact known and fixed at the beginning of the computation. Once the solution of the transport step is known, to complete one step in time, we need to compute the solution of the relaxation step. As in finite volume or finite difference methods, we solve the relaxation step only on the grid, thus only the value of the distribution function in the centers of the cells are computed. From the exact solution of the function we can immediately recover these values at the cost of one simple vector multiplication. On the other hand, one notices that for classical finite difference or finite volume methods nested loops for each dimension in space and in velocity space are mandatory to compute the solution of the transport part. This makes the computational cost of these methods extremely demanding in the multidimensional cases. On the contrary, the computational cost of the method we propose is only of the order of the number of points in which the velocity space is discretized (). In particular, for uniform meshes, we only need to compute the new value of in the center of one single cell, to know the solution in the center of all others cells.
Relaxation stage. For this step we need to locally solve on the grid, i.e. in the center of each spatial cell, an ordinary differential equation. Thus, we have to solve:
(38) 
where the initial data is the result of the transport step
(39) 
Any discretization method in time for this term can be chosen, as for instance the preferred Runge Kutta method. However, being the above equation a first order linear ordinary differential equation, we choose to compute the exact solution. The last ingredient needed to perform the computation, is the value of the equilibrium distribution at the center of the cell after the transport stage. To this aim, observe that, the Maxwellian distribution does not change during the relaxation step, which means that during this step the macroscopic quantities remain constants. This implies that only the transport stage possibly modifies the equilibrium distribution. In order to compute the Maxwellian, the macroscopic quantities in the center of the cells, i.e. the density, the mean velocity and the temperature, are given by summing the local value of the discrete distribution over the velocity set , where . Finally, the discrete equilibrium distribution at time is the solution of equation (33) with moments . We can now compute the solution of the relaxation stage by
(40) 
Observe that the above equation furnishes only the new value of the distribution at time in the center of each spatial cell for each velocity . However, what we need, in order to continue the computation, is the value of the distribution in all points of the space. To overcome this problem, in classical discrete velocity methods several authors [28, 31] consider the distribution function constant in the cell as well as the Maxwellian distribution. The result is that they need to solve only an ordinary differential equation in the center of the cell taking the average value of the macroscopic quantities inside one cell. Here, we make a different approximation. We consider that the equilibrium distribution has the same form as the distribution in space. In other words is a piecewise constant function in space for each velocity . The values of this piecewise constant function are the values computed in the center of the spatial cells, i.e. one defines
(41) 
This further implies that the relaxation term writes in term of spacial continuous function as
(42) 
For each velocity this choice permits to keep the
form of the distribution constant in space throughout the computation,
and, as a consequence it drastically reduces the computational cost.
This ends the first time step.
We focus now on the time marching procedure for the first and secondorder splitting schemes which will allow to solve the BolzmannBGK equation.
Generic time step .
We present a firstorder and secondorder Strang splitting technique [37].
Firstorder splitting: Given the value of the distribution function , for all , and all at time , the value of the distribution at time , , is given by
(43) 
(44) 
where is a piecewise constant function, computed considering the solution of the minimization problem (33) relative to the moments value in the center of each spatial cell after the transport stage: . These moments are given by computing where is the value that the distribution function takes after the transport stage in the center of each spatial cell.
Secondorder splitting: Given the value of the distribution function at time , the scheme reads
(45) 
(46) 
where is a piecewise constant function, computed considering the solution of the minimization problem (33) relative to the moments values in the center of each spatial cell after the transport stage of size . We call these moments . They are given by the discrete summation where is the value that the distribution function takes after the transport stage in the center of each spatial cell. The last step consists of a second transport stage of half time step
(47) 
which ends the secondorder splitting scheme.
Remark 2

As already mentioned the choices of uniform meshes and piecewise constant functions in space are not necessary for the construction of the method. These choices have been made because we wanted to analyze the method in its simplest form. We postpone to future works the study of nonuniform meshes and different shapes of the distribution function in space. However, a key point is that, even if the method in its general form is already much more faster than finite volume, finite difference or semi Lagrangian methods for kinetic equations, it can be made extremely fast in the case of uniform meshes as we will explain in the next paragraph.

For finite volume or finite difference methods applied to discrete velocity models of kinetic equations, the secondorder time splitting implies the computation of the transport stage in two steps, from to and from to . Conversely the same operation can be done with the relaxation step to get second order accuracy.

In our method, extending the scheme from first to secondorder time splitting is almost as expensive as the firstorder. In fact, except for the first time step in which we need to compute two times the transport operator with , starting from the second time step we have to solve a sequence of two transport stages. However, being the transport computed exactly, solving the linear transport equations two times with or only one time with the entire provides the same solution. This means that, in order to obtain secondorder accuracy it is sufficient to solve the first time step with and then proceed as for the firstorder method to obtain global secondorder accuracy in time.

However, any time splitting method does degenerate to firstorder accuracy in the fluid limit, that is to say, when .

Due to the fact that the relaxation stage preserves the macroscopic quantities, the scheme is globally conservative. In fact, at each time step, the change of density, momentum and energy is only due to the transport step. This latter, being exact, does preserve the macroscopic quantities as well as the distribution function.

For the same reason, the scheme is also unconditionally positive. In others words, we observe that , for all , and if the initial datum is positive for all . In fact, the transport maintains the shape of unchanged in space while the relaxation towards the Maxwellian distribution is a convex combination of and both being positive.

We expect the scheme to perform very well in collisionless or almost collisionless regimes. In these cases in fact the relaxation stage is neglectible and only the exact transport does play a role. When moving from rarefied to dense regimes the projection over the equilibrium distribution becomes more important. Thus, the accuracy of the scheme is expected to diminish in fluid regimes, because the projection method is only firstorder accurate. One possibility, for such regimes is to increase the order of the projection method towards the equilibrium. This possibility will also be analyzed in future works.

The time step is chosen as the classical CFL condition
(48) Observe that this choice is not mandatory, in fact the scheme is always stable for every choice of the time step, but being based on a time splitting technique the error is of the order of or . This suggests to take the usual CFL condition in order to maintain the error small enough.
5 Analogies with particles methods
In this section we first introduce a Monte Carlo particle method which permits to solve the BoltzmannBGK equation. Next, we introduce its deterministic counterpart, i.e a deterministic particle method. Finally, we show that a slightly modified version of this latter method, in which the positions of the particles, instead of being randomly chosen, are taken initially at the same position in space for all the cells, is equivalent to a FKS method where some specific choices of the discretization parameters are done. This analogy permits to derive a very convenient form of the algorithm which for this choice of the discretization parameters.
The starting point of Monte Carlo methods is again given by a time splitting between free transport
(49) 
and collision, which in the case of the BGK operator is substituted by a relaxation towards the equilibrium
(50) 
In Monte Carlo simulations the distribution function is discretized by a finite set of particles
(51) 
where represents the particle position, the particle velocity and the particle mass which is usually taken constant. During the transport stage the particles move to their next positions according to
(52) 
where is such that an appropriate CFL condition holds. This condition normally implies that one particle does not cross more than one cell in one time step.
The collision step acts only locally, changes the velocity distribution but preserves the macroscopic quantities. In this case, as already explained, the space homogeneous problem admits the following exact solution at time
(53) 
Thus, in a Monte Carlo method, the relaxation step consists in replacing randomly selected particles with Maxwellian particles with probability . This means
(54) 
where in the above expression represents a particle sampled from the Maxwellian distribution with moments . Observe that, secondorder splitting can be used as well in the Monte Carlo methods. As in the case of the FKS, because the transport step is resolved exactly, the change with respect to the firstorder method is only the first time step which has to be computed with a time step of . This will assure secondorder accuracy in time except in the limit in which the method degenerates again to firstorder accuracy.
We introduce now a modified particle method which shares many analogies with our method. Instead of the continuous kinetic equation, this modified particle approach solves the discrete velocity approximation of the kinetic equation. In this method, the distribution function is again represented by a piecewise constant function, defined on a compact support in the velocity space. The distribution function is approximated by a finite set of particles in each spatial cell as in the previous Monte Carlo method. The main difference with respect to the other particle method is that now the particles can attain only a discrete set of velocities and that the mass of each particle is no more a constant, instead it changes in time during the time evolution of the kinetic equation. These types of methods are known in literature as weighted particles methods [11, 26, 27]. Therefore we consider
(55) 
where is the same set of multiindices than the DVM discretization (this means that the number of particle is fixed equal to the number of points in which the velocity space is discretized). The BGK equation is again split into two stages: a transport and a relaxation stage. The transport part, as before, corresponds to the motion of the particles in space caused by their velocities (52). The main difference is in the solution of the relaxation part (50). In order to solve this equation from a particle point of view, we change the mass of each particle using the exact solution of the relaxation equation, i.e.
(56) 
this corresponds to
(57) 
Again in practice to avoid the loss of conservation of macroscopic quantities, once the conserved quantities are computed in one cell, we solve the minimization problem (33) to get the function . Thus, the above procedure requires the knowledge of , which can only be estimated from the sample positions. The simplest method, which produces a piecewise constant reconstruction, is based on evaluating the histogram of the samples on the grid, considering all the samples inside one cell be of the same importance irrespectively of their positions. In practice, the density is given by the number of samples belonging to the cell
(58) 
while the mean velocity in each spatial direction and the energy are given by
(59) 
The method described above deserves some remarks. First, note that as the method becomes a particle scheme for the limiting fluid dynamic equations. This limit method is the analogous of a kinetic particle method for the compressible Euler equations. Second, the simple splitting method described is firstorder in time. Second order Strang splitting can be implemented similarly to the case of the FKS scheme described in the previous section.
Now, we dispose of all the elements which permit to highlight the similarities with the FKS scheme. Observe that the relaxation step (57) is no more solved statistically as for the original Monte Carlo method (54). Thus, the scheme described is in fact a deterministic particle scheme, in which, however, the particle positions are still randomly initialized. Now, if we consider the piecewise reconstruction of the macroscopic quantities introduced before (5859), we take one single particle for each velocity and we fix all particles positions at the beginning of the computation at the center of each cell we obtain the FKS described in the previous section. In fact, first the number of particles in each spatial cell remains constant in time and equal to the number of mesh point in velocity space . This is because for each particle that goes out of one cell, there exists another particle with the same velocity which enters in the cell from another location. This is due to the fact that particles have initially the same position, they never change velocity and the mesh is uniform. Thus, during the time evolution the only quantity that is modified is the mass of the particle. This mass changes according to the solution of the relaxation equation (57). This is exactly what happens in the FKS method in equation (44). Finally, the transport is solved exactly for the particle scheme as well as for the FKS method. However, the weighted particle scheme, can be viewed as a particular case of the FKS method. In fact, to regain the weighted particle method, we have to fix the position of the particles, take only a single particle for a given velocity , the mesh must be uniform and the shape of the distribution function in space must be piecewise constant for the FKS method. This analogy between the two schemes permits, from one side, to derive a very efficient algorithm for the FKS method. From the other side, it opens the way to in deep discussions from the theoretical point of view on the relation between the two methods , like the different convergence properties of the two approaches. We remind to a future work for an analysis of the convergence of the FKS method.
6 Numerical tests
6.1 General setting
In this section, we present several numerical tests to illustrate the main features of the method. First the performance of the scheme is tested in the one dimensional case for solving the Sod problem. In this case, we do comparisons of our method with different finite difference methods which can solve the same problem. In the one dimensional case, the computational speedup is not very relevant being all classical methods sufficiently fast. However, the FKS method is still faster than the other methods. In a second series of tests we solve a two dimensional two dimensional kinetic equation. Finally we solve a full threethree dimensional problem. In this situation, it is a matter of fact that computing the solution of a kinetic equation with finite difference, finite volume or semiLagrangian methods is unreasonable. We will show results from our method running on a monoprocessor laptop machine.
6.2 1D Sod shock tube problem
We consider the 1D/1D Sod test with mesh points in physical and points in velocity spaces. The boundaries in velocity space are set to and . The left and right states are given by a density , mean velocity and temperature if , while , , if . The gas is in thermodynamical equilibrium. We repeat the same test with different values of the Knudsen number, ranging from to . We plot the results for the final time for the density (Figure 1), the mean velocity (Figure 2) and the temperature (figure 3). In each figure we compare the FKS method with a third order WENO method, a secondorder MUSCL method and a firstorder upwind method [23]. These numerical methods used as reference, employ the same discretization parameters, except for the time step which for stability reason is chosen equal to for the WENO and secondorder MUSCL schemes, where is the time step of the fast DVM method given by (48).
From Figures (1) to (3) we can observe that our method gives very similar results to the two high order schemes for , and while for , the scheme is more diffusive than the second and third order scheme but it still performs better than the first order method. The behaviors of the method for different regimes are due to the fact that for collisionless regimes the FKS gives almost the exact solution, this means that it is more precise than the third and secondorder methods. When the gas becomes denser the projection towards the equilibrium, which is only firstorder (second step of the method (44)), does reduce the accuracy of the method. Notice that high order reconstruction of the equilibrium distribution could also be considered to increase the global accuracy in such case. However, a key point of the FKS is its low CPU time consumption in comparison to other existing methods. In the case for which the scheme exhibits diffusive behaviors, a comparison between the third order WENO method and our FKS method is carried out for a fixed CPU time. In other words, we consider for a given total computational time, which method gives better results. Thus, we solve the problem with points in space and in velocity space for the WENO method and we consider an FKS solver which employs points in velocity space. In order to have the same computational time for the two methods, we can afford points for the FKS. The two results are compared in Figure 4. We observe that, in this situation, the FKS method gives more accurate solutions, in particular for the shock wave (see the zooms in the figures). Finally, observe that the gain in term of computational time is not so relevant for the one dimensional case, while it becomes very important for the two and the three dimensional case. In the later case, the difference is about being able to do or not to do the computation in a reasonable amount of time on a single processor machine.
6.3 2D Sod shock tube problem
We consider now the 2D/2D Sod test on a square . The velocity space is also a square with bounds and , i.e. , discretized with points in each direction which gives points. We repeat the same test using different meshes ranging from to . The domain is divided into two parts, a disk centered at point of radius is filled with a gas with density , mean velocity and temperature , whereas the gas in the rest of the domain is initiated with , , . The final time is . The gas is in thermodynamical equilibrium during all the computation which means that we fix . In practice, we are using the kinetic scheme to compute the solution of the compressible Euler equation. We recall that, as seen in the previous section, this is the case in which the FKS scheme gives the worse results, this is due to the first order accurate projection towards the local Maxwellian distribution. However, this choice permits to compare our results with a numerical method for the compressible Euler equations, being as already stated, computationally very demanding to perform simulations of kinetic equations in the two dimensional case and considerably more demanding in the three dimensional case.
In Figure 5 we show the results for respectively the density, the mean velocity in the direction and in the direction and the temperature using a mesh. In Figure 6 we report the profile for of the same macroscopic quantities comparing the results to a first order and to a second order MUSCL scheme for the compressible Euler equations [23]. We clearly see that, as in the 1D case, the accuracy of the FKS method lies between the first and the second order accuracy in the limit . We expect the accuracy to be highly improved when the gas is far from the thermodynamical equilibrium as in the one dimensional case.
In table 1 we report the CPU time of these simulations, the CPU time per time cycle , the CPU time per cycle per cell and the number of cycles needed to perform the computation for different meshes in space and a fixed mesh in velocity. As expected the number of time step linearly scales with the size of the spatial mesh at fixed velocity mesh (factor when the cell number is multiplies by ). The CPU time is very small compared to classical kinetic schemes, in less than minutes the simulation of the Sod shock tube on a mesh is computed. Finally we observe that the CPU time per cycle per cell is almost constant which allows to predict the end of the simulation and its cost beforehand.
6.4 Numerical validation of the 3D/3D fast FKS method
Here we report some simulations of the full 3D/3D problem. We consider only the case in which , which means, we project towards equilibrium at each time step, this is the fluid limit. We recall that, in this regime, the numerical method gives the worst results in terms of precision, on the other hand, exact solution are known and this permits to make fair comparisons. For all the other regimes, the performances of the method are better as shown in the previous section.
The FKS method has been implemented in fortran on a sequential machine. The goal is to numerically show that such a kinetic scheme can reasonably perform on six dimensions on a monoprocessor laptop. All simulations have been carried out on a HP EliteBook 8740W Intel(R) Core(TM) i7 Q840@1.87GHz running under a Ubuntu (oneiric) version 11.10. The code has been compiled with gfortran 4.6 compiler with O3 optimization flags.
Otherwise noticed the velocity space is or and is discretized with or grid points in each velocity direction, leading to or mesh points. The time step is fixed to of the maximum time step allowed, as prescribed by the CFL condition (48), apart from the last time step which is chosen to exactly match the usergiven final time. Symmetric boundary conditions are considered.
The Sod shock tube in 1D is run as a sanity checks in order to validate the implementation of the method and show its ability to reproduce 1D results with a 3D run. Then the Sod problem in 3D is simulated to show the performances of the FKS algorithm and further compared to a reference solution. For each simulation we report the memory consumption, the full CPU time and the CPU time cost per cell per time step. Some extrapolation of these results are also made to measure the efficiency of this method.
6.4.1 1D Sod shock tube problem: A sanity check
Cell v #  Cell x #  Cell #  Cycle  Time  Time/cycle  Time/cell  

Bounds  T (s)  (s)  (s)  
s  

s  
s  
mn  
s  
mn 
The first sanity check consists of running the 1D Sod shock tube in direction on cubes. The initial data are the same as for the 1D problem previously run. The final time is . In our numerical experiments the computational domain is of size in direction leading to . We set . Four successively refined meshes in direction are utilized, , , and , in order to observe the convergence of the numerical method towards the exact solution.
In Figure 7 we display the density, the velocity and the temperature vs the exact solution with solid line (respectively panels (a), (c) and (d)) and a 3D view on the mesh cells colored by density (panel (b) where a mesh is used for figure scaling reasons). The first observation is the perfect symmetry in the ignorable directions and as all cells are plotted (notice that the results for a cells mesh exactly match the results). The second obvious observation is the convergence of the numerical solution towards the exact solution when the mesh is refined. These results assess the ability of the method and the code to reproduce 1D results without alteration.
(a)  (b) 
(c)  (d) 
In table 2 we gather the number of cycles , the CPU time of these simulations and display the CPU time per time cycle and the CPU time per cycle per cell . As expected, the cycle number and the CPU time per time cycle scales with the cell number and, consequently, the CPU time per cycle per cell is almost constant. This allows to almost exactly predict the duration of a simulation knowing the cell number. Moreover we have provided the relative percentage of the cost of the transport and collision stages.
As expected the transport stage does not cost anything, in absolute value, especially when the number of cells increases. In fact, for computing the solution of this stage in all domain, we consider the evolution of the distribution function in one single cell, the same happens in the other cells. This means that the cost of this stage is proportional to the mesh points in the velocity space. On the other hand, in finite volume methods as well as Monte Carlo method the cost to solve this stage is proportional to with and, obviously this scales with . Another satisfactory result is the memory storage in MB (or Mo) of the method which is very low because we never have to store the distribution function values for more than points, leading to store reals, say MB independently of . Conversely the storage of the Monte Carlo method scales with the cell number . Finally as expected the time T scales with a factor for twice the number of cells.
Cell #  Cycle  Time  Time/cycle  Time/cycle/cell  Memory  
(s)  (s)  (s)  Mem(MB)  
Transp.  Coll.  
6.4.2 3D Sod shock tube problem
The 3D Sod shock tube has been run with the 3D/3D FKS method. The left state of the 1D Sod problem is set for any cell with cell center radius , conversely the right state is set for cell radius . The final time is . The domain is the unit cube and the mesh is composed of cells with and . The problem is run with ( cells), ( million cells) and ( millions cells). The velocity space is either discretized with points, or discretized with points. This leads to consider up to milliards cells. In Figure 8 the density is plotted as a function of the radius (left panel) and the colored density on a 3D view (right panel) for (middle panels) and (bottom panels). The two different choices for the bounds and the mesh points in velocity space do not significantly change the results hence only the solution with bounds and with mesh points is reported. The reference solution is obtained with a 2D axisymmetric compatible staggered ArbitraryLagrangianEulerian code [25] with cells in radial and cells in angular directions.