Efficient Fourier Basis Particle Simulation
The standard particle-in-cell algorithm suffers from finite grid errors which break energy conservation, cause numerical dispersion, and create numerical instabilities. There exists a gridless alternative which bypasses the deposition step and calculates each Fourier mode of the charge density directly from the particle positions. We show that a gridless method can be computed efficiently through the use of an Unequally Spaced Fast Fourier Transform (USFFT) algorithm. After a spectral field solve, the forces on the particles are calculated via the inverse USFFT (a rapid solution of an approximate linear system) beylkin_fast_1995 (); dutt_fast_1993 (). We provide a one dimensional implementation of this algorithm with an asymptotic runtime of for each iteration, identical to the standard PIC algorithm (where is the number of particles and is the number of Fourier modes). Higher dimensional formulations would scale as , where is the spatial dimensionality of the problem. We demonstrate superior energy conservation and reduced noise, as well as convergence of the energy conservation at small time steps.
keywords:Numerical, Plasma, Particle-in-cell, Energy conserving, Momentum conserving, Fourier transform
A common approach to numerically solving the Vlasov-Poisson system is to represent the distribution function using particles, with fields solved on a grid and interpolated at the particle positions birdsall_plasma_1985 (); hockney_computer_1988 (). This scheme, known as the particle-in-cell (PIC) method, has been enormously successful for simulating plasmas and is used in a wide variety of applications, but does not conserve total energy langdon_effects_1970 (). Energy-conserving schemes based on variational formulations have been proposed lewis_energy-conserving_1970 (); eastwood_virtual_1991 (), but generally do not conserve momentum because of a lack of translational invariance evstatiev_variational_2013 (), though the momentum error can be kept small in many cases due to the choice of integrator chen_energy-_2011 ().
In addition to the lack of energy conservation, PIC also suffers from a finite grid instability, in which sufficiently high Fourier modes experience exponential growth okuda_nonphysical_1972 (); lindman_dispersion_1970 () due to coupling with lower modes. This phenomenon causes numerical heating, which saturates when the Debye length is on the order of the grid spacing okuda_nonphysical_1972 (); evstatiev_variational_2013 (). The finite grid instability is of particular relevance to problems involving cold plasmas or multiple length scales okuda_nonphysical_1972 (); brackbill_method_1994 (). Various approaches have been proposed to reduce this instability, such as the use of smoother particle shapes jacobs_high-order_2006 (), grid jiggling brackbill_method_1994 (), filtering hockney_quiet_1974 (), and high-order Galerkin methods jacobs_high-order_2006 (), but many of these schemes suffer from issues suck as a lack of charge conservation and high computational costs.
Several energy-conserving particle algorithms lewis_energy-conserving_1970 (); eastwood_virtual_1991 () based on the Lagrangian formulated by Low low_lagrangian_1958 () have been suggested as alternatives to PIC. Charge-conserving approaches chen_energy-_2011 (); markidis_energy_2011 () using implicit time integration have also been proposed. Energy-conserving algorithms have the benefit of eliminating numerical heating, but suffer from several drawbacks. They generally do not conserve momentum langdon_energy-conserving_1973 (), sometimes suffer from increased noise, and heavily restrict the choice of particle shape evstatiev_variational_2013 (). However, these methods have seen widespread use due to their efficiency and simplicity.
It has been demonstrated that exact energy and momentum conservation in a particle code can only be achieved by depositing charge on a truncated Fourier basis evstatiev_variational_2013 (). This method has also been shown to eliminate the finite grid instability and reduce coupling between modes huang_finite_2016 (). However, due to the poor scaling and high computational cost of this approach (, where is the number of Fourier modes and is the number of particles), it has not been seriously considered in practice.
We present a similar algorithm to the one proposed in evstatiev_variational_2013 (), in which we model the charge density as a sum of shape functions in continuous space and perform the field solve with a truncated Fourier series. However, we reduce the computation time to , which is equivalent to that of conventional PIC with a spectral field solve, by making use of an Unequally Spaced FFT (USFFT) beylkin_fast_1995 (); dutt_fast_1993 ().
This paper is organized as follows: we begin by reviewing the standard PIC method in Section 2. We then propose a gridless algorithm based on evstatiev_variational_2013 () and demonstrate that it can be made efficient via the USFFT. In Section 3, we analyze the results of several numerical experiments, comparing our code to an implementation of conventional PIC and providing experimental confirmation of its energy conservation. Formal proofs of energy conservation in the continuous-time limit and momentum conservation are given in Section 4. In Section 5, we present an analysis of the algorithm’s asymptotic time complexity and a possible parallelization scheme along with the results of several scaling experiments. Section 6 contains a summary of our results and a discussion of possible generalizations.
2 Numerical Methods
The behavior of a collisionless plasma is given by the Vlasov-Poisson system:
This system is commonly solved via the particle-in-cell (PIC) algorithm birdsall_plasma_1985 (); hockney_computer_1988 (), which tracks particles in continuous phase space, while fields are tracked on a spatial grid with points. Each timestep begins by calculating the charge density at the grid points from particle positions, via deposition of charge on the grid, such that
where is some shape function corresponding to the weighting scheme. and are then calculated at the grid points from , usually via spectral methods, as follows:
After the field solve, the forces on the particles are calculated by interpolating the E-field from the grid to the particle positions as
In this paper, we assume without loss of generality that the same weighting scheme is used for both deposition and interpolation, but this need not be the case.
Our algorithm, termed Particle-in-Fourier (or PIF), begins by following conventional Fourier basis algorithms evstatiev_variational_2013 (). Instead of depositing the charge on a grid, we treat as a sum of shape functions (usually ) in continuous space and “deposit” on a truncated Fourier basis. This can be accomplished by calculating each mode of directly from the particle positions, as
where , for some integer such that , where is the number of modes (analogous to the number of grid cells in PIC). denotes the Fourier transform of the shape function, given by
A unique feature of the PIF method is the ability to use a physical particle shape given by as an alternative to filtering. In many ways, this is similar to PIC, except without a computational penalty for using higher order shape functions (e.g. Gaussian). Because convolution with the shape function is equivalent to multiplication in Fourier space and only needs to be computed once, the step time is independent of the weighting scheme, permitting the use of arbitrary shape functions with no additional computational cost.
We now perform the field solve, in a manner identical to conventional PIC, as
Using a similar approach to the deposition, we can determine the forces on the particles by summing over all modes of :
2.3 Unequally Spaced Fast Fourier Transform (USFFT)
Our algorithm relies on rapid evaluation of two computationally expensive sums: (done for every mode), and (done for every particle). Naively, computing these sums requires operations for particles and modes. However, we can reduce the computational cost by using an Unequally Spaced FFT (USFFT) beylkin_fast_1995 (); dutt_fast_1993 ().
The USFFT algorithm computes the sum
for all , where and are arbitrary complex coefficients. The dual version of the USFFT rapidly evaluates the sum
where are arbitrary complex coefficients. The computational cost is again . It is important that the accuracy of the USFFT is user-controlled, and since the result is guaranteed by the algorithm, one can use the USFFT in a manner similar to the FFT (tight accuracy estimates can be found in beylkin_fast_1995 ()).
The USFFT algorithm uses the FFT as one of the steps, and for this reason, it is helpful to estimate the computational cost of the USFFT in terms of that of the FFT. The USFFT yielding double precision accuracy in one dimension is times slower than the FFT of the same size. In two dimensions, the factor is (in both cases for a random distribution of points).
The first step of the algorithm can be understood as a projection on a subspace of multiresolution analysis. The second step applies the FFT to the projected data, and the last step corrects the computed values by a predetermined factor. In beylkin_fast_1995 () the projection step uses b-splines, whereas dutt_fast_1993 () uses Gaussians instead. Other choices are possible with a mild effect on the speed of the algorithm. The key to the speed is the splitting of operations between the spatial and the Fourier domains so that the combined effect is an application of a nearly ideal filter in the Fourier domain.
We use the USFFT by setting and for all particles, so that we can efficiently compute the sum in Eq. (11). Similarly, by taking as the amplitude of each mode of the electric field, we can compute the sum in Eq. (16). As a result, PIF has the same computational complexity as standard PIC. We analyze this further in Section 5.
3 Numerical Experiments
In this section, we test the behavior of the two methods on two classic kinetic problems in plasma physics for which PIC has been well verified with theory.
3.1 Two Stream Instability
An implementation of PIF was created, using a serial implementation of the USFFT beylkin_fast_1995 (). The performance and energy conservation of the code were compared to an implementation of conventional PIC with a linear particle shape, first order interpolation, and a spectral field solve. Shared-memory parallelism was implemented in both codes using OpenMP. Source code can be found at https://github.com/matt2718/ftpic.
The superior energy conservation of PIF becomes evident when we examine the the two stream instability. PIF and the reference PIC implementation were used to simulate the mixing of two counter-moving electron beams against a neutralizing background (see panel (i) of Fig. 1). Both codes were run with 10000 particles and 16 modes/grid cells for 20000 time steps, with the effective grid spacing equal to approximately . Fig. 2 plots the normalized field, kinetic, and total energy of each system. It is clear that the total energy is not conserved for conventional PIC. Fig. 3 illustrates the total energy over time for both simulations, demonstrating energy conservation of PIF, but not in PIC.
Aside from the difference in energy conservation, the two codes produced similar results. Fig. 1 illustrates the evolution of phase space over time in the PIF simulation. The two beams are represented by different colors, but all particles are identical.
3.2 Landau Damping
Our implementations of PIF and conventional PIC were used to simulate a Maxwellian plasma of 10000 particles with a sinusoidal density perturbation of the second Fourier mode given by . This was performed for both 32 grid cells in PIC and 32 modes in PIF () and for 128 grid cells in PIC and 128 modes in PIF (). The observed damping rate of the second mode of the E-field was compared to the theoretical rate, illustrated in Fig. 5 and Fig. 5.
At small grid spacings, we see that the results of the PIC simulation converge to those of PIF. PIF, however, accurately reproduces the theoretical damping rate regardless of the number of modes present.
4 Conservation Properties and Numerical Error
The standard PIC algorithm does not conserve energy. It is shown by Langdon langdon_effects_1970 () that, if the field energy is taken to be , then
and the time derivative of kinetic energy is
where is defined by
Therefore, total energy is conserved only when for all values of . However, due to grid aliasing, this condition only holds in the first Brillouin zone (, where is the grid wavenumber).
For two particles at and in a periodic system, the effective force on the second particle in PIF is given by
This is independent of displacement, so grid aliasing is not present in PIF, and in all Brillouin zones. Therefore, energy is conserved in the continuous-time limit.
Because the sum runs from to (ignoring 0) and contains a term, all cosine terms in the expansion will cancel out, so the effective force is odd and the total force on the system is 0. This can be generalized to a system with an arbitrary number of particles.
We have shown energy conservation in the continuous time limit, but an actual simulation involves discretization in the time domain. Experimentally, we observe that the single-timestep error in the total energy of PIF converges as , in agreement with evstatiev_variational_2013 () and huang_finite_2016 (). This is illustrated in Fig. 6.
5 Performance and Scaling
The asymptotic runtime of the USFFT algorithm is beylkin_fast_1995 (). Because the field solve and the particle push take and time respectively, the total asymptotic runtime of the particle in Fourier algorithm is for a single iteration. This is identical to the standard PIC algorithm, assuming the field solve is done spectrally. This scaling is demonstrated experimentally in Fig. 7
Though we only provide a 1D implementation, the same scheme can be extended to higher dimensions, while still scaling well. Higher dimensional versions of the USFFT library exist, requiring time (where is the dimensionality of the system). As an ordinary -dimensional FFT requires time, the scaling is identical to that of PIC in any number of dimensions.
In addition, because convolution with an arbitrary shape function corresponds to multiplication in Fourier space, increasing the width of the shape function beyond a single grid cell requires no additional time. This can provide a huge advantage for PIF as higher order shape functions (even Gaussian) are essentially free, whereas in PIC, higher order shape functions involve many more calculations and difficult-to-parallelize scatter operations.
For reasonable parameters, the term dominates over the term. This suggests the following parallelization scheme for the algorithm: We divide the particles between nodes, with each node performing the field solve for its own particles. Before the push step, the E-fields from every node are summed together.
In order to verify that this scaling holds, the two codes were run for 1000 time steps with 80000 particles and 64 grid cells/mode on a single 68-core Xeon Phi node of the Cori supercomputer at the National Energy Research Scientific Computing Center (NERSC). The number of OpenMP threads was varied from 1 to 64, with the problem size kept constant. Both algorithms exhibited similar strong scaling, as shown in Fig. 8. On average, PIF required 2.9 times as much time as PIC, and the two algorithms demonstrated comparable scaling.
Weak scaling was investigated by increasing the problem size, such that each run had 20000 particles per thread. The number of grid cells was kept constant between runs. Results are shown in Fig. 9. We see that PIF demonstrates better weak scaling than our implementation of PIC for large numbers of particles. We suspect that this is because of the difficulty of parallelizing the deposition step due to the scatter operations involved.
We have demonstrated that the PIF gridless scheme is a feasible approach to plasma simulation, as it can be implemented with comparable performance and identical scaling to the conventional PIC method while conserving both energy and momentum in the continuous-time limit. We have provided a proof of these conservation properties and verified them through several numerical experiments. In addition, we have verified that the results of the PIF model agree with theory for standard test cases such as Landau damping and the two stream instability.
We did find that PIF was slower than a reference implementation of PIC by a factor of approximately 2.9 in the 1D case. However, PIF demonstrated better weak and strong scaling and avoids costly scatter operations during the deposition step. Furthermore, the computational cost of PIF is independent of the particle shape function, permitting the use of higher order or Gaussian weighting schemes without additional time.
We expect that the PIF algorithm can be generalized to higher dimensional energy conserving electrostatic and electromagnetic models, as the field solves can still be performed with spectral methods birdsall_plasma_1985 (). This can be accomplished through the use of two and three dimensional USFFTs, still with a computational cost of beylkin_fast_1995 (). In addition, we expect that our method could be generalized to based codes parker_fully_1993 (), which are of particular interest to fusion plasmas.
Research supported by the United States Department of Energy under grants DE-FG02-08ER54954 and DE-SC000801.
the Fast Fourier Transform of Functions with Singularities,
Applied and Computational Harmonic Analysis 2 (4) (1995) 363–381.
A. Dutt, V. Rokhlin, Fast
Fourier Transforms for Nonequispaced Data, SIAM Journal on
Scientific Computing 14 (6) (1993) 1368–1393.
- C. K. Birdsall, A. B. Langdon, Plasma physics via computer simulation, McGraw-Hill, New York, 1985.
- R. W. Hockney, J. W. Eastwood, Computer simulation using particles, special student ed Edition, A. Hilger, Bristol [England] ; Philadelphia, 1988.
A. B. Langdon,
of the spatial grid in simulation plasmas, Journal of Computational Physics
6 (2) (1970) 247–267.
H. R. Lewis,
numerical approximations for Vlasov plasmas, Journal of Computational
Physics 6 (1) (1970) 136–141.
J. W. Eastwood,
virtual particle electromagnetic particle-mesh method, Computer Physics
Communications 64 (2) (1991) 252–266.
E. G. Evstatiev, B. A. Shadwick,
formulation of particle algorithms for kinetic plasma simulations, Journal
of Computational Physics 245 (2013) 376–398.
G. Chen, L. ChacÃ³n, D. C. Barnes,
energy- and charge-conserving, implicit, electrostatic particle-in-cell
algorithm, Journal of Computational Physics 230 (18) (2011) 7018–7036.
noises and instabilities in plasma simulation due to a spatial grid, Journal
of Computational Physics 10 (3) (1972) 475–486.
E. L. Lindman,
relation for computer-simulated plasmas, Journal of Computational Physics
5 (1) (1970) 13–22.
J. W. Brackbill, G. Lapenta,
Method to Suppress the Finite-Grid Instability in Plasma
Simulations, Journal of Computational Physics 114 (1) (1994) 77–84.
G. B. Jacobs, J. S. Hesthaven,
nodal discontinuous Galerkin particle-in-cell method on unstructured
grids, Journal of Computational Physics 214 (1) (2006) 96–121.
R. W. Hockney, S. P. Goel, J. W. Eastwood,
high-resolution computer models of a plasma, Journal of Computational
Physics 14 (2) (1974) 148–158.
F. E. Low, A
Lagrangian formulation of the Boltzmann-Vlasov equation for plasmas,
Proc. R. Soc. Lond. A 248 (1253) (1958) 282–287.
S. Markidis, G. Lapenta,
energy conserving particle-in-cell method, Journal of Computational Physics
230 (18) (2011) 7037–7052.
A. B. Langdon,
plasma simulation algorithms, Journal of Computational Physics 12 (2) (1973)
C. K. Huang, Y. Zeng, Y. Wang, M. D. Meyers, S. Yi, B. J. Albright,
grid instability and spectral fidelity of the electrostatic
Particle-In-Cell algorithm, Computer Physics Communications 207 (2016)
S. E. Parker, W. W. Lee,
A fully nonlinear
characteristic method for gyrokinetic simulation, Physics of Fluids B:
Plasma Physics 5 (1) (1993) 77–86.