Efficient Fourier Basis Particle Simulation
Abstract
The standard particleincell algorithm suffers from finite grid errors which break energy conservation, cause numerical dispersion, and create numerical instabilities. There exists a gridless alternative which bypasses the deposition step and calculates each Fourier mode of the charge density directly from the particle positions. We show that a gridless method can be computed efficiently through the use of an Unequally Spaced Fast Fourier Transform (USFFT) algorithm. After a spectral field solve, the forces on the particles are calculated via the inverse USFFT (a rapid solution of an approximate linear system) beylkin_fast_1995 (); dutt_fast_1993 (). We provide a one dimensional implementation of this algorithm with an asymptotic runtime of for each iteration, identical to the standard PIC algorithm (where is the number of particles and is the number of Fourier modes). Higher dimensional formulations would scale as , where is the spatial dimensionality of the problem. We demonstrate superior energy conservation and reduced noise, as well as convergence of the energy conservation at small time steps.
keywords:
Numerical, Plasma, Particleincell, Energy conserving, Momentum conserving, Fourier transform1 Introduction
A common approach to numerically solving the VlasovPoisson system is to represent the distribution function using particles, with fields solved on a grid and interpolated at the particle positions birdsall_plasma_1985 (); hockney_computer_1988 (). This scheme, known as the particleincell (PIC) method, has been enormously successful for simulating plasmas and is used in a wide variety of applications, but does not conserve total energy langdon_effects_1970 (). Energyconserving schemes based on variational formulations have been proposed lewis_energyconserving_1970 (); eastwood_virtual_1991 (), but generally do not conserve momentum because of a lack of translational invariance evstatiev_variational_2013 (), though the momentum error can be kept small in many cases due to the choice of integrator chen_energy_2011 ().
In addition to the lack of energy conservation, PIC also suffers from a finite grid instability, in which sufficiently high Fourier modes experience exponential growth okuda_nonphysical_1972 (); lindman_dispersion_1970 () due to coupling with lower modes. This phenomenon causes numerical heating, which saturates when the Debye length is on the order of the grid spacing okuda_nonphysical_1972 (); evstatiev_variational_2013 (). The finite grid instability is of particular relevance to problems involving cold plasmas or multiple length scales okuda_nonphysical_1972 (); brackbill_method_1994 (). Various approaches have been proposed to reduce this instability, such as the use of smoother particle shapes jacobs_highorder_2006 (), grid jiggling brackbill_method_1994 (), filtering hockney_quiet_1974 (), and highorder Galerkin methods jacobs_highorder_2006 (), but many of these schemes suffer from issues suck as a lack of charge conservation and high computational costs.
Several energyconserving particle algorithms lewis_energyconserving_1970 (); eastwood_virtual_1991 () based on the Lagrangian formulated by Low low_lagrangian_1958 () have been suggested as alternatives to PIC. Chargeconserving approaches chen_energy_2011 (); markidis_energy_2011 () using implicit time integration have also been proposed. Energyconserving algorithms have the benefit of eliminating numerical heating, but suffer from several drawbacks. They generally do not conserve momentum langdon_energyconserving_1973 (), sometimes suffer from increased noise, and heavily restrict the choice of particle shape evstatiev_variational_2013 (). However, these methods have seen widespread use due to their efficiency and simplicity.
It has been demonstrated that exact energy and momentum conservation in a particle code can only be achieved by depositing charge on a truncated Fourier basis evstatiev_variational_2013 (). This method has also been shown to eliminate the finite grid instability and reduce coupling between modes huang_finite_2016 (). However, due to the poor scaling and high computational cost of this approach (, where is the number of Fourier modes and is the number of particles), it has not been seriously considered in practice.
We present a similar algorithm to the one proposed in evstatiev_variational_2013 (), in which we model the charge density as a sum of shape functions in continuous space and perform the field solve with a truncated Fourier series. However, we reduce the computation time to , which is equivalent to that of conventional PIC with a spectral field solve, by making use of an Unequally Spaced FFT (USFFT) beylkin_fast_1995 (); dutt_fast_1993 ().
This paper is organized as follows: we begin by reviewing the standard PIC method in Section 2. We then propose a gridless algorithm based on evstatiev_variational_2013 () and demonstrate that it can be made efficient via the USFFT. In Section 3, we analyze the results of several numerical experiments, comparing our code to an implementation of conventional PIC and providing experimental confirmation of its energy conservation. Formal proofs of energy conservation in the continuoustime limit and momentum conservation are given in Section 4. In Section 5, we present an analysis of the algorithm’s asymptotic time complexity and a possible parallelization scheme along with the results of several scaling experiments. Section 6 contains a summary of our results and a discussion of possible generalizations.
2 Numerical Methods
The behavior of a collisionless plasma is given by the VlasovPoisson system:
(1) 
(2) 
2.1 ParticleinCell
This system is commonly solved via the particleincell (PIC) algorithm birdsall_plasma_1985 (); hockney_computer_1988 (), which tracks particles in continuous phase space, while fields are tracked on a spatial grid with points. Each timestep begins by calculating the charge density at the grid points from particle positions, via deposition of charge on the grid, such that
(3) 
where is some shape function corresponding to the weighting scheme. and are then calculated at the grid points from , usually via spectral methods, as follows:
(4)  
(5)  
(6)  
(7) 
After the field solve, the forces on the particles are calculated by interpolating the Efield from the grid to the particle positions as
(8) 
In this paper, we assume without loss of generality that the same weighting scheme is used for both deposition and interpolation, but this need not be the case.
2.2 ParticleinFourier
Our algorithm, termed ParticleinFourier (or PIF), begins by following conventional Fourier basis algorithms evstatiev_variational_2013 (). Instead of depositing the charge on a grid, we treat as a sum of shape functions (usually ) in continuous space and “deposit” on a truncated Fourier basis. This can be accomplished by calculating each mode of directly from the particle positions, as
(9)  
(10)  
(11) 
where , for some integer such that , where is the number of modes (analogous to the number of grid cells in PIC). denotes the Fourier transform of the shape function, given by
(12) 
A unique feature of the PIF method is the ability to use a physical particle shape given by as an alternative to filtering. In many ways, this is similar to PIC, except without a computational penalty for using higher order shape functions (e.g. Gaussian). Because convolution with the shape function is equivalent to multiplication in Fourier space and only needs to be computed once, the step time is independent of the weighting scheme, permitting the use of arbitrary shape functions with no additional computational cost.
We now perform the field solve, in a manner identical to conventional PIC, as
(13)  
(14) 
Using a similar approach to the deposition, we can determine the forces on the particles by summing over all modes of :
(15)  
(16) 
2.3 Unequally Spaced Fast Fourier Transform (USFFT)
Our algorithm relies on rapid evaluation of two computationally expensive sums: (done for every mode), and (done for every particle). Naively, computing these sums requires operations for particles and modes. However, we can reduce the computational cost by using an Unequally Spaced FFT (USFFT) beylkin_fast_1995 (); dutt_fast_1993 ().
The USFFT algorithm computes the sum
(17) 
for all , where and are arbitrary complex coefficients. The dual version of the USFFT rapidly evaluates the sum
(18) 
where are arbitrary complex coefficients. The computational cost is again . It is important that the accuracy of the USFFT is usercontrolled, and since the result is guaranteed by the algorithm, one can use the USFFT in a manner similar to the FFT (tight accuracy estimates can be found in beylkin_fast_1995 ()).
The USFFT algorithm uses the FFT as one of the steps, and for this reason, it is helpful to estimate the computational cost of the USFFT in terms of that of the FFT. The USFFT yielding double precision accuracy in one dimension is times slower than the FFT of the same size. In two dimensions, the factor is (in both cases for a random distribution of points).
The first step of the algorithm can be understood as a projection on a subspace of multiresolution analysis. The second step applies the FFT to the projected data, and the last step corrects the computed values by a predetermined factor. In beylkin_fast_1995 () the projection step uses bsplines, whereas dutt_fast_1993 () uses Gaussians instead. Other choices are possible with a mild effect on the speed of the algorithm. The key to the speed is the splitting of operations between the spatial and the Fourier domains so that the combined effect is an application of a nearly ideal filter in the Fourier domain.
We use the USFFT by setting and for all particles, so that we can efficiently compute the sum in Eq. (11). Similarly, by taking as the amplitude of each mode of the electric field, we can compute the sum in Eq. (16). As a result, PIF has the same computational complexity as standard PIC. We analyze this further in Section 5.
3 Numerical Experiments
In this section, we test the behavior of the two methods on two classic kinetic problems in plasma physics for which PIC has been well verified with theory.
3.1 Two Stream Instability
An implementation of PIF was created, using a serial implementation of the USFFT beylkin_fast_1995 (). The performance and energy conservation of the code were compared to an implementation of conventional PIC with a linear particle shape, first order interpolation, and a spectral field solve. Sharedmemory parallelism was implemented in both codes using OpenMP. Source code can be found at https://github.com/matt2718/ftpic.
The superior energy conservation of PIF becomes evident when we examine the the two stream instability. PIF and the reference PIC implementation were used to simulate the mixing of two countermoving electron beams against a neutralizing background (see panel (i) of Fig. 1). Both codes were run with 10000 particles and 16 modes/grid cells for 20000 time steps, with the effective grid spacing equal to approximately . Fig. 2 plots the normalized field, kinetic, and total energy of each system. It is clear that the total energy is not conserved for conventional PIC. Fig. 3 illustrates the total energy over time for both simulations, demonstrating energy conservation of PIF, but not in PIC.
Aside from the difference in energy conservation, the two codes produced similar results. Fig. 1 illustrates the evolution of phase space over time in the PIF simulation. The two beams are represented by different colors, but all particles are identical.
3.2 Landau Damping
Our implementations of PIF and conventional PIC were used to simulate a Maxwellian plasma of 10000 particles with a sinusoidal density perturbation of the second Fourier mode given by . This was performed for both 32 grid cells in PIC and 32 modes in PIF () and for 128 grid cells in PIC and 128 modes in PIF (). The observed damping rate of the second mode of the Efield was compared to the theoretical rate, illustrated in Fig. 5 and Fig. 5.
At small grid spacings, we see that the results of the PIC simulation converge to those of PIF. PIF, however, accurately reproduces the theoretical damping rate regardless of the number of modes present.
4 Conservation Properties and Numerical Error
The standard PIC algorithm does not conserve energy. It is shown by Langdon langdon_effects_1970 () that, if the field energy is taken to be , then
(19) 
and the time derivative of kinetic energy is
(20) 
where is defined by
(21) 
Therefore, total energy is conserved only when for all values of . However, due to grid aliasing, this condition only holds in the first Brillouin zone (, where is the grid wavenumber).
For two particles at and in a periodic system, the effective force on the second particle in PIF is given by
(22)  
(23)  
(24) 
This is independent of displacement, so grid aliasing is not present in PIF, and in all Brillouin zones. Therefore, energy is conserved in the continuoustime limit.
Because the sum runs from to (ignoring 0) and contains a term, all cosine terms in the expansion will cancel out, so the effective force is odd and the total force on the system is 0. This can be generalized to a system with an arbitrary number of particles.
We have shown energy conservation in the continuous time limit, but an actual simulation involves discretization in the time domain. Experimentally, we observe that the singletimestep error in the total energy of PIF converges as , in agreement with evstatiev_variational_2013 () and huang_finite_2016 (). This is illustrated in Fig. 6.
5 Performance and Scaling
The asymptotic runtime of the USFFT algorithm is beylkin_fast_1995 (). Because the field solve and the particle push take and time respectively, the total asymptotic runtime of the particle in Fourier algorithm is for a single iteration. This is identical to the standard PIC algorithm, assuming the field solve is done spectrally. This scaling is demonstrated experimentally in Fig. 7
Though we only provide a 1D implementation, the same scheme can be extended to higher dimensions, while still scaling well. Higher dimensional versions of the USFFT library exist, requiring time (where is the dimensionality of the system). As an ordinary dimensional FFT requires time, the scaling is identical to that of PIC in any number of dimensions.
In addition, because convolution with an arbitrary shape function corresponds to multiplication in Fourier space, increasing the width of the shape function beyond a single grid cell requires no additional time. This can provide a huge advantage for PIF as higher order shape functions (even Gaussian) are essentially free, whereas in PIC, higher order shape functions involve many more calculations and difficulttoparallelize scatter operations.
For reasonable parameters, the term dominates over the term. This suggests the following parallelization scheme for the algorithm: We divide the particles between nodes, with each node performing the field solve for its own particles. Before the push step, the Efields from every node are summed together.
In order to verify that this scaling holds, the two codes were run for 1000 time steps with 80000 particles and 64 grid cells/mode on a single 68core Xeon Phi node of the Cori supercomputer at the National Energy Research Scientific Computing Center (NERSC). The number of OpenMP threads was varied from 1 to 64, with the problem size kept constant. Both algorithms exhibited similar strong scaling, as shown in Fig. 8. On average, PIF required 2.9 times as much time as PIC, and the two algorithms demonstrated comparable scaling.
Weak scaling was investigated by increasing the problem size, such that each run had 20000 particles per thread. The number of grid cells was kept constant between runs. Results are shown in Fig. 9. We see that PIF demonstrates better weak scaling than our implementation of PIC for large numbers of particles. We suspect that this is because of the difficulty of parallelizing the deposition step due to the scatter operations involved.
6 Discussion
We have demonstrated that the PIF gridless scheme is a feasible approach to plasma simulation, as it can be implemented with comparable performance and identical scaling to the conventional PIC method while conserving both energy and momentum in the continuoustime limit. We have provided a proof of these conservation properties and verified them through several numerical experiments. In addition, we have verified that the results of the PIF model agree with theory for standard test cases such as Landau damping and the two stream instability.
We did find that PIF was slower than a reference implementation of PIC by a factor of approximately 2.9 in the 1D case. However, PIF demonstrated better weak and strong scaling and avoids costly scatter operations during the deposition step. Furthermore, the computational cost of PIF is independent of the particle shape function, permitting the use of higher order or Gaussian weighting schemes without additional time.
We expect that the PIF algorithm can be generalized to higher dimensional energy conserving electrostatic and electromagnetic models, as the field solves can still be performed with spectral methods birdsall_plasma_1985 (). This can be accomplished through the use of two and three dimensional USFFTs, still with a computational cost of beylkin_fast_1995 (). In addition, we expect that our method could be generalized to based codes parker_fully_1993 (), which are of particular interest to fusion plasmas.
Acknowledgments
Research supported by the United States Department of Energy under grants DEFG0208ER54954 and DESC000801.
References
References

G. Beylkin,
On
the Fast Fourier Transform of Functions with Singularities,
Applied and Computational Harmonic Analysis 2 (4) (1995) 363–381.
doi:10.1006/acha.1995.1026.
URL http://www.sciencedirect.com/science/article/pii/S1063520385710263 
A. Dutt, V. Rokhlin, Fast
Fourier Transforms for Nonequispaced Data, SIAM Journal on
Scientific Computing 14 (6) (1993) 1368–1393.
doi:10.1137/0914081.
URL https://epubs.siam.org/doi/abs/10.1137/0914081  C. K. Birdsall, A. B. Langdon, Plasma physics via computer simulation, McGrawHill, New York, 1985.
 R. W. Hockney, J. W. Eastwood, Computer simulation using particles, special student ed Edition, A. Hilger, Bristol [England] ; Philadelphia, 1988.

A. B. Langdon,
Effects
of the spatial grid in simulation plasmas, Journal of Computational Physics
6 (2) (1970) 247–267.
doi:10.1016/00219991(70)900240.
URL http://www.sciencedirect.com/science/article/pii/0021999170900240 
H. R. Lewis,
Energyconserving
numerical approximations for Vlasov plasmas, Journal of Computational
Physics 6 (1) (1970) 136–141.
doi:10.1016/00219991(70)900124.
URL http://www.sciencedirect.com/science/article/pii/0021999170900124 
J. W. Eastwood,
The
virtual particle electromagnetic particlemesh method, Computer Physics
Communications 64 (2) (1991) 252–266.
doi:10.1016/00104655(91)90036K.
URL http://www.sciencedirect.com/science/article/pii/001046559190036K 
E. G. Evstatiev, B. A. Shadwick,
Variational
formulation of particle algorithms for kinetic plasma simulations, Journal
of Computational Physics 245 (2013) 376–398.
doi:10.1016/j.jcp.2013.03.006.
URL http://www.sciencedirect.com/science/article/pii/S0021999113001800 
G. Chen, L. ChacÃ³n, D. C. Barnes,
An
energy and chargeconserving, implicit, electrostatic particleincell
algorithm, Journal of Computational Physics 230 (18) (2011) 7018–7036.
doi:10.1016/j.jcp.2011.05.031.
URL http://www.sciencedirect.com/science/article/pii/S0021999111003421 
H. Okuda,
Nonphysical
noises and instabilities in plasma simulation due to a spatial grid, Journal
of Computational Physics 10 (3) (1972) 475–486.
doi:10.1016/00219991(72)900484.
URL http://www.sciencedirect.com/science/article/pii/0021999172900484 
E. L. Lindman,
Dispersion
relation for computersimulated plasmas, Journal of Computational Physics
5 (1) (1970) 13–22.
doi:10.1016/00219991(70)900495.
URL http://www.sciencedirect.com/science/article/pii/0021999170900495 
J. W. Brackbill, G. Lapenta,
A
Method to Suppress the FiniteGrid Instability in Plasma
Simulations, Journal of Computational Physics 114 (1) (1994) 77–84.
doi:10.1006/jcph.1994.1150.
URL http://www.sciencedirect.com/science/article/pii/S0021999184711508 
G. B. Jacobs, J. S. Hesthaven,
Highorder
nodal discontinuous Galerkin particleincell method on unstructured
grids, Journal of Computational Physics 214 (1) (2006) 96–121.
doi:10.1016/j.jcp.2005.09.008.
URL http://www.sciencedirect.com/science/article/pii/S0021999105004250 
R. W. Hockney, S. P. Goel, J. W. Eastwood,
Quiet
highresolution computer models of a plasma, Journal of Computational
Physics 14 (2) (1974) 148–158.
doi:10.1016/00219991(74)900102.
URL http://www.sciencedirect.com/science/article/pii/0021999174900102 
F. E. Low, A
Lagrangian formulation of the BoltzmannVlasov equation for plasmas,
Proc. R. Soc. Lond. A 248 (1253) (1958) 282–287.
doi:10.1098/rspa.1958.0244.
URL http://rspa.royalsocietypublishing.org/content/248/1253/282 
S. Markidis, G. Lapenta,
The
energy conserving particleincell method, Journal of Computational Physics
230 (18) (2011) 7037–7052.
doi:10.1016/j.jcp.2011.05.033.
URL http://www.sciencedirect.com/science/article/pii/S0021999111003445 
A. B. Langdon,
âEnergyconservingâ
plasma simulation algorithms, Journal of Computational Physics 12 (2) (1973)
247–268.
doi:10.1016/S00219991(73)800142.
URL http://www.sciencedirect.com/science/article/pii/S0021999173800142 
C. K. Huang, Y. Zeng, Y. Wang, M. D. Meyers, S. Yi, B. J. Albright,
Finite
grid instability and spectral fidelity of the electrostatic
ParticleInCell algorithm, Computer Physics Communications 207 (2016)
123–135.
doi:10.1016/j.cpc.2016.05.021.
URL http://www.sciencedirect.com/science/article/pii/S001046551630145X 
S. E. Parker, W. W. Lee,
A fully nonlinear
characteristic method for gyrokinetic simulation, Physics of Fluids B:
Plasma Physics 5 (1) (1993) 77–86.
doi:10.1063/1.860870.
URL https://aip.scitation.org/doi/abs/10.1063/1.860870