Low rank approximations for the DEPOSIT computer code.

Low rank approximations for the DEPOSIT computer code.

Mikhail S. Litsarev m.litsarev@skolkovotech.ru Ivan V. Oseledets Skolkovo Institute of Science and Technology, Novaya St. 100, Skolkovo, Odintsovsky district, 143025 Moscow Region, Russia Institute of Numerical Mathematics, Gubkina St. 8, 119333 Moscow, Russia
Abstract

We present an efficient technique based on low-rank separated approximations for the computation of three-dimensional integrals in the computer code DEPOSIT that describes ion-atomic collision processes. Implementation of this technique decreases the total computational time by a factor of . The general concept can be applied to more complicated models.

keywords:
Low rank approximation, 2D cross, Separated representation, Exponential sums, 3D Integration, Slater wave function, Ion-atom collisions, Electron loss
journal: Computer Physics Communications

July 6, 2019

1 Introduction.

The computer code DEPOSIT litsarev-cpc-2013 () is intended to describe ion-atomic collision processes. It allows to calculate total and multiple electron loss cross sections and ( is the number of ionized electrons), the deposited energies , ( is the impact parameter of the projectile ion) and ionization probabilities . It is based on the energy deposition model introduced by N. Bohr bohr1915 () and developed further by A. Russek and J. Meli RussekMeli1970 (), C.L. Cocke cocke_pra1979 (), and V.P. Shevelko at al. litsarev-jpb-2008 (). Theoretical development of the DEPOSIT is presented in litsarev-jpb-2008 (); litsarev-nimb-2009 (); litsarev-jpb-2009 (); litsarev-jpb-2010 (). Examples of calculations are reported in litsarev-springer-2012 (); litsarev-hci-2012 (); uspekhi2013 (); litsarev-jpb-2014 (). Detailed description of the code and user guide are given in litsarev-cpc-2013 ().

The cross sections and ionization probabilities needed for estimation of the losses and lifetimes of fast ion beams, background pressures and pumping requirements in accelerators and storage rings are, in fact, functionals of the deposited energy , which in turn is a three-dimensional integral over the coordinate space. To calculate any of these parameters one has to compute in all points of the -mesh.

The integral is a bottleneck of the program, and it is required to be done as fast as possible. In the previous work litsarev-cpc-2013 () an advanced quadrature technique was used, and the computational time has appeared to be much faster in comparison with direct usage of uniform meshes. It takes several seconds to compute one point for one atomic shell at fixed . For complex ions, the total computation takes few hours on one processor core and is not enough fast. To overcome this issue a fully scalable parallel variant of the algorithm was proposed. Nevertheless, the computational time is still large.

In this work, we present an entirely different approach for computing in many points of the -mesh, based on low rank approximations of matrices and tensors. The main idea is to approximate the functions to be integrated by a sum of products of univariate functions, effectively decreasing the dimensionality of the problem. This involves active usage of numerical and analytical tools.

The definition of involves a function of two variables (the energy gain during an ion-atomic collision) and a function of three variables (electron density in Slater-type approximation). Details and definitions are given in Section 2.1. The integral is computed in Cartesian coordinates, which are better suited for the construction of separable representation than spherical coordinates used in the original DEPOSIT code.

In Section 2.2 for a function of two variables we use the pseudo-skeleton decomposition of matrices tee-mosaic-1996 (); gtz-psa-1997 (); gtz-maxvol-1997 () computed via a variant of the incomplete cross approximation algorithm tee-cross-2000 (). We show numerically that the function in question can be well-approximated by a separable function in Section 2.5. Thus, the approximation can be computed in time, where is the number of grid points in one dimension.

In Section 2.3 the Slater-type function of three variables is decomposed by the exponential sums approach beylkin-expsum-2005 (); beylkin-expsumrev-2010 (). The integral is immediately reduced to a two-dimensional one of a simpler structure.

Combining these two representations we obtain in Section 2.4 an efficient algorithm with scaling, in comparison with complexity for direct integration over a three-dimensional mesh. The computation of on the whole -mesh takes less then one minute and total speedup of the program is about times. Illustrative examples are given in Section 2.5.

All the equations related to the physical model are written in atomic units.

2 Numerical procedure

2.1 Statement of the problem

The deposited energy is defined as a three-dimensional integral over coordinate space centered in the projectile ion.

(1)

The sum here is over all atomic shells denoted by ,  is the principle quantum number and is the orbital quantum number. The electron density is taken in a Slater-type approximation

(2)

with integer , real positive and normalization condition

(3)

where is the number of electrons in a -shell. The gain of kinetic energy is a smooth finite function of parameter without any singularities. The impact parameter of the ion’s electron is a function of and . In frame of the moving projectile the following equality holds

(4)

For details we refer the reader to the paper litsarev-cpc-2013 (). In Cartesian coordinates as a function of parameter depends only on and as it follows from the equation (4)

(5)

Thus, we need to compute the integral (5). From here and bellow index will be skipped for the sake of simplicity and only one shell will be considered in the following equations.

System -Shell (sec) (sec) (sec)
24 2.41 4096
24 2.40
26 2.58
29 2.64
30 2.71
30 2.70
26 2.17 4096
27 2.58
30 2.69
30 2.77
30 2.75
30 2.71
30 2.69
Table 1: Ranks of the decomposition (32) calculated by the incomplete cross approximation algorithm tee-cross-2000 () for the energy gain . Two cases are considered: collision of ions with the Oxigen atom at a collision energy MeV/u and collision of ions with the Xenon atom at a collision energy MeV/u. Number of the -mesh points is taken equal to , number of the -mesh points is taken equal to in correspondence to the equations (26) and (30), . Accuracy means relative error of the approximation in the Frobenious norm. The calculations were carried out on GHz Intel Core i5 processor. Column corresponds to the time the cross algorithm takes. The numerical results confirm the almost linear scaling of the approach in .

2.2 Low rank approximation.

Let be a function of two variables where point is in a certain rectangle . The function is said to be in the separated form if it can be represented as a sum of products of univariate functions:

(6)

The minimal number such that (6) exists will be called separation rank. Direct generalization of (6) to multivariate functions is referred to as canonical polyadic (CP, also known as CANDECOMP/PARAFAC) kolda-review-2009 ().

If the function is in the separated form, the integration is simplified a lot. Indeed,

(7)

and the problem is reduced to the computation of one-dimensional integrals, which can be computed using fewer quadrature points than the original integral.

The discretization of one-dimensional integrals in (7) by some quadrature formula with nodes , , , and weights , , leads to the approximation

(8)

On the other hand, direct two-dimensional quadrature with separated weights in and can be used for the original integral:

(9)

Comparison of two representations (8) and (9) leads to the following discrete approximation problem

(10)

which is a discrete analogue of (6). Equation (10) can be written in the matrix form:

(11)

where is an matrix with elements , is an matrix with elements , is an matrix with elements and is an diagonal matrix with elements on the diagonal. This is a standard low-rank approximation problem for a given matrix. Provided that a good low-rank approximation exists, there are very efficient cross approximation algorithms tee-cross-2000 (); bebe-2000 () that need only elements of a matrix to be computed.

By using of our implementation of the cross approximation algorithm we decompose the energy gain in the form (10). In Table 1 the ranks and other numerical parameters are given for particular systems. Description of these parameters can be found in Section 2.5.

2.3 Exponential sums.

For a function defined in (2) the separation of variables can be done analytically beylkin-expsum-2005 (); hackbra-expsum-2005 (); GHK-ten_inverse_ellipt-2005 (); beylkin-expsumrev-2010 (). The main idea is to approximate the Slater density function by a sum of Gaussians

(12)

Once the approximation (12) is computed, the separation of variables in Cartesian coordinates comes for free

(13)

The technique for the computation of the nodes and the weights is based on the computation of the inverse Laplace transform.

Let us consider a function such that its Laplace transform is function

(14)

of the following form:

(15)

where and are parameters of the Slater density (2). The inverse Laplace transform can be computed analytically for the known . In A we present explicit expressions for the functions corresponding to the functions (14) for integer and real positive .

Once (14) is given and the function is known, the integral (14) is approximated by a quadrature formula

(16)

where and are quadrature weights and nodes, respectively. The procedure to compute the weights and the nodes was taken from the paper beylkin-expsumrev-2010 (). For the reader’s convenience we give the formula and its derivation in  B.

According to equation (12)

(17)

It appears that only several quadrature points (at fixed ) are required to achieve the accuracy of the expansion of order .

2.4 Fast computation of .

The three-dimensional integral defined in (5) can be reduced to a two-dimensional integral by means of the decomposition (13)

(18)

and analytical evaluation of the one-dimensional Gaussian integral

(19)
(20)

Suppose that has been decomposed as follows

(21)

Then the integration (20) can be reduced to a sequence of one-dimensional integrations.

(22)
(23)
(24)

For the numerical approximation of the integrals (23) and (24) we use the quadrature formula with uniform quadrature nodes (although any suitable quadrature can be used)

(25)
(26)
(27)
(28)

We sample the impact parameter (which can take only positive values) with the same step

(29)

This allows us to introduce a new variable discretized as

(30)

and such that for the boundary conditions (28), (29), (30)

(31)

The approximation problem (21) reduces to a low-rank approximation of the extended matrix

(32)

This should be done only once (using the cross approximation algorithm), and the final approximation of the integral (27) reads

(33)

The calculation of can be summarized in the following algorithm.

1:  for every -shell of the projectile ion do
2:     compute the decomposition (12) for
3:     compute the cross approximation for the matrix defined in (32)
4:     for  do
5:        for  do
6:           compute the integral defined in (25)
7:     for every required do
8:        for  do
9:           for  do
10:              compute the integral defined in (33)
11:        compute , equation (22)
System -Shell ( sec) (sec)
Table 2: Timings to compute at fixed are presented for two cases: the DEPOSIT code (old) and the code based on the separated representations (22) . Collision systems are the same as in Table 1. Number of terms in the expansion (13) is labeled by . The calculations were carried out for accuracy and mesh with points. The last column shows the speedup of the program.

2.5 Numerical experiments.

The most important parameter in (32) is the rank . It determines the complexity of the algorithm (the smaller , the better). In Table 1 we present the ranks (and other numerical parameters) calculated for the energy gain corresponding to different ion-atomic collisions. As it follows from the numerical experiments, the ranks are small. It means that the cross decomposition allows to decrease the size of the problem from elements to elements where .

In Table 2 we present the program speedup for every atomic shell. Details are given in the caption of the table. In sums (25) and (33) the terms less then were thrown out for every and . It is readily seen, that the use of the technique based on the separated representations (22) allows to decrease the total time to compute by a factor of compared to the previous version. In practice the computational time is reduced from several hours to one minute or less on the same hardware.

3 Conclusions and future work

We proposed a new technique for the computation of three-dimensional integrals based on low-rank and separated representation, that significantly reduces the computational time. The general concept can be applied to more complicated models (like ion-molecular collisions with electron loss and charge-changing processes) that lead to multidimensional integrals. For the multidimensional case we plan to use the fast approximation techniques based on the tensor train (TT) format osel-tt-2011 ().

Acknowledgements

This research was partially supported by RFBR grants 12-01-00546-a, 14-01-00804-a, 13-01-12061-ofi-m.

Appendix A Inverse Laplace transform sources

For integer and real positive the inverse Laplace transform of from equation (14) may be calculated analytically and expressed via the Kummer’s confluent hypergeometric function (abramowitz-stegun (), chapter  13) as follows

(34)

where

(35)

and is the Gamma function.

Below we present the most interesting explicitly. Due to the difference of the normalization conditions in spherical and Cartesian coordinates for the Slater density (2)

(36)

the parameter is related to the parameter as follows

(37)

The number of electrons in the shell is labeled as . The parameter is greater or equal to unity. It is an integer or half-integer depending on the principal quantum number and the orbital quantum number of the atomic shell. Details can be found in slater1960 (); shevelko1993 (). For example, , ; , ; , . Finally,

Appendix B Quadrature formula for the Laplace integral

To obtain the decomposition (12) for given and we make a substitution into the equation (15)

(38)

then introduce another variable

(39)

Good news is that the function under the integral (39) has exponential decay both in the spatial and frequency domains, therefore the truncated trapezoidal (or more advanced) rule gives the optimal convergence rate. The final approximation has the form

(40)

where parameters of the formula

(41)

have to be selected in such a way that the resulting quadrature formula approximates the integral for a wide range of parameter . Typically, the choice , , and gives good accuracy (). As an example, in Table 2 the required number of terms in sum (40) is presented. Accurate error analysis can be found in beylkin-expsumrev-2010 ().

References

  • (1) Litsarev, M. S., Computer Physics Communications 184 (2013) 432.
  • (2) Bohr, N., Philosophical Magazine Series 6 30 (1915) 581.
  • (3) Russek, A. and Meli, J., Physica 46 (1970) 222.
  • (4) Cocke, C. L., Phys. Rev. A 20 (1979) 749.
  • (5) Shevelko, V. P., Litsarev, M. S., and Tawara, H., Journal of Physics B: Atomic, Molecular and Optical Physics 41 (2008) 115204.
  • (6) Song, M.-Y., Litsarev, M. S., Shevelko, V. P., Tawara, H., and Yoon, J.-S., Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 267 (2009) 2369 .
  • (7) Shevelko, V. P., Litsarev, M. S., Song, M.-Y., Tawara, H., and Yoon, J.-S., Journal of Physics B: Atomic, Molecular and Optical Physics 42 (2009) 065202.
  • (8) Shevelko, V. P., Kato, D., Litsarev, M. S., and Tawara, H., Journal of Physics B: Atomic, Molecular and Optical Physics 43 (2010) 215202.
  • (9) Shevelko, V. et al., Electron loss and capture processes in collisions of heavy many-electron ions with neutral atoms, in Atomic Processes in Basic and Applied Physics, edited by Shevelko, V. and Tawara, H., volume 68 of Springer Series on Atomic, Optical, and Plasma Physics, pages 125 – 152, 2012.
  • (10) Litsarev, M. S. and Shevelko, V. P., Physica Scripta 2013 (2013) 014037.
  • (11) Tolstikhina, I. Y. and Shevelko, V. P., Physics-Uspekhi 56 (2013) 213.
  • (12) Tolstikhina, I. Y. et al., Journal of Physics B: Atomic, Molecular and Optical Physics 47 (2014) 035206.
  • (13) Tyrtyshnikov, E. E., Calcolo 33 (1996) 47.
  • (14) Goreinov, S. A., Tyrtyshnikov, E. E., and Zamarashkin, N. L., Linear Algebra Appl. 261 (1997) 1.
  • (15) Goreinov, S. A., Zamarashkin, N. L., and Tyrtyshnikov, E. E., Mathematical Notes 62 (1997) 515.
  • (16) Tyrtyshnikov, E. E., Computing 64 (2000) 367.
  • (17) Beylkin, G. and Monzón, L., Appl. Comput. Harm. Anal. 19 (2005) 17.
  • (18) Beylkin, G. and Monzón, L., Appl. Comput. Harm. Anal. 28 (2010) 131.
  • (19) Kolda, T. G. and Bader, B. W., SIAM Review 51 (2009) 455.
  • (20) Bebendorf, M., Numer. Mathem. 86 (2000) 565.
  • (21) Hackbusch, W. and Braess, D., IMA J. Numer. Anal. 25 (2005) 685.
  • (22) Gavrilyuk, I. P., Hackbusch, W., and Khoromskij, B. N., Computing (2005) 131.
  • (23) Oseledets, I. V., SIAM J. Sci. Comput. 33 (2011) 2295.
  • (24) Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover, New York, ninth dover printing, tenth gpo printing edition, 1964.
  • (25) Slater, J., Quantum theory of atomic structure, International series in pure and applied physics, McGraw-Hill, New York, 1960.
  • (26) Shevelko, V. P. and Vainshtein, L. A., Atomic physics for hot plasmas, Institute of Physics Pub., 1993.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
60257
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description