Abstract
We consider a type of nonnormal approximation of infinitely divisible distributions that incorporates compound Poisson, Gamma, and normal distributions. The approximation relies on achieving higher orders of cumulant matching, to obtain higher rates of approximation error decay. The parameters of the approximation are easy to fix. The computational complexity of random sampling of the approximating distribution in many cases is of the same order as normal approximation. Error bounds in terms of total variance distance are derived. Both the univariate and the multivariate cases of the approximation are considered.
Keywords and phrases. Infinitely divisible; normal approximation; compound Poisson approximation; Gamma approximation; cumulant matching; sampling
2000 Mathematics Subject Classifications: Primary 60E07; Secondary 60G51.
Nonnormal small jump approximation of infinitely divisible distributions
[1.5ex] Zhiyi Chi
Department of Statistics
University of Connecticut
Storrs, CT 06269, USA
[.5ex] Email: zhiyi.chi@uconn.edu
[1ex] July 5, 2019^{1}^{1}1Submitted for journal publication on March 28, 2013
1 Introduction
Simulation of infinitely divisible (i.d.) random variables has many applications. In most cases, since closed formulas of i.d. distributions are unavailable, good approximation methods are desired. Normal approximation of i.d. distributions, which was studied in [26] and later developed in [1, 11] in the framework of small jump approximation, has received much attention in the literature [33, 16, 22, 13, 23, 19, 2].
The idea of small jump normal approximation is as follows. Denote by the Lévy measure of an i.d. random variable . Given , decompose , such that has finite mass. Correspondingly, , where and are independent i.d. random variables with Lévy measures and , respectively. Then is approximated by a Gaussian random variable, while is sampled using standard methods for compound Poisson random variables. Presumably, in order for the approximation to have a certain degree of precision, the support of should be in a small neighborhood of 0. The size of the neighborhood is controlled by . For the univariate case, it is natural to set equal to the maximum jump size [1]. However, for the multivariate case, such use of can be restrictive. Generally speaking, one can use to index any tunable quantity, as long as it controls (indirectly) the size of the support of [11].
Normal approximation relies on secondorder moment matching between and a Gaussian random variable, by which we mean the matching of their first and second moments. This is equivalent to secondorder cumulant matching. Without specifying details, by certain measures, the error of the approximation in the univariate case is bounded by
where is a universal constant, is the second cumulant of , and for , is the “absolute cumulant” of [1]. The best currently available value of is 0.4785 ([28], p. 71). Also, in some symmetric cases, since the third cumulants of and the Gaussian random variable are 0, in the bound can be more or less replaced with , while the power of is raised to 2 [1]. From the pattern of the bound, one could guess that, if and some have the same cumulants of order 1, …, with , then could be approximated by with the error being bounded by
where is a nearconstant, at least when is small, or most ideally, a universal constant which may depend on the dimension of but is not very large. Elementary calculations indicates that in many cases, the above bound vanishes at a genuinely higher rate than the bound for normal approximation as . Since the cumulant of a Gaussian random variable is 0, the above bound, if true, is consistent with the bound for normal approximation.
Even by some rough analysis on characteristic functions, there is good reason to expect that the bound is true for both univariate and multivariate cases. However, before attempting to work out the detail, perhaps one should first ask if such an approximation can possibly be implemented easily. The meaning of the question is twofold. First, the distribution of the approximating random variable should be easy to identify; preferably, it is i.d. Second, the approximating random variable should be easy to sample; preferably, the computational complexity of the sampling is of the same order as the normal approximation. If the answer to the question is positive, then the next question is how large can be. It can be anticipated that the larger is, the faster the error of approximation vanishes as . After the answers to the two questions are in place, a wide range of available techniques can potentially be modified to establish the error bound (e.g. [5, 3, 8, 28]).
Clearly, cumulant matching is equivalent to moment matching. Actually, our proof of the above type of bound eventually will be based on moment matching. However, thanks to the LévyKhintchine representation, it is more natural and convenient to consider cumulants than moments. We shall show that it is fairly easy to construct approximating i.d. random variables with matching cumulants up to at least the fourth order, in other words, we can get at least . In many important cases, we can get , and in the symmetric cases, we can get . For the univariate case, the construction is particularly simple. The approximating i.d. random variable is the sum of a compound Poisson random variable and an independent Gaussian random variable, with the former in turn being the sum of a Poisson number of i.i.d. Gamma random variables. Importantly, using algorithms already available [14, 20], the computational complexity of the random sampling for the approximation is universally bounded, so it is of the same order as the random sampling from a normal distribution.
We shall refer to the approximation as PoissonGammaNormal (PGN) approximation, although a longer name like “compound PoissonNormal approximation with Gamma summands and higher order of cumulant matching” might be more appropriate. We shall bound its error in terms of total variation distance by combining Fourier analysis, Lindeberg method (cf. [6] for a modern application of it), and a device in [1]. The results are nonasymptotic and of the aforementioned type. Asymptotically, when applied to , the approximation yields substantially higher rate of precision than normal approximation as . Of course, on modern treatments of Poisson, compound Poisson, and normal approximations, there is now an extensive literature, and on Gamma and other types of approximations, there is also a growing literature; see [3, 7, 17, 8, 28, 30, 27] and references therein. However, it appears that there has been little work on using convolutions of different types of simple distributions to improve approximation, in the sense that the error of approximation vanishes at a faster rate asymptotically.
For the multivariate case, the issue of approximation becomes quite more involved, which is a well documented phenomenon [32, 5, 17, 11, 8, 28]. For normal approximation of i.d. distributions, several important issues unique to the multivariate case are identified and addressed in [11]. The same issues also arise in the type of approximation considered here and actually become more serious. To address them, we consider a “radial” cumulant matching approach. Its idea is to apply the same cumulant matching method for the univariate case to each radial direction in the LévyKhintchine representation, in such a way that, when the approximating Lévy measures and Gaussian measures along different radial directions are “bundled” together, we get a valid multivariate i.d. distribution with desired order of cumulant matching and with the covariance of its Gaussian component being precisely evaluated. Although the approach does not completely resolve the aforementioned issues, it seems to work well in many important cases.
As in the univariate case, we shall prove a similar type of bound for the error of the proposed PGN approximation in terms of total variation distance. On the other hand, the issue of computational complexity needs to be considered more carefully. Recall the approximation is applied to in the decomposition . As in the univariate case, the approximating random variable for has a compound Poisson component. Unfortunately, this component now can only be sampled by summing a large number of Poisson events. As a result, the computational complexity of the approximation of is much greater than normal approximation. However, one has to take into account the computational complexity of the sampling of . We will argue using an example that although the proposed PGN approximation as a whole has greater computational complexity than normal approximation, asymptotically, as , the two have the same order of complexity. Because the PGN approximation can yield substantially higher rate of convergence, therefore, at least asymptotically, it is worth the extra computation. Note that whereas in [11], the focus is the approximation of the related Lévy processes, our discussion is restricted to i.d. distributions. An extension of PGN approximation to processes will be subject to future work.
In Section 2, we shall set up notation and collect useful facts about i.d. distributions. Sections 3 and 4 consider PGN approximation for univariate i.d. distributions and multivariate i.d. distributions, respectively. The proofs of the main results in these two sections are collected in Section 5 and the proofs of related technical results are collected in Section 6.
2 Preliminaries
2.1 Notation
Denote and . If is a function on , where , then by or we mean 1) , and 2) , where denotes the order partial derivative with respect to . The order of is defined to be . Denote and . Denote by the space of rapidly decreasing function on . It is a basic fact that the Fourier transform is an homeomorphism of onto itself ([18], p. 103).
Denote by the support of a measure . For two random variables and , their total variation distance [3] is denoted by
and, if , , their KolmogorovSmirnov distance is denoted by
For any i.d. random variable , denote by and its characteristic function and characteristic exponent, respectively
Let be the probability density of . If it exists, then . Denote
The quantity is known as the cumulant of . It is well defined provided that for all in a neighborhood of 0. Some properties of cumulants can be found in [28]. We shall refer to as the absolute cumulant of .
2.2 Basic assumptions and facts
Let be i.d. with Lévy measure . We will always assume
(1) 
in particular, has no Gaussian component and . The assumption causes no loss of generality since, if necessary, we can decompose as , such that has a Lévy measure satisfying (1) and is a compound Poisson random variable. Then we can take as the new . We will also always assume
Under the assumption, is not compound Poisson and for ([31], Theorem 27.4). It is known that if and then does not admit normal approximation [1]. The assumption excludes the case of lattice valued i.d. random variables, for which PoissonCharlier approximation has been studied [26, 3].
Recall that for any , if and only if ([31], p. 159–160). If is a Borel measure on with and , then is the Lévy measure of some i.d. random variable ([31], Theorem 8.1). Also, if is bounded, then for all ([31], Theorem 25.17). By differentiation,
for all . It is easy to see that if each is even or , then . Also,
and hence , where denotes the trace of a square matrix .
Let . For , let be i.d. with . If and is compact, then , which directly follows from the vague convergence of to on given ([4], p. 39); see [21] for detail on vague convergence. The next result, which will be used later, concerns the case where is not a compact set in . When and with , the result is established as Lemma 3.1 in [1]. However, as seen from the case , the asserted convergence in general is not true if .
Proposition 1.
Suppose , where is nondecreasing with as . Suppose and one of the following holds, 1) for some , where , with i.i.d. , 2) for some , or 3) provided is symmetric, . Then and as .
3 Univariate PoissonGammaNormal approximation
3.1 Cumulant matching
For simplicity and without loss of generality, we will only consider two cases 1) and 2) is symmetric. First, suppose . Given , decompose , where and are independent i.d. random variables such that
(2) 
Given , let be an i.d. random variable with
(3) 
where and are constants that need to be determined. Finally, let
(4) 
where is a constant that needs to be determined.
We shall use to approximate , or equivalently, use to approximate . But first, let us point out how easy it is to sample . Clearly, the issue is the sampling of . Since , where is i.d. with Lévy density , and , we only need to consider the computational complexity of the sampling of . If , then , the Gamma distribution with shape parameter and scale parameter . It is known that the sampling of has universally bounded complexity regardless of ([14], p. 407–420). If , then , where with , and are i.i.d. random variables independent of . The sampling of is known to have universally bounded complexity ([15] or [20], p. 228–241). On the other hand, conditional on , . Therefore, the sampling of , and hence that of , has the same order of complexity as the sampling of a normal random variable.
Due to the LévyKhintchine representation of , we refer to the approximation of by , or by , as PoissonGammaNormal (PGN) approximation.
It is easy to see , and for ,
(5) 
This is the starting point of cumulant matching between and . In the next result, we allow , so it applies to any i.d. random variable with finite fourth cumulant.
Proposition 2 (Fourthorder cumulant matching).
Proof.
Proposition 3 (Fifthorder cumulant matching).
Proof.
Now we consider the symmetric case. Suppose , where are i.i.d. with Lévy measure supported in . Let , and approximate it by , where are i.i.d. defined in (4). Since all the oddordered cumulants of and are 0, we only need to match their evenordered cumulants. The next results states that for the general case, we can match their cumulants up to order 7, and for the i.d. distribution as in Proposition 3, we can match their cumulants up to order 9.
Proposition 4 (Symmetric case).
Proof.
1) By Hölder inequality, , so for all large , (10) is satisfied. Since for evenvalued , , it is easy to see and . On the other hand, for all oddvalued , . Finally, by similar argument for Proposition 2, , leading to for .
2) Following the proof of Proposition 3,
Clearly, is strictly increasing on . On the other hand,
is strictly decreasing on , with . Therefore, there is a unique satisfying (12). We have to show that for this , (10) is satisfied for all small . By continuity, it suffices to show that for ,
By calculation, the equality is equivalent to , while the inequality is equivalent to . Then, by and , the equality indeed implies the inequality. The rest of the proof then follows the one for 1). ∎
Propositions 3 and 4 directly lead to the following result on the truncated stable case. Note that for the nontruncated case, simple exact sampling method is known [14]. Also, for , the truncated stable distribution can be sampled exactly [9].
3.2 Error bound for approximation
We consider the error of approximation of by . Denote constants
Note that because
and by Central Limit Theorem, as .
Theorem 1.
Remark.

The bound is on instead of the more commonly used [1, 26]. However, we have not been able to derive a BerryEsseen type of bound of the form , with a universal constant only depending on . It appears that some key ingredients for the proof of the BerryEsseen bound for normal approximation are still missing for higher order approximations. Also, it is likely that the constants in the bounds are not optimal.

The bound will be proved by combining Fourier analysis, the Lindeberg method, and a device in [1] (cf. the proof of Theorem 25.18 in [31]). Although a bound on may be established solely based on Fourier analysis [10, 26], our proof seems to be more transparent and suitable for generalization to multivariate cases.
In the bound for , look rather technical. We can use the following result to bound them.
Proposition 5.
For and , there is , such that if
(14) 
then for any ,
Since the proof is short, we give it here. By (14), for all small , . Then, from the increasing monotonicity of in , there is a constant , such that for , . Consequently, if , then by (14), for , , and hence for all ,
Since as , the proof is complete.
Example 1.
Let , where , , and . By Corollary 1, given , if , and , and are set according (7) – (8), then for . To apply (13), we need to get , , and . For , . Since
then , with
Therefore, by Theorem 1,
Since is a constant independent of , and satisfies Orey’s condition ([29]; also see [31], Proposition 28.3), the conditions in (14) are satisfied no matter the value of . Then by Proposition 5, . This may be compared to the normal approximation in [26, 1], where between and its normal approximation is of rate when is asymmetric.
Example 2.
Let , where and . If we directly evaluate for , there is no closed formulas available. The following method avoids the problem. Recall that for any odd positive integer , for , where . Let be the smallest odd number greater than and , where for all . Decompose , where . Because as , has finite mass, and hence corresponds to a compound Poisson random variable that can be sampled exactly. Since , for