Exact distribution of selected multivariate test criteria by numerical inversion of their characteristic functions
Application of the exact statistical inference frequently leads to a non-standard probability distributions of the considered estimators or test statistics. The exact distributions of many estimators and test statistics can be specified by their characteristic functions. Typically, distribution of many estimators and test statistics can be structurally expressed as a linear combination or product of independent random variables with known distributions and characteristic functions, as is the case for many standard multivariate test criteria. The characteristic function represents complete characterization of the distribution of the random variable. However, analytical inversion of the characteristic function, if possible, frequently leads to a complicated and computationally rather strange expressions for the corresponding distribution function (CDF/PDF) and the required quantiles. As an efficient alternative, here we advocate to use the well-known method based on numerical inversion of the characteristic functions --- a method which is, however, ignored in popular statistical software packages. The applicability of the approach is illustrated by computing the exact distribution of the Bartlett’s test statistic for testing homogeneity of variances in several normal populations and the Wilks’s -distribution used in multivariate hypothesis testing.
keywords:multivariate test criteria, exact distribution, Bartlett’s test, Wilks’s -distribution, characteristic function, numerical inversion
In 1937, Bartlett proposed a testing procedure to test the hypothesis of equal variances of normal populations. He suggested an ingenious correction of the modified likelihood ratio based test statistic which under the null hypothesis approximately follows the chi-squared distribution with degrees of freedom, even for small sample sizes, see Bartlett1937 (). In fact, the Bartlett-type corrections (multiplying factors) are known to be effective and precise for various approximate tests based on the asymptotic approximations for likelihood ratios for wide range of parameters (as e.g. the number of normal populations and the sample sizes for in testing the homogeneity of variances), Bartlett1954 (); Cribari1996 (). However, detailed comparison with the exact distribution is still desirable.
In general, application of the exact statistical inference leads to a non-standard probability distributions of the considered estimators or test statistics which can be specified by their characteristic functions (CFs). Frequently, distribution of many estimators and test statistics can be structurally expressed as a linear combination or product of independent random variables with known characteristic functions, as is the case for many standard multivariate test criteria, see, e.g., Mathai1973 (); Arnold2013 (). In such cases, analytical expressions for the exact distributions are typically difficult to derive. Hence, such distributions are usually approximated by using results of the asymptotic theory, see Mardia1979 (); Anderson2003 (), or other available small sample approximation/correction methods, and frequently by using computer intensive simulation methods.
In this paper we advocate using numerical inversion of the known characteristic function as an efficient tool to evaluate the required distribution --- the probability density function (PDF), as well as the cumulative distribution function (CDF), and the quantiles of such estimators or test statistics. The method based on numerical inversion is convenient when the characteristic function is known apriori and is such that it can be easily evaluated by the available algorithms, or if the statistic under consideration is a linear combination of independent random variables with known and simple characteristic functions. In order to apply the method to products of independent random variables one has to consider, first, the logarithm of the statistic and, subsequently, to transform the computed values to the original scale (if required). Unfortunately, such numerical algorithms are not available in standard statistical software packages (e.g. SAS, R, MATLAB).
For illustration (and as a gentle introduction to the problem) let us consider first the distribution of a quadratic form with , where denotes the known covariance matrix and denotes the known p.s.d. matrix of the quadratic form, then
where are eigenvalues of and are independent chi-square distributed random variables (RVs) with 1 degree of freedom. Hence, the characteristic function of , say , is given by
where denotes the imaginary unit and , for all , is CF of the chi-square distribution with 1 degree of freedom. The cumulative distribution function of , say , is a non-standard distribution (in general, the closed form expression is unknown), however, it can be evaluated numerically from its CF, as it was suggested in Imhof1961 (); Davies1980 ().
This can be naturally generalized for more complicated applications, e.g., based on Gaussian stochastic processes. For example, let us consider the (asymptotic) distribution of the Cramér-von Mises and the Anderson-Darling statistics. These statistics belong to the class of quadratic goodness-of-fit test statistics based on the empirical distribution function. By using the theory of stochastic processes, the asymptotic distributions are derived from the Karhunen-Loève representation of functionals of the Brownian bridge.
In particular, let denotes the empirical CDF based on i.i.d. random variables from continuous distribution , i.e. . Then, for , the distribution of the Cramér-von Mises statistic converges to the distribution of infinite sum of (weighted) independent chi-square distributed random variables with 1 degree of freedom, i.e.
where represents the Brownian bridge process and are i.i.d. RVs. The exact distribution of is difficult to derive and evaluate, however its characteristic function is rather simple,
Similarly, for , the distribution of the Anderson-Darling statistic converges to the distribution of infinite sum of (weighted) independent chi-square distributed random variables with 1 degree of freedom, i.e.
with its (rather simple) characteristic function given by
For more details see Anderson1952 (); Marsaglia2004 (). The distribution functions (PDF/CDF) of and can be evaluated numerically from their respective CFs by using the proper numerical inversion algorithm.
The rest of the paper is organized as follows: In Section 2 we present the exact CF of the Bartlett’s test statistic for testing homogeneity of variances of normal populations. The exact CF of the Wilks’s -distribution is presented in Section 3 and the exact non-null CFs of selected related multivariate test criteria are presented in Section 4. In Section 5 we introduce the Gil-Pelaez inversion and its implementation based on using the trapezoidal rule. Applicability of the numerical inversion method is illustrated in Section 6, where the exact distribution is compared with some known approximations. Discussion and concluding remarks are presented in Section 7.
2 Characteristic function of the Bartlett’s test statistic
Let () represent independent random samples from normal populations, where are independent normally distributed RVs with unknown means and unknown variances for all and . The Bartlett’s test statistic (the corrected version of the log-likelihood based test statistic) and its approximate null distribution for testing homogeneity of variances of normal populations, i.e. the hypothesis , is given by
where , with , with the sample variances , for , and the pooled sample variance , where .
The closed form expression for the exact distribution of the test statistic (7) is unknown, but its distribution can be approximated by the asymptotic chi-square distribution with degrees of freedom, see Bartlett1937 (), for more precise higher-order asymptotic approximations see Anderson2003 (). However, the exact null distribution of the Bartlett’s test statistic can be evaluated by numerical inversion of its CF.
The exact distribution of the Bartlett’s test statistic was studied (among others) by Glaser and Chao in Glaser1976 (); Glaser1976b (); Chao1978 (). They recognized that the null distribution of the likelihood ratio statistic (which is functionally related to the Bartlett’s test statistic) is related to the distribution of a ratio of the weighted geometric mean and the arithmetic mean of independent gamma distributed random variables. Based on that, they succeeded to derive the characteristic function of the log-likelihood ratio test statistic. However, the subsequently derived expression for PDF of the considered test statistic was expressed in a complicated and intractable form for practical purposes. In fact, they used the asymptotic expansion of the derived cumulant generating function, in order to express the probability density function of the log-likelihood ratio test statistic as an infinite linear combination of chi-square densities (with the coefficients depending on the parameters and on the complicated double sums of Bernoulli polynomials).
Here we briefly recall the basic steps of deriving the exact CF of the Bartlett’s test statistic (7). Let be a ratio of the weighted geometric mean and the arithmetic mean,
where are the weights, such that , and are independent gamma distributed RVs with the shape parameters for for and common scale (resp. rate) parameter . Note that the ratio and the arithmetic mean are mutually independent random variables and, moreover, is scale invariant, i.e. the distribution does not depend on the scale parameter . Hence, the exact th moment of is given by
with . Further, by using our knowledge about the th moment of the gamma distribution, i.e.
for , we directly get the expression for the th moment of ,
In general, for any log-transformed non-negative RV, say , its CF can be derived from the formal expression of the th moment of (if it exists and is well defined also for purely imaginary order, say ) by substituting the order with the complex variable . In particular,
Now, let denote the likelihood ratio based test statistic for testing homogeneity of variances in normal populations with unequal sample sizes, which is the ratio of the weighted geometric mean and the weighted arithmetic mean of the sample variances ,
where is the Bartlett’s correction factor and .
Under the alternative hypothesis , i.e. when for some , the distribution of specified in (14) depends on for (i.e. the independent gamma distributions of have different shape parameters as well as different scale parameters). The exact th moment of and the associated non-null distribution characteristic function of test statistic is more complicated to derive, and hence, it is not presented here. However, the exact non-null distribution moments of the related likelihood ratio test statistic for testing sphericity of the multivariate distribution have been derived by Khatri and Srivastava in Khatri1971 (). For more details on the non-null characteristic functions and distributions of selected multivariate test criteria see Section 4.
3 Characteritic function of the Wilks’s test statistic
The Wilks’s test statistic is frequently used in multivariate hypothesis testing, especially with regard to different likelihood-ratio tests and multivariate analysis of variance (MANOVA). Let and are independent -dimensional Wishart matrices representing the residual errors and the hypothesis model sums of squares and products matrices, with the respective degrees of freedom and , such that , and a common covariance matrix . The Wilks’s statistic and its exact null distribution is given by
where are independent RVs with beta distributions, with specific (different) parameters for , and by we denote the Wilks’s Lambda distribution with the parameters , , and , see e.g. Anderson2003 ().
The exact Wilks’s distribution with the parameters (number of variates), (error degrees of freedom), and (hypothesis degrees of freedom), have been broadly studied in statistical literature for specific parameters , as well as for quite general situation with arbitrary parameters, see e.g. Wald1941 (); Schatzoff1966 (); Pillai1969 (); Mathai1971 (); Davis1979 (). In particular, Wald and Brookner in Wald1941 () gave an expansion of the exact CDF of from its CF by using the method of residues, expressed in general as an infinite series expansion applicable for any grouping, Schatzoff in Schatzoff1966 () considered the representation of as a sum of independently distributed beta random variables and derived the expressions for its distribution (PDF/CDF) by taking successive convolutions. He showed how to compute numerical values of the coefficients in the derived expressions (for both the density and distribution functions) by recursive computational techniques. Pillai and Gupta in Pillai1969 () derived explicit expressions for . Mathai and Rathie in Mathai1971 () derived the exact distribution by using the technique based on the inverse Mellin transform. However, computational problems may arise in tabulating the distributions when the latter are obtained as infinite series. One possibility for overcoming the problem of a slowly convergent series is to expand the series at intermediate points, and so to approach the required percentage points by a process of ’analytic continuation’. If the distribution can be shown to satisfy a differential equation, the latter may provide a convenient tool for this process, as was suggested by Davis in Davis1979 ().
However, the derived closed form expressions of the distribution functions are typically too complicated for practical purposes, especially for higher values of the parameters and , and thus frequently approximated by using the well-known asymptotic approximation, , and/or its improved (corrected) versions, see Bartlett1938 () and Rao1948 (),
In any case, the exact distribution of the log-transformed statistic can be evaluated by numerical inversion of its CF. In particular,
where denotes the CF of the log-transformed random variable for .
One possible application of the test statistic (17) is related to the well-known likelihood ratio test (LRT) for the equality of -dimensional mean vectors of normal distributions, for , when the common covariance matrix is assumed to be just positive-definite, but otherwise unstructured, see Anderson2003 (). The null hypothesis can be tested based on independent random samples with (, ), by the LRT test statistic
where , and , with and .
In Coelho2017 (), Coelho suggested similar test for cases where the common covariance matrix is restricted by some given structure, in particular the compound symmetry structure, i.e. for some (unknown) parameters and such that .
In such situation the null hypothesis is with assuming . Now, we get and . Based on that, the suggested LRT statistic which has similar structure as in (17) resp. (21) and its null distribution is given by
where , , with and denoting the diagonal elements of the matrices and , where and are defined as before and denotes the -dimensional Helmert matrix, and finally, and denote two independent beta distributed random variables.
The exact distribution of the log-transformed statistic can be evaluated by numerical inversion of its CF. For more details on derivation of the test statistic and its characteristic function and alternative methods for evaluating its distribution see Coelho2017 ().
4 Characteristic functions of the non-null distributions
In general, the exact non-null distributions of the multivariate test criteria are unknown or difficult to derive. As noted in Mathai1973 (), a breakthrough in this field was possible to achieve by using special functions with matrix arguments, in particular, by using the hypergeometric functions with matrix argument or the generalized functions, such as the Meijer’s -function or the Fox’s -function, for more details see Mathai2008 (); Mathai2009 (); Muirhead2009 (). In particular, the generalized hypergeometric function with matrix argument is defined by
where and are integers, and is symmetric matrix with eigenvalues , is a partition of , and represent the generalized Pochhammer symbols, and is the Jack function --- a symmetric, homogeneous polynomial of degree in the eigenvalues of . For more details and strategies for efficient computation of the generalized hypergeometric function see Koev2006 (); KoevMHF2008 (). Buttler and Wood in Butler2002 (); Butler2005 () suggested efficient Laplace approximations for two functions of matrix argument: the Type I confluent hypergeometric function, , and the Gauss hypergeometric function, .
In special cases it is possible to evaluate the moments of the statistics under consideration. For example, see Mathai1973 (), the th moment of Wilks’s generalized variance , where is a non-central Wishart distribution with degrees of freedom and the parameters (covariance matrix) and (non-centrality matrix), , is
where denotes the multivariate gamma function,
By using (12), the characteristic function of is
and hence, the required PDF/CDF/QF can be computed straightforwardly through numerical inversion of the CF (28).
In the normal theory of testing hypotheses on regression coefficients the Wilks’s test criterion is specified in (17) with the -matrices and . In general, has a non-central Wishart distribution with degrees of freedom, the covariance matrix , and the matrix of non-centrality parameters , where is the true expectation of (if the null hypothesis is not true), where is such that , i.e. . The matrix has a central Wishart distribution with degrees of freedom and a common covariance matrix , .
Then, the non-null characteristic function of , derived from the th non-null moment of , as specified in Constantine1963 (), is
Similarly the test criterion for testing equality of covariances of two -dimensional multivariate normal populations of size and , based on the test statistic
where , , , and . The non-central distribution of under is determined by the parameters , , and the eigenvalues of the matrix . In particular, the non-null characteristic function of derived from the th non-null moment of , as specified in Constantine1963 (), is
The required PDF/CDF/QF of the non-null distributions can be computed by numerical inversion of their CFs, (28) (29) and (31), by using algorithms for computing the generalized hypergeometric functions with matrix argument, e.g., as suggested in KoevMHF2008 (), or by using suitable approximations, see e.g. Butler2002 (). In fact, evaluation of the generalized hypergeometric functions with matrix argument is still a big challenge and numerical precision and efficiency of the computation strongly depends on the quality of the available algorithms.
In general, once the non-null distribution moments of the considered multivariate test statistic are available the characteristic function of the log-transformed statistic can be derived and the numerical inversion of the CF can be applied to evaluate the exact PDF/CDF and the quantiles. The non-null moments of the multivariate test criteria have been broadly discussed in statistical literature, for more particular cases see e.g. Anderson2003 (); Khatri1971 (); Mathai1973 (); Constantine1963 (). However, for many important test criteria the characteristic functions or the non-null moments are still not available or difficult to compute. These are open problems for further research.
5 Methods and algorithms for numerical inversion of the characteristic functions
Let denotes the continuous univariate RV with its PDF . Recall that the CF of the distribution of , given by the Fourier transform of its PDF, is defined as
Conversely, the PDF of is the inverse Fourier transform of its CF,
where denotes the real part of . Further, CDF of can be computed from the analytic CF via the Hilbert transform,
for more details see, e.g., feng2013inverting ().
Computing the (inverse) Fourier transform numerically is a well-known problem, frequently connected with the problem of computing integrals of highly oscillatory (complex) functions. The problem was studied for a long time in general, but also with focus on specific applications, see, e.g., asheim2013complex (); levin1996fast (); milovanovic1998numerical (); sidi1982numerical (); sidi1988user (); sidi2012user (). In particular, the methods suggested for inverting the characteristic function for obtaining the probability distribution function include abate1992fourier (); shephard1991characteristic (); waller1995obtaining (); zielinski2001high (); strawderman2004computing ().
If CF is absolutely integrable over , Gil-Pelaez in GilPelaez1951 () derived the inversion formula which require integration of a real-valued function, only. In particular,
where denotes the imaginary part of . The Gil-Pelaez inversion formulae can be evaluated by using a simple trapezoidal rule:
is sufficiently large integer,
the optimum discretization step is , where is the domain of ,
if not known explicitly or the distribution limits are infinite, can be approximated by a sufficiently large interval covering large part of the distribution domain, e.g., by using the six-sigma-rule: ,
, , are the quadrature weights (, otherwise ),
, , are the equidistant nodes from , where ,
The total approximation error (i.e. the truncation error and the discretization error) can be controlled by proper selection of used for setting the step and selection of sufficiently large , such that the integrand in (32) is sufficiently small for large arguments , i.e. for all .
However, numerical algorithms for computing the distribution function by numerical inversion from the characteristic function are still missing in the standard statistical packages, like e.g. SAS, R and/or MATLAB.
In order to illustrate the suggested approach, a possible alternative is to use the characteristic functions toolbox developed by the author and available at GitHub, see CharFunTool (). CharFunTool is a (still growing) MATLAB repository of characteristic functions and tools for their combinations and numerical inversion. The toolbox comprises different inversion algorithms, including those based on simple trapezoidal quadrature rule for computing the integrals defined by the Gil-Pelaez formulae, and/or based on using the fast Fourier transform (FFT) algorithm for computing the Fourier transform integrals, see e.g. Carr1999 (); Chourdakis2004 (); Huerlimann2013 (), as well as the algorithm for computing non-negative continuous distributions by using the method suggested by Bakhvalov and Vasileva in Bakhvalov1968 (). The method was suggested for computing the oscillatory Fourier integrals based on approximation of the integrand function by the Fourier-Legendre series expansion, and observation that Fourier transform of the Legendre polynomials is related to the Bessel functions. For more details see also Evans1999 ().
6 Numerical examples
Here we illustrate the suggested approach for evaluation of the distribution (PDF/CDF) of selected test statistic by numerical inversion of their characteristic functions we present two simple examples computed by the tools and algorithms available at the MATLAB toolbox CharFunTool, see CharFunTool (). For more details and examples we recommend to check the CharFunTool web page and taking a look at the algorithm help files and the Examples collection.
6.1 Computing the exact distribution of the Bartlett’s test statistic
Let us consider the exact distribution of the Bartlett’s test statistic for testing homogeneity of variances of normal populations with unequal sample sizes, given by (7) and specified by its characteristic function (16). Let us consider the following specific parameters:
, number of normal populations;
, samples degrees of freedom , , with the total sum of the degrees of freedom .
Hence, the Bartlett’s correction factor is and the coefficient . The MATLAB code to evaluate the characteristic function and the exact distribution functions of the Bartlett’s test statistic is presented in Figure 1, together with the plotted graphs of the computed CF/PDF/CDF.
For specified probabilities , and the exact computed quantiles are , , and , respectively. On the other hand, the approximate quantiles, computed from the approximate distribution with , as specified in (7), are , , and , respectively.
6.2 Computing the exact distribution of the log-transformed Wilks’s test statistic
Here we consider and compare the exact distributions of the log-transformed LRT statistics, , for testing equality of the mean vectors of normal populations, when the common covariance matrix is assumed to be unstructured (just positive-definite) and when the common covariance matrix is assumed to have given structure, in particular the compound symmetry structure. Let us consider the following specific parameters:
, dimension of the normal populations;
, number of normal populations;
, total number of samples.
The MATLAB code to evaluate the characteristic functions and the exact distribution functions (PDF/CDF) of the test statistics under both assumed covariance structures is presented in Figure 2 together with the plotted graphs of the computed CF/PDF/CDF.
Notice the apparent difference of the null distributions of the log-transformed LRT statistics for testing equality of mean vectors of the normal populations, with otherwise equal parameters, when the assumed structure of the common covariance matrix is different (unstructured vs. compound symmetry).
In general, evaluation of the exact distribution function based on numerical inversion of its characteristic function is a convenient method when the characteristic function is known apriori and is such that it can be easily numerically evaluated by the available algorithms or if the statistic under consideration is a linear combination of independent random variables with known and simple characteristic functions.
In this paper we have presented several standard test statistics used in multivariate analysis for which the exact null distribution is difficult to express analytically but its characteristic function is known and can be computed easily in standard software packages. Frequently, such distribution functions are usually approximated by using results of the asymptotic theory, or by using other available small sample approximation/correction methods, and frequently by using computer intensive simulation methods. Here we advocate to use the method based on numerical inversion of the characteristic functions. However, numerical algorithms for computing and combining more complicated characteristic functions and for computing the distribution function by numerical inversion from the characteristic function are still missing in the standard statistical packages, like e.g. SAS, R and/or MATLAB. As a possible alternative and a starting point for further development here we present a MATLAB toolbox developed by the author and freely available at the GitHub, https://github.com/witkovsky/CharFunTool. Further research is necessary for deriving the characteristic functions of the non-null distributions, efficient algorithms for computing the special functions (of matrix and complex argument) required for evaluation of complicated characteristic functions, as well as development of the more advanced algorithms (efficient and precise) for numerical inversion of the characteristic functions.
The work was supported by the Slovak Research and Development Agency, project APVV-15-0295, and by the Scientific Grant Agency VEGA of the Ministry of Education of the Slovak Republic and the Slovak Academy of Sciences, project VEGA 2/0047/15.
- (1) M. S. Bartlett, Properties of sufficiency and statistical tests, Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 160 (901) (1937) 268--282.
- (2) M. S. Bartlett, A note on the multiplying factors fo