Moment-based inference for Pearson’s quadratic q subfamily of distributions
Panepistemiopolis, 157 84 Athens, Greece.
The author uses a Stein-type covariance identity to obtain moment estimators for the parameters of the quadratic polynomial subfamily of Pearson distributions. The asymptotic distribution of the estimators is obtained, and normality and symmetry tests based on it are provided. Simulation is used to compare the performance of the proposed tests with that of other existing tests for symmetry and normality.
Mathematics Subject Classification: 62E01.
Keywords: Quadratic ; Covariance Identity; Moment Estimators; -method.
Let be a continuous random variable (r.v.) with probability density function (p.d.f.) and finite mean . We say that has a Pearson quadratic form (see ), , if it satisfies the identity
Let be a continuous r.v. which satisfies (1.1). Then the support of is the interval
where and . Also, it is obvious that .
Similarly, if is a discrete (integer-valued) r.v. with probability mass function (p.m.f.) and finite mean , we say that has a Pearson quadratic form if it satisfies the identity
Under (1.2) it can be shown that the support is an interval of integers, i.e., if and then all integers between and belong to . The cases or are trivial, because identity (1.2) is always satisfied and is not uniquely defined. We exclude these trivial cases from what follows. Thus, when we say that an integer-valued r.v. has a quadratic polynomial, we will assume that .
where denotes the forward difference operator, .
where , , with .
It should be noted that the quadratic also appears in variance bounds obtained using Bessel’s inequality (see ).
The purpose of the present paper is to obtain an estimator for the parameters of the quadratic , i.e., for the vector . With the help of (1.3), (1.4), we generate a system of equations, from which we obtain the moment estimators for .
Employing the -method, the asymptotic distribution of the estimators is derived. Some applications are also given. Similar work has been done by Pewsey, , who found the joint asymptotic distribution of the sample mean, variance, skewness and kurtosis. It is worth mentioning that Pewsey’s results provide, for the first time, the joint asymptotic distribution for these fundamental statistics.
2 Moment Estimators
Here we deal with the estimation of the parameters using the method of moments. ML estimation is possible but, as is generally true for all but the most simple of distributions, there are no closed-form expressions for the MLEs and ML estimation reduces to a numerical optimization problem. Instead, in what follows we consider estimators obtained using the method of moments.
Let be a r.v. with and . If is the -th central moment, then
and the equality holds only for the trivial case where takes, with probability (w.p. ), at most two values [the r.v. has variance , since ]. Next, let be a random sample from any distribution. If is the -th sample central moment, then
and the equality holds if and only if we have observed at most two values in the sample (this follows directly from (2.1)).
Let be an integer-valued r.v. (or a continuous r.v.) with mean , finite fourth moment and p.m.f. satisfying (1.2) (or p.d.f. satisfying (1.1)), with . If is a random sample from , with at least three different values (or at least three values), then
for the integer-valued case, the moment estimators for are
for the continuous case, the moment estimators for are
where is a positive number, is the sample mean and is the -th sample central moment. Also, the estimators converge strongly to , respectively.
Proof: (a) Since has finite fourth moment, it follows that it has finite central moments up to the fourth order. Also, its p.m.f. is quadratic and so the covariance identity (1.4) applies to any suitable . In particular, for , the covariance identity is satisfied. For these functions, and since , , , (1.4) yields the equalities,
Solving this system of equations, we obtain
If in a random sample we have observed at least three different values, then , by (2.2). Replacing by in the above expressions we obtain the moment estimators , and .
(b) Using similar arguments we observe that for , , the covariance identity (1.3) is satisfied, which is
From (2.3), with , we generate a system of equations. Solving this system we obtain
where . For the solution of this system we have to assure that . This follows directly from (2.1), since the r.v. is continuous.
Since a random sample of at least three values from a continuous distribution function consists of distinct values (w.p. ), we have , a.s.
For all the r.v. has finite mean , and it is well known that , a.s. Finally, the estimators can be written as rational functions of , and we conclude, using Slutsky’s Theorem, that these functions converge strongly to respectively.
If we carefully examine the expressions for in the continuous case, we will see that is a number that does not have “measurement units” (m.u.’s), is measured using the m.u.’s of and is measured using the m.u.’s of the square of . Bearing in mind that are multiplied by respectively, (i.e. ) we expect the final result to be measured in m.u.’s of the square of . This indicates that the above choice of estimators is natural (see (1.1)).
3 Asymptotic Distribution
Next, we study the asymptotic distribution of the estimators using the -method:
Let be a sequence of r.v.’s in , , and assume that
where is a -dimensional normal distribution with mean vector and covariance matrix .
If is (totally) differentiable in , with total differential , then (see , p. 25)
So, we can easily deduce the following result.
Let be a random sample, with at least three distinct values (or at least three values), from an integer-valued (or continuous) r.v. , with p.m.f. satisfying (1.2) (or p.d.f. satisfying (1.1)) and finite . If , with as in Theorem 2.1(a) (or (b)), and , then
where , , , (with ), has elements , , , , , , , , , , , , , , , , and (with ) has elements:
(a) for the discrete case, , , , , , , , , , , , , , , , ,
(b) for the continuous case, , , , , , , , , , , , , , , , .
Proof: We centralize the -values as , for . Then for the vector , it is well known that
We consider the sample central moments , and we seek the asymptotic distribution of the vector . Observe that
Hence, for , we get , , . Thus, the vector can be written as , where with , , , . Applying the -method, it follows that , as , where .
Since (and ), we obtain , as .
Regarding the asymptotic distribution of the vector , we have and , where and the coordinates are given by:
for the integer-valued case, , , , ,
for the continuous case, , , , ,
with . Thus,
where for both cases.
4 Hypothesis Testing
In the subsections that follow we present various hypothesis tests based on the asymptotic distribution of the parameter estimates.
4.1 Continuous Case
4.1.1 Test for Normality
A test of normality is equivalent to testing
Theorem 3.1(b) shows that , as , where , with . Thus, under null hypothesis, we have that
where and is the chi-square distribution with degrees of freedom. Since is unknown we estimate it by , replacing by . For testing the above hypothesis, we propose the statistic
and, at significance level , the asymptotic rejection region is , where is the upper point of the distribution.
The distribution of is asymptotically . Table 1 contains the 90th, 95th, 97.5th and 99th percentiles of the empirical distribution of generated by simulation of samples of size and from a normal distribution.
Table 1. Empirical percentiles of the distribution of for random samples of size drawn from a normal distribution.
For a sample of small size , the proposed -level normality test is
where is given in Table 1.
4.1.2 Test for
It is of interest to know if , because this simplifies the procedure of inverting the quadratic and arranging (categorizing) the distribution. Hence we consider testing the null hypothesis . Theorem 3.1(b) shows that , as , where is the element of matrix .
Note that if in (1.1) is linear (that is, ) then follows either a normal or a gamma-type distribution of the form , where is gamma and are constants. In both cases, . Thus, under null hypothesis,
However, is unknown, we have to estimate it by , replacing by . Thus, we proposed the statistic
and, at significance level , the (asymptotic) rejection region is , where is the upper point of the standard normal distribution.
4.1.3 Test for Symmetry
First we prove the following lemma.
Let be a continuous r.v. with mean and p.d.f. satisfying (1.1), with