The exp family of probability distributions
Abstract
In this paper we introduce a new method to add a parameter to a family of distributions. The additional parameter is completely studied and a full description of its behaviour in the distribution is given. We obtain several mathematical properties of the new class of distributions such as KullbackLeibler divergence, Shannon entropy, moments, order statistics, estimation of the parameters and inference for large sample. Further, we showed that the new distribution have the reference distribution as special case, and that the usual inference procedures also hold in this case. Furthermore,
we applied our method to yield threeparameter extensions of the Weibull and beta distributions. To motivate the use of our class of distributions,
we present a successful application to fatigue life data.
Keywords: Exp distribution; Order statistics; Fisher’s information matrix; ExpWeibull distribution; Expbeta distribution.
1 Introduction
The present work is an enhanced and extended version of the pioneering manuscript presented at Estância de São Pedro, São Paulo, Brazil, in the 18º SINAPE, 2008, see BarretoSouza et al. (2008).
In many practical situations, usual probability distributions do not provide an adequate fit. By example, if the data are asymmetric, normal distribution will not be a good choice. With this, several methods of introducing a parameter to expand a family of distributions have been studied.
Marshall and Olkin (1997) introduced a new way to expand probability distributions and applied to yield a twoparameter extension of the exponential distribution which can serve as a competitor to such commonlyused twoparameter distributions as the Weibull, gamma and lognormal distributions. Furthermore, this method was used to obtain a threeparameter extension of the Weibull distribution. Moreover, Mudholkar et al. (1996) introduced a threeparameter distribution alternative to the Weibull distribution, that has the Weibull as limiting distribution.
Some methods of introducing of parameters to symmetric distributions have been studied in order to add skewness. For instance, Azzalini (1985) introduced and studied the wellknown skewnormal distribution, which is obtained by adding a shape parameter to the normal distribution. Another symmetric distribution that was extended by adding a skewness parameter was the Student’s distribution by Jones and Faddy (2003). Finally, Ma and Genton (2004) introduced a general class of skewsymmetric distributions, whereas Ferreira and Steel (2006) provides a general perspective on the introduction of skewness into symmetric distributions.
Recently, Jones (2004) introduced a class of distributions that adds two parameters to a reference distribution. Further, Jones and Pewsey (2009) advanced a fourparameter family has both symmetric and skewed members and allows for tail weights that are both heavier and lighter than those of the generating distribution.
In this article, we introduce a new method to add a parameter to some reference distribution. The resulting distribution exhibits the remarkable reciprocal property. We study this parameter in detail, and we give a full description of its behaviour in the distribution. The augmented distribution has several connections with the reference distribution, for instance, the KullbackLeibler divergence of the augmented distribution with respect to the original distribution is finite and only depends on the new parameter. Several others properties in this direction are also given. The inferential aspects of this distribution are studied in details, and two special cases are discussed, and a successful empirical application shows the flexibility the new distribution, and also motivates its usage.
A special attention must be given to the fact that it is not straightforward that this new distribution contains the reference distribution as special case. We show that this is the case if we enlarge the parameter space, and also, that this enlargement is good, in the sense that, all the standard inferential procedures work if this new value in the parameter space is considered to be the true value of the parameter.
The remaining of the article unfolds as follows: in Section 2 the new class of distributions is introduced, several properties are given, the new parameter is completely characterized, and the inferential aspects are discussed. Sections 3 and 4 deals with two special cases: the expWeibull and expbeta distributions, respectively. In Section 5 an empirical application shows the usefulness of this distribution. Finally, Section 6 ends the article with some concluding remarks. The Appendix contains the proofs of the results presented in the article.
2 The new class of distributions
The cdf of a random variable with truncated exponential distribution in the interval with parameter is given by
(1) 
where and . We now observe that is a cdf for , and that
Therefore, we extend the parameter space of the distribution above for the entire line:
We now define the new class as follows. Let be the cdf of a continuous or discrete random variable with being the parameters related to , then the class of distributions exp, indexed by , is defined by
(2) 
From now on, we will denote a random variable with cdf (2) by , where .
If is a cdf of a continuous random variable, then the exp distribution is absolutely continuous for every , and its probability density function (pdf), which is the derivative of the cdf (2) with respect to , is given by
(3) 
where is the pdf associated to the cdf . Let be a cdf of a discrete random variable taking values on the set , where , then the corresponding exp distribution is also discrete, takes values on the same set for every , and its probability function is given by
(4) 
where .
If is a absolutely continuous cdf, then its hazard function is given by
(5) 
where is the survival function of a random variable with cdf .
We now state several results regarding the relation between the exp and distributions, where proofs can be found in the Appendix.
Proposition 2.1
Let and have distribution and exp distribution with parameter , respectively. Let also be the law of , and be the law of . Then,

and have the same support for all ;

if is continuous, singular or discrete, then is continuous, singular or discrete, respectively, for all ;

, that is, is absolutely continuous with respect to . Moreover, the RadonNikodym derivative of with respect to is, almost surely,

if is continuous, the relative entropy (KullbackLeibler divergence) between and is

if , then , and, moreover, if
and if
We now give a characterization for our class of distributions through Shannon entropy. Such entropy were introduced by Shannon (1948) and, for a random variable with density , with respect to a finite measure , usually the Lebesgue or counting measure, is given by
(6) 
Jaynes (1957) introduced one of the most powerful techniques employed in the field of probability and statistics called maximum entropy method. This method is closely related to the Shannon entropy and considers a class of density functions
(7) 
where , , are absolutely integrable functions with respect to , and . In the continuous case, the maximum entropy principle suggests to derive the unknown density function of the random variable by the model that maximizes the Shannon entropy in (6), subject to the information constraints defined in the class .
The maximum entropy distribution is the density of the class , denoted by , which is obtained as the solution of the optimization problem
Jaynes (1957), in the page 623, states that the maximum entropy distribution , obtained by the constrained maximization problem described above, “is the only unbiased assignment we can make; to use any other would amount to arbitrary assumption of information which by hypothesis we do not have.” It is the distribution which should not incorporate additional exterior information other than which is specified by the constraints.
In order to obtain a maximum entropy characterization for our class of distributions, we now derive suitable constraints. For this, the next result plays a important role. We will assume in the Propositions 2.2 and 2.3 that the reference measure, , is the Lebesgue measure, and that all the random variables involved are continuous.
Proposition 2.2
The next proposition shows that the class exp of distributions has maximum entropy in the class of all probability distributions specified by the constraints stated therein.
Proposition 2.3
2.1 as a concentration parameter
We provide two asymptotic results of this class, by making the parameter tend to . This results will allow us to give an interpretation for this parameter. Since as , we have, trivially, that if and , then
as , where stands for convergence in distribution.
Therefore, the definition of the family exp by using (2) with is good. This fact plays an important role in our paper because this makes the family exp contain as particular case. The following result is very important since regular distributions in Statistics enjoy many desirable properties.
Proposition 2.4
If is a parametric regular probability distribution, with parametric space , then so is the exp distribution, with respect to the parametric space .
Proof.
The proof follows from a simple verification of the conditions given in Lehmann and Casella (2003).
The distribution may present very different behaviour for large absolute values of , thus showing that this is a rich class of distributions.
Going further on the discussion of what happens when the absolute value of is large. We begin by noting that will tend to one if tends to infinity, whenever is such that , and will be zero otherwise. Therefore, if follows a exp distribution, where is any cdf, then
as , where , ‘’ stands for vague convergence, and is the Dirac’s measure concentrated on , that is, . Note that we needed to consider the vague convergence instead of convergence in distribution to allow . If , then
where is the function identically equal to one, which is not a probability measure. However, we may interpret this case as a “probability measure” concentrated at , that is, if a random variable would follow , then for all .
We now obtain the asymptotic behaviour of . For this case, a simple calculus argument allows us to conclude that will tend to zero, whenever is such that , and will be 1 otherwise. Therefore, if follows a exp distribution, where is any cdf, then
as , where . Note that we also needed to use the vague convergence to include the case where . In this case
where is the function identically equal to zero, which, again, is not a probability measure. However, we may, accordingly, interpret this case as a “probability measure” concentrated at , that is, if a random variable would follow , then for all .
We see from this result, that the parameter can be interpreted as a concentration parameter, because it moves the exp distribution to a degenerated distribution in (if is finite), when it varies from zero to infinity, and to a degenerated distribution in (if is finite) when it varies from 0 to minus infinity. Furthermore, if equals minus infinity, the distribution moves towards the left side of the axis until the mass escape entirely, when tends to infinity. Analogously, when equals infinity, the distribution moves towards the right side of the axis until the mass escape entirely, when tends to minus infinity.
2.2 Reciprocal property
This family of distributions enjoys a very interesting reciprocal property. We begin by introducing some notation, let , and , where is continuous. Therefore, we have that if , then . To see this, observe that, for ,
We also would like to remark that the reciprocal of has a corresponding exp distribution with , that is, has cdf and has cdf .
This means that whenever we study a special case of the exp distribution, we may easily study the reciprocal case. For instance, in this paper we study the expWeibull, and from this result, we also obtain several properties of the expFréchet distribution.
2.3 Expansions, order statistics and moments
We now give an useful expansions for the pdf (3). With this expansion, we can obtain mathematical properties such as ordinary moments, factorial moments and moment generating function of the exp distribution from distribution. Expanding the term in (3), it follows
(9) 
If has not closedform, suppose
(10) 
where
is a sequence of real numbers and . Several distributions do not have closedform cdf and can be written in the form (10), we have, for instance, the normal, gamma and beta distributions.
For positive integer, we have
(11) 
where and for (see Gradshteyn and Ryzhik, 2000). Using (10) and (11) in (9), it becomes an useful expansion for (3) when has not closedform given by
(12) 
Let now be a random sample with pdf in the form (3) and define the th order statistic. The pdf of the , say , is given by
(13)  
By using binomial expansion for the terms and in (13), it follows
(14)  
where denotes the pdf of a random variable with exp distribution. Therefore,
the pdf of can be written as linear combination of pdf’s in the form (3) and, hence, the mathematical properties of the order statistics can be obtained from associated exp distribution.
We hardly need to emphasize the necessity and importance of moments in any statistical analysis
especially in applied work. Some of the most important features and characteristics of a distribution
can be studied through moments, e.g., tendency, dispersion, skewness and kurtosis. We now give general expressions for the moments of the family exp of distributions.
Consider and be random variables with exp and distributions, respectively. When has closedfrom, an useful expression for the th moment of the exp distributions it follows from (9) and it is given in function of the probability weighted moments of the :
(15) 
If has not closedfrom, from (12) we obtain the th moment of in function of the moments of :
(16) 
In particular, if is integer nonnegative, the moments of are given in function of the ordinary moments of . Finally, with the result (14) the th moment of the th order statistic is given by
(17)  
where has exp distribution. The expansions (9), (12) and (14) are main results of this Section and plays an important role in this paper.
2.4 Estimation and inference
Let be a random variable with exp distribution, with . The logdensity of with observed value is given by
and the associated score function is , where
with being the associated score function of the logdensity of a random variable with pdf .
From regularity conditions, we have and .
The information matrix is
where
For a random sample of size from and , the total loglikelihood is
where is the loglikelihood for the th observation () as given before. The total score function is , where for has the form given earlier and the total information matrix is .
The maximum likelihood estimator (MLE) of is obtained numerically from the solution of the nonlinear system of equations . Under conditions that are fulfilled for the parameter in the interior of the parameter space but not on the boundary, the asymptotic distribution of
where ‘’ stands for the asymptotic distribution. The asymptotic multivariate normal distribution of can be used to construct approximate confidence regions for some parameters and for the hazard and survival functions. In fact, an asymptotic confidence interval for each parameter is given by
where denotes the th diagonal element of for and is the quantile of the standard normal distribution. The asymptotic normality is also useful for testing goodness of fit of the exp distribution and for comparing this distribution with some of its special submodels using one of the three wellknown asymptotically equivalent test statistics  namely, the likelihood ratio (LR) statistic, Rao () and Wald () statistics. Consider the partition of the vector of parameters for the expWeibull distribution. The total score function and the total Fisher information matrix and its inverse
are assumed partitioned in the same way as . The LR statistic for testing the null hypothesis versus the alternative hypothesis is given by , where and denote the MLEs under the null and the alternative hypotheses, respectively. The statistic is asymptotically (as ) distributed as , where is the dimension of the vector of interest. The score statistic for testing is , where and are the components of and corresponding to evaluated at . The score statistic has asymptotically the distribution and has an advantage over the LR since it only needs the estimation under the null hypothesis but requires the inverse Fisher information matrix. The Wald statistic for testing the null hypothesis is given by , where is the component of the inverse information matrix corresponding to evaluated at . The Wald statistic has also under an asymptotic distribution. The Wald and score statistics are very used in practice and our derivation of the information matrix will be very convenient in modelling the exp distributions.
2.5 Modified profile likelihood estimator
Since is a parameter added to some distribution, it can be seen as a nuisance parameter. With this in mind, we will advance a modified profile estimator for . From the last subsection, we have that
with
Therefore, if , that belongs to , does not vanish for all values in some open neighbourhood of the true value of , let
with
where stands for the row vector containing the diagonal elements of the Hessian matrix of , , and stands for the column vector .
We, therefore, obtain the modified profile likelihood function:
The modified profile estimator for can be obtained by maximizing . Let be the estimating equation given by
one may also obtain the profile likelihood estimator by solving the equation .
2.6 Interest case
We now discuss estimation and inference when . It is very important to discuss this case because we are interested in testing the hypotheses versus , i.e., to test if the exp fit is significantly better than fit. The next result plays a important role in this paper.
Theorem 2.5
Let and be the cdf and pdf defined by (2) and (3), respectively. The following conditions are true:

If is continuous then uniformly when ;

uniformly when , consequently , where is the loglikelihood associated to ;

and , when ;

, and , when , with being the information matrix with respect to ;

If is regular and , then

If is regular and , then the likelihood ratio, Wald and Score statistics has null asymptotic distribution , where is the number of parameters estimated in alternative hypothesis minus the number of parameters estimated in null hypothesis.
3 The expWeibull distribution
We now move to the class of distributions exp, when is the cdf of the Weibull distribution, we will call this class of distributions by expWeibull. More precisely, to obtain the expWeibull distribution we put in (2) the cdf of the Weibull distribution , where , and . Therefore, the cdf of the expWeibull distribution given by
From the general expressions (3) and (5) we obtain that the pdf and hazard functions are given by
(18) 
and
respectively.
We now illustrate the flexibility of this class of distributions by presenting some graphics of both the pdf and hazard functions. Figure 1 shows the plots of the pdf of the expWeibull distribution for some values of and , and for . We note that when the the value of increases the pdf becomes more ‘peaked’. Figure 2 contains the plots of hazard function of the expWeibull distribution for different values of and and . We note that the behaviour of the hazard function of the Weibull distribution is close to the behaviour of the graphics with , and as the value of increases, the behaviour of the hazard function of the expWeibull becomes very different from the behaviour of the hazard function of the Weibull distribution, showing that as the value of gets larger the expWeibull “moves away” from the Weibull distribution, and gets closer to the Dirac mass at zero, as remarked on the end of the last Section.
3.1 Order statistics and moments
The pdf of the th order statistic of a random sample from exp distribution is given by
We will now obtain series representation for the moments of the expWeibull distribution and of the order statistics. To this end, let be a random variable following a expWeibull distribution with parameters , and . From now on we will use the notation to indicate this fact.
We have the probability weighted moment of a random variable following Weibull distribution with parameter vector can be written as . Therefore, from (15) it follows that the th moment of is
(19) 
We now give an alternative expression to (19) more simple. The th moment of is
Now, expading in Taylor’s series we get
where follows the Weibull distribution with parameters and , and the interchange between the series and integral being possible due to Fubini’s theorem together with the fact that we are dealing with positive integrand. Hence, we have that the th moment of a expWeibull distribution can be written as
(20) 
Figure 3 shows skewness and kurtosis of the expWeibull distribution, obtained from application of the formula of the moments above, for and some values of as function of . We now note from (20) that all moments of the expWeibull distribution tends to zero as increases to infinity, which is a very remarkable fact. So, as we can note from Figure 3, as increases, the skewness tends to zero, as well as the kurtosis, one more time reflecting the expected behaviour of the limiting distribution as .
An expression for the th moment of the th order statistic of the expWeibull distribution, say , follows from (17) and (20):
(21)  
Expressions (19) and (21) show the importance of the expansions given in Subsection (2.3). Furthermore, result (20) shows that alternative expressions to (15) and (16) can be obtained depending of the distribution.
3.2 Order statistics and moments of the expFréchet distribution
In this brief subsection we use the reciprocal property of the exp distributions to obtain expressions for the moments and order statistics of the expFréchet distribution.
3.3 Score function and information matrix
Let be the parameter vector and random variable with distribution. The logdensity for the random variable with observed value is given by