A New Generalized Kumaraswamy Distribution

# A New Generalized Kumaraswamy Distribution

## Abstract

A new five-parameter continuous distribution which generalizes the Kumaraswamy and the beta distributions as well as some other well-known distributions is proposed and studied. The model has as special cases new four- and three-parameter distributions on the standard unit interval. Moments, mean deviations, Rényi’s entropy and the moments of order statistics are obtained for the new generalized Kumaraswamy distribution. The score function is given and estimation is performed by maximum likelihood. Hypothesis testing is also discussed. A data set is used to illustrate an application of the proposed distribution.

Keywords: Beta distribution; Continuous proportions; Generalized Kumaraswamy distribution; Kumaraswamy distribution; Maximum likelihood; McDonald Distribution; Moments.

## 1 Introduction

We introduce a new five-parameter distribution, so-called generalized Kumaraswamy (GKw) distribution, which contains some well-known distributions as special sub-models as, for example, the Kumaraswamy (Kw) and beta () distributions. The GKw distribution allows us to define new three- and four-parameter generalizations of such distributions. The new model can be used in a variety of problems for modeling continuous proportions data due to its flexibility in accommodating different forms of density functions.

The GKw distribution comes from the following idea. Wahed (2006) and Ferreira and Steel (2006) demonstrated that any parametric family of distributions can be incorporated into larger families through an application of the probability integral transform. Specifically, let be a cumulative distribution function (cdf) with corresponding probability density function (pdf) , and be a pdf having support on the standard unit interval. Here, and represent scalar or vector parameters. Now let

 F(x;\boldmathω,\boldmathτ)=∫G1(x;\boldmathω)0g2(t;\boldmathτ)dt. (1)

Note that is a cdf and that and have the same support. The pdf corresponding to (1) is

 f(x;\boldmathω,\boldmathτ)=g2(G1(x;\boldmathω);\boldmathτ)g1(x;% \boldmathω). (2)

This mechanism for defining generalized distributions from a parametric cdf is particularly attractive when has a closed-form expression.

The beta density is often used in place of . However, different choices for have been considered in the literature. Eugene et al. (2002) defined the beta normal distribution by taking to be the cdf of the standard normal distribution and derived some of its first moments. More general expressions for these moments were obtained by Gupta and Nadarajah (2004a). Nadarajah and Kotz (2004) introduced the beta Gumbel (BG) distribution by taking to be the cdf of the Gumbel distribution and provided closed form expressions for the moments, the asymptotic distribution of the extreme order statistics and discussed the maximum likelihood estimation procedure. Nadarajah and Gupta (2004) introduced the beta Fréchet (BF) distribution by taking to be the Fréchet distribution, derived the analytical shapes of its density and hazard rate functions and calculated the asymptotic distribution of its extreme order statistics. Also, Nadarajah and Kotz (2006) dealt with the beta exponential (BE) distribution and obtained its moment generating function, its first four cumulants, the asymptotic distribution of its extreme order statistics and discussed maximum likelihood estimation.

The starting point of our proposal is the Kumaraswamy (Kw) distribution (Kumaraswamy, 1980; see also Jones, 2009). It is very similar to the beta distribution but has a closed-form cdf given by

 G1(x;\boldmathω) = 1−(1−xα)β,0

where , and . Its pdf becomes

 g1(x;\boldmathω) = αβxα−1(1−xα)β−1,0

If is a random variable with pdf (4), we write . The Kw distribution was originally conceived to model hydrological phenomena and has been used for this and also for other purposes. See, for example, Sundar and Subbiah (1989), Fletcher and Ponnambalam (1996), Seifi et al. (2000), Ganji et al. (2006), Sanchez et al. (2007) and Courard-Hauri (2007).

In the present paper, we propose a generalization of the Kw distribution by taking as cdf (3) and as the standard generalized beta density of first kind (McDonald, 1984), with pdf given by

 g2(x;\boldmathτ)=λxλγ−1(1−xλ)η−1B(γ,η),0

where , and , is the beta function and is the gamma function. If is a random variable with density function (5), we write . Note that if then , i.e., has a beta distribution with parameters and .

The article is organized as follows. In Section 2, we define the GKw distribution, plot its density function for selected parameter values and provide some of its mathematical properties. In Section 3, we present some special sub-models. In Section 4, we obtain expansions for the distribution and density functions. We demonstrate that the GKw density can be expressed as a mixture of Kw and power densities. In Section 5, we give general formulae for the moments and the moment generating function. Section 6 provides an expansion for the quantile function. Section 7 is devoted to mean deviations about the mean and the median and Bonferroni and Lorenz curves. In Section 8, we derive the density function of the order statistics and their moments. The Rényi entropy is calculated in Section 9. In Section 10, we discuss maximum likelihood estimation and determine the elements of the observed information matrix. Section 11 provides an application to a real data set. Section 12 ends the paper with some conclusions.

## 2 The New Distribution

We obtain an appropriate generalization of the Kw distribution by taking as the two-parameter Kw cdf (3) and associated pdf (4). For , we consider a three-parameter generalized beta density of first kind given by (5). To avoid non-identifiability problems, we allow to vary on only. We then write which varies on . Using (1), the cdf of the GKw distribution, with five positive parameters , , , and , is defined by

 F(x;\boldmathθ)=λB(γ,δ+1)∫1−(1−xα)β0yγλ−1(1−yλ)δdy, (6)

where is the parameter vector.

The pdf corresponding to (6) is straightforwardly obtained from (2) as

 f(x;\boldmathθ)=λαβxα−1B(γ,δ+1)(1−xα)β−1[1−(1−xα)β]γλ−1{1−[1−(1−xα)β]λ}δ,0

Based on the above construction, the new distribution can also be referred to as the McDonald Kumaraswamy (McKw) distribution. If is a random variable with density function (7), we write .

An alternative, but related, motivation for (6) comes through the beta construction (Eugene et al., 2002). We can easily show that

 F(x;\boldmathθ)=I[1−(1−xα)β]λ(γ,δ+1), (8)

where denotes the incomplete beta function ratio. Thus, the GKw distribution can arise by taking the beta construction applied to a new distribution, namely the exponentiated Kumaraswamy (EKw) distribution, to yield (7), which can also be called the beta exponentiated Kumaraswamy (BEKw) distribution, i.e., a beta type distribution defined by the baseline cumulative function .

Immediately, inverting the transformation motivation (8), we can generate following the GKw distribution by , where is a beta random variable with parameters and . This scheme is useful because of the existence of fast generators for beta random variables. Figure 1 plots some of the possible shapes of the density function (7). The GKw density function can take various forms, bathtub, , inverted , monotonically increasing or decreasing and upside-down bathtub, depending on the parameter values.

We now provide two properties of the GKw distribution.

Proposition 1. If , then for .

Proposition 2. Let and . Then, the pdf of is given by

 f(y;\boldmathθ)=λαβB(γ,δ+1)e−αy(1−e−αy)β−1[1−(1−e−αy)β]γλ−1{1−[1−(1−e−αy)β]λ}δ,y>0. (9)

We call (9) the log-generalized Kumaraswamy (LGKw) distribution.

## 3 Special Sub-Models

The GKw distribution is very flexible and has the following distributions as special sub-models.

### The Kumaraswamy distribution (Kw)

If and , the GKw distribution reduces to the Kw distribution with parameters and , and cdf and pdf given by (3) and (4), respectively.

### The McDonald distribution (Mc)

For , we obtain the Mc distribution (5) with parameters , and .

### The beta distribution

If , the GKw distribution reduces to the beta distribution with parameters and .

### The beta Kumaraswamy distribution (BKw)

If , (7) yields

 f(x;α,β,γ,δ,1)=α β xα−1B(γ,δ+1)(1−xα)β(δ+1)−1[1−(1−xα)β]γ−1,0

This distribution can be viewed as a four-parameter generalization of the Kw distribution. We refer to it as the BKw distribution since its pdf can be obtained from (2) by setting to be the cdf and to the density function.

### The Kumaraswamy-Kumaraswamy distribution (KwKw)

For , (7) reduces to (for )

 f(x;α,β,γ,δ,λ)=λαβ(δ+1)xα−1(1−xα)β−1[1−(1−xα)β]λ−1{1−[1−(1−xα)β]λ}δ.

Again, this distribution is a four-parameter generalization of the Kw distribution. It can be obtained from (2) by replacing by the cdf of the distribution and by the pdf of the distribution. Its cdf has a closed form given by

 F(x;α,β,1,δ,λ)=1−{1−[1−(1−xα)β]λ}δ+1.

### The EKw distribution

If and , (7) gives

 f(x;α,β,1,0,λ) = λαβxα−1(1−xα)β−1[1−(1−xα)β]λ−1,0

It can be easily seen that the associated cdf can be written as

 F(x;α,β,1,0,λ)=G1(x;α,β)λ,

where is the cdf of the distribution. This distribution was defined before as the EKw distribution which is a new three-parameter generalization of the Kw distribution.

### The beta power distribution (Bp)

For and , (9) reduces to

 f(x;1,1,γ,δ,λ) = λB(γ,δ+1)xγλ−1(1−xλ)δ,  0

This density function can be obtained from (2) if and is taken as the beta density with parameters and . We call this distribution as the BP distribution.

The LGKw distribution (9) contains as special sub-models the following distributions.

### The beta generalized exponential distribution (Bge)

For , (9) gives

 f(y;α,1,γ,δ,λ)=αβB(γ,δ+1)e−αy(1−e−αy)β(δ+1)−1[1−(1−e−αy)β]γ−1,y>0, (10)

which is the density function of the BGE distribution introduced by Barreto-Souza et al. (2010). If and in addition to , the LGKw distribution becomes the generalized exponential distribution (Gupta and Kundu, 1999). If and , (10) coincides with the exponential distribution with mean .

### The beta exponential distribution (Be)

For and , (9) reduces to

 f(y;α,1,γ,δ,1)=αB(γ,δ+1)e−αγy(1−e−αy)δ,y>0,

which is the density of the BE distribution introduced by Nadarajah and Kotz (2006).

## 4 Expansions for the Distribution and Density Functions

We now give simple expansions for the cdf of the GKw distribution. If and is a non-integer real number, we have

 (1−z)δ=∞∑j=0(−1)j(δ)jzj, (11)

where (for ) is the descending factorial. Clearly, if is a positive integer, the series stops at . Using the series expansion (11) and the representation for the GKw cdf (6), we obtain

 F(x;\boldmathθ)=∫G1(x;α,β)0λB(γ,δ+1)yγλ−1∞∑j=0(δ)j(−1)jyλjdy

if is a non-integer real number. By simple integration, we have

 F(x;\boldmathθ)=∞∑j=0ωj[G1(x;α,β)]λ(γ+j), (12)

where

 ωj=(−1)j(δ)j(γ+j)B(γ,δ+1), (13)

and is given by (3). If is a positive integer, the sum stops at .

The moments of the GKw distribution do not have closed form. In order to obtain expansions for these moments, it is convenient to develop expansions for its density function. From (12), we can write

 f(x;\boldmathθ)=∞∑j=0ωjλ(γ+j)g1(x;α,β)[G1(x;α,β)]λ(γ+j)−1.

If we replace by (3) and use (4), we obtain

 f(x;\boldmathθ)=∞∑k=0pkg1(x;α,(k+1)β), (14)

where with Here, and denotes the density function with parameters and . Further, we can express (14) as a mixture of power densities, since the Kw density (4) can also be written as a mixture of power densities. After some algebra, we obtain

 f(x;\boldmathθ)=∞∑i=0vix(i+1)α−1, (15)

where

 vi=(−1)iαβ∞∑k=0(k+1)((k+1)β−1)ipk.

Equations (14) and (15) are the main results of this section. They can provide some mathematical properties of the GKw distribution from the properties of the Kw and power distributions, respectively.

## 5 Moments and Moment Generating Function

Let be a random variable having the GKw distribution (7). First, we obtain an infinite sum representation for the rth ordinary moment of , say . From (14), we can write

 μ′r=∞∑k=0pkτr(k), (16)

where is the rth moment of the distribution which exists for all . Using a result due to Jones (2009, Section 3), we have

 Missing or unrecognized delimiter for \Big (17)

Hence, the moments of the GKw distribution follow directly from (16) and (17). The central moments and cumulants of are easily obtained from the ordinary moments by and , etc., respectively. The th descending factorial moment of is

 μ′(r)=E[X(r)]=E[X(X−1)×⋯×(X−r+1)]=r∑m=0s(r,m)μ′m,

where is the Stirling number of the first kind given by . It counts the number of ways to permute a list of items into cycles. Thus, the factorial moments of are given by

 μ′(r)=∞∑k=0pkr∑m=0s(r,m)τm(k).

The moment generating function of the GKw distribution, say , is obtained from (15) as

 M(t)=∞∑i=0vi∫10x(i+1)α−1exp(tx)dx.

By changing variable, we have

 M(t)=∞∑i=0vit−(i+1)α∫t0u(i+1)α−1exp(−u)du

and then reduces to the linear combination

 M(t)=∞∑i=0viγ((i+1)α,t)t(i+1)α,

where denotes the incomplete gamma function.

## 6 Quantile Function

We can write (8) as , where . From Wolfram’s website4 we can obtain some expansions for the inverse of the incomplete beta function, say , one of which is

 z=QB(u)=a1v+a2v2+a3v3+a4v4+O(v5/γ),

where for and , , ,

 a3=δ[γ2+3(δ+1)γ−γ+5δ+1]2(γ+1)2(γ+2),
 a4 = δ{γ4+(6δ+5)γ3+(δ+3)(8δ+3)γ2+[33(δ+1)2−30δ+26]γ + (δ+1)(31δ−16)+18}/[3(γ+1)3(γ+2)(γ+3)],…

The coefficients s for can be derived from the cubic recursion (Steinbrecher and Shaw, 2007)

 ai = 1[i2+(γ−2)i+(1−γ)]{(1−ρi,2)i−1∑r=2arai+1−r[r(1−γ)(i−r) − r(r−1)]+i−1∑r=1i−r∑s=1arasai+1−r−s[r(r−γ)+s(γ+β−2) × (i+1−r−s)]},

where if and if . In the last equation, we note that the quadratic term only contributes for . Hence, the quantile function of the GKw distribution can be written as .

## 7 Mean Deviations

If has the GKw distribution, we can derive the mean deviations about the mean and about the median from

 δ1=∫10∣x−μ′1∣f(x;\boldmathθ)dxandδ2=∫10∣x−M∣f(x;\boldmathθ)dx,

respectively. From (8), the median is the solution of the nonlinear equation

 I[1−(1−Mα)β]λ(γ,δ+1)=1/2.

These measures can be calculated using the relationships

 δ1=2[μ′1F(μ′1;\boldmathθ% )−J(μ′1;\boldmathθ)]andδ2=μ′1−2J(M;\boldmathθ).

Here, the integral is easily calculated from the density expansion (15) as

 J(a;\boldmathθ)=∞∑i=0via(i+1)α+1(i+1)α+1.

We can use this result to obtain the Bonferroni and Lorenz curves. These curves have applications not only in economics to study income and poverty, but also in other fields, such as reliability, demography, insurance and medicine. They are defined by

 B(p;\boldmathθ)=J(q;% \boldmathθ)pμ′1andL(p;% \boldmathθ)=J(q;\boldmathθ)μ′1,

respectively, where .

## 8 Moments of Order Statistics

The density function of the ith order statistic , say , in a random sample of size from the GKw distribution, is given by (for )

 fi:n(x;\boldmathθ)=1B(i,n−i+1)f(x;\boldmathθ)F(x;\boldmathθ)i−1{1−F(x;% \boldmathθ)}n−1,0

The binomial expansion yields

 fi:n(x;\boldmathθ)=1B(i,n−i+1)f(x;\boldmathθ)n−1∑j=0(n−1j)(−1)jF(x;%\boldmath$θ$)i+j−1,

and using and integrating (15) we arrive at

 fi:n(x;\boldmathθ)=1B(i,n−i+1)(∞∑t=0vtx(t+1)α−1)n−1∑j=0(n−1)j(−1)j(∞∑s=0v⋆sx(s+1)α)i+j−1,

where .

We use the following expansion for a power series raised to a integer power (Gradshteyn and Ryzhik, 2000, Section 0.314)

 (∞∑j=0ajxj)p=∞∑j=0cj,pxj, (19)

where is any positive integer number, and for all .] We can write

 fi:n(x;\boldmathθ)=1B(i,n−i+1)n−1∑j=0(n−1j)(−1)j∞∑s,t=0vtes,i+j−1x(s+t+i+j)α−1,

where and (for )

 es,i+j−1=(sv⋆0)−1s∑m=1[m(i+j−1)−s+m]v⋆mes−m,i+j−1.

The rth moment of the ith order statistic becomes

 E(Xri:n)=1B(i,n−i+1)n−1∑j=0(n−1)j(−1)j∞∑s,t=0vtes,i+j−1(r+s+t+i+j)α. (20)

We now obtain another closed form expression for the moments of the GKw order statistics using a general result due to Barakat and Abdelkader (2004) applied to the independent and identically distributed case. For a distribution with pdf and cdf , we can write

 Unknown environment '%

where

 Im(r)=∫10xr−1{1−F(x;\boldmathθ)}mdx.

For a positive integer , we have

 Unknown environment 'tabular'

By replacing (12) in the above equation we have

 Unknown environment 'tabular' (21)

Equations (19) and (21) yield

 Unknown environment 'tabular'

By replacing by (3) and using (11) we obtain

 Unknown environment 'tabular'

where . Since for (Gupta and Nadarajah, 2004b), we have

 Im(r)=m∑p=0∞∑j,w=0sp,j,wB(rα,βw+1),

where

 sp,j,w=(−1)p+wm!α(m−p)!p!cj,p(ψ)w.

Finally, reduces to

 Unknown environment 'tabular' (22)

Equations (20) and (22) are the main results of this section. The L-moments are analogous to the ordinary moments but can be estimated by linear combinations of order statistics. They are linear functions of expected order statistics defined by (Hoskings, 1990)

 λr+1=(r+1)−1r∑k=0(−1)k(rk)E(Xr+1−k:r+1),r=0,1,…

The first four L-moments are , , and . These moments have several advantages over the ordinary moments. For example, they exist whenever the mean of the distribution exists, even though some higher moments may not exist, and are relatively robust to the effects of outliers. From (22) applied for the means (), we can obtain expansions for the L-moments of the GKw distribution.

## 9 Rényi Entropy

The entropy of a random variable with density function is a measure of variation of the uncertainty. One of the popular entropy measures is the Rényi entropy given by

 JR(ρ)=11−ρlog[∫fρ(x)dx],ρ>0,ρ≠1. (23)

From (15), we have

 f(x;\boldmathθ)ρ=(∞∑i=0vix(i+1)α−1)ρ.

In order to obtain an expansion for the above power series for , we can write

 f(x;\boldmathθ)ρ = ∞∑j=0(ρj)(−1)j{1−(∞∑i=0vix(i+1)α−1)}j = ∞∑j=0j∑r=0(−1)j+r(ρj)(jr)x(α−1)r(∞∑i=0vixiα)r.

Using equation (19), we obtain

 f(x;\boldmathθ)ρ=∞∑i,j=0j∑r=0(−1)j+r(ρj)(jr)di,rx(i+r)α−r,

where and for all . Hence,

 JR(ρ)=11−ρlog[∞∑i,j=0j∑r=0(−1)j+r(ρj)(jr)di,r(i+r)α−r+1].

## 10 Maximum Likelihood Estimation

Let be a random sample from the distribution. From (7) the log-likelihood function is easy to derive. It is given by

 ℓ(\boldmathθ) = nlog(λ)+nlog(α)+nlog(β)−nlog[B(γ,δ+1)]+(α−1)n∑i=1log(xi) +(β−1)n∑i=1log(1−xαi)+(γλ−1)n∑i=1log[1−(1−xαi)β]+ δn∑i=1log[1−{1−(1−xαi)β}λ].

By taking the partial derivatives of the log-likelihood function with respect to , , , and , we obtain the components of the score vector,