The COM-negative binomial distribution: modeling overdispersion and ultrahigh zero-inflated count data

# The COM-negative binomial distribution: modeling overdispersion and ultrahigh zero-inflated count data

Huiming Zhang Kai Tan Bo Li
###### Abstract

In this paper, we focus on the COM-type negative binomial distribution with three parameters, which belongs to COM-type class distributions and family of equilibrium distributions of arbitrary birth-death process. Besides, we show abundant distributional properties such as overdispersion and underdispersion, log-concavity, log-convexity (infinite divisibility), pseudo compound Poisson, stochastic ordering and asymptotic approximation. Some characterizations including sum of equicorrelated geometrically distributed r.v.’s, conditional distribution, limit distribution of COM-negative hypergeometric distribution, and functional operator characterization are given for theoretical properties. COM-negative binomial distribution was applied to overdispersion and ultrahigh zero-inflated data sets. We employ maximum likelihood method to estimate the parameters and the goodness-of-fit are evaluated by the discrete Kolmogorov-Smirnov test.

###### keywords:
overdispersion, zero-inflated data, compound Poisson distribution, infinite divisibility, discrete Kolmogorov-Smirnov test MSC2010: 60E07 60A05 62F10
journal: Journal

## 1 Introduction

Before 2005, Conway-Maxwell-Poisson distribution (denoted as COM-Poisson distribution) had been rarely used since Conway and Maxwell (1962) briefly introduced it for modeling of queuing systems with state-dependent service time, see also Wimmer and Altmann (1999), Wimmer et al. (1995). About ten years ago, the COM-Poisson distribution with two parameters was revived by Shmueli et al. (2005) as a generalization of Poisson distribution. More recently, there has been a fast growth of researches on COM-Poisson distribution in terms of related statistical theory and applied methodology, see Sellers et al. (2012) and the references therein. The probability mass function(p.m.f.) is given by

 P(X=k)=λk(k!)ν⋅1Z(λ,ν),(k=0,1,2,⋯), (1)

where and . We denote (1) as .

Kokonendji et al. (2008) proved that COM-Poisson distribution was overdispersed when and underdispersed when . Another extension of Poisson is negative binomial, which is a noted discrete distribution with overdispersion property and is widely applied in actuarial sciences(see Denuit et al. (2007), Kaas et al. (2008)). The p.m.f. of the negative binomial r.v is

 P(X=k)=Γ(r+k)k!Γ(r)pk(1−p)rfork=0,1,2,… (2)

where and .

In this paper, we introduce a new extension of the negative binomial distribution (denoted by CMNB) depending on three parameters by replacing in (2) with and divide the normalization constant .

###### Definition 1.1.

A r.v. is said to follow COM-negative binomial distribution with three parameters if the p.m.f. is given by

 P(X=k)=(Γ(r+k)k!Γ(r))νpk(1−p)r∞∑i=0(Γ(r+i)i!Γ(r))νpi(1−p)r=(Γ(r+k)k!Γ(r))νpk(1−p)r1C(r,ν,p),(k=0,1,2,…), (3)

where and .

In another point of view, the alternative form of (3) can be written as

 P(X=k)=[Γ(r+k)k!Γ(r)~pk(1−~p)r]ν∞∑i=0[Γ(r+i)i!Γ(r)ν~pi(1−~p)r]ν=(Γ(r+k)k!Γ(r))ν~pk(1−~p)r1C(r,ν,~p),(k=0,1,2,…), (4)

where .

When , we will show that COM-negative binomial (3) is discrete compound Poisson, which has wide application in risk theory (includes non-life insurance) as well, see Zhang et al. (2014) and the references therein. We plot 12 cases of COM-negative binomial p.m.f. in Figture 1 in the end of this section.

It is easy to see that our COM-negative binomial distribution belongs to the COM-type extension of class. The class distribution is a famous family of distributions which sometimes refers to Katz class (see remarks in section 2.3.1 of Johnson et al. (2005)). It has significant applications in non-life insurance mathematics, especially for modelling claim counts( loss models, collective risk models), see Kaas et al. (2008), Denuit et al. (2007). A classic result in non-life insurance textbooks states that the class distribution only contains degenerate, binomial, Poisson and the negative binomial distribution. After adding a new parameter , we define the COM-type extension of distribution, and it is convenient to see that degenerate, COM-Poisson, COM-binomial and COM-negative binomial belong to this class of distributions.

Shmueli et al. (2005) firstly proposed the COM-binomial distribution which is presented as a sum of equicorrelated Bernoulli variables. Borges et al. (2014) studied some properties and an asymptotic approximation (e.g. COM-binomial approximates to COM-Poisson under some conditions) of this family of distributions in detail. We will show that some results of COM-Poisson can be extended in our COM-negative binomial distribution. Kadane (2016) gives the exchangeably properties, sufficient statistics and multivariate extension of COM-binomial distribution.

Another variant of COM-negative binomial distribution has been studied by Imoto (2014), it just replaces the term in (2) by and then divides the normalization constant. We will give adequate reasons to support our extension in succeeding sections. Chakraborty and Imoto (2016) considered the extended COM-Poisson distribution (ECOMP):

 P(X=k)=Γ(r+k)β(k!)αθk/∞∑i=1Γ(r+i)β(i!)αθi(k=0,1,2,…), (5)

where the parameter space is . ECOMP distribution combines Imoto (2014)’s extension () and our extension of COM-negative binomial distribution (). The COM-Poisson is a special case of ECOMP when . ECOMP distribution has the queuing systems characterization (birth-death process with arrival rate and service rate for ), see also Brown and Xia (2001) for arbitrary birth-death process characterization.

The rest of the article is organized as the follows. In section 2, we propose the COM-type class distributions, and demonstrate some example of COM-type class which includes the COM-negative binomial. Further more, some properties of COM-negative binomial are given: Renyi entropy and Tsallis entropy representation, overdispersion and underdispersion, log-concavity, log-convexity (infinite divisibility), pseudo compound Poisson, stochastic ordering and asymptotic approximation. In section 3, some conditional distribution characterizations and Stein identity characterization are presented by using related lemmas, and we also show that COM-negative hypergeometric can approximate to COM-negative binomial. In section 4, inverse method were introduced to generate COM-negative binomial distributed random variables. Section 5 then estimates the parameters by maximum likelihood method, in which the initial values are provided by recursive formula. In section 6, two simulated data sets and two applications to actuarial claim data sets are given as examples. In section 7, we provide some potential and further research suggestions based on the properties and characterizations of COM-negative binomial distribution.

## 2 Properties

### 2.1 Recursive formula and ultrahigh zero-inflated property

The recursive formula (or ratios of consecutive probabilities) is given by

 P(X=k)P(X=k−1)=p(Γ(k+r)k!Γ(r))ν/(Γ(k−1+r)(k−1)!Γ(r))ν=p⋅(k−1+rk)ν, (6)

We say that a zero-inflated count data following some discrete distribution is ultrahigh zero-inflated if . For COM-negative binomial case with two parameters, we have ; and for negative binomial case, we get If we choose and in COM-negative binomial case, then this three-parameter case is more flexible to deal with the ratio comparing to the two-parameter case, as when . (For examples, the plots of COM-negative binomial distribution with in Figture 1; in Table 4 of section 6.2). In the insurance company, the more zero insurance claims, the less risk to bankrupt.

The following definition provides a generalization of class distribution.

###### Definition 2.1.

Let be a discrete r.v., if satisfies the recursive formula

 pk=(a+bk)νpk−1,(k=1,2,⋯) (7)

for some constants and , then we call it COM-type class distribution. We denote this class as .

The COM-Poisson distribution satisfies the case , since and .

From (6), it is easy to see that COM-negative binomial distribution belongs to COM-type class distribution with .

 (8)

The COM-binomial distribution (CMB), see Shmueli et al. (2005), Borges et al. (2014), with p.m.f.

 P(X=k)=(mk)vpk(1−p)m−km∑i=0(mi)vpi(1−p)m−i,k=0,1,⋯,m, (9)

where . We denote (9) as .

Since the ratio of consecutive probabilities is , COM-binomial distribution belongs to COM type class distribution.
Remark 1: As we know, the class distribution only contains degenerate distribution, binomial distribution, Poisson distribution and the negative binomial distribution. But there are other distributions belongs to COM type class. For example, can be replaced by in (9), and the p.m.f. is given by

 P(X=k)=(mk)2pk(1−p)m−k∞∑i=0(mi)2pi(1−p)m−i,k=0,1,2,⋯,

where such that .
Remark 2: Brown and Xia (2001) considered the very large class of stationary distribution of birth-death process with arrival rate and service rate by the recursive formula:

 P(X=k)P(X=k−1)=λk−1μk,(k=1,2,…).

Thus we can construct a birth-death process with arrival rate and service rate ; is a positive constant), which characterizes the COM-type class distribution.

### 2.2 Related to Rényi entropy and Tsallis entropy

Notice the Rényi entropy(see Rényi (1961)) in the information theory, which generalizes the Shannon entropy. The Renyi entropy of order of a discrete r.v. :

 HRα(X)=11−αln∞∑i=0[P(X=i)]α,(α≠1).

Let be negative binomial distributed in (2) and be COM-negative binomial distributed in (4). Then the normalization constant in (4) has Rényi entropy representation , so .

Another generalization of Shannon entropy in physic is the Tsallis entropy. For r.v. , its Tsallis entropy of order is defined by

 HTα(X)=11−α(∞∑i=0[P(X=i)]α−1),(α≠1).

This entropy was introduced by Tsallis (1988) as a basis for generalizing the Boltzmann-Gibbs statistics.

Also, the normalization constant in (4) has Tsallis entropy representation , then .

### 2.3 Log-concave, Log-convex, Infinite divisibility

This subsection deals with log-concavity and log-convexity of the COM-negative binomial distribution. A discrete distribution with is said to have log-concave (log-convex) p.m.f. if

 pk+1pk−1p2k=pk+1pk/pkpk−1≤(≥)1,k≥1.
###### Lemma 2.1.

The COM-negative binomial distribution is log-concave if and log-convex if .

###### Proof.

In fact, using the ratio of consecutive probabilities (6), we have

 M=pk+1pk/pkpk−1=p(r+kk+1)ν/p(r+k−1k)ν=(k2+krk2+kr+r−1)ν.

Then iff (log-concave) and iff (log-convex). ∎

Remark 3: Ibragimov (1956) called a distribution strongly unimodal if it is unimodal and its convolution with any unimodal distribution is unimodal. He showed that the strongly unimodal distributions is equal to the log-concave distributions. So COM-negative binomial distribution has strong unimodality(see Figure 1 for example) when .

Steutel (1970) showed that all log-convex discrete distributions are infinitely divisible, the background and detailed proof can be found in Steutel and van Harn (2003). Then we obtain infinite divisibility of COM-negative binomial distribution when .

###### Corollary 2.1.

The COM-negative binomial distribution (3) is discrete infinitely divisible (discrete compound Poisson distribution) if .

Feller’s characterization of the discrete infinite divisibility showed that a non-negative integer valued r.v. is infinitely divisible if and only if its distribution is a discrete compound Poisson distribution with p.g.f.:

 G(z)=∞∑k=0pkzk=e∞∑i=1αiλ(zi−1),(|z|≤1) (10)

where

For a theoretical treatment of discrete infinite divisibility (or discrete compound Poisson distribution), we refer readers to section 2 of Steutel and van Harn (2003), section 9.3 of Johnson et al. (2005), Zhang and Li (2016).

Considering some being negative in (10), it turns into a generalization of the discrete compound Poisson distribution:

###### Definition 2.2 (Discrete pseudo compound Poisson distribution).

If a discrete r.v.  with , , has a p.g.f. of the form

 G(z)=∞∑k=0pkzk=exp{∞∑i=1αiλ(zi−1)}, (11)

where , , , and , then is said to follow a discrete pseudo compound Poisson distribution, abbreviated as DPCP.

Next, we will give two lemmas on the non-vanishing p.g.f. characterization of DPCP, see Zhang et al. (2014) and Zhang et al. (2017).

###### Lemma 2.2.

Let , for any discrete r.v. , its p.g.f.  has no zeros in if and only if is DPCP distributed.

The proof of Lemma 2.2 is based on Wiener-Lévy theorem, which is a sophisticated theorem in Fourier analysis, see Zygmund (2002).

###### Lemma 2.3.

(Lévy-Wiener theorem) Let be a absolutely convergent Fourier series with . The value of lies on a curve , and is an analytic (not necessarily single-valued) function of a complex variable which is regular at every point of . Then has an absolutely convergent Fourier series.

###### Lemma 2.4.

For any discrete r.v. with p.g.f. . If , then is DPCP distributed.

###### Proof.

First, we show that has no zeros in , since

 |(1−z)G(z)| =∣∣p0−(p0−p1)z−(p1−p2)z2+⋯∣∣ ≥p0−∣∣(p0−p1)|z|+(p1−p2)|z2|+⋯∣∣

And notice that , so are not zeros point. ∎

The condition in the next corollary is weaker than that of Corollary 2.1, and the result (DPCP) is also weaker than Corollary 2.1 (DCP).

###### Corollary 2.2.

The COM-negative binomial distribution (3) is discrete pseudo compound Poisson distribution if or .

###### Proof.

On the one hand, deduces that COM-negative binomial belongs to discrete compound Poisson by Corollary 2.1, hence COM-negative binomial is discrete pseudo compound Poisson. On the other hand, by using Lemma 2.4, we need to guarantee that for . is a decreasing function with respect to when , reaches its maximum as . So is the other case. ∎

Applying the recurrence relation (Lévy-Adelson-Panjer recursion) of p.m.f. of DPCP distribution, see Remark 1 in Zhang et al. (2014)

 Pn+1=λn+1[α1Pn+2α2Pn−1+⋯+(n+1)αn+1P0],(P0=e−λ,n=0,1,⋯)

and , then the DPCP parametrization of COM-nagative binomial distribution is determined by the following system of equations:

 (Γ(r+n+1)(n+1)!)νpn+1=λn+1[α1(Γ(r+n)n!)νpn+2α2(Γ(r+n−1)(n−1)!)νpn−1+⋯+(n+1)αn+1],(n=0,1,⋯),

where .

### 2.4 Overdispersion and underdispersion

In statistics, for a given random sample , overdispersion means that . Conversely, underdispersion means that . Moreover, equal-dispersion means that . G¨®mez-D¨¦niz (2011) summerized the phenomena of insurance count claims data, which were characterized by two features: (i) Overdispersion, i.e., the variance is greater than the mean; (ii) Zero-inflated, i.e. the presence of a high percentage of zero values in the empirical distribution.

The COM-negative binomial distribution belongs to the family of weighted Poisson distribution (see Kokonendji et al. (2008)) with p.m.f.

 P(X=k)=w(k)E[w(X)]⋅θkk!e−θ, (12)

where is a non-negative weighted function.

Then weighted Poisson representation of COM-negative binomial distribution is

 P(X=k)=(1−p)repC(r,ν,p)[Γ(1+k)]1−ν[Γ(r+k)]νpkk!e−p,(k=0,1,2,…).

Therefore, COM-negative binomial distribution in (3) can be seen as a weighted Poisson distribution with weighted function

 f(k,r,ν)=w(k)=[Γ(1+k)]1−ν[Γ(r+k)]ν. (13)

Theorem 3 and its corollary in Kokonendji et al. (2008) provide an “iff” condition to prove overdispersion and underdispersion of the weighted Poisson distribution.

###### Lemma 2.5.

Let be a weighted Poisson random variable with mean , and let be a weighted function not depending on . Then, weighted function is logconvex (logconcave) iff the weighted version of is overdispersed (underdispersed).

Kokonendji et al. (2008) applied it to show that COM-Poisson distribution is overdispersion if and underdispersion if . We employ their methods to get a criterion for overdispersion or underdispersion of COM-negative binomial distribution.

###### Theorem 2.1.

Set . The COM-nagative binomial distribution (3) is overdispersion if and underdispersion if .

###### Proof.

Function is logconvex(logconcave) if . Followed by the formula of logarithmic second derivative of Gamma function(see p54 of Temme (2011)), , we have

 d2logf(k,r,ν)dk2=(1−ν)d2logΓ(k+1)dk2+νd2logΓ(k+r)dk2=∞∑i=0(1−ν(i+k+1)2+ν(i+k+r)2),(∀k∈N).

Applying Lemma 2.5, the proof is complete. ∎

Then, the results of overdispersion can be easily obtained by Theorem 2.1.

###### Corollary 2.3.

In these two cases: 1. ; 2. . COM-negative binomial distribution is overdispersion.

Remark 4: The result of case 2 () can be also obtained from Corollary 2.1 and overdispersion of discrete compound Poisson distribution (equivalently, the discrete infinitely divisible).

### 2.5 Stochastic ordering

Stochastic ordering is the concept of one r.v. neither stochastically greater than, less than nor equal to another r.v. . There are plenty types of stochastic orders, which have various applications in risk theory. Firstly, we present 4 different definitions for discrete r.v.: usual stochastic order, likelihood ratio order, hazard rate order and mean residual life order.

1. is stochastically less than in usual stochastic order (denoted by ) if for all , where is the survival function of with p.m.f. .

2. is stochastically less than in likelihood ratio order(denoted by ) if increases in over the union of the supports of and , where and denotes the p.m.f. of and , respectively.

3. is stochastically less than in hazard rate order(denoted by ) if for all , where the hazard function of a discrete r.v. with p.m.f. is defined as

4. is stochastically less than in mean residual life order(denoted by ) if for all , where the mean residual life function of a discrete r.v. with p.m.f. is defined as .

The relationship among the above four stochastic ordering are
(see Theorem 1.C.1 of Shaked and Shanthikumar (2007)) and (see Theorem 1.B.1 of Shaked and Shanthikumar (2007)).

Gupta et al. (2014) gave the stochastic ordering between COM-Poisson r.v. and Poisson distributed r.v. with same parameter in (1), that is , therefore , and . In the following result we will show that COM-negative binomial distribution also has some stochastic ordering properties.

###### Theorem 2.2.

Let and be two r.v.’s following COM-negative binomial distribution with parameters and , respectively. If , then , hence , and .

###### Proof.

Note that , we have . Then

 P(Y=n)P(X=n)=(Γ(r+n)n!Γ(r))ν2−ν1C(r,ν1,p)C(r,ν2,p),(n=0,1,2,…).

which is increasing in as . ∎

Especially, assume that is a positive integer, the COM-negative binomial should be called the COM-Pascal distribution. Let be COM-negative binomial distributed and be negative binomial distributed with the same parameters , it yields to when .

The next theorem is proved in the view of weighted Poisson distribution (12) from weighted function of COM-negative binomial distribution. Example 1.C.59 of Shaked and Shanthikumar (2007) states the obvious lemma below:

###### Lemma 2.6.

Define as the r.v. with weighted density function , Similarly, for another nonnegative r.v. with density function , define as the r.v. with the weighted density function . If is an increasing function, then .

###### Theorem 2.3.

Let and be two COM-negative binomial distributed with parameters and , respectively. If or , then , and therefore , and

###### Proof.

For Poisson distributed , with mean ,, if , then is increasing for all . So . From section 2.4, we know that COM-negative binomial is weight Poisson with weight (13).

On the one hand, when , we notice that weighted density function for COM-negative binomial distribution is increasing with respect to . On the other hand, is an increasing function with respect to as , that is, . ∎

### 2.6 Approximate to COM-Poisson distribution

The next theorem enable COM-negative binomial distribution to be a suitable generalization since its limit distribution is the COM-Poisson under some conditions. We prove that COM-negative binomial distribution converges to the COM-Poisson distribution when goes to infinity.

###### Theorem 2.4.

Suppose that r.v. has COM-negative binomial distribution with parameters , denote the p.m.f. as , and let . Then

 limr→∞P(X=k∣r,ν,p)=λk(k!)ν⋅1Z(λ,ν),(k=0,1,2,⋯). (14)
###### Proof.

Notice that , substitute to p.m.f (3), then we obtain

 P(X=k∣r,ν,p)=λk(k!)ν⋅(Γ(r+k)Γ(r)rk)ν⋅1(1+λ/rν)k/C(r,ν,p)(rνrν+λ)r.

Hence,

 limr→∞P(X=k∣r,ν,p)=λk(k!)ν/C(r,ν,p)(rνrν+λ)r=λk(k!)ν⋅1Z(λ,ν)

holds as and

 limr→+∞C(r,ν,p)(rvrv+λ)r=limn→+∞limr→+∞n∑i=0λi(i!)ν⋅(Γ(r+i)Γ(r)ri)ν⋅1(1+λ/rν)i=limn→+∞n∑i=0λi(i!)ν=Z(λ,ν).

## 3 Characterizations

### 3.1 Sum of equicorrelated geometrically distributed r.v.

It is well known that the binomial r.v. can be seen as the sum of independent Bernoulli r.v. .

 S=Z1+Z2+⋯+Zm

where

 P(Zi=1)=p,P(Zi=0)=1−p,i=1,2,⋯,m
 P(S=k)=(mk)pk(1−p)m−k.

Shmueli et al. (2005) and Borges et al. (2014) mentioned that the COM-binomial distribution (9) can be presented as a sum of equicorrelated Bernoulli r.v.’s with joint distribution

 P(Z1=z1,⋯,Zm=zm)=(mk)v−1pk(1−p)m−k1∑x1=0⋯1∑xm=0(m∑mi=1xi)v−1p∑mi=1xi(1−p)m−∑mi=1xi,z=(z1,⋯,zm)∈{0,1}m,

where .

As we know, negative binomial distribution can be treated as the sum of independent geometric r.v.’s :

 S=Z1+Z2+⋯+Zm
 P(S=x)=(m+x−1x)px(1−p)m,

where

It is similar to see that the COM-negative binomial distribution can be interpreted as a sum of equicorrelated geometric r.v.’s with joint distribution

 P(Z1=z1,⋯,Zm=zm)∝(m+x−1x)v−1px(1−p)m, (15)

where .

The reason is that we assume is equicorrelated, and has feasible positive integer solutions, and each solution has probability . Then, the possible values of random vector such that is COM-negative binomial distributed, namely

 (m+x−1x)P(Z1=z1,⋯,Zm=zm)∝(m+x−1x)vpx(1−p)m,

Thus we have (15).

### 3.2 Conditional distribution

In this subsection, two conditional distribution characterizations are obtained for COM-negative binomial distribution. For two independent r.v.’s , what is the form of the conditional distribution of given ? Consider the sum of COM-negative binomial r.v.’s with parameters and , then

 P(S=s) =s∑x=0P(X=x)P(Y=s−x)=s∑x=0(Γ(rx+x)x!Γ(rx))νpx(1−p)rxC(rx,ν,p)(Γ(ry+s−x)(s−x)!Γ(ry))νps−x(1−p)ryC(ry,ν,p) =(1−p)rx(1−p)ryC(rx,ν,p)C(ry,ν,p)s∑x=0(Γ(rx+x)Γ(ry+s−x)x!Γ(rx)(s−x)!Γ(ry))νps =(1−p)rx(1−p)ry[Γ(rx+ry+s)]νC(rx,ν,p)C(ry,ν,p)[Γ(rx+ry)]νs∑x=0((