Bivariate gamma-geometric law and its inducedLévy process Published in Journal of Multivariate Analysis, volume 109, August 2012,pages 130-145, DOI: 10.1016/j.jmva.2012.03.004

Bivariate gamma-geometric law and its induced
Lévy process
Published in Journal of Multivariate Analysis, volume 109, August 2012,
pages 130-145, DOI: 10.1016/j.jmva.2012.03.004

Wagner Barreto-Souza
Departamento de Estatística, Universidade de São Paulo
Rua do Matão, 1010, São Paulo/SP 05508-090, Brazil
E-mail: wagnerbs85@gmail.com
Abstract

In this article we introduce a three-parameter extension of the bivariate exponential-geometric (BEG) law (Kozubowski and Panorska, 2005). We refer to this new distribution as bivariate gamma-geometric (BGG) law. A bivariate random vector follows BGG law if has geometric distribution and may be represented (in law) as a sum of independent and identically distributed gamma variables, where these variables are independent of . Statistical properties such as moment generation and characteristic functions, moments and variance-covariance matrix are provided. The marginal and conditional laws are also studied. We show that BBG distribution is infinitely divisible, just as BEG model is. Further, we provide alternative representations for the BGG distribution and show that it enjoys a geometric stability property. Maximum likelihood estimation and inference are discussed and a reparametrization is proposed in order to obtain orthogonality of the parameters. We present an application to the real data set where our model provides a better fit than BEG model. Our bivariate distribution induces a bivariate Lévy process with correlated gamma and negative binomial processes, which extends the bivariate Lévy motion proposed by Kozubowski et al. (2008). The marginals of our Lévy motion are mixture of gamma and negative binomial processes and we named it BMixGNB motion. Basic properties such as stochastic self-similarity and covariance matrix of the process are presented. The bivariate distribution at fixed time of our BMixGNB process is also studied and some results are derived, including a discussion about maximum likelihood estimation and inference.

Keywords: Bivariate gamma-geometric law; Characteristic function; Infinitely divisible distribution; Maximum likelihood estimation; Orthogonal parameters; Lévy process.

1 Introduction

Mixed univariate distributions have been introduced and studied in the last years by compounding continuous and discrete distributions. Marshall and Olkin (1997) introduced a class of distributions which can be obtained by minimum and maximum of independent and identically distributed (iid) continuous random variables (independent of the random sample size), where the sample size follows geometric distribution.

Chahkandi and Ganjali (2009) introduced some lifetime distributions by compounding exponential and power series distributions; this models are called exponential power series (EPS) distributions. Recently, Morais and Barreto-Souza (2011) introduced a class of distributions obtained by mixing Weibull and power series distributions and studied several of its statistical properties. This class contains the EPS distributions and other lifetime models studied recently, for example, the Weibull-geometric distribution (Marshall and Olkin, 1997; Barreto-Souza et al., 2011). The reader is referred to introduction from Morais and Barreto-Souza’s (2011) article for a brief literature review about some univariate distributions obtained by compounding.

A mixed bivariate law with exponential and geometric marginals was introduced by Kozubowski and Panorska (2005), and named bivariate exponential-geometric (BEG) distribution. A bivariate random vector follows BEG law if admits the stochastic representation:

(1)

where the variable follows geometric distribution and is a sequence of iid exponential variables, independent of . The BEG law is infinitely divisible and therefore leads a bivariate Lévy process, in this case, with gamma and negative binomial marginal processes. This bivariate process, named BGNB motion, was introduced and studied by Kozubowski et al. (2008).

Other multivariate distributions involving exponential and geometric distributions have been studied in the literature. Kozubowski and Panorska (2008) introduced and studied a bivariate distribution involving geometric maximum of exponential variables. A trivariate distribution involving geometric sums and maximum of exponentials variables was also recently introduced by Kozubowski et al. (2011).

Our chief goal in this article is to introduce a three-parameter extension of the BEG law. We refer to this new three-parameter distribution as bivariate gamma-geometric (BGG) law. Further, we show that this extended distribution is infinitely divisible, and, therefore, it induces a bivariate Lévy process which has the BGNB motion as particular case. The additional parameter controls the shape of the continuous part of our models.

Our bivariate distribution may be applied in areas such as hydrology and finance. We here focus in finance applications and use the BGG law for modeling log-returns (the ’s) corresponding to a daily exchange rate. More specifically, we are interested in modeling cumulative log-returns (the ) in growth periods of the exchange rates. In this case represents the duration of the growth period, where the consecutive log-returns are positive. As mentioned by Kozubowski and Panorska (2005), the geometric sum represented by in (1) is very useful in several fields including water resources, climate research and finance. We refer the reader to the introduction from Kozubowski and Panorska’s (2005) article for a good discussion on practical situations where the random vectors with description (1) may be useful.

The present article is organized as follows. In the Section 2 we introduce the bivariate gamma-geometric law and derive basic statistical properties, including a study of some properties of its marginal and conditional distributions. Further, we show that our proposed law is infinitely divisible. Estimation by maximum likelihood and inference for large sample are addressed in the Section 3, which also contains a proposed reparametrization of the model in order to obtain orthogonality of the parameter in the sense of Cox and Reid (1987). An application to a real data set is presented in the Section 4. The induced Lévy process is approached in the Section 5 and some of its basic properties are shown. We include a study of the bivariate distribution of the process at fixed time and also discuss estimation of the parameters and inferential aspects. We close the article with concluding remarks in the Section 6.

2 The law and basic properties

The bivariate gamma-geometric (BGG) law is defined by the stochastic representation (1) and assuming that is a sequence of iid gamma variables independent of and with probability density function given by , for and ; we denote . As before, is a geometric variable with probability mass function given by , for ; denote . Clearly, the BGG law contains the BEG law as particular case, for the choice . The joint density function of is given by

(2)

Hence, it follows that the joint cumulative distribution function (cdf) of the BGG distribution can be expressed by

for and , where is the incomplete gamma function. We will denote . We now show that as , where ‘’ denotes convergence in distribution and is a exponential variable with mean 1; for , we obtain the result given in the proposition 2.3 from Kozubowski and Panorska (2005). For this, we use the moment generation function of the BGG distribution, which is given in the Subsection 2.2. Hence, we have that , where is given by (4). Using L’Hôpital’s rule, one may check that as , which is the moment generation function of .

2.1 Marginal and conditional distributions

The marginal density of with respect to Lebesgue measure is an infinite mixture of gamma densities, which is given by

(3)

Therefore, the BGG distribution has infinite mixture of gamma and geometric marginals. Some alternative expressions for the marginal density of can be obtained. For example, for , we obtain the exponential density. Further, with help from Wolfram111http://www.wolframalpha.com/, for , we have that

and

respectively, where and is the error function. Figure 1 shows some plots of the marginal density of for , and some values of .

Figure 1: Plots of the marginal density of for , , (left) and (right).

We now obtain some conditional distributions which may be useful in goodness-of-fit analyses when the BGG distribution is assumed to model real data (see Section 4). Let be positive integers and . The conditional cdf of given is

We have that is given by the right side of the above expression with replacing .

For and , the conditional cdf of given is

The conditional probability is given by the right side of the above expression with replacing .

From (2) and (3), we obtain that the conditional probability mass function of given is

for . If is known, the above probability mass function belongs to the one-parameter power series class of distributions; for instance, see Noack (1950). In this case, the parameter would be . For , we obtain the Poisson distribution truncated at zero with parameter , which agrees with formula (7) from Kozuboswki and Panorska (2005). For the choice , we have that

where .

2.2 Moments

The moment generation function (mgf) of the BGG law is

and then

(4)

for . The characteristic function may be obtained in a similar way and is given by

(5)

for . With this, the product and marginal moments can be obtained by computing or . Hence, we obtain the following expression for the product moments of the random vector :

(6)

where , for , is the beta function. In particular, we obtain that , and the covariance matrix of is given by

(7)

The correlation coefficient between and is . Let , that is, the correlation coefficient of a bivariate random vector following BEG law. For , we have , and for , it follows that . Figure 2 shows some plots of the correlation coefficient of the BGG law as a function of for some values of .

Figure 2: Plots of the correlation coefficient of the BGG law as a function of for .

From (4), we find that the marginal mgf of is given by

for . The following expression for the th moment of can be obtained from above formula or (6):

We notice that the above expression is valid for any real .

2.3 Infinitely divisibility, geometric stability and representations

We now show that BGG law is infinitely divisible, just as BEG law is. Based on Kozubowski and Panorska (2005), we define the bivariate random vector

where the ’s are iid random variables following distribution and independent of the random variable , which follows negative binomial distribution with the probability mass function

(8)

where . The moment generation function of is given by

which is valid for and . In a similar way, we obtain that the characteristic function is given by

(9)

for . With this, we have that , where is the characteristic function of the BGG law given in (5). In words, we have that BGG distribution is infinitely divisible.

The exponential, geometric and BEG distributions are closed under geometric summation. We now show that our distribution also enjoys this geometric stability property. Let be iid random vectors following distribution independent of , where , with . By using (4) and the probability generation function of the geometric distribution, one may easily check that

From the above result, we find another stochastic representation of the BGG law, which generalizes proposition (4.2) from Kozubowski and Panorska (2005):

where , with , and is defined as before. In what follows, another representation of the BGG law is provided, by showing that it is a convolution of a bivariate distribution (with gamma and degenerate at 1 marginals) and a compound Poisson distribution. Let be a sequence of iid random variables following logarithmic distribution with probability mass function , for , where . Define the random variable , independent of the ’s. Given the sequence , let , for , be a sequence of independent random variables and let be independent of all previously defined variables. Then, we have that

(10)

Taking in (10), we obtain the proposition 4.3 from Kozubowski and Panorska (2005). To show that the above representation holds, we use the probability generation functions (for ) and (for ). With this, it follows that

(11)

for . Furthermore, for , we have that

By using the above result in (11), we obtain the representation (10).

3 Estimation and inference

Let , …, be a random sample from distribution and be the parameter vector. The log-likelihood function is given by

(12)

where and . The associated score function to log-likelihood function (12) comes

and

(13)

where . By solving the nonlinear system of equations , it follows that the maximum likelihood estimators (MLEs) of the parameters are obtained by

(14)

Since MLE of may not be found in closed-form, nonlinear optimization algorithms such as a Newton algorithm or a quasi-Newton algorithm are needed.

We are now interested in constructing confidence intervals for the parameters. For this, the Fisher’s information matrix is required. The information matrix is

(15)

with

where .

Standard large sample theory gives us that as , where is the inverse matrix of defined in (15).

The asymptotic multivariate normal distribution of can be used to construct approximate confidence intervals and confidence regions for the parameters. Further, we can compute the maximum values of the unrestricted and restricted log-likelihoods to construct the likelihood ratio (LR) statistic for testing some sub-models of the BGG distribution. For example, we may use the LR statistic for testing the hypotheses versus , which corresponds to test BEG distribution versus BGG distribution.

3.1 A reparametrization

We here propose a reparametrization of the bivariate gamma-geometric distribution and show its advantages over the previous one. Consider the reparametrization and and as before. Define now the parameter vector . Hence, the density (2) now becomes

We shall denote . Therefore if , …, is a random sample from distribution, the log-likelihood function is given by

(16)

The score function associated to (16) is , where

and is given by (13). The MLE of is given (as before) in (14), and the MLEs of and are obtained by

As before nonlinear optimization algorithms are needed to find MLE of . Under this reparametrization, Fisher’s information matrix becomes

with

The asymptotic distribution of is trivariate normal with null mean and covariance matrix . We see that under this reparametrization we have orthogonal parameters in the sense of Cox and Reid (1987); the information matrix is a diagonal matrix. With this, we obtain desirable properties such as asymptotic independence of the estimates of the parameters. The reader is referred to Cox and Reid (1987) for more details.

4 Application

Here, we show the usefulness of the bivariate gamma-geometric law applied to a real data set. We consider daily exchange rates between Brazilian real and U.K. pounds, quoted in Brazilian real, covering May 22, 2001 to December 31, 2009. With this, we obtain the daily log-returns, that is, the logarithms of the rates between two consecutive exchange rates. Figure 3 illustrates the daily exchange rates and the log-returns.

Figure 3: Graphics of the daily exchange rates and log-returns.

We will jointly model the magnitude and duration of the consecutive positive log-returns by using BGG law. We call attention that the duration of the consecutive positive log-returns is the same that the duration of the growth periods of the exchange rates. The data set consists of 549 pairs , where and are the magnitude and duration as described before, for . We notice that this approach of looking jointly at the magnitude and duration of the consecutive positive log-returns was firstly proposed by Kozubowski and Panorska (2005) with the BEG model, which showed a good fit to another currencies considered. Suppose are iid random vectors following distribution. We work with the reparametrization proposed in the Subsection 3.1.

Table 1 presents a summary of the fit of our model, which contains maximum likelihood estimates of the parameters with their respective standard errors, and asymptotic confidence intervals at the 5% significance level. Note that the confidence interval of does not contain the value . Then, for the Wald test, we reject the hypothesis that the data come from BEG distribution in favor of the BGG distribution, at the 5% significance level. We also perform likelihood ratio (LR) test and obtain that the LR statistic is equal to with associated p-value . Therefore, for any usual significance level (for example 5%), the likelihood ratio test rejects the hypothesis that the data come from BEG distribution in favor of the BGG distribution, so agreeing with Wald test’s decision. The empirical and fitted correlation coefficients are equal to 0.6680 and 0.6775, respectively, therefore, we have a good agreement between them.

Parameters Estimate Stand. error Inf. bound Sup. bound
0.0082 0.00026 0.0076 0.0087
0.8805 0.04788 0.7867 0.9743
0.5093 0.01523 0.4794 0.5391
Table 1: Maximum likelihood estimates of the parameters, standard errors and bounds of the asymptotic confidence intervals at the 5% significance level.

The BEG model was motived by an empirical observation that the magnitude of the consecutive positive log-returns followed the same type of distribution as the positive one-day log-returns (see Kozubowski and Panorska, 2005). Indeed, the marginal distribution of in the BEG model is also exponential (with mean ), just as the positive daily log-returns (with mean ). This stability of the returns was observed earlier by Kozubowski and Podgórski (2003), with the log-Laplace distribution. We notice that BGG distribution does not enjoy this stability property, since the marginal distribution of is an infinite mixture of gamma distributions. We now show that the data set considered here does not present this stability.

Denote the th positive one-day log-returns by and define . If the data was generated from a distribution, then an empirical quantile-quantile plot between the ’s (-axis) and the ’s (-axis) would be around the straight line , for . Figure 4 presents this plot and we observe that a considerable part of the points are below of the straight line (we replace by its MLE ). Therefore, the present data set seems to have been generated by a distribution that lacks the stability property discussed above. In order to confirm this, we test the hypothesis that the ’s and ’s have the same distribution. In the BEG model, both have exponential distribution with mean . Since converges in probability to (as ), we perform the test with replacing . The Kolmogorov-Smirnov statistic and associated p-value are equal to 0.0603 and 0.0369, respectively. Therefore, using a significance level at 5%, we reject the hypothesis that the ’s and ’s have the same distribution.

Figure 4: Empirical quantile-quantile plot between cumulative consecutive positive log-returns and positive one-day log-returns, with the straight line . The range covers 85% of the data set.

Figure 5 presents the fitted marginal density (mixture of gamma densities) of the cumulative log-returns with the histogram of the data and the empirical and fitted survival functions. These plots show a good fit of the mixture of gamma distributions to the data. This is confirmed by the Kolmogorov-Smirnov (KS) test, which we use to measure the goodness-of-fit of the mixture of gamma distributions to the data. The KS statistic and its p-value are equal to 0.0482 and 0.1557, respectively. Therefore, using any usual significance level, we accept the hypothesis that the mixture of gamma distributions is adequate to fit the cumulative log-returns.

Figure 5: Plot on the left shows the fitted mixture of gamma densities (density of ) with the histogram of the data. Plot on the right presents the empirical and fitted theoretical (mixture of gamma) survival functions.
Figure 6: Picture on the left shows the histogram and fitted gamma density for the daily positive log-returns. Empirical survival and fitted gamma survival are shown in the picture on the right.

Plots of the histogram, fitted gamma density and empirical and fitted survival functions for the daily positive log-returns are presented in the Figure 6. The good performance of the gamma distribution may be seen by these graphics. In the Table 2 we show absolute frequency, relative frequency and fitted geometric model for the duration in days of the consecutive positive log-returns. From this, we observe that the geometric distribution fits well the data. This is confirmed by the Pearson’s chi-squared (denoted by ) test, where our null hypothesis is that the duration follows geometric distribution. The statistic equals 42 (degrees of freedom equals 36) with associated p-value 0.2270, so we accept (using any usual significance level) that the growth period follows geometric distribution. We notice that geometric distribution has also worked quite well for modeling the duration of the growth periods of exchange rates as part of the BEG model in Kozubowski and Panorska (2005).

1 2 3 4 5 6
Absolute frequency 269 136 85 34 15 6 4
Relative frequency 0.48998 0.24772 0.15483 0.06193 0.02732 0.01093 0.00728
Fitted model 0.50928 0.24991 0.12264 0.06018 0.02953 0.01449 0.01396
Table 2: Absolute and relative frequencies and fitted marginal probability mass function of (duration in days of the growth periods).
Figure 7: Plots of the fitted conditional density and survival functions of given , and . In the pictures of the density and survival functions, we also plot the histogram of the data and the empirical survival function, respectively.
Figure 8: Plots of the fitted conditional density and survival functions of given and . In the pictures of the density and survival functions, we also plot the histogram of the data and the empirical survival function, respectively.

So far our analysis has showed that the bivariate gamma-geometric distribution and its marginals provided a suitable fit to the data. We end our analysis verifying if the conditional distributions of the cumulative log-returns given the duration also provide good fits to the data. As mentioned before, the conditional distribution of given is . Figure 7 shows plots of the fitted density and fitted survival function of the conditional distributions of given . The histograms of the data and the empirical survival functions are also displayed. The corresponding graphics for the conditional distributions of given are displayed in the Figure 8. These graphics show a good performance of the gamma distribution to fit cumulative log-returns given the growth period (in days). We also use the Kolmogorov-Smirnov test to verify the goodness-of-fit these conditional distributions. In the table 3 we present the KS statistics and their associated p-values. In all cases considered, using any usual significance level, we accept the hypothesis that the data come from gamma distribution with parameters specified above.

Given one-day two-day three-day four-day five-day
KS statistic 0.0720 0.0802 0.1002 0.1737 0.2242
p-value 0.1229 0.3452 0.3377 0.2287 0.3809
Table 3: Kolmogorov-Smirnov statistics and their associated p-values for the goodness-of-fit of the conditional distributions of the cumulative log-returns given the durations (one-day, two-day, three-day, four-day and five-day).

5 The induced Lévy process

As seen before, the bivariate gamma-geometric distribution is infinitely divisible, therefore, we have that (9) is a characteristic function for any real . This characteristic function is associated with the bivariate random vector

where are iid random variables following distribution, , is a discrete random variable with distribution and all random variables involved are mutually independent. Hence, it follows that the BGG distribution induces a Lévy process , which has the following stochastic representation:

(17)

where the ’s are defined as before, is a gamma Lévy process and is a negative binomial Lévy process, both with characteristic functions given by

and

respectively. All random variables and processes involved in (17) are mutually independent.

From the process defined in (17), we may obtain other related Lévy motions by deleting and/or . Here, we focus on the Lévy process given by (17) and by deleting . In this case, we obtain the following stochastic representation for our process:

(18)

Since both processes (the left and the right ones of the equality in distribution) in (18) are Lévy, the above result follows by noting that for all fixed , we have . One may also see that the above result follows from the stochastic self-similarity property discussed, for example, by Kozubowski and Podgórski (2007): a gamma Lévy process subordinated to a negative binomial process with drift is again a gamma process.

The characteristic function corresponding to the (18) is given by

(19)

for . With this, it easily follows that the characteristic function of the marginal process is

Since the above characteristic function corresponds to a random variable whose density is an infinite mixture of gamma densities (see Subsection 5.1), we have that is an infinite mixture of gamma Lévy process (with negative binomial weights). Then, we obtain that the marginal processes of are infinite mixture of gamma and negative binomial processes. Therefore, we define that is a Lévy process. We notice that, for the choice in (18), we obtain the bivariate process with gamma and negative binomial marginals introduced by Kozubowski et al. (2008), named BGNB Lévy motion.

As noted by Kozubowski and Podgórski (2007), if is a negative binomial process, with parameter , independent of another negative binomial process { with parameter , then the changed time process is a negative binomial process with parameter . With this and (18), we have that the changed time process is a Lévy process.

In what follows, we derive basic properties of the bivariate distribution of the BMixGNB process for fixed and discuss estimation by maximum likelihood and inference for large sample. From now on, unless otherwise mentioned, we will consider fixed.

5.1 Basic properties of the bivariate process for fixed

For simplicity, we will denote . From stochastic representation (18), it is easy to see that the joint density and distribution function of are

(20)

and

for and . Making in (20), we obtain the BGNB distribution (bivariate distribution with gamma and negative binomial marginals) as particular case. This model was introduced and studied by Kozubowski et al. (2008). We have that the marginal distribution of is negative binomial with probability mass function given in (8). The marginal density of is given by

where is the density of a gamma variable as defined in the Section 2. Therefore, the above density is an infinite mixture of gamma densities (with negative binomial weigths). Since the marginal distributions of are infinite mixture of gamma and negative binomial distributions, we denote . Some plots of the marginal density of are displayed in the Figure 9, for and some values of , and .

Figure 9: Graphics of the marginal density of for , , and .

The conditional distribution of is gamma with parameters and , while the conditional probability distribution function of is given by

for , which belongs to one-parameter power series distributions if and are known. In this case, the parameter is . For positive integers and real , it follows that

and for and positive integer

The moments of a random vector following distribution may be obtained by