Exponentiated Generalized Pareto Distribution: Properties and applications towards Extreme Value Theory

# Exponentiated Generalized Pareto Distribution: Properties and applications towards Extreme Value Theory

Se Yoon Lee Texas A & M University, 400 Bizzell St, College Station, TX 77843    Joseph H. T. Kim Yonsei University, Seoul, Korea, 120-749
###### Abstract

The Generalized Pareto Distribution (GPD) plays a central role in modelling heavy tail phenomena in many applications. Applying the GPD to actual datasets however is a non-trivial task. One common way suggested in the literature to investigate the tail behaviour is to take logarithm to the original dataset in order to reduce the sample variability. Inspired by this, we propose and study the Exponentiated Generalized Pareto Distribution (exGPD), which is created via log-transform of the GPD variable. After introducing the exGPD we derive various distributional quantities, including the moment generating function, tail risk measures. As an application we also develop a plot as an alternative to the Hill plot to identify the tail index of heavy tailed datasets, based on the moment matching for the exGPD. Various numerical analyses with both simulated and actual datasets show that the proposed plot works well.
Key Words: Extreme Value Theory; Generalized Pareto Distribution (GPD); Exponentiated Generalized Pareto Distribution; Hill plot

## 1 Introduction

The Generalized Pareto Distribution (GPD) has recently emerged as an influential distribution in modelling heavy tailed datasets in various applications in finance, operational risk, insurance and environmental studies. In particular, it is widely used to model sample exceedances beyond some large threshold, a procedure commonly known as the peaks-over-threshold (POT) method in the Extreme Value Theory (EVT) literature; see, e.g., Embrechts et al. (1997) and Beirlant et al. (2006). The heavy tail phenomenon and the GPD are linked through the famous Pickands-Balkema-de Hann theorem (Balkema and De Haan (1974) and Pickands III (1975)) which states that, for an arbitrary distribution of which the sample maximum tends to a non-degenerate distribution after suitable standardization, the distribution function of its exceedances over a large threshold converges to the GPD. To this extent, extensive research has been carried out in the literature to characterize the GPD and apply it to the EVT framework; see, for example, de Zea Bermudez and Kotz (2010) for a comprehensive survey on various estimation procedures for the GPD parameter.

In this paper we introduce a two-parameter exponentiated Generalized Pareto Distribution (exGPD in short) and study its distributional properties, which is the first contribution of the present paper. The term ‘exponentiated’ is used because this new distribution is obtained via log-transform of the GPD random variable. There are ample examples where a new distribution is constructed through logarithm or exponential transformation of an existing one. Such examples include normal and log-normal, gamma and log-gamma, and Pareto and shifted exponential distributions. Thus introducing the exGPD is interesting on its own right from statistical viewpoint, but we have further motivations of considering such a distribution in connection to EVT. In various graphical tools offered by EVT one frequently investigates the tail behaviour with log-transformed data rather than the original data, as seen in, for example, Hill (1975) plot and the estimator of Pickands III (1975). As log transformation greatly reduces the variability of extreme quantiles, this practice naturally allows one to investigate the tail behaviour in a more stable manner. Therefore, given that the GPD is the central distribution in EVT modelling, it makes sense to create an alternative plotting tool using the exGPD directly. This is our second contribution to this paper. In particular, we develop a new plot as an alternative to the Hill plot to identify the tail index of heavy tailed datasets. The proposed Log Variance (LV) plot is based on the idea of the sample variance of log exceedances to be matched to the variance of the exGPD. Through various numerical illustrations with both simulated and actual datasets, it is shown that the LV plot works reasonably well compared to the Hill plot, elucidating the usefulness of the exGPD.

This article is organized as follows. In Section 2 we define the exGPD and investigate its distributional quantities including the moment generating function, from which the moments can be obtained. We show that the moment of all orders are finite for the exGPD, unlike the GPD. Section 3 derives popular risk measures, including the Value-at-Risk and the conditional tail expectation. In Section 4 we develop the LV plot that is derived from the exGPD variance. We use both simulated and actual datasets to illustrate the proposed plot and compare it to the Hill plot. Section 5 concludes the article.

## 2 Exponentiated Generalized Pareto Distribution

### 2.1 Definition

The distribution function (df) of the two-parameter GPD with parameter () is defined as

 GX(x)=1−(1+ξxσ)−1/ξ;ξ≠0 (0)

where the support is for and for . Here, and are called the scale and shape parameter, respectively. For the case of , the df is defined as

 GX(x)=1−e−x/σ;ξ=0, (0)

an exponential distribution defined on with scale parameter . The density function is then

 gX(x)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩1σ(1+ξxσ)−1/ξ−1,ξ≠0\par1σe−x/σ,ξ=0.\par (0)

The GPD is contains three distributions. When the GPD is an Pareto distribution of the second kind (or Lomax distribution in the insurance literature) with a heavy tail decaying at a polynomial speed; when the GPD is an exponential distribution with a medium tail decaying exponentially; for , the GPD becomes a short-tailed distribution the upper bound of the distribution support is finite. The th moment of the GPD is existent for ; for instance, the mean and variance are finite only when and , respectively.

Now we define the exponentiated GPD (exGPD) as the logarithm of the GPD random variable. That is, when is GPD distributed, random variable is said to be an exGPD random variable. A simple algebra gives its df as

 FY(y) =P(Y≤y)=P(logX≤y)=P(X≤ey) =GX(ey)=1−(1+ξeyσ)−1/ξ;ξ≠0, (0)

of which the support is for , and for . When , we have

 FY(y)=1−e−ey/σ=1−exp(−ey−logσ);ξ=0,−∞

which is the Type III extreme value distribution (or the distribution of is that of Gumbel) with location parameter . Combining these, we can write the density of the exGPD as

 fY(y)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩eyσ(1+ξeyσ)−1/ξ−1,ξ≠0\par1σey−ey/σ,ξ=0.\par (0)

We note that the role of changes from a scale parameter under the GPD to a location parameter under the exGPD, as seen from in (2.1) for all . In what follows we denote and to be the GPD variable, and for the exGPD to avoid confusion.

An alternative way to create the exGPD is to define through conditional on , where is the Pareto random variable with df , . The survival function of is then given by

 P(Y>y) =P(log(W−d)>y|W>d)=P(W−d>ey|W>d)P(W>d) =P(W>d+ey)P(W>d)=(d+eyβ)−α(dβ)−α=(1+eyd)−α, (0)

which corresponds to exGPD.

In Figure 1 we compare the densities of the GPD and exGPD side by side for selected parameter choices for . The most dramatic change is the shape itself, where the GPD is always decreasing on its support whereas the exGPD, defined on the entire real line, has a peak in the middle of the distribution. Obviously, all the realized GPD values between 0 and 1 are mapped to negative numbers under the exGPD. Also, as seen from the densities (2.1) and (2.1), the polynomial right tail of the GPD changes to an exponentially decaying tail under the exGPD through log transformation. Later we will prove the moments of all orders are actually finite for the exGPD. From the figure, we have the following additional comments:

• As increases the density of the exGPD shifts to the right because is the location parameter. In fact, the mode can be shown to be , provided it exists; the proof will be given shortly. As is strictly positive, the mode can take any real values with being the boundary of the sign of the mode.

• As increases the exGPD density becomes less peaked and shows larger dispersion or variance. This implies that the shape parameter of the GPD roughly plays the role of a scale parameter under the exGPD. This is also hinted from the density (2.1); if the constant 1 is removed from the density function, becomes a scale parameter.

• The visual advantage of the exGPD over the GPD is clear from the figure. The exGPD expresses the tail thickness, represented by , in a much clearer way in that the tail decaying angle becomes steeper as gets smaller. Thus one can quickly investigate how heavy the tail is by looking at, e.g., the histogram of the log data, which is less straightforward with the GPD densities.

In Figure 2, the densities of the GPD and exGPD are compared for . Again, the shape of the density changes substantially by log-transform. Note that there is an upper limit in the support of both distributions in this case. However, when is a small negative value, the shape looks not that different from case in that its right tail gradually decays as if there is no upper limit to our eyes. As gets larger, but not greater than 1, there is no smooth landing around the upper limit of the support; the right tail abruptly drops to zero without hesitation. When , the exGPD density takes a drastically different shape, by shooting upside at its upper limit. Thus, when the shape parameter is negative, its magnitude can completely change the shape of the exGPD density, as is the case for the GPD distribution. We note that the role of is two-fold when ; it acts as the location parameter and also controls the upper limit of the distribution support.

The following result formally shows how the shape parameter of the exGPD relates to its density shape, which is also linked to the existence of the mode.

###### Lemma 1.

The density of the exGPD() in (2.1) is bounded only for , in which case there is a unique mode at .

Proof: To find the mode, a simple algebra yields that

 f′Y(y) =eyσ(1+ξeyσ)−1/ξ−1+eyσ(−1ξ−1)(1+ξeyσ)−1/ξ−2⋅ξeyσ =eyσ(1+ξeyσ)−1/ξ−1[1+(−1ξ−1)(1+ξeyσ)−1ξeyσ] =fY(y)1−ey/σ1+ξey/σ. (0)

Note that in the last expression, both and are non-negative regardless of the sign of . We now investigate whether gives a sensible solution for different ranges of .
(1) case: Using the distribution support for negative , we see that the numerator of the second term in (2.1) is bounded by

 1+1/ξ≤1−ey/σ<1. (0)

When , the lower bound of this inequality becomes a strictly positive number, so has no solution as seen in (2.1). In fact, the density value in this case gets indefinitely large as approaches its upper limit since

 limy→log(−σ/ξ)−fY(y) =limy→log(−σ/ξ)−eyσ(1+ξeyσ)−1/ξ−1 =−1ξlimy→log(−σ/ξ)−(1+ξeyσ)−1/ξ−1=+∞. (0)

The last equality holds because when . Thus the density is unbounded in this range.
(2) case: The density reduces to , the exponential function shifted by . This function, defined on , is increasing in and attains its maximum of 1, at its upper limit of the support, , which is the mode.
(3) case: The lower bound of (2.1) is negative in this range, so has a unique solution at from (2.1). Note that the mode in this case is strictly inside the distribution support because , in contrast to the case where the maximum occurs at the upper boundary of the support.
(4) case: In this case the distribution support is , which yields

 −∞<1−ey/σ<1. (0)

Thus has a unique solution , which is the mode.

Now we turn to the hazard function of the exGPD which is given by

 hY(y)=fY(y)1−FY(y)=eyσ(1+ξσey)−1=ey/σ1+ξey/σ=1σe−y+ξ,ξ≠0. (0)

Figure 3 compares the hazard function of the GPD and exGPD. The hazard function of the GPD can increase or decrease depending on the sign of , with heavy tail implied for as the hazard function decreases in . However the exGPD always has an increasing hazard function, a.k.a. increasing fairlure rate (IFR), regardless of the sign of , as seen from (2.1) and the figure. According to the standard theory, therefore, we see that exGPD is also DMRL (decreasing mean residual lifetime), indicating a light-tailed distribution.

### 2.2 Moment Generating Function

In what follows, we denote to be a GPD() random variable and to be the corresponding exGPD() variable. In order to derive the moment generating function (mgf) of , we note the following relationship:

 MY(s)=E[esY]=E[eslogX]=E[elogXs]=E[Xs],s∈IR. (0)

from which we have the next result.

###### Lemma 2.

For the exGPD() distribution defined in (2.1), the moment generating function is given by

 MY(s)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩−1ξ(−σξ)sB(s+1,−1/ξ);s∈(−1,∞),ξ<01ξ(σξ)sB(s+1,1/ξ−s);s∈(−1,1/ξ),ξ>0σsΓ(1+s);s∈(−1,∞),ξ=0, (0)

where is the beta function defined as

 B(x,y)=∫10tx−1(1−t)y−1dt=Γ(x)Γ(y)Γ(x+y),x>0,y>0. (0)

Proof: We prove this for three cases depending on the sign of .
(1) case: From the support of GPD(), we have

 MY(s) (0)

If we let , we have

 MY(s) =∫10(−σξt)s⋅1σ(1−t)−1/ξ−1⋅−σξdt =−1ξ(−σξ)s∫10ts(1−t)−1/ξ−1dt =−1ξ(−σξ)sB(s+1,−1/ξ);−1

(2) case: As is defined for all positive values,

 MY(s) =E[Xs]=∫∞0xs⋅1σ(1+ξσx)−1/ξ−1dx. (0)

By letting , we have

 MY(s) =∫∞0(σξt)s⋅1σ(1+t)−1/ξ−1⋅σξdt =1ξ(σξ)s∫∞0ts(1+t)−1/ξ−1dt =1ξ(σξ)sB(s+1,1/ξ−s);−1

where the last equality comes from a property of the beta function

 B(x,y)=∫10tx−1(1−t)y−1dt=∫∞0tx−1(1+t)−x−ydt,x>0,y>0. (0)

It is pointed out that if we compare the two mgf’s ( vs. ) in (2), only the second argument of the beta function is different.

(3) case: The derivation is straightforward and omitted; alternatively, this can be adapted and derived from the mgf of Gumbel distribution as shown in, e.g., Ch. 22 of Johnston et al. (1994).

### 2.3 Moments

From the mgf of the exGPD given in Lemma 2, we may determine the moments by differentiating it with respect to . We present the first two moments for first. Let us rewrite mgf in (2) as a function of gamma functions for easier differentiation.

 MY(s)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩−1ξΓ(−1/ξ)⋅(−σξ)s⋅Γ(s+1)Γ(s−1/ξ+1);s∈(−1,∞),ξ<0.1ξΓ(1/ξ+1)⋅(σξ)s⋅Γ(s+1)⋅Γ(1/ξ−s);s∈(−1,1/ξ),ξ>0. (0)

For case, we may obtain the first two derivatives of the mgf as follows.

 ddsMY(s) =−1ξΓ(−1/ξ)⋅(−σξ)s⋅Γ(s+1)⋅1Γ(s−1/ξ+1) ⋅(log(−σξ)+ψ(s+1)−ψ(s−1/ξ+1)). d2ds2MY(s) =−1ξΓ(−1/ξ)⋅(−σξ)s⋅Γ(s+1)⋅1Γ(s−1/ξ+1) ⋅[(log(−σξ)+ψ(s+1)−ψ(s−1/ξ+1))2+0+ψ′(s+1)−ψ′(s−1/ξ+1)],

where is the digamma function. In this derivation, we used and

 (1Γ(s−1/ξ+1))′=−ψ(s−1/ξ+1)Γ(s−1/ξ+1).

Similarly, when , we use the fact to get

 ddsMY(s) =1ξΓ(1/ξ+1)⋅(σξ)s⋅Γ(s+1)⋅Γ(1/ξ−s)(log(σξ)+ψ(s+1)−ψ(1/ξ−s)). d2ds2MY(s) =1ξΓ(1/ξ+1)⋅(σξ)s⋅Γ(s+1)⋅Γ(1/ξ−s) ⋅[(log(σξ)+ψ(s+1)−ψ(1/ξ−s))2+0+ψ′(s+1)+ψ′(1/ξ−s)].

Hence, by setting , we have the first moment of the exGPD:

 (0)

Here , where is the Euler-Mascheroni constant. The case for for has been determined separately, but it is easily done. Likewise, the second moment is given by

 E[Y2]=d2ds2MY(s)∣∣∣s=0 =⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩(log(−σξ)+ψ(1)−ψ(1−1/ξ))2+ψ′(1)−ψ′(−1/ξ+1);ξ<0(log(σξ)+ψ(1)−ψ(1/ξ))2+ψ′(1)+ψ′(1/ξ);ξ>0(logσ+ψ(1))2+ψ′(1);ξ=0. =⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩E[Y]2+ψ′(1)−ψ′(−1/ξ+1);ξ<0E[Y]2+ψ′(1)+ψ′(1/ξ);ξ>0E[Y]2+ψ′(1);ξ=0. (0)

From the second raw moment, the variance of the exGPD becomes

 Var[Y]=⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩ψ′(1)−ψ′(−1/ξ+1)ψ′(1)+ψ′(1/ξ)ψ′(1)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩π26−∞∑k=11(k−1/ξ)2;ξ<0π26+∞∑k=11(k+1/ξ−1)2;ξ>0π26;ξ=0. (0)

Note that the variance of exGPD depends only on . Furthermore, since the summation terms in (2.3) are always positive, the value of the sample variance of , denoted , can serve as a quick indicator about the sign of . That is, if , then is deemed positive; otherwise, is negative. The first two moments yield the method of moments estimator (MME) of the exGPD parameter. For ,

 ^ξMME =11−(ψ′)−1(ψ′(1)−s2), ^σMME =−^ξMME⋅e¯Y−ψ(1)+ψ(1−1/^ξMME)

and for ,

 ^ξMME =1(ψ′)−1(s2−ψ′(1)), ^σMME =^ξMME⋅e¯Y−ψ(1)+ψ(1/^ξMME).

When , we have

One may further obtain higher moments by differentiating the mgf repeatedly, but the task is not straightforward as higher order derivatives can be complicated, involving infinite series, as seen from the derivation of the first two moments above. However, formally establishing the existence of higher moments of the exGPD is important because the difficulty of handling the GPD, such as sampling variability, is essentially attributed to its tail heaviness, directly connected to the non-existence of its moments.

###### Corollary 3.

The exGPD defined in (2.1) has finite moments of all orders.

Proof: Referring to (2.3), the mgf of the exGPD is a product of three functions in terms of , for both and cases. The first term is simply an exponential function of , which has derivatives of all orders. The second and third terms are gamma functions or its reciprocal. The derivatives of a gamma function can be written through the polygamma function. The polygamma function of order is defined as the -th derivative of the logarithm of the gamma function

 ψ(k)(z)=dkdzkψ(z)=dk+1dzk+1logΓ(z)=(−1)k+1k!∞∑r=01(z+r)k+1,k≥1, (0)

which is finite for . Hence we conclude that the mgf of the exGPD has derivatives of all orders, each of which in turn gives a finite value when evaluated at . A similar argument holds for case.

### 2.4 Further properties

In the GPD literature, various distributional quantities are available in addition to the ordinary moments and MEF; see,e.g., Ch 3 of Embrechts et al. (1997). Here we list similar properties of the exGPD; some are parallel to those of the GPD, but others are different. We present the findings first, and then provide the proofs. All findings include both and cases.

(a) For a real value with ,

 E[(1+ξσeY)−r]=11+rξ;rξ>−1 (0)

(b) For a non-negative integer ,

 E[(log(1+ξσeY))k]=ξk⋅k! (0)

(c) For real value with ,

 E[eY⋅(¯¯¯¯FY(Y))r]=σ(r+1−ξ)(r+1) (0)

(d) Assume that , independent of the iid sequence where , . Write . Then

 P(MN≤y)=exp(−λ(1+ξeyσ)−1/ξ) (0)

(e) For a constant ,

 ∫∞c¯¯¯¯FY(y)dy=∫∞c(1+ξσey)−1/ξdy=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩B((1+ξec/σ)−1;1/ξ,0),ξ>0,B(1+ξec/σ;1−1/ξ,0),ξ<0, (0)

where is the incomplete beta function

 B(x;a,b)=∫x0ta−1(1−t)b−1dt. (0)

Proof: Proof for (a):

 E[(1+ξσeY)−r] =∫∞−∞eyσ⋅(1+ξσey)−1/ξ−r−1dy. (0)

For case, we take , then by integration by substitution, (2.4) is equal to

 ∫∞1t−1/ξ−r−1⋅1ξdt=1ξ⋅11/ξ+r=11+rξ;r>−1/ξ. (0)

When , (2.4) becomes, with the same substitution,

 ∫01t−1/ξ−r−1⋅1ξdt=1ξ⋅11/ξ+r=11+rξ;r<−1/ξ, (0)

completing the proof. Note that the different conditions for has been combined to . We comment thus that the condition used in Theorem 4.3.13 of Embrechts et al. (1997) is not correct.

Proof for (b):

 E[(log(1+ξσeY))k] =∫∞−∞(log(1+ξσey))k⋅eyσ⋅(1+ξσey)−1/ξ−1dy. (0)

For case, if we take , then by integration by substitution, (2.4) is equal to

 ξ−1∫∞0tk⋅e−t/ξdt=ξk⋅k!. (0)

For case, we obtain the same result with the same substitution, but via a slightly different integration as and . Note that depending on being odd or even, the quantity can be negative or positive for .

Proof for (c):

 E[eY⋅(¯¯¯¯FY(Y))r] =∫∞−∞ey⋅1σ(1+ξσey)−(r+1)/ξ−1dy. (0)

For case, if we take , then (2.4) is equal to

 ∫∞1σξ⋅(t−1)t−(r+1)/ξ−1⋅1ξdt =σξ2∫∞1(t−(r+1)/ξ−t−(r+1)/ξ−1)dt =σξ2(ξr+1−ξ−ξr+1) =σ(r+1−ξ)(r+1);1+r>ξ. (0)

For case, with the same substitution, (2.4) becomes

 ∫01σξ⋅(t−1)t−(r+1)/ξ−1⋅1ξdt=σ(r+1−ξ)(r+1);r+1>0. (0)

Thus the conditions differ depending on the sign of even though the results are identical. If one wishes to merge these for convenience sake, then would serve the purpose.

Proof for (d): Note that

 P(MN≤y)=E[P(MN≤y|N=n)]=∞∑n=0P(MN≤y|N=n)⋅P(N=n).

On the other hand, we have

 P(MN≤y|N=n) =P(Mn≤y)=P(Y1≤y,Y2≤y,⋯,Yn≤y) =P(Y1≤y)P(Y2≤y)⋯P(Yn≤y)=(FY(y))n.

Therefore, we have

 P(MN≤y) =∞∑n=0(FY(y))n⋅λne−λn!=e−λ∞∑n=0(λ⋅FY(y))nn! =e−λ⋅eλ⋅FY(y)=e−λ(1−FY(y))=e−λ⋅¯¯¯¯FY(y) (0)

This property is related to the POT framework. If transformed back, this is the generalized extreme value (GEV) distribution.

Proof for (e): For case, use integration by substitution to get

 ∫∞c¯¯¯¯FY(y)dy =∫∞c(1+ξeyσ)−1/ξdy=∫(1+ξec/σ)−10t1/ξ−1⋅(1−t)−1dt =B((1+ξex/σ)−1;1/ξ,0), (0)

which can be evaluated using the hypergeometric function. Similarly, for case, we set to get

 ∫log(−σ/ξ)c¯¯¯¯FY(y)dy =∫log(−σ/ξ)c(1+ξeyσ)−1/ξdy=∫01+ξec/σt−1/ξ⋅(t−1)−1dt (0)

For case, we may use integration by substitution to get

 ∫∞c¯¯¯¯FY(y)dy=∫∞ce−ey−logσdy=∫∞ec−logσt−1e−tdt=Γ(0,ec−logσ), (0)

where is the incomplete gamma function

 Γ(s,x)=∫∞xts−1e−tdt.□ (0)

Sometimes the three-parameter GPD is found in the literature, which is created by adding a location parameter to the GPD. The df of this distribution is defined as , , where is the GPD in (2.1). We comment that most results in this section can be readily applied to the three-parameter GPD without additional difficulty.

### 2.5 Maximum likelihood estimation

Let be an iid sample from exGPD. Then from its density (2.1) the log-likelihood of the exGPD can be written as

 l(σ,ξ)=n∑i=1yi+nlogσξ−(1ξ+1)n∑i=1log(σ+ξeyi). (0)

and the MLE may be found from solving

 ∂l∂σ =nσξ−(1ξ+1)n∑i=1(1σ+ξeyi)=0 ∂l∂ξ =−nlogσξ2−(1ξ+1)n∑i=1(eyiσ+ξeyi)+1ξ2n∑i=1log(σ+ξeyi)=0.

Clearly, the MLE has to be solved numerically. Due to the condition that for all , one must have for where is the sample maximum. Then, for any given , we see that

 limlog(−σ/ξ)→y+(n)l(σ,ξ)=∞, (0)

implying that there is no finite MLE. So the MLE exists only when , the same condition for the MLE of the GPD to exist as pointed out in, e.g., Grimshaw (1993). In fact, it turns out that the MLEs for the GPD and exGPD are identical for a given sample. To see this, consider the densities of these two distributions with the same parameter , that is, and from (2.1) and (2.1), respectively. When a sample is given from the GPD, we can obtain a corresponding exGPD sample with . The log-likelihood function for the exGPD () is then shown from the density to be

 n∑i=1logfY(yi|σ,ξ) =n∑i=1yi+n∑i=1loggX(eyi|σ,ξ) =n∑i=1logxi+n∑i=1loggX(xi|σ,ξ) (0)

In the last expression, the first term involves no parameter and the second term stands for the log-likelihood function of the GDP. Hence maximizing the likelihood functions for both distributions leads to the identical parameter estimate. This implies that, even though the log transform may stablize the volatility of the GPD sample in terms of, e.g., moments, the MLE does not offer additional benefits from this stabilization. Consequently, if we let be the MLE of the parameter for exGPD, we have

 √n(^θ−θ)d∼BVN(0,[I1(θ)]−1), (0)

where the covariance matrix is given by

 [I1(θ)]−1=[2σ2(1+ξ)−σ(