Quantile correlation coefficient: a new tail dependence measure
Ji-Eun Choi and Dong Wan Shin111Corresponding author. Mailing Address: Dept. of Statistics, Ewha University, Seoul, Korea. Tel: 82-2-3277-2614; Fax: 82-2-3277-3606; E-mail address: firstname.lastname@example.org (D.W. Shin)
Department of Statistics, Ewha University
March 16, 2018
Abstract. We propose a new measure related with tail dependence in terms of correlation: quantile correlation coefficient of random variables X, Y. The quantile correlation is defined by the geometric mean of two quantile regression slopes of X on Y and Y on X in the same way that the Pearson correlation is related with the regression coefficients of Y on X and X on Y. The degree of tail dependent association in X, Y, if any, is well reflected in the quantile correlation. The quantile correlation makes it possible to measure sensitivity of a conditional quantile of a random variable with respect to change of the other variable. The properties of the quantile correlation are similar to those of the correlation. This enables us to interpret it from the perspective of correlation, on which tail dependence is reflected. We construct measures for tail dependent correlation and tail asymmetry and develop statistical tests for them. We prove asymptotic normality of the estimated quantile correlation and limiting null distributions of the proposed tests, which is well supported in finite samples by a Monte-Carlo study. The proposed quantile correlation methods are well illustrated by analyzing birth weight data set and stock return data set.
Keywords Quantile correlation; quantile regression; tail dependence; conditional quantile
MSC classification: 62H20
Correlation coefficient is a standard statistical tool for measuring relationship between two variables. There are several versions of correlation coefficient such as the Pearson correlation coefficient, the Spearman’s rank correlation coefficient, the Kendall’s tau rank correlation coefficient, and others. The most common of these is the Pearson correlation coefficient, which is a measure for linear association. However, these correlation coefficients fail to measure tail-specific relationships.
Recently, interests in associations of random variables in tail parts have grown up in various fields. In finance, recurrent global finance crises have shown that a risky status of one financial institution causes a series of bad impacts on other financial institutions or on the total financial system. Hence, many studies on the measures for tail dependence have been conducted in the recent literature: CoVaR (co-value at risk) of Adrian and Brunnermeier (2016) and Giradi and Ergun (2013), volatility spillover index of Diebold and Yilmaz (2012) and many others. Other statistical tools were considered for tail dependence analysis. Copular is considered by many authors, see Joe et al. (2010), Nikoloulopoulos et al. (2012) and Kollo et al. (2017).
In environment, as frequency of abnormal climate has increased, importance for identifying associations of environmental factors in extreme tail part is accentuated. Accordingly, statistical analysis for association between abnormal climate and other factors using quantile regression have been conducted by many authors: Sayegh et al. (2014) for concentration; Meng and Shen (2014) for extreme temperature; Vilarini et al. (2011) for heavy rainfall and others.
We therefore need a measure which captures tail-specific relations. We define a new correlation coefficient, called “quantile correlation coefficient”, as a measure related to tail dependence in the context of correlation of random variables and . There is already a measure named “quantile correlation coefficient”, say, proposed by Li et al. (2015) which is the Pearson correlation of the indicator of the event and with -quantile of random variable , . Clearly, is not symmetric in in that . Moreover, the measure is a compound measure of sensitivity of conditional probability to change in and heterogeneity of conditional expectations and , which make it difficult to get a clear interpretation related with tail dependence, see Sections 2, 6. In fact, fails to reflect the degree of tail dependent association as illustrated in Examples 2.1, 2.2 below. Therefore, it is necessary to define a new quantile correlation coefficient which capture well the degree of tail dependent association and allows a clear interpretation for tail dependence.
The -quantile correlation coefficient of two random variables , is defined by the geometric mean of the two -quantile regression slopes of X on Y and of Y on X. Note that the Pearson correlation coefficient is the geometric mean of the two linear regression slopes of X on Y and of Y on X. The geometric mean indicates overall sensitivity of conditional mean of a variable with respect to change of the other variable. Similarly, the quantile correlation coefficient has the meaning of overall sensitivity of conditional -quantile of one variable with respect to change of the other variable.
Our quantile correlation coefficient will be shown to have many advantages of clear meaning and easy estimation. The quantile correlation coefficient satisfies the basic features of correlation coefficient: being zero for independent random variables; being for perfectly linearly related random variables; commutativity; scale-location-invariance; being bounded by 1 in absolute value for a general class of . This allows quantile correlation coefficient to be interpreted as a correlation coefficient.
The quantile correlation coefficient can be applied diversely. First, we can compare how sensitive lower, upper, median conditional quantile of one variable is to unit change of the other variable. For example, we can identify the fact that a stock return is more affected in lower tail conditional quantiles by change of another stock return than in upper tail conditional quantiles or than in conditional median. Second, it can be used to determine the order of variables which have high sensitivity in tail conditional quantiles with respect to change of the specific variable. For example, in environment, we can use it in primary screening of environmental factors which cause abnormal climate, such as high concentration of fine dust, heavy snow, heat wave and many others.
An estimation method is implemented for the quantile correlation coefficient giving us the sample quantile correlation coefficient. Based on the sample quantile correlation coefficient, we construct new measures and tests for differences between -quantile correlation and the median correlation and between left -quantile correlation and right -quantile correlation. We derive the asymptotic distributions of the sample quantile correlation coefficient and the asymptotic null distributions of the proposed tests.
A Monte-Carlo experiment shows finite sample validity of asymptotic distribution of the sample quantile correlation coefficient through its stable confidence interval coverage. The experiment also demonstrates that the proposed tests have reasonable sizes and powers. The proposed quantile correlation coefficient methods are well demonstrated by analyzing birth weight data set and stock return data set for investigating the relations between mother’s weight gained during pregnancy and birth weight and between the US S&P 500 index return and the French CAC 40 index return .
The remaining of the paper is organized as follows. Section 2 defines quantile correlation coefficient. Section 3 implements an estimation method. Section 4 establishes asymptotic distributions. Section 5 contains a finite sample Monte-Carlo simulation. Section 6 applies the quantile correlation coefficient methods to real data sets. Section 7 gives a conclusion.
2 Quantile correlation coefficient
In Section 2.1, quantile correlation coefficient is defined for a random vector which addresses -tail specific relation of and , . Meaning of is discussed to be a sensitivity measure of conditional -quantile of a variable with respect to change of the other variable. The proposed is shown to satisfy the properties what the Pearson correlation coefficient does. In Section 2.2, two examples illustrate that tail-dependent relations of and are well reflected in -dependent shape of . Measures of tail-dependency and tail asymmetry are proposed.
2.1 Definition and properties
The quantile correlation is motivated from the relationship between linear regression coefficients and correlation coefficient. Let be a random vector having finite second moment. Let , , . We observe that is the minimizing the expected squared error loss and is the minimizing . Note that is the geometric mean of the two linear regression coefficients. This correlation coefficient measures sensitivity of conditional mean of a random variable with respect to change of the other variable. The correlation is modified to measure sensitivity of conditional quantile rather than of conditional mean by considering -quantile regressions of minimizing the expected losses of -quantile regression, , rather than linear regressions of minimizing the expected square error losses: the -quantile correlation coefficient is defined to be the geometric mean of the two -quantile regression coefficients and of Y on X and X on Y.
In order to see what tells us, we first review what the Pearson correlation coefficient tells us. Assume temporarily linearities of and . Note that is the amount of change of with respect to unit change in and so is the amount of change of with respect to unit change in . The regression coefficients and are sensitivities of conditional expectations with respect to changes of conditioning variables. When the linearities of conditional expectations are violated, and are overall sensitivities of changes of conditional expectations with respect to changes of conditioning variables. Therefore, their geometric mean tells us overall sensitivity of conditional mean of a variable with respect to change of the other variable: the larger , the more sensitive the conditional mean of one variable to change of the other variable in an overall sense. By the same reasoning, the median correlation is an overall sensitivity measure of conditional median of one variable with respect to change of the other variable: the larger , the more overally sensitive the conditional median of a random variable to change of the other variable.
Similarly, for given , the larger , the more sensitive overally the conditional -quantile of a random variable to change of the other variable. Therefore, comparison of for different is meaningful. For example, if , it means that the conditional 0.1-quantile of a random variable is overally more sensitive to change of the other variable than the conditional median of it. If , it means the left conditional 0.1-quantile of a random variable is overally more sensitive to change of the other variable than the right conditional 0.1-quantile of it. Therefore, we can say that is an overall sensitivity measure of conditional -quantile of one variable with respect to change of the other variable.
On the other hand, the quantile correlation , of Li et al. (2015) has complicated implication, where is the -quantile of and is the indicator function of an event A. Assume linear conditional expectations for and . We have , , . Note that is the change of the conditional probability associated with unit change in and that . Therefore, large indicates (i) strong sensitivity of the conditional probability of being highly sensitive to change in or (ii) strong heterogeneity of conditional mean of having large difference in mean depending on or . Therefore, is a compound measure of sensitivity of conditional probability to change in and heterogeneity of conditional expectations and , see Section 6.1 for a real data illustration. It is hard to get a simple sensitivity interpretation from . Moreover, it is obvious that lacks symmetry in that . Furthermore, tail dependent association of , , if any, is not well reflected in as illustrated in Examples 2.1, 2.2. Unlike , our quantile correlation has well reflection of the degree of association of , as demonstrated in Examples 2.1, 2.2, has symmetry in and has a clear sensitivity interpretation.
Our -quantile regressions of on and on are defined by minimizing the expected loss,
is the loss function of the -quantile regression. Let be given and let
Note that these coefficients are more general than the “usual quantile regression coefficient” which minimizes the conditional loss function under the linearity assumption of the -conditional quantiles of given , see Koenker (2005, Section 4.1.2). Our quantile regression coefficient is defined without imposing the linearity assumption. If the -quantile of Y given X is linear in , is the same as the “usual -quantile regression coefficients” as shown in Theorem 2.3 below.
We define the quantile correlation for a random vector and study its basic properties.
Let be a random vector having finite first order moment. Let and be defined in (2). Given , the -quantile correlation coefficient between X and Y is defined as
If relation between and is heterogeneous in that they have different degrees of association depending on left tails of , right tails of and other , then the heterogeneity is reflected on . Therefore, can be regarded as a tail-dependence measure. This point will be more investigated in Examples 2.1, 2.2, below.
If , is not defined. However, the following theorem shows that the proposed quantile correlation is always well-defined.
TheoremTheorem 2.2 ()
For all , .
The following theorem states that under the linear quantile function conditions, the quantile regression coefficients are the same as the “usual quantile regression coefficient” of Koenker (2005, Section 4.1.2) and many others. Let be the conditional -quantile of given and let be that of given :
Note that minimizes the conditional expected loss . Similarly, minimizes .
TheoremTheorem 2.3 ()
Assume and are both linear in and , respectively, that is,
for some and . Then .
Basic properties of such as commutativity, scale-location-equivariance, and others are given below.
TheoremTheorem 2.4 ()
Assume has finite first moment. We have
(ii) if ,
(iii) if and , ,
(iv) if and are independent, .
Thanks to Theorem 2.4 (i), we can write by , which will be adopted in the remaining of the paper. According to properties (ii) - (iv), we have for perfectly linearly related (, ) and for independent and we know that is invariant under linear transforms of or of with positive slopes. The following theorem shows that for a wide class of distributions. We therefore can say that is a correlation measure of for such class.
TheoremTheorem 2.5 ()
Assume has finite first moment. We have if either (i) or (ii) , , where , , , , .
The condition of Theorem 2.5 for needs to be discussed. We have if , if , if or if . The first one is the case in which conditional -quantile of a random variable is negatively associated with the other variable. From the second condition, we have in any case. The third one is a kind of symmetry of the residuals and , which is satisfied for the usual symmetric bivariate distributions such as bivariate normal, bivariate t, bivariate uniform and many others. We finally discuss the last condition . For skewed distributions having , we have for . This is a satisfactory aspect. For the distributions with , left tails of the distributions of , are heavier than right tails. Special important such examples are financial asset returns. For such distributions with heavier left tails, people are more interested in dependence in left tails than in right tails. For distribution having , even though Theorem 2.5 does not guarantee for , it does not mean for .
The following theorem characterizes a situation in which the -quantile correlation coefficient is identical with the Pearson correlation coefficient .
TheoremTheorem 2.6 ()
Assume the conditional distribution of Y given X depend on only through and is linear in X. Assume the same one for the conditional distribution of X given Y. Then for all .
An important special random vector satisfying the conditions of Theorem 2.6 is the bivariate normal random vector for which we hence have . Such random vector satisfying the conditions of Theorem 2.6 has no tail-specific dependence because association between X and Y is exhausted out by the linear conditional expectations and .
2.2 Local dependence measure
This subsection starts with a couple of illustrative examples having -dependent quantile correlation coefficient whose shape reflects the tail-dependent degree of association between and . Next, it proposes tail dependence measure and of tail asymmetry measure based on .