About kernel-based estimation of conditional Kendall’s tau: finite-distance bounds and asymptotic behavior

About kernel-based estimation of conditional Kendall’s tau: finite-distance bounds and asymptotic behavior

Alexis Derumigny, Jean-David Fermanian ENSAE, 5, avenue Henry Le Chatelier, 91764 Palaiseau cedex, France. alexis.derumigny@ensae.frENSAE, 5, avenue Henry Le Chatelier, 91764 Palaiseau cedex, France. jean-david.fermanian@ensae.fr. This research has been supported by the Labex Ecodec.
July 20, 2019
Abstract

We study nonparametric estimators of conditional Kendall’s tau, a measure of concordance between two random variables given some covariates. We prove non-asymptotic bounds with explicit constants, that hold with high probabilities. We provide “direct proofs” of the consistency and the asymptotic law of conditional Kendall’s tau. A simulation study evaluates the numerical performance of such nonparametric estimators.

Keywords : conditional dependence measures, kernel smoothing, conditional Kendall’s tau

1 Introduction

In the field of dependence modeling, it is common to work with dependence measures. Contrary to usual linear correlations, most of them have the advantage of being defined without any condition on moments, and of being invariant to changes in the underlying marginal distributions. Such summaries of information are very popular and can be explicitly written as functionals of the underlying copulas: Kendall’s tau, Spearman’s rho, Blomqvist’s coefficient… See Nelsen [9] for an introduction. In particular, Kendall’s tau is a well-known dependence measure in which quantifies the positive or negative dependence between two random variables and . Denoting by the (assumed unique) underlying copula of , their Kendall’s tau can be directly defined as

(1)

where are two independent versions of . This measure is then interpreted as the probability of observing a concordant pair minus the probability of observing a discordant pair.

Similar dependence measure can be introduced in a conditional setup, when a -dimensional covariate is available. The goal is now to model the dependence between the two components and , given the vector of covariates . Logically, we can invoke the conditional copula of given for any point (see Patton [10, 11]), and the corresponding conditional Kendall’s tau would be simply defined as

where are two independent versions of . As above, this is the probability of observing a concordant pair minus the probability of observing a discordant pair, conditionally on and being both equal to . Indeed, if and have two different values, then concordant/discordant pairs do not bring any information about the dependence between and for any fixed value of the conditioning covariate. Note that, as conditional copulas themselves, conditional Kendall’s taus are invariant w.r.t. increasing transformations of the conditional margins and , given .

Of course, if is independent of then, for every , the conditional Kendall’s tau is equal to the (unconditional) Kendall’s tau . A weaker sufficient condition for this to happen is the so-called “simplifying assumption” about the conditional copula, i.e. “ does not depend on the choice of ”, a key assumption for vine modeling in particular: see [1] or [8] for a discussion, [3] for a review and a presentation of formal tests for this hypothesis. However, in general, there is no reason why the mapping or any other conditional dependence measure should stay constant. And conditional Kendall’s tau are of interest per se because they allow to summarize the evolution of the dependence between and , when the covariate is changing. Some conditional dependence measures and their estimates have been introduced in the literature a few years ago ([7],[15],[6]) but their properties have not yet been fully studied in depth. Note that our has not to be confused with the so-called “conditional Kendall’s tau” in the case of truncated data, as in Tsai [14].

Interestingly, dependence measures are of interest for the purpose of estimating copula models too. Indeed, several popular parametric families of copulas have a simple one-to-one mapping between their parameter and the associated Kendall’s tau (or Spearman’s rho): Gaussian, Student with a fixed degree of freedom, Clayton, Gumbel and Frank copulas, etc. Getting back in a conditional framework, assume that the conditional copula belongs to such a parametric family, say a Gaussian copula with a parameter . Then, by estimating the conditional Kendall’s tau , we get an estimate of the corresponding parameter , and finally of the conditional copula itself.

Until now, the theoretical properties of conditional Kendall’s tau estimates have been obtained “in passing” in the literature, as a sub-product of the weak-convergence of conditional copula processes ([15]) or as intermediate quantities that will be “plugged-in” ([5]). Therefore, such properties have been stated under too demanding assumptions, in particular some assumptions related to the estimation of conditional margins while it is not required. In this paper, we will directly study nonparametric estimates without relying on the theory/inference of copulas. Therefore, we will state their main usual properties of statistical estimates: exponential bounds in probability, consistency, asymptotic normality.

In Section 2, different kernel-based estimators of the conditional Kendall’s tau are proposed. In Section 3, the theoretical properties of the latter estimators are proved, first with finite-distance bounds and then under an asymptotic point-of-view. A short simulation study is provided in Section 4. Proofs are postponed into the appendix.

2 Definition of several kernel-based estimators of

Let be an i.i.d. sample distributed as , and . Assuming continuous underlying distributions, there are several equivalent ways of defining the conditional Kendall’s tau:

Motivated by each of the latter expressions, we introduce several kernel-based estimators of :

where denotes the indicator function, is a sequence of weights given by

(2)

with for some kernel on , and denotes a usual bandwidth sequence that tends to zero when . In this paper, we have chosen usual Nadaraya-Watson weights. Obviously, there are alternatives (local linear, Priestley-Chao, Gasser-Müller, etc., weight), that would lead to different theoretical results.

The estimators and look similar, but they are nevertheless different, as shown in Proposition 1. These differences are due to the fact that all the are affine transformations of a double-indexed sum, on every pair , including the diagonal terms where . The treatment of these diagonal terms is different for each of the three estimators defined above. Indeed, setting it can be easily proved that takes values in the interval , in , and in . Moreover, there exists a direct relationship between these estimators, given by the following proposition.

Proposition 1.

Almost surely, , where .

This proposition is proved in Section A.1. As a consequence, we can rescale easily the previous estimators so that the new estimator will take values in the whole interval . This would yield

Note that none of the latter estimators depends on any estimation of conditional marginal distributions. In other words, we only have to choose conveniently the weights to obtain an estimator of the conditional Kendall’s tau. This is coherent with the fact that conditional Kendall’s taus are invariant with respect to conditional marginal distributions. Moreover, note that, in the definition of our estimators, the inequalities are strict (there are no terms corresponding to the cases ). This is inline with the definition of (conditional) Kendall’s tau itself through concordant/discordant pairs of observations.

The definition of can be motivated as follows. For , let be an estimator of the conditional cdf of given . Then, a usual estimator of the conditional copula of and given is

See [15] or [6], e.g. The latter estimator of the conditional copula can be plugged into (1) to define an estimator of the conditional Kendall’s tau itself:

(3)

If the functions are increasing, this reduces to

Veraverbeke et al. [15], Subsection 3.2, introduced their estimator of by (3). By the functional Delta-Method, they deduced its asymptotic normality as a sub-product of the weak convergence of the process when is univariate. In our case, we will obtain the theoretical properties of under weaker conditions by a more direct analysis. We could similarly justify in a similar way by considering conditional survival copulas.

Let us define by

where for , we set . Clearly, is a smoothed estimator of , . The choice of the bandwidth can be done in a data-driven way following the general conditional U-statistics framework detailed in Dony and Mason [4, Section 2]. Indeed, for any and , denote by the estimator that is made with our dataset where the -th and -th observations have been removed. As a consequence, the random function is independent of . The bandwidth can then be chosen as the minimizer of the cross-validation criteria

for . A similar criterion can be proposed for the rescaled version .

3 Theoretical results

3.1 Finite distance bounds

Hereafter, we will consider the behavior of conditional Kendall’s tau estimates given belongs to some fixed open subset in . For the moment, let us state an instrumental result that is of interest per se. Let be the usual kernel estimator of the density of the conditioning variable . Note that the estimators are well-behaved only whenever . Denote the joint density of by . In our study, we need some usual conditions of regularity.

Assumption 3.1.

The kernel is bounded, and set . It is symmetrical and satisfies , . This kernel is of order for some integer : for all and every indices in , . Moreover, for every .

Assumption 3.2.

is -times continuously differentiable on and there exists a constant s.t., for all ,

Assumption 3.3.

There exist two positive constants and such that, for every , .

Proposition 2.

Under Assumptions 3.1-3.3 and if , for any , the estimator is strictly positive with a probability larger than

The latter proposition is proved in Section A.2. It guarantees that our estimators , , are well-behaved with a probability close to one.

Assumption 3.4.

For every , is differentiable on almost everywhere up to the order . For every and every , let

denoting . Assume that is integrable and there exists a finite constant such that, for every and every ,

is less than .

Proposition 3 (Exponential bound for the estimated conditional Kendall’s tau).

Under Assumptions 3.1-3.4, for every such that and every , we have

for any and every , with and .

Remark 4.

In Propositions 2 and 3, the lower bound can be replaced by the real density when it is positive. Moreover, when the support of is included in for some , can be replaced by a local bound , denoting by a closed ball of center and any radius , when . Moreover, if it is not guaranteed that is positive, the results above apply, replacing by in the denominators.

This proposition is proved in Section A.3. As a corollary, it yields the weak consistency of for every , under the assumptions of Proposition 3 and if (set and sufficiently small). In the next section, some asymptotic results will be stated, including consistency under weaker assumptions.

3.2 Asymptotic behavior

Proposition 5 (Consistency).

Under Assumption 3.1, if , when , and are continuous on , then tends to in probability, when for any .

This property is proved in Section A.4.

Proposition 6 (Uniform consistency).

Under Assumption 3.1, assume that , when , is Lipschitz, and are continuous on a bounded set , and there exists a lower bound s.t. for any . Then almost surely, when for any .

This property is proved in Section A.5. To derive the asymptotic law of this estimator, we will assume:

Assumption 3.5.

(i) and ; (ii) is compactly supported.

Proposition 7 (Joint asymptotic normality at different points).

Let be fixed points in a set . Assume 3.13.43.5, that the are distinct and that and are continuous on , for every . Then, as ,

where is a real matrix defined by

for every , and , , are independent versions.

This proposition is proved in Section A.6.

Remark 8.

The results of Proposition 5 and 7 apply to the “rescaled” estimator , , too. Indeed, under our assumptions, it can be easily proved by Markov’s inequality that for any , that tends to zero. Then, by Slutsky’s theorem, we get an asymptotic equivalence between the limiting laws of our estimated Kendall’s tau , , and of the linearly transformed version.

4 Simulation study

In this simulation study, we draw i.i.d. random samples , with univariate explanatory variables (). We consider two settings, that correspond to bounded and/or unbounded explanatory variables respectively:

  1. and the law of is uniform on . Conditionally on , and both follow a Gaussian distribution . Their associated conditional copula is Gaussian and their conditional Kendall’s tau is given by .

  2. and the law of is . Conditionally on , and both follow a Gaussian distribution , where is the cdf of the . Their associated conditional copula is Gaussian and their conditional Kendall’s tau is given by .

These simple frameworks allow us to compare the numerical properties of our different estimators in different parts of the space, in particular when is close to zero or one, i.e. when the conditional Kendall’s tau is close to or to . We compute the different estimators for , and the symmetrically rescaled version . The bandwidth is chosen as with and . For each setting, we consider three local measures of goodness-of-fit: for a given and for any Kendall’s tau estimate (say ), let

  • the (local) bias: ,

  • the (local) standard deviation: ,

  • the (local) mean square-error: .

We also consider their integrated version w.r.t the usual Lebesgue measure on the whole support of , respectively denoted by , and . Some results concerning these integrated measure are given in Table 1 (resp. Table 2) for Setting (resp. Setting ), and for different choices of and . For the sake of effective calculations of these measures, all the theoretical previous expectations are replaced by their empirical counterparts based on simulations.

For every , the best results seem to be obtained with and the fourth (rescaled) estimator, particularly in terms of bias. This is not so surprising, because the estimators , , do not have the right support at a finite distance. Note that this comparative advantage of in terms of bias decreases with , as expected. In terms of integrated variance, all the considered estimators behave more or less similarly, particularly when .

To illustrate our results for Setting 1 (resp. Setting 2), the functions , and have been plotted on Figures 1-2 (resp. Figures 3-4), both with our empirically optimal choice . We can note that, considering the bias, the estimator behaves similarly as when the true is close to , and similarly as when the true Kendall’s tau is close to . But globally, the best pointwise estimator is clearly obtained with the rescaled version , after a quick inspection of MSE levels, and even if the differences between our four estimators weaken for large sample sizes. The comparative advantage of more clearly appears with Setting 2 than with Setting 1. Indeed, in the former case, the support of ’s distribution is the whole line. Then does not suffer any more from the boundary bias phenomenon, contrary to what happened with Setting 1. As a consequence, the biases induced by the definitions of , , appear more strinkingly in Figure 3, for instance: when is close to (resp. ), the biases of (resp. ) and are close, when the bias (resp. ) is a lot larger. Since the squared biases are here significantly larger than the variances in the tails, provides the best estimator globally considering ”both sides” together. But even in the center of ’s distribution, the latter estimator behaves very well.

IBias ISd IMSE IBias ISd IMSE IBias ISd IMSE IBias ISd IMSE

-133 197 66.5 -34.5 84.9 9.86 -18.2 61.6 4.85 -10.9 46 2.65
-12.9 187 43.7 -4.08 84.4 8.58 -0.9 61.5 4.49 -1.07 46 2.53
107 190 56.6 26.4 84.5 9.26 16.4 61.5 4.76 8.8 46 2.6
-0.91 213 48.2 -1.18 86.9 8.55 0.733 62.4 4.46 -0.149 46.4 2.5

-88 150 35.8 -26.3 68 6.32 -13.9 50.7 3.33 -7.98 37.6 1.8
-10.4 145 26.3 -5.97 67.9 5.6 -2.33 50.6 3.12 -1.39 37.5 1.74
67.2 146 30.6 14.3 67.9 5.75 9.2 50.6 3.19 5.2 37.5 1.76
-2.06 157 26.7 -3.99 69.2 5.49 -1.21 51.2 3.05 -0.76 37.8 1.69

-67.8 123 24.5 -19.2 58.7 4.8 -11 43.1 2.52 -6.34 33 1.44
-9.99 121 19 -3.95 58.6 4.39 -2.35 43.1 2.39 -1.39 33 1.4
47.8 122 20.9 11.3 58.7 4.47 6.34 43.1 2.41 3.57 33 1.41
-3.48 128 18.1 -2.34 59.5 4.18 -1.46 43.4 2.29 -0.897 33.2 1.35

-44.6 101 17.5 -15.9 50.4 4.12 -9.7 35.9 2.13 -5.52 27.6 1.28
-5.81 100 14.9 -5.68 50.3 3.84 -3.84 35.9 2.02 -2.18 27.6 1.24
33 101 15.5 4.58 50.3 3.77 2.01 35.9 1.99 1.15 27.6 1.23
-1.09 104 13.4 -4.55 50.8 3.57 -3.19 36.1 1.9 -1.83 27.7 1.18

-37.8 91.4 17.3 -11.8 43.8 4.14 -7.2 31.2 2.35 -5.97 23.7 1.43
-8.03 91.4 15.4 -3.93 43.8 3.94 -2.75 31.2 2.28 -3.44 23.7 1.39
21.7 91.7 15.4 3.91 43.8 3.87 1.7 31.2 2.24 -0.912 23.7 1.37
-4.5 94.2 13.5 -3.01 44.1 3.62 -2.24 31.3 2.12 -3.16 23.8 1.32
Table 1: Results of the simulation in Setting 1. All values have been multiplied by 1000. Bold values indicate optimal choices for the chosen measure of performance.
Figure 1: Local bias, standard deviation and MSE for the estimators (red) , (blue),  (green),  (orange), with and in Setting 1. The dotted line on the first figure is the reference at 0.
Figure 2: Local bias, standard deviation and MSE for the estimators (red) , (blue),  (green),  (orange), with and in setting 1 in Setting 1. The dotted line on the first figure is the reference at 0.
IBias ISd IMSE IBias ISd IMSE IBias ISd IMSE IBias ISd IMSE

-207 227 180 -54.1 83.9 16.9 -29.6 55.3 5.81 -16.9 38.9 2.49
1.15 207 97 0.845 80.5 10.8 0.557 54.4 4.35 0.145 38.6 2.04
210 228 181 55.7 83.2 16.4 30.7 55.4 5.9 17.2 38.9 2.5
1.4 225 51.9 0.987 81.4 6.86 0.456 55 3.22 0.175 38.9 1.66

-144 175 98.6 -33.3 60.6 7.5 -19.8 41.9 3.12 -10.6 30.5 1.42