In this paper we propose a generalization of a class of Gaussian Semiparametric Estimators (GSE) of the fractional differencing parameter for long-range dependent multivariate time series. We generalize a known GSE-type estimator by introducing some modifications at the objective function level regarding the process’ spectral density matrix estimator. We study large sample properties of the estimator without assuming Gaussianity as well as hypothesis testing. The class of models considered here satisfies simple conditions on the spectral density function, restricted to a small neighborhood of the zero frequency. This includes, but is not limited to, the class of VARFIMA models. A simulation study to assess the finite sample properties of the proposed estimator is presented and supports its competitiveness. We also present an empirical application to an exchange rate data.
Keywords: Multivariate processes; Long-range dependence; Semiparametric estimation; VARFIMA processes; Asymptotic theory.
Mathematical Subject Classification (2010). Primary 62H12, 62F12, 62M10, 60G10, 62M15.
A Semiparametric Estimator for Long-Range Dependent Multivariate Processes
Guilherme Pumi111Corresponding author.††Mathematics Institute - Federal University of Rio Grande do Sul - 9500, Bento Gonçalves Avenue - 91509-900, Porto Alegre - RS - Brazil.††E-mail addresses: firstname.lastname@example.org (G. Pumi), email@example.com (S.R.C. Lopes). and Sílvia R.C. Lopes
††This version: 12/13/2012.
Semiparametric estimation of the fractional differencing parameter in multivariate long-range dependent time series has seen growing attention in the last few years (see, for instance, Lobato, 1999, Andersen et al., 2003 and Chiriac and Voev, 2011). The first attempt to develop the theory of semiparametric estimation in the context of univariate long-range dependent time series seems to point back to late 80’s with the work of Künsch (1987), which proposed a local Whittle-type estimator. The idea of the estimator is to locally model the behavior of the spectral density function in long-range dependent time series by locally approximating the time-domain Gaussian likelihood near the origin. These estimators comprehend the widely applied class of Gaussian Semiparametric Estimators (GSE, for short). The asymptotic theory of the particular estimator proposed in Künsch (1987) was challenging and posed some real theoretical difficulties in its development given the non-linear definition of the estimator. The first asymptotic results were presented in Robinson (1995b) and later further studied in Velasco (1999), Phillips and Shimotsu (2004), among others. Several variants also emerged (Hurvich and Chen, 2000, Shimotsu and Phillips, 2005, among others).
The estimation of the fractional differencing parameter in long-range dependent time series started focusing on the fully parametric case. In the univariate case, the asymptotic theory of Whittle’s estimator was fully described in the work of Fox and Taqqu (1986) and Giraitis and Surgailis (1990), while the asymptotic theory of the exact maximum likelihood estimator was established by Dahlhaus (1989) and later extended to the multivariate case by Hosoya (1997), although Sowell (1989) has studied the method in the context of VARFIMA processes before. The computational cost of the exact maximum likelihood procedure in the multivariate case is high. A relatively fast approximation is studied in Luceño (1996) and, more recently, in Tsay (2010), both in the context of VARFIMA processes.
Fully parametric methods present some important asymptotic properties such as efficiency, -consistency and asymptotic normality under the correct specification of the parametric model. The main criticism to the method comes from its inconsistency under misspecification of the underlying parametric structure and from the crucial role played by Gaussianity assumptions in the asymptotic theory (but see Giraitis and Surgailis, 1990), both contestable in real life applications. In this direction, the semiparametric approach presents many advantages over the parametric one, such as less distributional requirements, robustness against short-run dependencies and more efficiency compared to the latter. Another important advantage of the semiparametric approach is that Gaussianity is usually not assumed in the asymptotic theory.
The first rigorous treatment of a multivariate semiparametric estimator was given in Robinson (1995a). Lobato (1999) analyzes a two-step GSE based on a simple local approximation of the spectral density function in the neighborhood of the origin and derive its (Gaussian) asymptotic distribution. Shimotsu (2007) analyzes another multivariate GSE by considering a refinement of the local approximation considered in Lobato (1999) and, extending the techniques in Robinson (1995b), shows its consistency and asymptotic normality. Shimotsu (2007) also considers a “single-step” version of Lobato (1999)’s estimator and shows its consistency and asymptotic normality. The work of Shimotsu (2007) has been recently extended to cover non-stationary multivariate long-range dependent processes in Nielsen (2011).
In this paper we are interested in semiparametric estimation of the fractional differencing parameter on multivariate long-range dependent processes. The idea is to generalize the “single-step” version of Lobato (1999)’s GSE considered in Shimotsu (2007) by substituting the periodogram function, applied in defining the estimator’s objective function, by an arbitrary estimator of the spectral density. Although a useful tool in spectral analysis, it is well-known that the periodogram is an ill-behaved inconsistent estimator of the spectral density. Seen as a random variable, it does not even converge to a random variable at all (cf. Grenander, 1951) being considered by some authors “an extremely poor (if not useless) estimate of the spectral density function” (Priestley, 1981, p.420). A natural question is can we improve the performance of the GSE estimator by considering spectral density estimators other than the periodogram? Answering this question is the main focus of our study.
Our theoretical contribution is focused on large sample properties of the proposed estimator considering different classes of spectral density estimators. First we consider objective functions with the periodogram substituted by consistent estimators of the spectral density function. We show the consistency of the GSE obtained under the same conditions in the process as considered in Shimotsu (2007). Second, we relax the consistency condition by considering spectral density estimators satisfying certain mild moment conditions and show the consistency of the related estimator. Third we examine the asymptotic normality of the proposed estimator under a certain mild regularity condition on the spectral density estimator and under the same assumptions as in Shimotsu (2007). The limiting distribution turns out to be the same as the estimator considered in Shimotsu (2007). We also consider hypothesis testing related problems. Gaussianity is nowhere assumed in the asymptotic theory.
To exemplify the use and to assess the finite sample performance of the proposed estimator, we consider the particular cases where the smoothed periodogram and the tapered periodogram are applied as spectral density estimators. We perform a Monte Carlo simulation study based on the resulting GSE and compare it to the same estimator based on the periodogram itself. We also apply the estimators to a real data set.
The paper is organized as follows. In the next section we consider some preliminary results and definitions necessary to this work and introduce the proposed estimator. In Section 3, we study the consistency of the proposed estimator while in Section 4 we derive the estimator’s asymptotic distribution. In Section 5 we present a Monte Carlo simulation study in order to assess the estimator’s finite sample performance and compare it to the “single-step” version of Lobato’s estimator considered in Shimotsu (2007). The real data application is presented in Section 6 and the conclusions in Section 7. For the presentation sake, the proofs of the results in this work are presented in Appendix A.
Let be a weakly stationary -dimensional process and let denote the spectral density matrix function of , so that
for . In Lobato (1999), the author considers processes for which the spectral density matrix satisfies the following local approximation
where , and is a symmetric positive definite real matrix.
Although specification (2.1) is somewhat general, as noted in Lobato (1999), several fractionally integrated models satisfy this condition. Each coordinate process of exhibits long-range dependence whenever the respective parameter , in the sense that the respective (unidimensional) spectral density function satisfies , as , for some constant and .
be the discrete Fourier transform and the periodogram of at , respectively, where denotes the conjugate transpose of a complex matrix .
From the local form of the spectral density function (2.1) and the frequency domain Gaussian log-likelihood localized in the neighborhood of zero, Lobato (1999) introduces a two-step Gaussian semiparametric estimator for the parameter , henceforth denoted by . The estimator is a two-step optimization procedure based on the objective function
where , for , are the Fourier frequencies, , with denoting the sample size, and
where, as usual, denotes the real part of . For , let denote the univariate quasi-maximum likelihood estimate (QMLE) given in Robinson (1995b) obtained from the -th coordinate process. The first step is to obtain an initial estimate of by calculating the univariate QMLE for each coordinate process. Let . The final estimate is obtained by calculating
Naturally, in this case the estimator of the matrix in (2.1) is just .
Under some mild conditions, Lobato (1999) shows that, when the spectral density function follows (2.1), the estimator (2.5) satisfies , as tends to infinity, where is the identity matrix and denotes the Hadamard produtct.
Notice that one can consider the same estimator based on the objective function (2.3) as a “single-step” estimator by solving the -dimensional optimization problem
where is the parameter space, usually some subset of . Estimator (2.6) was considered in details in Shimotsu (2007). Arguably, a two-step procedure like the one necessary to obtain in (2.5) is computationally faster than a direct -dimensional optimization procedure as (2.6). In the late 90’s, a direct multidimensional optimization procedure could be troublesome considering the computational resources available for the general public at the epoch. Nowadays, however, with the recent advances in computer sciences and the development of faster CPU’s, a direct optimization procedure such as (2.6) represents no difficulty in practice.
Shimotsu (2007) considered a more refined local approximation for the spectral density matrix, namely
and studied the asymptotic behavior of (2.6) under (2.7). Under some mild conditions, the author showed the consistency and asymptotical normality of the estimator (2.6) under (2.7) even though, in this case, the estimator is based on the misspecified model (2.1).
Now let be a weakly stationary -dimensional process and let denote its spectral density matrix function satisfying (2.7). Let denote an arbitrary estimator of based on the observations . Consider the objective function
where denotes the space of admissible estimates, usually a subset of . In this work we are interested in studying the asymptotic behavior and finite sample performance of the estimator (2.10) as a function of . The estimator (2.10) is a refinement of Lobato (1999) and a generalization of the results on the estimator (2.6) presented in Shimotsu (2007). Our asymptotic study is divided in two main cases. First, we consider the case where is an arbitrary consistent estimator of the spectral density . Secondly, we consider the case where is an arbitrary estimator of satisfying a certain moment condition. The intersection between the two cases is not empty, as we shall discuss later on.
3 Asymptotic Theory: Consistency
Let be a weakly stationary -dimensional process and let be its spectral density matrix satisfying (2.1) for a real, symmetric and positive definite matrix . Let be the true fractional differencing vector parameter. As usual, the sup-norm is denoted by and to simplify the notation, we shall denote the -th row and the -th column of a matrix by and , respectively. Before proceed with the asymptotic theory, let us state the necessary conditions for the consistency of the estimator.
Assumption A1. As ,
Assumption A2. The process is a causal linear process, that is,
where the innovation process is a (not necessarily uncorrelated) square integrable martingale difference, in the sense that
for all , where denotes the -field generated by . Also assume that there exist a random variable and a constant such that and , for all .
Assumption A3. With given in (3.1), define the function
In a neighborhood , , of the origin, we assume that is differentiable and
Assumption A4. As ,
Assumptions A1-A4 are the multivariate extensions of those in Robinson (1995b) considered in Shimotsu (2007). Assumption A1 describes the true spectral density matrix behavior at the origin. Replacing by makes asymptotically no difference, since . Assumption A2 regards the behavior of the innovation process which is assumed to be a not necessarily uncorrelated square integrable martingale difference uniformly dominated in probability by a square integrable random variable. Assumption A3 is a regularity condition often imposed in the parametric case as, for instance, in Fox and Taqqu (1986) and Giraitis and Surgailis (1990). Assumption A4 is minimal but necessary since must go to infinity for consistency, but must do so slower than in view of Assumption A1.
Assumptions A1-A4 are local ones and only regard the spectral density behavior at the vicinities of the origin. Outside a small neighborhood of the origin, no assumption on the spectral density is made (except, of course, for the integrability of the spectral density, implied by the weak stationarity of the process).
For , let be a -consistent estimator of the spectral density function444that is, for all where is a closed subset of . Since the spectral density matrix of the process is unbounded at the zero frequency when , for some , there is no hope in finding a consistent estimator for it in this situation. Let
Next we consider the estimator (2.10) with , that is
In the next theorem we establish the consistency of (3.4) under Assumptions A1-A4 considering an -consistent spectral density function estimator. For the sake of a better presentation, the proofs of all results in the paper are postponed to Appendix A.
Let be a weakly stationary -dimensional process and let be its spectral density matrix. Let be a -consistent estimator of for all , for , and let be as in (3.4). Assume that Assumptions A1-A4 hold and let . Then, , as .
We now study the problem of relaxing the -consistency of the spectral density estimator assumed in Theorem 3.1. We consider the class of estimators satisfying, for ,
for . A relatively simpler condition implying (3.5) is as follows:
Let be a weakly stationary -dimensional process and let be its spectral density matrix and assume that assumptions A1-A4 hold. Let be an estimator of satisfying for all and ,
where is given by (3.2) and denotes the periodogram function associated to , that is,
Then, for ,
where and satisfy
Thus, (3.5) holds.
The class of estimators satisfying (3.5) is non-empty since, for instance, both, the ordinary and the tapered periodogram satisfy (3.6) in the form (see lemma 1 in Shimotsu, 2007 and Section 5.3 below). Condition (3.5) is just slightly more general than (3.6). For the periodogram, (3.5) can be seen directly as well, but it is more involved (see lemma 1 in Shimotsu, 2007). Condition (3.5) plays a crucial role in replacing the -consistency assumed in Theorem 3.1, as we shall see later. From an asymptotic point of view, (3.5) is a sufficient condition to prove the consistency of under Assumptions A1-A4. This is the content of the next theorem.
4 Asymptotic Distribution and Hypothesis Testing
Let be a weakly stationary -dimensional process, let be its spectral density matrix. Suppose that satisfies (2.1) for a real, symmetric and positive definite matrix . Let be the true fractional differencing vector parameter. Assume that the following assumptions are satisfied
Assumption B1. For and ,
Assumption B2. Assumption A2 holds and the process has finite fourth moment.
Assumption B3. Assumption A3 holds.
Assumption B4. For any ,
Assumption B5. There exists a finite real matrix such that
Assumption B1 is a smoothness condition on the behavior of the spectral density function near the origin. It is slightly stronger than Assumption A1 and is often imposed in spectral analysis. Assumption B2 imposes that the process is linear with finite fourth moment. This restriction in the innovation process is necessary since the proof of Theorem 4.1 depends on a CLT-type result for a martingale difference sequence defined as a quadratic form involving , which must have finite variance. Assumption B4 is the same as assumption 4’ in Shimotsu (2007). In particular, it implies that , for . Assumption B5 is the same as assumption 5’ in Shimotsu (2007) and it is a mild regularity condition in the degree of approximation of by . It is satisfied by general VARFIMA processes.
Let be a weakly stationary -dimensional process. Let be its spectral density matrix and assume that assumptions B1-B5 hold, with B4 holding for . Let be an estimator of satisfying
for all and . Then,
The next theorem establishes the asymptotic normality of estimator (2.10) under Assumptions B1-B5 considering the class of estimators satisfying (4.1). To make the presentation simpler, let us define the matrices
Let be a weakly stationary -dimensional process, let be its spectral density matrix and assume that assumptions B1-B5 hold, with B4 holding for . Let be an estimator of satisfying (4.1), for all and . If , for , then
as tends to infinity, where
and the identity matrix. Furthermore, .
The asymptotic variance-covariance matrix takes a cumbersome form. A simple case occurs when in which case and , the variance-covariance matrix of the limiting distribution of Lobato (1999)’s two-step estimator. Also, by Theorem 4.1, is not a consistent estimator of . However, since the -th element of is , a consistent estimator of the matrix under the hypothesis of Theorem 4.1 is
This result allows for hypothesis testing. First, let denote the matrix defined in the same way as is defined in Theorem 4.1, but with in place of . It follows that, under the hypothesis of Theorem 4.1, . Let and let be a non-zero real matrix and . Consider testing a set of (independent) linear restrictions on of the form
Assuming the conditions of Theorem 4.1, under the test statistics
is asymptotically distributed as a distribution. As particular cases we have: testing the process for a common fractional differencing parameter, in which case with dimension and , where is a vector composed by zeroes; testing if the process is , in which case and is a vector of zeroes. Notice that in the particular case where is the periodogram , and thus, , (4.5) is also valid by theorem 3 in Shimotsu (2007).
5 Simulation Study
In this section we present a Monte Carlo simulation study to assess the finite sample performance of the estimator (2.10) and compare it to (2.6). In order to do that, we apply the tapered periodogram and the smoothed periodogram as the spectral density estimator on (2.9). Before presenting the simulation study, let us recall some facts on the the smoothed and tapered periodograms.
5.1 The Smoothed Periodogram
Let be a weakly stationary -dimensional process. Let be an array of functions (called weight functions) and be an increasing sequence of positive integers. For a Fourier frequency , we define the smoothed periodogram of by
where is given by (2.2) and denotes the Hadamard product. If, for some and , , we take as having period . In practice, at zero frequency, we use a slightly different estimative, namely,
When the process is long-range dependent, its spectral density present a pole at zero frequency, so that some authors take the summation in (5.1) restricted to . In finite samples, however, is always well defined since the process is finite with probability one. More details on the smoothed periodogram can be found, for instance, in Priestley (1981) and references therein.
The smoothed periodogram as defined in (5.1) is a multivariate extension of the univariate smoothed periodogram. The use of the Hadamard product in the definition allows the use of different weight functions across different components of the process, accommodating, in this manner, the necessity often observed in practice of modeling different spectral density characteristics, including cross spectrum ones, with different weight functions. We refrain from discussing the different types of weight functions in the literature, since the subject is present in most textbooks. See, for instance, Priestley (1981), where a broad account of different weight functions, their properties and further references can be found.
In the asymptotic theory, we are only interested in sequences and weight functions satisfying
Assumption C1. , as tends to infinity;
Assumption C2. and , for all ;
Assumption C3. ;
Assumption C4. , as tends to infinity.
Under Assumptions C1-C4, the smoothed periodogram is an -consistent estimator of the spectral density matrix for all . Assumptions C1-C4 are standard ones in the asymptotic theory of the smoothed periodogram (see, for instance, Priestley, 1981). Assumption C1 controls the convergence rate of the sequence with respect to . Assumptions C2-C3 impose that the weight functions must be non-negative symmetric functions and that the sequences form a convex sequence of coefficients for each , and . Assumption C4 is just a mild technical condition for the consistency of the estimator.
Since the smoothed periodogram (under Assumptions C1-C4) is an -consistent estimator of the spectral density function for , Theorem 3.1 applies and we conclude that the estimator (3.4) (under Assumptions A1-A4) is consistent for all . To this moment, we were not able to establish the consistency of the estimator (2.10) based on the smoothed periodogram via Theorem 3.2 nor its asymptotic normality via Theorem 4.1. However, there is empirical evidence (as we shall show later) that this is indeed the case.
5.2 The Tapered Periodogram
The main idea on the tapered periodogram is to obtain a decrease on the asymptotic bias by tapering the data before calculating the periodogram of the series. This is specially helpful in the case of long-range dependent time series. See, for instance, Priestley (1981) and Hurvich and Beltrão (1993).
Let be a weakly stationary -dimensional process. For , let be a collection of functions. Consider the vector of functions defined as and let
The tapered periodogram of is then defined by setting
We shall assume the following:
Assumption D. The tapering functions are of bounded variation and , for all
The tapered periodogram is not a consistent estimator of the spectral density function, since the reduction on the bias induces, in this case, an augmentation of the variance. Just as the ordinary periodogram, the increase in the variance can be dealt by smoothing the tapered periodogram in order to obtain a consistent estimator of the spectral density function in the case (see, for instance, the recent work of Fryzlewicz, Nason and von Sachs, 2008). Usually, a good performance of the tapered periodogram is obtained through tapering functions which decay faster than the Féjer’s kernel. For more information on the choices of taper functions, see Priestley (1981), Dahlhaus (1983), Hurvich and Beltrão (1993), Fryzlewicz, Nason and von Sachs (2008) and references therein.
Under Assumption D, (cf. Fryzlewicz, Nason and von Sachs, 2008) so that . This allows to show that the estimator (2.10) based on the tapered periodogram is also consistent and asymptotically normally distributed. These are the contents of the next Corollaries.
Let be a weakly stationary -dimensional process with spectral density satisfying Assumptions B1-B5, with B4 holding for . Let be the tapered periodogram given in (5.2) satisfying Assumption D. For , consider the estimator based on , as given in (2.10). Then, for ,
as tends to infinity, with as given in Theorem 4.1.
5.3 Simulation Results
Recall that the class of VARFIMA processes comprehend -dimensional stationary processes which satisfy the difference equations
where is the backward shift operator, is an -dimensional stationary process (the innovation process), and are matrices in , given by the equations
assumed to have no common roots, where are real matrices and , the identity matrix.
The simulation study is based on bidimensional Gaussian VARFIMA time series (i.i.d. innovation process) of sample size for four different pairs of the parameter and (within component) correlation . A total of 1,000 replications is performed for each set of parameters. The time series are generated by the widely applied method of truncating the multidimensional infinite moving average representation of the process. The truncation point is fixed at 50,000 for all cases. The goal is the estimation of the parameter . To do that, we consider the estimator (2.10) with the tapered and the smoothed periodogram as the spectral density matrix estimator . For the tapered periodogram, we apply the cosine-bell tapering function, namely,
The resulting estimator is denoted by TLOB. The cosine-bell taper is often applied as tapering function in applications as, for instance, in Hurvich and Ray (1995), Velasco (1999) and Olbermann et al. (2006). For the smoothed periodogram, we apply the so-called Bartlett’s weights for all spectral density components, namely
We consider the smoothed periodogram with and without the restriction in (5.1) and the resulting estimator are denoted by SLOB and , respectively. We also apply, for comparison purposes, the estimator given in (2.6), denoted by LOB. The specific truncation point of the smoothed periodogram function is of the form , for while the truncation point of the objective function (2.8) is of the form , for for all estimators. All simulations were performed by using the computational resources of the (Brazilian) Center of Super Computing (CESUP-UFRGS). The routines were all implemented in FORTRAN 95 language optimized with OpenMP directives for parallel computing.