Tstatistic for Autoregressive process
Abstract
In this paper, we discuss the distribution of the tstatistic under the assumption of normal autoregressive distribution for the underlying discrete time process. This result generalizes the classical result of the traditional tdistribution where the underlying discrete time process follows an uncorrelated normal distribution. However, for AR(1), the underlying process is correlated. All traditional results break down and the resulting tstatistic is a new distribution that converges asymptotically to a normal. We give an explicit formula for this new distribution obtained as the ratio of two dependent distribution (a normal and the distribution of the norm of another independent normal distribution). We also provide a modified statistic that is follows a non central tdistribution. Its derivation comes from finding an orthogonal basis for the the initial circulant Toeplitz covariance matrix. Our findings are consistent with the asymptotic distribution for the tstatistic derived for the asympotic case of large number of observations or zero correlation. This exact finding of this distribution has applications in multiple fields and in particular provides a way to derive the exact distribution of the Sharpe ratio under normal AR(1) assumptions.
AMS 1991 subject classification: 62E10, 62E15
Keywords: tStudent, Auto regressive process, Toeplitz matrix, circulant matrix, non centered Student distribution
1 Introduction
Let be a random sample from a cumulative distribution function (cdf) with a constant mean and let define the following statistics referred to as the tstatistic
(1) 
where is the empirical mean, the empirical Bessel corrected empirical variance, and the regular full history of the random sample defined by:
(2) 
It is well known that if the sample comes from a normal distribution, , has the Student tdistribution with degrees of freedom. The proof is quite simple (we provide a few in the appendix section in A.1). If the variables have a mean non equal to zero, the distribution is referred to as a noncentral tdistribution with non centrality parameter given by
(3) 
Extension to weaker condition for the tstatistics has been widely studied.
Mauldon Mauldon (1956) raised the question for which pdfs the tstatistic as defined by 1 is tdistributed with degrees of freedom. Indeed, this characterization problem can be generalized to the one of finding all the pdfs for which a certain statistic possesses the property which is a characteristic for these pdfs. Kagan et al. (1973), Bondesson (1974) and Bondesson (1983) to cite a few tackled Mauldonâs problem. Bondesson (1983) proved the necessary and sufficient condition for a tstatistic to have Studentâs tdistribution with degrees of freedom for all sample sizes is the normality of the underlying distribution. It is not necessary that is an independent sample. Indeed consider as a random vector each component of which having the same marginal distribution function, . Efron (1969) has pointed out that the weaker condition of symmetry can replace the normality assumption. Later, Fang et al. (2001) showed that if the vector has a spherical distribution, then the tstatistic has a tdistribution. A natural question that gives birth to this paper was to check if the Student resulting distribution is conserved in the case of an underlying process following an AR(1) process. This question and its answer has more implication than a simple theoretical problem. Indeed, if by any chance, one may want to test the statistical significance of a coefficient in a regression, one may do a ttest and rely upon the fact that the resulting distribution is a Student one. If by any chance, the observations are not independent but suffer from autocorrelation, the building blocks supporting the test break down. Surprisingly, as this problem is not easy, there has been few research on this problem. Even if this is related to the Dickey Fuller statistic (whose distribution is not closed form and needs to be computed by Monte Carlo simulation), this is not the same statistics. Mikusheva (2015) applied an Edgeworth expansion precisely to the Dickey Fuller statistics but not to the original tstatistic. The neighboring Dickey Fuller statistic has the great advantage to be equal to the ratio of two known continuous time stochastic process making the problem easier. In the sequel, we will first review the problem, comment on the particular case of zero correlation and the resulting consequence of the tstatistic. We will emphasize the difference and challenge when suddenly, the underlying observations are not any more independent. We will study the numerator and denominator of the tstatistic and derive their underlying distribution. We will in particular prove that it is only in the case of normal noise in the underlying AR(1) process, that the numerator and denominator are independent. We will then provide a few approximation for this statistic and conclude.
2 AR(1) process
The assumptions that the underlying process (or observations) follows an AR(1) writes :
(4) 
where is an independent white noise processes (i.i.d. variables with zero mean and unit constant variance). To assume a stationary process, we impose
(5) 
It is easy to check that equation 4 is equivalent to
(6) 
We can also easily check that the variance and covariance of the returns are given by
(7) 
Both expressions in 7 are independent of time and the covariance only depends on implying that is a stationary process.
2.1 Case of Normal errors
If in addition, we assume that are distributed according to a normal distribution, we can fully characterize the distribution of and rewrite our model in reduced matrix formulations as follows:
(8) 
where , hence, .
The matrix is a Toeplitz circulant matrix defined as
(9) 
Its Chlolesky decomposition is given by
(10) 
It is worth splitting into and another matrix as follows:
(11) 
The inverse of is given by
(12) 
Its Cholesky decomposition is given by
(13) 
Notice in the various matrix the dissymmetry between the first term and the rest. This shows up for instance in the first diagonal term of which is , while all other diagonal terms are equal to 1. Similarly, in the matrix , we can notice that the first column is quite different from the other ones as it is a fraction over .
2.2 Tstatistics issue
The Tstatistic given by equation 1 is not easy to compute. For the numerator, we have that follows a normal distribution. The proof is immediate as is a linear combination of the Gaussian vector generated by the AR(1) process. We have . It follows that (for a quick proof of the fact that any linear combination of a Gaussian vector is normal, see B.1). In section 3, we will come back on the exact computation of the characteristics of the distribution of the numerator and denominator as this will be useful in the rest of the paper.
As for the denominator, for a non null correlation , the distribution of is not a known distribution.
The distributions of the variables are normal given by
with
.
where
Hence the square of these normal variables is the sum of Gamma distributions. However, we cannot obtain a closed form for the distribution as the variance of the different terms are different and the terms are neither independent. If the correlation is null, and only in this specific case, we can apply the Cochranâs Theorem to prove that follows a Chi square distribution with degree of freedom. However, in the general case, we need to rely on approximation that will be presented in the rest of the paper.
Another interesting result is to use the Cholesky decomposition of the inverse of the covariance matrix of our process to infer a modified tstatistic that has now independent terms and is defined as follows
Let us take the modified process defined by
(14) 
The variables is distributed according to a normal . We can compute the modified Tstatistic on as follows:
(15) 
where
(16) 
In this specific case, the distribution of is a Student distribution of degree . We will now work on the numerator and denominator of the Tstatistic in the specific case of AR(1) with a non null correlation .
3 Expectation and variance of numerator and denominator
The numerator of the Tstatistic writes
(17) 
Its expectation is null as each term is of zero expectation. Its variance is given by
(18)  
(19) 
Proof.
: See B.2 ∎
The proposition 3 is interesting as it states that the sample mean variance converges to for large . It is useful to keep the two forms of the variance. The first one (equation (18)) is useful in following computation as it shares the denominator term . The second form (equation 19) gives the asymptotic form.
The denominator writes:
(20) 
In the following, we denote by the zero mean variable and work with these variables to make computation easier. We also write the variable orthogonal to whose variance (we sometimes refer to it as its squared norm to make notation easier) is equal to the one of : . To see the impact of correlation, we can write for any , .
As studying this denominator is not easy because of the presence of the square root, it is easier to investigate the properties of its squared given by
(21) 
We have that the mean of is zero while proposition 3 gives its variance :
(22) 
The covariance between and each stochastic variable is useful and given by
(23) 
In addition, we have a few remarkable identities
(24) 
(25) 
Proof.
: See B.3 ∎
We can now compute easily the expectation and variance of the denominators as follows
The expectation of is given by:
(26) 
Proof.
: See B.4 ∎
The second moment of is given by:
(27) 
with
(28)  
(29)  
(30)  
(31) 
Proof.
: See B.5 ∎
Combining the two results leads to {proposition} The variance of is given by:
(32) 
with
(33)  
(34)  
(35)  
(36)  
(37) 
Proof.
: See B.6 ∎
It is worth noting that a direct approach as explained in Benhamou (2018) could also give the results for the first, second moments and variance for the numerator and denominator.
4 Resulting distribution
The previous section shows that under the AR(1) assumptions, the tstatistic is no longer a Student distribution but the ratio of a normal whose first and second moments have been given above and the norm of a Gaussian whose moments have also been provided. To go further, one need to rely on numerical integration. This is the subject of further research.
5 Conclusion
In this paper, we have given the explicit first, second moment and variance of the numerator of the t statistic under the assumption of AR(1) underlying process. We have seen that these moments are very sensitive to the correlation assumptions and that the distribution is far from a Student distribution.
Appendix A Various Proofs for the Student density
a.1 Deriving the tstudent density
Let us first remark that in the Tstatistic, the factor cancels out to show the degree of freedom as follows:
(38) 
In the above expression, it is well know that if , then the renormalized variable and as well as and are independent. Hence, we need to prove that the distribution of is a Student distribution with and mutually independent, and is the degree of freedom of the chi squared distribution.
The core of the proof relies on two steps that can be proved by various means.
Step 1 is to prove that the distribution of is given by
(39) 
Step 2 is to compute explicitly the integral in equation 39
Step 1 can be done by transformation theory using the Jacobian of the inverse transformation or the property of the ratio distribution. Step 2 can be done by Gamma function, Gamma distribution properties, Mellin transform or Laplace transform.
a.2 Proving step 1
a.2.1 Using transformation theory
The joint density of and is:
(40) 
with the distribution support given by and .
Making the transformation and , we can compute the inverse: and . The Jacobian ^{1}^{1}1determinant of the Jacobian matrix of the transformation is given by
(41) 
whose value is . The marginal pdf is therefore given by:
(42)  
(43)  
(44) 
which proves the result ∎
a.2.2 Using ratio distribution
The squareroot of , is distributed as a chidistribution with degrees of freedom, which has density
(45) 
Define . Then by changeofvariable, we can compute the density of :
(46)  
(47) 
The student’s t random variable defined as has a distribution given by the ratio distribution:
(48) 
We can notice that over the interval since is a nonnegative random variable. We are therefore entitled to eliminate the absolute value. This means that the integral reduces to
(49)  
(50)  
(51) 
To conclude, we make the following change of variable that leads to
(52) 
∎
a.3 Proving step 2
The first step is quite relevant as it proves that the integral to compute takes various form depending on the change of variable done.
a.3.1 Using Gamma function
Using the change of variable and knowing that , we can easily conclude as follows:
(53)  
(54)  
(55)  
(56) 
∎
a.3.2 Using Gamma distribution properties
Another way to conclude is to notice the kernel of a gamma distribution pdf given by in the integral of 39 with parameters . The generic pdf for the gamma distribution is and it sums to one over , hence
(57)  
(58)  
(59) 
∎
a.3.3 Using Mellin transform
The integral of equation 39 can be seen as a Mellin transform for the function , whose solution is well known and given by
(60) 
Like previously, this concludes the proof. ∎
a.3.4 Using Laplace transform
We can use a result of Laplace transform for the function as folllows:
(61) 
Hence the integral is simply the the value of the Laplace transform of the polynomial function taken for , whose value is . Making the change of variable in equation 39 enables to conclude similarly to the proof for the Gamma function ∎
a.3.5 Using other transforms
Indeed, as the Laplace transform is related to other transform, we could also prove the result with LaplaceâStieltjes, Fourier, Z or Borel transform.
a.4 Sum of independent normals
We want to prove that if then . There are multiple proofs for this results:

Recursive derivation

Cochran’s theorem
a.4.1 Recursive derivation
{lemma}Let us remind a simple lemma:

If is a random variable, then ; which states that the square of a standard normal random variable is a chisquared random variable.

If are independent and then , which states that independent chisquared variables add to a chisquared variable with its degree of freedom equal to the sum of individual degree of freedom.
The proof of this simple lemma can be established with variable transformations for the fist part and by moment generating function for the second part. We can now prove the following proposition
If is a random sample from a distribution, then

and are independent random variables.

has a distribution where denotes the normal distribution.

has a chisquared distribution with degrees of freedom.
Proof.
Without loss of generality, we assume that and . We first show that can be written only in terms of . This comes from:
(62)  
(63) 
where we have use the fact that , hence .
We now show that and are independent as follows: The joint pdf of the sample is given by
(64) 
We make the
(65)  
(66)  
(67)  
(68) 
The Jacobian of the transformation is equal to . Hence
(69)  
(70) 
which proves that is independent of , or equivalently, is independent of . To finalize the proof, we need to derive a recursive equation for as follows: We first notice that there is a relationship between and as follows:
(71) 
We have therefore:
(72)  
(73)  
(74)  
(75) 
We can now get the result by induction. The result is true for since with , hence . Suppose it is true for , that is , then since