Expected number of real roots of random trigonometric polynomials
We investigate the asymptotics of the expected number of real roots of random trigonometric polynomials
whose coefficients , , are independent identically distributed random variables with zero mean and unit variance. If denotes the number of real roots of in an interval , we prove that
1.1. Main result
In this paper we are interested in the number of real roots of a random trigonometric polynomial defined as
where , , and the coefficients and are independent identically distributed random variables with
The random variable which counts the number of real roots of in an interval is denoted by . By convention, the roots are counted with multiplicities and a root at or is counted with weight . The main result of this paper is as follows.
Under assumption (2) and for arbitrary , the expected number of real roots of satisfies
The number of real roots of random trigonometric polynomials has been much studied in the case when the coefficients are Gaussian; see [Dun66], [Das68], [Qua70], [Wil91], [Far90], [Sam78], to mention only few references, and the books [Far98], [BRS86], where further references can be found. In particular, a proof of (3) in the Gaussian case can be found in [Dun66]. Recently, a central limit theorem for the number of real roots was obtained in [GW11] and then, by a different method employing Wiener chaos expansions, in [AL13]. For random trigonometric polynomials involving only cosines, the asymptotics for the variance (again, only in the Gaussian case) was obtained in [SS12].
All references mentioned above rely heavily on the Gaussian assumption which allows for explicit computations. Much less is known when the coefficients are non-Gaussian. In the case when the coefficients are uniform on and there are no terms involving the sine, an analogue of (3) was obtained in [Sam76]. The case when the third moment of the coefficients is finite, has been studied in [ST83]. After the main part of this work was completed, we became aware of the work of Jamrom [Jam72] and a recent paper by Angst and Poly [AP]. Angst and Poly [AP] proved (3) (with ) assuming that the coefficients and have finite -th moment and satisfy certain Cramér-type condition. Although this condition is satisfied by some discrete probability distributions, it excludes the very natural case of -valued Bernoulli random variables. Another recent work by Azaïs et. al. [ADJ] studies the local distribution of zeros of random trigonometric polynomials and also involves conditions stronger than just the existence of the variance. In the paper of Jamrom [Jam72], Theorem 1 (and even its generalization to coefficients from an -stable domain of attraction) is stated without proof. Since full details of Jamrom’s proof do not seem to be available and since there were at least three works following [Jam72] in which the result was established under more restrictive conditions (namely, [Sam76], [ST83], [AP]), it seems of interest to provide a full proof of Theorem 1.
1.2. Method of proof
The proof uses ideas introduced by Ibragimov and Maslova [IM71] (see also the paper by Erdös and Offord [EO56]) who studied the expected number of real zeros of a random algebraic polynomial of the form
For an interval and we introduce the random variable which is the indicator of a sign change of on the endpoints of and is more precisely defined as follows:
The proof of Theorem 1 consists of two main steps.
Step 1: Reduce the study of roots to the study of sign changes. Intuition tells us that and should not differ much if the interval becomes small. More concretely, one expects that the number of real zeros of on should be of order , hence the distance between consecutive roots should be of order . This suggests that on an interval of length (with small ) the event of having at least two roots (or a root with multiplicity at least ) should be very unprobable. The corresponding estimate will be given in Lemma 2. For this reason, it seems plausible that on intervals of length the events “there is at least one root”, “there is exactly one root” and “there is a sign change” should almost coincide. A precise statement will be given in Lemma 5. This part of the proof relies heavily on the techniques introduced by Ibragimov and Maslova [IM71] in the case of algebraic polynomials.
Step 2: Count sign changes. We compute the limit of on an interval of length . This is done by establishing a bivariate central limit theorem stating that as the random vector converges in distribution to a Gaussian random vector with mean , unit variance, and covariance . From this we conclude that converges to the probability of a sign change of this Gaussian vector. Approximating the interval by a lattice with mesh size and passing to the limits and then completes the proof. This part of the proof is much simpler than the corresponding argument of Ibragimov and Maslova [IM71].
Notation. The common characteristic function of the random variables and is denoted by
Due to the assumptions on the coefficients in (1), we can write
for sufficiently small , where is a continuous function with .
In what follows, denotes a generic positive constant which may change from line to line.
2. Estimate for on small intervals
In this section we investigate the expected difference between and on small intervals of length , where is fixed.
2.1. Expectation and variance
The following lemma will be frequently needed.
For let denote the th derivative of . The expectation and the variance of are given by
The th derivative of reads as follows:
Recalling that and have zero mean and unit variance we immediately obtain the required formula. ∎
2.2. Estimate for the probability that has many roots
Given any interval , denote by the event that the th derivative of has at least roots in (the roots are counted with their multiplicities and the roots on the boundary are counted without the weight ). Here, and . A key element in our proofs is an estimate for the probability of this event presented in the next lemma.
Fix and . For and let be any interval of length . Then,
where is a constant independent of , , , .
For all , there exists a constant such that the estimate
holds for all , and all intervals .
By Rolle’s theorem, on the event we can find (random) in the interval such that
Thus we may consider the random variable
On the event , the random variables and are equal. On the complement of , . Hence, it follows that
Markov’s inequality yields
Using Hölder’s inequality we may proceed as follows
It remains to find a suitable estimate for . From Lemma 1 it follows that
holds, whence the statement follows immediately. ∎
Fix . There exists a constant such that for all , , ,
For let be a random variable (independent of ) with characteristic function
That is, is the sum of two independent random variables which are uniformly distributed on . Consider the random variable
For all we have
and we estimate the terms on the right-hand side separately.
First term on the RHS of (7). The density of exists and can be expressed using the inverse Fourier transform of its characteristic function denoted in the following by
Using the representation for obtained in the proof of Lemma 1 and recalling that is the characteristic function of and , we obtain
Using Fourier inversion, for every we may write
We used that for every and . The coefficients and are supposed to have zero mean and unit variance. From this we can conclude that
where is a sufficiently small constant. Let be a disjoint partition of defined by
We decompose the integral above as follows:
For the integral over we may write using and ,
The integral over is smaller than a positive constant independent of because we can estimate all terms involving by means of (8) as follows:
where is a small constant and we used that
For with we have
Thus, we can estimate all factors with using (8), whereas for all other factors we use the trivial estimate :
where are small constants and we substituted . Summing up yields
Taking the estimates for together, for every we obtain
2.3. Roots and sign changes
The next lemma contains the main result of this section.
For every there exists such that for all and every interval of length we have the estimate
where is a constant independent of , , , .
A crucial feature of this estimate is that the exponent of is , while the exponent of is negative.
Let be the random event defined as in Section 2.2. Observe that due to the convention in which way counts the roots, the difference between and vanishes in the following cases:
has no roots in (in which case );
has exactly one simple root in and no roots on the boundary (in which case );
has no roots in and one simple root (counted as ) at either or (in which case ).
In all other cases (namely, on the event when the number of roots in , with multiplicities, but without -weights on the boundary, is at least ) we only have the trivial estimate
Since and on the event it holds that , we obtain
where in the last step we passed to the -th derivative of using Rolle’s theorem. The upper bounds for the first two terms on the right-hand side follow immediately by Lemma 2, namely
Thus we focus on the last term. For every (and big enough) we can find a number such that
For the estimate for the probability of presented in Lemma 2 is good enough, whereas for we use the fact that for all . This yields
Combining the above estimates yields the statement of the lemma. ∎
3. The related stationary Gaussian process
3.1. Convergence to the Gaussian case
In the following let denote the stationary Gaussian process with , , and covariance
The following lemma states the weak convergence of the bivariate distribution of with to , as .
Let be arbitrary but fixed. For let be an interval of length . Then
To prove the statement it suffices to show the pointwise convergence of the corresponding characteristic functions. Let
denote the characteristic function of . Recall that represents the common characteristic function of the coefficients and . Then the expression reads
Using (5) we have
where we have shortened the writing by defining
After elementary transformations and using that we obtain
where we have abbreviated
Since Riemann sums converge to Riemann integrals, we have
For we have that uniformly in . Hence,
as . The remaining term of the sum
goes to for all fixed , as . Therefore we have
and is nothing but the characteristic function of . This implies the statement. ∎
3.2. The Gaussian case
Denote by the analogue of for the process , that is
As , we have
Let be bivariate normal distributed with parameters
Let be arbitrary but fixed. Then,
In the special case the lemma could be deduced from the explicit formula
due to F. Sheppard; see [BD88] and the references therein. For general , no similar formula seems to exist and we need a different method.
By the formula for the density of the random vector , we have to investigate the integral