Real roots of random trigonometric polynomials

Expected number of real roots of random trigonometric polynomials

Hendrik Flasche Hendrik Flasche, Institut für Mathematische Statistik, Universität Münster, Orléans–Ring 10, 48149 Münster, Germany
Abstract.

We investigate the asymptotics of the expected number of real roots of random trigonometric polynomials

 Xn(t)=u+1√nn∑k=1(Akcos(kt)+Bksin(kt)),t∈[0,2π],u∈R

whose coefficients , , are independent identically distributed random variables with zero mean and unit variance. If denotes the number of real roots of in an interval , we prove that

 limn→∞ENn[a,b]n=b−aπ√3exp(−u22).

1. Introduction

1.1. Main result

In this paper we are interested in the number of real roots of a random trigonometric polynomial defined as

 (1) Xn(t):=u+1√nn∑k=1(Akcos(kt)+Bksin(kt)),

where , , and the coefficients and are independent identically distributed random variables with

 (2) EAk=EBk=0,E[A2k]=E[B2k]=1.

The random variable which counts the number of real roots of in an interval is denoted by . By convention, the roots are counted with multiplicities and a root at or is counted with weight . The main result of this paper is as follows.

Theorem 1.

Under assumption (2) and for arbitrary , the expected number of real roots of satisfies

 (3) limn→∞ENn[a,b]n=b−aπ√3exp(−u22).

The number of real roots of random trigonometric polynomials has been much studied in the case when the coefficients are Gaussian; see [Dun66], [Das68], [Qua70], [Wil91], [Far90], [Sam78], to mention only few references, and the books [Far98], [BRS86], where further references can be found. In particular, a proof of (3) in the Gaussian case can be found in [Dun66]. Recently, a central limit theorem for the number of real roots was obtained in [GW11] and then, by a different method employing Wiener chaos expansions, in [AL13]. For random trigonometric polynomials involving only cosines, the asymptotics for the variance (again, only in the Gaussian case) was obtained in [SS12].

All references mentioned above rely heavily on the Gaussian assumption which allows for explicit computations. Much less is known when the coefficients are non-Gaussian. In the case when the coefficients are uniform on and there are no terms involving the sine, an analogue of (3) was obtained in [Sam76]. The case when the third moment of the coefficients is finite, has been studied in [ST83]. After the main part of this work was completed, we became aware of the work of Jamrom [Jam72] and a recent paper by Angst and Poly [AP]. Angst and Poly [AP] proved (3) (with ) assuming that the coefficients and have finite -th moment and satisfy certain Cramér-type condition. Although this condition is satisfied by some discrete probability distributions, it excludes the very natural case of -valued Bernoulli random variables. Another recent work by Azaïs et. al. [ADJ] studies the local distribution of zeros of random trigonometric polynomials and also involves conditions stronger than just the existence of the variance. In the paper of Jamrom [Jam72], Theorem 1 (and even its generalization to coefficients from an -stable domain of attraction) is stated without proof. Since full details of Jamrom’s proof do not seem to be available and since there were at least three works following [Jam72] in which the result was established under more restrictive conditions (namely, [Sam76], [ST83], [AP]), it seems of interest to provide a full proof of Theorem 1.

1.2. Method of proof

The proof uses ideas introduced by Ibragimov and Maslova [IM71] (see also the paper by Erdös and Offord [EO56]) who studied the expected number of real zeros of a random algebraic polynomial of the form

 Qn(t):=n∑k=1Aktk.

For an interval and we introduce the random variable which is the indicator of a sign change of on the endpoints of and is more precisely defined as follows:

 (4) N∗n[a,b]:=12−12sgn(Xn(a)Xn(b))=⎧⎪⎨⎪⎩0if Xn(a)Xn(b)>0,1/2if Xn(a)Xn(b)=0,1if Xn(a)Xn(b)<0.

The proof of Theorem 1 consists of two main steps.

Step 1: Reduce the study of roots to the study of sign changes. Intuition tells us that and should not differ much if the interval becomes small. More concretely, one expects that the number of real zeros of on should be of order , hence the distance between consecutive roots should be of order . This suggests that on an interval of length (with small ) the event of having at least two roots (or a root with multiplicity at least ) should be very unprobable. The corresponding estimate will be given in Lemma 2. For this reason, it seems plausible that on intervals of length the events “there is at least one root”, “there is exactly one root” and “there is a sign change” should almost coincide. A precise statement will be given in Lemma 5. This part of the proof relies heavily on the techniques introduced by Ibragimov and Maslova [IM71] in the case of algebraic polynomials.

Step 2: Count sign changes. We compute the limit of on an interval of length . This is done by establishing a bivariate central limit theorem stating that as the random vector converges in distribution to a Gaussian random vector with mean , unit variance, and covariance . From this we conclude that converges to the probability of a sign change of this Gaussian vector. Approximating the interval by a lattice with mesh size and passing to the limits and then completes the proof. This part of the proof is much simpler than the corresponding argument of Ibragimov and Maslova [IM71].

Notation. The common characteristic function of the random variables and is denoted by

 φ(t):=Eexp(itA1),t∈R.

Due to the assumptions on the coefficients in (1), we can write

 (5) φ(t)=exp(−t22H(t))

for sufficiently small , where is a continuous function with .

In what follows, denotes a generic positive constant which may change from line to line.

2. Estimate for ENn[a,b]−EN∗n[a,b] on small intervals

In this section we investigate the expected difference between and on small intervals of length , where is fixed.

2.1. Expectation and variance

The following lemma will be frequently needed.

Lemma 1.

For let denote the th derivative of . The expectation and the variance of are given by

 EX(j)n(t)={u,j=0,0,j∈N,VX(j)n(t)=1nn∑k=1k2j.
Proof.

The th derivative of reads as follows:

 X(j)n(t)−u\mathbbm1j=0 =1√nn∑k=1(Akdjdtjcos(kt)+Bkdjdtjsin(kt)) =1√nn∑k=1kj{(−1)j/2Akcos(kt)+(−1)j/2Bksin(kt),if j is even,(−1)j−12Aksin(kt)+(−1)j−12Bkcos(kt),% if j is odd.

Recalling that and have zero mean and unit variance we immediately obtain the required formula. ∎

2.2. Estimate for the probability that X(j)n has many roots

Given any interval , denote by the event that the th derivative of has at least roots in (the roots are counted with their multiplicities and the roots on the boundary are counted without the weight ). Here, and . A key element in our proofs is an estimate for the probability of this event presented in the next lemma.

Lemma 2.

Fix and . For and let be any interval of length . Then,

 P(D(j)m)⩽C(δ(2/3)m+δ−(1/3)mn−(2j+1)/4),

where is a constant independent of , , , .

Proof.

For arbitrary we may write

The terms on the right-hand side will be estimated in Lemmas 3 and 4 below. Using these lemmas, we obtain

 P(D(j)m)⩽C[nmT(β−α)mm!]2+C(T+T−1/2n−(2j+1)/4).

Setting yields the statement. ∎

Lemma 3.

For all , there exists a constant such that the estimate

 P⎛⎝D(j)m∩⎧⎨⎩∣∣ ∣∣X(j)n(β)nj∣∣ ∣∣⩾T⎫⎬⎭⎞⎠⩽C[nmT(β−α)mm!]2

holds for all , and all intervals .

Proof.

By Rolle’s theorem, on the event we can find (random) in the interval such that

 X(j+l)n(tl)=0 for all l∈{0,…,m−1}.

Thus we may consider the random variable

 Y(j)n:=\mathbbm1D(j)m×β∫t0x1∫t1…xm−1∫tm−1X(j+m)n(xm) dxm…dx1.

On the event , the random variables and are equal. On the complement of , . Hence, it follows that

 P⎛⎝D(j)m∩⎧⎨⎩|X(j)n(β)|nj⩾T⎫⎬⎭⎞⎠⩽P⎛⎝|Y(j)n|nj⩾T⎞⎠.

Markov’s inequality yields

Using Hölder’s inequality we may proceed as follows

 P(|Y(j)n|⩾Tnj) ⩽1T2n2j(β−α)mm!E∫βt0∫x1t1…∫xm−1tm−1|X(j+m)n(xm)|2 dxm…dx1 ⩽1T2n2j[(β−α)mm!]2supx∈[α,β]E|X(j+m)n(x)|2.

It remains to find a suitable estimate for . From Lemma 1 it follows that

 E|X(m+j)n(x)|2=VX(j+m)n(x)=1nn∑k=1k2(j+m)⩽C(j,m)n2(j+m)

holds, whence the statement follows immediately. ∎

Lemma 4.

Fix . There exists a constant such that for all , , ,

 (6)
Proof.

For let be a random variable (independent of ) with characteristic function

 ψ(t):=E[exp(itη)]=sin2(tλ)t2λ2.

That is, is the sum of two independent random variables which are uniformly distributed on . Consider the random variable

 ~X(j)n(β):=n−jX(j)n(β)+η.

For all we have

 (7)

and we estimate the terms on the right-hand side separately.

First term on the RHS of (7). The density of exists and can be expressed using the inverse Fourier transform of its characteristic function denoted in the following by

 ~φn(t):=Eexp(it~X(j)n(β)).

Using the representation for obtained in the proof of Lemma 1 and recalling that is the characteristic function of and , we obtain

 |~φn(t)|=ψ(t)n∏k=1∣∣ ∣∣φ(kjtcos(kβ)nj+1/2)∣∣ ∣∣∣∣ ∣∣φ(kjtsin(kβ)nj+1/2)∣∣ ∣∣.

Using Fourier inversion, for every we may write

 P(|~X(j)n(β)|⩽y) =2π∞∫0sin(yt)tRe~φn(t) dt

We used that for every and . The coefficients and are supposed to have zero mean and unit variance. From this we can conclude that

 (8) |φ(t)|⩽exp(−t2/4) for t∈[−c,c],

where is a sufficiently small constant. Let be a disjoint partition of defined by

 Γl :={t:cnj+1/2(l+1)j⩽t

We decompose the integral above as follows:

 P(|~X(j)n(β)|⩽y)⩽2yπn∑l=0Il,

where

 Il:=∫Γlψ(t)n∏k=1∣∣ ∣∣φ(kjtcos(kβ)nj+1/2)∣∣ ∣∣∣∣ ∣∣φ(kjtsin(kβ)nj+1/2)∣∣ ∣∣ dt.

For the integral over we may write using and ,

 I0⩽∞∫cnj+1/2ψ(t) dt=∞∫cnj+1/2sin2(λt)λ2t2 dt⩽1cλ2n−(j+1/2).

The integral over is smaller than a positive constant independent of because we can estimate all terms involving by means of (8) as follows:

 In⩽c√n∫0ψ(t)exp(−14t2n2j+1n∑k=1k2j) dt⩽∞∫0exp(−t2γ) dt⩽C,

where is a small constant and we used that

 n∑k=1k2j∼n2j+12j+1as n→∞.

For with we have

 ∣∣∣ljtcos(lβ)nj+1/2∣∣∣⩽tljnj+1/2⩽c,∣∣∣ljtsin(lβ)nj+1/2∣∣∣⩽tljnj+1/2⩽c.

Thus, we can estimate all factors with using (8), whereas for all other factors we use the trivial estimate :

 Il ⩽∫Γlψ(t)exp(−14t2n2j+1l∑k=1k2j) dt ⩽∫cnj+1/2ljcnj+1/2(l+1)j1λ2t2exp(−γ1t2(ln)2j+1) dt =1λ2(ln)j+1/2∫c√lclj+1/2(l+1)j1u2exp(−γ1u2) du ⩽Cλ2(ln)j+1/2exp(−γ2l),

where are small constants and we substituted . Summing up yields

 n−1∑l=1Il⩽Cλ−2n−(j+1/2)n−1∑l=1lj+1/2exp(−γ2l)⩽C′λ−2n−(j+1/2).

Taking the estimates for together, for every we obtain

 (9) P(|~X(j)n(β)|⩽y)⩽Cy(1λ2n−(j+1/2)+1).

Second term on the RHS of (7). The second term on the right hand-side of (7) can be estimated using Chebyshev’s inequality (and ). Namely, for every ,

 (10) P(|η|⩾z)⩽Vηz2=23λ2z2.

Proof of (6). We arrive at the final estimate setting and in (9) and (10) respectively. We obtain that for every and the inequality

holds for a positive constant . This bound can be optimized by choosing a suitable . Setting the statement of the lemma follows. ∎

2.3. Roots and sign changes

The next lemma contains the main result of this section.

Lemma 5.

For every there exists such that for all and every interval of length we have the estimate

 0⩽ENn[α,β]−EN∗n[α,β]⩽C(δ4/3+δ−7n−1/4),

where is a constant independent of , , , .

A crucial feature of this estimate is that the exponent of is , while the exponent of is negative.

Proof.

Let be the random event defined as in Section 2.2. Observe that due to the convention in which way counts the roots, the difference between and vanishes in the following cases:

• has no roots in (in which case );

• has exactly one simple root in and no roots on the boundary (in which case );

• has no roots in and one simple root (counted as ) at either or (in which case ).

In all other cases (namely, on the event when the number of roots in , with multiplicities, but without -weights on the boundary, is at least ) we only have the trivial estimate

 0⩽Nn[α,β]−N∗n[α,β]⩽Nn[α,β].

Since and on the event it holds that , we obtain

 0⩽ENn[α,β]−EN∗n[α,β] ⩽E[Nn[α,β]\mathbbm1D(0)2] ⩽P(D(0)2)+2n∑m=2P(D(0)m) ⩽P(D(0)2)+21∑m=2P(D(0)m)+2n−20∑m=2P(D(20)m),

where in the last step we passed to the -th derivative of using Rolle’s theorem. The upper bounds for the first two terms on the right-hand side follow immediately by Lemma 2, namely

 P(D(0)2)+21∑m=2P(D(0)m)⩽C(δ4/3+δ−7n−1/4).

Thus we focus on the last term. For every (and big enough) we can find a number such that

 n2⩽δ−k0/3<δ−2k0/3⩽n5.

For the estimate for the probability of presented in Lemma 2 is good enough, whereas for we use the fact that for all . This yields

 2n−20∑m=2P(D(20)m) ⩽k0∑m=2P(D(20)m)+2n−20∑m=k0+1P(D(20)k0) ⩽k0∑m=2C(δ2m/3+δ−m/3n−10)+2Cn(δ2k0/3+δ−k0/3n−10) ⩽C(δ4/3+n−5)+2Cn(n−2+n−5) ⩽C(δ4/3+δ−7n−1/4).

Combining the above estimates yields the statement of the lemma. ∎

3. The related stationary Gaussian process

3.1. Convergence to the Gaussian case

In the following let denote the stationary Gaussian process with , , and covariance

 Cov[Z(t),Z(s)]=sin(t−s)t−s,t≠s.

The following lemma states the weak convergence of the bivariate distribution of with to , as .

Lemma 6.

Let be arbitrary but fixed. For let be an interval of length . Then

 (Xn(αn)Xn(βn))→(Z(0)Z(δ))in distribution as n→∞.
Proof.

To prove the statement it suffices to show the pointwise convergence of the corresponding characteristic functions. Let

 φn(λ,μ):=Eei(λXn(αn)+μXn(βn))

denote the characteristic function of . Recall that represents the common characteristic function of the coefficients and . Then the expression reads

 φn(λ,μ)= eiu(λ+μ)n∏k=1φ(λcos(kαn)+μcos(kβn)√n)φ(λsin(kαn)+μsin(kβn)√n).

Using (5) we have

 φn(λ,μ) =e−Sn(λ,μ), Sn(λ,μ) :=−iu(λ+μ)+12nn∑k=1(λcos(kαn)+μcos(kβn))2H1(n,k) +12nn∑k=1(λsin(kαn)+μsin(kβn))2H2(n,k),

where we have shortened the writing by defining

 H1(n,k) :=H(λcos(kαn)+μcos(kβn)√n), H2(n,k) :=H(λsin(kαn)+μsin(kβn)√n).

After elementary transformations and using that we obtain

 Sn(λ,μ)= −iu(λ+μ)+1nn∑k=1H1(n,k)(λ22+μ22+λμcos(kδn))+Rn(λ,μ),

where we have abbreviated

 Rn(λ,μ):=12nn∑k=1(λsin(kαn)+μsin(kβn))2(H2(n,k)−H1(n,k)).

Since Riemann sums converge to Riemann integrals, we have

 limn→∞1nn∑k=1(λ22+μ22+λμcos(kδn))=λ22+μ22+λμsinδδ.

For we have that uniformly in . Hence,

 ∣∣ ∣∣1nn∑k=1(H1(n,k)−1)(λ22+μ22+λμcos(kδn))∣∣ ∣∣⩽Cnn∑k=1|H1(n,k)−1|⟶0

as . The remaining term of the sum

 |Rn(λ,μ)|⩽12nn∑k=1C|H2(n,k)−H1(n,k)|⟶0

goes to for all fixed , as . Therefore we have

 (11) S∞(λ,μ):=limn→∞Sn(λ,μ)=−iu(λ+μ)+λ2+μ22+λμsin(δ)δ

and is nothing but the characteristic function of . This implies the statement. ∎

3.2. The Gaussian case

Denote by the analogue of for the process , that is

 (12) ~N∗[α,β]:=12−12sgn(Z(α)Z(β)).
Lemma 7.

As , we have

 (13) E~N∗[0,δ]=1π√3exp(−u22)δ+o(δ).
Proof.

The bivariate random vector is normal-distributed with mean and covariance . We have

 E~N∗[0,δ] =P(Z(0)Z(δ)<0) =2P(Z(0)−u<−u,Z(δ)−u>−u) ∼√1−ρ2πexp(−u22)

as (equivalently, ), where the last step will be justified in Lemma 8, below. Using the Taylor series of which is given by

 (14) sin(δ)δ=1−δ26+o(δ2)% as δ↓0,

we obtain the required relation (13). ∎

Lemma 8.

Let be bivariate normal distributed with parameters

 μ=(00)undΣ=(1ρρ1).

Let be arbitrary but fixed. Then,

 P(X⩽u,Y⩾u)∼√1−ρ22πexp(−u22)as ρ↑1.
Proof.

In the special case the lemma could be deduced from the explicit formula

 P(X⩾0,Y⩾0)=14+arcsinρ2π

due to F. Sheppard; see [BD88] and the references therein. For general , no similar formula seems to exist and we need a different method.

By the formula for the density of the random vector , we have to investigate the integral

 ∫x⩽u∫y⩾u12π√(1−ρ2)exp(−12(1−ρ2)(x2+y2−2ρxy)) d