Random sensor arrays and Kerdock codes

Accurate detection of moving targets via random sensor arrays and Kerdock codes

Thomas Strohmer Department of Mathematics
University of California
Davis, CA 95616
and  Haichao Wang Department of Mathematics
University of California
Davis, CA 95616
Abstract.

The detection and parameter estimation of moving targets is one of the most important tasks in radar. Arrays of randomly distributed antennas have been popular for this purpose for about half a century. Yet, surprisingly little rigorous mathematical theory exists for random arrays that addresses fundamental question such as how many targets can be recovered, at what resolution, at which noise level, and with which algorithm. In a different line of research in radar, mathematicians and engineers have invested significant effort into the design of radar transmission waveforms which satisfy various desirable properties. In this paper we bring these two seemingly unrelated areas together. Using tools from compressive sensing we derive a theoretical framework for the recovery of targets in the azimuth-range-Doppler domain via random antennas arrays. In one manifestation of our theory we use Kerdock codes as transmission waveforms and exploit some of their peculiar properties in our analysis. Our paper provides two main contributions: (i) We derive the first rigorous mathematical theory for the detection of moving targets using random sensor arrays. (ii) The transmitted waveforms satisfy a variety of properties that are very desirable and important from a practical viewpoint. Thus our approach does not just lead to useful theoretical insights, but is also of practical importance. Various extensions of our results are derived and numerical simulations confirming our theory are presented.

Key words and phrases:
Sparsity, Radar, Compressive Sensing, Random Sensor Arrays, MIMO, Kerdock Codes

1. introduction

The detection and parameter estimation of moving targets is one of the most important radar applications. The use of antenna arrays greatly improves our ability to perform this task. Antenna arrays make it possible to estimate not only the range and Doppler frequency, but also the azimuth of the target. Furthermore, using multiple antennas can significantly increase signal strength and thus in turn can greatly enhance accuracy and our ability to locate low contrast targets (“faint” or “weak” targets).

Therefore it does not come as a surprise that in recent years radar systems employing multiple antennas at the transmitter and the receiver (also referred to as MIMO radar, where MIMO stands for multiple-input multiple-output) have attracted enormous attention in the engineering and signal processing community. Despite the significant resources that have been devoted to MIMO radar, there exists fairly little rigorous mathematical theory for MIMO radar that addresses fundamental questions, such as how many targets can be detected at which azimuth-range-Doppler resolution and at what signal-to-noise ratio. Existing theory focuses mainly on the detection of a single target [12, 26]. Only very recently, in the footsteps of compressive sensing, do we see the emergence of a rigorous mathematical theory for MIMO radar that addresses the more realistic and more interesting case of multiple targets [33]. However, for the widely popular case of randomly spaced antennas111In this paper we only consider the case of co-located transmitters and receivers, which is the most relevant situation in practice. We do not discuss the case of widely separated antennas [15]., the mathematical theory is still in its infancy.

In an independent and seemingly disparate line of research in radar, mathematicians and engineers have devoted substantial efforts to the design of radar transmission waveforms that satisfy a variety of desirable properties. The vast majority of this research has focused on single antenna radar systems, and it is a priori not clear if and how these waveforms can be utilized for MIMO radar. In this paper we bring together these two independent areas of research, MIMO radar with random antenna arrays and radar waveform design, by developing a rigorous mathematical framework for accurate target detection via random arrays, which at the same time utilizes some of the most attractive radar waveforms, such as Kerdock codes.

A radar system illuminates a region of interest in order to detect the location, velocity, and reflectivity of the objects (targets) in its field of view. We consider the following standard (narrowband) radar model [32]. Suppose a target located at range is traveling with constant velocity and has reflection coefficient . Suppose further just for the moment that we have only one target, one transmitter and one receiver (in which case we cannot detect direction). After transmitting signal , the receiver observes the reflected signal

 y(t)=as(t−τr)e2πiωvt (1)

where is the round trip time of flight, is the speed of light, is the Doppler shift, and is the carrier frequency. The basic idea is that the range-velocity information of the target can be inferred from the observed time delay-Doppler shift of in (1). For only one target this can be done conveniently by correlating the received signal with time-frequency shifted versions of the transmitted signal. Since we are dealing with bandlimited signals, it suffices to consider discrete signals sampled at a properly chosen rate . It is therefore common practice to compute

 V(τ,ω):=∑ly(lΔt)¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯s(lΔt−τ)e2πiωl (2)

and then locate the largest value of in order to detect the target in the range-Doppler domain.

In the presence of multiple targets more sophisticated methods are necessary. In order to resolve azimuth in addition to range and Doppler, we need to employ an array of antennas. We assume an array of transmit and receiver antennas that are co-located (also known as mono-static radar) as illustrated in Figure 1. A more detailed description of the setup is postponed to Section 2.

The transmit antennas send simultaneously probing signals, which can differ from antenna to antenna and can be chosen to our specifications. It is convenient to divide the region of interest into range-azimuth-Doppler cells corresponding to distance, direction and velocity, respectively. Let be a measurement matrix whose columns correspond to the signal recorded at each receive antenna from a single unit-strength scatterer at a specific range-azimuth-Doppler cell. Let denote a vector whose elements represent the complex amplitudes of the scatterers. In many cases the radar scene is sparse in the sense that only a small fraction (often a very small fraction) of the cells is occupied by the objects of interest. In this case most of the entries of will be zero, but we do not know which ones, otherwise we would have located the targets already. With representing a noise vector, we are faced with the linear system of equations

 y=Ax+w, (3)

where is a vector of measurements collected by the receive antennas over an observation interval. Typically this system will be underdetermined, which implies that it will have infinitely many solutions. What comes to our rescue here is the sparsity of . While conventional radar processing techniques do not take full advantage of sparsity of the radar scene, the recent development of compressive sensing provides us with the possibility to optimally utilize this property [17, 31, 33]. The approach pursued in this paper to obtain a sparse solution of (3) is based on the lasso [36], which gained tremendous popularity in connection with compressive sensing. The lasso solves

 minx12∥Ax−y∥22+λ∥x∥1, (4)

where the parameter trades off goodness of fit with sparsity.

However, one of the main challenges in bringing compressive sensing theory into radar is that in radar the sensing matrix cannot be freely chosen. Its structure is dictated by the laws of physics on which radar is based. The crux is to carefully balance the desired resolution in the azimuth-range-Doppler domain with the degrees of freedom at our disposal in the formation of , such as the antenna locations and the transmit waveforms.

A great deal of work has been devoted in the mathematical and engineering literature to the design of radar transmission waveforms, see for instance [1, 3, 9, 13, 14, 29, 38] for a small sample of references. The design criteria for radar waveforms can be roughly split into two categories: (i) properties that are important from the viewpoint of hardware implementation, and (ii) properties that are relevant for target detection. Waveforms that fall in the first category are for example polyphase sequences222A polyphase sequence is a sequence whose coefficients are of the form for some , see e.g. [13]., since they give rise to signals with low peak-to-average power ratio333The peak-to-aver power ratio of a signal is defined, up to different normalizations, as . (PAPR). A low PAPR is desirable in the digital-to-analog conversion of signals, since signals with large PAPR would require expensive power amplifiers. Polyphase sequences also have the advantage that they can be very convientiently implemented in hardware via simple look-up tables. The second category usually includes waveforms with low auto-correlation and (nearly) ideal ambiguity function. Quite a number of polyphase sequences, such as Alltop sequences or Kerdock codes, fall in this category. With the exception of [17] a rigorous mathematical theory concerning the benefits (or even optimality) of such sequences has only existed for the detection of a single target. Common to all these carefully constructed sequences in both categories is that they have been designed for single-antenna radar systems and it is a priori not clear at all if any of these sequences are useful in exploiting the potential benefits of a MIMO radar system.

Our paper provides two main contributions: (i) We derive the first rigorous mathematical theory for the detection of moving targets in the azimuth-range-Doppler domain for random sensor arrays. (ii) The transmitted waveforms satisfy a variety of properties that are very desirable and important from a practical viewpoint. In particular, we show that Kerdock sequences, which would perform very poorly in single-antenna radar, are nearly ideally suited for MIMO radar with randomly spaced antennas. Since Kerdock codes are polyphase sequences, they have excellent PAPR and they are easy to implement in hardware via a simple look-up table. Thus, our framework does not just lead to useful theoretical insights, but also has a very strong practical appeal.

1.1. Connections with prior work and innovations

Random sensor arrays have been around for half a century. The pioneering work [27, 28] by Lo contains a mathematical analysis of important specific characteristics of random arrays, such as sidelope behavior and antenna gain. There is extensive engineering literature that deals with random arrays in connection with phased array radar technology, e.g. see [11]. Recently, Carin made an explicit connection between the areas of random sensor arrays and compressive sensing [6]. He has shown that algorithms developed in these two seemingly different areas are in fact highly inter-related. The setup in [6] is quite different from ours, since the paper is only concerned with angular resolution (thus transmission waveforms do not even explicitly enter into the model), while it is often crucial in practice to be able to estimate range and Doppler as well. Moreover, the theoretical analysis in [6] follows more an engineering style and places less emphasis on mathematical rigor. The paper [7] provides interesting results for the angular estimation of stationary targets. Its setup is similar to that in [6], and quite different from ours, as it does not deal with waveform design nor with moving targets.

Kerdock codes have been proposed for radar in [20]. However in the setting of a single transmit antenna. Kerdock codes are known to perform rather poorly444This poor performance is caused by Property (ii) in Theorem 3.1. even in the case of single targets as considered in [20]. Only in the setting of mulitple transmit antennas can Kerdock codes exhibit their enormous potential. Our paper utilizes some properties of Kerdock codes proved in [20], but otherwise there is no overlap. In our paper we also present an extension of the main result, that allows for instance the use of the so-called finite harmonic oscillator system as transmission waveforms. These sequences have been derived in [14], where the authors also briefly sketch their use in a single-antenna radar system for the simple case of a single target. Thus, while our framework allows one to employ the finite harmonic oscillator system, there is essentially no overlap of our results with those in [14].

The paper [33] (coauthored by one of the authors) is closest to this paper, but the setting is in a sense complementary. [33] considers a MIMO radar setting with a very specific (non-random) choice for the antenna locations, but random waveforms, while the current paper deals with randomly spaced antennas, but very specific, deterministic waveforms. At first glance, the difference may appear to be mainly semantic. But in practice, the second setting has many advantages. From an engineer’s viewpoint random waveforms have several drawback over properly designed deterministic waveforms: they are much harder to implement on a digital device (requiring more complicated hardware, more memory, …); and they exhibit a larger peak-to-average-power ratio. On the other hand it makes no difference from the viewpoint of physics or hardware, if we place the antennas at random or at deterministic locations. In particular, the current paper yields some important insights, which cannot be inferred from [33]: We obtain a theoretical framework for radar operating with random antenna arrays, a technique which have been around for half a century; we show that Kerdock sequences, which are not useful for SISO or SIMO radar555SISO stands for single-input-single-output radar, and SIMO for single-input-multiple-output radar (i.e., a radar with one transmit and multiple receive antennas)., are excellent for MIMO radar; our approach allows for waveforms that satisfy a number of properties which are very desirable in practice, and are not satisfied by random waveforms. Indeed, as mentioned above, we also show that the finite harmonic oscillator system “plays well” with random antenna arrays.

Our paper is organized as follows. Section 2 describes the problem setup and the radar model. We review key properties of Kerdock codes in Section 3. Our main theorem is presented in Section 4 and Section 5 is devoted to the proof of the main theorem. In Section 6 we extend our framework to other deterministic waveforms, such as the finite harmonic oscillator system. Numerical simulations are presented in Section 7.

1.2. Notation

For a matrix , we use to denote its adjoint matrix, which is its conjugate transpose. The operator norm of is the largest singular value of and is denoted by . We denote the -th column of by and the element in the -th row and -th column by . The coherence of is defined as

 μ(A):=maxk≠l|⟨Ak,Al⟩|∥Ak∥2∥Al∥2. (5)

The Discrete Fourier Transform (DFT) matrix is written as and the identity matrix as . For , let denote the circulant translation operator, defined by

 Tτx(l)=x(l−τ),τ∈Cn, (6)

where is understood modulo , and let be the modulation operator defined by

 Mfx(l)=x(l)e2πifl/n. (7)

Acknowledgements

The authors acknowledge generous support by the National Science Foundation under grant DTRA-DMS 1042939 and by DARPA under grant N66001-11-1-4090.

2. Problem Setup

We consider a MIMO radar employing antennas at the transmitter and antennas at the receiver. We assume for convenience that transmitters and receivers are co-located, cf. Figure 1. Furthermore, we assume a coherent propagation scenario, i.e., the element spacing is sufficiently small so that the radar return from a given scatterer is fully correlated across the array. The arrays and all the scatterers are assumed to be in the same 2-D plane. The extension to the 3-D case is straightforward.

The array manifolds , with randomly spaced antennas are given by

 aT(β)=[e2πip1β,e2πip2β,…,e2πipNTβ]T, (8)

and

 aR(β)=[e2πiq1β,e2πiq2β,…,e2πiqNRβ]T, (9)

where we assume that the relative antenna spacings ’s and ’s are i.i.d. uniformly on . The -th transmit antenna repeatedly transmits the signal , which is assumed to be a periodic, continuous-time signal of period-duration seconds and bandwidth . We observe the back-scattered signal over a duration , and since its bandwidth is , it suffices that each receive antennas takes samples666Actually the received signal will have a somwhat larger bandwidth due to the Doppler effect. However, in practice this increase in bandwidth is small, so we can assume ., where and . It is convenient to introduce the finite-length vector associated with , via .

Let be the noise-free received signal matrix from a unit strength target at direction , delay , and Doppler (corresponding to its radial velocity with respect to the radar). Then

 Z(t;β,τ,f)=aR(β)aTT(β)STτ,f,

where is a matrix whose columns are the circularly delayed and Doppler shifted signals .

We let be the noise-free vectorized received signal. We set up a discrete azimuth-range-Doppler grid for , and , where and denote the corresponding discretization stepsizes. Using vectors for all grid points we construct a complete response matrix whose columns are for and , . In other words, is a matrix with columns

 Aβ,τ,f=aR(β)⊗Sτ,faT(β). (10)

Assume that the radar illuminates a scene consisting of scatterers located on points of the -grid. Let be a sparse vector whose non-zero elements are the complex amplitudes of the scatterers in the scene. The zero elements correspond to grid points which are not occupied by scatterers. We can then define the radar signal received from this scene by

 y=Ax+w (11)

where is an vector, is an sparse vector and is an complex Gaussian noise vector. Our goal is to solve for , i.e., to locate the scatterers (and their reflection coefficients) in the azimuth-delay-Doppler domain.

Remark: The assumption that the targets lie on the grid points, while common in compressive sensing, is certainly restrictive. A violation of this assumption will result in a model mismatch, sometimes dubbed gridding error, which can potentially be quite severe [18, 8]. Recently some interesting strategies have been proposed to overcome this gridding error [10, 35]. But these methods – at least in their current form – are not directly applicable to our setting. This model mismatch issue is beyond the scope of this paper and will be addressed in our future research.

3. Kerdock codes

In this section we introduce one particularly useful set of transmission waveforms. Due to the setup in Section 2 it suffices that we deal with discrete, finite-length signals as transmission waveforms. We briefly review the construction of Kerdock codes and some of their fundamental properties. There is a long list of properties that radar waveforms should satisfy. As we will see in this paper, Kerdock codes fulfill many of them. Kerdock codes over (i.e., binary Kerdock codes) were originally introduced in [23]. In the seminal paper [4] the authors extend Kerdock codes from to . By doing so, they uncover many fascinating properties of Kerdock codes and reveal numerous deep connections between coding theory, discrete geometry and group theory. In the same paper, the authors also extend Kerdock codes to the setting of , where is an odd prime.

Kerdock codes are an example of so-called mutually unbiased bases [39, 34]. Kerdock codes have also been proposed for use in communications engineering [16, 22]. In [20] the authors suggest the use of Kerdock codes for radar, based on the peculiar properties of the discrete ambiguity function associated with Kerdock codes. We emphasize however that for the single transmit antenna radar scenario Kerdock codes would actually perform rather badly, as discussed after Theorem 3.1 and shown by Figure 5 in Section 7. It is only in the setting of multiple transmit antennas that Kerdock codes become useful for radar.

For the remainder of this paper we will only be concerned with Kerdock codes over . Some of the Kerdock codes over , namely those corresponding to desarguesian planes in the language of [4], have also been derived earlier in [25] and [24]. A simple way to construct these Kerdock codes is the following, in which they arise as eigenvectors of time-frequency shift operators. Let be an odd prime number. For each we compute the eigenvector decomposition of (which always exists, since is a unitary matrix)

 U(k)Σ(k)U∗(k)=T0Mk, (12)

where the unitary matrix contains the eigenvectors of and the diagonal matrix the associated eigenvalues777The attentive reader will have noticed that is just the DFT matrix .. Furthermore, we define . Now, let be the -th column of . The set consisting of the vectors forms a -Kerdock code. There are numerous equivalent ways to derive this Kerdock code, but, as pointed out earlier, not all Kerdock codes over are equivalent (see also the comment following Corollary 11.6 in [4]). But we will be a bit sloppy, and simply refer to the Kerdock code constructed above as the Kerdock code.

In the following theorem we collect those key properties of Kerdock codes that are most relevant for radar. These properties are either explicitly proved in [4, 20] or can be derived easily from properties stated in those papers.

Theorem 3.1.

Kerdock codes over , where is an odd prime, satisfy the following properties:

• Mutually unbiased bases: For all and all , there holds:

 |⟨uk,j,uk′,j′⟩|=⎧⎪ ⎪⎨⎪ ⎪⎩1if k=k′,j=j′,0if k=k′,j≠j′,1√pif k≠k′.
• Time-frequency “autocorrelation”:
(a) For any fixed there exists a unique such that

 |⟨MfTluk0,j,uk0,j⟩|=1 for j=0,…,p−1, (13) |⟨MfTluk,j,uk,j⟩|=0 for k≠k0. (14)

(b) For any fixed , there exist , such that

 |⟨MfrTlruk,j,uk,j⟩|=1 for j=0,…,p−1, (15)
• Time-frequency crosscorrelation: For all and all and there holds:

 |⟨MfTluk,j,uk′,j⟩|≤1√pfor j=0,…,p−1. (16)
• Polyphase property (Roots of unity property) in time and in frequency:
For any , there holds:

 uk,j(l)=e2πir/pfor some r∈{0,…,p−1}. (17)

For any , there holds:

 ^uk,j(l)=e2πir/pfor some r∈{0,…,p−1}. (18)
Proof.

Property (i) is proved for instance in Lemma 11.3 in [4]. Properties (ii) and (iii) appear in Theorem 3 of [20]. Statement (17) of property (iv) follows from the comment right after Corollary 11.6 in [4]. Finally, statement (18) of property (iv) follows from (12) together with property (3) and the well-known fundamental relationships

 FpTxF∗p=M−x,FpMxF∗p=Tx.

Kerdock codes have been proposed for adaptive radar in [20]. We emphasize again though that Kerdock codes would not be very effective for a radar system with a single transmit antenna (SISO or SIMO radar). This can be easily seen as follows: Assume we only have one antenna that transmits one waveform . Because of (15), is (up to a constant phase factor) equal to for some . In practice this ambiguity prevents us from determining the distance and the velocity of the object, when using Kerdock codes for SISO or SIMO.

As a consequence of the aforementioned ambiguity we will not use all of the Kerdock codes as transmission signals for our MIMO radar, instead we will choose one code for each index . The reason is that we need the waveforms to have low time-frequency crosscorrelation, while (16) only holds when and are different.

Definition 3.2 (Kerdock waveforms).

Let be a Kerdock code over . The Kerdock waveforms , where , are given by for some arbitrary . In other words, for each we pick an arbitrary vector from the orthonormal basis .

Note that Kerdock waveforms do not include any unit vectors, since only the first unitary matrices are considered and is strictly less than (recall that ).

4. The main result

As mentioned in the introduction, a standard approach to solve (11) when is sparse, is

 minx12∥Ax−y∥22+λ∥x∥1, (19)

which is also known as lasso [36]. But instead of (19), we will use the debiased lasso. That means first we compute an approximation for the support of by solving (19). This is the detection step. Then, in the estimation step, we “debias” the solution by computing the amplitudes of via solving the reduced-size least squares problem , where is the submatrix of consisting of the columns corresponding to the index set , and similarly for .

We assume that the locations of the targets are random. To be precise, we assume that the nonzero coefficients of are selected uniformly at random and the phases of the non-zero entries of are random and uniformly distributed in . We will refer to this model as the generic -sparse model.

We are now ready to state our main result.

Theorem 4.1.

Consider , where is defined as in (10) and . Assume that the positions of the transmit and receive antennas ’s and ’s are chosen i.i.d. uniformly on at random. Suppose further that each transmit antenna sends a different Kerdock waveform, i.e. the columns of the signal matrix are different Kerdock waveforms. Choose the discretization stepsizes to be ,, and suppose that

 max(NRNT,32N3TlogNτNfNβ)≤Ns=Nτ, (20)

and also

 log2NτNfNβ≤NT≤NR. (21)

If is drawn from the generic -sparse scatterer model with

 S≤c0NτlogNτNfNβ (22)

for some constant , and if

 mink∈I|xk|>8√3σ√NRNT√2logNτNfNβ, (23)

then the solution of the debiased lasso computed with satisfies

 supp(~x)=supp(x), (24)

with probability at least

 1−p1, (25)

and

 ∥~x−x∥2∥x∥2≤5σ√3NRNs∥y∥2 (26)

with probability at least

 (1−p1)(1−p2), (27)

where

 p1=16N−2τN−1R+8N−2τN−2f+4NTN−2τN−2f+4(NτNf)−1 +4N−3τN−3fN−2RN−1T+8N−2T(NτNfNR)−3,

and

 p2=2(NτNfNβ)−1(2πlog(NτNfNβ)+K(NτNfNβ)−1)+O((NτNfNβ)−2log2).

Remarks:

1. The condition in (21) is by no means necessary, but rather to make our computation a little cleaner. We could change it into , then the theorem would remain true with a slightly different probability of success.

2. It may seem that the conditions in (20) and (21) are a bit restrictive. But, in practice, our method works with a broad range of parameters as the simulations show in Section 7.

5. Proof of the result

To prove Theorem 4.1, we use a theorem by Candès and Plan (Theorem 1.3 in [5]) which requires to estimate the operator norm of and the coherence of . The original theorem only treats the real-valued case, it can be extended to complex-values case after some straightforward modifications (see Appendix B in [33]).

5.1. Auxiliary results

We first need the following Bernstein type lemma.

Lemma 5.1.

Suppose is an matrix, and are two joint independent random vectors in with zero means and for . If is a positive constant, then for any and ,

1. if for all , then

 P(|⟨Mα,β⟩|≤mt)≥1−4mexp(−t24mn). (28)

and

 P(|⟨Mα,α⟩|≤2mt)≥1−8mexp(−t22mn), (29)
2. if for and , then

 P(|⟨Mα,β⟩|≤s+mt)≥1−4exp(−s24m)−4mexp(−t24mn), (30)

and

 P(m(1−2t)≤|⟨Mα,α⟩|≤m(1+2t))≥1−8mexp(−t22mn). (31)
Proof.
 ⟨Mα,β⟩ =m∑k,j=1mkjαj¯βk =m∑l=1m∑j=1mj⊕l,jαj¯βj⊕l,

Let us first assume that .

Since and are joint independent, then for any , the entries in are all joint independent and it is easy to check that and , then Theorem 4.5 in [21] will give,

 P(|m∑j=1mj⊕l,jαj¯βj⊕l|≤t) ≥1−4exp(−t24∑j|mj⊕l,j|2) ≥1−4exp(−t24mn). (32)

We take all different choices of , then

 P(|m∑l=1m∑j=1mi⊕l,jαj¯βj⊕l|≤mt)≥1−4mexp(−t24mn), (33)

which proves (28).

 ⟨Mα,α⟩=m∑l=1m∑j=1mj⊕l,jαj¯αj⊕l,

different from above, the entries in are no longer all jointly independent. But similar to the proof of Theorem 5.1 in [30] and Lemma 3 in [33], we observe that for any we can split the index set into two subsets , each of size , such that the variables are jointly independent for , and analogous for . (For convenience we assume here that is even, but with a negligible modification the argument also applies for odd .) In other words, each of the sums , contains only jointly independent terms.

So for each ,

 P(|∑j∈Trlmj⊕l,jαj¯αj⊕l|≤t)≥1−4exp(−t22mn), (34)

which implies that

 P(|∑jmj⊕l,jαj¯αj⊕l|≤2t) ≥1−8exp(−t22mn), (35)

Again, we take all different choices of , then

 P(|m∑l=1m∑j=1mj⊕l,jαj¯αj⊕l|≤2mt)≥1−8mexp(−t22mn), (36)

which proves (29).

Now let us assume that for and .

 ⟨Mα,β⟩ =m∑j=1mjjαj¯βj+m−1∑l=1m∑j=1mj⊕l,jαj¯βj⊕l =m∑j=1αj¯βj+m−1∑l=1m∑j=1mj⊕l,jαj¯βj⊕l.

Since and are joint independent and ,

 P(|m∑j=1αj¯βj|≤s)≥1−4exp(−s24m). (37)

Similar to the proof of (33) above, we have that

 P(|m−1∑l=1m∑j=1mj⊕l,jαj¯βj⊕l|≤(m−1)t)≥1−4(m−1)exp(−t24mn), (38)

together with (37), it follows

 P(|⟨Mα,β⟩|≤s+(m−1)t)≥1−4exp(−s24m)−4(m−1)exp(−t24mn), (39)

which proves (30).

 ⟨Mα,α⟩=m∑j=1mjj+m−1∑l=1m∑j=1mj⊕l,jαj¯αj⊕l=m+m−1∑l=1m∑j=1mj⊕l,jαj¯αj⊕l,

then (31) results from similar proof as for (29) and the triangle inequality. ∎

5.2. Estimation of the Operator Norm

Lemma 5.2.

Let be the matrix in Theorem 4.1 satisfying (20). Then

 P(∥A∥2op≤2NfN2RN2T)≥1−8N−2τN−1R. (40)
Proof.

Since , we consider matrix as block matrix

 ⎡⎢ ⎢ ⎢⎣B1,1B1,2…B1,NR⋮⋱⋮BNR,1…BNR,NR⎤⎥ ⎥ ⎥⎦,

where the blocks are matrices of size .

Via a simple permutation, we can turn into a matrix with blocks of size , where the -th entry of the block is defined as

 C[l,j;l′j′]=B[j,l;j′,l′]=(AA∗)[j,l;j′,l′]=∑β∑τ∑fA[j,l;τ,f,β]¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯A[j′,l′;τ,f,β] =∑βe2πi(qj−qj′)βNT∑k=1NT∑k′=1e2πi(pk−pk′)β⟨Tlkk,Tl′kk′⟩Nf∑m=1e2πi(l−l′)ΔtmΔf =Nfδl,l′∑βe2πi(qj−qj′)βNT∑k=1NT∑k′=1e2πi(pk−pk′)β⟨Tlkk,Tl′kk′⟩. (41)

Then it is easy to see that is block-diagonal, and all the diagonal-blocks are identical. So we only have to bound the first block .

 C[1,j;1,j′]=Nf∑βe2πi(qj−qj′)βNT∑k=1NT∑k′=1e2πi(pk−pk′)β⟨kk,kk′⟩ =NfNRNT−1∑n=0e2πi(qj−qj′)nNRNTNT∑k=1NT∑k′=1e2πi(pk−pk′)nNRNT⟨kk,kk′⟩.

Define , then

 C1,1=NfNRNT−1∑n=0cnXn,

where is the matrix-valued random variable given by and therefore .

Note that and for . Choosing in (31) of Lemma 5.1, we arrive at

 P(|cn|≤NT(1+4√NTNs√logNτNRNT))≥1−8NT(NτNRNT)−2,

then the assumption in (20) implies that , therefore

 P(|cn|≤2NT)≥1−8NT(NτNRNT)−2.

We apply the union bound over the possibilities associated with and get

 P(max|cn|≤2NT)≥1−8N−2τN−1R,

which implies that

 P(∥C1,1∥op≤2NfN2RN2T)≥1−8N−2τN−1R.

Then the fact that will give us the conclusion.

5.3. Estimation of the Coherence

Lemma 5.3.

Let be the matrix in Theorem 4.1 satisfying (20) and (21). Then

 max(τ,f,β)≠(τ′,f′,β′)∣∣⟨Aτ,f,β,Aτ′,f′,β′⟩∣∣≤16NRlogNτNfNRNT (43)

with probability at least

 1−8N−2τN−2f−4NTN−2τN−2f−4(NτNf)−1−4N−3τN−3fN−2RN−1T−8N−2T(NτNfNR)−3.
Proof.

We need to find an upper bound for

 max|⟨Aτ,f,β,Aτ′,f′,β′⟩|for (τ,f,β)≠(τ′,f′,β′).

Recall the , it follows from the definition that

 Aτ,f,β=aR(β)⊗(Sτ,faT(β)),