A FFAST framework for computing a sparse DFT in time using sparsegraph alias codes
Abstract
Given an length input signal , it is well known that its Discrete Fourier Transform (DFT), , can be computed from samples in operations using a Fast Fourier Transform (FFT) algorithm. If the spectrum is sparse (where ), can we do better? We show that asymptotically in and , when is sublinear in (precisely, ) where ), and the support of the nonzero DFT coefficients is uniformly random, our proposed FFAST (Fast Fourier Aliasingbased Sparse Transform) algorithm computes the DFT , from samples in arithmetic operations, with high probability. Further, the constants in the big Oh notation for both sample and computational cost are small, e.g., when , which essentially covers almost all practical cases of interest, the sample cost is less than .
Our approach is based on filterless subsampling of the input signal using a set of carefully chosen uniform subsampling patterns guided by the Chinese Remainder Theorem (CRT). The idea is to cleverly exploit, rather than avoid, the resulting aliasing artifacts induced by subsampling. Specifically, the subsampling operation on is designed to create aliasing patterns on the spectrum that “look like” paritycheck constraints of a good erasurecorrecting sparsegraph code. Next, we show that computing the sparse DFT is equivalent to decoding of sparsegraph codes. These codes further allow for fast peelingstyle decoding. The resulting DFT computation is low in both sample complexity and decoding complexity. We analytically connect our proposed CRTbased aliasing framework to random sparsegraph codes, and analyze the performance of our algorithm using density evolution techniques from coding theory. We also provide simulation results, that are in tight agreement with our theoretical findings.
I Introduction
Spectral analysis using the Discrete Fourier Transform (DFT) has been of universal importance in engineering and scientific applications for a long time. The Fast Fourier Transform (FFT) is the fastest known way to compute the DFT of an arbitrary length signal, and has a computational complexity of ^{1}^{1}1Recall that a single variable function is said to be , if for a sufficiently large the function is bounded above by , i.e., for some constant . Similarly, if and if the growth rate of as , is negligible as compared to that of , i.e. .. Many applications of interest involve signals, e.g. audio, image, video data, biomedical signals etc., which have a sparse Fourier spectrum. In such cases, a small subset of the spectral components typically contain most or all of the signal energy, with most spectral components being either zero or negligibly small. If the length DFT, , is sparse, where , can we do better in terms of both sample and computational complexity of computing the sparse DFT? We answer this question affirmatively. In particular, we show that asymptotically in and , when is sublinear in (precisely, ) where ), and the support of the nonzero DFT coefficients is uniformly random, our proposed FFAST (Fast Fourier Aliasingbased Sparse Transform) algorithm computes the DFT , from judiciously chosen samples in arithmetic operations, with high probability. Further, the constants in the big Oh notation for both sample and computational cost are small, e.g., when , which essentially covers almost all practical cases of interest, the sample cost is less than .
At a high level, our idea is to cleverly exploit rather than avoid the aliasing resulting from subsampling, and to shape the induced spectral artifacts to look like parity constraints of “good” erasurecorrecting codes, e.g., LowDensityParityCheck (LDPC) codes [1] and fountain codes [2], that have both low computational complexity and are nearcapacity achieving. Towards our goal of shaping the spectral artifacts to look like parity constraints of good erasurecorrecting codes, we are challenged by not being able to design any arbitrary spectral code at will, as we can control only the subsampling operation on the timedomain signal. The key idea is to design subsampling patterns, guided by the ChineseRemainderTheorem (CRT) [3], that create the desired codelike aliasing patterns. Based on the qualitative nature of the subsampling patterns needed, our analysis is decomposed into two regimes (see Section VI and Section VII for more details):

The “verysparse” regime, where . For the very sparse regime the subsampling patterns are based on relatively coprime numbers.

The “lesssparse” regime, where . The subsampling pattern, for the lesssparse regime, comprise of “cyclicallyshifted” overlapping coprime integers.
Our approach is summarized in Fig. 1. The length input signal is uniformly subsampled, through a small number^{2}^{2}2We show that the number of stages depend on the sparsity index , and is in the range of to for , as shown in Table II. of stages, say . Each stage subsamples the input signal and its circularly shifted version by a CRT guided set (see Sections VI and VII) of sampling periods, to generate samples per subsampling path, for . Next, the large length DFT is synthesized from much smaller (length DFTs, using the FFAST backend peeling decoder. For the entire range of practical interest of sublinear sparsity, i.e., , the FFAST algorithm computes length sparse DFT in computations from less than samples. It is gratifying to note that both the sample complexity and the computational complexity depend only on the sparsity parameter , which is sublinear in .
We emphasize the following caveats. First, our analytical results are probabilistic, and hold for asymptotic values of and , with a success probability that approaches asymptotically. This contrasts the FFT algorithm which works deterministically for all values of and . Secondly, we assume a uniformly random model for the support of the nonzero DFT coefficients. Lastly, we require the signal length to be a product of a few (typically 3 to 9) distinct primes^{3}^{3}3This is not a major restriction for two reasons. Firstly, in many problems of interest, the choice of is available to the system designer, and choosing to be a power of 2 is often invoked only to take advantage of the readilyavailable radix2 FFT algorithms. Secondly, by truncating or zeropadding the given signal, by a constant number of samples, one can obtain a modified signal of a desired length . The DFT of the modified signal has dominant nonzero DFT coefficients at the same frequencies as the original signal but with additional noiselike sidelobes, due to windowing effect, and can be decoded using a noise robust version of the FFAST algorithm [4].. In effect, our algorithm trades off sample and computational complexity for asymptotically zero probability of failure guarantees, for a uniformly random model of the support. The FFAST algorithm is applicable whenever is sublinear in (i.e. is ), but is obviously most attractive when is much smaller than . As a concrete example, when , and , our algorithm achieves computational savings by a factor of more than , and savings in the number of input samples by a factor of more than over the standard FFT (see [5] for computational complexity of a prime factor FFT algorithm). This can be a significant advantage in many existing applications, as well as enable new classes of applications not thought practical so far.
The focus of this paper is on signals having an exactlysparse DFT. Our motivation for this focused study is threefold: (i) to provide conceptual clarity of our proposed approach in a noiseless setting; (ii) to present our deterministic subsampling frontend measurement subsystem as a viable alternative to the class of randomized measurement matrices popular in the compressive sensing literature [6, 7]; and (iii) to explore the fundamental limits on both sample complexity and computational complexity for the exactly sparse DFT problem, which is of intellectual interest. In addition, the key insights derived from analyzing the exactly sparse signal model, apply more broadly to the noisy setting (see discussion in Section X), i.e., where the observations are further corrupted by noise [4].
The rest of the paper is organized as follows. Section II states the problem and introduce important notations. Section III presents our main results, and the related literature is reviewed in Section IV. Section V exemplifies the mapping of the problem of computing a sparse DFT to decoding over an appropriate sparsegraph code. In Section VI and Section VII we analyze the performance of the FFAST algorithm for the verysparse and the lesssparse regimes respectively. Section IX provides simulation results that empirically corroborate our theoretical findings. The paper is concluded in Section X, with few remarks about extending the FFAST framework for noise robustness and other applications.
Ii Problem formulation, notations and preliminaries
Iia Problem formulation
Consider an length discretetime signal that is sum of complex exponentials, i.e., its length discrete Fourier transform has nonzero coefficients:
(1) 
where the discrete frequencies and the amplitudes , for . We further assume that the discrete frequencies are uniformly random in to and the amplitudes are drawn from some continuous distribution over complex numbers. Under these assumptions, we consider the problem of identifying the unknown frequencies and the corresponding complex amplitudes from the time domain samples . A straightforward solution is to compute an length DFT, , using a standard FFT algorithm [3], and locate the nonzero coefficients. Such an algorithm requires samples and computations. When the DFT is known to be exactly sparse and , computing all the DFT coefficients seems redundant.
In this paper, we address the problem of designing an algorithm, to compute the sparse length DFT of for the asymptotic regime of and . We would like the algorithm to have the following features:

it takes as few input samples of as possible.

it has a low computational cost, that is possibly a function of only the number of input samples .

it is applicable for the entire sublinear regime, i.e., for all , where .

it computes the DFT with a probability of failure vanishing to as becomes large.
In Table I, we provide notations and definitions of the important parameters used in the rest of the paper.
Notation  Description 

Signal length.  
Number of nonzero coefficients in the DFT .  
Sparsityindex: .  
Sample complexity: Number of samples of input to the FFAST algorithm.  
Oversampling ratio: Input samples per nonzero DFT coefficient of the signal.  
Number of stages in the subsampling “frontend” architecture of the FFAST.  
Number of output samples at each subsampling path of stage of the FFAST frontend. 
IiB The ChineseRemainderTheorem (CRT)
The CRT plays an important role in our proposed FFAST architecture as well as in the FFAST decoder. For integers , we use to denote the operation, , i.e., .
Theorem II.1 (ChineseRemainderTheorem [3]).
Suppose are pairwise coprime positive integers and . Then, every integer ‘’ between and is uniquely represented by the sequence of its remainders modulo respectively and viceversa.
Further, given a sequence of remainders , where , one can find an integer ‘’, such that,
(2) 
For example, consider the following pairwise coprime integers and . Then, given a sequence of remainders , there exists a unique integer ‘’, such that,
(3)  
It is easy to verify that is a unique integer, between and , that satisfies the congruencies in (IIB).
Iii Main Results
We propose a novel FFAST architecture and algorithm to compute a sparse length DFT, , of an length signal . In this paper, an length input signal is said to have a sparse DFT , if has at most nonzero coefficients, whose locations are uniformly randomly distributed in . The FFAST algorithm computes the sparse length DFT with high probability, using as few as samples of and computations. The following theorem states the main result of the paper.
Theorem III.1.
For any , and large enough , the FFAST algorithm computes a sparse DFT of an length input , where , with the following properties:

Sample complexity: The algorithm needs samples of , where is a small constant and depends on the sparsity index ;

Computational complexity: The FFAST algorithm computes DFT using , arithmetic operations.

Probability of success: The probability that the algorithm correctly computes the sparse DFT is at least 1 .
Proof.
Remark III.2.
[Oversampling ratio ] The achievable oversampling ratio in the sample complexity , depends on the number of stages used in the FFAST architecture. The number of stages , in turn, is a function of the sparsity index (recall )), and increases as (i.e., as the number of the nonzero coefficients, , approach the linear regime in ). In Sections VI and VII, we provide constructions of the FFAST frontend architecture, that require increasing number of stages as increase from to . In our proposed construction, the increase in occurs over intervals of , resulting in a staircase plot as shown in Fig. 2. Note, that the plot in Fig. 2 is an achievable performance of the FFAST algorithm using the constructions described in Section VID and Section VIID. There are many other constructions of FFAST architecture that may achieve similar or better performance for a specific sparsity index . Table II provides some example values of and for different values of the sparsity index .
1/3  2/3  0.99  0.999  0.9999  
d  3  6  8  11  14 
r  2.45  3.14  3.74  4.64  5.51 
Iv Related work
The problem of computing a sparse discrete Fourier transform of a signal is related to the rich literature of frequency estimation [8, 9, 10, 11] in statistical signal processing as well as compressivesensing [6, 7]. In frequency estimation, it is assumed that a signal consists of complex exponentials embedded in additive noise. The frequency estimation techniques are based on wellstudied statistical methods like MUSIC and ESPRIT [8, 9, 10, 11], where the focus is on ‘superresolution’ spectral estimation from initial few consecutive samples, i.e., extrapolation. In contrast, the algorithm proposed in this paper combine tools from coding theory, number theory, graph theory and signal processing, to ‘interpolate’ the signal from interspersed but significantly less number of samples. In compressive sensing, the objective is to reliably reconstruct the sparse signal from as few measurements as possible, using a fast recovery technique. The bulk of this literature concentrates on random linear measurements, followed by either convex programming or greedy pursuit reconstruction algorithms [7, 12, 13]. An alternative approach, in the context of sampling a continuous time signal with a finite rate of innovation is explored in [14, 15, 16, 17]. Unlike the compressive sensing problem, where the resources to be optimized are the number of measurements^{4}^{4}4Consider a compressive sensing problem with a measurement matrix , i.e., , where is a measurement vector and is the input signal. Then, the sample complexity is equal to the number of nonzero columns of and the measurement complexity is equal to the number of nonzero rows of . and the recovery cost, in our problem, we want to minimize the number of input samples needed by the algorithm in addition to the recovery cost.
At a higher level though, despite some key differences in our approach to the problem of computing a sparse DFT, our problem is indeed closely related to the spectral estimation and compressive sensing literature, and our approach is naturally inspired by this, and draws from the rich set of tools offered by this literature.
A number of previous works [18, 19, 20, 21, 22] have addressed the problem of computing a D DFT of a discretetime signal that has a sparse Fourier spectrum, in sublinear sample and time complexity. Most of these algorithms achieve a sublinear time performance by first isolating the nonzero DFT coefficients into different bins, using specific filters or windows that have ‘good’ (concentrated) support in both, time and frequency. The nonzero DFT coefficients are then recovered iteratively, one at a time. The filters or windows used for the binning operation are typically of length . As a result, the sample and computational complexity is typically or more. Moreover the constants involved in the bigOh notation can be large, e.g., the empirical evaluation of [19] presented in [23] shows that for and , the number of samples required are which is times more than the sample complexity of the FFAST algorithm^{5}^{5}5As mentioned earlier, the FFAST algorithm requires the length of the signal to be a product of a few distinct primes. Hence, the comparison is for an equivalent and .. The work of [20] provides an excellent tutorial on some of the key ideas used by most sublinear time sparse FFT algorithms in the literature. While we were writing this paper, we became aware of a recent work [24], in which the authors consider the problem of computing a noisy as well as an exactlysparse D DFT of size signal. For an exactly sparse signal, and when , the algorithm in [24] uses samples to compute the 2D DFT of the signal in time with a constant probability of failure (that is controllable but that does not appear go to zero asymptotically). In [25], the author proposes a sublinear time algorithm with a sample complexity of or and computational complexity of or to compute a sparse DFT, with high probability or zeroerror respectively. The algorithm in [25] exploits the ChineseRemainderTheorem, along with number of subsampling patterns to identify the locations of the nonzero DFT coefficients. In contrast, the FFAST algorithm exploits the CRT to induce ‘good’ sparsegraph codes using a small constant number of subsampling patterns and computes the sparse DFT with a vanishing probability of failure.
V Computing DFT using decoding of aliascodes
In this section, we use a simple example to illustrate the working mechanics of the FFAST subsampling “frontend” and the associated “backend” peelingdecoder. Then we demonstrate, how the output of the FFAST frontend subsampling can be viewed as a sparse graph code in the frequency domain, which we refer to as “Aliascodes”, and computing the DFT is equivalent to decoding of this resulting aliascode. Later, we also point out a connection between the FFAST and coding techniques for a packet erasure channel.
Consider a length discretetime signal , such that its length DFT , is sparse. Let the nonzero DFT coefficients of the signal be and .
Va FFAST subsampling frontend
In general, the FFAST subsampling frontend architecture, as shown in Fig. 1, consists of multiple subsampling stages and each stage further has subsampling paths, referred to as “delaychains”. The sampling periods used in both the delaychains of a stage are identical. For example, consider a stage FFAST subsampling frontend, shown in Fig. 3, that samples the length input signal and its circularly shifted version^{6}^{6}6Conventionally, in signal processing literature is used to denote a timedelay. In this paper, for ease of exposition we use to denote a timeadvancement (see Fig. 3 for an example). by factors and respectively. In the sequel we use, to denote the number of the output samples per delaychain of stage , e.g., and in Fig. 3. The FFAST frontend then computes short DFT’s of appropriate lengths of the individual output data streams of the delaychains. Next, we group the output of short DFT’s into “binobservation”.
VA1 Bin observation
A binobservation is a dimensional vector formed by collecting one scalar output value from the DFT output of the signal from each of the delay chains in a stage. For example, is an observation vector of bin in stage and is given by,
(4) 
The first index of the observation vector corresponds to the stage number, while the second index is the bin number within a stage. Note that in the FFAST architecture of Fig. 3, there are total of bins in stage and bins in stage .
Using basic Fourier transform properties, reviewed below, of sampling, aliasing and circular shift, one can compute the relation between the original length DFT of the signal and the output binobservations of the FFAST frontend.

Aliasing: If a signal is subsampled in the timedomain, its frequency components mix together, i.e., alias, in a pattern that depends on the sampling procedure. For example, consider the output of the first delaychain of the stage0 in Fig. 3. The input signal is uniformly sampled by a factor of to get . Then, the length DFT of is related to the original length DFT as:
More generally, if the sampling period is (we assume that divides ) then,
(5) where the notation , denotes .

Circular Shift in time: A circular shift in the timedomain results in a phase shift in the frequencydomain. Consider a circularly shifted signal obtained from as . The DFT coefficients of the shifted signal , are given as, , where is an root of unity. In general a circular shift of results in .
VB Aliascodes and its connection to computing sparse DFT from subsampled signal
In this section, we represent the relation between the original length sparse DFT and the FFAST frontend output, i.e., binobservations, obtained using the Fourier properties, in a graphical form and interpret it as a “channelcode”. In particular, since this code is a result of a subsampling and aliasing operations, we refer to it as an “Aliascode”. We also establish that computing the sparse DFT of the signal from its subsampled data is equivalent to decoding the aliascode resulting from processing the input signal through the FFAST frontend.
Suppose that the length example input signal is processed through a stage FFAST frontend architecture shown in Fig. 3, to obtain the binobservation vectors and . Then, the relation between the binobservation vectors and the nonzero DFT coefficients of the signal can be computed using the signal processing properties. A graphical representation of this relation is shown in Fig. 4. Left nodes of the bipartite graph represent the nonzero DFT coefficients of the input signal , while the right nodes represent the “bins” ( also sometimes referred in the sequel as “check nodes”) with vector observations. An edge connects a left node to a right node iff the corresponding nonzero DFT coefficient contributes to the observation vector of that particular bin, e.g., after aliasing, due to subsampling, the DFT coefficient contributes to the observation vector of bin of stage and bin of stage . Thus, the problem of computing a sparse length DFT has been transformed into that of decoding the values and the support of the left nodes of the bipartite graph in Fig. 4, i.e., decoding of aliascode. Next, we classify the observation bins based on its edge degree in the bipartite graph, i.e., number of nonzero DFT coefficients contributing to a bin, which is then used to decode the aliascode.

zeroton: A bin that has no contribution from any of the nonzero DFT coefficients of the signal, e.g., bin of stage in Fig. 4. A zeroton bin can be trivially identified from its observations.

singleton: A bin that has contribution from exactly one nonzero DFT coefficient of the signal, e.g., bin of stage . Using the signal processing properties, it is easy to verify that the observation vector of bin of stage is given as,
The observation vector of a singleton bin can be used to determine the support and the value, of the only nonzero DFT coefficient contributing to that bin, as follows:

support: The support of the nonzero DFT coefficient contributing to a singleton bin can be computed as,
(6) 
Value: The value of the nonzero DFT coefficient is given by the observation .
We refer to this procedure as a “ratiotest”, in the sequel. Thus, a simple ratiotest on the observations of a singleton bin correctly identifies the support and the value of the only nonzero DFT coefficient connected to that bin. It is easy to verify that this property holds for all the singleton bins.


multiton: A bin that has a contribution from more than one nonzero DFT coefficients of the signal, e.g., bin of stage . The observation vector of bin of stage is,
Now, if we perform the “ratiotest” on these observations, we get, the support to be . Since, we know that the support has to be an integer value between to , we conclude that the observations do not correspond to a singleton bin. In Appendix B, we rigorously show that the ratiotest identifies a multiton bin almost surely.
Hence, using the “ratiotest” on the binobservations, the FFAST decoder can determine if a bin is a zeroton, a singleton or a multiton, almost surely. Also, when a bin is singleton the ratiotest provides the support as well as the value of the nonzero DFT coefficient connected to that bin. In the following Section VC, we provide the FFAST peelingdecoder that computes the support and the values of all the nonzero DFT coefficients of the signal .
VC FFAST backend peelingdecoder
In the previous section we have explained how the ratiotest can be used to determine if a bin is a zeroton, singleton or a multiton bin almost surely. Also, in the case of a singleton bin the ratiotest also identifies the support and the value of the nonzero DFT coefficient connected to that bin.
FFAST peelingdecoder
The FFAST decoder repeats the following steps (the pseudocode is provided in Algorithm 1 and Algorithm 2 in Appendix A):

Select all the edges in the graph with right degree 1 (edges connected to singleton bins).

Remove these edges from the graph as well as the associated left and right nodes.

Remove all the other edges that were connected to the left nodes removed in step2. When a neighboring edge of any right node is removed, its contribution is subtracted from that check node.
Decoding is successful if, at the end, all the edges have been removed from the graph. It is easy to verify that the FFAST peelingdecoder operated on the example graph of Fig. 4 results in successful decoding, with the coefficients being uncovered in the following possible order: .
Thus, the FFAST architecture has transformed the problem of computing the DFT of into that of decoding aliascode of Fig. 4. Clearly the success the FFAST decoder depends on the properties of the sparse bipartite graph resulting from the subsampling operation of the FFAST frontend. In particular, if the subsampling induced aliasing bipartite graph is peelingfriendly, i.e., has few singleton bins to initiate the peeling procedure and creates new singletons at each iteration, until all the DFT coefficients are uncovered, the FFAST peelingdecoder succeeds in computing the DFT .
VD Connection to coding for packet erasure channels
Erasure Channel  Sparse DFT 

1. Explicitly designed sparsegraph code.  1. Implicitly designed sparsegraph code induced by subsampling. 
2. correctly received packets.  2. zero DFT coefficients. 
3. erased packets.  3. unknown nonzero DFT coefficients 
4. Peelingdecoder recovers the values of the erased packets using ‘singleton’ check nodes. The identity (location) of the erased packets is known.  4. Peelingdecoder recovers both the values and the locations of the nonzero DFT coefficients using ‘singleton’ check nodes. The locations of the nonzero DFT coefficients are not known. This results in a cost in the sample complexity. 
5. Codes based on regulardegree bipartite graphs are nearcapacityachieving. More efficient, capacityachieving irregulardegree bipartite graph codes can be designed.  5. The FFAST architecture based on uniform subsampling can induce only leftregular degree bipartite graphs. 
The problem of decoding sparsegraph codes for erasure channel has been well studied in the coding theory literature. In this section we draw an analogy between decoding over sparsegraph codes for a packet erasure channel and decoding over sparse bipartite graphs induced by the FFAST architecture.
Sparsegraph code for a packeterasure channel
Consider an packet erasure code. Each length codeword consists of information packets and parity packets. The erasure code is defined by a bipartite graph as shown in Fig. 5(a). An length sequence of packets that satisfies the constraints defined by the graph in Fig. 5(a), i.e., sum of the packets connected to a parity check node equals zero, is a valid codeword. Suppose a codeword from the code, defined by the graph of Fig. 5(a), is transmitted over an erasure channel that uniformly at random drops some number of packets. In Fig. 5(a), we use dotted circles to represent the correctly received packets (a cyclic redundancy check can be used to verify the correctly received packets). Let be the packets that are dropped/erased by the channel. The dotted edges in the bipartite graph of Fig. 5(a) denote the operation of subtracting the contribution of all the correctly received packets from the corresponding check nodes. A peelingdecoder can now iteratively unravel the erased packets that are connected to a check node with exactly one erased packet. If the bipartite graph consisting only of the solid variable nodes and the solid edges is such that the peelingdecoder successfully unravels all the erased packets, decoding is successful.
Decoding over a bipartite graph induced by the FFAST
Consider an length signal that has a sparse DFT . Let the signal be subsampled by some appropriately designed FFAST frontend. The induced spectraldomain aliasing due to FFAST frontend subsampling operation is graphically represented by a bipartite graph shown in Fig. 5(b). The variable (left) nodes correspond to the length DFT and the check (right) nodes are the bins consisting of the aliased DFT coefficients. A variable node is connected to a check node, iff after aliasing that particular DFT coefficient contributes to the observation of the considered check node. Let be the nonzero DFT coefficients. The zero DFT coefficients are represented by dotted circles. The dashed edges in Fig. 5(b) denotes that the contribution to the binobservation, due to this particular edge is zero. Using the ratiotest on the vector observation at each checknode one can determine if the check node is a “singleton”, i.e., has exactly one solid edge. A peelingdecoder can now iteratively unravel the nonzero DFT coefficients connected to singleton check nodes. If the bipartite graph consisting only of the solid variable nodes and the solid edges is such that peelingdecoder successfully unravels all the variable nodes, the algorithm succeeds in computing the DFT . In Table III we provide a comparison between decoding over bipartite graphs of Fig. 5(a) and Fig. 5(b).
Thus, the problem of decoding bipartite graphs corresponding to sparsegraph codes designed for a packeterasure channel is closely related to decoding the sparse bipartite graphs induced by the FFAST architecture. We use this analogy: a) to design a subsampling frontend that induces a ‘good’ leftregular degree sparsegraph codes; and b) to formally connect our proposed ChineseRemainderTheorem based aliasing framework to a random sparsegraph code constructed using a ballsandbins model (explained in Section VIA), and analyze the convergence behavior of our algorithm using wellstudied density evolution techniques from coding theory.
Next, we address the question of how to carefully design the subsampling parameters of the FFAST frontend architecture so as to induce “goodgraphs” or “aliascodes”. In Section VI and Section VII we provide constructions of the FFAST frontend architecture and analyze the performance of the FFAST peelingdecoder for the verysparse, i.e., , and the lesssparse, i.e., , regimes of sparsity respectively.
Vi FFAST Construction and Performance analysis for the verysparse () regime
In Section V, using an example, we illustrated that the problem of computing a sparse length DFT of a signal can be transformed into a problem of decoding over sparse bipartite graphs using the FFAST architecture. In this section, we provide a choice of parameters of the FFAST frontend architecture and analyze the probability of success of the FFAST peelingdecoder for the verysparse regime of . As shown in Section VD, the FFAST decoding process is closely related to the decoding procedure on sparsegraph codes designed for erasure channels. From the coding theory literature, we know that there exist several sparsegraph code constructions that are lowcomplexity and capacityachieving for the erasure channels. The catch for us is that we are not at liberty to use any arbitrary bipartite graph, but can choose only those graphs that correspond to the aliascodes, i.e., are induced via aliasing through our proposed FFAST subsampling frontend. How do we go about choosing the right parameters and inducing the good graphs?
We describe two ensembles of bipartite graphs. The first ensemble is based on a “ballsandbins” model, while the second ensemble is based on the CRT. The ballsandbins model based ensemble of graphs is closer in spirit to the sparsegraph codes in the codingtheory literature. Hence, is amenable to a rigorous analysis using codingtheoretic tools like densityevolution [26]. The bipartite graphs induced by the FFAST frontend subsampling operation belong to the CRT ensemble. Later, in Lemma VI.1, we show that the two ensembles are equivalent. Hence, the analysis of the ballsandbins construction carries over to the FFAST. We start by setting up some notations and common parameters.
Consider a set of pairwise coprime integers. Let the signal length , for some positive integer , and . The integers ’s are chosen such that they are approximately equal and we use to denote this value. More precisely, , for , where is an asymptotically large number. The perturbation term in each is used to obtain a set of coprime integers^{7}^{7}7An example construction of an approximately equal sized coprime integers can be obtained as follows. Let for any integers greater than . Then, and are coprime integers. approximately equal to . We construct a FFAST subsampling frontend architecture with stages. Each stage further has delaychains (see Fig. 1 for reference). The subsampling period used in both delaychains of stage is and hence the number of output samples is . The total number of input samples used by the FFAST algorithm is (see Fig. 1). In this section, we use , for some constant . This results in a sparsity index , depending on the value of the integer .
Via Ensemble of bipartite graphs constructed using a “BallsandBins” model
Bipartite graphs in the ensemble , have variable nodes on the left and check nodes on the right. Further, each variable (left) node is connected to right nodes, i.e., leftregular degree bipartite graphs. An example graph from an ensemble , for and is shown in Fig. 4. More generally, the ensemble of left regular edge degree bipartite graphs constructed using a “ballsandbins” model is defined as follows. Set , where . Partition the set of check nodes into subsets with the subset having check nodes. For each variable node, choose one neighboring check node in each of the subsets, uniformly at random. The corresponding left regular degree bipartite graph is then defined by connecting the variable nodes with their neighboring check nodes by an undirected edge.
An edge in the graph is represented as a pair of nodes , where and are the variable and check nodes incident on the edge . By a directed edge we mean an ordered pair or corresponding to the edge . A path in the graph is a directed sequence of directed edges such that, if , then the for . The length of the path is the number of directed edges in it, and we say that the path connecting to starts from and ends at .
ViA1 Directed Neighborhood
The directed neighborhood of depth of , denoted by , is defined as the induced subgraph containing all the edges and nodes on paths starting at node such that . An example of a directed neighborhood of depth is given in Fig. 6. If the induced subgraph corresponding to the directed neighborhood is a tree then we say that the depth neighborhood of the edge is treelike.
ViB Ensemble of bipartite graphs constructed using the ChineseRemainderTheorem (CRT)
The ensemble of left regular degree bipartite graphs, with variable nodes and check nodes, is defined as follows. Partition the set of check nodes into subsets with the subset having check nodes (see Fig. 4 for an example). Consider a set of integers, where each element of the set is between and . Assign the integers from the set to the variable nodes in an arbitrary order. Label the check nodes in the set from to for all . A left regular degree bipartite graph with variable nodes and check nodes, is then obtained by connecting a variable node with an associated integer to a check node in the set , for . The ensemble is a collection of all the left regular degree bipartite graphs induced by all possible sets .
Lemma VI.1.
The ensemble of bipartite graphs is identical to the ensemble .
Proof.
It is trivial to see that . Next we show the reverse. Consider a graph . Suppose, a variable node is connected to the check nodes numbered . Then, using the CRT, one can find P number of integer’s ‘’ between and such that . Thus, for every graph , there exists a set of integers, that will result in an identical graph using the CRT based construction. Hence, . ∎
Note that the modulo rule used to generate a graph in the ensemble is same as the one used in equation (5) of Section V. Thus, the FFAST architecture of Fig. 1, generates graphs from the CRT ensemble , where the indices of the variable nodes correspond to the locations (or support) of the nonzero DFT coefficients^{8}^{8}8A set with repeated elements corresponds to a signal with fewer than nonzero DFT coefficients. of the signal . Also, under the assumption that the support of the nonzero DFT coefficients of the signal is uniformly random, the resulting aliasing graph is uniformly random choice from the ensemble .
Next, we analyze the performance of the FFAST peelingdecoder over a uniformly random choice of a graph from the ensemble , which along with the Lemma VI.1, provides a lower bound on the success performance of the FFAST decoder over graphs from the ensemble . Although the construction and the results described in this section are applicable to any value of , we are particularly interested in the case when . For and , we achieve the sublinear sparsity index , while other values of , are achieved using larger values of .
ViC Performance analysis of the FFAST peelingdecoder on graphs from the ensemble
In this section, we analyze the probability of success of the FFAST peelingdecoder, over a randomly chosen graph from the ensemble , after a fixed number of peeling iterations . Our analysis follows exactly the arguments in [26] and [27]. Thus, one may be tempted to take the results from [26] “offtheshelf”. However, we choose here to provide a detailed analysis for two reasons. First, our graph construction in the ensemble is different from that used in [26], which results in some fairly important differences in the analysis, such as the expansion properties of the graphs, thus warranting an independent analysis. Secondly, we want to make the paper more selfcontained and complete.
We now provide a brief outline of the proof elements highlighting the main technical components needed to show that the FFAST peelingdecoder decodes all the nonzero DFT coefficients with high probability.

Density evolution: We analyze the performance of the messagepassing algorithm, over a typical graph from the ensemble, for iterations. First, we assume that a local neighborhood of depth of every edge in a typical graph in the ensemble is treelike, i.e., cyclefree. Under this assumption, all the messages between variable nodes and the check nodes, in the first rounds of the algorithm, are independent. Using this independence assumption, we derive a recursive equation that represents the expected evolution of the number of singletons uncovered at each round for this typical graph.

Convergence to the cyclefree, case: Using a Doob martingale as in [26], we show that a random graph from the ensemble, chosen as per nature’s choice of the nonzero DFT coefficients, behaves like a “typical” graph, i.e., depth neighborhood of most of the edges in the graph is cyclefree, with high probability. This proves that for a random graph in , the FFAST peelingdecoder decodes all but an arbitrarily small fraction of the variable nodes with high probability in a constant number of iterations, .

Completing the decoding using the graph expansion property: We first show that if a graph is an “expander” (as will be defined later in Section VIC3), and the FFAST peelingdecoder successfully decodes all but a small fraction of the nonzero DFT coefficients, then it decodes all the nonzero DFT coefficients successfully. Next, we show that a random graph from the ensemble is an expander with high probability.
ViC1 Density evolution for local treelike view
In this section we assume that a local neighborhood of depth of every edge in a graph in the ensemble is treelike. Next, we define the edgedegree distribution polynomials of the bipartite graphs in the ensemble as and , where (resp. ) denotes the probability that an edge of the graph is connected to a left (resp. right) node of degree . Thus for the ensemble , constructed using the ballsandbins procedure, by construction. Further, as shown in Appendix C, the edge degree distribution .
Let denote the probability that an edge is present (or undecoded) after round of the FFAST peelingdecoder, then . Under the treelike assumption, the probability , is given as,
(7) 
Equation (7) can be understood as follows (also see Fig. 6): the treelike assumption implies that, up to iteration , messages on different edges are independent. Thus, the total probability, that at iteration , a variable node is decoded due to a particular check node is given by and similarly the total probability that none of the neighboring check nodes decode the variable node is . Specializing equation (7) for the edge degree distributions of we get,
(8) 
where . The evolution process of (8) asymptotically (in the number of iterations ) converges to , for an appropriate choice of the parameter , e.g., see Table IV.
d  2  3  4  5  6  7  8  9 

1.0000  0.4073  0.3237  0.2850  0.2616  0.2456  0.2336  0.2244  
2.0000  1.2219  1.2948  1.4250  1.5696  1.7192  1.8688  2.0196 
ViC2 Convergence to cyclefree case
In the following Lemma VI.2 we show; a) the expected behavior over all the graphs in the ensemble converges to that of a cyclefree case, and b) with exponentially high probability, the proportion of the edges that are not decoded after iterations of the FFAST peelingdecoder is tightly concentrated around , as defined in (8).
Lemma VI.2 (Convergence to Cyclefree case).
Over the probability space of all graphs , let be the total number of edges that are not decoded after (an arbitrarily large but fixed) iterations of the FFAST peelingdecoder over a randomly chosen graph. Further, let be as given in the recursion (8). Then there exist constants and such that for any and sufficiently large we have
(9)  
(10) 
Proof.
Please see Appendix F. ∎
ViC3 Successful Decoding using Expansion
In the previous section we have shown that with high probability, the FFAST peelingdecoder decodes all but an arbitrarily small fraction of variable nodes. In this section, we show how to complete the decoding if the graph is a “goodexpander”. Our problem requires the following definition of an “expandergraph”, which is somewhat different from conventional notions of an expandergraph in literature, e.g., edge expander, vertex expander or spectral expander graphs.
Definition VI.3 (Expander graph).
A left regular degree bipartite graph from ensemble , is called an expander, if for all subsets , of variable nodes, of size at most , there exists a right neighborhood of , i.e., , that satisfies , for some .
In the following lemma, we show that if a graph is an expander, and if the FFAST peelingdecoder successfully decodes all but a small fraction of the nonzero DFT coefficients, then it decodes all the nonzero DFT coefficients successfully.
Lemma VI.4.
Consider a graph from the ensemble , with , that is an expander for some . If the FFAST peelingdecoder over this graph succeeds in decoding all but at most variable nodes, then it decodes all the variable nodes.
Proof.
See Appendix D ∎
In Lemma VI.5, we show that most of the graphs in the ensemble are expanders.
Lemma VI.5.
Consider a random graph from the ensemble , where . Then, all the subsets of the variable nodes, of the graph, satisfy ,

with probability at least , for sets of size , for small enough and some .

with probability at least , for sets of size .
Proof.
See Appendix E ∎
The condition is a necessary condition for part of Lemma VI.5. This can be seen as follows. Consider a random graph from the ensemble , where . If any two variable nodes in the graph have the same set of neighboring check nodes, then the expander condition, for the set consisting of these two variable nodes, will not be satisfied. In a bipartite graph from the ensemble , there are a total of distinct sets of check nodes. Each of the variable nodes chooses a set of check nodes, uniformly at random and with replacement, from the total of sets. If we draw integers uniformly at random between to , the probability that at least two numbers are the same is given by,
(11) 
This is also known as the birthday paradox or the birthday problem in literature [28]. For a graph from the ensemble , we have . Hence, if the number of stages , there is a constant probability that there exists a pair of variable nodes that share the same neighboring check nodes, in both the stages, thus violating the expander condition.
Theorem VI.6.
The FFAST peelingdecoder over a random graph from the ensemble , where and :

successfully uncovers all the variable nodes with probability at least ;

successfully uncovers all but a vanishingly small fraction, i.e., , of the variable nodes with probability at least for some constants , and .
for an appropriate choice of the constant as per Table IV.
Proof.
Consider a random graph from the ensemble . Let be the number of the edges not decoded by the FFAST peelingdecoder in (large but fixed constant) iterations after processing this graph. Then, from recursion (8) and Lemma VI.2, for an appropriate choice of the constant (as per Table IV), , for an arbitrarily small constant , with probability at least . The result then follows from Lemmas VI.5 and VI.4. ∎
Corollary VI.7.
The FFAST peelingdecoder over a random graph from the ensemble , where and :

successfully uncovers all the variable nodes with probability at least ;

successfully uncovers all but a vanishingly small fraction, i.e., , of the variable nodes with probability at least for some constants , and .
for an appropriate choice of the constant as per Table IV.
ViD The FFAST frontend architecture parameters for achieving the sparsity index
Consider a set of a pairwise coprime integers. The integers ’s are such that they are approximately equal, i.e, , for , where (see Table IV) is an asymptotically large number. Set the signal length , where, , thus achieving the sparsity index , for any positive constant . We construct a FFAST subsampling frontend with stages, where each stage further has delaychains (see Fig. 1). The subsampling period used in the both the delaychains of the stage is . As a result, the number of samples at the output of each delaychain of the stage is for , i.e., the total number of samples used by the FFAST algorithm is .
Vii FFAST Construction and Performance analysis for the lesssparse () regime
In the FFAST frontend architecture for the lesssparse regime, the integers in the set , unlike for the verysparse regime of Section VI, are not pairwise coprime. Instead, for the lesssparse regime () the relation between the integers ’s is bit more involved. Hence, for ease of exposition, we adhere to the following approach:

First, we describe the FFAST frontend construction and analyze performance of the FFAST decoding algorithm for a simple case of lesssparse regime of .

Then, in Section VIIB, we provide a brief sketch of how to generalize the FFAST architecture of , to , for integer values of . This covers the range of values of etc.

In Section VIIC, we show how to achieve the intermediate values of .

Finally, in Section VIID, we use all the techniques learned in the previous sections to provide an explicit choice of parameters for the FFAST frontend architecture that achieves all the sparsity indices in the range .
Viia Lesssparse regime of
ViiA1 FFAST frontend construction
Consider , where the set consists of approximately equal sized coprime integers with each and . This results in . Choose the integers , such that . Then, we construct a stage FFAST frontend architecture, where stage has two delaychains each with a subsampling period of and number of output samples.
ViiA2 Performance analysis of the FFAST decoding algorithm
In order to analyze the performance of the FFAST decoding algorithm, we follow a similar approach as in Section VI for the verysparse regime of . We first provide an ensemble of bipartite graphs constructed using a ballsandbins model. Then, we provide CRT based ensemble of bipartite graphs, that are generated by the FFAST frontend of Section VIIA1. We show by construction these two ensembles are equivalent and analyze the performance of the FFAST peelingdecoder on a uniformly random graph from the ballsandbins ensemble.
BallsandBins construction
We construct a bipartite graph with variable nodes on the left and , check nodes on the right (see Fig. 7) using ballsandbins model as follows. Partition the check nodes into sets/stages containing and check nodes respectively. The check nodes in each of the sets are arranged in a matrix format as shown in Fig. 7, e.g., check nodes in the set are arranged as rows and columns. A check node in row and column in the set , is indexed by a pair and so on and so forth. Each variable node uniformly randomly chooses a triplet , where is between and . The triplets are chosen with replacement and independently across all the variable nodes. A regular degree bipartite graph with variable nodes and check nodes is then constructed by connecting a variable node with a triplet to the check nodes and in the three sets on right respectively, as shown in Fig. 7.