Toeplitz Block Matrices in Compressed Sensing
Abstract.
Recent work in compressed sensing theory shows that independent and identically distributed (IID) sensing matrices whose entries are drawn independently from certain probability distributions guarantee exact recovery of a sparse signal with high probability even if . Motivated by signal processing applications, random filtering with Toeplitz sensing matrices whose elements are drawn from the same distributions were considered and shown to also be sufficient to recover a sparse signal from reduced samples exactly with high probability. This paper considers Toeplitz block matrices as sensing matrices. They naturally arise in multichannel and multidimensional filtering applications and include Toeplitz matrices as special cases. It is shown that the probability of exact reconstruction is also high. Their performance is validated using simulations.
1. Introduction
The central problem in compressed sensing (CS) is the recovery of a vector from its linear measurements of the form
(1.1) 
where is assumed to be much smaller than . Of course, for , (1.1) posts an underdetermined system of equations which has nonunique solutions. Exact recovery of the original vector needs further prior information. The work by Candés, Donoho, Romberg, Tao, and others (see e.g. [1],[2], and the references therein) showed that under the assumption that is sparse, one can actually recover from a sample which is much smaller in size than by solving a convex program with a suitably chosen sampling basis . If we write the linear system (1.1) in the form
(1.2) 
then the question about what sampling methods guarantee the exact recovery of becomes the question about what matrices are “good” compressed sensing matrices, meaning that they ensure exact recovery of a sparse from with high probability under the condition that .
In [3] Candès and Tao introduce the restricted isometry property as a condition on matrices which provides a guarantee on the performance of in compressed sensing.
Following their definition, we say that a matrix satisfies RIP of order and constant if
(1.3) 
where , , and denotes the matrix obtained by retaining only the columns of corresponding to the entries of .
It was shown in [3] (reinterpreted in [4]) that if satisfies RIP of order and constant :
(1.4) 
where and , the decoder given by
(1.5) 
ensures exact recovery of from .
Recently Baraniuk et al [5] showed that matrices whose entries are drawn independently from certain probability distribution satisfy RIP of order with probability for every provided that , where are some positive constants depending only on . Motivated by applications in signal processing, Bajwa et al [6] considered (truncated) Toeplitzstructured matrices whose entries are drawn from the same probability distributions and showed that they satisfy RIP of order with probability for every provided that .
Some examples of probability distributions that can be used in this context have been studied in [7]. They include
(1.6) 
Motivated by applications in multichannel sampling, in this paper we will consider Toeplitz block matrices with elements in each block drawn independently from one of the probability distributions in (1.6) and some other block matrices with similar structures. We show that such matrices also satisfy RIP of order for every with high probability, provided that , where and is some positive constant depending only on . These Toeplitz block matrices naturally represent the system equation matrices in multichannel sampling applications where a single input signal is recovered from output samples of multiple channels with IID random filters. The result justifies the use of multichannel over singlechannel systems in compressed sensing. The advantages of Toeplitz matrices pointed out in [6], like e.g. efficient implementations, also apply to the matrices considered in this paper.
2. Main Result
Theorem 2.1.
For Toeplitz block matrices of the form
(2.1) 
with blocks whose elements are drawn independently from one of the probability distributions in (1.6), there exist constants depending only on , such that:

If , then for any , satisfies RIP of order for every with probability at least

If , then for any , satisfies RIP of order for every with probability at least
The above theorem gives the requirement for and probability of exact reconstruction of a sparse signal from a measurement if Toeplitz block matrices are used. In particular it says, that if the number of blocks () in one column of does not exceed a certain value depending only on the sparsity of the signal , the probability of perfect reconstruction is greater and the number of required measurements is smaller than if is not bounded in this way.
As noted in [6, 9], Toeplitz matrices naturally arise in onedimensional singlechannel filtering applications where the matrix elements are filter coefficients. Similarly, the Toeplitz block matrices defined in (2.1) naturally arise in onedimensional multichannel sampling applications where the length of the filter is at least points larger than that of the input signal. The conventional multichannel sampling theorem states that the sampling rate reduction over the single channel system cannot exceed the number of channels for exact recovery. While Theorem 2.1 suggests that multichannel systems with IID random filters might be able to reduce the sampling rate by a factor higher than the number of channels.
We remark, that for other block matrices with similar structures, the result in Theorem 2.1 also holds (see IV).
3. Proof of Main Result
Let . Denote by the th row of the matrix obtained by retaining only those columns of corresponding to the elements in , and let denote the set of random variables common to the th row of and the th block of .
We note that, if (1.4) holds for a set , then it also holds for any . To prove that Toeplitz IID block matrices satisfy RIP with high probability, it is therefore enough to consider only those sets where .
Lemma 3.1.
Define the sets by

If satisfies , then

If satisfies , then
Proof.
Fix . defines a sequence , where is the number of columns from block in . Thus . Consider the number of rows that have dependency with the elements in . Since all elements inside a single block are independent, there can be no dependencies within one block. Moreover, because of the structure of the matrix , there can be at most
rows outside the block
that depend on any element in .
(i) If satisfies , i.e. if
, these rows may be distinct, and we have
dependent rows.
(ii) If satisfies , i.e. if
, then is upper bounded by the number
of blocks, so .
∎
In [7] it has been shown that for given , , and with , an IID matrix of size with entries drawn independently from one of the distributions in (1.6)^{1}^{1}1These matrices consist of columns whose squared norm is equal to 1 in expectation. satisfies (1.3) with probability
(3.1) 
where
(3.2) 
Now consider a (truncated) Toeplitz block matrix as in (2.1), where the blocks are such IID matrices with entries drawn independently from the same set of distributions as above.
The following lemma gives an upper bound for the probability that a matrix as in (2.1) with satisfies (1.4) for any fixed subset with . Lemma 3.3 gives a tighter bound for the case .
Lemma 3.2.
For given with , and , the Toeplitz block submatrix satisfies (1.4) with probability at least
Proof.
We can write the matrix as
(3.3) 
where the blocks of size are given by the columns determined by in the th row of blocks in .
Note that is an IID matrix with entries from one of the distributions in (1.6). If we let , then the matrices have columns whose squared norm is equal to 1 in expectation and by (3.1) satisfy (1.4), i.e.
with probability at least
(3.4) 
Now since
(3.5) 
and , we have
In other words, the event implies the event
. Consequently,
∎
Lemma 3.3.
For given with , and , if , the Toeplitz block submatrix satisfies (1.4) with probability at least
where .
Proof.
Let denote the th row of and construct an undirected dependency graph such that and
By Lemma 3.1, can at most be dependent with other rows. Therefore, the maximum degree of is given by , and using the HajnalSzemerédi theorem on equitable coloring of graphs, we can partition using colors. Let be the different color classes, then
Now, let be the submatrix obtained from retaining the rows corresponding to the indices in and define . Then
(3.6) 
Every is a IID matrix whose columns have squared norm equal to 1 in expectation. By (3.1), they satisfy (1.4) with probability at least
(3.7) 
Since , by (3.6), we have that if
then
In other words, the event implies the event . Consequently,
∎
Main result in Theorem 2.1.
Proof.
(i) From (3.2) and Lemma 3.2 we have that satisfies (1.4) for any such that with probability at least
(3.8) 
Since there are such subsets, using Bonferroni’s inequality (see e.g. [8]) yields that satisfies RIP of order with probability at least
(3.9) 
Fix and pick . Then for any , the exponent of in (3.9) is upper bounded by :
(ii) From (3.2) and Lemma 3.3 we have that satisfies (1.4) for any such that with probability at least
(3.10) 
Since there are such subsets, using Bonferroni’s inequality again yields that satisfies RIP of order with probability at least
(3.11) 
Now fix and pick , where . Then, for any , the exponent of in (3.11) is upper bounded by . This completes the proof of the theorem. ∎
Remark 3.1.
4. Other Block Matrices
4.1. Circular matrices
The above consideration can be applied to (truncated) circulant block matrices of the form
(4.1) 
where the blocks are all IID matrices.
Similar to (2.1), the circulant matrices in (4.1) also represent the system equation matrices in multichannel sampling, but the convolution is a circular one. They usually arise in applications where convolutions are implemented by multiplications in Fourier domain.
Before we present the theorem for this type of matrices, we first comment on the maximum number of stochastically dependent rows in a (truncated) circulant matrix of the form
(4.2) 
Again, we denote by the th row of the matrix , which is obtained by retaining only those columns of corresponding to .
Lemma 4.1.
Define the sets by is stochastically dependent on Then has cardinality at most .
Proof.
Note first, that an upper bound for the case clearly upper bounds the case where . We may therefore assume that and is a square circulant matrix. Then the number of rows stochastically dependent on is independent of and we can, w.l.o.g., assume that . Let be a tuple defined by
and consider the matrix
(4.3) 
where defines the rightshift . Denote by the matrix obtained by retaining only those columns of corresponding to . It is now easy to see that
where is the Hamming distance defined by
∎
The following theorem gives lower bounds for the probability that a circulant block matrix as in 4.2 satisfies the RIP of order . Note that the bounds obtained are the same as in 2.1 although the number of independent entries in is greater than before. This is due to the nature of the proof using the number of stochastically dependent rows of which is the same for both Toeplitz and circulant matrices.
Theorem 4.1.
Let be as in (4.1). Then there exist constants depending only on , such that:

If , then for any , satisfies RIP of order for every with probability at least

If , then for any , satisfies RIP of order for every with probability at least
Proof.
A similar argument as the one in the proof of Lemma 3.1 shows that the upper bound for the maximum number of rows stochastically dependent on any row of a (truncated) circulant block matrix is the same as for the (truncated) Toeplitz block matrices (use Lemma 4.1). Then the proof of Theorem 2.1 directly applies to the setting at hand. ∎
4.2. Circulantcirculant Matrices
We also consider matrices that are (truncated) circulant block matrices whose blocks are themselves circulant:
(4.4)  
(4.5) 
Denote by the rightshift of blocks and by the rightshift of elements inside a block , both by one position. These matrices arise in twodimentional imaging applications where the independent elements are the coefficients of the point spread function of the imaging system. Replacing (4.3) in the proof of Lemma 4.1 by
readily yields the upper bound for the number of rows stochastically dependent on any one row of . Applying Lemma 3.3 and Theorem 4.1 shows that the probability for perfect reconstruction is no less than . This says that imaging systems with IID random point spread functions can significantly reduce the number of acquired samples, while still being able to reconstruct the original sparse image if the above conditions hold.
4.3. Circulantcirculant Block Matrices
As a generalization of the matrices defined by (4.4) and (4.5), the following matrices are also considered:
(4.6)  
where the blocks are all IID matrices. These matrices arise in multichannel twodimensional imaging applications where the number of rows in corresponds to the independent channels. We show next that these matrices are also good compressed sensing matrices.
Corollary 4.1.
Let be as in (4.6). Then there exist constants depending only on , such that:

If , then for any , satisfies RIP of order for every with probability at least

If , then for any , satisfies RIP of order for every with probability at least
4.4. Deterministic Construction
The CS matrices we have considered so far are based on randomized constructions. However, in certain applications, deterministic constructions are preferred. In [10] DeVore provided a deterministic construction of CS matrices using polynomials over finite fields. We will consider deterministic block matrices based on DeVore’s construction. Let us first recall the construction in [10].
Consider the set , where denotes the field of integers modulo , a prime. This set has elements. Define , . This set has elements. For every , define the graph of by
and consider the column vector , indexed by the elements of ordered lexicographically, given by
where
Construct the matrix , where the polynomials are ordered lexicographically with respect to their coefficients. It was shown in [10], that the matrix satisfies RIP for any with .
Now consider
(4.7) 
where , and each block is constructed from the first vectors , as above.
Theorem 4.2.
The matrix satisfies RIP with for any .
Proof.
As before, we only have to consider the case where . Let such that , and let be the matrix obtained by retaining only those columns of corresponding to the elements in . Consider the matrix . Since every column of has exactly ones, the diagonal elements of are all one. An off diagonal element of has the form , where , and denotes the vector that represents some polynomial . Since the graphs of two different polynomials in have at most elements in common, for any . Therefore, the sum of all off diagonal elements in any row or column of is whenever . We can, therefore, write
(4.8) 
where and . Since , we have that and so the spectral norms of and are and , respectively. This shows that satisfies (1.4). ∎
5. Numerical Results
To validate that the probability of exact recovery for Toeplitz block CS matrices is high, the performance of Toeplitz block, IID, and Toeplitz CS matrices is compared empirically. In our simulation, a length n = 2048 signal with randomly placed m = 20 nonzero entries drawn independently from the Gaussian distribution was generated. Each such generated signal is sampled using IID, Toeplitz and Toeplitz block matrices with entries drawn independently from the Bernoulli distribution and reconstructed using the log barrier solver from [11]. The experiment is declared a success if the signal is exactly recovered, i.e., the error is within the range of machine precision. The empirical probability of success is determined by repeating the reconstruction experiment 1000 times and calculating the fraction of success. This empirical probability of success is plotted as a function of the number of measurement samples n in Fig. 1. The simulation results show, that in the vast majority of applications all Toeplitz block matrices perform similar to IID matrices.
Acknowledgment
The third author would like to acknowledge the support from IMA for his participation in the short course “Compressive Sampling and Frontiers in Signal Processing”.
References
 [1] E. Candès, J. Romberg, and T. Tao, ”Robust uncertainty principles: Exact sinal reconstruction from highly incomplete frequency information”, IEEE Trans. Inf. Theory 52, no. 2, pp. 489509, 2006.
 [2] D. Donoho, ”Compressed Sensing”, IEEE Trans. Information Theory 52, no. 4, pp. 12891306, 2006.
 [3] E. Candès and T. Tao, ”Decoding by linear programming”, IEEE Trans. Inf. Theory 51, no. 12, pp. 42034215, 2005.
 [4] A. Cohen, W. Dahmen, and R. DeVore, ”Compressed sensing and best kterm approximation”, (2006), Preprint.
 [5] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, ”A Simple Proof of the Restricted Isometry Property for Random Matrices”, (2007), Preprint.
 [6] W. Bajwa, J. Haupt, G. Raz, S. Wright, and R. Nowak, ”ToeplitzStructured Compressed Sensing Matrices”, IEEE SSP Workshop, pp. 294298, 2007.
 [7] D. Achlioptas, ”Databasefriendly Random Projections”, Proc. ACM SIGMODSIGACTSIGART Symp. on Principles of Database Systems, pp. 274281, 2001.
 [8] Y. Dodge, F. Marriott, Int. Statistical Institute, ”The Oxford dictionary of statistical terms”, 6th ed, Oxford University Press, p. 47, 2003.
 [9] J.A. Tropp, ”Random Filters for Compressive Sampling”, Proceedings of 40th Annual Conference on Information Sciences and Systems, pp. 216  217, 2224 March 2006.
 [10] R. DeVore, ”Deterministic Constructions of Compressed Sensing Matrices”, (2007), Preprint.
 [11] E. Candés, http://www.acm.caltech.edu/l1magic/.