Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

# Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Deanna Needell and Roman Vershynin Department of Mathematics, University of California, Davis, CA 95616, USA
July 23, 2007
###### Abstract.

This paper seeks to bridge the two major algorithmic approaches to sparse signal recovery from an incomplete set of linear measurements – -minimization methods and iterative methods (Matching Pursuits). We find a simple regularized version of Orthogonal Matching Pursuit (ROMP) which has advantages of both approaches: the speed and transparency of OMP and the strong uniform guarantees of -minimization. Our algorithm ROMP reconstructs a sparse signal in a number of iterations linear in the sparsity, and the reconstruction is exact provided the linear measurements satisfy the Uniform Uncertainty Principle.

###### Key words and phrases:
signal recovery algorithms, restricted isometry condition, uncertainty principle, Basis Pursuit, Compressed Sensing, Orthogonal Matching Pursuit, signal recovery, sparse approximation
###### 1991 Mathematics Subject Classification:
68W20, 65T50, 41A46
Partially supported by the Alfred P. Sloan Foundation and by NSF DMS grants 0401032 and 0652617.
Communicated by Emmanuel Candes.

## 1. Introduction

Sparse recovery problems arise in many applications ranging from medical imaging to error correction. Suppose is an unknown -dimensional signal with at most nonzero components:

 v∈Rd,|supp(v)|≤n≪d.

We call such signals -sparse. Suppose we are able to collect nonadaptive linear measurements of , and wish to efficiently recover from these. The measurements are given as the vector , where is some measurement matrix.111We chose to work with real numbers for simplicity of presentation; similar results hold over complex numbers.

As discussed in , exact recovery is possible with just . However, recovery using only this property is not numerically feasible; the sparse recovery problem in general is known to be NP-hard. Nevertheless, massive recent work in the emerging area of Compressed Sensing demonstrated that for several natural classes of measurement matrices , the signal can be exactly reconstructed from its measurements with

 (1.1) N=nlogO(1)(d).

In other words, the number of measurements should be almost linear in the sparsity . Survey  contains some of these results; the Compressed Sensing webpage  documents progress in this area.

The two major algorithmic approaches to sparse recovery are methods based on -minimization and iterative methods (Matching Pursuits). We now briefly describe these methods. Then we propose a new iterative method that has advantages of both approaches.

### 1.1. L1-minimization

This approach to sparse recovery has been advocated over decades by Donoho and his collaborators (see e.g. ). The sparse recovery problem can be stated as the problem of finding the sparsest signal with the given measurements :

 (L0) min∥u∥0subject toΦu=Φv

where . Donoho and his associates advocated the principle that for some measurement matrices , the highly non-convex combinatorial optimization problem should be equivalent to its convex relaxation

 (L1) min∥u∥1subject toΦu=Φv

where denotes the -norm of the vector . The convex problem can be solved using methods of convex and even linear programming.

The recent progress in the emerging area of Compressed Sensing pushed forward this program (see survey ). A necessary and sufficient condition of exact sparse recovery is that the map be one-to-one on the set of -sparse vectors. Candès and Tao  proved that a stronger quantitative version of this condition guarantees the equivalence of the problems and .

###### Definition 1.1 (Restricted Isometry Condition).

A measurement matrix satisfies the Restricted Isometry Condition (RIC) with parameters for if we have

 (1−ε)∥v∥2≤∥Φv∥2≤(1+ε)∥v∥2for all m-sparse vectors.

The Restricted Isometry Condition states that every set of columns of forms approximately an orthonormal system. One can interpret the Restricted Isometry Condition as an abstract version of the Uniform Uncertainty Principle in harmonic analysis (, see also discussions in  and ).

###### Theorem 1.2 (Sparse recovery under RIC ).

Assume that the measurement matrix satisfies the Restricted Isometry Condition with parameters . Then every -sparse vector can be exactly recovered from its measurements as a unique solution to the convex optimization problem .

In a lecture on Compressive Sampling, Candès sharpened this to work for the Restricted Isometry Condition with parameters . Measurement matrices that satisfy the Restricted Isometry Condition with number of measurements as in (1.1) include random Gaussian, Bernoulli and partial Fourier matrices. Section 2 contains more detailed information.

### 1.2. Orthogonal Matching Pursuit (OMP)

An alternative approach to sparse recovery is via iterative algorithms, which find the support of the -sparse signal progressively. Once is found correctly, it is easy to compute the signal from its measurements as , where denotes the measurement matrix restricted to columns indexed by .

A basic iterative algorithm is Orthogonal Matching Pursuit (OMP), popularized and analyzed by Gilbert and Tropp in , see  for a more general setting. OMP recovers the support of , one index at a time, in steps. Under a hypothetical assumption that is an isometry, i.e. the columns of are orthonormal, the signal can be exactly recovered from its measurements as .

The problem is that the matrix is never an isometry in the interesting range where the number of measurements is smaller than the ambient dimension . Even though the matrix is not an isometry, one can still use the notion of coherence in recovery of sparse signals. In that setting, greedy algorithms are used with incoherent dictionaries to recover such signals, see , , . In our setting, for random matrices one expects the columns to be approximately orthogonal, and the observation vector to be a good approximation to the original signal .

The biggest coordinate of the observation vector in magnitude should thus be a nonzero coordinate of the signal . We thus find one point of the support of . Then OMP can be described as follows. First, we initialize the residual . At each iteration, we compute the observation vector . Denoting by the coordinates selected so far, we solve a least squares problem and update the residual

 y=argminz∈RI∥x−Φz∥2;r=x−Φy,

to remove any contribution of the coordinates in . OMP then iterates this procedure times, and outputs a set of size , which should equal the support of the signal .

Tropp and Gilbert  analyzed the performance of OMP for Gaussian measurement matrices ; a similar result holds for general subgaussian matrices. They proved that, for every fixed -sparse -dimensional signal , and an random Gaussian measurement matrix , OMP recovers (the support of) from the measurements correctly with high probability, provided the number of measurements is .

### 1.3. Advantages and challenges of both approaches

The -minimization method has strongest known guarantees of sparse recovery. Once the measurement matrix satisfies the Restricted Isometry Condition, this method works correctly for all sparse signals . No iterative methods have been known to feature such uniform guarantees, with the exception of Chaining Pursuit  and the HHS Algorithm  which however only work with specifically designed structured measurement matrices.

The Restricted Isometry Condition is a natural abstract deterministic property of a matrix. Although establishing this property is often nontrivial, this task is decoupled from the analysis of the recovery algorithm.

-minimization is based on linear programming, which has its advantages and disadvantages. One thinks of linear programming as a black box, and any development of fast solvers will reduce the running time of the sparse recovery method. On the other hand, it is not very clear what this running time is, as there is no strongly polynomial time algorithm in linear programming yet. All known solvers take time polynomial not only in the dimension of the program , but also on certain condition numbers of the program. While for some classes of random matrices the expected running time of linear programming solvers can be bounded (see the discussion in  and subsequent work in ), estimating condition numbers is hard for specific matrices. For example, there is no result yet showing that the Restricted Isometry Condition implies that the condition numbers of the corresponding linear program is polynomial in .

Orthogonal Matching Pursuit is quite fast, both theoretically and experimentally. It makes iterations, where each iteration amounts to a multiplication by a matrix (computing the observation vector ), and solving a least squares problem in dimensions at most (with matrix ). This yields strongly polynomial running time. In practice, OMP is observed to perform faster and is easier to implement than -minimization . For more details, see .

Orthogonal Matching Pursuit is quite transparent: at each iteration, it selects a new coordinate from the support of the signal in a very specific and natural way. In contrast, the known -minimization solvers, such as the simplex method and interior point methods, compute a path toward the solution. However, the geometry of is clear, whereas the analysis of greedy algorithms can be difficult simply because they are iterative.

On the other hand, Orthogonal Matching Pursuit has weaker guarantees of exact recovery. Unlike -minimization, the guarantees of OMP are non-uniform: for each fixed sparse signal and not for all signals, the algorithm performs correctly with high probability. Rauhut has shown that uniform guarantees for OMP are impossible for natural random measurement matrices .

Moreover, OMP’s condition on measurement matrices given in  is more restrictive than the Restricted Isometry Condition. In particular, it is not known whether OMP succeeds in the important class of partial Fourier measurement matrices.

These open problems about OMP, first stated in  and often reverberated in the Compressed Sensing community, motivated the present paper. We essentially settle them in positive by the following modification of Orthogonal Matching Pursuit.

### 1.4. Regularized OMP

This new algorithm for sparse recovery will perform correctly for all measurement matrices satisfying the Restricted Isometry Condition, and for all sparse signals.

When we are trying to recover the signal from its measurements , we can use the observation vector as a good local approximation to the signal . Namely, the observation vector encodes correlations of the measurement vector with the columns of . Note that is a dictionary, and so since the signal is sparse, has a sparse representation with respect to the dictionary. By the Restricted Isometry Condition, every columns form approximately an orthonormal system. Therefore, every coordinates of the observation vector look like correlations of the measurement vector with the orthonormal basis and therefore are close in the Euclidean norm to the corresponding coefficients of . This is documented in Proposition 3.2 below.

The local approximation property suggests to make use of the biggest coordinates of the observation vector , rather than one biggest coordinate as OMP did. We thus force the selected coordinates to be more regular (ie. closer to uniform) by selecting only the coordinates with comparable sizes. To this end, a new regularization step will be needed to ensure that each of these coordinates gets an even share of information. This leads to the following algorithm for sparse recovery:

Regularized Orthogonal Matching Pursuit (ROMP)

Input: Measurement vector and sparsity level Output: Index set Initialize: Let the index set and the residual . Repeat the following steps until : Identify: Choose a set of the biggest coordinates in magnitude of the observation vector , or all of its nonzero coordinates, whichever set is smaller. Regularize: Among all subsets with comparable coordinates: choose with the maximal energy . Update: Add the set to the index set: , and update the residual:

###### Remark.

The identification and regularization steps of ROMP can be performed efficiently. In particular, the regularization step does not imply combinatorial complexity, but actually can be done in linear time. The running time of ROMP is thus comparable to that of OMP in theory, and is often better than OMP in practice. We discuss the runtime in detail in Section 4.

The main theorem of this paper states that ROMP yields exact sparse recovery provided that the measurement matrix satisfies the Restricted Isometry Condition.

###### Theorem 1.3 (Exact sparse recovery via ROMP).

Assume a measurement matrix satisfies the Restricted Isometry Condition with parameters for . Let be an -sparse vector in with measurements . Then ROMP in at most iterations outputs a set such that

 supp(v)⊂Iand|I|≤2n.

This theorem is proved in Section 3.

Remarks.   1. Theorem 1.3 guarantees exact sparse recovery. Indeed, it is easy to compute the signal from its measurements and the set given by ROMP as , where denotes the measurement matrix restricted to columns indexed by .

2. Theorem 1.3 gives uniform guarantees of sparse recovery. Indeed, once the measurement matrix satisfies a deterministic condition (RIC), then our algorithm ROMP correctly recovers every sparse vector from its measurements. Uniform guarantees have been shown to be impossible for OMP , and it has been an open problem to find a version of OMP with uniform guarantees (see ). Theorem 1.3 says that ROMP essentially settles this problem.

3. The logarithmic factor in may be an artifact of the proof. At this moment, we do not know how to remove it.

4. Measurement matrices known to satisfy the Restricted Isometry Condition include random Gaussian, Bernoulli and partial Fourier matrices, with number of measurements almost linear in the sparsity , i.e. as in (1.1). Section 2 contains detailed information. It has been unknown whether OMP gives sparse recovery for partial Fourier measurements (even with non-uniform guarantees). ROMP gives sparse recovery for these measurements, and even with uniform guarantees.

The rest of the paper is organized as follows. In Section 2 we describe known classes of measurement matrices satisfying the Restricted Isometry Condition. In Section 3 we give the proof of Theorem 1.3. In Section 4 we discuss implementation, running time, and empirical performance of ROMP.

### Acknowledgment

We would like to thank the referees for a thorough reading of the manuscript and making useful suggestions which greatly improved the paper.

## 2. Measurement matrices satisfying the Restricted Isometry Condition

The only known measurement matrices known to satisfy the Restricted Isometry Condition with number of measurements as in (1.1) are certain classes of random matrices. The problem of deterministic constructions is still open. The known classes include: subgaussian random matrices (in particular, Gaussian and Bernoulli), and random partial bounded orthogonal matrices (in particular, partial Fourier matrices).

Throughout the paper, denote positive absolute constants unless otherwise specified.

A subgaussian random matrix is a matrix whose entries are i.i.d. subgaussian random variables with variance . A random variable is subgaussian if its tail distribution is dominated by that of the standard Gaussian random variable: there are constants such that for all . Examples of subgaussian random variables are: standard Gaussian, Bernoulli (uniform ), and any bounded random variables.

A partial bounded orthogonal matrix is formed by randomly uniformly chosen rows of an orthogonal matrix , whose entries are bounded by , for some constant . An example of is the discrete Fourier transform matrix. Taking measurements with a partial Fourier matrix thus amounts to observing random frequencies of the signal .

The following theorem documents known results on the Restricted Isometry Condition for these classes of random matrices.

###### Theorem 2.1 (Measurement matrices satisfying RIC).

Consider an measurement matrix , and let , , and .

1. If is a subgaussian matrix, then with probability the matrix satisfies the Restricted Isometry Condition with parameters provided that

 N≥Cnε2log(dε2n).

2. If is a partial bounded orthogonal matrix, then with probability the matrix satisfies the Restricted Isometry Condition with parameters provided that

 N≥C(nlogdε2)log(nlogdε2)log2d.

In both cases, the constant depends only on the confidence level and the constants from the definition of the corresponding classes of matrices.

Remarks.   1. The first part of this theorem is proved in . The second part is from ; a similar estimate with somewhat worse exponents in the logarithms was proved in . See these results for the exact dependence of on the confidence level (although usually would be chosen to be some small constant itself.)

2. In Theorem 1.3, we needed to use RIC for . An immediate consequence of Theorem 2.1 is that subgaussian matrices satisfy such RIC for the number of measurements

 N∼nlog2d

and partial bounded orthogonal matrices for

 N∼nlog5d.

These numbers of measurements guarantee exact sparse recovery using ROMP.

## 3. Proof of Theorem 1.3

We shall prove a stronger version of Theorem 1.3, which states that at every iteration of ROMP, at least of the newly selected coordinates are from the support of the signal .

###### Theorem 3.1 (Iteration Invariant of ROMP).

Assume satisfies the Restricted Isometry Condition with parameters for . Let be an -sparse vector with measurements . Then at any iteration of ROMP, after the regularization step, we have , and

 (3.1) |J0∩supp(v)|≥12|J0|.

In other words, at least of the coordinates in the newly selected set belong to the support of .

In particular, at every iteration ROMP finds at least one new coordinate in the support of the signal . Coordinates outside the support can also be found, but (3.1) guarantees that the number of such “false” coordinates is always smaller than those in the support. This clearly implies Theorem 1.3.

Before proving Theorem 3.1 we explain how the Restricted Isometry Condition will be used in our argument. RIC is necessarily a local principle, which concerns not the measurement matrix as a whole, but its submatrices of columns. All such submatrices , , are almost isometries. Therefore, for every -sparse signal , the observation vector approximates locally, when restricted to a set of cardinality . The following proposition formalizes these local properties of on which our argument is based.

###### Proposition 3.2 (Consequences of Restricted Isometry Condition).

Assume a measurement matrix satisfies the Restricted Isometry Condition with parameters . Then the following holds.

1. (Local approximation) For every -sparse vector and every set , , the observation vector satisfies

 ∥u|I−v|I∥2≤2.03ε∥v∥2.
2. (Spectral norm) For any vector and every set , , we have

 ∥(Φ∗z)|I∥2≤(1+ε)∥z∥2.
3. (Almost orthogonality of columns) Consider two disjoint sets , . Let denote the orthogonal projections in onto and , respectively. Then

 ∥PIPJ∥2→2≤2.2ε.
###### Proof.

Part 1. Let , so that . Let denote the identity operator on . By the Restricted Isometry Condition,

 ∥Φ∗ΓΦΓ−IdΓ∥2→2=supy∈RΓ,∥y∥2=1∣∣∥ΦΓy∥22−∥y∥22∣∣≤(1+ε)2−1≤2.03ε.

Since , we have

 ∥u|Γ−v|Γ∥2=∥Φ∗ΓΦΓv−IdΓv∥2≤2.03ε∥v∥2.

The conclusion of Part 1 follows since .

Part 2. Denote by the orthogonal projection in onto . Since , the Restricted Isometry Condition yields

 ∥QIΦ∗∥2→2=∥ΦQI∥2→2≤1+ε.

This yields the inequality in Part 2.

Part 3. The desired inequality is equivalent to:

 |⟨x,y⟩|∥x∥2∥y∥2≤2.2ε% for all x∈range(ΦI),y∈range(ΦJ).

Let so that . For any , there are so that

 x=ΦKa,y=ΦKb,a∈RI,b∈RJ.

By the Restricted Isometry Condition,

 ∥x∥2≥(1−ε)∥a∥2,∥y∥2≥(1−ε)∥b∥2.

By the proof of Part 2 above and since , we have

 |⟨x,y⟩|=|⟨(Φ∗KΦK−IdΓ)a,b⟩|≤2.03ε∥a∥2∥b∥2.

This yields

 |⟨x,y⟩|∥x∥2∥y∥2≤2.03ε(1−ε)2≤2.2ε,

which completes the proof. ∎

We are now ready to prove Theorem 3.1.

The proof is by induction on the iteration of ROMP. The induction claim is that for all previous iterations, the set of newly chosen indices is nonempty, disjoint from the set of previously chosen indices , and (3.1) holds.

Let be the set of previously chosen indices at the start of a given iteration. The induction claim easily implies that

 (3.2) |supp(v)∪I|≤2n.

Let , , be the sets found by ROMP in the current iteration. By the definition of the set , it is nonempty.

Let be the residual at the start of this iteration. We shall approximate by a vector in . That is, we want to approximately realize the residual as measurements of some signal which lives on the still unfound coordinates of the the support of . To that end, we consider the subspace

 H:=range(Φsupp(v)∪I)

and its complementary subspaces

 F:=range(ΦI),E0:=range(Φsupp(v)∖I).

The Restricted Isometry Condition in the form of Part 3 of Proposition 3.2 ensures that and are almost orthogonal. Thus is close to the orthogonal complement of in ,

 E:=F⊥∩H.

We will also consider the signal we seek to identify at the current iteration, its measurements, and its observation vector:

 (3.3) v0:=v|supp(v)∖I,x0:=Φv0∈E0,u0:=Φ∗x0.

Lemma 3.5 will show that for any small enough subset is small, and Lemma 3.8 will show that is not too small. First, we show that the residual has a simple description:

###### Lemma 3.3 (Residual).

Here and thereafter, let denote the orthogonal projection in onto a linear subspace . Then

 r=PEx.
###### Proof.

By definition of the residual in the algorithm, . Since , we conclude from the orthogonal decomposition that . Thus . ∎

To guarantee a correct identification of , we first state two approximation lemmas that reflect in two different ways the fact that subspaces and are close to each other. This will allow us to carry over information from to .

###### Lemma 3.4 (Approximation of the residual).

We have

 ∥x0−r∥2≤2.2ε∥x0∥2.
###### Proof.

By definition of , we have . Therefore, by Lemma 3.3, , and so

 x0−r=x0−PEx0=PFx0=PFPE0x0.

Now we use Part 3 of Proposition 3.2 for the sets and whose union has cardinality at most by (3.2). It follows that as desired. ∎

###### Lemma 3.5 (Approximation of the observation).

Consider the observation vectors and . Then for any set with , we have

 ∥(u0−u)|T∥2≤2.4ε∥v0∥2.
###### Proof.

Since , we have by Lemma 3.4 and the Restricted Isometry Condition that

 ∥x0−r∥2≤2.2ε∥Φv0∥2≤2.2ε(1+ε)∥v0∥2≤2.3ε∥v0∥2.

To complete the proof, it remains to apply Part 2 of Proposition 3.2, which yields . ∎

We next show that the energy (norm) of when restricted to , and furthermore to , is not too small. By the approximation lemmas, this will yield that ROMP selects at least a fixed percentage of energy of the still unidentified part of the signal. By the regularization step of ROMP, since all selected coefficients have comparable magnitudes, we will conclude that not only a portion of energy but also of the support is selected correctly. This will be the desired conclusion.

We have .

###### Proof.

Let = . Since , the maximality property of in the algorithm implies that

 ∥u0|J∥2≥∥u0|S∥2.

Furthermore, since , by Part 1 of Proposition 3.2 we have

 ∥u0|S∥2≥(1−2.03ε)∥v0∥2.

Putting these two inequalities together and using Lemma 3.5, we conclude that

 ∥u|J∥2≥(1−2.03ε)∥v0∥2−2.4ε∥v0∥2≥0.8∥v0∥2.

This proves the lemma. ∎

We next bound the norm of restricted to the smaller set . We do this by first noticing a general property of regularization:

###### Lemma 3.7 (Regularization).

Let be any vector in , . Then there exists a subset with comparable coordinates:

 (3.4) |y(i)|≤2|y(j)|for all i,j∈A,

and with big energy:

 (3.5) ∥y|A∥2≥12.5√logm∥y∥2.
###### Proof.

We will construct at most subsets with comparable coordinates as in (3.4), and such that at least one of these sets will have large energy as in (3.5).

Let , and consider a partition of using sets with comparable coordinates:

 Ak:={i:2−k∥y∥2<|yi|≤2−k+1∥y∥2},k=1,2,…

Let , so that for all , . Then the set contains most of the energy of :

 ∥y|Uc∥2≤(m(1m∥y∥2)2)1/2=1√m∥y∥2≤1√2∥y∥2.

Thus

 (∑k≤k0∥y|Ak∥22)1/2=∥y|U∥2=(∥y∥22−∥y|Uc∥22)1/2≥1√2∥y∥2.

Therefore there exists such that

 ∥y|Ak∥2≥1√2k0∥y∥2≥12.5√logm∥y∥2,

which completes the proof. ∎

In our context, Lemma 3.7 applied to the vector along with Lemma 3.6 directly implies:

###### Lemma 3.8 (Regularizing the energy).

We have

 ∥u|J0∥2≥0.32√logn∥v0∥2.

We now finish the proof of Theorem 3.1.

To show the first claim, that is nonempty, we note that . Indeed, otherwise by (3.3) we have , so by the definition of the residual in the algorithm, we would have at the start of the current iteration, which is a contradiction. Then by Lemma 3.8.

The second claim, that , is also simple. Indeed, recall that by the definition of the algorithm, . It follows that the observation vector satisfies . Since by its definition the set contains only nonzero coordinates of we have . Since , the second claim follows.

The nontrivial part of the theorem is its last claim, inequality (3.1). Suppose it fails. Namely, suppose that , and thus

 |J0∖supp(v)|>12|J0|.

Set . By the comparability property of the coordinates in and since , there is a fraction of energy in :

 (3.6) ∥u|Λ∥2>1√5∥u|J0∥2≥17√logn∥v0∥2,

where the last inequality holds by Lemma 3.8.

On the other hand, we can approximate by as

 (3.7) ∥u|Λ∥2≤∥u|Λ−u0|Λ∥2+∥u0|Λ∥2.

Since and using Lemma 3.5, we have

 ∥u|Λ−u0|Λ∥2≤2.4ε∥v0∥2

Furthermore, by definition (3.3) of , we have . So, by Part 1 of Proposition 3.2,

 ∥u0|Λ∥2≤2.03ε∥v0∥2.

Using the last two inequalities in (3.7), we conclude that

 ∥u|Λ∥2≤4.43ε∥v0∥2.

This is a contradiction to (3.6) so long as . This proves Theorem 3.1. ∎

## 4. Implementation and empirical performance of ROMP

### 4.1. Running time

The Identification step of ROMP, i.e. selection of the subset , can be done by sorting the coordinates of in the nonincreasing order and selecting biggest. Many sorting algorithms such as Mergesort or Heapsort provide running times of .

The Regularization step of ROMP, i.e. selecting , can be done fast by observing that is an interval in the decreasing rearrangement of coefficients. Moreover, the analysis of the algorithm shows that instead of searching over all intervals , it suffices to look for among consecutive intervals with endpoints where the magnitude of coefficients decreases by a factor of . (these are the sets in the proof of Lemma 3.7). Therefore, the Regularization step can be done in time .

In addition to these costs, the -th iteration step of ROMP involves multiplication of the matrix by a vector, and solving the least squares problem with the matrix , where . For unstructured matrices, these tasks can be done in time and respectively. Since the submatrix of when restricted to the index set is near an isometry, using an iterative method such as the Conjugate Gradient Method allows us to solve the least squares method in a constant number of iterations (up to a specific accuracy.) Using such a method then reduces the time of solving the least squares problem to just . Thus in the cases where ROMP terminates after a fixed number of iterations, the total time to solve all required least squares problems would be just . For structured matrices, such as partial Fourier, these times can be improved even more using fast multiply techniques.

In other cases, however, ROMP may need more than a constant number of iterations before terminating, say the full iterations. In this case, it may be more efficient to maintain the QR factorization of and use the Modified Gram-Schmidt algorithm. With this method, solving all the least squares problems takes total time just However, storing the QR factorization is quite costly, so in situations where storage is limited it may be best to use the iterative methods mentioned above.

ROMP terminates in at most iterations. Therefore, for unstructured matrices using the methods mentioned above and in the interesting regime , the total running time of ROMP is O(dNn). This is the same bound as for OMP .

### 4.2. Non-sparse signals

In many applications, one needs to recover a signal which is not sparse but close to being sparse in some way. Such are, for example, compressible signals, whose coefficients decay at a certain rate (see , ). To make ROMP work for such signals, one can replace the stopping criterion of exact recovery by “repeat times or until , whichever occurs first”. Note that we could amend the algorithm for sparse signals in this way as well, allowing for a specific level of accuracy to be attained before terminating.

We recently proved that ROMP is stable and guarantees approximate recovery of non-sparse signals with noisy measurements; this will be discussed in a forthcoming paper.

### 4.3. Experiments

This section describes our experiments that illustrate the signal recovery power of ROMP. We experimentally examine how many measurements are necessary to recover various kinds of -sparse signals in using ROMP. We also demonstrate that the number of iterations ROMP needs to recover a sparse signal is in practice at most linear the sparsity.

First we describe the setup of our experiments. For many values of the ambient dimension , the number of measurements , and the sparsity , we reconstruct random signals using ROMP. For each set of values, we generate an Gaussian measurement matrix and then perform independent trials. The results we obtained using Bernoulli measurement matrices were very similar. In a given trial, we generate an -sparse signal in one of two ways. In either case, we first select the support of the signal by choosing components uniformly at random (independent from the measurement matrix ). In the cases where we wish to generate flat signals, we then set these components to one.222Our work as well as the analysis of Gilbert and Tropp  show that this is a challenging case for ROMP (and OMP). In the cases where we wish to generate sparse compressible signals, we set the component of the support to plus or minus for a specified value of . We then execute ROMP with the measurement vector .

Figure 1 depicts the percentage (from the trials) of sparse flat signals that were reconstructed exactly. This plot was generated with for various levels of sparsity . The horizontal axis represents the number of measurements , and the vertical axis represents the exact recovery percentage. We also performed this same test for sparse compressible signals and found the results very similar to those in Figure 1. Our results show that performance of ROMP is very similar to that of OMP which can be found in .

Figure 2 depicts a plot of the values for and at which of sparse flat signals are recovered exactly. This plot was generated with . The horizontal axis represents the number of measurements , and the vertical axis the sparsity level .

Theorem 1.3 guarantees that ROMP runs with at most iterations. Figure 3 depicts the number of iterations executed by ROMP for and . ROMP was executed under the same setting as described above for sparse flat signals as well as sparse compressible signals for various values of , and the number of iterations in each scenario was averaged over the trials. These averages were plotted against the sparsity of the signal. As the plot illustrates, only iterations were needed for flat signals even for sparsity as high as . The plot also demonstrates that the number of iterations needed for sparse compressible is higher than the number needed for sparse flat signals, as one would expect. The plot suggests that for smaller values of (meaning signals that decay more rapidly) ROMP needs more iterations. However it shows that even in the case of , only iterations are needed even for sparsity as high as . Figure 1. The percentage of sparse flat signals exactly recovered by ROMP as a function of the number of measurements N in dimension d=256 for various levels of sparsity n. Figure 2. The 99% recovery limit as a function of the sparsity n and the number of measurements N for sparse flat signals. Figure 3. The number of iterations executed by ROMP as a function of the sparsity n in dimension d=10,000 with N=200.

## References

•  E. Candès, Compressive sampling, Proc. International Congress of Mathematics, 3, pp. 1433-1452, Madrid, Spain (2006).
•  A. Cohen, W. Dahmen, R. DeVore, Compressed sensing and k-term approximation, Manuscript (2007).
•  E. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inform. Theory 52 (2006), 489-509.
•  E. Candès, T. Tao, Near-optimal signal recovery from random projections: universal encoding strategies, IEEE Trans. Inform. Theory 52 (2004), 5406–5425.
•  E. J. Candès, T. Tao, Decoding by linear programming, IEEE Trans. Inform. Theory 51 (2005), 4203–4215.
•  Compressed Sensing webpage, http://www.dsp.ece.rice.edu/cs/
•  D. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 52 (2006), 1289–1306.
•  D. Donoho, M. Elad, V. Temlyakov, Stable recovery of sparse overcomplete representations in the presence of noise, IEEE Trans. Inform. Theory 52 (2006), 6-18.
•  D. Donoho, M. Elad, V. Temlyakov, On the Lebesgue type inequalities for greedy approximation, J. Approximation Theory 147 (2007), 185-195.
•  D. Donoho, Ph. Stark, Uncertainty principles and signal recovery, SIAM J. Appl. Math. 49 (1989), 906–931.
•  D. Donoho, J. Tanner, Counting faces of randomly-projected polytopes when the projection radically lowers dimension, submitted.
•  D. Donoho, J. Tanner, Thresholds for the recovery of sparse solutions via ell-1 minimization, Conf. on Information Sciences and Systems (2006).
•  A. Gilbert, S. Muthukrishnan, M. Strauss, Approximation of functions over redundant dictionaries using coherence, The 14th Annual ACM-SIAM Symposium on Discrete Algorithms (2003).
•  A. Gilbert, M. Strauss, J. Tropp, R. Vershynin, Algorithmic linear dimension reduction in the norm for sparse vectors, submitted. Conference version in: Algorithmic Linear Dimension Reduction in the L1 Norm for Sparse Vectors, Allerton 2006 (44th Annual Allerton Conference on Communication, Control, and Computing).
•  A. Gilbert, M. Strauss, J. Tropp, R. Vershynin, One sketch for all: fast algorithms for compressed sensing, STOC 2007 (39th ACM Symposium on Theory of Computing), to appear
•  Yu. Lyubarskii, R. Vershynin, Uncertainty principles and vector quantization, submitted.
•  S. Mendelson, A. Pajor, N. Tomczak-Jaegermann, Uniform uncertainty principle for Bernoulli and subgaussian ensembles, submitted.
•  H. Rauhut, On the impossibility of uniform recovery using greedy methods, in preparation
•  M. Rudelson, R.Vershynin, On sparse reconstruction from Fourier and Gaussian measurements, Communications on Pure and Applied Mathematics, to appear. Conference version in: CISS 2006 (40th Annual Conference on Information Sciences and Systems).
•  D. Spielman, S.-H. Teng, Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time, J. ACM 51 (2004), 385–463.
•  J. A. Tropp, A. C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE Trans. Info. Theory. To appear, 2007.
•  V. Temlyakov, Nonlinear methods of approximation, Found. Comput. Math. 3 (2003), 33-107.
•  R. Vershynin, Beyond Hirsch Conjecture: walks on random polytopes and smoothed complexity of the simplex method, submitted. Conference version in: FOCS 2006 (47th Annual Symposium on Foundations of Computer Science), 133–142.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   