Noisy Signal Recovery via Iterative Reweighted L1-Minimization

# Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Deanna Needell DN is with the Dept. of Mathematics, University of California, Davis, One Sheilds Ave., Davis CA 95616, USA e-mail: dneedell@math.ucdavis.edu.
April 2009
###### Abstract

Compressed sensing has shown that it is possible to reconstruct sparse high dimensional signals from few linear measurements. In many cases, the solution can be obtained by solving an -minimization problem, and this method is accurate even in the presence of noise. Recent a modified version of this method, reweighted -minimization, has been suggested. Although no provable results have yet been attained, empirical studies have suggested the reweighted version outperforms the standard method. Here we analyze the reweighted -minimization method in the noisy case, and provide provable results showing an improvement in the error bound over the standard bounds.

## I Introduction

Compressed sensing refers to the problem of realizing a sparse input using few linear measurements that possess some incoherence properties. Its applications range from error correction to image processing. Since the measurements are linear, the problem can be formulated as the recovery of a signal from its measurements where is a measurement matrix. In the interesting case where , it is clearly impossible to reconstruct any arbitrary signal, so we must restrict the domain to which the signals belong. To that end, we consider sparse signals, those with few non-zero coordinates relative to the actual dimension. In particular, for , we say that a signal is -sparse if has or fewer non-zero coordinates:

 ∥x∥0def=|{i:xi≠0}|≤s.

It is now well known that many signals such as real-world audio and video images or biological measurements are sparse either in this sense or with respect to a different basis.

Much work in the field of compressed sensing has led to promising reconstruction algorithms for these kinds of sparse signals. One solution to the recovery problem is simply to select the signal whose measurements are equal to those of , with the smallest sparsity. That is, one could solve the optimization problem

 min^x∈Rd∥^x∥0 subject to Φ^x=Φx. (L0)

This straightforward approach is quite accurate, and if the columns of the matrix are in general position, it can recover signals that are up to -sparse. The crucial drawback to the problem is of course that it is highly nonconvex and requires a search through the exponentially many column sets. This clearly makes it of little use in practice. A natural alternative then is to relax the problem , and use a convex problem instead. One can then consider instead the -minimization problem

 min^x∈Rd∥^x∥1 subject to Φ^x=Φx. (L1)

Here and throughout, denotes the standard -norm: .

Problem is convex, and can actually be reformulated as a linear program. Due to recent work linear programming and smoothed analysis, it is now well known that it can be solved efficiently in practice [1, 11]. Notice that the solution to is the contact point where the smallest -ball meets the subspace . The geometry of the octahedron lends itself well to sparsity due to its wedges at the lower dimensional subspaces.

Indeed, Candès and Tao prove that when the measurement matrix satisfies a certain quantitative property, the solution to the problem will be the original sparse signal ([3], see also [10]). This restricted isometry condition guarantees that every submatrix of approximately preserves norm:

###### Definition I.1

The measurement matrix satisfies the restricted isometry condition with parameters if

 (1−δ)∥x∥22≤∥Φx∥22≤(1+δ)∥x∥22,

holds for all -sparse vectors . Here and throughout, denotes the usual Euclidean norm: .

It has been shown that many random measurement matrices satisfy the restricted isometry condition with small and nearly linear in the sparsity. In  [7] it is shown that measurement matrices whose entries are subgaussian satisfy the restricted isometry condition with parameters with high probability when

 m=O(sδ2logdδ2s).

Note that this implies in particular that matrices whose entries are (normalized) random Gaussian or Bernoulli satisfy the restricted isometry condition with this number of measurements. An alternative type of measurement matrix is a partial bounded orthogonal matrix. One such example is obtained by selecting rows uniformly at random from the discrete Fourier matrix. Rudelson and Vershynin show in [9] that these matrices satisfy the restricted isometry condition with parameters with

 m=O((slogdϵ2)log(slogdϵ2)log2d).

Note that in both cases we need only measurements.

Candès and Tao showed that if the measurement matrix satisfies the restricted isometry condition with parameters , that the -sparse signal is the unique solution to the problem . In [5], Candès sharpened these results to show success with parameters just .

As is evident, these results provide strong guarantees for the -minimization problem on sparse signals. In practice, however, signals are rarely exactly sparse, and may often be corrupted by noise. The problem then becomes to reconstruct an approximately sparse signal from its noisy measurements , where is an error vector. In this case, problem will clearly not suffice for recovery, since with noise the original signal may not even satisfy the constraint requirements. However, the problem can simply be adapted to account for noise error:

 min^x∈Rd∥^x∥1 subject to ∥Φ^x−u∥2≤ε, (L′1)

where is a noise parameter with . Candès, Romberg, and Tao showed that the solution to the problem is close in Euclidean norm to the original signal  [2]. Candès improved these results to provide the following.

###### Theorem I.2 (ℓ1-minimization from [5])

Assume has . Let be an arbitrary signal with noisy measurements , where . Then the approximation to from -minimization satisfies

 ∥x−^x∥2≤Cε+C′∥x−xs∥1√s,

where , , and .

The error bound provided here is optimal up to the constants, as the error can be viewed as the unrecoverable energy due to the inherent noise. See [8] for a detailed discussion of the unrecoverable energy. Note also that in the case where is exactly sparse, but the measurements are noisy, the error bound is proportional to the norm of the error vector . Although these results provide very strong guarantees, recent work has been done on a variant of the -minimization problem that seems to outperform the standard method.

## Ii Reweighted ℓ1-minimization

As discussed above, the -minimization problem is equivalent to the nonconvex problem when the measurement matrix satisfies a certain condition. The key difference between the two problems of course, is that depends on the magnitudes of the coefficients of a signal, whereas does not. To reconcile this imbalance, a new weighted -minimization algorithm was proposed by Candès, Wakin, and Boyd [4]. This algorithm solves the following weighted version of at each iteration:

 min^x∈Rdd∑i=1δi^xi % subject to Φx=Φ^x. (WL1)

It is clear that in this formulation, large weights will encourage small coordinates of the solution vector, and small weights will encourage larger coordinates. Indeed, suppose the -sparse signal was known exactly, and that the weights were set as . Notice that in this case, the weights are infinite at all locations outside of the support of . This will force the coordinates of the solution vector at these locations to be zero. Thus if the signal is -sparse with , these weights would guarantee that . Of course, these weights could not be chosen without knowing the actual signal itself. However even if the weights are close to the actual signal, the geometry of the weighted -ball becomes “pinched” toward the signal, decreasing the liklihood of an inaccurate solution.

Although the weights might not initially induce this geometry, one hopes that by solving the problem at each iteration, the weights will get closer to the optimal values , thereby improving the reconstruction of . Of course, one cannot actually have an infinite weight, so a stability parameter must also be used in the selection of the weight values. The reweighted -minimization algorithm can thus be described precisely as follows.

Reweighted -minimization

Input: Measurement vector , stability parameter , noise parameter Output: Reconstructed vector Initialize: Set the weights for . Repeat the following until convergence or a fixed number of times: Approximate: Solve the reweighted -minimization problem: Update the weights:

In [4], the reweighted -minimization algorithm is discussed thoroughly, and experimental results are provided to show that it often outperforms the standard method. However, no provable guarantees have yet been made for the algorithm’s success. Here we analyze the algorithm when the measurements and signals are corrupted with noise. Since the reweighted method needs a weight vector that is somewhat close to the actual signal , it is natural to consider the noisy case since the standard -minimization method itself produces such a vector. We are able to prove an error bound in this noisy case that improves upon the best known bound for the standard method. We also provide numerical studies that show the bounds are improved in practice as well.

## Iii Main Results

The main theorem of this paper guarantees an error bound for the reconstruction using reweighted -minimization that improves upon the best known bound of Theorem I.2 for the standard method. For initial simplicity, we consider the case where the signal is exactly sparse, but the measurements are corrupted with noise. Our main theorem, Theorem III.1 will imply results for the case where the signal is arbitrary.

###### Theorem III.1 (Reweighted ℓ1, Sparse Case)

Assume satisfies the restricted isometry condition with parameters where . Let be an -sparse vector with noisy measurements where . Assume the smallest nonzero coordinate of satisfies . Then the limiting approximation from reweighted -minimization satisfies

 ∥x−^x∥2≤C′′ε,

where , and .

Remarks.

1. We actually show that the reconstruction error satisfies

 ∥x−^x∥2≤2αε1+√1−4αεμ−4αερμ. (III.1)

This bound is stronger than that given in Theorem III.1, which is only equal to this bound when nears the value . However, the form in Theorem III.1 is much simpler and clearly shows the role of the parameter by the use of .

2. For signals whose smallest non-zero coefficient does not satisfy the condition of the theorem, we may apply the theorem to those coefficients that do satisfy this requirement, and treat the others as noise. See Theorem III.2 below.

3. Although the bound in the theorem is the limiting bound, we provide a recursive relation (III.8) in the proof which provides an exact error bound per iteration. In Section IV we use dynamic programming to show that in many cases only a very small number of iterations are actually required to obtain the above error bound.

We now discuss the differences between Theorem I.2 and our new result Theorem III.1. In the case where nears its limit of , the constant increases to , and so the constant in Theorem I.2 is unbounded. However, the constant in Theorem III.1 remains bounded even in this case. In fact, as approaches , the constant approaches just . The tradeoff of course, is in the requirement on . As gets closer to , the bound needed on requires the signal to have unbounded non-zero coordinates relative to the noise level . However, to use this theorem efficiently, one would select the largest that allows the requirement on to be satisfied, and then apply the theorem for this value of . Using this strategy, when the ratio , for example, the error bound is just .

Theorem III.1 and a short calculation will imply the following result for arbitrary signals .

###### Theorem III.2 (Reweighted ℓ1)

Assume satisfies the restricted isometry condition with parameters . Let be an arbitrary vector with noisy measurements where . Assume the smallest nonzero coordinate of satisfies where . Then the limiting approximation from reweighted -minimization satisfies

 ∥x−^x∥2≤4.1α1+ρ(∥x−xs/2∥1√s+ε),

and

 ∥x−^x∥2≤2.4α1+ρ(∥x−xs∥2+∥x−xs∥1√s+ε),

where and are as in Theorem III.1.

Again in the case where nears its bound of , both constants and in Theorem I.2 approach infinity. However, in Theorem III.2, the constant remains bounded even in this case. The same strategy discussed above for Theorem III.1 should also be used for Theorem III.2. Next we begin proving Theorem III.1 and Theorem III.2.

### Iii-a Proofs

We will first utilize a lemma that bounds the norm of a small portion of the difference vector by the -norm of its remainder. This lemma is proved in [5] and essentially in [2] as part of the proofs of the main theorems of those papers.

###### Lemma III.3

Set , and let , , and be as in Theorem III.1. Let be the set of largest coefficients in magnitude of and be the largest coefficients of . Then

 ∥hT0∪T1∥2≤αε+ρ√s∥hTc0∥1, (III.2)

and

 ∥h(T0∪T1)c∥2≤1√s∥hTc0∥1. (III.3)

We will next require two lemmas that give results about a single iteration of reweighted -minimization.

###### Lemma III.4 (Single reweighted ℓ1-minimization)

Assume satisfies the restricted isometry condition with parameters . Let be an arbitrary vector with noisy measurements where . Let be a vector such that for some constant . Denote by the vector consisting of the (where ) largest coefficients of in absolute value. Let be the smallest coordinate of in absolute value, and set . Then when and , the approximation from reweighted -minimization using weights satisfies

 ∥x−^x∥2≤D1ε+D2∥x−xs∥1a,

where , , , , and and are as in Theorem III.1.

{proof}

Now we begin the proof of Lemma III.4.

Set and for as in Lemma III.3. For simplicity, denote by the weighted -norm:

 ∥z∥wdef=d∑i=11|wi|+azi.

Since is the minimizer of (), we have

 ∥x∥w ≥∥x+h∥w=∥(x+h)T0∥w+∥(x+h)Tc0∥w ≥∥xT0∥w−∥hT0∥w+∥hTc0∥w−∥xTc0∥w.

This yields

 ∥hTc0∥w≤∥hT0∥w+2∥xTc0∥w.

Next we relate the reweighted norm to the usual -norm. We first have

 ∥hTc0∥w≥∥hTc0∥1A+a+b,

by definition of the reweighted norm as well as the values of , , and . Similarly we have

 ∥hT0∥w≤∥hT0∥1μ−A+a.

Combining the above three inequalities, we have

 ∥hTc0∥1 ≤(A+a+b)∥hTc0∥w (III.4) ≤(A+a+b)(∥hT0∥w+2∥xTc0∥w) ≤A+a+bμ−A+a∥hT0∥1+2(A+a+b)∥xTc0∥w. (III.5)

Using (III.3) and (III.4) along with the fact , we have

 ∥h(T0∪T1)c∥2≤C1∥hT0∥2+C2∥xTc0∥w, (III.6)

where and . By (III.2) of Lemma III.3, we have

 ∥hT0∪T1∥2≤αε+ρ√s∥hTc0∥1,

where and . Thus by (III.4), we have

 ∥hT0∪T1∥2 ≤αε+ρ√s(C1∥hT0∥1+2(A+a+b)∥xTc0∥w) =αε+ρC1∥hT0∪T1∥2+ρC2∥xTc0∥w.

Therefore,

 ∥hT0∪T1∥2≤(1−ρC1)−1(αε+ρC2∥xTc0∥w). (III.7)

Finally by (III.6) and (III.7),

 ∥h∥2 ≤∥hT0∪T1∥2+∥h(T0∪T1)c∥2 ≤(1+C1)∥hT0∪T1∥2+C2∥xTc0∥w ≤(1+C1)(αε+ρC2∥xTc0∥w1−ρC1)+C2∥xTc0∥w.

Applying the inequality and simplifying completes the claim.

Applying Lemma III.4 to the case where and yields the following.

###### Lemma III.5 (Single reweighted ℓ1-minimization, Sparse Case)

Assume satisfies the restricted isometry condition with parameters . Let be an -sparse vector with noisy measurements where . Let be a vector such that for some constant . Let be the smallest non-zero coordinate of in absolute value. Then when , the approximation from reweighted -minimization using weights satisfies

 ∥x−^x∥2≤D1ε.

Here , , and and are as in Theorem III.1.

Now we begin the proof of Theorem III.1. {proof} The proof proceeds as follows. First, we use the error bound in Theorem I.2 as the initial error, and then apply Lemma III.5 repeatedly. We show that the error decreases at each iteration, and then deduce its limiting bound using the recursive relation. To this end, let for , be the error bound on where is the reconstructed signal at the iteration. Then by Theorem I.2 and Lemma III.5, we have the recursive definition

 E(1)=2α1−ρε,E(k+1)=(1+E(k)μ−E(k))α1−ρE(k)μ−E(k)ε. (III.8)

Here we have taken iteratively (or if remains fixed, a small constant will be added to the error). First, we show that the base case holds, . Since , we have that

 E(1)μ−E(1)=2αε1−ρμ−2αε1−ρ≤1.

Therefore we have

 E(2)=(1+E(1)μ−E(1))α1−ρE(1)μ−E(1)ε≤2α1−ρε=E(1).

Next we show the inductive step, that assuming the inequality holds for all previous . Indeed, if , then we have

 E(k+1)=(1+E(k)μ−E(k))α1−ρE(k)μ−E(k)ε≤(1+E(k−1)μ−E(k−1))α1−ρE(k−1)μ−E(k−1)ε=E(k).

Since and we have that and , so is also bounded below by zero. Thus E(k) is a bounded decreasing sequence, so it must converge. Call its limit . By the recursive definition of , we must have

 L=(1+Lμ−L)α1−ρLμ−Lε.

Solving this equation yields

 L=μ−√μ2−4μαε−4μαερ2(1+ρ),

where we choose the solution with the minus since is decreasing and (note also that when ). Multiplying by the conjugate and simplifying yields

 L =4μαε+4μαερ2(1+ρ)(μ+√μ2−4μαε−4μαερ) =2αε1+√1−4αεμ−4αερμ.

Then again since , we have

 L≤2αε1+ρ.
{proof}

Now we begin the proof of Theorem III.2. By Lemma 6.1 of [8] and Lemma 7 of [6], we can rewrite as where

 ∥˜e∥2 ≤1.2(∥x−xs∥2+1√s∥x−xs∥1)+∥e∥2 ≤2.04(∥x−xs/2∥1√s)+∥e∥2.

This combined with Theorem III.1 completes the claim.

## Iv Numerical Results and Convergence

Our main theorems prove bounds on the reconstruction error limit. However, as is the case with many recursive relations, convergence to this threshold is often quite fast. To show this, we use dynamic programming to compute the theoretical error bound given by (III.8) and test its convergence rate to the threshold given by eqrefactualbnd. Since the ratio between and is important, we fix and test the convergence for various values of and . The results are displayed in Figure 1. We observe that in each case, as increases we require slightly more iterations. This is not surprising since higher means a lower bound. We also confirm that less iterations are required when the ratio is smaller.

Next we examine some numerical experiments conducted to test the actual error with reweighted -minimization versus the standard method. In these experiments we consider signals of dimension with non-zero entries. We use a measurement matrix consisting of Gaussian entries. We note that we found similar results when the measurement matrix consisted of symmetric Bernoulli entries. For each trial in our experiments we construct an -sparse signal with support chosen uniformly at random and entries from either the Gaussian distribution or the symmetric Bernoulli distribution, all independent of the matrix . We then construct the normalized Gaussian noise vector , and run the reweighted -algorithm using such that where is the variance of the normalized error vectors. This value is likely to provide a good upper bound on the noise norm (see e.g. [2], [4]). We set in the iteration. We run 500 trials for each parameter selection and signal type. We found similar results for non-sparse signals, which is not surprising since we can treat the signal error as measurement error after applying the measurement matrix (see the proof of Theorem III.2). Figures 2 and 3 display the results of the experiments and demonstrate large improvements in the error of the reweighted reconstruction compared to the reconstruction from the standard method.

## References

• [1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
• [2] E. Candès, J. Romberg, and T. Tao. Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math, 59(8):1207–1223, 2006.
• [3] E. Candès and T. Tao. Decoding by linear programming. IEEE Trans. Inform. Theory, 51:4203–4215, 2005.
• [4] E. Candès, M. Wakin, and S. Boyd. Enhancing sparsity by reweighted ell-1 minimization. J. Fourier Anal. Appl., 14(5):877–905, Dec. 2008.
• [5] E. J. Candès. The restricted isometry property and its implications for compressed sensing. Technical report, California Institute of Technology, 2008.
• [6] A. Gilbert, M. Strauss, J. Tropp, and R. Vershynin. One sketch for all: Fast algorithms for compressed sensing. In Proc. 39th ACM Symp. Theory of Computing, San Diego, June 2007.
• [7] S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Uniform uncertainty principle for Bernoulli and subgaussian ensembles. To appear, Constr. Approx., 2009.
• [8] D. Needell and J. A. Tropp. CoSaMP: Iterative signal recovery from noisy samples. Appl. Comput. Harmon. Anal., 2008. DOI: 10.1016/j.acha.2008.07.002.
• [9] M. Rudelson and R. Vershynin. On sparse reconstruction from Fourier and Gaussian measurements. Comm. Pure Appl. Math., 61:1025–1045, 2008.
• [10] M. Rudelson and R. Veshynin. Geometric approach to error correcting codes and reconstruction of signals. Int. Math. Res. Not., 64:4019–4041, 2005.
• [11] R. Vershynin. Beyond hirsch conjecture: walks on random polytopes and smoothed complexity of the simplex method. SIAM J. Comput., 2006. To appear.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters