Sparse recovery based on q-ratio constrained minimal singular values

# Sparse recovery based on q-ratio constrained minimal singular values

Zhiyong Zhou and Jun Yu The authors are with the Department of Mathematics and Mathematical Statistics, Umeå University, Umeå, 901 87, Sweden (e-mail: zhiyong.zhou@umu.se, jun.yu@umu.se).
###### Abstract

We study verifiable sufficient conditions and computable performance bounds for sparse recovery algorithms such as the Basis Pursuit, the Dantzig selector and the Lasso estimator, in terms of a newly defined family of quality measures for the measurement matrices. With high probability, the developed measures for subgaussian random matrices are bounded away from zero as long as the number of measurements is reasonably large. Comparing to the restricted isotropic constant based performance analysis, the arguments in this paper are much more concise and the obtained bounds are tighter. Numerical experiments are presented to illustrate our theoretical results.

Compressive sensing; -ratio sparsity; -ratio constrained minimal singular values; Convex-concave procedure.

## I Introduction

Sparse signal recovery, particularly compressive sensing [1, 2, 3, 4], aims to reconstruct a sparse signal from noisy underdetermined linear measurements:

 y=Ax+w, (1)

where is the true sparse or compressible signal, is the measurement vector with , is the measurement matrix, and is the noise vector. If the measurement matrix satisfies the stable or robust null space property (NSP) [5] or restricted isometry property (RIP) [1, 6], stable and robust recovery can be guaranteed. Although probabilistic results conclude that the NSP and RIP are fulfilled for some specific random matrices with high probability [7, 8, 9], it’s computationally hard to verify NSP and compute restricted isometry constant (RIC) for a given measurement matrix [10, 11]. Several relaxation techniques are used to obtain an approximate solution, for instance semi-definite programming [12, 13] and linear programming [14]. Recently, [15] and [16] defined new kinds of computable quality measures of the measurement matrices. Specifically, [15] developed -constrained minimal singular values (CMSV) and obtained the error bounds in terms of this quality measure of the measurement matrix. Similarly, in [16], the authors defined another quantity with denoting a general norm, and derived the performance bounds on the norm of the recovery error vector based on this quality measure. This kind of measures has also been used in establishing results for block sparsity recovery [17] and low-rank matrix recovery [18]. In this paper we generalize these two quantities to a more general quantity called -ratio CMSV with , and establish the performance bounds for both norm and norm of the reconstruction error.

### I-a Contributions

Our contribution mainly has four aspects. First, we proposed a sufficient condition based on a -ratio sparsity level for the exact recovery using minimization in the noise free case, and designed a convex-concave procedure to solve the corresponding non-convex problem, leading to an acceptable verification algorithm. Second, we introduced -ratio CMSV and derived concise bounds on both norm and norm of the reconstruction error for the Basis Pursuit (BP) [19], the Dantzig selector (DS) [20], and the Lasso estimator [21] in terms of -ratio CMSV. We established the corresponding stable and robust recovery results involving both sparsity defect and measurement error. Third, we demonstrated that for subgaussion random matrices, the -ratio CMSVs are bounded away from zero with high probability, as long as the number of measurement is large enough. Finally, we presented algorithms to compute the -ratio CMSV for an arbitrary measurement matrix, and studied the effects of different parameters on the proposed -ratio CMSV. Moreover, we illustrated that -ratio CMSV based bound is tighter than the RIC based one.

### I-B Organization and Notations

The paper is organized as follows. In Section II, we present the definitions of -ratio sparsity and -ratio CMSV, and give a sufficient condition for unique noiseless recovery based on the -ratio sparsity and an inequality for the -ratio CMSV. In Section III, we derive performance bounds on both norm and norm of the reconstruction errors for several convex recovery algorithms in terms of -ratio CMSVs. In Section IV, we demonstrate that the subgaussian random matrices have non-degenerate -ratio CMSVs with high probability as long as the number of measurements is relatively large. In Section V, we design algorithms to verify the sufficient condition for unique recovery in noise free case and compute the -ratio CMSV. Section VI contains the conclusion. Finally, the proofs are postponed to the Appendix.

Throughout the paper, we denote vectors by lower case letters and matrices by upper case letters. Vectors are columns by default. denotes the transpose of the vector and denotes the -th entry of . For any vector , we denote the norm , the norm and the norm for . We say a signal is -sparse if . denotes the set and denotes the cardinality of a set . Furthermore, we write for the complement of a set in . . For a vector and a set , we denote by the vector coincides with on the indices in and is extended to zero outside . For any matrix , , is the transpose and is the common trace function. is the inner product function. denotes the floor function.

## Ii q-ratio Sparsity and q-ratio CMSV

In this section, we present the definitions of -ratio sparsity and -ratio CMSV, and give their basic properties. A sufficient condition is established for unique sparse recovery via noise free BP. We start with a stable sparsity measure, which is called -ratio sparsity level here.

###### Definition 1

([22, 23]) For any non-zero and non-negative , the -ratio sparsity level of is defined as

 sq(z)=(∥z∥1∥z∥q)qq−1. (2)

The cases of are evaluated as limits:

 s0(z) =limq→0sq(z)=∥z∥0 (3) s1(z) =limq→1sq(z)=exp(H1(π(z))) (4) s∞(z) =limq→0sq(z)=∥z∥1∥z∥∞. (5)

Here with entries and is the ordinary Shannon entropy.

This kind of sparsity measure was proposed in [22, 23], where estimation and statistical inference via -stable random projection method were studied. Its extension to block sparsity was developed in [24]. In fact, this kind of sparsity measure is entropy-based, which counts effective coordinates of by counting effective states of via entropy. Formally, we have

 sq(z)={exp(Hq(π(z)))if z≠00if z=0, (6)

where is the Rényi entropy of order [25, 26]. When , the Rényi entropy is given by , and the cases of are defined by evaluating limits, with being the ordinary Shannon entropy. The sparsity measure has the following basic properties (see also [22, 23]):

• Continuity: Unlike the traditional sparsity measure norm, the function is continuous on for all . Thus, it is stable with respective to small perturbations of the signal.

• Range equal to : For all and all , we have

 0≤sq(z)≤N.
• Scale-invariance: For all , it holds that . This property is in line with the common sense that sparsity should be based on relative (rather than absolute) magnitudes of the entries of the signal.

• Non-increasing in : For any , we have

 ∥z∥1∥z∥∞=s∞(z)≤sq′(z)≤sq(z)≤s0(z)=∥z∥0,

which follows from the non-increasing property of the Rényi entropy with respect to .

Next, we present a sufficient condition for the exact recovery via noise free BP in terms of -ratio sparsity. First, it is well known that when the true signal is -sparse, the sufficient and necessary condition for the exact recovery of the noise free BP problem:

 minz∈RN∥z∥1%s.t.Az=Ax (7)

is given by the null space property of order :

 ∥zS∥1<∥zSc∥1,∀z∈kerA∖{0},S⊂[N]with|S|≤k,

see Theorem 4.5 in [4]. Then, the sufficient condition for exact recovery of -sparse signal via noise free BP (7) in terms of -ratio sparsity goes as follows.

###### Proposition 1

if is -sparse and there exists some such that is strictly less than

 minz∈kerA∖{0}2q1−qsq(z), (8)

then the unique solution to problem (7) is the true signal .

Remarks. Obviously, this result is a direct extension of the Proposition 1 in [15], which states that if the sparsity level is strictly less than either or , then the unique recovery is achieved via (7). We extends it to a weaker condition, that is . When , the minimization problem (8) can be solved by solving linear programs with a polynomial time, see the algorithm (30). However, in the cases of , it’s very difficult to solve exactly. In Section V, we adopt a convex-concave procedure algorithm to solve it approximately.

Now we are ready to present the definition of -ratio constrained minimal singular value, which is developed based on -ratio sparsity level.

###### Definition 2

For any real number , and matrix , the -ratio constrained minimal singular value (CMSV) of is defined as

 ρq,s(A)=minz≠0,sq(z)≤s∥Az∥2∥z∥q. (9)

Remarks. When and , reduces to given in [15] and given in [16], respectively. For measurement matrices with columns of unit norm, it is obvious that for any since , and , where is the -th canonical basis for . Moreover, when and are fixed, as a function with respective to , is non-increasing. For any , we have . This fact together with Theorem 1 in Section III imply that increasing the sensing energy without changing the sensing matrix structure proportionally reduces the norm of reconstruction errors when the true signal is exactly sparse. In fact, other extensions of this definition can be done, for instance we can define, for any ,

 ρ◊,q(A,s)=minz≠0,sq(z)≤s∥Az∥◊∥z∥q,

where denotes a general norm. Then the measure defined in [16] is exactly . Thus the corresponding results there can be generalized in terms of this newly defined measure. But we do not pursue the extensions in this paper. Basically, the recovery condition ( with some proper ) discussed later to achieve the norm error bounds is equivalent to the robust width property investigated in [27, 28, 29]. The recovery condition in terms of -ratio CMSV is quiet similar to the restricted eigenvalue condition [30, 31] or more general restricted strong convexity condition [32]. The difference is that here we use a restricted set in terms of -ratio sparsity, which is more intuitive and makes the proof procedure more concise. Not least it is computable! Comparing to the RIP, all these conditions do not require upper bounds on the restricted eigenvalues.

As for different , we have the following important inequality, which will play a crucial role in analysing the probabilistic behavior of via the existing results for established in [15].

###### Proposition 2

If , then for any real number , we have

 ρq1,s(A)≥ρq2,sq2(q1−1)q1(q2−1)(A)≥s−q2(q1−1)q1(q2−1)ρq1,sq2(q1−1)q1(q2−1)(A). (10)

Remarks. Let and , we have . The left hand side is exactly the right hand side inequality of Proposition 4 in [16], i.e., . But as as , so . Similarly, according to the right hand side of the inequality, we have for any . But obviously . Therefore, we can not obtain the monotonicity with respective to of when and are fixed. However, when , then since for any , , it holds trivially that is increasing with respect to by using the decreasing property of norm.

## Iii Recovery results

In this section, we derive performance bounds on both norm and norm of the reconstruction errors for several convex sparse recovery algorithms in terms of the -ratio CMSV of the measurement matrix. Let where is the true sparse or compressible signal, is the measurement matrix and is the noise vector. We focus on three most renowned sparse recovery algorithms based on convex relaxation: the BP, the DS and the Lasso estimator.

BP: .

DS: .

Lasso: .

Here , and are parameters used in the conditions to control the noise levels. We first present the following main recovery results for the case that the true signal is exactly sparse.

###### Theorem 1

Suppose is -sparse. For any , we have
1) If , then the solution to the BP obeys

 ∥^x−x∥q ≤2ερq,2qq−1k(A), (11) ∥^x−x∥1 ≤4k1−1/qερq,2qq−1k(A). (12)

2) If the noise in the DS satisfies , then the solution to the DS obeys

 ∥^x−x∥q≤4k1−1/qρ2q,2qq−1k(A)λNσ, (13) ∥^x−x∥1≤8k2−2/qρ2q,2qq−1k(A)λNσ. (14)

3) If the noise in the Lasso satisfies for some , then the solution to the Lasso obeys

 ∥^x−x∥q ≤1+κ1−κ⋅2k1−1/qρ2q,(21−κ)qq−1k(A)λNσ, (15) ∥^x−x∥1 ≤1+κ(1−κ)2⋅4k2−2/qρ2q,(21−κ)qq−1k(A)λNσ. (16)

Remarks. When the noise vector , the conditions on noise for the DS and Lasso hold with high probability if (the parameter related to the signal dimensional ) is properly chosen. As a by product of (11), we have if , then the noise free BP (7) can uniquely recover any -sparse signal by letting . We established both the error norm and norm bounds. Our results for the error norm bounds generalize from the existing results in [15] () and [16] () to any . The error norm bounds depend on the -ratio CMSV of the measurement matrix , which is bounded away from zero for subgaussian random matrices and can be computed approximately by using some specific algorithms. The details will be discussed in the later sections.

Next, we extend Theorem 1 to the case that the true signal is allowed to be not exactly sparse, but is compressible, i.e., it can be well approximately by an exactly sparse signal.

###### Theorem 2

Let the -error of best -term approximation of be , which is a function that measures how close is to being -sparse. For any , we have
1) If , then the solution to the BP obeys

 ∥^x−x∥q ≤2ερq,4qq−1k(A)+k1/q−1σk(x)1, (17) ∥^x−x∥1 ≤4k1−1/qερq,4qq−1k(A)+4σk(x)1. (18)

2) If the noise in the DS satisfies , then the solution to the DS obeys

 ∥^x−x∥q ≤8k1−1/qρ2q,4qq−1k(A)λNσ+k1/q−1σk(x)1, (19) ∥^x−x∥1 ≤16k2−2/qρ2q,4qq−1k(A)λNσ+4σk(x)1. (20)

3) If the noise in the Lasso satisfies for some , then the solution to the Lasso obeys

 ∥^x−x∥q ≤1+κ1−κ⋅4k1−1/qρ2q,(41−κ)qq−1k(A)λNσ+k1/q−1σk(x)1, (21) ∥^x−x∥1 ≤1+κ(1−κ)2⋅8k2−2/qρ2q,(41−κ)qq−1k(A)λNσ+41−κσk(x)1. (22)

Remarks. As we can see, all the error bounds consist of two components, one is caused by the measurement error, while the other one is caused by the sparsity defect. And according to the proof procedure presented later, we can sharpen the error bounds to be the maximum of these two components instead of their summation. Comparing to the exactly sparse case, we need slightly stronger conditions to achieve the valid error bounds. Concisely, we require , and for the BP, DS and Lasso in the compressible case, while the conditions are , and in the exactly sparse case, respectively.

## Iv Random matrices

In this section, we study the property of -ratio CMSVs for the subgaussian random matrices. A random vector is called isotropic and subgaussian with constant if it holds for all that and . Then as shown in Theorem 2 of [15], we have the following lemma.

###### Lemma 1

([15]) Suppose the rows of the scaled measurement matrix to be i.i.d isotropic and subgaussian random vectors with constant . Then there exists constants and such that for any and satisfying

 m≥c1L2slogNη2

we have

 E|1−ρ2,s(A)|≤η

and

 P(1−η≤ρ2,s(A)≤1+η)≥1−exp(−c2η2mL4).

Then as a direct consequence of Proposition 2 (i.e., if , . While if , .) and Lemma 1, we have the following probabilistic statements about .

###### Theorem 3

Under the assumptions and notations of Lemma 1, it holds that

1) When , there exist constants and such that for any and satisfying

 m≥c1L2slogNη2

we have

 E[ρq,s(A)] ≥s−1(1−η), (23) P{ρq,s(A) ≥s−1(1−η)}≥1−exp(−c2η2mL4). (24)

2) When , there exist constants and such that for any and satisfying

 m≥c1L2s2(q−1)qlogNη2

we have

 E[ρq,s(A)] ≥1−η, (25) P{ρq,s(A) ≥1−η}≥1−exp(−c2η2mL4). (26)

Remarks. Theorem 3 shows that at least for subgaussian random matrices, the -ratio CMSV is bounded away from zero as long as the number of measurements is large enough. Random measurement matrices with i.i.d isotropic subgaussian random vector rows include the Gaussian and Bernoulli ensembles. The order of required number of measurements are close to the optimal order for establishing the norm error bound, see [33]. Besides, [34] shows that the of a class of structured random matrices including the Fourier random matrices and Hadamard random matrices is bounded from zero with high probability as long as the number of measurements is reasonably large. Then by adopting Proposition 2 again, this conclusion still holds for the with .

In Fig. 1, we plot the histograms of using the computing algorithm (33) for Gaussian random matrices normalized by . We set but with three different kind of , i.e., , and . We obtain each histogram from 100 Gaussian random matrices. It can be observed that as is expected that the -ratio CMSVs are all bounded away from zero both in expectation and with high probability in this setting.

## V Numerical experiments

In this section, we first describe a convex-concave procedure used to compute the maximal sparse level such that the sufficient condition (8) is fulfilled, and then introduce the computation of the -ratio CMSV and compare the -ratio CMSV based bound and RIC based bound on the BP.

### V-a Verifying Sufficient Conditions

In order to use the -ratio sparsity level to verify the sufficient condition (8), for each , we need to solve the optimization problem:

 minz∈kerA∖{0}2q1−q(∥z∥1∥z∥q)qq−1. (27)

Regardless of the constant, it is essentially equivalent to solve the problem:

 maxz∈RN∥z∥qs% .t. Az=0 and ∥z∥1≤1. (28)

Unfortunately, this maximizing norm over a polyhedron problem is non-convex. For , [15] proposed to use a semidefinite relaxation to obtain an upper bound:

 (L2): maxZ∈RN×N:Z⪰0trace(Z) s.t.trace(AZAT)=0,∥Z∥1≤1, (29)

where and is the entry-wise norm of . For , it can be solved by solving linear programs (see [15, 16]):

 (L∞):max1≤i≤N{maxz∈RNzi,s.t.\,\,Az=0 and ∥z∥1≤1}. (30)

Here we adopt the convex-concave procedure (CCP) (see [35] for details) to solve the problem (28) for any . The basic CCP algorithm goes as follows:

Algorithm: CCP to solve (28). Given an initial point . Let . Repeat 1. Convexify. Linearize with the approximation 2. Solve. Set the value of to be a solution of (31) 3. Update iteration: . Until stopping criterion is satisfied.

We first compare the and algorithms for verifying the sufficient condition with our developed CCP algorithm. We present the results of CCP algorithms for and 20 here. All the convex problems including (29), (30) and (31) are solved by CVX toolbox in Matlab [36]. The initial point used in CCP is taken to be the solution of (30). If we denote the optimal objective values obtained to solve (29), (30) and (28) via CCP algorithm with some by , and , then the maximal sparsity levels to achieve unique recovery for noise free BP are calculated as , and , respectively. In TABLE I, we present the corresponding maximal sparsity levels calculated via different algorithms for a small size Bernoulli matrix with fixed while varying the number of measurements . As is shown, the convex relaxation algorithm actually give a lower bound for the solution of the case . In TABLE II, we compare the results computed by and CCP for larger Gaussian random matrix with , also varying the number of measurements . From both tables, it is observed that the maximal sparsity levels to achieve unique recovery for noise free BP computed by the algorithms and are quite conservative. But it is much more acceptable and closer to the theoretical well-known optimal recovery condition bound for the Gaussian measurement matrix via our proposed CCP algorithm with some proper , for instance . According to Proposition 1, to obtain unique recovery for noise free BP, the sparsity level of the unknown true signal is merely required to less than or equal to the maximal sparsity level calculated for some .

### V-B Computing q-ratio CMSVs

For each , the computation of the -ratio CMSV is equivalent to

 minz∈RN∥Az∥2% s.t.∥z∥1≤sq−1q,∥z∥q=1. (32)

The above optimization problem is not convex because of the constraint . Here we use an interior point (IP) algorithm to directly compute an approximate numerical solution of (32). However IP approach requires that the objective and constraint function to possess continuous second order derivatives, which is not fulfilled by the constraint . The problem can be addressed by defining with and . This leads to the following augmented optimization problem:

 minz+,z−∈RN (z+−z−)TATA(z+−z−) s.t.∑iz+i+∑iz−i−sq−1q≤0, ∥z+−z−∥qq=1, z+≥0,z−≥0. (33)

The IP algorithm is implemented using the Matlab function fmincon. Due to the existences of local minima, we run the IP 30 times and select the minimal function value for all the trials. In Fig. 2, we compare the -ratio CMSVs as a function of approximated by the IP for Bernoulli random matrices. We set but with three different and three different . A Bernoulli random matrix with dimensionality is first simulated. Then for , the corresponding matrix is obtained by taking the first rows of that full Bernoulli random matrix. And the columns of all the used matrix are normalized to have unit norms, which guarantees that for any . In general, as is shown that the -ratio CMSVs decrease as increases for all the nine cases. For fixed , the -ratio CMSVs increases as increases for all the . The influence of on the -ratio CMSVs is relatively small and apparently not monotonous.

In Fig. 3, we plot the -ratio CMSVs as a function of with varying and . For each and , we computing the -ratio CMSVs with increasing from 20 to 40. For each , the construction of the corresponding matrix follows the same procedure given previously. Under the same settings, the -ratio CMSVs as a function of with varying and are presented in Fig. 4. Similar behaviors as Fig. 2 are observed from these two figures.

### V-C Bounds Comparison

Finally, we compare the -ratio CMSV based bound and the RIC based bound on the BP for different configurations of and . It’s known that if the order RIC of the measurement matrix satisfies that , then for any solution of the noisy BP approximates the true -sparse signal with errors

 ∥x−^x∥q≤Ck1/q−1/2ε, (34)

where with any .

Without loss of generality, we set . The RIC is approximated using Monte Carlo simulations. Specifically, to compute , we randomly take 1000 submatrices of of size , and approximate using the maximum of among all sampled submatrices. Here and are the corresponding maximal and minimal singular values of the sampled submatrix. As it is obvious that the approximated RIC is always smaller than or equal to the exact RIC, the error bounds based on the exact RIC are always worse than those based on the approximated RIC. Therefore, if our -ratio CMSV based bound is better than the approximated RIC based bound, it is even better than the exact RIC based one.

We approximate the -ratio CMSV and the RIC for column normalized submatrices of a row-randomly-permuted Hadamard matrix with , , , and . Fig. 5 shows that for all the tested cases, the -ratio based bounds are smaller than those based on the RIC. For some certain and , the -ratio CMSV based bounds apply even when the RIC based bound do not apply (i.e., ). When approaches , it can be observed that the -ratio based bounds are slightly larger than while the RIC based bounds approach as .

## Vi Conclusion

In this paper, we proposed a new measure of the measurement matrix’s incoherence, the -ratio CMSV which was defined based on the -ratio sparsity measure. We established the bounds for both norm and norm of the reconstruction errors of the Basis Pursuit, the Dantzig selector and the Lasso estimator using the -ratio CMSV. For the subgaussian random matrices, we showed that the -ratio CMSV is bounded away from zero as long as the number of measurements is relatively large. A CCP algorithm was developed to verify the sufficient conditions guaranteeing the unique noiseless recovery and an interior point problem was used to compute the -ratio CMSV. Numerical experiments were presented to illustrate our theoretical results and assess all the algorithms. Some further generalizations including the block sparsity recovery, low-rank matrix recovery or more general high dimensional M-estimation are left for future work.

[Proofs.]

Proof of Proposition 1. If there exists and such that , then for any , we have

 ∥z∥1=∥zS∥1+∥zSc∥1≤2∥zS∥1 ≤2k1−1/q∥zS∥q ≤2k1−1/q∥z∥q,

which implies that for any .

As a consequence of contraposition, when there exists some such that , holds that for all and . Thus the null space property of order is fulfilled and the unique solution to problem (7) is exactly the true -sparse signal .

Proof of Proposition 2. We firstly prove the left hand side of (10). For any and , when , we have as . Then, we have

 ∥z∥1∥z∥q2≤sq1−1q1⇒sq2(z)=(∥z∥1∥z∥q2)q2q2−1≤sq2(q1−1)q1(q2−1),

which implies that

 {z:sq1(z)≤s}⊆{z:sq2(z)≤sq2(q1−1)q1(q2−1)}.

Therefore, we have

 ρq1,s(A) =minz≠0,sq1(z)≤s∥Az∥2∥z∥q1≥minz≠0,sq2(z)≤sq2(q1−1)q1(q2−1)∥Az∥2</