Simultaneous Block-Sparse Signal Recovery Using Pattern-Coupled Sparse Bayesian Learning

# Simultaneous Block-Sparse Signal Recovery Using Pattern-Coupled Sparse Bayesian Learning

###### Abstract

In this paper, we consider the block-sparse signals recovery problem in the context of multiple measurement vectors (MMV) with common row sparsity patterns. We develop a new method for recovery of common row sparsity MMV signals, where a pattern-coupled hierarchical Gaussian prior model is introduced to characterize both the block-sparsity of the coefficients and the statistical dependency between neighboring coefficients of the common row sparsity MMV signals. Unlike many other methods, the proposed method is able to automatically capture the block sparse structure of the unknown signal. Our method is developed using an expectation-maximization (EM) framework. Simulation results show that our proposed method offers competitive performance in recovering block-sparse common row sparsity pattern MMV signals.

Simultaneous Block-Sparse Signal Recovery Using Pattern-Coupled Sparse Bayesian Learning

 Hang Xiao1, Zhengli Xing 2, Linxiao Yang1, Jun Fang1, and Yanlun Wu1 1National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu 611731, China, 2China Academy of Engineering Physics, Mianyang 621900, China.

Index Terms—  Compressed sensing, block-sparse signal, multiple Measurement Vectors (MMV), sparse Bayesian Learning.

## 1 Introduction

Compressing sensing is a new paradigm for data acquisition and reconstruction through exploiting the inherent sparsity signals of interest. In practice, sparse signals usually have additional structure that can be exploited to enhance the recovery performance. For example, the atomic decomposition of multi-band signals[1] or audio signals[2] usually results in block-sparse structure, in which the non-zeros coefficients occurs in cluster. A number of algorithms, e.g. block-OMP[3], mixed norm-minimization[4] were proposed to recover the block-sparse signals. These methods address only the SMV recovery problem. However, in real world the assumption of signals share the same sparsity pattern hold valid. We usually obtain multiple observations, where the recovery performance can be enhanced by exploiting the joint estimation. There are a great deal of signals in the real-world share the unchanged sparsity pattern over time, such as communication signals are assigned to a specific bands of frequency spectrums, thus, these communication signals often sharing the same sparsity pattern in frequency domain. It has been shown that compared to SMV case, the successful recovery rate can be greatly improved using multiple measurement vectors[5][6]. Wipf and Rao first introduced SBL to sparse signal recovery for SMV model, and later extended it to the MMV model, deriving the MSBL algorithm[7]. However, MSBL algorithm does not take the block-sparse properties of MMV signals into consideration. Zhang and Rao develop the BSBL algorithm to solve the temporally correlated block-sparse MMV signals[8]. Nevertheless, BSBL algorithm requires block-partition known a priori.

In this paper, we extend our former work[9] to the MMV scenario. To exploit the statistical dependencies, we propose pattern-coupled hierarchical Gaussian prior model which characterizes both the block sparseness of the coefficient and the statistical dependency between neighboring coefficients. A key assumption in the considered MMV model is that the support ( i.e. the indexes of nonzero entries) of each column in MMV signals are identical[10], and the block-structure of each signal is entirely unkown[8]. In our hierarchical Bayesian model, the prior for each coefficient not only involves its own hyperparameter, but also the hyperparameters of its immediate neighboring coefficients. An expectation-maximization (EM) algorithm is developed to learn and the hyperparameters and to estimate the block-sparse MMV signals. Simulation results are provided to illustrate the effectiveness of the proposed algorithm.

## 2 Problem Formulation

We consider the problem of simultaneously recovering a set of block-sparse signals from an basic underdetermined system

 yl=Φxl+vl,∀l=1…L, (1)

where , and denote the th measurements, the sensing matrix and the noise, respectively. The signal has a block-sparse structure and all share the same support. We note that the block partition of is unknown. We aim to recover by exploiting their block-sparsity and the property that sharing the same row sparsity pattern.

It is easy to see that the model (1) can be rewritten in matrix form, given by

 Y=ΦX+V, (2)

where , , and is unknown noise matrix. We assume that the elements of are i.i.d white noise following Gaussian distribution with zero mean and variance. Then the distribution of conditional on is given as

 p(Y|X)=(λ2π)ML2exp(−λ2∥Y−ΦX∥2F) (3)

To simultaneously capture the property of column-wise block-sparsity and the common row sparsity pattern of , we assign a Gaussian prior on each row of . Specifically, we impose a Gaussian prior distribution on the th row of , i.e. , with zero mean and covariance matrix, i.e.,

 p(xn⋅)=N(0,(αn+βαn−1+βαn+1)−1B−12) (4)

where is a positive matrix characterizing the dependency of the elements of . We note that all the rows of share the same which has been shown that such a prior is able to promote the low-rankness of [11], i.e., automatically capture the correlation among the rows of . In (4), are positive scalars controlling the sparsity of the rows of . We assume and for the end rows and , is a parameter indicating the relevance between the coefficient and its neighboring coefficients. It has been shown that by coupling the neighbor elements using , such a prior has potential to encourage a block-sparse solution[9]. Then the joint distribution of given and can be written as

 p(X)=|B1|L2|B2|N2√(2π)NLexp(−% tr(XTB1XB2)2). (5)

where to be a diagonal matrix with its th diagonal element equals to .

## 3 Proposed Bayseian Inference Algorithm

In this section, we proceed to develop a sparse Bayesian learning method for block-sparse MMV signal recovery. Based on the above hierarchical model, the posterior distribution of can be computed as

 p(X|Y;α,B2,λ)∝p(Y|X;λ)p(X;α,B2) (6)

where .

The log-posterior of can be written as

 lnp(X|Y;α,B2,λ) ∝ −λ2∥Y−ΦX∥2F−12tr(XTB1XB2) ∝ −12xT(λI⊗(ΦTΦ)+B2⊗B1)x−λyT(I⊗Φ)x

where and denote the vectorization of and , respectively, and denotes the operation of Kronecker product. Then we arrive at that the posterior distribution of follows the Gaussian distribution with mean and covariance matrix given as

 μ =λΣ(I⊗Φ)Ty (7) Σ =(λI⊗(ΦTΦ)+B2⊗B1)−1 (8)

Given a set of estimated hyperparameters , , and the observed , the maximum a posterior (MAP) estimate of is the mean of its posterior distribution, i.e.

 ^xMAP=μ (9)

Our problem therefore reduces to estimate the value of the hyperparameters , , and . A strategy to maximize the likelihood function of these hyperparameters is to exploit the expectation-maximization (EM) formulation, in which we first introduce a hidden variable and then iteratively maximize a lower bound of the likelihood function (this lower bound is also referred to as the Q-function). Briefly speaking, the algorithm alternates between an E-step and a M-step. In the E-step, we compute a new Q-function by taking expectation of the log joint distribution of data and hidden variable with respect to the posterior of hidden variable which computed using current estimation of the hyperparameters. In the M-step, we update the hyperparameters by maximizing the Q-function with respect to them.

We define the hyperparameters and recognize as the hidden variable. Then Q-function can be expressed as

 Q(Θ) =Ep(X|Y;Θ(t))[lnp(Y,X;Θ)], (10)

and, consequently, can be updated by maximizing -function, i.e.,

 α(t+1) =argmaxαQ(Θ) =argmaxαEp(X;Y,Θ(t))[lnp(X;α1,B2)] (11) B(t+1)2 =argmaxB2Q(Θ) =argmaxB2Ep(X|Y;Θ(t))[lnp(X;α1,B2)] (12) λ(t+1) =argmaxλQ(Θ) =argmaxλEp(X|Y;Θ(t))[lnp(Y,X;λ)] (13)

We first evaluate the expectation in (11) and (12), which is given as

 Ep(X|Y;Θ(t))[lnp(X|α1,B2)] = ⟨L2ln|B1|+N2ln|B2|−tr(XTB1XB2)2⟩ = L2ln|B1|+N2ln|B2|−12tr((B2⊗B1)⟨xxT⟩) (14)

where denotes the operator that taking expectation using the distribution , and is given as

 ⟨xxT⟩=μμT+Σ (15)

Then the problem (11) can be solved by setting the first derivative of the (14) with respect to to zero, i.e.,

 ∂Q(Θ)∂αi=L2(νi+βνi−1+βνi+1)−ϕi=0 (16)

where we define , with , and , in which , with denotes the the row of . Similarly, we set . Then the optimal solution should satisfy

 L2(ν∗i+βν∗i−1+βν∗i+1)=ϕi (17)

Since all the hyper parameters are non-negative, we have

 1α∗i>ν∗i>0,∀i=1,…,N (18) 1βα∗i+1>ν∗i>0,∀i=1,…,N−1 (19) 1βα∗i−1>ν∗i>0,∀i=2,…,N (20)

Hence the term on the left-hand side of (27) is lower and upper bounded respectively by

 3L2α∗i>ϕi>0 (21)

combining above equations (27)-(31), we arrive at

 α∗i∈[0,3L2ϕi] (22)

Due to the high computational complexity of calculating an accuracy solution of (16), we employ an sub-optimal solution of it, i.e., just set to its upper bound, which arrives at

 α(t+1)i=3L2ϕi (23)

Although We employ a sub-optimal solution (23) to update the hyperparameter in M-step, numerical results show that the sub-optimal update rule is quite effective. This is because the sub-optimal solution (23) provide a reasonable estimate of the optimal solution.

We then consider solving the problem (12). Similarly, we set the first derivative of the Q-function with respect to to zero, i.e.,

 ∂Q(Θ)∂B2=N2B−12−12 N∑i=1ν−1iΩi (24)

and arrive at that the optimal solution is given by

 B(t+1)=1N N∑i=1(αi+βαi−1+βαi+1)−1Ω−1i (25)

To estimate , the Q-function can be simplified to

 Q(λ)= Ex|y,Θ(t)[logp(y|x;λ)] ∝ NL2logλ−λ2⟨∥Y−ΦX∥2F⟩ (26)

By computing the derivative of (26) and setting it to zero, we arrive at

 λ(t+1) =1NL⟨∥Y−ΦX∥2F⟩ =1NL⟨∥y−(I⊗Φ)x∥22⟩ (27)

Some of the expectations and moments used during the update are summarized as

 ⟨xxT⟩=μμT+Σ⟨xi⋅xTi⋅⟩=μiμTi+Σi (28) ⟨∥y−Ax∥22⟩=∥y−Aμ∥22+tr(ATAΣ) (29)

where is a vector with its th element equals to the th of ’s and is a matrix with its th entry equals to the th of .

For clarity, we summarize our algorithm as follows.

## 4 Simulation Results

In our simulations, we study how the proposed algorithm benefit from multiple measurement vectors for the block-sparse common row sparsity pattern signals recovery problem. Suppose each -dimensional sparse vector contains nonzero coefficients which are partitioned into blocks with random sizes and random location. The over-complete dictionary are randomly generated with each entry independently drawn from a normal distribution.

We examine the recovery performance of our proposed algorithm, also referred as the MMV pattern-coupled sparse Bayesian learning algorithm (MPCSBL), under different choice of . As indicated earlier in our paper, is a parameter quantifying the dependencies among the neighboring coefficients. Fig.1 depicts the success rates vs. the ratio for different choices of in noiseless case, where we set , , . Results are averaged over 1000 independent runs, with the measurement matrix and the sparse signal randomly generated for each run. The performance of conventional sparse Bayesian learning method ( donated as ”MSBL”[8]) is also included for our comparision. When , our proposed algorithm achieves a significant performance improvement as compared with MSBL through exploiting the underlying block-sparse structure, even without knowing the exacting locations and the sizes of the non-zero blocks. We also observe that our proposed algorithm is not very sensitive to the choice of as long as . The success rates of proposed method and MSBL as a function of the sparsity level are plotted in Fig.2, where , and . We see that our proposed algorithm present the better performance than the MSBL method. In noisy case, we setting , , and , Fig.3 shows that the normalized mean square errors (NMSE) vs. SNR with nonzero rows for different choices of . We also see that our proposed method achieves the better estimation accuracy than existing methods.

## 5 Conclusion

We proposed a new Bayesian method for recovery block-sparse MMV signals with common row sparsity pattern. A pattern-coupled hierarchical Gaussian prior model was introduced to characterize both the sparseness of the coefficients and the statistical dependency between neighboring coefficients of the signal. Through exploiting the underlying block-structure, our method outperforms other existing methods in block-sparse MMV signals recovery with common row sparsity pattern. Numerical results show that the proposed method presents superior performance in recovery block-sparse MMV signals with common row sparsity pattern.

## References

• [1] M. Mishali and Y. C. Eldar, “Blind multi-band signal reconstruction compressed sending for analog signals,” IEEE Trans. Signal Processing, vol. 57, no. 3, pp. 993–1009, 2009.
• [2] Rémi Gribonval and Emmanuel Bacry, “Harmonic decomposition of audio signals with matching pursuit,” IEEE Trans. Signal Processing, vol. 51, no. 1, pp. 101–111, Jan. 2003.
• [3] Yonina C. Eldar, Patrick Kuppinger, and Helmut Bölcskei, “Block-sparse signals: uncertainty relations and efficient recovery,” IEEE Trans. Information Theory, vol. 58, no. 6, pp. 3042–3054, June 2010.
• [4] Y. C. Eldar and M. Mishali, “Robust recovery of signals from a strctured union of subspaces,” IEEE Trans. Inform. Theory, vol. 55, no. 11, pp. 5302–5316, 2009.
• [5] Yonina C. Eldar and Moshe Mishali, “Robust recovery of signals from a structured union of subspaces,” IEEE Trans. Information Theory, vol. 55, no. 11, pp. 5302–5316, Nov. 2009.
• [6] Y. C. Eldar and H. Rauhut, “Average case analysis of multichannel sparse recovery using convex relaxation,” IEEE Trans. Inform. Theory, vol. 56, no. 1, pp. 505–519, 2010.
• [7] D. P. Wipf and B. D. Rao, “An empirical bayesian stragety for solving the simultaneous sparse approximation problem,” IEEE Trans. Signal Processing, vol. 55, no. 7, pp. 3704–3716, 2010.
• [8] Zhilin Zhang and Bhaskar D. Rao, “Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 5, pp. 912–926, Sept. 2011.
• [9] Jun Fang, Yanning Shen, Hongbin Li, and Pu Wang, “Pattern-coupled sparse Bayesian learning for recovery of block-sparse signals,” IEEE Transactions on Signal Processing, vol. 63, no. 2, pp. 360–372, 2015.
• [10] D. Malioutov, M. Cetin, , and A. S. Willsky, “A sparse signal reconstructional perspective for source localization with sensor arrays,” IEEE Trans. Signal Processing, vol. 53, no. 8, pp. 3010–3022, 2005.
• [11] Bo Xin, Yizhou Wang, Wen Gao, and David Wipf, “Exploring algorithmic limits of matrix rank minimization under affine constraints,” IEEE Transactions on Signal Processing, vol. 64, no. 19, pp. 4960–4974, 2016.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters