Generalized Distributed Compressive Sensing

Generalized Distributed
Compressive Sensing

Jeonghun Park  Seunggye Hwang  Janghoon Yang,  and Dongku Kim*,  J.-H Park, S.-K Hwang and D.-K Kim (*corresponding author) are with the School of Electrical and Electronic Engineering, Yonsei University, Seoul, Korea. Emails:@yonsei.ac.kr, Tel: 82-2-2123-2877, Fax: 82-2-365-4504. J.-H Yang is with Department of Newmedia, Korean German Institute of Technology, Seoul, Korea. Email:jhyang@kgit.ac.kr, Tel: 82-2-6393-3237.
Abstract

Distributed Compressive Sensing (DCS) [1] improves the signal recovery performance of multi signal ensembles by exploiting both intra- and inter-signal correlation and sparsity structure. However, the existing DCS was proposed for a very limited ensemble of signals that has single common information [1]. In this paper, we propose a generalized DCS (GDCS) which can improve sparse signal detection performance given arbitrary types of common information which are classified into not just full common information but also a variety of partial common information. The theoretical bound on the required number of measurements using the GDCS is obtained. Unfortunately, the GDCS may require much a priori-knowledge on various inter common information of ensemble of signals to enhance the performance over the existing DCS. To deal with this problem, we propose a novel algorithm that can search for the correlation structure among the signals, with which the proposed GDCS improves detection performance even without a priori-knowledge on correlation structure for the case of arbitrarily correlated multi signal ensembles.

Compressive sensing, distributed source coding, sparsity, random projection, sensor networks.

I Introduction

Generally, signals in various applications can be represented as sparse coefficients with a particular basis, meaning a signal vector has only nonzero coefficients. Many compression algorithms exploit this sparse structure, including MP3 [2], JPEG [3] and JPEG2000 [4]. Compressive sensing (CS) is an emerging signal acquisition technique that has an advantage of reducing the required number of measurements for recovery of sparse signal. If a target signal is represented as a sparse signal with a particular sparse basis, one can recover it with only measurements. It is known that the signal can be recovered with overwhelming probability if the sparsity (simply, the number of nonzero elements) of the signal satisfies [5], where is a constant.

Baron et al. [1] introduced Distributed Compressive Sensing (DCS), which exploits not just intra-, but also inter- joint sparsity to improve the detection performance. They assume the scenario of a Wireless Sensor Network (WSN) consisting of an arbitrary number of sensors and one sink node. In this scenario, each sensor should normally carry out the compression in a distributed way without cooperation of the other sensors and transmit the compressed signal to the sink node. At the sink node the received signals from all the sensors are reconstructed jointly. Here, a key of the DCS is the concept of joint sparsity, defined as the sparsity of the entire signal ensemble. Three models have been considered as joint sparse signal models in [1]. In the first model, not only each signal is individually sparse, but there are also common components shared by every signal, called common information, which allow reduction of required measurements by joint recovery. In the second model, all signals share the supports, the locations of the nonzero coefficients. In the third model, no signal is sparse itself, nevertheless, they share the large amount of common information, which makes it possible to compress and recover the signals. While the second model, called the Multiple Measurement Vector (MMV) setting, has been actively explored in [6, 7, 8], to the best of authors’ knowledge, the first model has been studied for only a limited ensemble of signals that has single common information.

Despite its limitation, the first model is applied in many other applications such as [9, 10] as well as in the WSN [1]. In [9], they extract a common component and an innovation component from various face images for facilitating an analysis task such as face recognition. In [10], when implementing image fusion that combines multiple images of the same scene into a single image which is suitable for human perception and practical applications, they model the constant background image as common information and the variable foreground image as innovation information for efficiency of the process.

However, it is unrealistic to assume that there exists only common information. Practically, in most of situations, partial common information, which is firstly proposed in our conference version paper [11], as well as full common information are measured by arbitrary number of multiple sensors. Using this notion, we introduce partial common information leading to a generalized DCS (GDCS) model in this paper and obtain the theoretical bound of the number of measurements for exact reconstruction. However, to take advantage of partial common information, the decoder should know partial common structures of signals, which is not typically known to the decoder. To deal with this problem, we also propose a novel algorithm that can find the correlation structure among sensors to help decoder exploit partial common information. This algorithm can provide significant performance improvement. In summary, the main contributions of this paper are as follows.

  1. We propose a GDCS model where [1] is a special case.

  2. The theoretical bound on the required number of measurements of the GDCS model is obtained.

  3. To solve the necessity of a priori-knowledge, which is a burden of the decoder, we propose a novel algorithm that iteratively detects the signals with the proposed algorithm.

The remainder of this paper is organized as follows. We summarize the background of CS briefly in Section II. In Section III, we explain the concept of the existing joint sparse signal model and define its general version extension. Based on this model, we obtain the theoretical bound on the required number of measurements and propose a novel algorithm to capitalize on the GDCS in a practical environment in Section VI. In Section VII, numerical simulations are provided, followed by conclusions in Section VIII.

Ii Compressive sensing background

When we deal with the signals sensed in the real world, in many cases we can represent a real value signal as sparse coefficients with a particular basis . We can write

(1)

where is the th component of sparse coefficients and is the th column of the sparse basis. Without loss of generality, let assume that . Here, is the number of nonzero elements in vector . In matrix multiplication form, this is represented as

(2)

Including the widely used Fourier and wavelet basis, various expansions, e.g., Gabor bases [12] and bases obtained by Principal Component Analysis (PCA) [13], can be used as a sparse basis. For convenience, we use the identity matrix for a sparse basis . Without loss of generality, an arbitrary sparse basis is easily incorporated into the developed structure.

Candes, Romberg and Tao [5] and Donoho [14] showed that a reduced set of linear projections can contain enough information to recover the sparse signal. This technique introduced in [5, 14] has been named CS. In the CS, a compression is processed by simply projecting a signal onto measurement matrix where . We can describe the compression procedure as follows.

(3)

Since the number of equations is smaller than the number of values , this system is ill-posed. However, the sparsity of the signal allows perfect recovery if the restricted isometry property (RIP) of [5], [15] is satisfied with an appropriate constant. Assuming that the signal can be represented as in (2), the sparsest coefficient vector can be found by solving the following minimization.

(4)

If the original coefficient vector is sparse enough, there is no other sparse solution that satisfies except for , which implies we can recover the original signal in spite of the ill-posedness of the system.

However, although minimization problem guarantees significant reduction in the required number of measurements for recovery, we cannot use minimization practically because of its huge complexity. To solve the minimization, we must search possible sparse subspaces, which makes minimization NP-hard [16].

Instead of solving minimization, we can use the solution of minimization as the coefficient vector of the original signal, paying more measurements [5] as a cost of a tractable algorithm.

(5)

This approach is called Basis Pursuit. Contrary to minimization, we can solve minimization with bearable complexities, which is polynomial in .

Not only norm minimization, but also an iterative greedy algorithm can be used for finding the original signal from the observed signal. The Orthogonal Matching Pursuit (OMP) [17] is the most typical algorithm among iterative greedy algorithms. It iteratively chooses the vector from the measurement matrix that occupies the largest portion in the observed signal . It is proven in [17] that the original signal can be recovered with appropriately high probability by OMP. A greedy algorithm has been developed to more sophisticated algorithm, e.g., CoSaMP [18] and Subspace Pursuit [19].

Iii Joint Sparse Signal Model

In [1], the joint sparse signal model is defined. Using the same notations with [1], let denote the set of indices of signal ensembles. The ensembles consist of the signal . We use as the th sample in the signal . Each sensor is given a distinct measurement matrix , which is i.i.d. Gaussian matrix. The compressed signal can be written as . Concatenating all the signals from 1 to , we can write it in the following form.

(6)

where , and . Finally we can write

(7)

In [1], can be decomposed into two parts. The first part is common information , which is measured by every sensor, and the second part is innovation information , which is uniquely measurable by the sensor . The signal can be written accordingly as

(8)

While (8) is composed of two kinds of components, we can refine the model by defining the partial common information as follows : It is the information measured by multiple sensors where is an arbitrary number that satisfies . Then, the innovation information in (8) can be decomposed into partial common/innovation information.

Fig. 1: Venn diagram description of the characterizing signal model for the DCS

For ease of explanation, we consider a simple sensor network in Fig. 1 where each sensor measures its own innovation information and the full common information of all three sensors. In addition to those, three different partial common information can be measured in pairs as {sensor 1, sensor 2}, {sensor 2, sensor 3}, and {sensor 3 and sensor 1}. For and , let denote partial common information measured by the sensors . To avoid confusion with the notation of the existing signal model, we change the notation of innovation information to in our model. With the defined notion, we can write the signal of Fig. 1 in the following form.

(9)

We can readily extend (9) to the case of an arbitrary large number of sensors.

Adopting the same notation in DCS [1] which decouples the location and value of a signal, we also write an arbitrary sparse signal as

(10)

for satisfying , where , called a value vector, contains only nonzero elements in , and , called a location matrix, is an identity submatrix, which consists of column vectors chosen from an identity matrix. With these, we can describe the signal model (9) as follows.

(11)

where, for and , , and denote the sparsity of , and respectively. Likewise, , and are location matrices and , and are value vectors of , and respectively. From hence, we use and universally, not to be restricted to a specific signal model.

Iv Theoretical Bound on the Required Number of Measurements

In this section, we find the condition on the number of measurements to recover the original signal ensembles in a noiseless environment. First, we summarize the theoretical bound of the existing DCS [1] and then obtain the theoretical bound for the simple three sensors network described in Fig. 1 in the view of the proposed GDCS model, which is followed by extension to the general case.

If , the minimum number of measurements to recover the signal is [5]. If we know the supports of the elements, it is obvious that measurements would be sufficient for perfect recovery. Therefore, thinking naively, the required number of measurements for recovery is assuming the known supports. However, in the DCS scenario, because full common information is measured by every sensor, it is possible to recover the original signal with the number of measurements less than . Then the remaining problem is how to allocate the measurements to sensors to prevent from missing the information. Obviously, when we have and simultaneously, we cannot recover both from a single measurement. However, can be recovered with help of other sensors whose does not overlap with the innovation information. In this notion, size of overlaps for a subset of signals can be quantified.

Definition 1 ([1], Size of overlaps).

The overlap size for the set of signals , denoted as , is the number of indices in which there is overlap between the common and the innovation information supports at all signals :

(12)

We also define and .

Simply, , for implies a penalty term regarding the cardinality of indices of common information which should be recovered with help of measurements in due to overlaps between common and innovation information at . With the above definition, the theoretical required number of measurements for recovering the original signal can be determined from the following theorem.

Theorem 1 ([1], Achievable, known ).

Assume that a signal ensemble is obtained from a common/innovation information JSM (Joint Sparsity Model). Let be a measurement tuple, and be random matrices having rows of i.i.d. Gaussian entries for each . Suppose there exists a full rank location matrix where is the set of feasible location matrices such that

(13)

for all . Then with probability one over , there exists a unique solution to the system of equations ; hence, the signal ensemble can be uniquely recovered as .

Iv-a The three sensors network using the proposed GDCS model

The theoretical bound in the proposed GDCS model can be computed in a similar way to Theorem 1. We find the required number of measurements in a subset . A difference between the existing DCS model and the proposed GDCS model is that there would be various types of overlaps among the signals in GDCS since we consider partial common information between them.

Fig. 2: Three possible cases of a subset including sensors.

Before we go into further detail of the bound, we define the notation of partial common information. We denote as partial common information observed by a set of sensors of cardinality . For example, if partial common information is measured by a sensor set , where is an arbitrary number less than , it can be represented as where . For the three sensors network considered in Fig. 2, all the partial common information can be written as where and . Now we define two groups of information for explaining the theoretical bound. We divide all existing information into two groups, where the existing information includes full common, partial common and innovation information.

Definition 2 (Exclusive information group).

If the set of all sensors measuring given information is a subset of , where , such information is categorized into . We can write this as follows.

(14)

We call the defined group an exclusive information group since the information included in this group only can be measured from the sensors belonging to . This concept can be clarified using Fig. 2, where each type information is symbolized, and three possible cases of a subset are shown. In Fig. 2-(a), full common information , partial common information , and innovation information are all included in . In Fig. 2-(b), partial common information , and innovation information and are included in . In Fig. 2-(c), only innovation information is included in .

On the contrary to this, we can define another group as follows.

Definition 3 (Shared information group).

If the set of all sensors measuring given information has a nonempty intersection set with , where but is not a subset of , such information is categorized into . We can write this as follows.

(15)

We call this group a shared information group since the information included in this group can be measured from the sensors both belonging to and not belonging to . In Fig. 2-(a), none of the information is included in . In Fig. 2-(b), full common information and partial common information are included in . In Fig. 2-(c), full common information , partial common information , and are included in . Lastly, we define the third group as follows.

Definition 4 (Unrelated information group).

If the set of all sensors measuring given information has an empty intersection set with , where , such information is categorized into . We can write this as follows.

(16)

Since is not used in obtaining the theoretical bound, the third group has no practical meaning. Defined three groups are disjoint.

For , since the information included in can be recovered only from the measurements of the sensors belonging to , we must have the measurements of the sensors belonging to as many as sparsity of the information included in . On the other hand, for , if there is no overlap, we do not need to have the measurements of the sensors belonging to since the information included in can be recovered from the measurements of the sensors not belonging to . However, if there is an overlap, the information cannot be recovered from the measurements of the sensors not belonging to , so the measurements of the sensors belonging to are needed. Therefore, we need additional measurements of the sensors belonging to to compensate these overlaps.

Now, we explain the concept of overlap in more detail using Fig. 2. Assume that we want to find the number of measurements required in a subset for recovery. We assume that as in Fig. 2-(a). In this case, , , , , , and are all included in , and is empty. Therefore, as mentioned above, the required number of measurements is as follows.

(17)

The right side of inequality in (17) is the sum of the sparsity of the information included in . Next, let us consider the case of Fig. 2-(b). In this case, , , and are included in , and , , and are included in . If there is no overlap on the information included in , we need the measurements of the sensors belonging to for , , and since other information can be recovered from the measurements of the sensors not belonging to . Therefore, the required measurements are as follows.

(18)

The right side of the inequality in (18) is the sum of the sparsity of the information included in . However, assuming that there are overlaps on the information included in , e.g., and for some arbitrary , more measurements than (18) are needed since the overlapped information has to be recovered with the help of the measurements of the sensors belonging to . The necessary number of measurements is as follows.

(19)

where denotes additionally required number of measurements. In the case of (c) in Fig. 2, similar to (19), the necessary number of measurements is as follows.

(20)

where denotes additionally required number of measurements. In summary, in order to recover the original signal perfectly, we need measurements of the sensors belonging to as many as the sparsity of the information included in plus the size of overlaps of the information included in .

To calculate the theoretical bound on the required number of measurements analytically, we need to define the size of overlaps. Two types of overlaps can be differently considered: the overlaps of full common information and the overlaps of partial common information. We define the size of each type of overlap as follows.

Definition 5 (Size of overlaps of full common information).

Overlap size of full common information for the set of signals , denoted as , is the number of indices in which there are overlaps of the full common information and other information supports at all signals .

(21)

We also define and .

Next, we need to quantify the overlaps of partial common information. Using the same principle as in Definition 2, we can define size of overlaps of partial common information as follows.

Definition 6 (Size of overlaps of partial common information).

Assume that , and . For the set of signals , overlap size of partial common information measured by the signals i.e. , denoted as , is the number of indices for which there is a overlap between the partial common and the other information supports at a signal .

(22)

We also define and .

The reason that we only consider the overlap of partial common information which satisfy and is to consider the size of overlaps for partial common information included in .

With these definitions, we can decide the theoretical bound on the number of measurements for the three sensors example in the proposed GDCS model.

Theorem 2 (Achievable, known ).

Assume that a three signal ensemble is obtained from a full common/partial common/innovation information JSM (Joint Sparsity Model), as described in Fig. 1. Let be a measurement tuple, and let be random matrices having rows of i.i.d. Gaussian entries for each . Suppose there exists a full rank location matrix where is the set of feasible location matrices such that

(23)

for all . Then, with probability one, there exists a unique solution to the system of equations ; hence, the signal ensemble can be uniquely recovered as .

Assuming a subset , arbitrary partial common information is included in if , and it is included in if . Therefore, for the partial common information which satisfies , we consider the sparsity as the required number of measurements, and for the partial common information which satisfies , we consider the size of overlaps as the required number of measurements.

Iv-B The general case

The case of a larger number of sensors can be readily extended from the three sensors networks. With a larger number of sensors, many various kinds of partial common information can be characterized depending on how they share the information. Now, we generalize the size of overlaps to derive the bound on the number of measurements.

Definition 7 (Size of overlaps of full common information, the general version).

The overlap size of full common information, denoted as , is the number of indices in which there is overlap between the full common and other information supports at all signals .

(24)

where is a sensor set for partial common information and is a union of sensor sets for partial common information. We also define and .

(24) consists of a union of three sets. The sets present overlaps between the full common information and the innovation information, overlaps between the full common information and the partial common information, and overlaps between the full common information and both of the innovation information and the partial common information, respectively. Now, Definition 6 should be extended to the general case.

Definition 8 (Size of overlaps of partial common information, the general version).

The overlap size of partial common information measured by a sensor set , i.e., , denoted as , is the number of indices for which there is overlap between the partial common and other information supports at all signals .

(25)

where is a sensor set for partial common information and is a union of the sensor sets for partial common information. We also define and .

As in (24), (25) consists of union of three sets, and each set is a case of overlaps. The first set is overlaps between the partial common information and the innovation information, the second set is overlaps between the partial common information and other partial common information, and the third set is overlaps between the partial common information and both of the innovation information and other partial common information.

With these definitions, we can compute the theoretical bound on the required number of measurements for the proposed GDCS model for the general case.

Theorem 3 (Achievable, known ).

Assume that a signal ensemble is obtained from a full common/partial common/innovation information JSM (Joint Sparsity Model). Let be a measurement tuple, and let be random matrices having rows of i.i.d. Gaussian entries for each . Suppose there exists a full rank location matrix where is the set of feasible location matrices such that

(26)

for all . Then, with probability one, there exists a unique solution to the system of equations ; hence, the signal ensemble can be uniquely recovered as .

The proof is described in the Appendix. As in Theorem 2, for the information included in , we consider the sparsity of the information as the required number of measurements. On the other hand, for the information included in , we consider the size of overlaps of the information as the required number of measurements.

We can observe that Theorem 1, the theoretical bound of the existing DCS model, is a special case of Theorem 3, i.e., the case not considering partial common information. Therefore, Theorem 3 can be regarded as a more refined version of Theorem 1. When is unknown, it is known that additional measurements in the right side of (26) would be sufficient for recovery [1].

V Iterative Signal Detection With Sequential Correlation Search

In this section, we discuss a method that can benefit from partial common information without any a priori-knowledge about correlation structure, which is the main obstacle of exploiting partial common information in practical implementation. To compare the requirement of a priori-knowledge of the existing DCS and the proposed GDCS, the problem formulation of the existing DCS model is described as follows. The notations of the information follow the existing DCS style.

(27)
(28)
(29)
(30)
(31)
(32)

where and , are weight matrices. Thanks to a joint recovery, the improved recovery performance can be obtained compared to separate recovery.

To get some insight on the proposed algorithm, let us consider a case in which partial common information is measured by a set of sensors . This case can be formulated as the following problem by using the proposed GDCS model.

(33)
(34)
(35)
(36)
(37)
(38)

where and , are weight matrices. As shown above, to exploit partial common information, we have to find a sensor set for partial common information , in this case . Unfortunately, it is not straightforward to find which sensors are correlated. Since each sensor compresses its signal without cooperation of other sensors, there is nothing we can do to determine the correlation structure in a compression process. In a recovery process, although we can find the correlation structure by an exhaustive search, it demands approximately number of searches, which is not practical. Therefore, we need an moderately complex algorithm that finds the correlation structure.

A novel algorithm is proposed for finding the correlation structure. The algorithm iteratively selects the least correlated sensor so that we can approximate the sensor set for partial common information . For simplicity, we assume a joint sparse signal ensemble with partial common information , where as in (34). However, since we have no knowledge on the correlation structure, we cannot formulate the measurement matrix as in (36). Instead, we use the solutions of the separate recovery and of the existing DCS framework which considers only full common information. The solution obtained by using the existing DCS has to include full common information, although the given signal ensemble may not have full common information. By using this forcefully found full common information, we can obtain a clue about the correlation structure. When we compare norm of the innovation information between the solution vector of the separate recovery and the existing DCS, while the norm tends to increase ( norm of the innovation information of the solution vector obtained using the existing DCS becomes larger.) if the corresponding sensor , it tends to be decreased if the corresponding sensor .

Though this phenomenon is difficult to understand at first glance, it is quite straightforward. We should note that the forcefully found full common information may have some relation with the true partial common information. Actually, the forcefully found full common information is likely to be similar to the partial common information to minimize norm of the solution vector. (But not always. We will explain it after this paragraph.) Then, if the sensor , which means it is one of the sensors that measure the partial common information, a joint recovery process successfully divides the energy of the signal into a joint recovery part (the first column of in (30)) and a separate recovery part (the rest of the columns of in (30)). However, if the sensor , which means it is one of the sensors that do not have the partial common information, the innovation information of the sensor must be made to compensate the forcefully found full common information, causing the increase in norm of the innovation information.

However, it can be exploited only if forcefully found full common information is similar to partial common information. If only a small number of sensors can measure partial common information, i.e., is small, the forcefully found full common information has no relationship to the partial common information. In this case, we cannot expect to find the sensor set based on the above observation. Therefore, in this paper, we assume that any partial common information can be measured by a sufficient number of sensors. This assumption can be justified by the fact that significant performance gain by joint recovery of partial common information can be achieved when a sufficient number of sensors measure the partial common information.

Exploiting the above intuition, an iterative signal detection with a sequential correlation search algorithm is proposed with an underlying assumption on arbitrary inter-signal correlation (full common information or partial common information). The algorithm assumes the following. The number of sensors is , each signal size is and arbitrary inter-signal correlation (full common information or partial common information) exists. The received signal is denoted by , where is the number of measurements assigned to each sensor node.

Algorithm 1.

0: Set the iteration counts and to zero and initialize a matrix . Define a matrix such that it holds on the th block position while setting other blocks to zero. For example,

(39)

Initialize a set of indexes of the sensors , variables and and a matrix . The and variables will be used as a parameter for escaping the loop and will be used as a temporary matrix for an updated measurement matrix. The notation “” means to substitute the left hand side parameter with the right hand side parameter.
1: Construct two measurement matrices as follows.

(40)

2: Obtain and by solving the following weighted minimization problems.

(41)
(42)

where , are weight matrices. To avoid confusion, we denote the innovation information part of each solution in the following forms. The innovation information part indicates a part of the solution vector that is multiplied by the block diagonal part of or .

(43)

3: . (We treat as zero if .)
4: Find the sensor index by solving the following problem.

(44)

We denote the obtained index as .
5: .
6: