Joint Source-Channel Vector Quantization for Compressed Sensing

# Joint Source-Channel Vector Quantization for Compressed Sensing

Amirpasha Shirazinia, Student Member, IEEE, Saikat Chatterjee, Member, IEEE, Mikael Skoglund, Senior Member, IEEE
###### Abstract

We study joint source-channel coding (JSCC) of compressed sensing (CS) measurements using vector quantizer (VQ). We develop a framework for realizing optimum JSCC schemes that enable encoding and transmitting CS measurements of a sparse source over discrete memoryless channels, and decoding the sparse source signal. For this purpose, the optimal design of encoder-decoder pair of a VQ is considered, where the optimality is addressed by minimizing end-to-end mean square error (MSE). We derive a theoretical lower-bound on the MSE performance, and propose a practical encoder-decoder design through an iterative algorithm. The resulting coding scheme is referred to as channel-optimized VQ for CS, coined COVQ-CS. In order to address the encoding complexity issue of the COVQ-CS, we propose to use a structured quantizer, namely low complexity multi-stage VQ (MSVQ). We derive new encoding and decoding conditions for the MSVQ, and then propose a practical encoder-decoder design algorithm referred to as channel-optimized MSVQ for CS, coined COMSVQ-CS. Through simulation studies, we compare the proposed schemes vis-a-vis relevant quantizers.

Vector quantization, multi-stage vector quantization, joint source-channel coding, noisy channel, compressed sensing, sparsity, mean square error.

## I Introduction

Compressed sensing (CS) [2] considers retrieving a high-dimensional sparse vector from relatively lower number of measurements. In many practical applications, the collected measurements at a CS sensor node need to be encoded using finite bits and transmitted over noisy communication channels. To do so, efficient design of source and channel codes should be considered for reliable transmission of the CS measurements over noisy channels. The optimum performance theoretically attainable in a point-to-point memoryless channel can be achieved using separate design of source and channel codes, but this performance requires infinite source and channel code block lengths resulting in delay as well as coding complexity. Considering finite-length sparse source and CS measurement vector, it is theoretically guaranteed that joint source-channel coding (JSCC) can provide better performance than a separate design of source and channel codes. Therefore, to design a practical coding method, we focus on optimal JSCC principles for CS in the current work. Denoting the reconstruction vector by at a decoder, our main objective is to develop a generic framework for optimum JSCC of CS measurements using vector quantization, or in other words, optimum joint source-channel vector quantization for CS, such that is minimized.

### I-a Background

Recently, significant research interest has been devoted to design and analysis of source coding, e.g. quantization, for CS, and a wide range of problems has been formulated. Existing work on this topic is mainly divided into three categories.

1. The first category considers optimum quantizer design for quantization of CS measurements, where a CS reconstruction algorithm is held fixed at the decoder. Examples include [3] and [4], where CS reconstruction algorithms are LASSO and message passing, respectively. Based on analysis-by-synthesis principle, we have recently developed a quantizer design method in [5], where any CS reconstruction algorithm can be used.

2. The second category considers the design of a good CS reconstruction algorithm, where the quantizer is held fixed. CS reconstruction from noisy measurements – where the noise properties follow the effect of quantization – falls in the category. Examples are [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. To elaborate, let us consider [9] where CS measurements are uniformly quantized and a convex optimization-based CS reconstruction algorithm, called basis pursuit dequantizing (BPDQ), is developed to suit the effect of uniform quantization. Further, the design of CS reconstruction algorithms and their performance bounds for reconstructing a sparse source from 1-bit quantized measurements have been investigated in [12, 13, 14, 15].

3. Another line of previous work focuses on trade-offs between the quantization resources (e.g., quantization rate) and CS resources (e.g., number of measurements or complexity of CS reconstruction) [16, 17, 8, 18]. For example, in [18], a trade-off between number of measurements and quantization rate was established by introducing the concept of two compression regimes as quantification of resources – quantization compression regime and CS compression regime.

We mention that all the above works are dedicated to pure source coding through quantization of CS measurements. To the best of our knowledge, there is limited work on JSCC of CS measurements using vector quantizer (VQ). In this regard, we had our previous effort in [1]. The current paper is build upon the work of [1], and provides a comprehensive framework for developing optimum JSCC schemes to encode and transmit CS measurements (of a sparse source ) over discrete memoryless channels, and to decode the sparse source so as to provide the reconstruction . The optimality is addressed by minimizing the MSE performance measure .

### I-B Contributions

We first consider the optimal design of VQ encoder-decoder pair for CS in the sense of minimizing the MSE. Here, we stress that we use the VQ in its generic form. This is different from the design methods using uniform quantization [9] or 1-bit quantization of CS measurements [12, 13, 14, 15]. Our contributions include

• Establishing (necessary) optimal encoding and decoding conditions for VQ.

• Providing a theoretical bound on the MSE performance.

• Developing a practical VQ encoder-decoder design through an iterative algorithm.

• Addressing the encoding complexity issue of VQ using a structured quantizer, namely low complexity multistage VQ (MSVQ), where we derive new encoder-decoder conditions for sub-optimal design of the MSVQ.

Our practical encoder-decoder designs consider Channel-Optimized VQ for CS, coined COVQ-CS, and Channel-Optimized MSVQ for CS, coined COMSVQ-CS. To demonstrate the strength of the proposed designs, we compare them with relevant quantizer design methods through different simulation studies. Particularly, we show that in noisy channel scenarios, the proposed COVQ-CS and COMSVQ-CS schemes provide better and more robust (against channel noise) performances compared to existing quantizers for CS followed by separate channel coding.

### I-C Outline

The rest of the paper is organized as follows. In Section II, we introduce some preliminaries of CS. The optimal design and performance analysis of a joint source-channel VQ for CS are presented in Section III. In Section IV, we propose a practical VQ encoder-decoder design algorithm. Further, in Section V, we deal with complexity issue by proposing the design of computationally- and memory-efficient MSVQ for CS. The performance comparison of the proposed quantization schemes with other relevant methods are made in Section VI, and conclusions are drawn in Section VII.

### I-D Notations

Notations: Random variables (RV’s) will be denoted by upper-case letters while their realizations (instants) will be denoted by the respective lower-case letters. Random vectors of dimension will be represented by boldface characters. We will denote a sequence of RV’s by ; further, implies that . Matrices will be denoted by capital Greek letters, except that the square identity matrix of dimension is denoted by . The matrix operators determinant, trace, transpose and the maximum eigenvalue of a matrix are denoted by , , , and , respectively. Further, cardinality of a set is shown by . We will use  to denote the expectation operator. The -norm () of a vector will be denoted by . Also, represents -norm which is the number of non-zero coefficients in .

## Ii Preliminaries of CS

In CS, a random sparse vector (where most coefficients are likely zero) is linearly measured by a known sensing matrix () resulting in an under-determined set of linear measurements (possibly) perturbed by noise

 Y=ΦX+W, (1)

where and denote the measurement and the additive measurement noise vectors, respectively. We assume that is a -sparse vector, i.e., it has at most () non-zero coefficients, where the location and magnitude of the non-zero components are drawn from known distributions. We also assume that the sparsity level is known in advance. We define the support set of the sparse vector as with . Next, we define the mutual coherence notion which characterizes the merit of a sensing matrix . The mutual coherence is defined as [19]

 μ≜maxi≠j|Φ⊤iΦj|∥Φi∥2∥Φj∥2,1≤i,j≤N, (2)

where denotes the column of . The mutual coherence formalizes the dependence between the columns of , and can be calculated in polynomial-time complexity.

In order to reconstruct an unknown sparse source from a noisy under-sampled measurement vector, several reconstruction methods have been developed based on convex optimization methods, iterative greedy search algorithms and Bayesian estimation approaches. In this paper, through the design and analysis procedures, we adopt the Bayesian framework [20, 21, 22, 23, 24] for reconstructing a sparse source from noisy and quantized measurements.

In the subsequent sections, we describe our proposed design methods for quantization by observing the CS measurement vector, and then develop theoretical results.

## Iii Joint Source-Channel VQ for CS

In this section, we first introduce a general joint source-channel VQ system model for CS measurements in Section III-A. We derive necessary conditions for optimality of encoder-decoder pair in Section III-B. Thereafter, we investigate the effects of optimal conditions in Section III-C, and proceed to analysis of performance in Section III-D.

### Iii-a General System Description and Performance Criterion

Consider the general system model, shown in Figure 1, for transmitting CS measurements and reconstructing a sparse source. Let the total bit budget allocated for encoding (quantization) be fixed at bits per dimension of the source vector. Given the noisy measurement vector , a VQ encoder is defined by a mapping , where is a finite index set defined as with . Denoting the quantized index by , the encoder works according to , where the sets are encoder regions and such that when the encoder outputs the index . Note that given an index , the set is not necessarily a connected set (due to non-linear CS reconstruction) in the space . Also, might be an empty set (due to channel noise, see e.g. [25]).

Next, we consider a memoryless channel consisting of discrete input and output alphabets which is referred to as discrete memoryless channel (DMC). In our problem setup, the DMC accepts the encoded index and outputs a noisy symbol . The channel is defined by a random mapping characterized by transition probabilities

 P(j|i)≜Pr(J=j|I=i),i,j∈I, (3)

which indicates the probability that index is received given that the input index to the channel was . We assume that the transmitted index and the received index share the same index set , and the channel transition probabilities (3) are known in advance. We denote the capacity of a given channel by bits/channel use. Given the received index , a decoder is characterized by a mapping where is a finite discrete codebook set containing all reproduction codevectors . The decoder’s functionality is described by a look-up table; such that when the received index from the channel is , the decoder outputs .

Next, we state how we quantify the performance of Figure 1 and our design goal. It is important to design an encoder-decoder pair in order to minimize a distortion measure which reflects the requirements of the receiving-end user. Therefore, we quantify the source reconstruction distortion of our studied system by the end-to-end MSE defined as

 D≜E[∥X−ˆX∥22], (4)

where the expectation is taken with respect to the distributions on the sparse source (which, itself, depends on the distribution of non-zero coefficients in as well as their random placements (sparsity pattern)), the noise and the randomness in the channel. We mention that the end-to-end MSE depends on CS reconstruction error, quantization error as well as channel noise. While the CS sensing matrix is given, our concern is to design an encoder-decoder pair robust against all these three kinds of error.

### Iii-B Optimality Conditions for VQ Encoder and Decoder

We consider an optimization technique for the system illustrated in Figure 1 in order to determine encoder and decoder mappings E and D, respectively, in the presence of channel noise. More precisely, the aim of the VQ design is to find

• MSE-minimizing encoder regions and

• MSE-minimizing decoder codebook .

We note that the optimal joint design of encoder and decoder cannot be implemented since the resulting optimization is analytically intractable. To address this issue, in Section III-B1, we show how the encoding index (or equivalently encoder region ) can be chosen to minimize the MSE for a given codebook . Then, in Section III-B2, we derive an expression for the optimal decoder codebook for given encoder regions .

#### Iii-B1 Optimal Encoder

First, let us introduce the minimum mean-square error (MMSE) estimator of the source given the observed measurements (1) which is (see [26, Chapter 11])

 ˜x(y)≜E[X|Y=y]∈RN. (5)

Now, assume that the decoder codebook is known and fixed. We focus on how the encoding index should be chosen to minimize the MSE given the observed noisy CS measurement vector . We rewrite the MSE as

 D ≜E[∥X−ˆX∥22]=E[∥X−cJ∥22] (6) \lx@stackrel(a)=∫y∑i∈IPr{I=i|Y=y}E[∥X−cJ∥22|Y=y,I=i]f(y)dy \lx@stackrel(b)=∑i∈I∫y∈Ri{E[∥X−cJ∥22|Y=y,I=i]}f(y)dy,

where follows from marginalization of the MSE over and . Further, is the -fold probability density function (pdf) of the measurement vector. Also, follows by interchanging the integral and the summation and the fact that , , and otherwise the probability is zero. Now, since is always non-negative, the MSE-minimizing points in that shall be assigned to the encoder region are those that minimize the term within the braces in the last expression of (6). Then, the MSE-minimizing encoding index, denoted by , is given by

 i⋆ =arg min i∈IE[∥X−cJ∥22|Y=y,I=i] (7) \lx@stackrel(a)=arg min i∈I{E[∥cJ∥22|Y=y,I=i]−2E[X⊤cJ|Y=y,I=i]}

where follows from the fact that is independent of , conditioned on ; hence, which is pulled out of the optimization. follows from the fact that is independent of , conditioned on , and from the Markov chain . Next, note that introducing channel transition probabilities in (3) and the MMSE estimator in (5), the last equality in (7) can be expressed as

 i⋆=arg mini∈I{R−1∑j=0P(j|i)∥∥cj∥∥22−2˜x(y)⊤R−1∑j=0P(j|i)cj}. (8)

Equivalently, the optimized encoding regions are obtained by

 R⋆i= {y∈RM:R−1∑j=0[P(j|i)−P(j|i′)]∥∥cj∥∥22≤ (9) 2˜x(y)⊤R−1∑j=0[P(j|i)−P(j|i′)]cj,i≠i′∈I}.

#### Iii-B2 Optimal Decoder

Applying the MSE criterion, it is straightforward to show that the codevectors which minimize in (4) for a fixed encoder are obtained by letting represent the MMSE estimator of the vector based on the received index from the channel, that is

 c⋆j=E[X|J=j],j∈I. (10)

Now, using the Bayes’ rule, the expression for can be rewritten as

 c⋆j =E[X|J=j] (11) =∑iP(i|j)E[X|J=j,I=i] \lx@stackrel(a)=∑iP(j|i)P(i)∫yE[X|Y=y]f(y|i)dy∑iP(j|i)P(i) \lx@stackrel(b)=∑iP(j|i)∫Ri˜x(y)f(y)dy∑iP(j|i)∫Rif(y)dy,

where follows from marginalization over and the Markov chain . Moreover, is the conditional pdf of given that . Also, follows by using (5) and by the fact that , .

The optimal conditions in (8) and (11) can be used in an alternate-iterate procedure to design a practical encoder-decoder pair for vector quantization of CS measurements. The resulting algorithm will be presented later in Section IV.

### Iii-C Insights Through Analyzing the Optimal Conditions

Here, we provide insights into the necessary optimal conditions (8) and (10). Note that the encoding condition (8) implies that the sparse source is first MMSE-wise reconstructed from CS measurements at the encoder, and then quantized to an appropriate index. Hence, it suggests that the system shown in Figure 1 may be translated to the equivalent system shown in Figure 2.

Let us first denote the MMSE estimator as the RV , then we rewrite the end-to-end distortion as

 D =E[∥X−˜X(Y)+˜X(Y)−ˆX∥22] (12) =E[∥X−˜X(Y)∥22]+E[∥˜X(Y)−ˆX∥22],

where the second equality can be proved by showing that the estimation error of the source and the quantized transmission error are uncorrelated. This holds from the definition of and the long Markov property due to the assumption of deterministic mappings E and D and memoryless channel.

###### Remark 1.

Following (12), let us denote by the CS reconstruction distortion, and by the quantized transmission distortion. Then, the decomposition (12) indicates that the end-to-end source distortion , without loss of optimality, is equivalent to .

Interestingly, it can be also seen from (12) that does not depend on quantization and channel aspects. Hence, to find optimal encoding indexes (given fixed codevectors) and optimal codevectors (given fixed encoding regions) with respect to the end-to-end distortion , it suffices to find them with respect to minimizing . It can be proved that the necessary conditions for optimality (with respect to ) of the encoder-decoder pair derived for the system of Figure 2 coincide with the ones developed for the system of Figure 1, i.e., (8) and (11). The proof of this claim is as follows. Similar to the steps taken in (6), the –minimizing encoding index is given by

 i⋆ =arg min i∈IE[∥˜X(Y)−cJ∥22|Y=y,I=i] =arg min i∈I{E[∥cJ∥22∣∣I=i]−2˜x(y)⊤E[cJ∣∣I=i]}, =arg mini∈I{R−1∑j=0P(j|i)∥∥cj∥∥22−2˜x(y)⊤R−1∑j=0P(j|i)cj}.

Further, the –minimizing decoder is obtained by

 c⋆j =E[˜X⋆(Y)|J=j] \lx@stackrel(a)=∫E[X|J=j,Y=y]p(y|j)dy =E[X|J=j],

where follows from the Markov property . Now, we provide the following remark.

###### Remark 2.

The general system of Figure 1 and Figure 2 are equivalent considering end-to-end MSE criterion, fixed sensing matrix and channel transition probabilities.

Before proceeding to the analysis of the MSE using the developed equivalence property, we provide a comparative study between our proposed design scheme with related methods in the literature which follow the building block structure shown in Figure 3. Under this system model, for a fixed CS reconstruction algorithm (or, a fixed quantizer encoder-decoder pair), a quantizer encoder-decoder pair (or, CS reconstruction algorithm) is designed in order to satisfy a certain performance criterion, e.g. minimizing end-to-end distortion, quantization distortion or –norm of reconstruction vector. Some examples of system models following Figure 3 include [3, 9, 11, 5] (assuming a noiseless channel) and the conventional nearest-neighbor coding of CS measurements. In general, according to this system model, quantizer decoder D outputs the vector after receiving channel output. Finally, a given CS reconstruction decoder takes and makes an estimate of the sparse source.

Following Figure 2 (as the equivalent system model of Figure 1), we note that it is structurally different from the system model of Figure 3 in the location of the CS reconstruction, either at the transmitter side or at the receiver side. In the former system, an encoder reconstructs the source from CS measurements, whereas the latter system puts all CS reconstruction complexity at the decoder.

### Iii-D Analysis of MSE

In this section, we provide an analysis into the impact of CS reconstruction distortion, quantization error and channel noise on the end-to-end MSE by deriving a lower-bound.

###### Proposition 1.

Consider the linear CS model (1) with an exact -sparse source under the following assumptions:

1. The magnitude of non-zero coefficients in are drawn according to the i.i.d. standard Gaussian distribution.

2. The elements of the support set are uniformly drawn from all possibilities.

3. The measurement noise is drawn as uncorrelated with the measurements, where .

Further, assume a sensing matrix with mutual coherence . Let the total quantization rate be bits/vector, and the channel be characterized by capacity bits/channel use, then the end-to-end MSE of the system of Figure 1 asymptotically (in quantization rate and dimension) is lower-bounded as

 D≥Kc1+c1c22−2C(R−log2(NK)K), (13)

where , and , in which denotes the Gamma function.

###### Proof.

The proof can be found in the Appendix. ∎

###### Remark 3.

Each component of the lower-bound (13) is intuitive. The first term is the contribution of the CS reconstruction distortion, and the second term reflects the distortion due to the vector quantized transmission. When the CS measurements are noisy, it can be verified that as increases, the end-to-end MSE attains an error floor. This result can be also inferred from (12): as quantization rate increases, decays (asymptotically) exponentially, however, is constant irrespective of rate. Hence, as , the value that the MSE converges to is .

It should be noted when CS measurements are noiseless (), the lower-bound (13) becomes trivial. In this case, a simple asymptotic lower-bound for the system of Figure 1, under the assumptions of Proposition 1, can be obtained as

 D≥c22−2C(R−log2(NK)K), (14)

where the constant is the same dimensionality-dependent constant in (13).

The lower-bound (14) (also known as adaptive bound in [16, 17] in the noiseless channel case) can be proved assuming that the support set of is a priori known. Therefore, one can transmit the known support set using bits, and the Gaussian coefficients within the support set can be quantized via bits. Under noiseless channel condition (), the right hand side in (14) is shown to achieve the distortion rate function of a -sparse source vector with Gaussian non-zero coefficients and a support set uniformly drawn from possibilities [27]. Then, the separate source-channel coding theorem [28, Chapter 7] can be applied to find the optimum performance theoretically attainable (OPTA) by introducing channel capacity .

###### Remark 4.

The lower-bound in (14) shows that the end-to-end MSE can at most decay exponentially (in quantization rate ) with exponent dB/bit. Since the sparsity ratio , the decaying exponent can be far steeper than dB/bit for a Gaussian non-sparse source vector of dimension .

The following toy example offers some insights into the tightness of the lower-bound (14).

###### Example 1.

Using a simple example, we show how tight the lower-bound (14) is with respect to our proposed design. In Figure 4, we compare simulation results with the lower-bound in some region where .111This scenario can be realized in an event where and number of measurements is such that the CS reconstruction is perfect. Following this best-case scenario, we generate realizations of with sparsity level , where the non-zero coefficient is a standard Gaussian RV, and its location is drawn uniformly at random over . Then, we use the necessary optimal conditions (8) and (10) iteratively (as will be shown later in Algorithm 1). Considering a binary symmetric channel (BSC) with bit cross-over probability and capacity bits/channel use (see (29)), we plot MSE, versus quantization rate for (noiseless channel) and (noisy channel) in Figure 4. It can be observed that at , the bound (dashed line) is tight. As would be expected, degrading channel condition to reduces the performance. At , the gap between the simulation result (solid line marked by ‘o’) and its corresponding lower-bound (dotted line) increases. Note that in the noisy channel case, the lower-bound is based upon the asymptotic assumption of infinite source and channel code lengths (used in the OPTA). Therefore, the lower-bound is not tight at for low dimensions.

## Iv Practical Quantizer Design

In this section, we first develop a practical VQ encoder-decoder design algorithm, referred to as channel-optimized VQ for CS (COVQ-CS) using the necessary optimal conditions (8) and (11). Then, we provide a practical comparison between our proposed algorithm and a conventional quantizer design algorithm. We finalize this section by analyzing encoding and decoding computational complexity.

### Iv-a Training Algorithm for Practical Design

The results presented in Section III-B1 and Section III-B2 can be utilized to formulate an iterate-alternate training algorithm for the problem of interest. Similar to the generalized Lloyd algorithm for noisy channels [29], we propose a VQ training method for the design problem in this paper which is summarized in Algorithm 1. The following remarks can be considered for implementing Algorithm 1:

• In step (1), besides the channel transition probabilities , we assume that the statistics of the sparse source vector are given for training.

• In general, it is not easy to derive closed-form solutions for the optimal decoding condition (11), for example, due to difficulties in calculating the integrals even if the pdf is known. In practice, we calculate the codevector () in (10) using the Monte-Carlo method. To implement this computationally-efficient procedure, we first generate a set of finite training vectors , and then sample-average over those vectors that have led to the index .

• To address the issue of encountering empty regions, we, in each iteration of the algorithm, pick the codevector whose index has been sent the most number of times, denoted by . Then, a codevector associated with the index that has not been sent is calculated as , where is sufficiently small. Using this technique (which is also known as splitting method in the initialization phase of the LBG algorithm [30]), we efficiently re-include those encoding indexes that have never been selected due to the limited number of generated samples. This will lead to a design that efficiently uses all degrees of freedom.

• The performance of the COVQ-CS is sensitive to initializations in order for the algorithm to converge to a smaller value of the distortion . Therefore, in step (3), when the channel is noiseless, the codevectors are initialized using the splitting procedure of the so-called LBG design algorithm. Then, the final optimized codevectors are chosen for initialization of Algorithm 1 in the noisy channel case. Furthermore, convergence in step (7) may be checked by tracking the MSE, and terminate the iterations when the relative improvement is small enough. By construction and ignoring issues such as numerical precision, the iterative design in Algorithm 1 always converges to a local optimum since when the criteria in steps (5) and (6) of the algorithm are invoked, the performance can only leave unchanged or improved, given the updated indexes and codevectors. This is a common rationale behind the proof of convergence for such iterative algorithms (see e.g. [31, Lemma 11.3.1]). However, nothing can be generally guaranteed about the global optimality of this algorithm.

### Iv-B Practical Comparison

Here, we offer further insights into quantization aspects through the design of conventional nearest-neighbor coding (NNC) as a representative of Figure 3, and the design of proposed COVQ-CS method as a representative of Figure 2. The NNC for CS is often considered as a benchmark for performance evaluations.

The nearest-neighbor coding (NNC) for CS measurements is accomplished by designing a channel-optimized VQ for the input vector aiming to minimize the quantization distortion, i.e., , where is the quantizer decoder output as shown in Figure 3.111See e.g. [29] for more details regarding the design of channel-optimized VQ in a non-CS system model. Considering the notations given for the Figure 3, the design procedure of the quantizer encoder and the quantizer decoder is as follows: for a quantization rate bits/vector, a fixed codebook , with , and channel transition probability , the optimized encoding region becomes

 R⋆u= {y∈RM:R−1∑v=0[P(v|u)−P(v|u′)]∥gv∥22≤ (15) 2y⊤R−1∑v=0[P(v|u)−P(v|u′)]gv,u≠u′∈U},

where is the encoding index set. Now, for the given region (15) and channel transition probability , the quantization MSE-minimizing codevectors satisfy

 g⋆v=E[Y|V=v],v∈U. (16)

In order to design an encode-decoder pair using the NNC, an iterative algorithm can be used to alternate between (15) and (16). Finally, a CS reconstruction algorithm R produces the reconstruction vector from the quantizer decoder output . We refer to this design method as NNC-CS.

###### Example 2.

In this example, we illustrate how the COVQ-CS and NNC-CS design methods are different in shaping encoding regions (given that CS measurements are observed) and positioning codevectors (given that channel output index observed). For illustration purpose, we choose the input sparse vector dimension, measurement vector dimension and sparsity level as , and , respectively. The location of non-zero coefficient is drawn uniformly at random from , and its value is a standard Gaussian RV. For implementing the COVQ-CS via Algorithm 1, the MMSE estimator (used in (8)) is calculated via the closed-form solution given in [22, eq. (27)]. We generate realizations for (and subsequently ), where measurement noise vector is drawn from with . Then, we fix the quantization rate at bits/vector and assume a BSC with cross-over probability . For implementing the NNC-CS, an iterative algorithm is used by alternating between encoding regions (15) and codevectors (16). Finally, a CS reconstruction algorithm (here, we choose the same MMSE estimator used at the encoder of COVQ-CS) takes the NNC-CS codevectors and produces an estimate of the sparse source. In both NNC-CS and COVQ-CS schemes, the sensing matrix is chosen as

 Φ=(0.99240.89610.72010.12300.44390.6939).

In Figure 5, we qualitatively illustrate encoding regions and codevectors using the two designs. Figure 5(a) shows the samples of CS measurements classified by encoding regions of COVQ-CS in , i.e., (8), and Figure 5(b) shows the samples of classified by the index of encoding regions (in the same color) together with the codevectors of COVQ-CS in , i.e., in (11). Figure 5(c) illustrates the encoding regions of NNC-CS, i.e., (15), together with codevectors shown by black circles, and Figure 5(d) shows the samples of the sparse source along with the codevectors of NNC-CS mapped to the 3-dimensional space using the CS reconstruction algorithm, i.e., . From the samples in the measurement space, we observe that the entries of the CS measurements are highly correlated, in this particular example, due to a large mutual coherence of the sensing matrix (). Hence, as shown in Figure 5(c), the codevectors designed by the NNC-CS (almost) lie on a single line. Although, in this case, the location of codevectors are optimized to minimize the quantization distortion, , it is critical when the codevectors are mapped back to the source domain. From Figure 5(d), it is observed that the reconstructed codevectors, , are not only situated (approximately) on one axis but also far (in Euclidean distance) to their corresponding source samples (shown in same color) resulting in a high end-to-end distortion. Further, if, for example, the codevector is received as due to channel noise, it produces a large end-to-end distortion. Using other experiments, in the case of noiseless channel, we observed the same trend in the location of reconstructed codevectors (using NNC-CS) on the source domain which also produces large MSE in terms of the average distance between source samples and their corresponding reconstructed codevectors. While this is the case in NNC-CS, it can be seen from Figure 5(a) that the encoding regions using COVQ-CS may not form convex sets (for example, region 3) unlike the ones using the NNC-CS. This is due to the fact that the region fixed by the rule (9) may not be a convex set in due to non-linearity in . As a result, the COVQ-CS uses the measurement space more efficiently in order to reduce end-to-end distortion, . It can be observed from Figure 5(b) that the COVQ-CS codevectors are located on different coordinates in the 3-dimensional source space to minimize the end-to-end source distortion. In addition, the codevectors are located such that the COVQ-CS design becomes more robust against channel noise which produces smaller end-to-end distortion unlike the NNC-CS design. For example, as shown in Figure 5(b), if the codevector is chosen as at decoder due to channel noise, it provides much less end-to-end distortion than that of the NNC-CS. Numerical performance comparison between these two schemes will be made later in Section VI-B through different simulation studies.

### Iv-C Complexity of COVQ-CS

We analyze the encoding computational complexity (time usage) as well as encoder-decoder memory complexity (space usage) for the COVQ-CS. For encoding computational complexity, we calculate the number of operations (in terms of FLOP111Each addition, multiplication and comparison is represented by one floating point operation (FLOP).) required for transmitting an encoded index over the channel based on (8). In addition, for memory complexity, we calculate the memory (in terms of float222Float is considered as a single precision point unit.) required for storing vector parameters at encoder and decoder.

The encoding complexity for computing the argument in (8) requires one FLOP for calculating the subtraction as well as FLOP’s ( multiplications and additions) for calculating the inner product in the second term. Thus, the total complexity for the full-search minimization at encoder is FLOP’s. Note that we do not consider the complexity of CS reconstruction algorithm since its calculation is required for all relevant quantizers for CS. Next, considering the argument in (8), the encoder needs one float to store the first constant term in (8), i.e., , and also floats to store the second term in (8), i.e., the codevector . Thus, the total encoding memory for full-search minimization is . It also follows that the decoder requires floats to store in (10).

Using high-dimensional VQ and CS, the implementation of the quantizer encoder and decoder may not be feasible, both from computational complexity and from memory complexity viewpoints. The complexity can be reduced by exploiting sub-optimal approaches (with respect to (4)) such as multi-stage VQ (MSVQ) which splits a single VQ into multiple VQ’s at different stages. In the next section, we focus on the design of JSCC strategies for CS measurements using MSVQ.

## V Joint Source-Channel MSVQ for CS

Taking advantage of VQ properties by addressing its encoding complexity effectively has led to development of multi-stage VQ (MSVQ).

### V-a System Description and Performance Criterion

In this section, we give an account for the basic assumptions and models made about the investigated system depicted in Figure 6. We illustrate an -stage VQ, where is the maximum number of stages. Our MSVQ system model basically follows that of [32].

More specifically, we consider the () stage with allocated bits/vector, where , and is the total available quantization rate. Indeed, adjusts a trade-off between complexity and performance of MSVQ. A quantizer encoder, at stage , accepts the measurement vector and the encoded index from the stage as inputs, then maps them into an integer index with . Therefore, the –stage encoder is described by a mapping such that

 El(Y,Il−1)=il,% if (Y∈Rili1,Il−1=il−1), (17)

where is called the –stage encoding region. The region might be a connected set or union of some connected sets in . We also make the assumption that .

The encoded index is transmitted over a DMC (independent of other channels) with transition probabilities

 P(jl|il)=Pr(Jl=jl|Il=il),il,jl∈Il, (18)

where denotes the channel output at the stage.

Next, a decoder accepts the noisy index , and provides an estimation of the quantization error according to an available codebook set. Formally, the –stage decoder is defined by a mapping where denotes a codebook set consists of reproduction codevectors, i.e., , thus

 Dl(jl)=cjl,if Jl=jl,jl∈Il. (19)

We denote the output of the stage decoder by , and the final reconstructed vector by .

We are interested in designing the quantizers in the system of Figure 6 using the end-to-end MSE criterion defined in (4). Nevertheless, it is not easy to find optimal encoders (by fixing the decoders) and decoders (by fixing encoders) for all the stages jointly with respect to minimizing (4). Therefore, we define a new performance criterion as

 Dl≜E[∥X−l∑t=1ˆXt∥22],l=1,…,L. (20)

Using the performance criterion in (20), we assume that the stage only observes the previous stages. Applying , we derive necessary encoding and decoding policies for optimality (with respect to (20)) at stage (). Then, encoder-decoder pairs at the next stages are sequentially designed one after another. Using the sequential optimization at stage , we assume that the subsequent codevectors are populated with zero. This assumption means that the sequential design is sub-optimal with respect to (4), and the resulting conditions would lead to neither global nor local minimum of the end-to-end MSE. However, it provides better performance compared to the schemes which only consider quantization distortion at each stage separately.

### V-B Optimality Conditions for MSVQ Encoder and Decoder

In this section, we develop encoding and decoding principles for the () stage of the MSVQ system shown Figure 6. Following the arguments of Section III-B, we first assume that decoder codevectors and all encoding regions/codevectors at previous stages are fixed and known, then we find necessary optimal encoding regions with respect to minimizing in (20) in Section V-B1. Second, we fix the encoding regions , and then derive necessary optimal codevectors in Section V-B2. Finally, in Section V-B3, we combine these necessary optimal conditions to develop a practical MSVQ design algorithm referred to as channel-optimized MSVQ for CS (COMSVQ-CS).

#### V-B1 Optimal Encoder

In order to derive encoding regions , we fix the codevectors and all the codevectors at previous stages. First, let us define

 Dl(y,il1)≜E[∥X−l∑t=1ˆXt∥22∣∣Y=y,Il1=il1],1≤l≤L. (21)

Now, in (20) can be rewritten as

 Dl ≜E[∥X−l∑t=1cJt∥22] (22) \lx@stackrel(a)=∑i1,…,il∫Rili1Dl(y,il1)f(y)dy,

where follows from marginalization of over and , and the fact that , and otherwise the probability is zero.

Thus, , the optimized index, denoted by , is attained by (23), where follows from the Markov chain (), hence, which is pulled out of the optimization. Also, follows from the Markov chain , .

Introducing transition probabilities (18) and the MMSE estimator (5), the last equality in (23) is expressed as

 i⋆l =arg minil∈Il⎧⎨⎩Rl−1∑jl=0P(jl|il)∥∥cjl∥∥22−2˜x(y)⊤Rl−1∑jl=0P(jl|il)cjl (24) +2Rl−1∑jl=0l−1∑t=1Rt−1∑jt=0P(jl|il)P(jt|it)c⊤jlcjt⎫⎬⎭.
###### Remark 5.

Comparing the optimized encoding index for MSVQ for CS in (24), with that of the VQ for CS in (8), it can be seen that the third term in (24) is due to imposing multi-stage structure on the original VQ. As , this term vanishes and the resulting expression coincides with (8).

#### V-B2 Optimal Decoder

In order to derive codevectors , we fix encoding regions and all prior codebook sets. Therefore, applying in (20), it is straightforward to show that the optimal –stage codevectors, denoted by , are obtained as

 c⋆jl =E[X−l−1∑t=1cJt|Jl=jl],jl∈Il. (25)

Similar to the steps taken in (11), the codevectors (25) can be parameterized in terms of encoding regions, channel transition probabilities and MMSE estimation. Here, for the sake of analysis, we only provide closed-form codebook expressions for which are given by

 c⋆j1=∑i1P(j1|i1)∫Ri1˜x(y)f(y)dy∑i1P(j1|i1)∫Ri1f(y)dy, c⋆j2=∑i1,i2P(j2|i2)∫Ri2i1(˜x(y)−∑j1P(j1|i1)cj1)f(y)dy∑i1,i2P(j2|i2)∫Ri2i1f(y)dy.

Finally, we note that when , the condition (25) simplifies into (11).

#### V-B3 Training Algorithm

Similar to Algorithm 1, we can develop a practical method for training channel-optimized MSVQ for CS, coined COMSVQ-CS, summarized in Algorithm 2. Similar remarks, as stated for Algorithm 1, can be also considered for implementing Algorithm 2 with the difference that convergence in step (8) may be checked by tracking the distortion , and terminate the iterations when the relative improvement is small enough. Furthermore, in order to calculate the codevector () in (25), we use Monte-Carlo method by first generating a set of finite training vectors , with known pdf, and then calculating the vector . Finally, we average over those vectors that have resulted the index .

### V-C Complexity of COMSVQ-CS

In order to calculate the MSVQ encoder complexity, we calculate the number of operations at the encoder based on (24). Here, the computational complexity of CS reconstruction algorithm is not considered. We consider the argument of (24) which requires two FLOP’s for the subtraction and addition, and also FLOP’s for computing the second inner product term. Note that the first constant term and the third inner product term can be computed offline, and they are not counted in our complexity analysis. Thus, in total, the COMSVQ-CS encoder requires operations, where () is the quantization rate available at a stage and is total number of stages such that .

It can be also shown that at stage , the encoder requires one float to store the first term in (24), i.e., , floats to store the second term in (24), i.e., , and also floats for storing the third term in (24). Therefore, considering stages, the total encoding memory of the COMSVQ-CS is . Now, we consider the decoder memory complexity. Each decoder at stage requires floats to store the codevector considering the fact that the memory for storing the codebooks of previous stages has been already calculated. Hence, the decoder storage memory is