Joint SourceChannel Vector Quantization for
Compressed Sensing
Abstract
We study joint sourcechannel coding (JSCC) of compressed sensing (CS) measurements using vector quantizer (VQ). We develop a framework for realizing optimum JSCC schemes that enable encoding and transmitting CS measurements of a sparse source over discrete memoryless channels, and decoding the sparse source signal. For this purpose, the optimal design of encoderdecoder pair of a VQ is considered, where the optimality is addressed by minimizing endtoend mean square error (MSE). We derive a theoretical lowerbound on the MSE performance, and propose a practical encoderdecoder design through an iterative algorithm. The resulting coding scheme is referred to as channeloptimized VQ for CS, coined COVQCS. In order to address the encoding complexity issue of the COVQCS, we propose to use a structured quantizer, namely low complexity multistage VQ (MSVQ). We derive new encoding and decoding conditions for the MSVQ, and then propose a practical encoderdecoder design algorithm referred to as channeloptimized MSVQ for CS, coined COMSVQCS. Through simulation studies, we compare the proposed schemes visavis relevant quantizers.
I Introduction
Compressed sensing (CS) [2] considers retrieving a highdimensional sparse vector from relatively lower number of measurements. In many practical applications, the collected measurements at a CS sensor node need to be encoded using finite bits and transmitted over noisy communication channels. To do so, efficient design of source and channel codes should be considered for reliable transmission of the CS measurements over noisy channels. The optimum performance theoretically attainable in a pointtopoint memoryless channel can be achieved using separate design of source and channel codes, but this performance requires infinite source and channel code block lengths resulting in delay as well as coding complexity. Considering finitelength sparse source and CS measurement vector, it is theoretically guaranteed that joint sourcechannel coding (JSCC) can provide better performance than a separate design of source and channel codes. Therefore, to design a practical coding method, we focus on optimal JSCC principles for CS in the current work. Denoting the reconstruction vector by at a decoder, our main objective is to develop a generic framework for optimum JSCC of CS measurements using vector quantization, or in other words, optimum joint sourcechannel vector quantization for CS, such that is minimized.
Ia Background
Recently, significant research interest has been devoted to design and analysis of source coding, e.g. quantization, for CS, and a wide range of problems has been formulated. Existing work on this topic is mainly divided into three categories.

The first category considers optimum quantizer design for quantization of CS measurements, where a CS reconstruction algorithm is held fixed at the decoder. Examples include [3] and [4], where CS reconstruction algorithms are LASSO and message passing, respectively. Based on analysisbysynthesis principle, we have recently developed a quantizer design method in [5], where any CS reconstruction algorithm can be used.

The second category considers the design of a good CS reconstruction algorithm, where the quantizer is held fixed. CS reconstruction from noisy measurements – where the noise properties follow the effect of quantization – falls in the category. Examples are [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. To elaborate, let us consider [9] where CS measurements are uniformly quantized and a convex optimizationbased CS reconstruction algorithm, called basis pursuit dequantizing (BPDQ), is developed to suit the effect of uniform quantization. Further, the design of CS reconstruction algorithms and their performance bounds for reconstructing a sparse source from 1bit quantized measurements have been investigated in [12, 13, 14, 15].

Another line of previous work focuses on tradeoffs between the quantization resources (e.g., quantization rate) and CS resources (e.g., number of measurements or complexity of CS reconstruction) [16, 17, 8, 18]. For example, in [18], a tradeoff between number of measurements and quantization rate was established by introducing the concept of two compression regimes as quantification of resources – quantization compression regime and CS compression regime.
We mention that all the above works are dedicated to pure source coding through quantization of CS measurements. To the best of our knowledge, there is limited work on JSCC of CS measurements using vector quantizer (VQ). In this regard, we had our previous effort in [1]. The current paper is build upon the work of [1], and provides a comprehensive framework for developing optimum JSCC schemes to encode and transmit CS measurements (of a sparse source ) over discrete memoryless channels, and to decode the sparse source so as to provide the reconstruction . The optimality is addressed by minimizing the MSE performance measure .
IB Contributions
We first consider the optimal design of VQ encoderdecoder pair for CS in the sense of minimizing the MSE. Here, we stress that we use the VQ in its generic form. This is different from the design methods using uniform quantization [9] or 1bit quantization of CS measurements [12, 13, 14, 15]. Our contributions include

Establishing (necessary) optimal encoding and decoding conditions for VQ.

Providing a theoretical bound on the MSE performance.

Developing a practical VQ encoderdecoder design through an iterative algorithm.

Addressing the encoding complexity issue of VQ using a structured quantizer, namely low complexity multistage VQ (MSVQ), where we derive new encoderdecoder conditions for suboptimal design of the MSVQ.
Our practical encoderdecoder designs consider ChannelOptimized VQ for CS, coined COVQCS, and ChannelOptimized MSVQ for CS, coined COMSVQCS. To demonstrate the strength of the proposed designs, we compare them with relevant quantizer design methods through different simulation studies. Particularly, we show that in noisy channel scenarios, the proposed COVQCS and COMSVQCS schemes provide better and more robust (against channel noise) performances compared to existing quantizers for CS followed by separate channel coding.
IC Outline
The rest of the paper is organized as follows. In Section II, we introduce some preliminaries of CS. The optimal design and performance analysis of a joint sourcechannel VQ for CS are presented in Section III. In Section IV, we propose a practical VQ encoderdecoder design algorithm. Further, in Section V, we deal with complexity issue by proposing the design of computationally and memoryefficient MSVQ for CS. The performance comparison of the proposed quantization schemes with other relevant methods are made in Section VI, and conclusions are drawn in Section VII.
ID Notations
Notations: Random variables (RV’s) will be denoted by uppercase letters while their realizations (instants) will be denoted by the respective lowercase letters. Random vectors of dimension will be represented by boldface characters. We will denote a sequence of RV’s by ; further, implies that . Matrices will be denoted by capital Greek letters, except that the square identity matrix of dimension is denoted by . The matrix operators determinant, trace, transpose and the maximum eigenvalue of a matrix are denoted by , , , and , respectively. Further, cardinality of a set is shown by . We will use to denote the expectation operator. The norm () of a vector will be denoted by . Also, represents norm which is the number of nonzero coefficients in .
Ii Preliminaries of CS
In CS, a random sparse vector (where most coefficients are likely zero) is linearly measured by a known sensing matrix () resulting in an underdetermined set of linear measurements (possibly) perturbed by noise
(1) 
where and denote the measurement and the additive measurement noise vectors, respectively. We assume that is a sparse vector, i.e., it has at most () nonzero coefficients, where the location and magnitude of the nonzero components are drawn from known distributions. We also assume that the sparsity level is known in advance. We define the support set of the sparse vector as with . Next, we define the mutual coherence notion which characterizes the merit of a sensing matrix . The mutual coherence is defined as [19]
(2) 
where denotes the column of . The mutual coherence formalizes the dependence between the columns of , and can be calculated in polynomialtime complexity.
In order to reconstruct an unknown sparse source from a noisy undersampled measurement vector, several reconstruction methods have been developed based on convex optimization methods, iterative greedy search algorithms and Bayesian estimation approaches. In this paper, through the design and analysis procedures, we adopt the Bayesian framework [20, 21, 22, 23, 24] for reconstructing a sparse source from noisy and quantized measurements.
In the subsequent sections, we describe our proposed design methods for quantization by observing the CS measurement vector, and then develop theoretical results.
Iii Joint SourceChannel VQ for CS
In this section, we first introduce a general joint sourcechannel VQ system model for CS measurements in Section IIIA. We derive necessary conditions for optimality of encoderdecoder pair in Section IIIB. Thereafter, we investigate the effects of optimal conditions in Section IIIC, and proceed to analysis of performance in Section IIID.
Iiia General System Description and Performance Criterion
Consider the general system model, shown in Figure 1, for transmitting CS measurements and reconstructing a sparse source. Let the total bit budget allocated for encoding (quantization) be fixed at bits per dimension of the source vector. Given the noisy measurement vector , a VQ encoder is defined by a mapping , where is a finite index set defined as with . Denoting the quantized index by , the encoder works according to , where the sets are encoder regions and such that when the encoder outputs the index . Note that given an index , the set is not necessarily a connected set (due to nonlinear CS reconstruction) in the space . Also, might be an empty set (due to channel noise, see e.g. [25]).
Next, we consider a memoryless channel consisting of discrete input and output alphabets which is referred to as discrete memoryless channel (DMC). In our problem setup, the DMC accepts the encoded index and outputs a noisy symbol . The channel is defined by a random mapping characterized by transition probabilities
(3) 
which indicates the probability that index is received given that the input index to the channel was . We assume that the transmitted index and the received index share the same index set , and the channel transition probabilities (3) are known in advance. We denote the capacity of a given channel by bits/channel use. Given the received index , a decoder is characterized by a mapping where is a finite discrete codebook set containing all reproduction codevectors . The decoder’s functionality is described by a lookup table; such that when the received index from the channel is , the decoder outputs .
Next, we state how we quantify the performance of Figure 1 and our design goal. It is important to design an encoderdecoder pair in order to minimize a distortion measure which reflects the requirements of the receivingend user. Therefore, we quantify the source reconstruction distortion of our studied system by the endtoend MSE defined as
(4) 
where the expectation is taken with respect to the distributions on the sparse source (which, itself, depends on the distribution of nonzero coefficients in as well as their random placements (sparsity pattern)), the noise and the randomness in the channel. We mention that the endtoend MSE depends on CS reconstruction error, quantization error as well as channel noise. While the CS sensing matrix is given, our concern is to design an encoderdecoder pair robust against all these three kinds of error.
IiiB Optimality Conditions for VQ Encoder and Decoder
We consider an optimization technique for the system illustrated in Figure 1 in order to determine encoder and decoder mappings E and D, respectively, in the presence of channel noise. More precisely, the aim of the VQ design is to find

MSEminimizing encoder regions and

MSEminimizing decoder codebook .
We note that the optimal joint design of encoder and decoder cannot be implemented since the resulting optimization is analytically intractable. To address this issue, in Section IIIB1, we show how the encoding index (or equivalently encoder region ) can be chosen to minimize the MSE for a given codebook . Then, in Section IIIB2, we derive an expression for the optimal decoder codebook for given encoder regions .
IiiB1 Optimal Encoder
First, let us introduce the minimum meansquare error (MMSE) estimator of the source given the observed measurements (1) which is (see [26, Chapter 11])
(5) 
Now, assume that the decoder codebook is known and fixed. We focus on how the encoding index should be chosen to minimize the MSE given the observed noisy CS measurement vector . We rewrite the MSE as
(6)  
where follows from marginalization of the MSE over and . Further, is the fold probability density function (pdf) of the measurement vector. Also, follows by interchanging the integral and the summation and the fact that , , and otherwise the probability is zero. Now, since is always nonnegative, the MSEminimizing points in that shall be assigned to the encoder region are those that minimize the term within the braces in the last expression of (6). Then, the MSEminimizing encoding index, denoted by , is given by
(7)  
where follows from the fact that is independent of , conditioned on ; hence, which is pulled out of the optimization. follows from the fact that is independent of , conditioned on , and from the Markov chain . Next, note that introducing channel transition probabilities in (3) and the MMSE estimator in (5), the last equality in (7) can be expressed as
(8) 
Equivalently, the optimized encoding regions are obtained by
(9)  
IiiB2 Optimal Decoder
Applying the MSE criterion, it is straightforward to show that the codevectors which minimize in (4) for a fixed encoder are obtained by letting represent the MMSE estimator of the vector based on the received index from the channel, that is
(10) 
Now, using the Bayes’ rule, the expression for can be rewritten as
(11)  
where follows from marginalization over and the Markov chain . Moreover, is the conditional pdf of given that . Also, follows by using (5) and by the fact that , .
IiiC Insights Through Analyzing the Optimal Conditions
Here, we provide insights into the necessary optimal conditions (8) and (10). Note that the encoding condition (8) implies that the sparse source is first MMSEwise reconstructed from CS measurements at the encoder, and then quantized to an appropriate index. Hence, it suggests that the system shown in Figure 1 may be translated to the equivalent system shown in Figure 2.
Let us first denote the MMSE estimator as the RV , then we rewrite the endtoend distortion as
(12)  
where the second equality can be proved by showing that the estimation error of the source and the quantized transmission error are uncorrelated. This holds from the definition of and the long Markov property due to the assumption of deterministic mappings E and D and memoryless channel.
Remark 1.
Interestingly, it can be also seen from (12) that does not depend on quantization and channel aspects. Hence, to find optimal encoding indexes (given fixed codevectors) and optimal codevectors (given fixed encoding regions) with respect to the endtoend distortion , it suffices to find them with respect to minimizing . It can be proved that the necessary conditions for optimality (with respect to ) of the encoderdecoder pair derived for the system of Figure 2 coincide with the ones developed for the system of Figure 1, i.e., (8) and (11). The proof of this claim is as follows. Similar to the steps taken in (6), the –minimizing encoding index is given by
Further, the –minimizing decoder is obtained by
where follows from the Markov property . Now, we provide the following remark.
Remark 2.
Before proceeding to the analysis of the MSE using the developed equivalence property, we provide a comparative study between our proposed design scheme with related methods in the literature which follow the building block structure shown in Figure 3. Under this system model, for a fixed CS reconstruction algorithm (or, a fixed quantizer encoderdecoder pair), a quantizer encoderdecoder pair (or, CS reconstruction algorithm) is designed in order to satisfy a certain performance criterion, e.g. minimizing endtoend distortion, quantization distortion or –norm of reconstruction vector. Some examples of system models following Figure 3 include [3, 9, 11, 5] (assuming a noiseless channel) and the conventional nearestneighbor coding of CS measurements. In general, according to this system model, quantizer decoder D outputs the vector after receiving channel output. Finally, a given CS reconstruction decoder takes and makes an estimate of the sparse source.
Following Figure 2 (as the equivalent system model of Figure 1), we note that it is structurally different from the system model of Figure 3 in the location of the CS reconstruction, either at the transmitter side or at the receiver side. In the former system, an encoder reconstructs the source from CS measurements, whereas the latter system puts all CS reconstruction complexity at the decoder.
IiiD Analysis of MSE
In this section, we provide an analysis into the impact of CS reconstruction distortion, quantization error and channel noise on the endtoend MSE by deriving a lowerbound.
Proposition 1.
Consider the linear CS model (1) with an exact sparse source under the following assumptions:

The magnitude of nonzero coefficients in are drawn according to the i.i.d. standard Gaussian distribution.

The elements of the support set are uniformly drawn from all possibilities.

The measurement noise is drawn as uncorrelated with the measurements, where .
Further, assume a sensing matrix with mutual coherence . Let the total quantization rate be bits/vector, and the channel be characterized by capacity bits/channel use, then the endtoend MSE of the system of Figure 1 asymptotically (in quantization rate and dimension) is lowerbounded as
(13) 
where , and , in which denotes the Gamma function.
Proof.
The proof can be found in the Appendix. ∎
Remark 3.
Each component of the lowerbound (13) is intuitive. The first term is the contribution of the CS reconstruction distortion, and the second term reflects the distortion due to the vector quantized transmission. When the CS measurements are noisy, it can be verified that as increases, the endtoend MSE attains an error floor. This result can be also inferred from (12): as quantization rate increases, decays (asymptotically) exponentially, however, is constant irrespective of rate. Hence, as , the value that the MSE converges to is .
It should be noted when CS measurements are noiseless (), the lowerbound (13) becomes trivial. In this case, a simple asymptotic lowerbound for the system of Figure 1, under the assumptions of Proposition 1, can be obtained as
(14) 
where the constant is the same dimensionalitydependent constant in (13).
The lowerbound (14) (also known as adaptive bound in [16, 17] in the noiseless channel case) can be proved assuming that the support set of is a priori known. Therefore, one can transmit the known support set using bits, and the Gaussian coefficients within the support set can be quantized via bits. Under noiseless channel condition (), the right hand side in (14) is shown to achieve the distortion rate function of a sparse source vector with Gaussian nonzero coefficients and a support set uniformly drawn from possibilities [27]. Then, the separate sourcechannel coding theorem [28, Chapter 7] can be applied to find the optimum performance theoretically attainable (OPTA) by introducing channel capacity .
Remark 4.
The lowerbound in (14) shows that the endtoend MSE can at most decay exponentially (in quantization rate ) with exponent dB/bit. Since the sparsity ratio , the decaying exponent can be far steeper than dB/bit for a Gaussian nonsparse source vector of dimension .
The following toy example offers some insights into the tightness of the lowerbound (14).
Example 1.
Using a simple example, we show how tight the lowerbound (14) is with respect to our proposed design. In Figure 4, we compare simulation results with the lowerbound in some region where .^{1}^{1}1This scenario can be realized in an event where and number of measurements is such that the CS reconstruction is perfect. Following this bestcase scenario, we generate realizations of with sparsity level , where the nonzero coefficient is a standard Gaussian RV, and its location is drawn uniformly at random over . Then, we use the necessary optimal conditions (8) and (10) iteratively (as will be shown later in Algorithm 1). Considering a binary symmetric channel (BSC) with bit crossover probability and capacity bits/channel use (see (29)), we plot MSE, versus quantization rate for (noiseless channel) and (noisy channel) in Figure 4. It can be observed that at , the bound (dashed line) is tight. As would be expected, degrading channel condition to reduces the performance. At , the gap between the simulation result (solid line marked by ‘o’) and its corresponding lowerbound (dotted line) increases. Note that in the noisy channel case, the lowerbound is based upon the asymptotic assumption of infinite source and channel code lengths (used in the OPTA). Therefore, the lowerbound is not tight at for low dimensions.
Iv Practical Quantizer Design
In this section, we first develop a practical VQ encoderdecoder design algorithm, referred to as channeloptimized VQ for CS (COVQCS) using the necessary optimal conditions (8) and (11). Then, we provide a practical comparison between our proposed algorithm and a conventional quantizer design algorithm. We finalize this section by analyzing encoding and decoding computational complexity.
Iva Training Algorithm for Practical Design
The results presented in Section IIIB1 and Section IIIB2 can be utilized to formulate an iteratealternate training algorithm for the problem of interest. Similar to the generalized Lloyd algorithm for noisy channels [29], we propose a VQ training method for the design problem in this paper which is summarized in Algorithm 1. The following remarks can be considered for implementing Algorithm 1:

In step (1), besides the channel transition probabilities , we assume that the statistics of the sparse source vector are given for training.

In general, it is not easy to derive closedform solutions for the optimal decoding condition (11), for example, due to difficulties in calculating the integrals even if the pdf is known. In practice, we calculate the codevector () in (10) using the MonteCarlo method. To implement this computationallyefficient procedure, we first generate a set of finite training vectors , and then sampleaverage over those vectors that have led to the index .

To address the issue of encountering empty regions, we, in each iteration of the algorithm, pick the codevector whose index has been sent the most number of times, denoted by . Then, a codevector associated with the index that has not been sent is calculated as , where is sufficiently small. Using this technique (which is also known as splitting method in the initialization phase of the LBG algorithm [30]), we efficiently reinclude those encoding indexes that have never been selected due to the limited number of generated samples. This will lead to a design that efficiently uses all degrees of freedom.

The performance of the COVQCS is sensitive to initializations in order for the algorithm to converge to a smaller value of the distortion . Therefore, in step (3), when the channel is noiseless, the codevectors are initialized using the splitting procedure of the socalled LBG design algorithm. Then, the final optimized codevectors are chosen for initialization of Algorithm 1 in the noisy channel case. Furthermore, convergence in step (7) may be checked by tracking the MSE, and terminate the iterations when the relative improvement is small enough. By construction and ignoring issues such as numerical precision, the iterative design in Algorithm 1 always converges to a local optimum since when the criteria in steps (5) and (6) of the algorithm are invoked, the performance can only leave unchanged or improved, given the updated indexes and codevectors. This is a common rationale behind the proof of convergence for such iterative algorithms (see e.g. [31, Lemma 11.3.1]). However, nothing can be generally guaranteed about the global optimality of this algorithm.
IvB Practical Comparison
Here, we offer further insights into quantization aspects through the design of conventional nearestneighbor coding (NNC) as a representative of Figure 3, and the design of proposed COVQCS method as a representative of Figure 2. The NNC for CS is often considered as a benchmark for performance evaluations.
The nearestneighbor coding (NNC) for CS measurements is accomplished by designing a channeloptimized VQ for the input vector aiming to minimize the quantization distortion, i.e., , where is the quantizer decoder output as shown in Figure 3.^{1}^{1}1See e.g. [29] for more details regarding the design of channeloptimized VQ in a nonCS system model. Considering the notations given for the Figure 3, the design procedure of the quantizer encoder and the quantizer decoder is as follows: for a quantization rate bits/vector, a fixed codebook , with , and channel transition probability , the optimized encoding region becomes
(15)  
where is the encoding index set. Now, for the given region (15) and channel transition probability , the quantization MSEminimizing codevectors satisfy
(16) 
In order to design an encodedecoder pair using the NNC, an iterative algorithm can be used to alternate between (15) and (16). Finally, a CS reconstruction algorithm R produces the reconstruction vector from the quantizer decoder output . We refer to this design method as NNCCS.
Example 2.
In this example, we illustrate how the COVQCS and NNCCS design methods are different in shaping encoding regions (given that CS measurements are observed) and positioning codevectors (given that channel output index observed). For illustration purpose, we choose the input sparse vector dimension, measurement vector dimension and sparsity level as , and , respectively. The location of nonzero coefficient is drawn uniformly at random from , and its value is a standard Gaussian RV. For implementing the COVQCS via Algorithm 1, the MMSE estimator (used in (8)) is calculated via the closedform solution given in [22, eq. (27)]. We generate realizations for (and subsequently ), where measurement noise vector is drawn from with . Then, we fix the quantization rate at bits/vector and assume a BSC with crossover probability . For implementing the NNCCS, an iterative algorithm is used by alternating between encoding regions (15) and codevectors (16). Finally, a CS reconstruction algorithm (here, we choose the same MMSE estimator used at the encoder of COVQCS) takes the NNCCS codevectors and produces an estimate of the sparse source. In both NNCCS and COVQCS schemes, the sensing matrix is chosen as
In Figure 5, we qualitatively illustrate encoding regions and codevectors using the two designs. Figure 5(a) shows the samples of CS measurements classified by encoding regions of COVQCS in , i.e., (8), and Figure 5(b) shows the samples of classified by the index of encoding regions (in the same color) together with the codevectors of COVQCS in , i.e., in (11). Figure 5(c) illustrates the encoding regions of NNCCS, i.e., (15), together with codevectors shown by black circles, and Figure 5(d) shows the samples of the sparse source along with the codevectors of NNCCS mapped to the 3dimensional space using the CS reconstruction algorithm, i.e., . From the samples in the measurement space, we observe that the entries of the CS measurements are highly correlated, in this particular example, due to a large mutual coherence of the sensing matrix (). Hence, as shown in Figure 5(c), the codevectors designed by the NNCCS (almost) lie on a single line. Although, in this case, the location of codevectors are optimized to minimize the quantization distortion, , it is critical when the codevectors are mapped back to the source domain. From Figure 5(d), it is observed that the reconstructed codevectors, , are not only situated (approximately) on one axis but also far (in Euclidean distance) to their corresponding source samples (shown in same color) resulting in a high endtoend distortion. Further, if, for example, the codevector is received as due to channel noise, it produces a large endtoend distortion. Using other experiments, in the case of noiseless channel, we observed the same trend in the location of reconstructed codevectors (using NNCCS) on the source domain which also produces large MSE in terms of the average distance between source samples and their corresponding reconstructed codevectors. While this is the case in NNCCS, it can be seen from Figure 5(a) that the encoding regions using COVQCS may not form convex sets (for example, region 3) unlike the ones using the NNCCS. This is due to the fact that the region fixed by the rule (9) may not be a convex set in due to nonlinearity in . As a result, the COVQCS uses the measurement space more efficiently in order to reduce endtoend distortion, . It can be observed from Figure 5(b) that the COVQCS codevectors are located on different coordinates in the 3dimensional source space to minimize the endtoend source distortion. In addition, the codevectors are located such that the COVQCS design becomes more robust against channel noise which produces smaller endtoend distortion unlike the NNCCS design. For example, as shown in Figure 5(b), if the codevector is chosen as at decoder due to channel noise, it provides much less endtoend distortion than that of the NNCCS. Numerical performance comparison between these two schemes will be made later in Section VIB through different simulation studies.
IvC Complexity of COVQCS
We analyze the encoding computational complexity (time usage) as well as encoderdecoder memory complexity (space usage) for the COVQCS. For encoding computational complexity, we calculate the number of operations (in terms of FLOP^{1}^{1}1Each addition, multiplication and comparison is represented by one floating point operation (FLOP).) required for transmitting an encoded index over the channel based on (8). In addition, for memory complexity, we calculate the memory (in terms of float^{2}^{2}2Float is considered as a single precision point unit.) required for storing vector parameters at encoder and decoder.
The encoding complexity for computing the argument in (8) requires one FLOP for calculating the subtraction as well as FLOP’s ( multiplications and additions) for calculating the inner product in the second term. Thus, the total complexity for the fullsearch minimization at encoder is FLOP’s. Note that we do not consider the complexity of CS reconstruction algorithm since its calculation is required for all relevant quantizers for CS. Next, considering the argument in (8), the encoder needs one float to store the first constant term in (8), i.e., , and also floats to store the second term in (8), i.e., the codevector . Thus, the total encoding memory for fullsearch minimization is . It also follows that the decoder requires floats to store in (10).
Using highdimensional VQ and CS, the implementation of the quantizer encoder and decoder may not be feasible, both from computational complexity and from memory complexity viewpoints. The complexity can be reduced by exploiting suboptimal approaches (with respect to (4)) such as multistage VQ (MSVQ) which splits a single VQ into multiple VQ’s at different stages. In the next section, we focus on the design of JSCC strategies for CS measurements using MSVQ.
V Joint SourceChannel MSVQ for CS
Taking advantage of VQ properties by addressing its encoding complexity effectively has led to development of multistage VQ (MSVQ).
Va System Description and Performance Criterion
In this section, we give an account for the basic assumptions and models made about the investigated system depicted in Figure 6. We illustrate an stage VQ, where is the maximum number of stages. Our MSVQ system model basically follows that of [32].
More specifically, we consider the () stage with allocated bits/vector, where , and is the total available quantization rate. Indeed, adjusts a tradeoff between complexity and performance of MSVQ. A quantizer encoder, at stage , accepts the measurement vector and the encoded index from the stage as inputs, then maps them into an integer index with . Therefore, the –stage encoder is described by a mapping such that
(17) 
where is called the –stage encoding region. The region might be a connected set or union of some connected sets in . We also make the assumption that .
The encoded index is transmitted over a DMC (independent of other channels) with transition probabilities
(18) 
where denotes the channel output at the stage.
Next, a decoder accepts the noisy index , and provides an estimation of the quantization error according to an available codebook set. Formally, the –stage decoder is defined by a mapping where denotes a codebook set consists of reproduction codevectors, i.e., , thus
(19) 
We denote the output of the stage decoder by , and the final reconstructed vector by .
We are interested in designing the quantizers in the system of Figure 6 using the endtoend MSE criterion defined in (4). Nevertheless, it is not easy to find optimal encoders (by fixing the decoders) and decoders (by fixing encoders) for all the stages jointly with respect to minimizing (4). Therefore, we define a new performance criterion as
(20) 
Using the performance criterion in (20), we assume that the stage only observes the previous stages. Applying , we derive necessary encoding and decoding policies for optimality (with respect to (20)) at stage (). Then, encoderdecoder pairs at the next stages are sequentially designed one after another. Using the sequential optimization at stage , we assume that the subsequent codevectors are populated with zero. This assumption means that the sequential design is suboptimal with respect to (4), and the resulting conditions would lead to neither global nor local minimum of the endtoend MSE. However, it provides better performance compared to the schemes which only consider quantization distortion at each stage separately.
VB Optimality Conditions for MSVQ Encoder and Decoder
In this section, we develop encoding and decoding principles for the () stage of the MSVQ system shown Figure 6. Following the arguments of Section IIIB, we first assume that decoder codevectors and all encoding regions/codevectors at previous stages are fixed and known, then we find necessary optimal encoding regions with respect to minimizing in (20) in Section VB1. Second, we fix the encoding regions , and then derive necessary optimal codevectors in Section VB2. Finally, in Section VB3, we combine these necessary optimal conditions to develop a practical MSVQ design algorithm referred to as channeloptimized MSVQ for CS (COMSVQCS).
VB1 Optimal Encoder
In order to derive encoding regions , we fix the codevectors and all the codevectors at previous stages. First, let us define
(21) 
Now, in (20) can be rewritten as
(22)  
where follows from marginalization of over and , and the fact that , and otherwise the probability is zero.
Thus, , the optimized index, denoted by , is attained by (23), where follows from the Markov chain (), hence, which is pulled out of the optimization. Also, follows from the Markov chain , .
(23)  
VB2 Optimal Decoder
In order to derive codevectors , we fix encoding regions and all prior codebook sets. Therefore, applying in (20), it is straightforward to show that the optimal –stage codevectors, denoted by , are obtained as
(25) 
Similar to the steps taken in (11), the codevectors (25) can be parameterized in terms of encoding regions, channel transition probabilities and MMSE estimation. Here, for the sake of analysis, we only provide closedform codebook expressions for which are given by
Finally, we note that when , the condition (25) simplifies into (11).
VB3 Training Algorithm
Similar to Algorithm 1, we can develop a practical method for training channeloptimized MSVQ for CS, coined COMSVQCS, summarized in Algorithm 2. Similar remarks, as stated for Algorithm 1, can be also considered for implementing Algorithm 2 with the difference that convergence in step (8) may be checked by tracking the distortion , and terminate the iterations when the relative improvement is small enough. Furthermore, in order to calculate the codevector () in (25), we use MonteCarlo method by first generating a set of finite training vectors , with known pdf, and then calculating the vector . Finally, we average over those vectors that have resulted the index .
VC Complexity of COMSVQCS
In order to calculate the MSVQ encoder complexity, we calculate the number of operations at the encoder based on (24). Here, the computational complexity of CS reconstruction algorithm is not considered. We consider the argument of (24) which requires two FLOP’s for the subtraction and addition, and also FLOP’s for computing the second inner product term. Note that the first constant term and the third inner product term can be computed offline, and they are not counted in our complexity analysis. Thus, in total, the COMSVQCS encoder requires operations, where () is the quantization rate available at a stage and is total number of stages such that .
It can be also shown that at stage , the encoder requires one float to store the first term in (24), i.e., , floats to store the second term in (24), i.e., , and also floats for storing the third term in (24). Therefore, considering stages, the total encoding memory of the COMSVQCS is . Now, we consider the decoder memory complexity. Each decoder at stage requires floats to store the codevector considering the fact that the memory for storing the codebooks of previous stages has been already calculated. Hence, the decoder storage memory is