Quantized Compressive Sensing with RIP Matrices:The Benefit of Dithering

# Quantized Compressive Sensing with RIP Matrices: The Benefit of Dithering

Chunlei Xu and Laurent Jacques111CX and LJ are with Image and Signal Processing Group (ISPGroup), ICTEAM/ELEN, Université catholique de Louvain (UCL). E-mail: {chunlei.xu,laurent.jacques}@uclouvain.be. The authors are funded by the Belgian F.R.S.-FNRS. Part of this study is funded by the project AlterSense (MIS-FNRS).
###### Abstract

In Compressive Sensing theory and its applications, quantization of signal measurements, as integrated into any realistic sensing model, impacts the quality of signal reconstruction. In fact, there even exist incompatible combinations of quantization functions (e.g., the 1-bit sign function) and sensing matrices (e.g., Bernoulli) that cannot lead to an arbitrarily low reconstruction error when the number of observations increases.

This work shows that, for a scalar and uniform quantization, provided that a uniform random vector, or random dithering, is added to the compressive measurements of a low-complexity signal (e.g., a sparse or compressible signal, or a low-rank matrix) before quantization, a large class of random matrix constructions known to respect the restricted isometry property (RIP) are made “compatible” with this quantizer. This compatibility is demonstrated by the existence of (at least) one signal reconstruction method, the projected back projection (PBP), whose reconstruction error is proved to decay when the number of quantized measurements increases.

Despite the simplicity of PBP, which amounts to projecting the back projection of the compressive observations (obtained from their multiplication by the adjoint sensing matrix) onto the low-complexity set containing the observed signal, we also prove that given a RIP matrix and for a single realization of the dithering, this reconstruction error decay is also achievable uniformly for the sensing of all signals in the considered low-complexity set.

We finally confirm empirically these observations in several sensing contexts involving sparse signals, low-rank matrices, and compressible signals, with various RIP matrix constructions such as sub-Gaussian random matrices and random partial Discrete Cosine Transform (DCT) matrices.

## 1 Introduction

Compressive sensing (CS) theory [1, 2, 3] has shown us how to compressively and non-adaptively sample low-complexity signals, such as sparse vectors or low-rank matrices, in high-dimensional domains. In this framework, accurate estimation of such signals from their compressive measurements is still possible thanks to non-linear reconstruction algorithms (e.g., -norm minimization, greedy algorithms) exploiting the signal low-complexity nature. In other words, by generalizing the concepts of sampling and reconstruction, CS has somehow extended Shannon-Nyquist theory initially restricted to the class of band-limited signals.

Specifically, given a sensing (or measurement) matrix with , CS describes how one can recover a signal from the measurements associated with the underdetermined linear model222Some of the mathematical notations and conventions used below are defined at the end of this section.

 y=Φx+n, (1)

where is the measurement vector, stands for a possible additive measurement noise, and is assumed restricted to a low-complexity signal set , e.g., the set of -sparse vectors in an orthonormal basis . In particular, it has been shown that the recovery of is guaranteed if respects the Restricted Isometry Property (RIP) over , which essentially states that behaves as an approximate isometry for all elements of (see Sec. 3.1). Interestingly many random constructions of sensing matrices have been proved to respect the RIP with high probability (w.h.p. 333Hereafter, we will write w.h.p. if the probability of failure of the considered event decays exponentially with respect to the number of measurements.[4, 5, 6, 3]. For instance, if is a Gaussian random matrix with entries identically and independently distributed (i.i.d. ) as a standard normal distribution , respects the RIP over with very high probability provided . For a more general set , the RIP is verified as soon as is sufficiently large compared to the intrinsic complexity of in , e.g., as measured by the squared Gaussian mean width or the Kolmogorov entropy of [7, 5, 4] (see Sec. 3.1 and Sec. 7).

Under the satisfiability of the RIP, many signal reconstruction methods (e.g., Basis Pursuit DeNoise [1], or greedy algorithms such as Orthogonal Matching Pursuit [8] or Iterative Hard Thresholding [3]) achieve a stable and robust estimate of from the sensing model (1), e.g., for . They typically display the following reconstruction error bound, or instance optimality [9, 1, 2],

 ∥x−^x∥ ⩽ C∥x−xk∥1√k+Dϵ, (2)

where is the signal estimate, the best -term approximation of , is an estimator bounding the noise energy, and are only depending on .

In this brief overview of CS theory, we thus see that, at least in the noiseless setting, the measurement vector is assumed represented with infinite precision. However, any realistic device model imposes digitalization and finite precision data representations, e.g., to store, transmit or process the acquired observations. In particular, (1) must be turned into a Quantized CS formalism where the objective is to reliably estimate a low-complexity signal from the quantized measurements

 y=Qg(Φx). (3)

In (3), is a general quantization function, or quantizer, mapping -dimensional vectors to some vectors in a discrete set, or codebook, .

While [10] only studied uniform quantization of CS measurements as an additive, bounded noise in (1), inducing thus a constant error bound in (2) [11, 12], various kind of quantizers have since then been studied more deeply in the context of QCS [12]. Their list includes -quantization [11], non-regular scalar quantizers [13], non-regular binned quantization [14, 15], and even vector quantization by frame permutation [16]. These quantizers, when combined with an appropriate signal reconstruction procedure, achieve different decay rate of the reconstruction error when the number of measurements increases. For instance, for a -quantizer combined with Gaussian or sub-Gaussian sensing matrices [11], or with random partial circulant matrices generated by a sub-Gaussian random vector [17], this error can decay polynomially in for an appropriate reconstruction procedure, and, in the case of a 1-bit quantizer, adapting the sign quantizer by inserting in it adaptive thresholds can even lead to exponential decay [18].

In this paper, our objective is, however, not to focus on optimizing the quantizer to achieve the best decay rate for the reconstruction error of some appropriate algorithm when increases. Actually, our aim is to show that a simple scalar quantization procedure, i.e., a uniform quantizer, applied componentwise onto vectors (or entry-wise on matrices), is compatible with the large class of sensing matrices known to satisfy the RIP, provided that we combine the quantization with a random, uniform pre-quantization dithering [13, 19, 20]. This access to a broader set of sensing matrices for QCS, i.e., not only restricted to unstructured sub-Gaussian random constructions, is indeed desirable in many CS applications where specific, structured sensing matrices are constrained by technology or physics, such as (random) partial Fourier/DCT matrices in magnetic resonance imaging [21], in radio-astronomy [22] or in radar or communication applications [23, 24]. Moreover, in this context, we focus on the estimation of signals belonging to a general low-complexity set in , e.g., the set of sparse or compressible vectors, the set of low-rank matrices, or any set having a small Kolmogorov entropy (see Sec. 3.1 and Sec. 7), provided that this set also supports the RIP of , i.e., we want the reconstruction guarantees of QCS to reduce to those of CS if the quantization disappears (e.g., when its precision becomes infinite).

Mathematically, our work considers the problem of estimating a signal from the QCS model

 y=A(x)=A(x;Φ,ξ):=Q(Φx+ξ), (4)

where is a quantized random mapping, i.e., , is a uniform scalar quantization of resolution444The term “resolution” does not refer here to the number of bits used to encode the quantization bins [25]. , and is a uniform random dithering vector whose components are i.i.d. as a uniform distribution over , i.e., for , or, more briefly, .

The compatibility mentioned above between the QCS model (4) and the class of RIP matrices is demonstrated by showing that a simple (often non-iterative) reconstruction method, the Projected Back Projection (PBP) of the quantized measurements onto the set , i.e., finding the closest point in to the back projection for any (see Sec. 4), achieves a reconstruction error that decays like when increases, for some only depending on .

For instance, we prove in Sec. 7 that, given a RIP matrix and a fixed signal , if the dithering is random and uniform over , then one achieves, w.h.p., when is the set of sparse vectors, the set of low-rank matrices555Up to the identification of these matrices with their vector representation., or any finite union of low-dimensional subspaces, as with model-based CS schemes [26] or group-sparse signal models [27]. Interestingly, for these specific sets, the same error decay rate is proved, up to extra log factors in the involved dimensions, in a uniform setting, i.e., when the randomly generated allows the estimation of all vectors of w.h.p.. More generally, if is a convex and bounded set of , e.g., the set of compressible signals , we observe that and in the non-uniform and in the uniform setting, respectively.

Knowing if other reconstruction algorithms can reach faster error decay is a matter of future study, in this regard, PBP can be seen as a reconstruction principle providing an initial guide for more advanced reconstruction algorithms, e.g., iteratively enforcing the consistency of the estimate with the observations from an initial guess provided by PBP [28, 29, 30, 31, 32].

In all our developments, the importance of the random dithering in the QCS model (4) founds its origin in the simple observation that, for , for all (see Lemma A.1 in Appendix A). By the law of large numbers, this thus means that for different r.v.’s with and increasingly large, an arbitrary projection of the vector for some vector onto a fixed direction tends to that is zero when increases. Moreover, this effect should persist for all and selected in a set whose dimension is small compared to , and, in our case of interest, if these vectors are selected in the image of a low-complexity set by a RIP matrix .

In order to accurately bound the deviation between these projections and zero, we prove using tools from measure concentration theory (and some extra care to deal with the discontinuities of ) that, given a resolution and for large before the intrinsic complexity of (as measured by its Kolmogorov entropy), the quantized random mapping associated with a RIP matrix and a random dithering respects, w.h.p., the Limited Projection Distortion property over , or LPD, defined by

 1m|⟨A(u),Φv⟩−⟨Φu,Φv⟩| ⩽ ν,∀u,v∈K∩Bn,

where is a certain distortion depending on , , and . In fact, we will see in Sec. 6 that if the dithering is random and uniform, where is an arbitrary small distortion impacting the requirement on . For instance, forgetting all other dependencies, for the set of sparse vectors as classically established for ensuring the RIP of a Gaussian random matrix [4] (see Sec. 6 and Sec. 7.3). Moreover, by localizing the LPD on a fixed , the impact of quantization is reduced and , as deduced in Sec. 5.

Interestingly, the LPD is useful to characterize the reconstruction error of PBP. This is easily understood in the case of the estimation of -sparse signals in . Postponing the accurate proof of this fact to Sec. 4, we first observe that if respects the RIP over with distortion one can show that for all (see Lemma 3.5). Therefore, if satisfies the LPD property over the same set with distortion , then, a simple use of the triangular inequality provides

Therefore, for a bounded sparse signal , its estimate provided by the PBP of the quantized observations is such that is the best -sparse approximation of . However, it is also the best -sparse approximation of the -sparse vector whose entries are equal to those of if they are indexed in and to 0 otherwise. Therefore, . Moreover, by the definition of the -norm,

 ∥x−¯a∥=supv∈Bn⟨v,x−¯a⟩=supv∈ΣnT∩Bn⟨v,x−a⟩=supv∈ΣnT∩Bn⟨v,x⟩−1m⟨Φv,A(x)⟩,

with the set of vectors in supported on , i.e., the support of . Consequently, since is at most -sparse, the LPD of over with distortion provides finally the bound on the reconstruction error of PBP.

The rest of the paper is structured as follows. We present in Sec. 2 a few works related to our study, namely former usages of the PBP method in 1-bit CS, other definitions in 1-bit CS and in non-linear CS of matrix properties similar to our definition of the LPD, and certain known reconstruction error bounds of PBP and related algorithms for a few QCS and non-linear sensing contexts. In this presentation of the state-of-the-art, we note that all works are based on (sub)Gaussian random projections of signals altered by quantization or other non-linear disturbances, with one noticeable exception using subsampled Gaussian circulant sensing matrix [33]. After having introduced a few preliminary concepts in Sec. 3, such as the characterization of low-complexity spaces, the PBP method and the formal definition of the (L)LPD, Sec. 4 establish the reconstruction error bound of PBP when the LPD of and the RIP of are both verified. We realize this analysis for three kinds of low-complexity sets, namely, finite union of low-dimensional spaces (e.g., the set of (group) sparse signals), the set of low-rank matrices, and the (unstructured) case of a general bounded convex set. In Sec. 5, we prove that the L-LPD holds w.h.p. over low-complexity sets for linear sensing model corrupted by additive sub-Gaussian noise. This analysis will later simplify the characterization of PBP of QCS observations in the non-uniform case when the observed signal is fixed prior to the generation of the random dithering. In Sec. 6, we prove that the quantized random mapping integrating a uniform random dithering is sure to respect the (uniform) LPD w.h.p. provided is large before the complexity of . From the results of these two last sections, we instantiate in Sec. 7 the general bounds found in Sec. 4 and establish the decay rate of the PBP reconstruction error when increases for the same low-complexity sets considered in Sec. 4 and for several classes of RIP matrices including sub-Gaussian and structured random sensing matrices. Finally, in Section 8, we numerically validate the error distortions via the PBP over the special sets discussed in Sec. 7.

#### Conventions and notations:

We find it useful to finish this introduction with the conventions and notations used throughout this paper. We denote vectors and matrices with bold symbols, e.g.,   or , while lowercase light letters are associated with scalar values. The identity matrix in reads and the zero vector , its dimension being clear from the context. The  component of a vector (or of a vector function)  reads either  or , while the vector  may refer to the  element of a set of vectors. The set of indices in  is  and the support of is . The Kronecker symbol is denoted by and is equal to 1 if and to 0 otherwise, while the indicator of a set is equal to 1 if and to 0 otherwise. For any  of cardinality , denotes the restriction of to , while is the matrix obtained by restricting the columns of  to those indexed by . The complement of a set reads . For any , the -norm of  is with . The -sphere in  in is , and the unit ball reads . For , we write and . By extension, is the Frobenius unit ball of matrices with , where the Frobenius norm is associated with the scalar product through , for two matrices . The common flooring operator is denoted .

An important feature of our study is that we do not pay particular attention to constants in the many bounds developed in this paper. For instance, the symbols are positive and universal constants whose values can change from one line to the other. We also use the ordering notations (or ), if there exists a such that (resp. ) for two quantities and .

Concerning statistical quantities, and  denote an  random matrix or an -length random vector, respectively, whose entries are identically and independently distributed (or ) as the probability distribution , e.g., (or ) is the distribution of a matrix (resp. vector) whose entries are as the standard normal distribution  (resp. the uniform distribution ). We also use extensively the sub-Gaussian and sub-exponential characterization of random variables (or r.v.) and of random vectors detailed in [34]. The sub-Gaussian and the sub-exponential norms of a random variable are thus denoted by and , respectively, with the Orlicz norm for . The random variable is therefore sub-Gaussian (or sub-exponential) if (resp. ).

## 2 Related works

We now provide a comparison of our work with the most relevant literature in the fields of 1-bit and quantized compressive sensing, and in signal recovery from non-linear sensing model. All the results presented below are summarized in Table 1, reporting there, amongst other aspects, the sensing model, the algorithm, the type of admissible sensing matrices and the low-complexity sets chosen in each of the referenced works.

#### PBP in 1-bit CS:

Recently, signal reconstruction via projected back projection has been studied in the context 1-bit compressive sensing (1-bit CS), an extreme QCS scenario where only the sign of the compressive measurements are retained [40, 18, 31, 39]. In this case (3) is turned into

 y=sign(Φx). (5)

It has been shown that if the sensing matrix satisfies the sign product embedding property (SPE) over [39, 31], that is, up to some distortion and some universal normalization ,

 |μm⟨sign(Φu),Φv⟩−⟨u,v⟩|⩽ϵ, (SPE)

for all , then the reconstruction error of the PBP of is bounded by [31, Prop. 2]. In other words, for a signal with unknown norm, since the binary measurements are invariant under a positive renormalization of the signal [40], the PBP method allows us to estimate the direction of a sparse signal, but not its norm. This remains true for all methods assuming to be of unit norm, as those explained below.

So far, the SPE property has only been proved for Gaussian sensing matrices, with i.i.d. standard normal entries and . Such matrices respect the SPE with high probability if , conferring to PBP a (uniform) reconstruction error decay of when increases for all . Besides, by localizing the SPE to a given , a non-uniform variant of the previous result, i.e., where is randomly drawn conditionally to the knowledge of , gives a faster error decay of [31, Prop. 2].

For more general low-complexity set (such that ) and with , provided is a random sub-Gaussian matrix (with i.i.d. centered sub-Gaussian random entries of unit variance [34]), a variant of the PBP method, which amounts to finding the vector maximizing its scalar product666On the principle, this is similar, but not equivalent, to finding the closest point of , in the Euclidean sense, to the back projection , as in the PBP. with , is proved to reach, with high probability, small reconstruction error [38], and this even if the binary sensing model (5) is noisy (e.g., with possible random sign flips on a small percentage of the measurements). In fact, this error decays like when increases, with depending only on the level of measurement noise, on the distribution of , and on (actually, on its Gaussian mean width, see Sec. 7), while is associated with the non-Gaussian nature of the sub-Gaussian random matrix (i.e., if it is Gaussian) [38, Thm 1.1]. Therefore, as a paid-off for having a 1-bit sub-Gaussian sensing (e.g., 1-bit Bernoulli sensing), the reconstruction error is not anymore guaranteed to decrease below a certain floor level , and this level is driven by the sparsity of (i.e., it is high if is very sparse). In fact, in the case of 1-bit Bernoulli sensing, there exist counterexamples of -sparse signals that cannot be reconstructed, i.e., with constant reconstruction error if increases, showing that the bound above is tight [39].

More recently, [33] has shown that, if is a subsampled Gaussian circulant sensing matrix in the binary observation model (5), PBP can reconstruct the direction of any sparse vector up to an error decaying as (see [33, Thm 4.1]). Moreover, by adding a random dithering to the linear random measurements before their binarization, the same authors proved that a second-order cone program (CP) can fully estimate the vector , i.e., not only its direction. For the same subsampled circulant sensing matrix, they also proved that their results extend to the dithered, uniformly quantized CS expressed in (4). In fact, with high probability, and for all effectively sparse signals , i.e., such that is small, the same CP program achieves a reconstruction error decaying like when increases. Their only requirement is that, first, the dithering is made of the addition of a Gaussian random vector with a uniform random vector adjusted to the quantization resolution, and second, that is smaller than [33, Thm 6.2].

In the same order of ideas, [41, 18] have also shown that, for Gaussian random sensing matrices, adding an adaptive or random dithering to the compressive measurements of a signal before their binarization allows accurate reconstruction of this signal, i.e., of both its norm and its direction, using either PBP or the same CP program as in [33]. Additionally, for random observations altered by an adaptive dithering before their 1-bit quantization, i.e., in a process close to noise shaping or -quantizer [11, 12], an appropriate reconstruction algorithm can achieve an exponential decay of its error in terms of the number of measurements. This is only demonstrated, however, in the case of Gaussian sensing matrices and for sparse signals only.

#### QCS and other non-linear sensing models:

The (scalar) QCS model (4) can be seen as a special case of the more general, non-linear sensing model , with the random non-linear function such that , for some random function  [35, 37]. In the QCS context defined in (4), this non-linear sensing model corresponds to setting with .

In [37], the authors proved that, for a Gaussian random matrix and for a bounded, star-shaped777The set is star-shaped if, for any , . set , provided that leads to finite moments , , and with , and provided that is sub-Gaussian with finite sub-Gaussian norm [34], one can estimate with high probability from the solution of the PBP of in (58) (see [37, Thm 9.1]). In the specific case where matches the QCS model (4), this analysis proves that for Gaussian random matrix , the PBP of QCS observations estimates the direction with a reconstruction error decaying like when increases (the details of this analysis are given in App. B).

A similar result is obtained in [35] for the estimate provided by a K-Lasso program, which finds the element of minimizing the -cost function , when , and under the similar hypotheses on the non-linear corruption than above (i.e., with finite moments , , and  ). Of interest for this work, [35] introduced a form of the (local) LPD (see Sec. 1 and Sec. 3) in the case where and is a Gaussian random matrix (with possibly unknown covariance between rows). The authors indeed analyzed when, for some ,

 1m(⟨f(Φx),Φv⟩−⟨Φμx,Φv⟩)≲ϵ,∀v∈D∗=D∩Bn, (6)

with being the tangent cone of at . An easy rewriting of [35, Proof of Thm 1.4] then essentially shows that the RIP of over combined with (6) provides . In particular, thanks to the Gaussianity of , they prove that, with large probability, (6) holds888As implied by Markovâs inequality combined with [35, Lem. 4.3]. with , with the Gaussian mean width of measuring its intrinsic complexity (see Sec. 3.1). For instance, if and if is such that , this proves that . Correspondingly, if the ’s are thus selected to match the QCS model with a Gaussian sensing , this shows that K-Lasso achieves a non-uniform reconstruction error decay of of if the Gaussian mean width of the tangent cone can be bounded (e.g., for sparse or compressible signals, or for low-rank matrices).

In other words, when instantiated to our specific QCS model, but only in the context of a Gaussian random matrix and with some restrictions on the norm of , the non-uniform reconstruction error decays of PBP and K-Lasso in [37] and [35], respectively, are similar to the one achieved in our work (see Sec. 7).

## 3 Preliminaries

### 3.1 Low-complexity spaces

In this work, our ability to estimate a signal from the QCS model (4) is developed on the hypothesis that this signal belongs to a “low-complexity” set . In other words, we first suppose that, for any radius , the restriction999Note that if is bounded, all our developments can be rescaled in order to directly analyze instead of . of to the -ball can be covered by a relatively small number of translated -balls of radius . In other words, we assume that has a small Kolmogorov entropy before  [42], with, for any bounded set ,

 H(S,η):=logmin{|G|: G⊂S⊂G+ηBn},

where the addition is the Minkowski sum between sets. Most of the time, e.g., if is a low-dimensional subspace of (or a finite union of such spaces, as for the set of sparse vectors) or the set of low-rank matrices, is well controlled by standard covering arguments of  [43]. In fact, as explained in Sec. 7 and summarized in [20, Table 1], for most of these sets, we can consider that , where is the Gaussian mean width of a bounded set [39, 44] defined by

 w(S) := Esupu∈S|⟨g,u⟩|,g∼N(0,In).

Interestingly, we have indeed for a large number of low-complexity sets , such as those mentioned above. For instance, , and the square Gaussian mean width of bounded, square rank- matrices with entries is bounded by (see e.g., [45, 44] and [20, Table 1]).

When does not belong to these easy cases, Sudakov minoration provides the (generally) looser bound [39, 44]. The analysis of both and for dithered QCS will be further investigated in Sec. 7.

Another implicit assumption we make on the set , or on its -multiple for some (see Sec. 7), is that it is compatible with the RIP of . In other words, given a distortion , we assume respects the RIP defined by

 ∣∣1m∥Φu∥2−∥u∥2∣∣ ⩽ ϵ,∀u∈K∩Bn. (7)

This assumption is backed up by a growing literature in the field of compressive sensing and we will refer to it in many places. In particular, it is known that sensing matrices with i.i.d. centered sub-Gaussian random entries satisfy the RIP if is large compared to the typical dimension of , as measured by the square Gaussian mean width of [7, 5, 3]. Note that in the case where with a cone, i.e., for all , a simple rescaling argument provides the usual formulation of the RIP, i.e., (7) implies

 (1−ϵ)∥u∥2⩽1m∥Φu∥2⩽(1+ϵ)∥u∥2,∀u∈K0. (8)

### 3.2 Projected Back Projection

As announced in the Introduction, the standpoint of this work is to show the compatibility of a RIP matrix with the dithered QCS model (4), provided that the dithering is random and uniform, through the possibility to estimate via the Projected Back Projection (PBP) onto of the quantized observations

Mathematically, the PBP method is simply defined by

 ^x := PK(1mΦ⊤y), (9)

where is the (minimal distance) projector101010Note that there exist cases where could have several equivalent minimizers, e.g., if is non-convex. If this happens, we just assume picks one of them, arbitrarily. on , i.e.,

 PK(z) ∈ argminu∈K∥z−u∥.

Throughout this work, we assume that can be computed, i.e., in polynomial complexity with respect to and . For instance, if , is the standard best -term hard thresholding operator, and if is convex and bounded, is the orthogonal projection onto this set.

In the context where we assume the matrix to be fixed and to satisfy the RIP (i.e., whatever the random construction that led to the generation of ) the only random element in the QCS model is the dithering . With respect to this randomness, the analysis of the reconstruction errors achieved by PBP is thus divided into two categories: uniform estimation, i.e., with high probability on the choice of the dithering, all signals in the set are estimated using the same dithering, and fixed signal estimation (or non-uniform) where, given a fixed signal, the dithering is randomly generated and, with high probability, one can estimate this signal.

### 3.3 Limited Projection Distortion

We already sketched at the end of the Introduction that a central machinery of our analysis is the combination of the RIP of with another property jointly verified by , or equivalently by the quantized random mapping defined in (4). As will be clear later, this property, the (local) limited projection distortion, or (L)LPD, and the RIP allow us to bound the reconstruction error of the PBP. We define it as follows.

###### Definition 3.1 (Limited Projection Distortion).

Given a matrix and a distortion , we say that a general mapping respects the limited projection distortion property over a set observed by , or LPD, if

 1m|⟨A(u),Φv⟩−⟨Φu,Φv⟩| ⩽ν,∀u,v∈K∩Bn. (10)

In particular, when is fixed in (10), we say that respects the local limited projection distortion on , or L-LPD.

###### Remark 3.2.

As explained in Sec. 2, the LPD property was (implicitly) introduced in [35] in the special case where and is a non-linear function applied componentwise on the image of . The LPD is also connected to the SPE introduced in [39] for the specific case of a 1-bit sign quantizer if we combine the LPD property with the RIP of in order to approximate by in (10) (see Lemma 3.5). This literature was however restricted to the analysis of Gaussian random matrices.

###### Remark 3.3.

In the case where is the quantized random mapping introduced in (4), a small distortion in (10) is meaningful in certain context. As proved in Sec. 6, if is a random uniform dithering, an arbitrary low-distortion is expected for large values of since, in expectation, from Lemma A.1. Note also that for such a random dithering, if tends to 0, then the quantizer tends to the identity operator and must vanish. In fact, by Cauchy-Schwarz and the triangular inequality, this is sustained by the deterministic bound

 |⟨A(u),Φv⟩−⟨Φu,Φv⟩|=|⟨(A(u)−Φu),Φv⟩| ⩽∥Φv∥(∥A(u)−(Φu+ξ)∥+∥ξ∥) ⩽2δ√m∥Φv∥.
###### Remark 3.4.

The definition of the L-LPD also includes the simple case of linear random observations corrupted by an additive noise, i.e., if . We have then and proving the LPD amounts to showing that is small, as for instance in the case where is composed of i.i.d. sub-Gaussian random components. As will be clear later, this includes the situation where is fixed and where is the quantized random mapping (4) since then the i.i.d. r.v.’s are bounded and thus sub-Gaussian. However, this cannot be easily generalized to a uniform LPD property without to more accurately consider the geometrical nature . In the case of a quantized mapping, we need in particular to control the impact of discontinuities introduced by on . This will be developed in Sec. 6.

The (L)LPD characterizes the proximity of scalar products between distorted and undistorted random observations in the compressed domain . In order to assess how approximates we can consider this standard lemma from the CS literature (see e.g., [46]).

###### Lemma 3.5.

Given two symmetric subsets and , if is RIP with , then

 |1m⟨Φu,Φv⟩−⟨u,v⟩|⩽2ϵ,∀u∈K1∩Bn, ∀v∈K2∩Bn. (11)

In particular, if and are two cones, we have

 |1m⟨Φu,Φv⟩−⟨u,v⟩|⩽2ϵ∥u∥∥v∥,∀u∈K1, ∀v∈K2. (12)
###### Proof.

Note that since and are symmetric, . Given and , if is RIP with , then, from the polarization identity, the fact that and from (7),

 1m⟨Φu,Φv⟩=1m∥Φ(u+v2)∥2−1m∥Φ(u−v2)∥2⩽14∥u+v∥2−14∥u−v∥2+2ϵ =⟨u,v⟩+2ϵ.

The lower bound is obtained similarly. A simple rescaling argument provides (12). ∎

Therefore, applying the triangular identity, it is easy to verify the following corollary.

###### Corollary 3.6.

Given , two symmetric subsets , and . If  respects the RIP and verifies the LPD for , then

 |1m⟨A(u),Φv⟩−⟨u,v⟩| ⩽ 2ϵ+ν,∀u∈K1∩Bn, ∀v∈K2∩Bn. (13)

The same observation holds is fixed when the L-LPD is invoked instead of the LPD.

Note that we recover Lemma 3.5 if is identified with .

## 4 PBP reconstruction error in Distorted CS

In this section, we provide a general analysis of the reconstruction error of the estimate provided by the PBP of some general distorted CS model

 y=D(x),x∈K∩Bn. (14)

This is achieved in the context where is only assumed to respect the (L)LPD property, which, in a certain sense, characterizes the proximity of this (possibly non-linear) mapping with a RIP matrix .

Note that the results of this general study can (and will) be applied of course to the quantized, random mapping introduced in (4) (as explained in Sec. 5 and Sec. 6), but it could potentially concern other distorted sensing models, provided that the associated mapping meets the (L)LPD property.

Hereafter, we analyze the cases where the low-complexity set of the vector is a union of low-dimensional subspaces, the set of low-rank matrices, or a convex set of . Sec. 7 will later analyze these general results when is the quantized random mapping introduced in (4).

### 4.1 Union of low-dimensional subspaces

Let us first consider the reconstruction error of PBP for estimating vectors belonging to a union of low-dimensional subspaces, or ULS. In a ULS model we can write , where each is a low-dimensional subspace of for . This model encompasses, e.g., sparse signals in an orthonormal basis or in a dictionary [47, 44], co-sparse signal models [48], group-sparse signals [27] and model-based sparsity [26].

The next theorem states that the PBP reconstruction error is bounded by the addition of the distortion induced by the RIP of (as in CS) and the one provided by (L)LPD of .

###### Theorem 4.1 (PBP for ULS).

Let us consider the ULS model . Given two distortions , if respects the RIP and if the mapping satisfies the LPD, then, for all , the estimate obtained by the PBP of onto satisfies

 ∥x−^x∥⩽4ϵ+2ν.

Moreover, if is fixed, then the same result holds if respects the L-LPD.

###### Proof.

The proof generalizes the proof sketch given at the end of the Introduction for the reconstruction error of PBP in the case where . Since and , there must exist two subspaces and , for some such that and . Let us define , the subspace and the orthogonal complement of . We can always decompose as with and , with the projector defined in Sec. 4. Since , we have

 ∥^x−a∥2=∥^x−¯a−¯a⊥∥2⩽∥x−a∥2=∥x−¯a−¯a⊥∥2.

Moreover, since both and belong to , the last inequality is equivalent to which implies . Consequently, the triangular inequality gives

 ∥x−^x∥⩽∥x−¯a∥+∥^x−¯a∥⩽2∥x−¯a∥.

From the assumptions of the theorem, and respect the RIP and the LPD, respectively. Therefore, from the symmetry of , using Cor. 3.6 with and , since and for all , we find

 ∥x−