Robust 1-Bit Compressive Sensing viaBinary Stable Embeddings of Sparse Vectors1footnote 11footnote 1L. J. is funded by the Belgian Science Policy “Return Grant”, BELSPO, IAP-VI BCRYPT, and by the Belgian F.R.S-FNRS. J. L. and R. B. were supported by the grants NSF CCF-0431150, CCF-0728867, CCF-0926127, CNS-0435425, and CNS-0520280, DARPA/ONR N66001-08-1-2065, N66001-11-1-4090, ONR N00014-07-1-0936, N00014-08-1-1067, N00014-08-1-1112, and N00014-08-1-1066, AFOSR FA9550-07-1-0301 and FA9550-09-1-0432, ARO MURI W911NF-07-1-0185 and W911NF-09-1-0383, and the Texas Instruments Leadership University Program. P.B. is funded by Mitsubishi Electric Research Laboratories.

# Robust 1-Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors111L. J. is funded by the Belgian Science Policy “Return Grant”, BELSPO, IAP-VI BCRYPT, and by the Belgian F.R.S-FNRS. J. L. and R. B. were supported by the grants NSF CCF-0431150, CCF-0728867, CCF-0926127, CNS-0435425, and CNS-0520280, DARPA/ONR N66001-08-1-2065, N66001-11-1-4090, ONR N00014-07-1-0936, N00014-08-1-1067, N00014-08-1-1112, and N00014-08-1-1066, AFOSR FA9550-07-1-0301 and FA9550-09-1-0432, ARO MURI W911NF-07-1-0185 and W911NF-09-1-0383, and the Texas Instruments Leadership University Program. P.B. is funded by Mitsubishi Electric Research Laboratories.

Laurent Jacques, Jason N. Laska , Petros T. Boufounos, and Richard G. Baraniuk ICTEAM Institute, ELEN Department, Université catholique de Louvain (UCL), B-1348 Louvain-la-Neuve, Belgium. Email: laurent.jacques@uclouvain.beDropcam, Inc. San Francisco, CA. Email: jason@dropcam.comDepartment of Electrical and Computer Engineering, Rice University, Houston, TX, 70015 USA. Email: laska@rice.edu, richb@rice.eduMitsubishi Electric Research Laboratories (MERL) Email: petrosb@merl.com
June 30, 2019
###### Abstract

The Compressive Sensing (CS) framework aims to ease the burden on analog-to-digital converters (ADCs) by reducing the sampling rate required to acquire and stably recover sparse signals. Practical ADCs not only sample but also quantize each measurement to a finite number of bits; moreover, there is an inverse relationship between the achievable sampling rate and the bit-depth. In this paper, we investigate an alternative CS approach that shifts the emphasis from the sampling rate to the number of bits per measurement. In particular, we explore the extreme case of 1-bit CS measurements, which capture just their sign. Our results come in two flavors. First, we consider ideal reconstruction from noiseless -bit measurements and provide a lower bound on the best achievable reconstruction error. We also demonstrate that i.i.d. random Gaussian matrices provide measurement mappings that, with overwhelming probability, achieve nearly optimal error decay. Next, we consider reconstruction robustness to measurement errors and noise and introduce the Binary -Stable Embedding (BSE) property, which characterizes the robustness of the measurement process to sign changes. We show that the same class of matrices that provide almost optimal noiseless performance also enable such a robust mapping. On the practical side, we introduce the Binary Iterative Hard Thresholding (BIHT) algorithm for signal reconstruction from 1-bit measurements that offers state-of-the-art performance.

## 1 Introduction

Recent advances in signal acquisition theory have led to significant interest in alternative sampling methods. Specifically, conventional sampling systems rely on the Shannon sampling theorem that states that signals must be sampled uniformly at the Nyquist rate, i.e., a rate twice their bandwidth. However, the compressive sensing (CS) framework describes how to reconstruct a signal from the linear measurements

 y=Φx, (1)

where with is an underdetermined measurement system [1, 2]. It is possible to design a physical sampling system such that where is a vector of Nyquist-rate samples of a bandlimited signal , . In this case, (1) translates to low, sub-Nyquist sampling rates, providing the framework’s axial significance: CS enables the acquisition and accurate reconstruction of signals that were previously out of reach, limited by hardware sampling rates [3] or number of sensors [4].

Although inversion of (1) seems ill-posed, it has been demonstrated that -sparse signals, i.e., where , can be reconstructed exactly [1, 2]. To do this, we could naïvely solve for the sparsest signal that satisfies (1),

 x∗=argminu∈RN∥u∥0s.t.y=Φu; (RCS)

however, this non-convex program exhibits combinatorial complexity in the size of the problem [5]. Instead, we solve Basis Pursuit (BP) by relaxing the objective in (R) to the -norm; the result is a convex, polynomial-time algorithm [6]. A key realization is that, under certain conditions on , the BP solution will be equivalent to that of (R[1]. This basic reconstruction framework has been expanded to include numerous fast algorithms as well as provably robust algorithms for reconstruction from noisy measurements [7, 8, 9, 10, 11]. Reconstruction can also be performed with iterative and greedy methods [12, 13, 14].

Reconstruction guarantees for BP and other algorithms are often demonstrated for that are endowed with the restricted isometry property (RIP), the sufficient condition that the norm of the measurements is close to the norm of the signal for all sparse  [7].222The RIP is in fact not needed to demonstrate exact reconstruction guarantees in noiseless settings, however it proves quite useful for establishing robust reconstruction guarantees in noise. This can be expressed, in general terms, as a -stable embedding. Let and . We say the mapping is a -stable embedding of if

 (1−δ)∥x−s∥22≤∥Φx−Φs∥22≤(1+δ)∥x−s∥22, (2)

for all and . The RIP requires that (2) hold for all ; that is, it is a stable embedding of sparse vectors. A key result in the CS literature is that, if the coefficients of are randomly drawn from a sub-Gaussian distribution, then will satisfy the RIP with high probability as long as , for some constant  [16, 17]. Several hardware inspired designs with only a few randomized components have also been shown to satisfy this property [18, 3, 19, 20].

In practice, CS measurements must be quantized, i.e., each measurement is mapped from a real value (over a potentially infinite range) to a discrete value over some finite range. For example, in uniform quantization, a measurement is mapped to one of distinct values, where denotes the number of bits per measurement. Quantization is an irreversible process that introduces error in the measurements. One way to account for quantization error is to treat it as bounded noise and employ robust reconstruction algorithms. Alternatively, we might try to reduce the error by choosing the most efficient quantizer for the distribution of the measurements. Several reconstruction techniques that specifically address CS quantization have also been proposed [21, 22, 23, 24, 25, 26].

While quantization error is a minor inconvenience, fine quantization invokes a more burdensome, yet often overlooked source of adversity: in hardware systems, it is the primary bottleneck limiting sample rates [27, 28]. In other words, the analog-to-digital converter (ADC) is beholden to the quantizer. First, quantization significantly limits the maximum speed of the ADC, forcing an exponential decrease in sampling rate as the number of bits is increased linearly [28]. Second, the quantizer is the primary power consumer in an ADC. Thus, more bits per measurement directly translates to slower sampling rates and increased ADC costs. Third, fine quantization is more susceptible to non-linear distortion in the ADC electronics, requiring explicit treatment in the reconstruction [29]. As we have seen, the CS framework provides one mechanism to alleviate the quantization bottleneck by reducing the ADC sampling rate. Is it possible to extend the CS framework to mitigate this problem directly in the quantization domain by reducing the number of bits per measurement (bit-depth) instead?

In this paper we concretely answer this question in the affirmative. We consider an extreme quantization; just one bit per CS measurement, representing its sign. The quantizer is thus reduced to a simple comparator that tests for values above or below zero, enabling extremely simple, efficient, and fast quantization. A -bit quantizer is also more robust to a number of commonly encountered non-linear distortions in the input electronics, as long as they preserve the signs of the measurements.

It is not obvious that the signs of the CS measurements retain enough information for signal reconstruction; for example, it is immediately clear that the scale (absolute amplitude) of the signal is lost. Nonetheless, there is strong empirical evidence that signal reconstruction is possible [30, 31, 32, 29]. In this paper we develop strong theoretical reconstruction and robustness guarantees, in the same spirit as classical guarantees provided in CS by the RIP.

We briefly describe the -bit CS framework proposed in [30]. Measurements of a signal are computed via

 ¯y=A(x):=sign(Φx), (3)

where the sign operator is applied component wise on , where equals if and otherwise, for any . Thus, the measurement operator is a mapping from to the Boolean cube333Generally, the -dimensional Boolean cube is defined as . Without loss of generality, we use instead. . At best, we hope to recover signals where is the unit hyper-sphere of dimension . We restrict our attention to sparse signals on the unit sphere since, as previously mentioned, the scale of the signal has been lost during the quantization process. To reconstruct, we enforce consistency on the signs of the estimate’s measurements, i.e., that . Specifically, we define a general non-linear reconstruction decoder such that, for , the solution is

1. sparse, i.e., satisfies ,

2. consistent, i.e., satisfies .

With (R) from CS as a guide, one candidate program for reconstruction that respects these two conditions is

 x∗=argminu∈SN−1∥u∥0s.t.¯y=sign(Φu). (R1BCS)

Although the parameter is not explicit in (R), the solution will be -sparse with because is a feasible point of the constraints.

Since (R) is computationally intractable, [30] proposes a relaxation that replaces the objective with the -norm and enforces consistency via a linear convex constraint. However, the resulting program remains non-convex due to the unit-sphere requirement. Be that as it may, several optimization algorithms have been developed for the relaxation, as well as a greedy algorithm inspired by the same ideas [30, 31, 32]. While previous empirical results from these algorithms provide motivation for the validity of this -bit framework, there have been few analytical guarantees to date.

The primary contribution of this paper is a rigorous analysis of the -bit CS framework. Specifically, we examine how the reconstruction error behaves as we increase the number of measurement bits given the signal dimension and sparsity . We provide two flavors of results. First, we determine a lower bound on reconstruction performance from all possible mappings with the reconstruction decoder , i.e., the best achievable performance of this -bit CS framework. We further demonstrate that if the elements of are drawn randomly from Gaussian distribution or its rows are drawn uniformly from the unit sphere, then the worst-case reconstruction error using will decay at a rate almost optimal with the number of measurements, up to a log factor in the oversampling rate and the signal dimension . Second, we provide conditions on that enable us to characterize the reconstruction performance even when some of the measurement signs have changed (e.g., due to noise in the measurements). In other words, we derive the conditions under which robust reconstruction from -bit measurements can be achieved. We do so by demonstrating that is a stable embedding of sparse signals, similar to the RIP. We apply these stable embedding results to the cases where we have noisy measurements and signals that are not strictly sparse. Our guarantees demonstrate that the -bit CS framework is on sound footing and provide a first step toward analysis of the relaxed -bit techniques used in practice.

To develop robust reconstruction guarantees, we propose a new tool, the binary -stable embedding (BSE), to characterize -bit CS systems. The BSE implies that the normalized angle between any sparse vectors in is close to the normalized Hamming distance between their -bit measurements. We demonstrate that the same class of random as above exhibit this property when (where is some constant). Thus remarkably, there exist such that the BSE holds when both the number of measurements is smaller than the dimension of the signal and the measurement bit-depth is at minimum.

As a complement to our theoretical analysis, we introduce a new -bit CS reconstruction algorithm, Binary Iterative Hard Thresholding (BIHT). Via simulations, we demonstrate that BIHT yields a significant improvement in both reconstruction error as well as consistency, as compared with previous algorithms. To gain intuition about the behavior of BIHT, we explore the way that this algorithm enforces consistency and compare and contrast it with previous approaches. Perhaps more important than the algorithm itself is the discovery that the BIHT consistency formulation provides a significantly better feasible solution in noiseless settings, as compared with previous algorithms. Finally, we provide a brief explanation regarding why this new formulation achieves better solutions, and its connection with results in the machine learning literature.

Since the first appearance of this work, Plan and Vershynin have developed additional theoretical results and bounds on the performance of 1-bit CS, as well as two convex algorithms with theoretical guarantees [33, 34, 35]. The results in [33, 34] generalize the BSE guarantees for more general classes of signals, including compressible signals in addition to simply sparse ones. However, the guarantees provided in that work exhibit worse decay rates in the error performance and the tightness of the BSE property. Furthermore, the results of [34, 35] are intimately tied to reconstruction algorithms, in contrast to our analysis. We point out similarities and differences with our results when appropriate in the subsequent development.

In addition to benchmarking the performance of BIHT, our simulations demonstrate that many of the theoretical predictions that arise from our analysis (such as the error rate as a function of the number of measurements or the error rate as a function of measurement Hamming distance), are actually exhibited in practice. This suggests that our theoretical analysis is accurately explaining the true behavior of the framework.

The remainder of this paper is organized as follows. In Section 2, we develop performance results for -bit CS in the noiseless setting. Specifically we develop a lower bound on reconstruction performance as well as provide the guarantee that Gaussian matrices enable this performance. In Section 3 we introduce the notion of a BSE for the mapping and demonstrate that Gaussian matrices facilitate this property. We also expand reconstruction guarantees for measurements with Gaussian noise (prior to quantization) and non-sparse signals. To make use of these results in practice, in Section 4 we present the BIHT algorithm for practical -bit reconstruction. In Section 5 we provide simulations of BIHT to verify our claims. In Section 6 we conclude with a discussion about implications and future extensions. To facilitate the flow of the paper and clear descriptions of the results, most of our proofs are provided in the appendices.

## 2 Noiseless Reconstruction Performance

### 2.1 Reconstruction performance lower bounds

In this section, we seek to provide guarantees on the reconstruction error from -bit CS measurements. Before analyzing this performance from a specific mapping with the consistent sparse reconstruction decoder , it is instructive to determine the best achievable performance from measurements acquired using any mapping. Thus, in this section we seek a lower bound on the reconstruction error.

We develop the lower bound on the reconstruction error based on how well the quantizer exploits the available measurement bits. A distinction we make in this section is that of measurement bits, which is the number of bits acquired by the measurement system, versus information bits, which represent the true amount of information carried in the measurement bits. Our analysis follows similar ideas to that in [36, 37], adapted to sign measurements.

We first examine how 1-bit quantization operates on the measurements. Specifically, we consider the orthants of the measurement space. An orthant in is the set of vectors such that all the vector’s coefficients have the same sign pattern

 O¯z:={z∈RM | signz=¯z}, (4)

where . Notice that and if . Therefore, any -dimensional space is partitioned to orthants. Figure 1(a) shows the 8 orthants of as an example. Since 1-bit quantization only preserves the signs of the measurements, it encodes in which measurement space orthant the measurements lie. Thus, each available quantization point corresponds to an orthant in the measurement space. Any unquantized measurement vector that lies in an orthant of the measurement space will quantize to the corresponding quantization point of that orthant and cannot be distinguished from any other measurement vector in the same orthant. To obtain a lower bound on the reconstruction error, we begin by bounding the number of quantization points (or equivalently the number of orthants) that are used to encode the signal.

While there are generally orthants in the measurement space, the space formed by measuring all sparse signals occupies a small subset of the available orthants. We determine the number of available orthants that can be intersected by the measurements in the following lemma:

###### Lemma 1.

Let belong to a union of subspaces of dimension , and let -bit measurements be acquired via the mapping as defined in (3). Then the measurements can effectively use at most quantization points, i.e., carry at most information bits.

###### Proof.

A -dimensional subspace in an -dimensional space cannot lie in all the available octants. For example, as shown in Fig. 1(b), a 2-dimensional subspace of a 3-dimensional space can intersect at most 6 of the available octants. In Appendix A, we demonstrate that one arbitrary -dimensional subspace in an -dimensional space intersects at most orthants of the available. Since is a linear operator, any -dimensional subspace in the signal space is mapped through to a subspace that is also at most -dimensional and therefore follows the same bound. Thus, if the signal of interest belongs in a union of such -dimensional subspaces, then , and it follows that at most orthants are intersected. This means that at most effective quantization points can be used, i.e., at most information bits can be obtained.

Since -sparse signals in any basis belong to a union of at most subspaces in  with , using Lemma 1 we can obtain the following corollary.444This corollary is easily adaptable to a redundant frame with .

###### Corollary 1.

Let be -sparse in a certain basis , i.e., . Then the measurements can effectively use at most 1-bit quantization points, i.e, carry at most information bits.

The set of signals of interest to be encoded is the set of unit-norm -sparse signals . Since unit-norm signals of a -dimensional subspace form a -dimensional unit sphere in that subspace, is a union of such unit spheres. The available quantization points partition into smaller sets, each of which contains all the signals that quantize to the same point.

To develop the lower bound on the reconstruction error we examine how can be optimally partitioned with respect to the worst-case error, given the number of quantization points used. The measurement and reconstruction process maps each signal in to a finite set of quantized signals . At best this map ensures that the worst case reconstruction error is minimized, i.e.,

 ϵopt = maxx∈Σ∗Kminq∈Q∥∥x−q∥2, (5)

where denotes the worst-case quantization error and each of the available quantization points. The optimal lower bound is achieved by designing to minimize (5) without considering whether the measurement and reconstruction process actually achieve this design. Thus, designing the set becomes a set covering problem. Appendix B precises this intuition and proves the following statement.

###### Theorem 1.

Let the mapping and measurements be defined as in (3) and let . Then the estimate from the reconstruction decoder has error defined by (5) of at least

 ϵopt≥K2eM+2K3/2 →M Ω(KM).

Thus, when is high compared to , the worst-case error cannot decay at a rate faster than as a function of the number measurements, no matter what reconstruction algorithm is used.

This result assumes noiseless acquisition and provides no guarantees of robustness and noise resiliency. This is in line with existing results on scalar quantization in oversampled representations and CS that state that the distortion due to scalar quantization of noiseless measurements cannot decrease faster than the inverse of the measurement rate [37, 38, 39, 40, 36].

To improve the rate vs. distortion trade-off, alternative quantization methods must be used, such as Sigma-Delta () quantization [41, 42, 43, 44, 45, 46, 47] or non-monotonic scalar quantization [48]. Specifically, approaches to CS can achieve error decay rate of , where is the order of the quantizer [47]. However, quantization requires feedback during the quantization process, which is not necessary in scalar quantization. Furthermore, the result in [47] only holds for multibit quantizers, not 1-bit ones. While efficient 1-bit quantization has been shown for classical sampling [42, 49], to the best of our knowledge, similar results are not currently known for 1-bit in CS applications. Alternatively, non-monotonic scalar quantization can achieve error decay exponential in the number of measurements , even in CS applications [48]. However, such a scheme requires a significantly more complex scalar quantizer and reconstruction approach [50].

Theorem 1 bounds the best possible performance of a consistent reconstruction over all possible mappings . However, not all mappings will behave as the lower bound suggests. In the next section we identify two classes of matrices such that the mapping  admits an upper bound on the reconstruction error from a general decoder  that decays almost optimally.

### 2.2 Achievable performance via random projections

In this section we describe a class of matrices such that the consistent sparse reconstruction decoder can indeed achieve error decay rates of optimal order, described by Theorem 1, with the number of measurements growing linearly in the sparsity and logarithmically in the dimension , as is required in conventional CS. We first focus our analysis on Gaussian matrices, i.e., such that each element is randomly drawn i.i.d. from the standard Gaussian distribution, . In the rest of the paper, we use the short notation to characterize such matrices, and we write to describe the equivalent random vectors in (e.g., the rows of ). For these matrices , we prove the following in Appendix C.

###### Theorem 2.

Let be matrix generated as , and let the mapping be defined as in (3). Fix and . If the number of measurements is

 M≥2ϵo(2Klog(N)+4Klog(17ϵo)+log1η), (6)

then for all we have that

 ∥x−s∥2>ϵo ⇒ A(x)≠A(s), (7)

or equivalently

 A(x)=A(s) ⇒ ∥x−s∥2≤ϵo,

with probability higher than .

Theorem 2 is a uniform reconstruction result, meaning that with high probability all vectors can be reconstructed as opposed to a non-uniform result where each vector could be reconstructed with high probability.

As derived in Appendix G, Theorem 2 demonstrates that if we use Gaussian matrices in the mapping , then, given a fixed probability level , the reconstruction decoder will recover signals with error order

 ϵo = O(KMlogMNK),

which decays almost optimally compared to the lower bound given in Theorem 1 up to a log factor in . Whether the gap can be closed, with tighter lower or upper bounds is still an open question. Notice that the hidden proportionality factor in this last relation depends linearly on which is assumed fixed.

We should also note a few minor extensions of Theorem 2. We can multiply the rows of with a positive scalar without changing the signs of the measurements. By normalizing the rows of the Gaussian matrix we obtain another class of matrices, ones with rows drawn uniformly from the unit sphere in . It is thus straightforward to extend the Theorem to matrices with such rows as well. Furthermore, these projections are rotation invariant (often referred to as “universal” in CS systems), meaning that the theorem remains valid for sparse signals in any basis , i.e., for belonging to . This is true since for any orthonormal basis , when .

### 2.3 Related Work

A similar result to Theorem 2 has been recently shown for sign measurements of non-sparse signals in the context of quantization using frame permutations [51]. Specifically, it has been shown that reconstruction from sign measurements of signals can be achieved (almost surely) with an error rate decay arbitrarily close to . Our main contribution here is demonstrating that this result is true uniformly for all -sparse vectors in , given a sparse and consistent decoder. Our results, in addition to introducing the almost linear dependence on , also show that proving this error bound uniformly for all -sparse signals involves a logarithmic penalty in . This does not seem to be necessary from the lower bound in the previous section. We will see in Section 5 that for Gaussian matrices, the optimal error behavior is empirically exhibited on average. Finally, we note that for a constant , the number of measurements required to guarantee (7) is , nearly the same as order in conventional CS.

Furthermore, since the first appearance of our work, a bound on the achievable reconstruction error for compressible signals and for signals in arbitrary subsets of appeared in [33, 34]. Specifically for compressible signals, that works leads to error decay , which decreases more slowly (with ) than both our bound and the one provided in [51]. However, the results in [33, 34] are for more general classes of signals.

We can also view the binary measurements as a hash or a sketch of the signal. With this interpretation of the result we guarantee with high probability that no sparse vectors with Euclidean distance greater than will “hash” to the same binary measurements. In fact, similar results play a key role in locality sensitive hashing (LSH), a technique that aims to efficiently perform approximate nearest neighbors searches from quantized projections [52, 53, 54, 55]. Most LSH results examine the performance on point-clouds of a discrete number of signals instead of the infinite subspaces that we explore in this paper. Furthermore, the primary goal of the LSH is to preserve the structure of the nearest neighbors with high probability. Instead, in this paper we are concerned with the ability to reconstruct the signal from the hash, as well as the robustness of this reconstruction to measurement noise and signal model mismatch. To enable these properties, we require a property of the mapping that preserves the structure (geometry) of the entire signal set. Thus, in the next section we seek an embedding property of that preserves geometry for the set of sparse signals and thus ensures robust reconstruction.

## 3 Acquisition and Reconstruction Robustness

### 3.1 Binary ϵ-stable embeddings

In this section we establish an embedding property for the -bit CS mapping that ensures that the sparse signal geometry is preserved in the measurements, analogous to the RIP for real-valued measurements. This robustness property enables us to upper bound the reconstruction performance even when some measurement signs have been changed due to noise. Conventional CS achieves robustness via the -stable embeddings of sparse vectors (2) discussed in Section 1. This embedding is a restricted quasi-isometry between the metric spaces and , where the distance metrics and are the -norm in dimensions and , respectively, and the domain is restricted to sparse signals.555A function is called a quasi-isometry between metric spaces and if there exists and such that for , and such that for all [56]. Since for -stable embeddings, they are also called bi-Lipschitz mappings. We seek a similar definition for our embedding; however, now the signals and measurements lie in the different spaces and , respectively. Thus, we first consider appropriate distance metrics in these spaces.

The Hamming distance is the natural distance for counting the number of unequal bits between two measurement vectors. Specifically, for we define the normalized Hamming distance as

 dH(¯a,¯b)=1MM∑i=1¯ai⊕¯bi,

where is the XOR operation between such that equals 0 if and 1 otherwise. The distance is normalized such that . In the signal space we only consider unit-norm vectors, thus, a natural distance is the angle formed by any two of these vectors. Specifically, for , we consider

 dS(x,s):=1πarccos⟨x,s⟩.

As with the Hamming distance, we normalize the true angle such that . Note that since both vectors have the same norm, the inner product can easily be mapped to the -distance using the polarization identity.

Using these distance metrics we define the binary stable embedding.

###### Definition 1 (Binary ϵ-Stable Embedding).

Let . A mapping is a binary -stable embedding (BSE) of order for sparse vectors if

 dS(x,s)−ϵ≤dH(A(x),A(s))≤dS(x,s)+ϵ

for all with .

Our definition describes a specific quasi-isometry between the two metric spaces and , restricted to sparse vectors. While this mirrors the form of the -stable embedding for sparse vectors, one important difference is that the sensitivity term is additive, rather than multiplicative, and thus the BSE is not bi-Lipschitz. This is a necessary side-effect of the loss of information due to quantization.

Any BSE of order enables robustness guarantees on any reconstruction algorithm extracting a unit sparse signal estimate of . In this case, the angular error is immediately bounded by

 dS(x,x∗) ≤dH(A(x),A(x∗))+ϵ.

Thus, if an algorithm returns a unit norm sparse solution with measurements that are not consistent (i.e., ), as is the case with several algorithms [30, 31, 32], then the worst-case angular reconstruction error is close to Hamming distance between the estimate’s measurements’ signs and the original measurements’ signs. Section 5 verifies this behavior with simulation results. Furthermore, in Section 3.3 we use the BSE property to guarantee that if measurements are corrupted by noise or if signals are not exactly sparse, then the reconstruction error is bounded.

Note that, in the best case, for a BSE , the angular error of any sparse and consistent decoder is bounded by since then . As we have seen earlier this is to be expected because, unlike conventional noiseless CS, quantization fundamentally introduces uncertainty and exact recovery cannot be guaranteed. This is an obvious consequence of the mapping of the infinite set to a discrete set of quantized values.

We next identify a class of matrices for which is a BSE.

### 3.2 Binary ϵ-stable embeddings via random projections

As is the case for conventional CS systems with RIP, designing a for -bit CS such that has the BSE property is possibly a computationally intractable task (and no such algorithm is yet known). Fortunately, an overwhelming number of “good” matrices do exist. Specifically we again focus our analysis on Gaussian matrices as in as in Section 2.2. As motivation that this choice of will indeed enable robustness, we begin with a classical concentration of measure result for binary measurements from a Gaussian matrix.

###### Lemma 2.

Let be a pair of arbitrary fixed vectors, draw according to , and let the mapping be defined as in (3). Fix . Then we have

 P( ∣∣dH(A(x),A(s)) − dS(x,s)∣∣ ≤ ϵ ) ≥ 1−2e−2ϵ2M, (8)

where the probability is with respect to the generation of .

###### Proof.

This lemma is a simple consequence of Lemma 3.2 in [57] which shows that, for one measurement, . The result then follows by applying Hoeffding’s inequality to the binomial random variable with trials.

In words, Lemma 2 implies that the Hamming distance between two binary measurement vectors tends to the angle between the signals and as the number of measurements increases. In [57] this fact is used in the context of randomized rounding for max-cut problems; however, this property has also been used in similar contexts as ours with regards to preservation of inner products from binary measurements [58, 59].

The expression (8) indeed looks similar to the definition of the BSE, however, it only holds for a fixed pair of arbitrary (not necessarily sparse) signals, chosen prior to drawing . Our goal is to extend (8) to cover the entire set of sparse signals. Indeed, concentration results similar to Lemma 2, although expressed in terms of norms, have been used to demonstrate the RIP [16]. These techniques usually demonstrate that the cardinality of the space of all sparse signals is sufficiently small, such that the concentration result can be applied to demonstrate that distances are preserved with relatively few measurements.

Unfortunately, due to the non-linearity of we cannot immediately apply Lemma 2 using the same procedure as in [16]. To briefly summarize, [16] proceeds by covering the set of all -sparse signals with a finite set of points (with covering radius ). A concentration inequality is then applied to this set of points. Since any sparse signal lies in a -neighborhood of at least one such point, the concentration property can be extended from the finite set to by bounding the distance between the measurements of the points within the -neighborhood. Such an approach cannot be used to extend (8) to , because the severe discontinuity of our mapping does not permit us to characterize the measurements using and and obtain a bound on the distance between measurements of signals in a -neighborhood.

To resolve this issue, we extend Lemma 2 to include all points within Euclidean balls around the vectors and inside the (sub) sphere for some fixed support set of size . Define the -ball to be the ball of Euclidean distance around , and let .

###### Lemma 3.

Given of size , let be a matrix generated as , and let the mapping be defined as in (3). Fix and . For any , we have

 P( ∀u∈B∗δ(x),∀v∈B∗δ(s), ∣∣dH(A(u),A(v)) − dS(x,s)∣∣ ≤ ϵ+√π2Dδ) ≥ 1−2e−2ϵ2M.

The proof of this result is given in Appendix D. It should be noted that the proof does not depend on the radial behavior of the Gaussian pdf in . In other words, this result is easily generalizable to matrices whose rows are independent random vectors drawn from an isotropic pdf .

In words, if the width is sufficiently small, then the Hamming distance between the -bit measurements , of any points , within the balls , , respectively, will be close to the angle between the centers of the balls.

Lemma 3 is key for providing a similar argument to that in [16]. We now simply need to count the number of pairs of -sparse signals that are euclidean distance apart. The Lemma can then be invoked to demonstrate that the angles between all of these pairs will be approximately preserved by our mapping.666 We note that the covering argument in the proof of Theorem 2 also employs -balls in similar fashion but only considers the probability that , rather than the concentration inequality. Thus, with Lemma 3 under our belt, we demonstrate in Appendix E the following result.

###### Theorem 3.

Let be a matrix generated as and let the mapping be defined as in (3). Fix and . If the number of measurements is

 M ≥ 2ϵ2(Klog(N)+2Klog(35ϵ)+log(2η)), (9)

then with probability exceeding , the mapping is a BSE of order for sparse vectors.

As with Lemma 3, the theorem extends easily to matrices with independent rows in drawn from an isotropic pdf in this space.

By choosing with , with high probability we ensure that the mapping is a BSE. Additionally, using (9) with a fixed and the development in Appendix G, we find that the error decreases as

 ϵ = O(√KMlogMNK).

Unfortunately, this decay rate is slower, roughly by a factor of , than the lower bound in Section 2.1. This error rate results from an application of the Chernoff-Hoeffding inequality in the proof of Theorem 3. An open question is whether it is possible to obtain a tighter bound (with optimal error rate) for this robustness property.

As with Theorem 2, Gaussian matrices provide a universal mapping, i.e., the result remains valid for sparse signals in a basis . Moreover, Theorem 3 can also be extended to rows of that are drawn uniformly on the sphere, since the rows of in Theorem 3 can be normalized without affecting the outcome of the proof. Note that normalizing the Gaussian rows of is as if they had been drawn from a uniform distribution of unit-norm signals.

We have now established a random construction providing robust BSEs with high probability: -bit quantized Gaussian projections. We now make use of this robustness by considering an example where the measurements are corrupted by Gaussian noise.

### 3.3 Noisy measurements and compressible signals

In practice, hardware systems may be inaccurate when taking measurements; this is often modeled by additive noise. The mapping is robust to noise in an unusual way. After quantization, the measurements can only take the values or . Thus, we can analyze the reconstruction performance from corrupted measurements by considering how many measurements flip their signs. For example, we analyze the specific case of Gaussian noise on the measurements prior to quantization, i.e.,

 An(x):=sign(Φx+n), (10)

where has i.i.d. elements . In this case, we demonstrate, via the following lemma, a bound on the Hamming distance between the corrupted and ideal measurements with the BSE from Theorem 3 (see Appendix F).

###### Lemma 4.

Let be a matrix generated as , let the mapping be defined as in (3), and let be defined as in (10). Let be a Gaussian random vector with i.i.d. components . Fix . Then, given , we have

 E(dH(An(x),A(x))) ≤ e(σ,∥x∥2), P(dH(An(x),A(x))> e(σ,∥x∥2)+γ)≤ e−2Mγ2,

where .

If is the estimate from a sparse consistent reconstruction decoder from the measurements with and if satisfies (9), then it immediately follows from Lemma 4 and Theorem 3 that

 dS(x∗n,x) ≤ dH(An(x),A(x))+ϵ ≤ 12σ∥x∥2+γ+ϵ, (11)

with a probability higher than . Given alternative noise distributions, e.g., Poisson noise, a similar analysis can be carried out to determine the likely number of sign flips and thus provide a bound on the error due to noise.

Another practical consideration is that real signals are not always strictly -sparse. Indeed, it may be the case that signals are compressible; i.e., they can be closely approximated by a -sparse signal. In this case, we can reuse the non-uniform result of Lemma 2 to see that, given and for ,

 P(dH(A(x),A(xK))>dS(x,xK)+γ) ≤ e−2Mγ2.

In similar fashion to (11), if satisfies (9), this result and Theorem 3 imply that, given (not necessarily sparse) and for the angular reconstruction error of is such that , with probability higher than . Therefore, from the triangular inequality on , this provides the bound

 dS(x∗,x) ≤ 2dS(x,xK)+γ+ϵ,

with the same probability. Much like conventional CS results, the reconstruction error depends on the magnitude of the best -term approximation error of the signal, here expressed angularly by .

This reconstruction error bound is non-uniform with respect to the selection of . A uniform bound on the BSE for more general classes of signals is developed in [33, 34], albeit with a worse error decay— for compressible signals.

Thus far we have demonstrated a lower bound on the reconstruction error from -bit measurements (Theorem 2) and introduced a condition on the mapping that enables stable reconstruction in noiseless, noisy, and compressible settings (Definition 1). We have furthermore demonstrated that a large class of random matrices—specifically matrices with coefficients drawn from a Gaussian distribution and matrices with rows drawn uniformly from the unit sphere—provide good mappings (Theorem 3).

Using these results we can characterize the error performance of any algorithm that reconstructs a -sparse signal. If the reconstructed signal quantizes to the same quantization point as the original data, then the error is characterized by Theorem 2. If the algorithm terminates unable to reconstruct a signal consistent with the quantized data, then Theorem 3 describes how far the solution is from the original signal. Since (R) is a combinatorially complex problem, in the next section we describe a new greedy reconstruction algorithm that attempts to find a solution as consistent with the measurements as possible, while guaranteeing this solution is -sparse.

## 4 BIHT: A Simple First-Order Reconstruction Algorithm

### 4.1 Problem formulation and algorithm definition

We now introduce a simple algorithm for the reconstruction of sparse signals from -bit compressive measurements. Our algorithm, Binary Iterative Hard Thresholding (BIHT), is a simple modification of IHT, the real-valued algorithm from which is takes its name [14]. Demonstrating theoretical convergence guarantees for BIHT is a subject of future work (and thus not shown in this paper), however the algorithm is of significant value since it i) has a simple and intuitive formulation and ii) outperforms previous algorithms empirically, demonstrated in Section 5. We further note that the IHT algorithm has recently been extended to handle measurement non-linearities [60]; however, these results do not apply to quantized measurements since quantization does not satisfy the requirements in [60].

We briefly recall that the IHT algorithm consists of two steps that can be interpreted as follows. The first step can be thought of as a gradient descent to reduce the least squares objective . Thus, at iteration , IHT proceeds by setting . The second step imposes a sparse signal model by projecting onto the “ ball”, i.e., selecting the largest in magnitude elements. Thus, IHT for CS can be thought of as trying to solve the problem

 argminu12∥y−Φu∥22 s.t. ∥u∥0=K. (12)

The BIHT algorithm simply modifies the first step of IHT to instead minimize a consistency-enforcing objective. Specifically, given an initial estimate and -bit measurements , at iteration BIHT computes

 al+1 =xl+τ2ΦT(¯y−A(xl)), (13) xl+1 =ηK(al+1), (14)

where is defined as in (3), is a scalar that controls gradient descent step-size, and the function computes the best -term approximation of by thresholding. Once the algorithm has terminated (either consistency is achieved or a maximum number of iterations have been reached), we then normalize the final estimate to project it onto the unit sphere. Section 4.2 discusses several variations of this algorithm, each with different properties.

The key to understanding BIHT lies in the formulation of the objective. The following Lemma shows that the term in (13) is in fact the negative subgradient of a convex objective . Let denote the negative function, i.e., with if and 0 else, and denote the Hadamard product, i.e., for two vectors and .

###### Lemma 5.

The quantity in (13) is a subgradient of the convex one-sided -norm

 J(x)=∥[¯y⊙(Φx)]−∥1,

Thus, BIHT aims to decrease at each step (13).

###### Proof.

We first note that is convex. We can write with each convex function given by

 Ji(x;¯y,Φ)={|⟨φi,x⟩|,if Ai(x)¯yi<0,0,else,

where denotes a row of and . Moreover, if , then the gradient of  is

 ∇Ji(x;¯y,Φ)=12(Ai(x)−¯yi)φi={Ai(x)φiif ¯yiAi(x)<0,0,else

while if , then the gradient is replaced by the subdifferential set

 ∇Ji(x;¯y,Φ)={ξ2(Ai(x)−¯yi)φi:ξ∈[0,1]}∋12(Ai(x)−¯yi)φi.

Thus, by summing over we conclude that .

Consequently, the BIHT algorithm can be thought of as trying to solve the problem:

 x∗ = argminu τ∥[¯y⊙(Φu)]−∥1s.t. ∥u∥0=K,∥u∥2=1.

Observe that since simply scales the elements of by the signs , minimizing the one-sided objective enforces a positivity requirement,

 ¯y⊙(Φx)≥0, (15)

that, when satisfied, implies consistency.

Previously proposed -bit CS algorithms have used a one-sided -norm to impose consistency [30, 31, 32, 29]. Specifically, they have applied a constraint or objective that takes the form . Both the one-sided and functions imply a consistent solution when they evaluate to zero, and thus, both approaches are capable of enforcing consistency. However, the choice of the vs.  penalty term makes a significant difference in performance depending on the noise conditions. We explore this difference in the experiments in Section 5.

### 4.2 BIHT shifts

Several modifications can be made to the BIHT algorithm that may improve certain performance aspects, such as consistency, reconstruction error, or convergence speed. While a comprehensive comparison is beyond the scope of this paper, we believe that such variations exhibit interesting and useful properties that should be mentioned.

Projection onto sphere at each iteration. We can enforce that every intermediate solution have unit norm. To do this, we modify the “impose signal model” step (14) by normalizing after choosing the best -term approximation, i.e., we replace the update of in (14) by , where . While this step is found in previous algorithms such as [30, 31, 32], empirical observations suggest that it not required for BIHT to converge to an appropriate solution.

If we choose to impose the projection, must be appropriately normalized or, equivalently, the step size of the gradient descent must be carefully chosen. Otherwise, the algorithm will not converge. Empirically, we have found that for a Gaussian matrix, an appropriate scaling is , where the controls the amplification of the estimate from in the gradient descent step (13) and the ensures that . Similar gradient step scaling requirements have been imposed in the conventional IHT algorithm and other sparse recovery algorithms as well (e.g., [9]).

Minimizing hinge loss. The one-sided -norm is related to the hinge-loss function in the machine learning literature, which is known for its robustness to outliers [61]. Binary classification algorithms seek to enforce the same consistency function as in (15) by minimizing a function , where sets negative elements to zero. When , the objective is both convex and has a non-trivial solution. Further connections and interpretations are discussed in Section 5. Thus, rather than minimizing the one-sided norm, we can instead minimize the hinge-loss. The hinge-loss can be interpreted as ensuring that the minimum value that an unquantized measurement can take is bounded away form zero, i.e., . This requirement is similar to the sphere constraint in that it avoids a trivial solution; however, will perform differently than the sphere constraint. In this case, in the gradient descent step (13), we instead compute

 al+1=xl−τΘT(sign(Θxl−κ)−1)/2

where scales the rows of by the signs of . Again, the step size must be chosen appropriately, this time as , where is a parameter that depends on .

General one-sided objectives. In general, any function , where is continuous and has a negative gradient for and is for , can be used to enforce consistency. To employ such functions, we simply compute the gradient of and apply it in (13). As an example, the previously mentioned one-sided -norm has been used to enforce consistency in several algorithms. We can use it in BIHT by computing

 al=xl+τΦT[¯y⊙Φxl]+

in (13). We compare and contrast the behavior of the one-sided and norms in Section 5.

As another example, in similar fashion to the Huber norm [15], we can combine the and functions in a piecewise fashion. One potentially useful objective is