Sparsityaware sphere decoding: Algorithms and complexity analysis
Abstract
Integer leastsquares problems, concerned with solving a system of equations where the components of the unknown vector are integervalued, arise in a wide range of applications. In many scenarios the unknown vector is sparse, i.e., a large fraction of its entries are zero. Examples include applications in wireless communications, digital fingerprinting, and arraycomparative genomic hybridization systems. Sphere decoding, commonly used for solving integer leastsquares problems, can utilize the knowledge about sparsity of the unknown vector to perform computationally efficient search for the solution. In this paper, we formulate and analyze the sparsityaware sphere decoding algorithm that imposes norm constraint on the admissible solution. Analytical expressions for the expected complexity of the algorithm for alphabets typical of sparse channel estimation and source allocation applications are derived and validated through extensive simulations. The results demonstrate superior performance and speed of sparsityaware sphere decoder compared to the conventional sparsityunaware sphere decoding algorithm. Moreover, variance of the complexity of the sparsityaware sphere decoding algorithm for binary alphabets is derived. The search space of the proposed algorithm can be further reduced by imposing lower bounds on the value of the objective function. The algorithm is modified to allow for such a lower bounding technique and simulations illustrating efficacy of the method are presented. Performance of the algorithm is demonstrated in an application to sparse channel estimation, where it is shown that sparsityaware sphere decoder performs close to theoretical lower limits.
I Introduction
Given a matrix and a vector , the integer leastsquares (ILS) problem is concerned with finding an dimensional vector comprising integer entries such that
(1) 
where denotes the dimensional integer lattice. In many applications, the unknown vector belongs to a finite dimensional subset of the infinite lattice such that has elements per dimension, i.e., each component of the unknown vector can take one of possible discrete values from the set . In multiantenna communication systems, for instance, is the transmitted symbol while the received signal is perturbed by the additive noise (hence, renders the system of equations inconsistent). Note that the symbols in form an dimensional rectangular lattice and, therefore, belongs to an dimensional lattice skewed in the direction of the eigenvectors of . An integer leastsquares problem can be interpreted as the search for the nearest point in a given lattice, commonly referred to as the closest lattice point problem [1]. Geometric interpretation of the integer leastsquares problem as the closest lattice point search is illustrated in Fig. 1(a).
Integer leastsquares problems with sparse solutions arise in a broad range of applications including multiuser detection in wireless communication systems [2], sparse array processing [3], collusionresistant digital fingerprinting [4], and arraycomparative genomic hybridization microarrays [5]. Formally, a sparse integer leastsquares problem can be stated as the cardinality constrained optimization
(2)  
where denotes the norm of its argument and is an upper bound on the number of nonzero entries of . In this paper, we restrict our attention to the case where (all of the aforementioned applications fall in this category). Note that (2) can be interpreted as a search for the lattice point closest to the given point in a sparse lattice, where the sparsity constraint is explicitly stated as . Search over a sparse lattice is illustrated in Fig. 1(b).
Finding the exact solution to (2) is computationally intensive (in fact, the closest lattice point problem is known to be NP hard [6]). The sphere decoding algorithm, developed by Fincke and Pohst [7], efficiently solves the ILS problem and provides optimal solution in a broad range of scenarios [8, 9, 10]. In particular, it was shown in [9, 10] that if the sphere radius is chosen using the statistics of , then the sphere decoding algorithm has an expected complexity that is practically feasible over a wide range of problem parameters.
Recently, several variants of sphere decoder that exploit information about sparsity of the unknown vector were proposed [3], [11], [2]. In [3], a modified sphere decoding algorithm with the norm constraint relaxed to an norm regularizer was proposed. This scheme is only applicable to nonnegative alphabets, in which case can be decomposed into the sum of the components of . In [11], a generalized sphere decoding approach imposing an constraint on the solution was adopted for sparse integer leastsquare problems over binary alphabets. That work examines the case where and essentially considers compressed sensing scenario. An norm regularized sphere decoder has been proposed and studied in [2], where the regularizing parameter was chosen to be a function of the prior probabilities of the activity of independent users in a multiuser detection environment. In contrast, sphere decoder in the present manuscript directly imposes the norm constraint on the unknown vector, i.e., we perform no relaxation or regularization of the distance metric. Note that the closest point search in a sparse lattice has previously been studied in [12] but sparsity there stems from the fact that not all lattice points are valid codewords of linear block codes. In [13], a sparse integer leastsquares problem arising in the context of spatial modulation was studied for the special case of symbol vectors with single nonzero entry; the method there does not rely on the branchandbound search typically used by sphere decoding. Note that none of the previous works on characterizing complexity of sphere decoder (see, e.g., [14], [15], [16] and the references therein) considered sparse integer leastsquares problems except for [17], where the worstcase complexity of sphere decoder in spatial modulation systems of [13] was studied. Finally, we should also point out a considerable amount of related research on compressed sensing, where one is interested in the recovery of an inherently sparse signal by using potentially far fewer measurements than what is typically needed for a signal which is not sparse [18][24].
The paper is organized as follows. In Section II, we review the sphere decoding algorithm and formalize its sparsityaware variant. Following the analysis framework in [9, 10], in Section III we derive the expected complexity of the sparsityaware sphere decoder for binary and ternary alphabets commonly encountered in various applications. The derived analytical expressions are validated via simulations. Section IV presents an expression for the variance of the complexity of the proposed algorithm for binary alphabets. In Section V, the algorithm is modified by introducing an additional mechanism for pruning nodes in the search tree, leading to a significant reduction of the number of points traversed and the overall computational complexity. In Section VI, performance of the algorithm is demonstrated in an application to sparse channel estimation. The paper is summarized in Section VII.
Ii Algorithms
In this section, we first briefly review the classical sphere decoding algorithm and then discuss its modification that accounts for sparsity of the unknown vector.
Iia Sphere decoding algorithm
To find the closest point in an dimensional lattice, sphere decoder performs the search within an dimensional sphere of radius centered at the given point . In particular, to solve (1), the sphere decoding algorithm searches over such that . The search procedure is reminiscent of the branchandbound optimization [25]. In a preprocessing step, one typically starts with the decomposition of ,
(3) 
where is the uppertriangular matrix and and are composed of the first and the last orthonormal columns of , respectively. Therefore, we can rewrite the sphere constraint as
(4) 
where . Now, (4) can be expanded to
(5) 
where and are the components of and , respectively, and is the entry of . Note that the first term on the righthand side (RHS) of (5) depends only on , the second term on , and so on. A necessary (but not sufficient) condition for to lie inside the sphere is . For every satisfying this condition, a stronger necessary condition can be found by considering the first two terms on the RHS of (5) and imposing a condition that should satisfy to lie within the sphere. One can proceed in a similar fashion to determine conditions for , thereby determining all the lattice points that satisfy . If no point is found within the sphere, its radius is increased and the search is repeated. If multiple points satisfying the constraint are found, then the one yielding the smallest value of the objective function is declared the solution. Clearly, the choice of the radius is of critical importance for facilitating a computationally efficient search. If is too large, the search complexity may become infeasible, while if is too small, no lattice point will be found within the sphere. To this end, can be chosen according to the statistics of , hence providing probabilistic guarantee that a point is found inside the sphere [9].
The sphere decoding algorithm can be interpreted as a search on a tree, as illustrated in Fig. 2(a). The nodes at the tree level represent dimensional points . The algorithm uses aforementioned constraints to prune nodes from the tree, keeping only those that belong to the dimensional sphere of radius . Many variants of the basic sphere decoding have been developed [26], [27]  [28].
IiB Sparsityaware sphere decoding
In scenarios where the unknown vector is known to be sparse, imposing norm constraint on improves the speed of the search since the necessary conditions that the components of must satisfy become more restrictive. Clearly, not all lattice points within the search sphere satisfy the sparsity constraint and thus fewer points need to be examined in each step of the sparsityaware sphere decoding algorithm. We impose sparsity constraint on the components of at each node of the search tree traversed by the algorithm. Note that the number of nonzero symbols along the path from the root node to a given node is a measure of the sparseness of the dimensional point associated with that node. Suppose that a node at the level of the tree satisfies the sphere constraint. Sparsity constraint implies that, in addition to the node being inside the sphere, the number of nonzero symbols along the path leading to the node must not exceed an upper bound . Hence, knowledge of sparsity allows us to impose more strict conditions on the potential solutions to the integer leastsquares problems, and the number of nodes that the algorithm prunes is greater than that in the absence of sparsity (or in the absence of the knowledge about sparsity).
Input: sphere radius , 
sparsity constraint . 
1. Initialize , , 
, . 
2. Update Interval , 
, . 
3. Update . If 
go to 4; else go to 5. 
4. Check Sparsity If 
, and go to 6; else go to 3. 
5. Increase . If stop; 
else, and go to 3. 
6. Decrease If go to 7; else , 
and go to 2. 
7. Solution found Save and its distance from , 
and go to 3. 
Table I formalizes the sparsityaware sphere decoding algorithm with norm constraint imposed on the solution. Note that, in the pseudocode, variable denotes the number of nonzero symbols selected in the first levels. This variable is used to impose the sparsity constraint in Step 4 of the algorithm (calculating and imposing the sparsity constraint requires only a minor increase in the number of computations per node). Whenever the algorithm backtracks up a level in the tree, the value of is adjusted to reflect sparseness of the current node (as indicated in Steps 5 and 7 of the algorithm). The depthfirst tree search of the sparsityaware sphere decoding algorithm is illustrated in Figure 2(b), where fewer points survive pruning than in Figure 2(a).
Remark 1: The algorithm in Table I relies on the original FinckePohst strategy for conducting the tree search [7]. There exist more efficient implementations such as the socalled SchnorrEuchner variant of sphere decoding where the search in each dimension starts from the middle of the feasible interval for and proceeds to explore remaining points in the interval in a “zigzag” fashion (for details see, e.g., [1]). For computational benefits, one should combine such a search strategy with radius update, i.e., as soon as a valid point inside the sphere is found, the algorithm should be restarted with the radius equal to . The expected complexity results derived in Section III are exact for the FinckePohst search strategy and can be viewed as an upper bound on the expected complexity of the SchnorrEuchner scheme with radius update.
Remark 2: The pseudocode of the algorithm shown in Table I assumes an alphabet having unit spacing, which can be generalized in a straightforward manner. For nonnegative alphabets, the algorithm can be further improved by imposing the condition that if a node at a given level violates the sparsity constraint, then all the remaining nodes at that level will also violate the constraint. Details are omitted for brevity.
We should also point out that, in addition to helping improve computational complexity, utilizing knowledge about sparseness of the unknown vector also improves accuracy of the algorithm. To illustrate this, in Figure 3 we compare the error rate (i.e., the average fraction of the incorrectly inferred components of ) performance of sparsityaware sphere decoder with that of the classical (sparsityunaware) sphere decoding algorithm. In this figure, the error rate is plotted as a function of the sparsity ratio . We use a sparse binary alphabet, and simulate a system with at SNR = 10 dB. The classical sphere decoder is unaware of the sparseness of the unknown vector (i.e., it essentially assumes ). It is shown in the figure that the sparsityaware sphere decoding algorithm performs exceptionally well compared to classical sphere decoder for low values of . As expected, the performance gap between the two decoders diminishes if the unknown vector is less sparse. For a comparison, Figure 3 also shows performance of the method where the relaxed version of the integer leastsquares problem is solved via orthogonal matching pursuit (OMP) and the result is then rounded to the nearest integer in the symbol space. It is worthwhile mentioning that the OMP method is suboptimal since there are no guarantees that it will find the closest lattice point. Therefore, its performance (as well as performances of other suboptimal schemes such as LASSO) is generally inferior to that of sparsityaware sphere decoder, as illustrated in Figure 3  Figure 3. Figure 3 shows the error rates as the function of the relative sparsity level for . Figure 3 illustrates the same but for . Performances of the considered algorithms in Figure 3 and Figure 3 exhibit the same trends – i.e., the performance gap between sparsityaware SD and classical SD widens as the sparsity ratio reduces. Figure 4 shows the error rate performance of sparsityaware and classical sphere decoder, along with OMP, as a function of SNR, while is fixed at . In this figure too, sparsityaware sphere decoder is seen to dramatically outperform its classical counterpart as well as OMP.
Iii Expected Complexity of SparsityAware Sphere Decoding
As elaborated in [9], computational complexity of the sphere decoding algorithm depends on the effective size of the search tree, i.e., the number of nodes in the tree visited by the algorithm during the search for the optimal solution to the integer leastsquares problem. If both and are random variables, so is the number of tree nodes visited by the algorithm. Therefore, it is meaningful to treat the complexity as a random variable and characterize it via a distribution or its moments. This motivates the study of the expected complexity of the sparsityaware sphere decoding algorithm presented next.
Assume that (the perturbation noise) consists of independent, identically distributed entries having Gaussian distribution . Clearly, is a random variable following distribution. As argued in [9], if the radius of the search sphere is chosen as , where is such that ^{1}^{1}1 denotes the normalized incomplete gamma function. for small , then with high probability the sphere decoding algorithm will visit at least one lattice point. Moreover, with such a choice of the radius, the probability that an arbitrary lattice point belongs to a
sphere of radius centered at is
(6)  
where denotes the true value of the unknown vector .
Let be the random variable denoting the number of points in the tree visited by the sparsityaware sphere decoding algorithm searching for an sparse solution to (2). It is easy to see that
(7) 
where denotes the vector consisting of the last entries of , i.e., . Note that the inner sum in (7) is only over those that satisfy the sparsity constraint. Therefore, the expected number of points visited by our sparsityaware sphere decoder, averaged over all permissible , is given by
(8)  
(9) 
where the outer expectation in (8) is evaluated with respect to , and in (9) denotes the probability that is the true value of . Note that without the sparsity constraint on , it would hold that , where denotes the alphabet size. However, due to the sparsity constraint, is not uniform. As an illustration, consider a simple example of the dimensional sparse set on the binary alphabet comprising of vectors . If all are equally likely, it is easy to see that the probability of is given by . For a general sparse dimensional constellation, the probability distribution can be obtained by enumerating all dimensional vectors in the set for .
Having determined expected number of lattice points visited by the sparsityaware sphere decoding algorithm, the average complexity can be expressed as
(10) 
where denotes the number of operations performed by the algorithm in the dimension.
The main challenge in evaluating (7) is to find an efficient enumeration of the symbol space, i.e., to determine the number of sparse vectors such that for a given . While this enumeration appears to be difficult in general, it can be found in a closed form for some of the most commonly encountered alphabets in sparse integer least square problems: the binary alphabet (relevant to applications in [4], [29] and [30]) and the ternary alphabet (relevant to application in [5]). In the rest of this section, we provide closed form expressions of the expected complexity of sparsityaware sphere decoder for these alphabets.
Iiia Binary Alphabet {0,1}
Recall that computing (7) requires enumeration of the sparse symbol space, i.e., counting sparse vectors satisfying for a given sparse vector and . Note that, for the binary alphabet, condition is equivalent to . Let , and denote the entry of as . Furthermore, let .
Lemma 1: Given and , the number of dimensional lattice points with such that is given by
(11) 
where
if  
if 
See Appendix A.
Note that for a given and , the possible values of belong to the set . Then, (7) can be written as
(12) 
Finally, outer expectation in (8) is enumerated as follows. Total number of sparse binary vectors for a given can be parameterized by and . For to be sparse, and for each , it should hold that . The number of vectors for each pair of values of and is then given by . Therefore, the total number of all possible sparse vectors is given as and (9) can be written in terms of and as
(13) 
where the summations are evaluated over respective ranges as described above.
Theorem 1: The expected number of tree points examined by the sparsityaware sphere decoding algorithm while solving (2) over a binary alphabet is
(14) 
where denotes the search radius and is defined in (11).
The proof follows from (7) and Lemma 1.
Remark: Note that (14) has important implications on the worstcase complexity of the sparsityaware sphere decoding algorithm. Assume, for the sake of argument, that the radius of the sphere is sufficiently large to cover the entire lattice. In the absence of the radius constraint, pruning of the search tree happens only when the sparsity constraint is violated, and hence (12) reduces to
(15)  
Expression (15) essentially represents the worstcase scenario where a bruteforce search over all sparse signals is required, leading to a complexity exponential in . For , this is clearly the case. For , the partial sum of binomial coefficients in (15) cannot be computed in a closed form but its various approximations and bounds exists in literature. For example, it has been shown in [31] that the partial sum of binomial coefficients in (15) grows exponentially in if for a constant . A corollary to this result states that if is an unbounded, monotonically increasing function, the sum in (15) does not grow exponentially in . Therefore, unless the fraction of the nonzero components of becomes vanishingly small as the dimension of grows, the worstcase complexity of the sparsityaware sphere decoding algorithm is exponential in (the length of ).
IiiB Ternary Alphabet {1,0,1}
Define the support sets of for and let . Unlike the binary case, here we evaluate (7) by reversing the order of summation, i.e., by enumerating all possible sparse vectors and then summing over all such that . To this end, let us introduce
(17) 
In words, denote the number of symbols in in the positions where has symbol . It is easy to see that and, furthermore,
(18) 
where . Given , can be written as
(19) 
Lemma 2: Given and , the number of dimensional lattice points with such that is given by
(20) 
for and
(21) 
for .
See Appendix B. Similar to the binary case, it can be shown that the total number of sparse vectors is given by , where , , and . Therefore, (9) can be written in this case as
where is the number of all sparse vectors , given by , and the summations are evaluated over respective ranges as described above.
Theorem 2: The expected number of tree points examined by the sparsityaware sphere decoding algorithm while solving (2) over a ternary alphabet is
(22) 
where the value of is given by (19).
The proof follows from (7) and Lemma 2.
IiiC Results
It is useful to define the complexity exponent, , as a measure of complexity of the algorithm. To validate the derived theoretical expressions for the expected complexity of the sparsityaware sphere decoding algorithm, we compare them with empirically evaluated complexity exponents. The results are shown in Fig. 5. Here, parameters of the simulated system are and . The empirical expected complexity is obtained by averaging over Monte Carlo runs. As can be seen in in Fig. 5, the theoretical expressions derived in this section exactly match empirical results and are hence corroborated.
Fig. 6 compares the complexity exponent of sparsityaware sphere decoder with that of the classical (sparsityunaware) sphere decoding algorithm for , and binary alphabet. Similar to Fig. 3 and 3, classical sphere decoder assumes the unknown vector to be nonsparse. As can be seen from the plot, the complexity of sparsityaware sphere decoder is less than that of classical decoder, with the gap being more significant at lower values of signaltonoise ratios (SNRs) and decreasing as SNR increases. This is expected because, for low SNRs, radius of the sphere is large and the complexity of sparsityaware sphere decoder approaches that of performing an exhaustive search for the closest point. Since the cardinality of the sparse set in this example is significantly smaller than that of the full lattice, the complexity gap between two algorithms is pronounced. However, for high SNRs, the sphere often contains very few lattice point (occasionally only one), and hence the difference between the complexity exponents of the two algorithms reduces. Also note that the complexity of sparsityaware sphere decoder decreases with since the sparsity constraint further reduces the number of points that will be in a sphere of a given radius.
Iv Variance Analysis
In this section, we characterize the secondorder moment of the complexity of sparsityaware sphere decoder. The following analysis is restricted to the binary alphabet, but it can be extended to more general alphabets in a relatively straightforward manner. It has been shown in [10] that the variance of the complexity of the sphere decoding algorithm is given by
(23) 
where and have same meanings as in Section III. Note that (23) applies to the sparsityaware sphere decoder as well, but , , and differ from those for the classical (sparsityunaware) algorithm. In Section III, we found an expression for the expected number of points visited by the sparsityaware sphere decoding algorithm. Therefore, to evaluate (23), what remains to be determined is , i.e., the correlation between the number of pairs of points of dimensions and lying inside a sphere of radius . Let each of these points be sparse.
Given , and using (4), the correlation between and can be found as [10]
(24) 
where , , and are arbitrary and dimensional sparse vectors having entries from , and is given by the following partition of ,
(25) 
for . Let be the dimensional subvector of the true solution, and define and . Depending on whether or not, the summand in (24) is given by [10]:

If ,
(26) 
(28) and where , , , and .
The number of pairs of points () over which the summation in (24) is evaluated depends on the specific symbol alphabet. Here we outline how to enumerate the total number of such pairs for the binary alphabet . Let us assume, without a loss of generality, that . As shown in [10], is a function of , , and . Therefore, we can evaluate (24) by counting the number of all possible solutions () to the system of equations
(29) 
where and are integer numbers satisfying the constraints imposed by the dimensions and . Unlike the scenario studied in [10], the space of permissible solutions to our problem is not isotropic (due to sparsity constraint). Note that since and belong to alphabet, and are and dimensional vectors with entries from the set . Moreover, each of these vectors is sparse. For the binary alphabet, norm is equivalent to norm. Therefore, the range of values that can take is = and, similarly, the range of values that can take is = . It is straightforward to see that for a given value of , the range of is defined by = , where if , 0 otherwise. Now, nonzero entries ( and ) in and – let us denote their number by and , respectively – can be arranged in ways. Therefore, the number of possible dimensional vectors with and is given by
(30) 
For any given and , we proceed by finding the number of possible pairs () satisfying the conditions (29). A close inspection of the definitions of and reveals that corresponds to the number of positions (out of ) where the entries of and can take values