Efficient SoftInput SoftOutput Detection of DualLayer MIMO Systems
Abstract
A duallayer multipleinput multipleoutput (MIMO) system with multilevel modulation is considered. A computationally efficient softinput softoutput receiver based on the exact maxlog maximum a posteriori (maxlogMAP) principle is presented in the context of iterative detection and decoding. We show that the computational complexity of our exact maxlogMAP solution grows linearly with the constellation size and is also less than that of the best known methods of TurboLORD that only provide approximate solutions. Using decoder feedback to change the decision thresholds of the constellation symbols, we show that the exhaustive search operation boils down to a simple slicing operation.
I Introduction
Iterative detection and decoding (IDD) techniques have been widely used [1, 2, 3, 4] to improve the performance of multipleinput multipleoutput (MIMO) systems. The detector utilizes the feedback from the decoder to enhance the accuracy of its output statistics. In [1, 2], the detector was designed as a linear minimum mean square error equalizer, accepting soft input from the channel decoder. The soft input was used to cancel the interference from other streams and to adapt the equalization (weight) vector by modifying the variance of the canceled streams. In [3], the detector was designed as a decision feedback equalizer with successive cancellation at the symbol level before passing the loglikelihood ratios (LLRs) of the code bits to the decoder. In [4], IDD was used to mitigate the effect of intercell interference in orthogonal frequency division multiplexing (OFDM) systems. In [5, 6, 7], a maximum a posteriori (MAP) approximating algorithm was proposed as an improvement over the layered orthogonal lattice detector (LORD) approach [8, 9]. In [10], list detectors were proposed in addition to iterative channel estimation in OFDM systems. Other MAP approximation algorithms were proposed in [11, 12, 13, 14] where modified sphere detection techniques were used.
Duallayer transmission schemes are widely used in current cellular systems where user equipments cannot easily support more than two antennas. The solution presented in this paper is an exact solution of the maxlog MAP detector for duallayer systems and uses fewer metric computations than the approximate solution provided in [5, 6]. To generate the LLRs for one layer, we use the apriori LLRs generated by the turbo decoder for the other layer to modify its decision thresholds and then use the slicer as a simple search device.
The rest of the paper is organized as follows. The system model is described in Section II, and the exact maxlog MAP solution is derived in Section III. In Section IV, we prove that the apriori probabilities can lead to constellation symbols with empty decision regions. In Section V, we provide the complete algorithm and describe it in pseudo code. We analyze the algorithm computational complexity and compare its complexity with other algorithms in Section VI, and the paper is concluded in Section VII.
Notations: Unless otherwise stated, lower case and upper case bold letters denote vectors and matrices, respectively, and denotes the identity matrix of size . Furthermore, and denote the absolute value and the norm, respectively, while denotes the complex conjugate transpose operation.
Ii System Model
We consider duallayer transmission schemes, where two layers (streams) are transmitted over antennas using the precoding matrix of size . The receiver detects the transmitted streams using receive antennas. The inputoutput relation is given by
(1) 
where , , and denote the received signal, transmitted symbols, background noise plus intercell interference, and channel matrix, respectively. Furthermore, is the th column vector of the equivalent channel matrix , and is the th transmitted symbol chosen from the QAM constellation . The QAM symbol, , represents code bits . The above model suits singlecarrier systems over flat fading channels and OFDM systems over frequencyselective channels where the relation in (1) applies to every subcarrier. In IDD, the detector computes the LLRs of the code bits and passes them to the channel decoder, which computes the extrinsic LLRs and feeds them back to the detector. The detector uses the a priori LLRs computed by the decoder to generate more accurate LLRs for the channel decoder and so forth. Assuming known channel and zeromean circularly symmetric complex Gaussian noise of covariance matrix , we write the log MAP a posteriori detector LLR of the bit as follows:
(2) 
where , and denote the constellation sets where the th bit are ’1’ and ’0’, respectively, and and denote the a priori probabilities that and , respectively. The maxlog MAP approximation of the LLR of is given by:
(3) 
Similarly, the maxlog MAP LLR of is:
(4) 
The brute force solution of (3) (and similarly (4)) requires the computation of metrics where, for each instance of , the metric is computed for all instances of . However, we show in Section III how we obtain the exact maxlog MAP solution for the LLRs using fewer than (rather than ) metrics computations.
Iii Exact MaxLog MAP solution
The main strategy of our solution is to convert the norms in (3) and (4) into simple absolute values fitted for the slicing operations. Then, we exploit the apriori LLRs to control the the thresholds of the slicers. We begin by whitening the noise to get = and =. We then rewrite the bottleneck maximization problem as follows:
(5) 
where is allzero vector of length , is an unitary matrix as is an matrix chosen such that and . The reason we write the matrix in this form is to exploit its unitary structure and take it as a common factor out of the norm without affecting its value. This will lead to converting the norm into a single absolute value as follows. Since , we rewrite the maximization as follows:
(6) 
where . If the apriori probability term () were not there [15] (i.e., ML instead of MAP), then the solution of the maximization in (6) would be a simple slicer, and only metrics (enumeration over and in (3) and (4)) were to be computed to obtain the LLRs of the code bits corresponding to and . With the apriori probability term, we obtain the exact solution of (6) with a reasonable increase in the number of metrics computations which is, interestingly, less than that of the approximate solution in [5, 6]. In modern communications standards [16], the real and imaginary parts of correspond to two orthogonal PAM constellations, , where ^{1}^{1}1Assuming square constellation, without loss of generality.. In Fig. 1, we show the PAM onedimensional constellation corresponding to the real or imaginary of any complex QAM constellation. Hence, we rewrite (6) as follows:
(7) 
where and denote the real and imaginary parts of , respectively. Furthermore, and denote the apriori probabilities that the real and imaginary parts of equal and , computed using the apriori LLRs of the bits corresponding to the real and imaginary parts, respectively.
Next, we use the apriori probabilities (LLRs) to modify the decision regions of the PAM real and imaginary symbols; correspondingly apply the slicer to the real and imaginary parts of , respectively, to find the solution of (7); and then compute the metrics in (3) and (4), which can be significantly simplified using (7). To develop the method of modifying the decision boundaries, we derive the decision region of the symbol in Fig. 1 by writing the conditions on such that
(8) 
Simplifying (8), we get the decision region of as follows:
(9) 
Similarly, the decision region of is given by
(10) 
and the decision region of the last symbol is given by
(11)  
(12) 
is called the probabilistic boundary between the constellation symbols and . Equation (12) shows that the boundary between two neighboring symbols moves towards the symbol with the lower apriori probability, tending to shrink its decision region while extending that of the symbol with the higher apriori probability. Equation (12) also shows that without apriori LLRs (i.e., ), the boundaries between symbols return to their original values (the average of constellation symbols amplitudes).
Iv Symbols With Empty Decision Regions
We prove that the apriori probability distribution can lead to constellation symbols with empty decision regions that will not be chosen by the slicer regardless of (or ).
Theorem 1
When computing the lower bound of the decision region for the constellation symbol , given by , the following can occur:
(13) 
meaning that the lower bound of the symbol is not determined by its boundary with the adjacent symbol , but determined instead by its boundary with a farther symbol . In this case, all symbols lying between and (i.e., the constellation symbols , where ) do not have decision regions and will not be chosen regardless of the decision statistic value.
From (10), the decision boundaries for , where are given by
(14) 
However, there is no value for that satisfies (14) if
(15) 
In the sequel, we prove that the condition in (15) is satisfied if , i.e., and, hence,
(16)  
(17) 
where and is the separation between adjacent real (or imaginary) constellation symbols as shown in Fig. 1. Since , we define
(18) 
where . We rewrite (17) as follows:
(19) 
Next, we rewrite the condition in (15) as follows:
(20) 
Using the inequality in (19), we bound as follows:
(21) 
which concludes the proof. The practical importance of this theorem is that it can reduce the algorithm complexity and further speed it up. For example, if the lower boundary of is determined by then we do not need to compute the decision boundaries of and because they will have empty decision regions.
V Algorithm and Computational Complexity
In the sequel, we summarize the algorithm and show the receiver model in Fig. 2.
Preprocessing: Compute and whiten the noise by computing , , and .
Procedure:

Get the decision regions for and using the corresponding a priori LLRs as follows:
Initialize .
While
A) Compute the lower and upper thresholds of the th constellation symbol as and , respectively, where(22) where , denotes the a priori LLR of the code bit , and are the bit vectors corresponding to the constellation symbols and , respectively. The transition from the probability domain in (12) to the LLR domain in (22) is straightforward.
B) If , set the decision regions of the symbols , where , to empty, and set
Else, set
End While 
Enumeration step over constellation points of and .
For
a) Compute the following quantities forb) Slice the real and imaginary parts of and using the thresholds obtained in Step 1 to obtain and , respectively.
c) Compute the following metrics(23) (24) where are the bit vectors of , respectively.
End For 
Compute the detector LLRs for and
(25)
Vi Complexity analysis
We count the number of required metrics computations to obtain the detector LLRs corresponding to and . To get the new decision regions, we need to compute the probabilistic boundaries between every two symbols of the symbols (for both real and imaginary parts). Since , the number of metrics (boundaries) to be computed is
(26) 
Note that these boundaries are computed only once and are not included inside the enumeration over in (3). Hence, the total number of metric computations to obtain the LLRs of the code bits corresponding to is . To obtain the LLRs corresponding to (i.e., and ), the number of metric computations per tone becomes
(27) 
In Table I, we compare our algorithm with the TurboLORD (TLORD) [5] and the bruteforce algorithms in terms of the number of metrics to be computed, number of real multiplications (Muls), and number of real additions (Adds) per tone per iteration as function of the constellation size and the number of receive antennas. In Table II, we compare these algorithms for 256QAM and two receive antennas where we observe the significant computational complexity saving without any performance loss since our algorithm obtains the exact solution rather than the approximate solution in [5]. In TLORD [5], while enumerating over , three candidates for are obtained for every possible value of the candidiases of . Hence, we have candidates for the pair, and the metric in (23) is computed for each candidate. Doing the same for , we have another metrics summing up to metrics to be computed.
Detector  Metrics  Real Muls  Real Adds 

Proposed  
TLORD  
Brute force 
Detector  Metrics  Real Muls  Real Adds 

Proposed  992  12768  16732 
TLORD  1536  20480  23548 
Brute force  65536  524288  785408 
Vii Conclusion
We developed the exact maxlog MAP detector for IDD in duallayer MIMO schemes with computational complexity less than . The idea is to use the a priori LLRs in modifying the decision thresholds of the constellation symbols. We also showed that the a priori LLRs can lead to constellation symbols with empty decision regions, reducing the search space of the slicing block. Comparing the computational complexity with the TurboLORD approximate solution and the exact brute force solutions, we show that our algorithm achieves significant complexity reduction while achieving the exact maxlog MAP solution. We have numerically verified that our method yields the same performance as the brute force solution for various simulation parameters but the simulation results are not shown here due to space limitations.
References
 [1] M. Tuchler, A. Singer, and R. Koetter, “Minimum mean squared error equalization using a priori information,” IEEE Transactions on Signal Processing, vol. 50, no. 3, pp. 673–683, 2002.
 [2] M. Sellathurai and S. Haykin, “TurboBLAST for wireless communications: theory and experiments,” IEEE Transactions on Signal Processing, vol. 50, no. 10, pp. 2538–2546, 2002.
 [3] J. Choi, A. Singer, L. Jungwoo, and N. Cho, “Improved linear softinput softoutput detection via soft feedback successive interference cancellation,” IEEE Trans. on Comm., vol. 58, no. 3, pp. 986–996, 2010.
 [4] M. Mikami and T. Fujii, “Iterative MIMO signal detection with intercell interference cancellation for downlink transmission in coded OFDM cellular systems,” in IEEE Vehicular Technology Conference, 2009.
 [5] A. Tomasoni, M. Siti, M. Ferrari, and S. Bellini, “Low complexity, quasioptimal MIMO detectors for iterative receivers,” IEEE Transactions on Wireless Communications, vol. 9, no. 10, pp. 3166–3177, 2010.
 [6] ——, “TurboLORD: A MAPapproaching softinput softoutput detector for iterative MIMO receivers,” in IEEE Global Telecomm. Conference, 2007, pp. 3504–3508.
 [7] ——, “A Kbest version of the turboLORD MIMO detector in realistic settings,” in IEEE International Conference on Communications, 2009.
 [8] M. Siti and M. Fitz, “Layered orthogonal lattice detector for two transmit antenna communications,” in Allerton Conference On Communication, Control, And Computing, 2005.
 [9] ——, “A novel softoutput layered orthogonal lattice detector for multiple antenna communications,” in IEEE ICC, 2006.
 [10] J. Ylioinas and M. Juntti, “Iterative joint detection, decoding, and channel estimation in turbocoded MIMOOFDM,” IEEE Transactions on Vehicular Technology, vol. 58, no. 4, pp. 1784–1796, 2009.
 [11] J. Choi, Y. Hong, and J. Yuan, “An approximate MAPbased iterative receiver for MIMO channels using modified sphere detection,” IEEE Trans. on Wireless Communications, vol. 5, no. 8, pp. 2119–2126, 2006.
 [12] H. Vikalo, B. Hassibi, and T. Kailath, “Iterative decoding for MIMO channels via modified sphere decoding,” IEEE Transactions on Wireless Communications, vol. 3, no. 6, pp. 2299–2311, 2004.
 [13] S. Han, T. Cui, and C. Tellambura, “Improved Kbest sphere detection for uncoded and coded MIMO systems,” IEEE Wireless Communications Letters, vol. 1, no. 5, pp. 472–475, 2012.
 [14] B. Hochwald and T. Brink, “Achieving nearcapacity on a multipleantenna channel,” IEEE Transactions on Communications, vol. 51, no. 3, pp. 389–399, 2003.
 [15] R. Ghaffar and R. Knopp, “Interference sensitivity for multiuser MIMO in LTE,” in IEEE SPAWC Workshop, 2011.
 [16] “Physical channels and modulation,” 3GPP TS 36.211, 20102013.