# On the Outage Capacity of a Practical Decoder
[-6mm] Accounting for Channel Estimation
[-6mm] Inaccuracies^{1}

^{1}

## Abstract

The optimal decoder achieving the outage capacity under imperfect channel estimation is investigated. First, by searching into the family of nearest neighbor decoders, which can be easily implemented on most practical coded modulation systems, we derive a decoding metric that minimizes the average of the transmission error probability over all channel estimation errors. Next, we specialize our general expression to obtain the corresponding decoding metric for fading MIMO channels. According to the notion of estimation-induced outage (EIO) capacity introduced in our previous work and assuming no channel state information (CSI) at the transmitter, we characterize maximal achievable information rates, using Gaussian codebooks, associated to the proposed decoder. In the case of uncorrelated Rayleigh fading, these achievable rates are compared to the rates achieved by the classical mismatched maximum-likelihood (ML) decoder and the ultimate limits given by the EIO capacity. Numerical results show that the derived metric provides significant gains for the considered scenario, in terms of achievable information rates and bit error rate (BER), in a bit interleaved coded modulation (BICM) framework, without introducing any additional decoding complexity.

## 1Introduction

Consider a practical wireless communication system, where the receiver disposes only of noisy channel estimates that may in some circumstances be poor estimates, and these estimates are not available at the transmitter. This constraint constitutes a practical concern for the design of such communication systems that, in spite of their knowledge limitations, have to ensure communications with a prescribed quality of service (QoS). This QoS requires to guarantee transmissions with a given target information rate and small error probability, no matter which degree of accuracy estimation arises during the transmission. The described scenario addresses two important questions: (i) What are the theoretical limits of reliable transmission rates, using the best possible decoder in presence of imperfect channel state information at the receiver (CSIR) and (ii) how those limits can be achieved by using practical decoders in coded modulation systems ? Of course, these questions are strongly related to the notion of capacity that must take into account the above mentioned constraints.

We have addressed in [1] the first question (i), for arbitrary memoryless channels, by introducing the notion of *Estimation-Induced Outage Capacity* (EIO capacity). This novel notion characterizes the information-theoretic limits of such scenarios, where the transmitter and receiver strive to construct codes for ensuring the desired communication service, no matter which degree of accuracy estimation arises during the transmission. The explicit expression of this capacity allows one to evaluate the optimal trade-off between the maximal achievable outage rate (i.e. maximizing over all possible transmitter-receiver pairs) versus the outage probability (the QoS constraint). This can be used by a system designer to optimally share the available resources (e.g. power for transmission and training, the amount of training used, etc.), so that the communication requirements be satisfied. Nevertheless, the theoretical decoder used to achieve the latter capacity cannot be implemented on practical communication systems.

The second question (ii) concerning the derivation of a practical decoder, which can achieve information rates close to the EIO capacity, is addressed in this paper. Classically, one replaces the exact channel by its estimate in the decoding metric. This is known as mismatched maximum-likehood (ML) decoding. However, this scheme is not appropriate in presence of channel estimation errors (CEE), at least if the estimation errors are large, i.e. for small number of training symbols [2]. This problem has recently motivated a lot of work. In [3] and [4] the authors analyze bit error rate (BER) performances of this mismatched decoder in the case of an orthogonal frequency division multiplexing (OFDM) system. References [5] considered a training-based MIMO system and showed that for compensating the performance degradation due to CEE, the number of receive antennas should be increased, which may become a limiting item for mobile applications. On the other hand, the performance of Bit Interleaved Coded Modulation (BICM) over fading MIMO channels with perfect CSI was studied for instance, in [6], [7] and [8]. Cavers in [9], derived a tight upper bound on the symbol error rate of pilot symbol assisted modulation (PSAM) for a -QAM constellation. A similar investigation was carried out in [10] showing that for iterative decoding of BICM at low SNR, the quality of channel estimates is too poor for being used in the mismatched ML decoder.

As an alternative to the aforementioned decoder, Tarokh *et al.* in [11] and Taricco and Biglieri in [2], proposed an improved ML detection metric and applied it to a space-time coded MIMO system, where they showed the superiority of this metric in terms of BER. Interestly enough, this decoding metric can be formally derived as a special case of the general framework presented in this paper. So far, most of the research in the field were focused on evaluating the performances of mismatched decoders in terms of BER (cf. [12]), but still not providing an answer to the question (ii). In [13], the authors investigate achievable rates of a weighting nearest-neighbor decoder for multiple-antenna channel. Moreover, in [14] and [1], authors show that the achievable rates using the mismatched ML decoding are largely sub-optimal (at least for a limited number of training symbols) compared to the ultimate limits given by the EIO capacity. In this paper, according to the notion of EIO capacity, we investigate the maximal achievable information rate with Gaussian codebooks of the improved decoder in [11]. Furthermore, it can be shown that this decoder achieves the capacity of a composite (more noisy) channel.

This paper is organized as follows. Section 2, briefly reviews our notion of capacity. Then, we search into the family of decoders that can be easily implemented on most practical coded modulation systems to derive the general expression of the decoder. This decoder minimizes the average of the transmission error probability over all CEE. We accomplish this by exploiting the availability of the statistic characterizing the quality of channel estimates, i.e., the *a posteriori* probability density function (pdf) of the unknown (true) channel conditioned on its estimate. Section 3 describes the fading MIMO model. In Section 4, we specialize our expression of the decoding metric for the case of MIMO channels and use this for iterative decoding of MIMO-BICM. In Section 5, we compute achievable information rates of a receiver using the proposed decoder and compare these to the EIO capacity and the achievable rates of the classical mismatched approach. Section 6 illustrates via simulations, conducted over uncorrelated Rayleigh fading, the performance of the improved decoder in terms of achievable outage rates and BER, compared to those provided by the mismatched ML decoding.

Notational conventions are as follows. Upper and lower case bold symbols are used to denote matrices and vectors; represents an identity matrix; refers to expectation with respect to the random vector ; and denote matrix determinant and Frobenius norm, respectively; and denote vector transpose and Hermitian transpose, respectively.

## 2Decoding under Imperfect Channel Estimation

Throughout this section we focus on deriving a practical decoder for general memoryless channels that achieves information rates close to the EIO capacity (the ultimate bound).

### 2.1Communication Model Under Channel Uncertainty

A specific instance of the memoryless channel is characterized by a transition probability with an unknown channel state , over input and output alphabets . Here, is a family of conditional pdf parameterized by the vector of parameters , where denotes the number of parameters. Throughout the paper we assume that the channel state, which neither the transmitter nor the receiver know exactly, remains constant within blocks of symbols, related to the product of the coherence time and the coherence bandwidth of a wireless channel, and these states for different blocks are i.i.d. (e.g. block Rayleigh fading). The transmitter does not know and the receiver only knows an estimate and a *characterization of the estimator performance* in terms of the conditional pdf (obtained by using , the estimation function and ). A decoder using , instead of , obviously might not support an information rate (even small rates might not be supported if and are strongly different). Consequently, outage events induced by CEE will occur with a certain probability . The scenario underlying these assumptions is motivated by current wireless systems, where the coherence time for mobile receivers may be too short to permit reliable estimation of the fading coefficients and in spite of this fact, the desired communication service must be guaranteed. This leads to the following notion of capacity.

### 2.2A Brief Review of EIO Capacity

A message is transmitted using a pair of mappings, where is the encoder, and is the decoder (that utilizes ). The random rate, which depends on the unknown channel realization through its probability of error, is given by . The maximum error probability (over all messages)

where . For a given channel estimate , and , an outage rate is -achievable if for every and every sufficiently large there exists a sequence of length- block codes such that the rate satisfies the quality of service

where stands for the set of all channel states allowing for the desired transmission rate , and is the set of all channel states allowing for reliable decoding (arbitrary small error probability). This definition requires that maximum error probabilities larger than occur with probability less than . The practical advantage of such definition is that for % of channel estimates, the transmitter and receiver strive to construct codes for ensuring the desired communication service. The EIO capacity is then defined as the largest -achievable rate, for an outage probability and a given channel estimate , as

where the maximization is taken over all encoder and decoder pairs. In [1], we proved the following coding Theorem that provides an explicit way to evaluate the maximal outage rate versus outage probability for an estimate , characterized by .

The existence of a decoder in achieving the capacity is proved using a random-coding argument, based on the well-known method of typical sequences [15]. Nevertheless, this decoder cannot be implemented on practical communication systems.

### 2.3Derivation of a Practical Decoder Using Channel Estimation Accuracy

We now consider the problem of deriving a practical decoder that achieves the capacity . Assume that we restrict the searching of decoding functions , maximizing , to the class of additive decoding metrics, which can be implemented on realistic systems. This means that for a given channel output , we set the decoding function

where and is an arbitrary per-letter additive metric. Consequently, the maximization in is actually equivalent to maximizing over all decoding metrics . Note, however, that this restriction does not necessarily lead to an optimal decoder achieving the capacity.

Problem statement:

In order to find the optimal decoding metric maximizing the outage rates in , for a given outage probability and channel estimate , it is necessary to look at the intrinsic properties of the capacity definition. Observe that the size of the set of all channel states allowing for reliable decoding is determined by the decoding function . The maximal achievable rate , constrained to the outage probability , is thus limited by this size. Hence, for a given decoder , there exists an optimal set of channel states with conditional probability larger than , providing the largest achievable rate, which follows as the minimal instantaneous rate for the worst . The optimal set is equal to the set maximizing the expression . Hence, an optimal decoding metric must guarantee minimum error probability for every .

The computation of such a metric becomes very difficult (not necessary feasible by using the class of decoders in ), since the maximization in by using is not an explicit function of . However, it is interesting to note, that if the set defines a compact and convex set of channels , then the optimal decoding metric can be chosen as the ML decoder , where is the channel state minimizing the mutual information in . The receiver can thus be a ML receiver with respect to the worst channel in the family [16]. However, in most practical cases, the channel states are represented by vectors of complex coefficients that do not lead to convex sets of channels.

Optimal decoder for composite channels:

Instead of trying to find an optimal decoding metric minimizing the error probability for every , we propose to look at the decoding metric minimizing the average of the transmission error probability over all CEE. This means,

where is obtained by replacing in . Since the channel is memoryless, the average of error probability in can be written as the error probability of a composite (more noisy) channel . This channel follows as the average of the unknown channel over all CEE given the estimate . Then, by taking the logarithm of this channel we obtain its ML decoder, which minimizes (for sufficiently large) the error probability in . Actually, by following an analogy with the proof in [16], it can be shown that

Remark:

We emphasize that this decoder cannot guarantee small error probabilities for every channel state , and consequently it only achieves a lower bound of the EIO capacity . Nevertheless, this archives the capacity of the composite channel. The remaining question to answer is how much lower are the achievable outage rates using the metric , comparing to the theoretical decoder achieving the EIO capacity. In Section 5, we evaluate and its achievable information rates for the fading MIMO channel with no CSI at the transmitter.

## 3System Model

### 3.1Fading MIMO Channel

We consider a single-user MIMO system with transmit and receiver antennas transmitting over a frequency non-selective channel and refer to it as a MIMO channel. Figure 1 depicts the BICM coding scheme used at the transmitter. The binary data sequence is encoded by a non-recursive and non-systematic convolutional (NRNSC) code, before being interleaved by a quasi-random interleaver. The output bits are gathered in subsequences of bits and mapped to complex M-QAM vector symbols with average power . We also send some pilot symbols at the beginning of each data frame for channel estimation. The symbols of a frame are then multiplexed for being transmitted through antennas. Assuming a frame of transmitted symbols associated to each channel matrix , the received signal vector of dimension is given by

where is the vector of transmitted symbols, referred to as a compound symbol. Here, the entries of the random matrix are independent identically distributed (i.i.d.) Zero-Mean Circularly Symmetric Complex Gaussian (ZMCSCG) random variables. Thus, the channel state is distributed as

where is the Hermitian covariance matrix of the columns of (assumed to be the same for all columns), i.e., . The noise vector consists of ZMCSCG random vector with covariance matrix . Both and are assumed ergodic and stationary random processes, and the channel matrix is independent of and .

### 3.2Pilot Based Channel Estimation

Assuming that the channel matrix is time-invariant over an entire frame, channel estimation is usually performed on the basis of known training (pilot) symbols transmitted at the beginning of each frame. The transmitter, before sending the data , sends a training sequence of vectors . According to the observation of the channel model , this sequence is affected by the channel matrix , allowing the receiver to observe separately , where is the noise matrix affecting the transmission of training symbols. We assume that the coherence time is much longer than the training time and the average energy of the training symbols is .

We focus on the estimation of , from the observed signals and . In the ML sense this estimate is obtained by minimizing with respect to . This yields , where denotes the estimation error matrix. For simplicity, we assume orthogonal training sequences, for which we must have , and consequently the matrix error becomes decorrelated. Thus, matrix must be full rank and thus must be nonsingular with orthogonal rows and such that . Next, denoting the th column of the error matrix , we can write with , yielding a white error matrix, i.e. the entries of are i.i.d. ZMCSCG random variables with variance . Thus, for each frame, the conditional pdf of given is the complex normal matrix pdf

## 4Metric Computation and Iterative Decoding of BICM

In this section, we specialize the expression to derive the decoding metric for MIMO channels and then we consider MIMO-BICM decoding with the derived metric.

### 4.1Mismatched ML Decoder

The classical mismatched ML decoder consists of the likelihood function of the channel pdf using the channel estimate . This leads to the following Euclidean distance

### 4.2Metric Computation

We now specialize the expression in the case of a MIMO channel . To this end, we need to derive the pdf , which can be obtained by using the pdf and (see Appendix @.1). The corresponding pdf is:

where and . The availability of the distribution characterizing the CEE is the key feature of pilot assisted channel estimation. Then, by averaging the channel over all CEE, using the pdf , and after some algebra we obtain the composite channel (cf. Appendix @.1)

Finally, from the optimal decoding metric for the MIMO channel reduces to:

This metric coincides with that proposed for space-time decoding, from independent results in [2]. We note that under near perfect CSI, obtained when ,

Consequently, we have the expected result that the metric tends to the classical mismatched ML decoding metric , when the estimation error .

### 4.3Receiver Structure

The problem of decoding MIMO-BICM has been addressed in [17] under the assumption of perfect CSIR. Here we consider the same problem with CEE, for which we use the metric in the iterative decoding process of BICM. Basically, the receiver consists of the combination of two sub-blocks operating successively. The block diagram of the transmitter and the receiver are shown in Figure 1 and Figure 2, respectively. The first sub-block, referred to as soft symbol to bit MIMO demapper, produces bit metrics (probabilities) from the input symbols and the second one is a soft-input soft-output (SISO) trellis decoder. Each sub-block can take advantage of the *a posteriori* (APP) provided by the other sub-block as an a priori information. Here, SISO decoding is performed using the well known forward-backward algorithm [18]. We recall the formulation of the soft MIMO detector.

Suppose first the case where the channel matrix is perfectly known at the receiver. The MIMO demapper provides at its output the extrinsic probabilities on coded and interleaved bits . Let , , be the interleaved bits corresponding to the -th compound symbol where the cardinality of is equal to . The extrinsic probability of the bit (bit metrics) at the MIMO demapper output is calculated as

where and is the normalization factor satisfying and is the *extrinsic* information coming from the SISO decoder. The summation in is taken over the product of the channel likelihood given a compound symbol , and the *a priori* probability on this symbol (the term ) fed back from the SISO decoder at the previous iteration. Concerning this latter term, the *a priori* probability of the bit itself has been excluded, so as to let the exchange of extrinsic information between the channel decoder and the MIMO demapper. Also, note that this term assumes independent coded bits , which is a valid approximation for random interleaving of large size. At the first iteration we set (there is no *a priori* information).

Note that by replacing the unknown channel in by its channel estimate , we obtain the mismatched ML decoder . The proposed decoder follows by introducing the metric given by in , yielding to the same equation with the appropriate constant .

## 5Achievable Information Rates over MIMO Channels

In this section we derive the achievable information rates in the sense of outage rates, associated to a receiver using the decoding rule based on metrics and .

### 5.1Achievable Information Rates Associated to the Improved Decoder

Assume a given pair of matrices , characterizing a specific instance of the channel realization and its estimate. We first derive the instantaneous achievable rates for MIMO channels , associated to a receiver using the derived metric . This is done by using the following Theorem from [19], which provides the general expression for the maximal achievable rate with a given decoding metric.

In order to solve the constrained minimization problem in Theorem for our metric (expression ), we must find the channel and the covariance matrix defining the test channel that minimizes the relative entropy . On the other hand, through this paper we assume that the transmitter does not dispose of the channel estimates, and consequently no power control is possible. Thus, we choose the sub-optimal input distribution with . We first compute the constraint set , given by and , and then we factorize matrix to solve the minimization problem. Before this, to compute the constraint , we need the following result (Appendix @.2).

From Lemma ? and some algebra, it is not difficult to show that the constraints require that

From expression and computing the relative entropy, the minimization in writes

where must be chosen such that . In order to obtain a simpler and more tractable expression of , we consider the following decomposition of the matrix with . Let be a diagonal matrix such that , whose diagonal values are given by the vector . We define , the vector resulting of its diagonal and let . Using the above definitions and some algebra, the optimization becomes equivalent to

with . The constraint set in the minimization , which corresponds to the set of vectors , is a closed convex polyhedral set. Thus, the infimun in is attainable at the extremal of the set given by the equality (cf. [20]). Furthermore, for every vector such that , we observe that expression is a monotonically increasing function of the square norm of . As a consequence, it is sufficient to find the optimal vector by minimizing the square norm over the constraint set. This becomes a classical minimization problem that can be easily solved by using Lagrange multipliers. The corresponding achievable rates are then presented in the following corollary.

and .

### 5.2Achievable Information Rates Associated to the Mismatched ML decoder

Next, we aim at comparing the achievable rates obtained in to those provided by the classical mismatched ML decoder . Following the same steps as above, we can compute the achievable rates associated to the mismatched ML decoder. In this case, the minimization problem writes

where must be chosen such that . The resulting achievable rates are given by

where and

### 5.3Estimation-Induced Outage Rates

Through this section, we have so far considered instantaneous achievable rates over MIMO channels. We now provided its associated outage rates, according to the notion of EIO capacity defined in Section 2.2. In order to compute these outage rates, it is necessary to calculate the outage probability as a function of the outage rate. Given outage rate and channel estimate , the outage probability is defined as

then the maximal outage rate for an outage probability is given by

Since this outage rate still depends on the channel estimate, we consider the average over all channel estimates as . These achievable rates are upper bounded by the mean outage rates given by the EIO capacity, which provides the maximal outage rate (i.e. maximizing over all possible receiver using the channel estimates), achieved by a theoretical decoder. In our case, this capacity is given by , where can be computed from by setting and .

## 6Simulation Results

In this section we provide numerical results to analyze the performance of a receiver using the decoder based on the metric . We consider uncorrelated Rayleigh fading MIMO channels, assuming that the channel changes for each compound symbol inside a frame of symbols. This assumption was made because of BICM for interleaver efficiency. The performances are measured in terms of BER and achievable outage rates. The binary information data is encoded by a rate non-recursive non-systematic convolutional (NRNSC) channel code with constraint length defined in octal form by . The interleaver is random and operates over the entire frame with size bits. The symbols belong to a -QAM constellation with either Gray or set-partition labeling. Besides, it is assumed that the average pilot symbol energy is equal to the average data symbol energy.

### 6.1Bit Error Rate Analysis of BICM Decoding Under Imperfect Channel Estimation

Here, we compare BER performances between the proposed decoder and the mismatched decoder for BICM decoding (section IV). Figure 3 and Figure 4 show, for a MIMO channel (), the increase in the required caused by decoding with the mismatched ML decoder in presence of CEE. BER obtained with perfect CSIR are also presented for comparison purpose. In this case, we insert or pilots per frame for channel training. At and , we observe about dB of SNR gain with set-partition labeling by using the proposed decoder. The performance improvement with set-partition labeling is higher (well served to iterative decoding) than Gray labeling (this is preferred if no iteration is allowed).

We also note that the performance loss of the mismatched receiver with respect to our receiver becomes insignificant for . This can be explained from , since by increasing the number of pilot symbols both decoders coincide. Results show that the decoder under investigation outperforms the mismatched decoder, especially when few are dedicated for training.

### 6.2Achievable Outage Rates Using the Derived Metric

Numerical results concerning achievable information rates decoding with the investigated metric over fading MIMO channels are based on Monte Carlo simulations.

Figure 5 compares average outage rates (in bits per channel use) over all channel estimates, of both mismatched ML decoding (given by expression ) and the proposed metric (given by ) versus the SNR. The MIMO channel is estimated by sending pilot symbols per frame, and the outage probability has been set to . For comparison, we also display the upper bound of these rates given by the EIO capacity (obtained by evaluating the expression ), and the capacity with perfect channel knowledge. It can be observed that the achievable rate using the mismatched ML decoding is about dB (at a mean outage rate of bits) of SNR far from the EIO capacity. Whereas, we note that the proposed decoder achieves higher rates for any SNR values and decreases by about dB the aforementioned SNR gap.

Similar plots are shown in Figure 6 in the case of a MIMO channel estimated by sending training sequences of length . Again, it can be observed that the modified decoder achieves higher rates than the mismatched decoder. However, we note that the performance degradation using the mismatched decoder has decreased to less than dB (at a mean outage rate of bits). This observation is a consequence of using orthogonal training sequences that requires (CEE are reduced by increasing the number of antennas [21]). Whereas for (using non-orthogonal sequences) the performance degradation will be larger than here.

Note that the achievable rates of the proposed decoder are still about dB far from the ultimate performance given by the EIO capacity. However, the new metric provides significative gains in terms of information rates compared to the classical mismatch approach.

## 7Summary

This paper studied the problem of reception in practical communication systems, when the receiver has only access to noisy estimates of the channel and these estimates are not available at the transmitter. Specifically, we focused on determining the optimal decoder that achieves the EIO capacity of arbitrary memoryless channels under imperfect channel estimation. By using the tools of information theory, we derived a practical decoding metric that minimizes the average of the transmission error probability over all CEE. This decoder is not optimal in the sense that it cannot achieve the EIO capacity, but it offers improvement performance without introducing any additional decoding complexity.

By using the general decoder, we analyzed the case of uncorrelated fading MIMO channels with ML channel estimation at the decoder and without channel information at the transmitter. Then, we used this metric for iterative BICM decoding of MIMO systems. Moreover, we obtained the maximal achievable rates, using Gaussian codebooks, associated to the proposed decoder and compared these rates to those of the classical mismatched ML decoder. Simulation results indicate that mismatched ML decoding is sub-optimal under short training sequences, in terms of both BER and achievable outage rates, and confirmed the adequacy of the proposed decoder.

Although we showed that the proposed decoder outperforms classical mismatched approaches, the derivation of a practical decoder that maximizes the EIO capacity (over all possible theoretical decoders) under imperfect channel estimation, is still an open problem in its full generality. Nevertheless, other types of decoding metrics incorporating also the outage probability value, have yet to be fully explored.

### @.1Metric evaluation

From and , by choosing and in Theorem ?, we obtain the *a posteriori* pdf , where . In order to evaluate the general expression of the decoding metric for fading MIMO channels, we compute the expectation of over the pdf . To this end, we need the following result (see [22]).

From this theorem, we can compute the composite channel . Let us define such that the conditional pdf of given is with and . Thus, by defining from and after some algebra, we obtain .

### @.2Proof of Lemma

Consider the quadratic expressions and , where is a vector of elements, such that *almost surely*. The joint generating function of and , namely, . It easy to see that

Then from the Gamma integral and setting in we have

where it is not difficult to show that

Finally, by solving the integral in , we obtain the expression .

### Footnotes

- The material in this paper was published in part at the International Symposium on Information Theory (ISIT07).

### References

- P. Piantanida, G. Matz, and P. Duhamel, “Outage behavior of discrete memoryless channels under channel estimation errors,”
*Submitted to Trans. on Information Theory*, January 2007. - G. Taricco and E. Biglieri, “Space-time decoding with imperfect channel estimation,”
*IEEE Trans. on Wireless Communications*, vol. 4, pp. 2426 – 2467, July 2005. - K. Ahmed, C. Tepedelenhoglu, and A. Spanias, “Effect of channel estimation on pair-wise error probability in OFDM,” in
*Proc. of Int. Conf. of Acoustics, Speech and Signal Processing (ICASSP)*, vol. 4, pp. 745–748, May 2004. - A. Leke and J. M. Cioffi, “Impact of imperfect channel knowledge on the performance of multicarrier systems,” in
*IEEE Global Telecommun. Conf*, vol. 4, pp. 951–955, Nov. 1998. - P. Garg, R. K. Mallik, and H. M. Gupta, “Performance analysis of space-time coding with imperfect channel estimation,”
*IEEE Trans. Wireless Commun.*, vol. 4, pp. 257–265, Jan. 2005. - G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation,”
*IEEE Trans. Information Theory*, vol. IT-44, pp. 927–945, May 1998. - E. Zehavi, “8-PSK trellis codes for a rayleigh channel,”
*IEEE Trans. Communications*, vol. 40, pp. 873–887, May 1992. - X. Li, A. Chindapol, and J. A. Ritcey, “Bit-interleaved coded modulation with iterative decoding and 8-PSK modulation,”
*IEEE Trans. Communications*, vol. 50, pp. 1250–1257, Aug. 2002. - J. K. Cavers, “An analysis of pilot symbol assisted modulation for rayleigh fading channels,”
*IEEE Trans. Veh. Technol.*, vol. 40, pp. 686–693, Nov. 1991. - Y. Huang and J. A. Ritcey, “16-QAM BICM-ID in fading channels with imperfect channel state information,”
*IEEE Trans. Communications*, vol. 2, pp. 1000–1007, Sept. 2003. - V. Tarokh, A. Naguib, N. Seshadri, and A. Calderbank, “Space-time codes for high data rate wireless communication:performance criteria in the presence of channel estimation errors,mobility, and multiple paths,”
*IEEE Transactions on Communications*, pp. 199–207, Feb 1999. **PhD thesis, Univ. of California, Los Angeles, 1979.**

D. Divsalar,*Performance of mismatched receivers on bandlimited channels*.- H. Weingarten, Y. Steinberg, and S. Shamai, “Gaussian codes and weighted nearest neighbor decoding in fading multiple-antenna channels weingarten,”
*IEEE Trans. Information Theory*, vol. 50, pp. 1665– 1686, Aug 2004. - A. Lapidoth and S. Shamai, “Fading channels: how perfect need ‘perfect side information’ be?,”
*IEEE Transactions on Information Theory*, vol. 48, pp. 1118–1134, May 2002. **Academic, New York, 1981.**

I. Csiszár and J. Körner,*Information theory: coding theorems for discrete memoryless systems*.- I. Csiszár and P. Narayan, “Channel capacity for a given decoding metric,”
*IEEE Trans. Information Theory*, vol. IT-41, no. 1, pp. 35–43, 1995. - J. J. Boutros, F. Boixadera, and C. Lamy, “Bit-interleaved coded modulations for multiple-input multiple-output channels,” in
*Int. Symp. on Spread Spectrum Tech. and Applications*, pp. 123–126, Sept. 2000. - L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,”
*IEEE Trans. Information Theory*, pp. 284–287, March 1974. - N. Merhav, G. Kaplan, A. Lapidoth, and S. Shamai (Shitz), “On information rates for mismatched decoders,”
*IEEE Trans. Information Theory*, vol. IT-40, pp. 1953–1967, Nov. 1994. **Springer-Verlag, 1993.**

J. Hirriart-Urruty and C. Lemaréchal,*Convex Analysis and Minimization Algorithms I*.- P. Garg, R. K. Mallik, and H. M. Gupta, “Performance analysis of space-time coding with imperfect channel estimation,”
*IEEE Trans. Wireless Communications*, vol. 4, pp. 257–265, Jan. 2005. **New York McGraw-Hill, 1996.**

M. Schwartz, W. Bennett, and S. Stein,*Communication Systems and Techniques*.