Turbo Packet Combining for Broadband Space–Time BICM Hybrid–ARQ Systems with Co–Channel Interference

# Turbo Packet Combining for Broadband Space–Time BICM Hybrid–ARQ Systems with Co–Channel Interference

Tarik Ait-Idir,  Houda Chafnaji, and Samir Saoudi,  The Associate Editor coordinating the review of this paper and approving it for publication is Dr. M. C. Valenti. Manuscript received March 26, 2009; revised December 24, 2009; accepted February 21, 2010. This paper was presented in part at the 19th Annual IEEE Symposium on Personal Indoor and Mobile Radio Communications (PIMRC 2008), Cannes, France, September 2008, and in part at the IEEE Global Communications Conference (Globecom’09), Honolulu, Hawaii, Nov-Dec 2009.T. Ait-Idir and H. Chafnaji are with the Communication Systems Department, INPT, Madinat Al-Irfane, Rabat, Morocco. They are also with Institut Telecom / Telecom Bretegne/LabSticc, Brest, France (email: aitidir@ieee.org). S. Saoudi is with Institut Telecom / Telecom Bretegne/LabSticc, Brest, France. He is also with Université Européenne de Bretagne.
###### Abstract

In this paper, efficient turbo packet combining for single carrier (SC) broadband multiple-input–multiple-output (MIMO) hybrid–automatic repeat request (ARQ) transmission with unknown co-channel interference (CCI) is studied. We propose a new frequency domain soft minimum mean square error (MMSE)-based signal level combining technique where received signals and channel frequency responses (CFR)s corresponding to all retransmissions are used to decode the data packet. We provide a recursive implementation algorithm for the introduced scheme, and show that both its computational complexity and memory requirements are quite insensitive to the ARQ delay, i.e., maximum number of ARQ rounds. Furthermore, we analyze the asymptotic performance, and show that under a sum-rank condition on the CCI MIMO ARQ channel, the proposed packet combining scheme is not interference-limited. Simulation results are provided to demonstrate the gains offered by the proposed technique.

{keywords}

Automatic repeat request (ARQ) mechanisms, multiple-input–multiple-output (MIMO), single carrier (SC), unknown co-channel interference (CCI), intersymbol interference (ISI), frequency domain methods.

## I Introduction

\PARstart

Space–time–bit-interleaved coded modulation (ST–BICM) with iterative decoding is an attractive signaling scheme that offers high spectral efficiencies over multiple-input–multiple-output (MIMO)-intersymbol interference (ISI) channels [1, 2, 3, 4, 5]. To combat ISI in single carrier (SC) broadband ST–BICM transmission, frequency domain equalization, initially introduced for single antenna systems [6, 7, 8, 9], has been proposed using iterative (turbo) processing [10]. It is a receiver scheme that allows high ISI cancellation capability at an affordable complexity cost. In practical systems, unknown co-channel interference (CCI) caused by other transmitters (distant users and/or neighboring cells) who simultaneously use the same radio resource can dramatically degrade the link performance. This limitation can be overcome by using the so-called hybrid–automatic repeat request (ARQ) protocols, where channel coding is combined with ARQ [11, 12]. In hybrid–ARQ, erroneous data packets are kept in the receiver and used to detect/decode the retransmitted frame [13, 14, 15, 16, 17, 18, 19]. This technique is often referred to as “packet combining”. Practical packet combining schemes have been addressed in [20]. In [21], an elegant information-theoretic framework has been introduced to analyze the throughput and delay of hybrid–ARQ under random user behavior. Interestingly, the authors have shown that hybrid–ARQ systems are not interference limited, i.e., arbitrarily high throughput can be achieved by simply increasing the transmit power of all users even when multi-user detection (MUD) techniques are not used at the receiver. Motivated by the above considerations, we investigate efficient low-complexity turbo frequency domain reception techniques for SC broadband ST–BICM signaling with hybrid–ARQ operating over CCI-limited MIMO channels.

The powerful diversity–multiplexing tradeoff tool, initially introduced by Zheng and Tse for coherent delay-limited, i.e., quasi-static, MIMO channels [22], has been elegantly extended by El Gamal et al. to MIMO ARQ channels with flat fading, and referred to as diversity–multiplexing–delay tradeoff [23]. The authors have proved that the ARQ delay, i.e., maximum number of ARQ protocol rounds, improves the outage probability 111In non-ergodic, i.e., block fading quasi-static channels, the outage probability is a meaningful measure that provides a lower bound on the block error probability. It is defined as the probability that the mutual information, as a function of the channel realization and the average signal-to-noise ratio (SNR), is below the transmission rate [24]. performance for large classes of MIMO ARQ channels [23]. In particular, they have demonstrated that the diversity order can be increased due to ARQ even when the MIMO ARQ channel is long-term static, i.e., the MIMO channel is random but fixed for all ARQ rounds. The diversity–multiplexing–delay tradeoff has then been characterized in the case of block-fading MIMO ARQ channels, i.e., multiple fading blocks are allowed within the same ARQ round [25]. In [26], the outage probability of MIMO-ISI ARQ channels has been evaluated under the assumptions of short-term static channel dynamic 222In the case of short-term static dynamic, the ARQ channel realizations are independent from round to round. This dynamic applies to slow ARQ protocols where the delay between two rounds is larger than the channel coherence time. , and Chase-type ARQ [27], i.e., the data packet is entirely retransmitted. It has been shown that, as in the flat fading case, ARQ presents an important source of diversity, but its influence becomes only minimal when the ARQ delay is increased. This observation suggests that the design of practical packet combining schemes should target a high diversity order for early ARQ rounds. Supplementary retransmissions are then used to correct rare erroneous data packets, when they occur.

More recently, packet combining for MIMO ARQ systems has been investigated (e.g. [28, 29, 30, 31, 32, 33, 34, 35, 36]). Turbo combining techniques, where decoding is iteratively performed through the exchange of soft information between the soft-input–soft-output (SISO) packet combiner and the SISO decoder, have been proposed for the MIMO-ISI ARQ channel using unconditional minimum mean square error (MMSE)-aided combining [37, 26]. These approaches have then been extended to broadband MIMO code division multiple access (CDMA) systems with ARQ [38]. Time domain turbo packet combining for CCI-limited MIMO-ISI ARQ channels has been introduced in [39].

In this paper, we investigate efficient turbo receiver techniques for SC ST–BICM transmission with Chase-type ARQ over broadband MIMO channel with unknown CCI. We introduce a frequency domain MMSE-based turbo packet combining scheme, where all ARQ rounds are used to decode the data packet. By using an identical cyclic prefix (CP) word for multiple retransmissions of a symbol block, we perform transmission combining at the signal level. The frequency domain soft MMSE packet combiner performs soft ISI cancellation and retransmission combining in the presence of unknown CCI jointly over all received signal blocks. We also provide an efficient recursive implementation algorithm for the proposed scheme, and show that both the computational load and memory requirements are quite insensitive to the ARQ delay. The complexity order is only cubic in terms of the number of transmit antennas. Received signals and channel frequency responses (CFR)s corresponding to all ARQ rounds are used without being required to be stored in the receiver. We analyze the asymptotic performance of the proposed combining scheme. Interestingly, we show that under a rank-condition on the MIMO ARQ channel corresponding to unknown CCI, the proposed combining scheme is not interference-limited, i.e.,unknown CCI can be completely removed. Finally, we provide numerical simulation results for some scenarios to validate our findings.

The remainder of the paper is organized as follows. In Section II we describe the ARQ system under consideration, along with the communication model in the presence of unknown CCI. In Section III, we present the frequency domain turbo packet combining scheme we propose in this paper, and analyze both its complexity and memory requirements. In Section IV, we carry out the asymptotic performance analysis, and provide representative numerical results that demonstrate the gains achieved by the proposed scheme. Finally, we point out conclusions in Section V.

Notation:

• Superscripts , , and denote conjugate, transpose, and Hermitian transpose, respectively. is the mathematical expectation of the argument .

• Let be a square matrix, denotes the row vector corresponding to the diagonal of , and denotes the trace of . When , denotes the matrix whose diagonal blocks are . is the diagonal matrix whose diagonal entries are the elements of the complex vector . denotes the th diagonal entry of matrix .

• is the identity matrix, and denotes an all zero matrix. For , is a zero matrix where the th block is equal to .

• Operator denotes the Kronecker product, and is the Kronecker symbol, i.e., for and for .

• For each sequence of matrices (respectively, scalars ), denotes its time average (respectively, ).

• is a unitary matrix whose th element is for , where . is defined as .

• For each vector , denotes the discrete Fourier Transform (DFT) of , i.e., .

• The acronym i.i.d. means “independent and identically distributed”.

## Ii ARQ System Model

### Ii-a SC–MIMO ARQ Transmission Scheme

We consider an SC multi-antenna-aided transmission scheme where the transmitter and the receiver are equipped with transmit (index ) and receive (index ) antennas, respectively. The MIMO channel is frequency selective and is composed of symbol-spaced taps (index ). The energy of each tap is denoted , and the total energy is normalized to one, i.e.,

Each information block is initially encoded then interleaved with the aid of a semi-random interleaver . The resulting frame is serial to parallel converted and mapped over the elements of the constellation set to produce symbol matrix , where is the number of channel use (c.u). A CP word, whose length is , is then appended to , thereby yielding matrix . This allows the prevention of inter-block interference (IBI) and the exploitation of the multipath diversity of the MIMO broadband channel. We suppose that no channel state information (CSI) is available at the transmitter and assume infinitely deep interleaving. Therefore, transmitted symbols are independent and have equal transmit power, i.e.,

 E[st,is⋆t′,i′]=δt−t′,i−i′. (1)

At the upper layer, an ARQ protocol is used to help correct erroneous frames. An acknowledgment message is generated after the decoding of each information block. Therefore, when the decoding is successful, the receiver sends back a positive acknowledgment (ACK) to the transmitter, while the feedback of a negative acknowledgment (NACK) indicates that the decoding outcome is erroneous. Let denote the ARQ delay, and denote the ARQ round index. When the transmitter receives an ACK feedback, it stops the transmission of the current block and moves on to the next information block. Reception of a NACK message incurs supplementary ARQ rounds until the packet is correctly decoded or the ARQ delay is reached. We focus on Chase-type ARQ, i.e., the symbol matrix is completely retransmitted. In addition, we suppose perfect packet error detection, and assume that the one bit ACK/NACK feedback is error-free.

### Ii-B Communication Model in the Presence of Unknown CCI

The broadband MIMO ARQ channel is assumed to be short-term static fading, i.e., the channel independently changes from round to round. Note that this channel dynamic applies to slow ARQ protocols where the delay between two consecutive ARQ rounds is larger than the channel coherence time. It also applies to orthogonal frequency division multiplexing (OFDM) systems where frequency hopping is used to mitigate ISI. Let denote channel matrices at the th ARQ round, and whose entries are i.i.d. zero-mean circularly symmetric Gaussian, i.e., , where denotes the fading channel corresponding to path and connecting the th transmit and the th receive antennas at the th ARQ round. Therefore, the channel energy at each receive antenna is

 L−1∑l=0NT∑t=1E[∣∣h(k)r,t,l∣∣2]=NT. (2)

The channel profile, i.e., power distribution and number of taps , is supposed to be identical for at least consecutive rounds. This is a reasonable assumption because the channel profile dynamic mainly depends on the shadowing effect.

Transmitted data blocks are corrupted by an unknown CCI signal caused by a co–channel transmission that uses transmit antennas (index ) and c.u. The link between the interferer transmitter and the receiver is composed of taps, where the channel matrix of each tap at round is and its energy is 333The ARQ processes corresponding to the desired user and the interferer are not necessarily synchronized. Therefore, the round index appearing in the CCI channel matrices only refers to the index of a realization of the interferer channel at ARQ round . The same remark holds for CCI symbols in (4). Also, note that in order to account for the path-loss between the interferer and the receiver. . We suppose that the receiver has no knowledge either about the interferer CSI or about its channel profile and number of transmit antennas (i,.e., parameters , , , and are completely unknown at the receiver). As the desired user, the interferer employs a CP-aided transmission strategy. Its transmitted symbols at each round verify the independence/energy-normalization condition (1) as useful symbols. Therefore, the signal-to-interference ratio (SIR) at each receive antenna is given as

 SIR=NTN′T∑L′−1l′=0σ2ul′. (3)

We assume perfect frame synchronization between the interferer and the desired user. They can differ in terms of the CP word length, which depends on the delay of the multipath channel, but are synchronized in terms of the useful symbol frames. Under this assumption, CP deletion yields the following baseband received signal at round and channel use ,

 (4)

where denotes the receiver thermal noise. The SC–MIMO ARQ communication scheme at round is depicted in Fig. 1. In the following, we assume perfect channel estimation at each ARQ round (i.e., are perfectly known) while CCI channel matrices are completely unknown at the receiver side.

#### Ii-B1 Single-Round Communication Model

To derive the block communication model corresponding to ARQ round , we consider the following block signal vector,

 (5)

that groups signals corresponding to the entire symbol frame. Vector can be expressed as,

 y(k)=H(k)s+w(k), (6)

where

 s≜[s⊤0,⋯,s⊤T−1]⊤∈STNT, (7)
 (8)
 H(k)≜⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣H(k)00NR×NT⋯0NR×NT⋮H(k)0⋮H(k)L−1⋮⋮0NR×NTH(k)L−10NR×NT⋮0NR×NTH(k)L−1⋮⋮⋮0NR×NT0NR×NT⋯H(k)0⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦NRT×NTT (9)

is a block circulant matrix that can be block diagonalized in a Fourier basis as

 H(k)=UHT,NRΛ(k)UT,NT, (10)

where

 Λ(k)≜diag{Λ(k)0,⋯,Λ(k)T−1}∈CNRT×NTT. (11)

Exploiting (10) and the block circulant structure of , we get

 Λ(k)i=L−1∑l=0H(k)lexp{−j2πilT}. (12)

Applying the DFT on signal vector yields the single-round frequency domain communication model

 y(k)f=Λ(k)sf+w(k)f, (13)

where , , and denote the DFT of , , and , respectively.

#### Ii-B2 Multi-Round Communication Model

Let us suppose that received signals and channel matrices corresponding to ARQ rounds are available at the receiver. First, we introduce the signal vector notation

 y––(k)i≜[y(1)⊤i,⋯,y(k)⊤i]⊤∈CkNR, (14)

where received signals corresponding to multiple ARQ rounds are grouped in such a way to construct virtual receive antennas. Similarly, we define,

 H–––(k)l≜[H(1)⊤l,⋯,H(k)⊤l]⊤∈CkNR×NT, (15)
 w––(k)i≜[w(1)⊤i,⋯,w(k)⊤i]⊤∈CkNR. (16)

The block signal vector that serves for jointly performing, at ARQ round , packet combining and equalization in the presence of CCI is constructed similarly to (5),

 (17)

and can be expressed as,

 y––(k)=H–––(k)s+w––(k), (18)

where

 (19)

Matrix has the same structure as (9), where its first block column is equal to

 [H–––(k)⊤0,⋯,H–––(k)⊤L−1,0NT×(T−L)kNR]⊤. (20)

can be factorized similarly to (10) as,

 H–––(k)=UHT,kNRΛ––(k)UT,NT, (21)

where

 Λ––(k)≜diag{Λ––(k)0,⋯,Λ––(k)T−1}∈CkNRT×NTT, (22)
 Λ––(k)i≜[Λ(1)⊤i,⋯,Λ(k)⊤i]⊤∈CkNR×NT, (23)

and matrices , , are given by (12). The multi-round frequency domain communication model at ARQ round is then expressed as,

 y––(k)f=Λ––(k)sf+w––(k)f, (24)

where and denote the DFT of and , respectively.

## Iii Frequency Domain Turbo Packet Combining in the Presence of Unknown CCI

### Iii-a General Description

At each ARQ round, the decoding of a data packet is performed by iteratively exchanging soft information in the form of log-likelihood ratio (LLR) values between the soft packet combiner, i.e., the joint transmission combining and equalization unit, and the soft-input–soft-output (SISO) decoder. Let us suppose that, at ARQ round , all received signals and channel matrices corresponding to previous rounds are available at the receiver. Note that this assumption could not be feasible in practice since the receiver will require a huge memory. In Subsection III-D, we show that the proposed turbo packet combining algorithm requires little memory while it uses signals and CSIs corresponding to all ARQ rounds . The block diagram of the frequency domain turbo packet combining receiver at ARQ round is depicted in Fig. 2.

First, the multiple ARQ rounds frequency domain block signal vector and CFR are constructed. Second, the soft packet combiner estimates the covariance of unknown CCI plus noise, then computes the multi-transmission MMSE filter that takes into account both co-antenna interference (CAI) and ISI while suppressing unknown CCI. These two elements are then used with a priori information to compute extrinsic LLRs corresponding to coded and interleaved bits. The generated soft information is transferred to the SISO decoder to compute a posteriori LLRs about both coded and useful bits. Only extrinsic information is fed back to the soft packet combiner to help perform transmission combining and equalization in the next turbo iteration. The iterative soft packet combining and decoding process is stopped after a preset number of turbo iterations and decision about the data packet is performed. The ACK/NACK message is then sent back to the transmitter depending on the decoding outcome. Note that during the first iteration a priori LLR values are the output of the SISO decoder obtained at the last iteration of previous round .

### Iii-B Properties of CCI plus Noise Covariance

In this subsection, we focus on covariance properties of CCI plus noise present in both the single-round and multi-round communication models given by (6) and (18), respectively. These properties present an important ingredient in the turbo packet combining algorithm we introduce in Subsection III-C.

Let denote the covariance of CCI plus noise present in received signal (4) at round ,

 Θk≜E[w(k)iw(k)Hi]∈CNR×NR. (25)

Let us group covariance matrices corresponding to rounds in the block diagonal matrix

 Ξk≜diag{Θ1,⋯,Θk}∈CkNR×kNR. (26)
###### Proposition 1

The covariance of the CCI plus noise block vector present in the multi-round communication model (18) after rounds is expressed as

 Ξ––k=IT⊗Ξk∈CTkNR×TkNR. (27)
{proof}

The expression in (27) is easily obtained by calculating the mathematical expectation of . In the derivation, we only exploit the independence between the entries of and and (i.e., short-term static block fading dynamic of the CCI MIMO ARQ channel), and the fact that CCI symbols satisfy (1). No assumption on the structure of the CCI block matrix is used. A detailed proof of (27) in the case of sliding-window aided time-domain detection can be found in [39, Subsection III.C].

###### Proposition 2

The covariance of the single-round CCI plus noise block vector at ARQ round is

 Θ––k=IT⊗Θk. (28)
{proof}

The proof follows by simply invoking Proposition 1 for one round.

###### Proposition 3

Covariance matrices of frequency domain CCI plus noise vectors and (corresponding to the DFTs of and , respectively) are and , respectively.

{proof}

The proof of Proposition 3 follows from the fact that and are block circulant and block diagonal matrices. Proposition 1 indicates that the covariance of the multi-round CCI plus noise vector can be obtained by separately computing single-round covariances using Proposition 2. This result greatly impacts the computational complexity of the proposed algorithm as it will be shown in Subsection III-D.

### Iii-C Proposed Scheme

In this subsection, we derive the frequency domain MMSE-based soft packet combiner that cancels both CAI and ISI jointly over multiple ARQ rounds in the presence of unknown CCI.

To combine signals corresponding to ARQ rounds , we use conventional soft parallel interference cancellation (PIC) (of both multi-round CAI and ISI) and unconditional MMSE filtering techniques [3]. Therefore, at each turbo iteration of ARQ round , the MMSE-based soft packet combiner produces a complex scalar decision that serves for computing extrinsic LLR values corresponding to coded and interleaved bits mapped over symbol . Let denote the vector of a priori LLRs of bits corresponding to symbol , and available at the input of the soft combiner at a particular turbo iteration. denotes the conditional variance of . By invoking either the orthogonal projection theorem or Lagrangian methods, and using (11) and (27), soft MMSE-based packet combining at ARQ round , can be performed in the frequency domain as,

 z(k)f=Γ(k)y––(k)f−Ω(k)¯sf, (29)

where is the DFT of , i.e., , denotes the DFT of the soft symbol vector , and

 ⎧⎪ ⎪⎨⎪ ⎪⎩Γ(k)=diag{Λ––(k)H0B(k)−10,⋯,Λ––(k)HT−1B(k)−1T−1},Ω(k)=C(k)−IT⊗diag{(~C(k))1,1,⋯,(~C(k))NT,NT}, (30)
 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩B(k)i=Λ––(k)i~ΣΛ––(k)Hi+Ξk,C(k)i=Λ––(k)HiB(k)−1iΛ––(k)i,C(k)≜diag{C(k)0,⋯,C(k)T−1},Σi≜diag{σ21,i,⋯,σ2NT,i}∈RNT×NT. (31)

Matrices and denote time averages of and , respectively, as defined in Section I. The input for the soft demapper can be extracted from as where is the th vector of the canonical basis. As it can be seen from the forward–backward filtering structure in (29), the frequency domain MMSE filter explicitly cancels soft CAI and ISI while it only requires the covariance of unknown CCI plus noise. Note that both Propositions 1 and 2 are used to derive (29).

To obtain estimates of unknown CCI plus noise covariance matrices , required by (31), let us consider the single-round frequency domain communication model (13). Proposition 3 indicates that the covariance of is Therefore, with respect to the block diagonal structure of (13), unknown CCI plus noise covariance can directly be estimated in the frequency domain at each turbo iteration, with the aid of a priori LLRs, according to the following average,

 (32)

and denote the DFTs of and at frequency bin , respectively, i.e., and . Covariance matrices are similarly estimated at ARQ rounds , respectively, and correspond to estimates obtained at the last turbo iteration. In other words, when the decoding outcome is erroneous, a NACK message is fed back to the transmitter, and the unknown CCI plus noise covariance estimate obtained at the last iteration is saved in the receiver to help perform packet combining at the next ARQ round.

### Iii-D Implementation Aspects

We first provide an efficient implementation of the proposed scheme since turbo combining requires at each turbo iteration the computation of matrix inverses given by (31). Second, we analyze the computational complexity and memory requirements of the proposed implementation algorithm.

#### Iii-D1 An Efficient Implementation Algorithm

The special structure of the frequency domain ARQ channel matrix (23) together with the matrix inversion lemma [40] allow us to express the inverse of as,

 B(k)−1i=Ξ−1k−Ξ−1kΛ––(k)i(~Σ+D(k)i)−1Λ––(k)HiΞ−1k, (33)

where is obtained according to the following recursion,

 {D(k)i=D(k−1)i+Λ(k)HiΘ−1kΛ(k)i,D(0)i=0NT×NT. (34)

Therefore, matrices are simply computed as,

 C(k)i=D(k)i−D(k)i(~Σ+D(k)i)−1D(k)i, (35)

while the forward filtering part of (29) is calculated at each ARQ round as,

 Γ(k)y––(k)f=F(k)~y––(k)f, (36)

where

 F(k)=diag{INT−D(k)0(~Σ+D(k)0)−1,⋯,INT−D(k)T−1(~Σ+D(k)T−1)−1}, (37)

and is given by the following recursion,

 ⎧⎨⎩~y––(k)f=~y––(k−1)f+Λ(k)H(IT⊗Θ−1k)y(k)f,~y––(0)f=0NT×1. (38)

The proposed turbo packet combining algorithm is summarized in Table I. Note that, during the first iteration of round , the anti-causal parts in recursions (34) and (38), i.e., and , respectively, correspond to the output of these recursions at the last iteration of previous round .

#### Iii-D2 Computational Complexity and Memory Requirements

The proposed recursive implementation algorithm avoids storing received signals and CFRs corresponding to multiple ARQ rounds. It also prevents the computation of matrix inverses. This dramatically reduces the implementation cost since the complexity order of directly computing is cubic against , and is greatly increased from round to round. In the following, we analyze both the complexity and memory requirements of the proposed scheme, and compare them with those of the LLR-level combining technique 444In this paper, LLR-level combining refers to the iterative (turbo) packet combining and SISO decoding receiver, where transmissions corresponding to ARQ rounds are separately turbo equalized using frequency domain MMSE soft equalizers. To perform packet combining at each iteration of ARQ round , extrinsic LLR values generated by the soft MMSE equalizer at round and those obtained at the last iteration of previous rounds are added together, then SISO decoding is performed. .

First, note that in the case of LLR-level packet combining, frequency domain MMSE equalization is separately performed for each ARQ round. Therefore, inversions of matrices are required to compute the forward and backward filters. Since in general it is required to have more receive than transmit antennas, especially when CCI is present in the system, an implementation similar to that introduced in the previous subsection is beneficial because only inversions of matrices will be required. In this case, the two variables in recursions (34) and (38) are computed at ARQ round as, and while all the other steps in Table I remain the same (including the CCI plus noise covariance estimation procedure in step 1.2.2.). Therefore, by letting denote the number of turbo iterations at each ARQ round, both combining algorithms have similar computational complexities since the proposed scheme and the LLR-level scheme require at most and arithmetic additions to perform (34) and (38), and to combine LLRs corresponding to multiple rounds, respectively.

LLR-level packet combining performs the combination of extrinsic LLR values generated by frequency domain soft equalizers at multiple ARQ rounds. Therefore, a storage capacity of real values is required to store accumulated LLR values corresponding to all ARQ rounds. The proposed scheme combines multiple transmissions at the signal level using signals and CFRs corresponding to all ARQ rounds, without being required to be explicitly stored in the receiver. This is performed with the aid of the two variables and in recursions (34) and (38), respectively. This translates into a memory size of real values. Therefore, the computational complexity and storage requirements are less sensitive to the ARQ delay. The technique requires only a few more additions and a bit more memory compared to LLR-level combining. Table II summarizes implementation requirements and reports the relative costs 555Relative costs refer to the relative number of arithmetic additions and memory required by the proposed scheme compared to LLR-level combining. With respect to storage requirements and number of arithmetic additions in Table II, we have . for some modulation schemes.

## Iv Performance Evaluation

### Iv-a Asymptotic Performance Analysis

In the following, we provide a frame-basis analysis where we derive system conditions under which perfect CCI cancellation holds. We suppose that the interferer CSI is perfectly known, and investigate the influence of its channel properties on the interference cancellation capability of the proposed packet combining scheme in the high SNR regime.

###### Theorem 1

We consider a CCI-limited MIMO ARQ system with transmit and receive antennas, and ARQ delay . Let denote the CCI covariance at ARQ round , i.e., the covariance of the global noise at the receiver is , and be the rank of . We assume perfect LLR feedback from the SISO decoder. The frequency domain soft MMSE packet combiner provides perfect CCI suppression for asymptotically high SNR if

 k∑u=1ρu
{proof}

See the Appendix. We now proceed to derive an upper bound on , where we incorporate the rank of the CCI fading channel. Under the assumption that CCI symbols satisfy (1), i.e., infinitely deep interleaving, we get

 ΘCCIk=L′−1∑l′=0HCCI(k)l′HCCI(k)Hl′. (40)

Let us write each CCI channel matrix as

 HCCI(k)l′=R1/2NRACCI(k)l′R1/2N′T∀l′, (41)

where characterizes the scattering environment between the CCI transmitter and receiver [41], and and are the correlation matrices controlling the receive and transmit antenna arrays, and are in general given by (42), where [42]. Note that (41) corresponds to a general model of correlated fading MIMO channels, where the scattering radii at transmitter and receiver sides is taken into account, and is not necessarily a full rank matrix, i.e., [41]. Noting that and are full rank matrices, and with respect to the fact that CCI tap channel matrices are independent, and using (40) and (41), we get

 ρk ≤min{NR,L′−1∑l′=0rank{HCCI(k)l′HCCI(k)Hl′}} ≤min{NR,L′−1∑l′=0rank{ACCI(k)l′}}. (43)

A closer look at Theorem 1 and upper bound (43) provides interesting system interpretations.

• Impact of CCI Fading Channel: First, note that the CCI cancellation capability of the frequency domain MMSE packet combiner is related to the CCI channel rank. When the interferer has a rank-deficient channel matrix at a certain ARQ round, interference can completely be removed (at subsequent rounds) if the sum-rank condition in Theorem 1 is satisfied. In practice, the channel rank can dramatically drop in the case of the so-called pinhole channel, where the transmitter and receiver are largely separated and are surrounded by multiple scatterers [41]. In this scenario, the channel can even prevent multipath from building up since the thin air pipe connecting transmitter and receiver scatterers is very long. For instance, in a system with receive and transmit antennas, and an unknown interferer who is experiencing one path () channel realizations with rank equal to two, CCI can be removed at the second ARQ round because the sum-rank condition (39) holds for .

• Impact of the Number of Transmit Antennas and ARQ Delay: Condition (39) suggests how, for a given CCI channel profile, the number of transmit antennas and ARQ rounds are chosen to achieve perfect CCI cancellation. For instance, if transmission is corrupted by CCI with quasi-static channel rank 666In this case, CCI with quasi-static channel rank refers to an interferer whose channel rank is constant over multiple ARQ rounds., and if the ARQ delay allowed by the upper layer is , then only transmit antennas can be allocated to the user of interest to achieve interference suppression at the latest at ARQ round , where is the rank of , i.e., . Increasing the ARQ delay will relax the condition on the number of transmit antennas and therefore allow for an increase in the diversity and/or multiplexing gains depending on the diversity-multiplexing-delay trade-off operating point [23]. Note that when , the CCI channel rank dramatically drops, and therefore CCI suppression is achieved even when a short ARQ delay is required.

• Interaction with the Scheduling Mechanism: In the case of opportunistic communications, interference with co-channel users who have high channel ranks can be prevented. For instance, when a retransmission is required on the reverse link, the base station (BS) can choose the timing of the next ARQ round in such a way that transmission simultaneously occurs with that of a user with low channel rank. This is feasible since the BS has complete knowledge about user CSIs in the reverse link. The same scheduling mechanism can be used in the forward link if all users provide the BS with feedback information about their channel ranks. When the system suffers from CCI caused by neighboring cells, the sum-rank condition (39) can be achieved by simply increasing the number of ARQ rounds because the CCI channel rank tends to be constant over time.

### Iv-B Numerical Results

In this subsection, we provide block error rate (BLER) performance results for the proposed combining technique. Our focus is to demonstrate the superior performance of the introduced scheme compared to LLR-level combining. We also evaluate BLER performance for scenarios where the interferer has rank deficient channel matrices to corroborate the theoretical analysis in Subsection IV-A.

In all simulations, we consider a BICM scheme where the encoder is a -rate convolutional code with polynomial generators , and the modulation scheme is quadrature phase shift keying (QPSK). The length of the code bit frame is bits including tails. The ARQ delay is , and the ratio appearing in all figures is the SNR per useful bit per receive antenna. We consider a path MIMO-ISI channel profile where . In practical wireless systems, the wireless channel may have more than two paths due to severe frequency selective fading. In this paper, we restrict ourselves to for the sake of simulation simplicity. Performance in the case of severe frequency selective fading channels can be found in [38]. We use both the matched filter bound (MFB) per ARQ round and the outage probability [26] of the CCI-free MIMO-ISI channel as absolute performance bounds to evaluate the CCI cancellation capability and diversity order achieved by the proposed combining scheme. The number of turbo iterations is set to five and the Max-Log-MAP version of the maximum a posteriori (MAP) algorithm is used for SISO decoding.

We first investigate performance for scenarios where the user of interest and the interferer have the same number of transmit antennas () and identical channel profiles, i.e., , equal power taps, and CCI fading channel coefficients are i.i.d. In Fig. 3, we compare the BLER performance of the proposed scheme with that of LLR-level combining for a ST–BICM code with rate , i.e., . The number of receive antennas is , and . We observe that the proposed scheme significantly outperforms LLR-level combining. The performance gap at ARQ round is about for . Note that both combining schemes fail to perfectly cancel CCI since performance curves tend to saturate for high values. Fig. 4 reports performance of both techniques when is increased to . In this case, the performance gap between the two schemes is reduced. The CCI cancellation capability is also improved as can be seen from the steeper slopes of BLER curves. In Fig. 5, we evaluate the performance for a high rate ST–BICM code where , i.e., . Only receive antennas are considered, and . The proposed scheme dramatically outperforms LLR-level combining, i.e., the performance gap at ARQ round is about at BLER. The proposed scheme also offers higher cancellation capability and diversity order than LLR-level combining.