Four-Group Decodable Space-Time Block Codes

# Four-Group Decodable Space-Time Block Codes

Dũng Ngọc Ðào, , Chau Yuen, , Chintha Tellambura, , Yong Liang Guan, , and Tjeng Thiang Tjhung, Manuscript received November 7, 2006; revised February 23, 2007, and May 7, 2007. The work of D. N. Ðào and C. Tellambura was supported by The National Sciences and Engineering Research Council (NSERC) and Alberta Informatics Circle of Research Excellence (iCORE), Canada. The editor coordinating the review of this paper and approving it for publication was Dr. Franz Hlawatsch.D. N. Ðào was with Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada. He is now with Department of Electrical and Computer Engineering, McGill University, Montréal, Québec, H3A 2A7, Canada. (e-mail: ngoc.dao@mail.mcgill.ca)C. Yuen and T. T. Tjhung are with Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613. e-mail: {cyuen, tjhungtt}@i2r.a-star.edu.sg.C. Tellambura is with Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta T6G 2V4, Canada. (e-mail: chintha@ece.ualberta.ca)Y. L. Guan is with the School of Electrical and Electronic Engineering, Nanyang Technological University, S1-B1c-108, Nanyang Avenue, Singapore, 639798. e-mail: eylguan@ntu.edu.sg
###### Abstract

Two new rate-one full-diversity space-time block codes (STBC) are proposed. They are characterized by the lowest decoding complexity among the known rate-one STBC, arising due to the complete separability of the transmitted symbols into four groups for maximum likelihood detection. The first and the second codes are delay-optimal if the number of transmit antennas is a power of 2 and even, respectively. The exact pair-wise error probability is derived to allow for the performance optimization of the two codes. Compared with existing low-decoding complexity STBC, the two new codes offer several advantages such as higher code rate, lower encoding/decoding delay and complexity, lower peak-to-average power ratio, and better performance.

{keywords}

Orthogonal designs, performance analysis, quasi-orthogonal space-time block codes, space-time block codes.

## I Introduction

Space-time block codes (STBC111The term "STBC" stands for space-time block code/codes/coding, depending on the context.) have been extensively studied since they exploit the diversity and/or the capacity of multiple-input multiple-output (MIMO) channels. Among various STBC, orthogonal STBC (OSTBC) [1, 2, 3] offer the minimum decoding complexity and full diversity. However, they have low code rates when the number of transmit (Tx) antennas is more than 2 [3]. The rate of one symbol per channel use (pcu) only exists for 2 Tx antennas and the rate approaches 1/2 for a large number of Tx antennas [1, 2, 3].

To improve the low rate of OSTBC, several quasi-orthogonal STBC (QSTBC) have been proposed (see [4, 5, 6, 7] and references therein). They allow joint maximum likelihood (ML) decoding of pairs of complex symbols. However, the rate-one QSTBC exist for 4 Tx antennas only and the code rate is smaller than 1 for more than 4 Tx antennas. Several rate-one STBC have been proposed (e.g. [8, 9, 10]), in which the transmitted symbols can be completely separated into two groups for ML detection. However, for more than 4 Tx antennas, the decoding complexity of the rate-one STBC in [8, 9, 10] increases significantly compared with OSTBC and QSTBC.

In this paper, we propose two new rate-one STBC for any number of Tx antennas. Compared with the existing rate-one STBC, our new codes have lowest decoding complexity since the transmitted symbols can be decoupled into 4 groups (4Gp) for ML detection. The first code is called 4Gp-QSTBC. The second code is derived from semi-orthogonal algebraic space-time (SAST) codes [10] and thus called 4Gp-SAST codes. The first and the second codes are delay-optimal when the number of Tx antennas is a power of 2 and even, respectively. The equivalent transmit-receive signals are derived so that sphere decoders [11] can be applied for data detection. To achieve full-diversity, signal rotations are required for the two codes. The exact pair-wise error probability (PEP) of the two codes is derived to optimize the signal rotations.

We compare the main parameters of our new codes and several existing STBC for 6 and 8 Tx antennas in Table I. Clearly, the new codes offer several distinct advantages such as higher code rate, low decoding complexity, and lower encoding/decoding delay. The two new codes also have lower peak-to-average power ratio (PAPR) than OSTBC, QSTBC, and minimum decoding complexity (MDC) QSTBC [12]. Moreover, simulation results show that our new codes also yield significant SNR gains compared with the existing codes.

Notation: Superscripts , , and denote matrix transpose, conjugate, and transpose conjugate, respectively. The identity and all-zero square matrices of proper size are denoted by and . The diagonal matrix with elements of vector on the main diagonal is denoted by . stands for the Frobenius norm of matrix and denotes Kronecker product [13]. A mean- and variance- circularly complex Gaussian random variable is written by . and denote the real and imaginary parts of , respectively.

## Ii System Model and Preliminaries

### Ii-a System Model

We consider data transmission over a MIMO quasi-static Rayleigh flat fading channel with Tx and receive (Rx) antennas [14]. The channel gain between the )-th Tx-Rx antenna pair is assumed and remains constant over time slots. We assume no spatial correlation at either Tx or Rx array. The receiver, but not the transmitter, completely knows the channel gains.

A STBC can be represented in a general dispersion form [14] as follows:

 X=K∑k=1(akAk+bkBk) (1)

where and , () are constant matrices, commonly called dispersion matrices; and are the real and imaginary parts of the symbol . We can use an equivalent form of STBC as

 X=L∑l=1clCl (2)

where is the number (not necessarily even) of transmitted symbols, are real-value transmitted symbols, are dispersion matrices. The average energy of code matrices is constrained such that .

The received signals of the th antenna at time can be arranged in a matrix of size . Thus, one can represent the Tx-Rx signal relation as [15, 14]

 Y=√ρXH+Z (3)

where is the channel matrix; is the noise matrix of size , its elements are independently, identically distributed (i.i.d.) . The Tx power is scaled by so that the average signal-to-noise ratio (SNR) at each Rx antenna is , independent of the number of Tx antennas.

Let the data vector be . The ML decoding of STBC is to find the solution so that:

 ^c=argminc∥Y−XH∥2F. (4)

### Ii-B Algebraic Constraints of QSTBC

The key idea of QSTBC is to divide the (real) transmitted symbols embedded in a code matrix into groups, so that the ML detection of the transmitted symbol vector can be decoupled into sub-metrics, each metric involves the symbols of only one group [6, 8, 16, 10]. We provide a definition of STBC with this feature to unify the notation in this paper as follows.

###### Definition 1

A STBC is said to be -group decodable STBC if the ML decoding metric (4) can be decoupled into a linear sum of independent submetrics, each submetric consists of the symbols from only one group. The -group decodable STBC is denoted by Gp-STBC for short.

In the most general case, we assume that there are groups; each group is denoted by and has symbols. Thus . Let be the set of indexes of symbols in the group .

Yuen et al. [16, Theorem 1] have shown a sufficient condition for a STBC to be -group decodable. In fact, this condition is also necessary. We will state these results in the following theorem without proof for brevity.

###### Theorem 1

The necessary and sufficient conditions, so that a STBC is -group decodable, are

 C†pCq+C†qCp=0∀p∈Θi,∀q∈Θj,i≠j. (5)

Note that Theorem 1 covers [17, Theorem 9] (single-symbol decodable STBC) and can be shown similarly.

## Iii Four-group Decodable STBC Derived from QSTBC

### Iii-a Encoding

In this section, we will study the new 4Gp-QSTBC. As we will see later, the general form of STBC in (1) is convenient for studying 4Gp-QSTBC; hence Theorem 1 can be restated as follows.

###### Lemma 1 ([18])

The necessary and sufficient conditions for a STBC in (1) to become -group decodable are: (a) , (b) , and (c) , .

We next consider another sufficient condition so that a STBC is four-group decodable.

###### Theorem 2

Given a 4Gp-STBC for Tx antennas with code length and sets of dispersion matrices , a 4Gp-STBC with code length for Tx antennas, which consists of sets of dispersion matrices denoted as , can be constructed using the following mapping rules:

 ¯A2k−1=[Ak00Ak],¯A2k=[Bk00Bk], ¯B2k−1=[0AkAk0],¯B2k=[0BkBk0]. (6)
{proof}

Theorem 2 can be proved by showing that if the dispersion matrices satisfy Lemma 1 with where , then the dispersion matrices constructed from using (2) will satisfy Theorem 2 with constructed from using (2). The detailed proof is omitted here, as the steps are routine.

The recursive construction of 4Gp-STBC specified in Theorem 2 suggests that we can start with the MDC-QSTBC for 4 Tx antennas proposed in [12] to construct 4Gp-STBC for 8, 16 Tx antennas and so on, because MDC-QSTBC is one of the STBC satisfying Lemma 1; the resulting STBC is thus called 4Gp-QSTBC. For practical interest, we will illustrate the encoding process of 4Gp-QSTBC for 8 Tx antennas from the MDC-QSTBC for 4 Tx antennas [12]. The code matrix of MDC-QSTBC for 4 Tx antennas is

 F4 =⎡⎢ ⎢ ⎢ ⎢⎣[r]a1+ja3a2+ja4b1+jb3b2+jb4−a2+ja4a1−ja3−b2+jb4b1−jb3b1+jb3b2+jb4a1+ja3a2+ja4−b2+jb4b1−jb3−a2+ja4a1−ja3⎤⎥ ⎥ ⎥ ⎥⎦ (7)

where .

The code matrix of 4Gp-QSTBC for 8 Tx antennas from using mapping rules in (2) is given below:

 F8 =⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣a1+ja5a3+ja7a2+ja6a4+ja8−a3+ja7a1−ja5−a4+ja8a2−ja6a2+ja6a4+ja8a1+ja5a3+ja7−a4+ja8a2−ja6−a3+ja7a1−ja5b1+jb5b3+jb7b2+jb6b4+jb8−b3+jb7b1−jb5−b4+jb8b2−jb6b2+jb6b4+jb8b1+jb5b3+jb7−b4+jb8b2−jb6−b3+jb7b1−jb5 b1+jb5b3+jb7b2+jb6b4+jb8−b3+jb7b1−jb5−b4+jb8b2−jb6b2+jb6b4+jb8b1+jb5b3+jb7−b4+jb8b2−jb6−b3+jb7b1−jb5a1+ja5a3+ja7a2+ja6a4+ja8−a3+ja7a1−ja5−a4+ja8a2−ja6a2+ja6a4+ja8a1+ja5a3+ja7−a4+ja8a2−ja6−a3+ja7a1−ja5⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (8)

The code rate of 4Gp-QSTBC for 8 Tx antennas is one symbol pcu. In general, by construction, the rate of 4Gp-QSTBC for Tx antennas is the same as the rate of MDC-QSTBC for Tx antennas. The maximal rate of MDC-QSTBC is one symbol pcu [12], the maximal achievable rate of 4Gp-QSTBC is also one symbol pcu for Tx antennas. If the number of Tx antennas is , then columns of the code matrix for Tx antennas can be deleted to obtain the code for antennas. Thus, the maximum rate of 4Gp-QSTBC is one symbol pcu and it is achievable for any number of Tx antennas. Additionally, the code matrix is square. By recursive construction (2), the code matrices of 4Gp-QSTBC are also square for Tx antennas; and therefore, 4Gp-QSTBC are delay optimal if the number of Tx antennas is [17].

### Iii-B Decoding

We know that the symbols of can be separately detected [12]. Therefore, from Theorem 2, the 4 groups of 8 symbols of can be detected independently. These 4 groups are , and . The ML metric given in (4) can be derived to detect the 4 groups of symbols of . However, to provide more insights into the decoding of 4Gp-QSTBC, we will derive an equivalent code and the equivalent channel of . Furthermore, using the equivalent channel of , we can use a sphere decoder [11] to reduce the complexity of the ML search.

The equivalent code of is obtained by column permutations for the code matrix of in (III-A): the order of columns is changed to (1, 3, 5, 7, 2, 4, 6, 8). This order of permutations is also applied for the rows of . Let be the intermediate variables, we obtain a permutation-equivalent code of below

 D=[[r]D1D2−D∗2D∗1] (9)

where

 D1=⎡⎢ ⎢ ⎢ ⎢⎣[r]x1x2x3x4x2x1x4x3x3x4x1x2x4x3x2x1⎤⎥ ⎥ ⎥ ⎥⎦,D2=⎡⎢ ⎢ ⎢ ⎢⎣[r]x5x6x7x8x6x5x8x7x7x8x5x6x8x7x6x5⎤⎥ ⎥ ⎥ ⎥⎦. (10)

The sub-matrices and have a special form called block-circulant matrix with circulant blocks [13].

We next show how to decode the code . For simplicity, a single Rx antenna is considered. The generalization for multiple Rx antennas is straightforward. Assume that the Tx symbols are drawn from a constellation with unit average power, the Tx-Rx signal model in (3) for the case of STBC follows

 y=√ρ/8Dh+z. (11)

Let , , , and

 H1=⎡⎢ ⎢ ⎢ ⎢⎣[r]h1h2h3h4h2h1h4h3h3h4h1h2h4h3h2h1⎤⎥ ⎥ ⎥ ⎥⎦,H2=⎡⎢ ⎢ ⎢ ⎢⎣[r]h5h6h7h8h6h5h8h7h7h8h5h6h8h7h6h5⎤⎥ ⎥ ⎥ ⎥⎦. (12)

We have an equivalent expression of (11) as

 (13)

Note that and are block-circulant matrices with circulant-blocks [13]. Thus, they are commutative and so do and . We can multiply both sides of (13) with to get

 ¯H†^y¯y=√ρ8[H∗1H1+H∗2H200H∗1H1+H∗2H2]x+¯H†^z¯z. (14)

It can be shown that the noise elements of vector are correlated with covariance matrix . Thus this noise vector can be whitened by multiplying both side of (14) with the matrix . Let . After the noise whitening step, (14) is equivalent to the following equations

 ^H−1/2¯yi=√ρ8^H1/2xi+¯zi,(i=1,2), (15)

where , , the noise vectors
are uncorrelated and have elements .

At this point, the decoding of the 8 transmitted symbols of the code can be readily decoupled into 2 groups. However, since the code is a 4Gp-STBC, we can further decompose them into 4 groups in the following.

Denote the (real) discrete Fourier transform (DFT) matrix by . The block-circulant matrices and can be diagonalized by a (real) unitary matrix [13, Theorem 5.8.2, p. 185]. Note that , therefore, and , where and are diagonal matrices, with eigenvalues of and in the main diagonal, respectively. Thus, , and also . Since is a real matrix, (15) becomes

 ^H−1/2R(¯yi)=√ρ/8^H1/2R(xi)+R(¯zi),i=1,2, (16a) ^H−1/2I(¯yi)=√ρ/8^H1/2I(xi)+I(¯zi),i=1,2. (16b)

Note that , i.e. is only dependent on the complex symbols and . Similarly, , and depend on , and , respectively.

Eq. (16) shows that the decoding of 8 transmitted symbols of STBC is separated into the decoding of 4 groups, each with two symbols (thus the search space size has been reduced from to where is the transmit constellation size). A sphere decoder [11] can also be used to reduce the complexity of the ML search for each group. The matrix can be considered as the equivalent channel of the 4Gp-QSTBC .

### Iii-C Performance Analysis

In (16), the PEP of the four transmit symbol vectors are the same. We thus need to consider the PEP of one of the vectors . For notational simplicity, the subindex of is dropped. Additionally, we can introduce redundancy on the signal space by using a real unitary rotation to the data vector . Thus the data vector .

From (16a), the PEP of the pair and can be expressed by the Gaussian tail function as [19]

 P(d→¯d|^H) =Q⎛⎜ ⎜⎝ ⎷ρ8∥H^1/2Rδ∥2F4N0⎞⎟ ⎟⎠ =Q⎛⎜ ⎜ ⎜⎝ ⎷ρ[δTRTΘT(Λ†1Λ1+Λ†2Λ2)ΘRδ]16⎞⎟ ⎟ ⎟⎠. (17)

where , is the variance of the elements of the white noise vector in (16a).

Remember that is a diagonal matrix with eigenvalues of on the main diagonal. Let be the eigenvalues of . Then . Let , we have

 P(d→¯d|^H)=Q⎛⎜⎝√ρ(∑2i=1∑4j=1β2j|λi,j|2)16⎞⎟⎠. (18)

To derive a closed form of (18), we need to evaluate the distribution of . The eigenvectors of is the columns of the matrix . Thus, the eigenvalues of are: . Since for , thus and so do .

We now use the Craig’s formula [20] to derive the conditional PEP in (18).

 P(d→¯d|^H)=Q⎛⎜⎝√ρ(∑2i=1∑4j=1β2j|λi,j|2)16⎞⎟⎠ =1π∫π/20exp⎛⎝−ρ(∑2i=1∑4j=1β2j|λi,j|2)32sin2α⎞⎠dα. (19)

Applying a method based on the moment generating function [19], we obtain the unconditional PEP as:

 P(d→¯d)=1π∫π/20[4∏i=1(1+ρβ2i8sin2α)]−2dα. (20)

If , then at high SNR, the approximation of the exact PEP in (20) is

 P(d→¯d) ≈(224ρ−8π∫π/20(sinα)16dα)4∏i=1|βi|−4 =2716!ρ−88!8!4∏i=1|βi|−4. (21)

The exponent of SNR in (III-C) is -8. This indicates that the maximum diversity order of 4Gp-QSTBC is 8 and it is achievable if the product distance (see [21] and references therein) is nonzero for all possible data vectors. Furthermore, at high SNR, the asymptotic PEP becomes very tight to the exact PEP. Recall that ; thus, the product matrix is the combined rotation matrix for data vector . Since is a constant matrix, we can optimize the matrix so that the minimum product distance , where is nonzero and maximized.

If the complex signals are drawn from QAM, the (real) elements of are in the set . The best known rotations for QAM in terms of maximizing the minimum product distance are provided in [21, 22]. Denoting the rotation matrix in [21, 22] by , the signal rotation for our 4Gp-QSTBC is given by

 R=ΘRBOV. (22)

Simulations show that the above vector signal rotation perform better than the symbol-wise rotation proposed in [18] (details omitted for brevity). We have presented important properties of 4Gp-QSTBC. In the next section, we will investigate 4Gp-SAST codes.

## Iv Four-Group Decodable STBC Derived from SAST Codes

### Iv-a Encoding

The SAST code matrix is constructed for Tx antennas using circulant blocks. Two length- data vectors and are used to generate two -by- circulant matrices [13]. Note that the first row of circulant matrix copies the row vector ; the th row is obtained by circular shift () times to the right the vector . The SAST code matrix is constructed as

 S=[[r]C(sT1)C(sT2)−C†(sT2)C†(sT1)]. (23)

By construction, 4Gp-SAST codes have rate of one symbol pcu; the code matrices for an even number of Tx antennas are square; thus 4Gp-SAST codes are delay-optimal for even number of Tx antennas.

### Iv-B Decoder of 4Gp-SAST codes

Similar to 4Gp-QSTBC, the decoding of 4Gp-SAST codes requires two steps. First, the two data vectors and are decoupled [10]; then, the real and imaginary parts of vectors and are separated. We provide the detail decoder with only one Rx antenna as generalization for multiple Rx antennas can be easily done.

We introduce another type of circulant matrix called left ciculant, denoted by , where the th row is obtained by circular shifts () times to the left for the row vector .

Let us define a permutation on an arbitrary matrix such that, the th row is permuted with the th row for , where is the ceiling function. One can verify that

 Π(CL(x))=C(x). (24)

Let , , , , , , , , . We can write the Tx-Rx signal relation as

 [y1y2]=√ρM[[r]C(s1)C(s2)−C†(s2)C†(s1)][h1h2]+[z1z2]. (25)

An equivalent form of (25) is

 [y1y∗2]=√ρM[X1X2X3X4][s1s2]+[z1z∗2] (26)

where .

Applying permutation in (24) for the column matrix , we obtain

 [¯y1¯y2] ≜[Π(y1)y∗2] =√ρM[Π(X1)Π(X2)X3X4][s1s2]+[Π(z1)z∗2] =√ρM[[r]H1H2H†2−H†1]H[s1s2]+[¯z1¯z2] (27)

where , , , . The elements of and are , as elements of and . We now multiply with both sides of (IV-B). Let , we get

 [^y1^y2] =H†[¯y1¯y2]=√ρM[^H0¯M0¯M^H][s1s2]+H†[¯z1¯z2] =√ρM[^H0¯M0¯M^H][s1s2]+[^z1^z2]^z. (28)

The covariance matrix of the additive noise vector is . Therefore, the noise vectors and are uncorrelated and have the same covariance matrix . Thus and can be decoded separately using , . The noise vectors and can be whitened by the same whitening matrix . The equivalent equations for Tx-Rx signals are

 ^H−1/2^yi=√ρ/M^H1/2si+^H−1/2^zi,i=1,2. (29)

At this point, the decoding of SAST codes becomes the detection of 2 group of complex symbols ; this is similar to the detection of 4Gp-QSTBC in (15). Our next step is to separate the real and imaginary parts of vectors to obtain 4 groups of symbols for data detection.

Recall that , and both and are circulant. Hence, is also circulant [13]. Let be the eigenvalues of . We can diagonalize by DFT matrix as . Thus