Grassmannian Predictive Coding for Limited Feedback in Multiple Antenna Wireless Systems

# Grassmannian Predictive Coding for Limited Feedback in Multiple Antenna Wireless Systems

Takao Inoue,  and Robert W. Heath, Jr.,  This material is based in part upon work supported by the National Science Foundation under grant CCF-830615. This work has appeared in part in the 2011 IEEE Int. Conf. on Acoustics, Speech and Signal Process.Takao Inoue is with National Instruments, 11500 N. Mopac Expwy, Austin, TX 78759 USA. Email: takao@ieee.org.Robert W. Heath, Jr. is with The University of Texas at Austin, Department of Electrical and Computer Engineering, Wireless Networking and Communication Group, 1 University Station C0803, Austin, TX, 78712-0240 USA. Email: rheath@ece.utexas.edu.
###### Abstract

Limited feedback is a paradigm for the feedback of channel state information in wireless systems. In multiple antenna wireless systems, limited feedback usually entails quantizing a source that lives on the Grassmann manifold. Most work on limited feedback beamforming considered single-shot quantization. In wireless systems, however, the channel is temporally correlated, which can be used to reduce feedback requirements. Unfortunately, conventional predictive quantization does not incorporate the non-Euclidean structure of the Grassmann manifold. In this paper, we propose a Grassmannian predictive coding algorithm where the differential geometric structure of the Grassmann manifold is used to formulate a predictive vector quantization encoder and decoder. We analyze the quantization error and derive bounds on the distortion attained by the proposed algorithm. We apply the algorithm to a multiuser multiple-input multiple-output wireless system and show that it improves the achievable sum rate as the temporal correlation of the channel increases.

Prediction methods, correlation, feedback communication, MIMO systems, quantization, vector quantization.

## I Introduction

Multiple antenna wireless communication systems can improve throughput and reliability when channel state information (CSI) is known at the transmitter. Limited feedback is a flexible approach for providing quantized channel state information from the receiver to the transmitter. Most prior work on limited feedback use one-shot feedback that makes an instantaneous channel measurement and sends back the quantized CSI without memory. In a mobile environment, however, the channel exhibits coherence over time that may be exploited to improve the resolution of the quantized CSI at the transmitter.

Predictive vector quantization (PVQ) is a class of memory based coding techniques used in applications such as speech, image, and video processing [1, 2, 3, 4]. In PVQ, the error signal between the current observed vector and the predicted vector based on past observations is quantized. When the observed data to be encoded are correlated, usually in time or space, quantizing the error signal leads to lower distortion compared with memoryless vector quantization [3]. The effectiveness of PVQ rests on the correlation exhibited by the data, the prediction function, and the quantization technique employed. Due to temporal correlation in the propagation channel, it is natural to consider predictive coding approach for encoding CSI in temporally correlated channels. Classical PVQ has been applied for signals in linear vector space where the usual difference, addition, and prediction are well understood. Unfortunately, in multiple antenna limited feedback beamforming in wireless communication, the CSI to be encoded lives often on the Grassmann manifold. The Grassmann manifold, denoted , is the set of -dimensional subspaces of -dimensional Euclidean space. Because it is a nonlinear manifold, extending classical PVQ is challenging since the usual linear operations, not to mention important functions like prediction, are not well defined.

Motivated by applications in multiple-input multiple-output (MIMO) wireless communication, there has been research in analyzing [5], quantizing [6, 7, 8], and coding [9, 10, 11] on the Grassmann manifold driven in part by applications to commercial wireless systems [12, 13]. Prior work exists for designing suitable memoryless quantization codebooks such as Grassmannian line packing [14], vector quantization [15], Grassmannian frames [16], and Kerdock codebooks [17] (e.g., also see the references in [18]). Several techniques have been previously proposed to exploit the temporal correlation of the propagation channel [19, 20, 21, 22, 23, 24, 25, 26, 27]. In [19, 20], modeling the feedback state transitions allow the net feedback rate to be reduced. The resolution, however, is fixed by the codebook size. To improve the quantization error, an adaptive codebook approach was proposed that can adapt to a given channel distribution [21]. Additional feedback overhead to retrain or synchronize the pre-computed codebooks may be needed when the channel distribution changes. Alternatively, a hierarchical codebook strategy uses two codebooks, coarse and fine, for layered feedback in temporally correlated channel [22, 23]. A codeword describing the coarse encoding region is updated infrequently and a finer local codebook is used for frequent feedback. A more flexible approach is to use a progressive refinement strategy in which rotation and scaling are applied to structured codebook so as to provide high resolution feedback [24, 25]. An approach related to our paper is the complex Householder transform based PVQ-like technique for correlated normalized channel vectors in multiple-input single-output communication systems [26]. The current vector channel is decomposed into previous vector channel and weighted sum of orthogonal subspaces to represent the temporal variation. While the algorithm is presented in the form of PVQ, the actual operation is successive decomposition and projection using the complex Householder transform with unit delay which was shown to be optimal for the specific application. A differential feedback approach using a rotation codebook has been proposed for spatial multiplexing system [27]. They require long term correlation statistics to design suitable codebook and exploit the structure of the Riemannian manifold. Unfortunately, the codebooks are specific to the given long term statistics and may become outdated. The Grassmannian predictive coding technique proposed in this paper was presented in part in [28]. We proposed the Grassmannian predictive coding algorithm applied to multiuser MIMO system. We did not, however, provide the details of derivation nor consider an efficient codebook representation and a distortion analysis.

In this paper, we propose a predictive coding algorithm for correlated data on the Grassmann manifold, which we call the Grassmannian predictive coding (GPC) algorithm. The GPC algorithm is derived using the intrinsic geometry of the manifold and corresponding mathematical operations that respect the curved manifold structure. The main contributions of this paper are as follows.

• Grassmannian predictive coding algorithm: We propose a framework for predictive coding on the Grassmann manifold. The key idea of our approach is to use the tangent vector to establish the notion of a difference between points on the manifold. The proposed prediction function uses parallel transport as a one step prediction. The prediction step uses the immediate past difference; formulating higher order prediction function remains for future work. The concepts of tangent vector and parallel transport have been used in [29] for optimization problems, but have not been exploited to develop a predictive coding concept.

• Efficient codebook structure: A design of tangent space codebook using Lloyd algorithm is proposed. The codebook lives in the tangent space of with magnitude dependent on the correlation exhibited by the channel and prediction function. An efficient codebook storage strategy is proposed exploiting the direction and magnitude decomposition of the tangent space vector.

• Distortion bounds: Based on a geometric interpretation of our GPC algorithm, a simple model of the quantization region is obtained. Using metric volume computations on the Grassmann manifold [7], lower and upper bounds on the quantization error are derived. We compare the obtained bounds with distortion obtained in simulations. Furthermore, we show that the distortion for the proposed GPC algorithm is lower than the lower bound of memoryless quantizer distortion for a given codebook size.

• Application to limited feedback multiuser MIMO systems: We apply the GPC algorithm for limited feedback zero-forcing multiuser MIMO systems with multiple transmit antennas and a single receive antenna at each mobile terminal [30]. We show that the proposed GPC algorithm provides substantial sum rate improvement over memoryless random codebook technique with same feedback rate [30]. The sum rate improvement, however, depends on the channel correlation. When the channel is highly correlated, the proposed GPC algorithm is shown to provide sum rates close to a system with perfect CSI at the transmitter, i.e., infinite feedback.

Notation: We use lower case bold letters, e.g., , to denote vectors and upper case bold letters, e.g., , to denote matrices. A 2-norm is denoted by and a normalized vector is denoted by . The identity matrix is denoted by . The space of integers and complex numbers are denoted by and , respectively, with an appropriate superscript to denote the dimension of the respective spaces. We use , , and to denote the transposition, Hermitian transpose, and pseudo inverse, respectively. The -th column entry of a matrix is denoted by . The expectation is denoted .

## Ii System Model

In this paper we apply GPC algorithm to limited feedback multiuser MIMO communication. It can also be applied to single user MIMO and to multi-cell MIMO systems. Multiple user MIMO is a challenging application of limited feedback as it requires high resolution quantization [30] and is known to be sensitive to channel variations [31]. We consider a multiuser limited feedback system with transmit antennas at the base station and mobile users each equipped with a single receive antenna. To isolate the impact of using predictive coding for limited feedback, we assume that users are scheduled a priori from possibly large number of user pool; we do not consider scheduling or the effects of multiuser diversity in this paper. Let , , and be the complex transmit symbol, unit norm beamforming vector, and channel vector for -th user at time index , respectively. We assume that the transmit vector satisfies the total transmit power constraint . Then, the input-output relationship for -th user may be written as

 yu[k]=h∗u[k]vu[k]su[k]+h∗u[k]U∑n=1,n≠uvn[k]sn[k]+nu[k] (1)

where is an independent identically distributed (i.i.d.) zero mean complex Gaussian noise with unit variance at user . The first term in (1) is the desired signal for -th user while the second summation term is the interference signal. The signal to interference plus noise ratio (SINR) for the -th user can be written as

 SINRu=PU|h∗uvu|21+∑n≠uPNt|h∗uvn|2. (2)

If the transmit signal is assumed to be Gaussian, the achievable rate for user is given by

 Ru=log2(1+SINRu) (3)

and the sum rate as .

The SINR expression (2) shows that the amount of interference depends on the design of the beamforming vectors. Zero forcing uses beamforming vectors such that they are orthogonal to other user’s channel vectors, i.e., for , to null the inter user interference [32]. Let be the composite channel matrix. With perfect CSI, the interference can be completely eliminated by choosing the unit norm beamforming vector as the normalized columns of pseudo inverse composite channel matrix, i.e., . Zero forcing creates interference free parallel channels providing nearly linear throughput increase as a function of number of users but with some power loss due to normalization [33].

In limited feedback multiuser MIMO systems, quantized CSI is fed back to the transmitter from each user [30, 31]. Assuming that a perfect channel estimate is obtained, we consider the quantization of the channel direction and assume that the scalar channel gain is known perfectly [31]. We assume that the channel gain is dependent on the longer term statistics that varies much slower than the channel direction. Since the channel gain is a real valued quantity that is easier to feedback, we assume that the channel gain is known perfectly at the transmitter and consider the effects of the channel shape quantization only [31]. In this regime, the SINR can be rewritten as

 SINRu=PNt∥hu∥2|g∗uvu|21+∑n≠uPNt∥hu∥2|g∗uvn|2. (4)

We make two observation from (4). First, if the channel vector is an i.i.d. vector distributed according to , is isotropically distributed on the -dimensional hyper-sphere. Second, due to the absolute value around , the SINR is independent of arbitrary unitary rotations of the channel direction. That is, for . Therefore, we may identify the space of channel shape as the Grassmannian manifold. Thus, the problem of transmit beamformer design is to feedback channel shapes on the Grassmann manifold from each user , and use the collected channel shape information at the transmitter to design the beamforming vectors by zero forcing.

In conventional codebook based limited feedback multiuser MIMO systems, each user has a normalized channel vector codebook of size which is shared with the transmitter [30, 31]. The transmitter maintains tables of size codebooks. Each user selects the codeword with minimum chordal distance from the normalized channel vector estimate. The index of the selected codeword using bits is fed back to the transmitter. The transmitter collects the decoded channel vectors for each user to form the composite channel matrix . The beamforming vectors are computed as . Using a random codebook, it was shown in [30] that sum rate performance becomes interference limited as signal to noise ratio (SNR) increases and that codebook size needs to be increased linearly as a function of SNR, in dB, to maintain multiplexing gain. Herein, lies the practical limitation of the conventional codebook approach: the codebook size that approaches the achievable sum rate becomes impractical even for moderate SNR. The proposed GPC algorithm overcomes this problem.

## Iii Grassmann Manifold: Preliminaries

The geometric and linear algebraic properties of the Grassmann manifold will be fundamental in derivation of our proposed algorithm. In this section we review key definitions, properties, and mathematical tools pertaining to designing algorithms for the Grassmann manifold. Then we propose a predictor on the Grassmann manifold built from the tangent vector, mapping from the tangent onto the manifold, and parallel transport.

Let be the unitary group formed by unitary matrices. For , the Grassmann manifold, , is the set of subspaces spanned by the columns of the quotient group . It may also be identified as the quotient space of the unitary group, . A point may be considered as an equivalence class, i.e., . For notational brevity, we denote to mean the equivalence class of matrices whose columns span the same -dimensional subspace. For numerical computation, we understand to be one representative of the equivalence class. The Grassmann manifold is a smooth topological manifold with a locally Euclidean property and smooth tangent space structure [34], both of which will be essential in the derivation of the proposed algorithm. In this paper, we consider the Grassmann manifold ; the general case of is a topic of future work.

Let the inner product of be denoted by . Let be the subspace angle between and  [35]. The chordal distance metric for is given by [29, 36]

 d(x,y) = √1−|ρ|2 (5) = |sinθ|.

For notational brevity, we use without the arguments when there is no confusion. Unlike the arc length, given by , the chordal distance is differentiable everywhere and provides a close approximation of the arc length when the points are close [37].

Using the chordal distance metric, we define the correlation of two sequences by which can be interpreted as the mean chordal distance between two sequences on the Grassmann manifold.

Based on the smooth manifold structure of the Grassmann manifold, it is possible to relate two points by considering the tangent vector emanating from to . Fig. 1 illustrates the concept. The tangent has been used successfully in the development of Newton and conjugate gradient algorithms with orthogonality constraints [38, 29, 39, 40]. We utilize the tangent relationship for its computational benefits and geometric insight to the problem.

###### Lemma 1 (Tangent)

If , then the tangent vector emanating from to is

 e = tan−1(d|ρ|)x[k+1]/ρ−x[k]∥x[k+1]/ρ−x[k]∥ (6)

such that is the arc length between and and

 →e=x[k+1]/ρ−x[k]d/|ρ|

is the unit tangent direction vector.

See Appendix A.

Lemma 1 provides a compact formula for the tangent vector relating points and on . For notational brevity, we denote . The tangent vector can be interpreted as a length preserving unwrapping of the arc between and onto the tangent space at . Furthermore, it is conveniently expressed as the product of a magnitude component and the normalized directional component. The decomposition will be exploited in codebook design for efficient storage.

The tangent vector describes the shortest distance path along the arc from to , called the geodesic [29]. The geodesic can be parameterized by a single parameter using the tangent vector as the next lemma shows.

###### Lemma 2 (Geodesic)

If , , , and are the tangent vector emanating from to , the norm of the tangent vector, and the normalized tangent vector, respectively, then the geodesic path between and is

 G(x[k],e,t) = x[k]cos(∥e∥t)+→esin(∥e∥t) (7)

for such that and .

See Appendix B.

Lemma 2 provides a convenient formula to relate points between and in terms of the tangent vector and the step size . To introduce the notion of prediction, we use the tangent vector with respect to such that it extends the geodesic path from and . The translation of the tangent vector along the Grassmann manifold is accomplished by the parallel transport.

###### Lemma 3 (Parallel Transport)

Let and be the tangent vector emanating from to . Then, the parallel transported tangent vector emanating from along the geodesic direction is

 ^e=tan−1(d|ρ|)x[k+1]ρ∗−x[k]d. (8)

See Appendix C.

Note that the general expression in [29] involves singular value decomposition (SVD) which is typically expensive for implementation. A compact form without an SVD on has not appeared in the literature before to the best of our knowledge. Thus Lemma 3 provides a convenient expression for transporting the base of the tangent vector from to . It can be interpreted as transforming the tangent vector onto another tangent space connected by the geodesic.

Using the concepts of the tangent vector, geodesic, and parallel transport, we propose a one step prediction for .

###### Definition 4 (One Step Grassmannian Prediction)

Let . The one step predicted vector along the geodesic direction from to is

 ~x[k+1]=|ρ|x[k]+ρ∗x[k]−x[k−1] (9)

such that .

See Appendix D for a detailed derivation. It is surprising that the predicted vector can be computed by the knowledge of and using linear operations and the result remains on the Grassmann manifold. This simplification only happens for the case of taking a full step using . It is also possible to consider smaller steps as well as adaptive step sizes, but we defer this to future work.

## Iv Grassmannian Predictive Coding

In this section, we describe the proposed GPC algorithm. First, a general overview of the algorithm is provided. Second, the codebook design for encoding the error tangent vector is described. Finally, strategies for initialization are considered.

### Iv-a GPC Algorithm

Let be a correlated input sequence with time index . The general operation of the proposed GPC algorithm closely follows that of the conventional predictive vector quantization technique [3]. Linear operations such as difference, quantization, addition, and prediction are replaced by equivalent operators on Grassmann manifold using the concepts derived in Section III. The main idea of predictive coding is to quantize the error between the predicted vector and the current observed vector . The figure on the left hand side of Fig. 2 illustrates this graphically. Then, the quantized error is applied to predicted vector to construct the state of the current observed vector. The figure on the right of Fig. 2 illustrates this graphically. The current and previous estimated vectors, and , are used to compute the predict vector as it was shown in Section III and Fig. 1. Since both the encoder and decoder uses estimated vectors for prediction, they both obtain the same predicted vectors. This is in contrast to quantizing directly in the conventional one-shot approach [18]. By exploiting memory, predictive vector quantization offer higher resolution for a given number of bits.

Fig. 3 illustrates the proposed GPC encoder; the pseudo code is provided in Algorithm 1. At time , an error tangent vector is computed from the predicted vector to the current observed vector . Using (6), the error tangent vector emanating from to is computed as

 e[k]=tan−1(d|ρ|)x[k]/ρ−~x[k]∥x[k]/ρ−~x[k]∥ (10)

where and .

If is the size codebook of error tangent vectors, the index of the quantized error tangent vector is obtained by

 i[k] = argmini∈{1,2,…,NC}d(G(~x[k],ci,1),x[k]). (11)

The corresponding codeword is . The codeword that yields the geodesic map with shortest distance to the observed vector is selected. For notational brevity, we denote the quantization step by that takes the error tangent vector and outputs the codeword index, i.e., . The design of the codebook and efficient representation of the codebook for implementation will be described in IV-B.

Continuing at the encoder, the estimated vector becomes

 ^x[k]=G(~x[k],ci[k],1). (12)

Finally, the prediction using Definition 4 is performed using two previous estimates

 ~x[k+1]=|ρ|^x[k]+ρ∗^x[k]−^x[k−1] (13)

where . For notational brevity, we denote the prediction operation by a map which takes current and previous state vectors and outputs the predicted vector, i.e., . The predicted vector is used in the next step to compute the error tangent vector. The encoding procedure is repeated for each time .

Fig. 4 illustrates the proposed GPC decoder; the pseudo code is shown in Algorithm 2. The same error tangent codebook as the encoder is assumed to be available. The received indices are decoded in to recover . The predicted vector is mapped to the estimated vector using the codeword as in (12). Similarly to the encoder, the prediction is performed using (13) to obtain for the next time period. Note that for the first iteration of the decoder, the knowledge of , or equivalently and , is needed. Synchronizing the initial vectors with the encoder is important because if is different from the encoder, the received codeword no longer represents the correct error tangent vector. In Section IV-C, we provide an efficient strategy for initialization over finite rate communication channel. With appropriate initialization, symmetric operation at the encoder and decoder yields the same predicted vector for each time .

### Iv-B Codebook Design

One of the strategy for PVQ codebook design is to employ an open loop approach followed by a closed loop approach to refine the codebook [3]. The open loop approach uses the prior vectors from a training data set to perform the prediction instead of predicting using the estimates, i.e. . The error tangent vector is computed using (10). Then the Lloyd iterative algorithm is used to obtain the open-loop codebook. Using the codebook obtained using the open-loop codebook design, GPC is performed on the training data set to obtain a sequence of error tangent vectors. The Lloyd iteration is performed on the closed-loop error tangent vectors to obtain the final codebook. It is difficult to show the Lloyd iteration optimality of the open-loop and closed-loop approaches due to the feedback structure of the GPC but these approaches have been known to provide good results in the PVQ literature [3]. Thus, in this paper, we employ the open-loop and closed-loop approach to obtain the error tangent vector codebook.

For storage of the codebook, we propose an efficient codebook representation by exploiting the product structure of the tangent space. We quantize separately the tangent vector magnitude and direction [3]. Shape-gain vector quantization is widely used, for example, in speech and video coding [41]. We use the shape-gain decomposition to provide efficient codebook storage and exploit it to analyze the rate-distortion of the proposed GPC that is otherwise very difficult. The tangent magnitude is dependent on the distance between the predicted vector and the observed vector, which in turn is dependent on the rate of change of the input vectors. The unit norm error tangent vector depends on the location at which the tangent is computed and the directional statistics of the error. If is the obtained error tangent codebook of size , the shape-gain decomposed codebooks are for the error tangent direction codebook and for the error tangent magnitude codebook. The desired codeword is reconstructed as at time . With some heuristic design, it is possible to express, for example, a size -bit codebook of vectors by a size -bit codebook of scalars representing the magnitude and a size -bit codebook of vectors representing the normalized tangent directions. Thus codebook storage reduction is possible at an expense of extra computation to reconstruct the codeword.

### Iv-C Initialization

Similar to the PVQ, the initial states of the GPC at both the encoder and the decoder needs to match to obtain the correct results. For example, in next generation wireless standards such as IEEE 802.16m, various feedback initialization intervals are defined [42, Sec.16.3.6]. Thus, an efficient mechanism for initialization is also important. Two approaches may be considered for initialization. One approach is to perform an initialization process so that the two estimated vectors and are communicated from the encoder to the decoder. Since the complete description of and must be communicated to the decoder, there is system dependent communication overhead. Another approach is to use the one-shot memoryless quantization technique to initialize the two vectors. This approach is attractive because it does not add any implementation overhead to systems already using one-shot feedback approach, e.g. 3GPP LTE. In particular, if the same codebook is used for the error tangent direction codebook and one-shot memoryless quantization codebook, there are no codebook memory overhead resulting in efficient implementation. A consequence of using memoryless quantization approach for initialization is that there may be an initial transient period in which the quantization error is larger than the steady state condition. As we show in Section V, this is because the memoryless quantization generally results in a larger quantization error.

## V Performance Analysis of GPC

In this section, we provide a quantization error analysis under a small angle approximation. We derive upper and lower distortion bounds, and then derive closed loop gain metric for the GPC algorithm.

### V-a Small Angle Approximation

In this section we use the locally Euclidean property of the Grassmann manifold to derive an expression for the prediction error as a function of the tangent vector. If is obtained by changes to , we can approximate the chordal distance between and as

 d(x,y) = √1−|x∗y|2 (14) = |sin(θ)| ≈ ∥x−y∥ (15)

where (14) follows from the subspace angle of vectors [35] and (15) follows from the small angle approximation. Thus, for a sufficiently small perturbation around , the subspace distance between and is approximated by the usual Euclidean distance.

We may express the current observed vector at time , , in terms of the predicted vector and the error tangent vector as

 x[k] = G(~x[k],e[k],1) (16) ≈ ~x[k]+→e[k]∥e[k]∥ = ~x[k]+e[k]

using the small angle approximation. Furthermore,

 x∗[k]x[k] ≈ (~x[k]+e[k])∗(~x[k]+e[k]) = 1+2∥e[k]∥R(→e[k]∗~x[k])+∥e[k]∥2 ≈ 1.

The second term, , in (V-A) is zero because the unit norm tangent vector is orthogonal to . Similarly, if is the selected error tangent codeword, the estimated signal can be expanded as

 ^x[k] = G(~x[k],ci[k],1) (18) ≈ ~x[k]+ci[k].

Both (16) and (18) reveal that for a small enough change, both vectors are expressed as an additive correction to the predicted vector. Thanks to the locally Euclidean property and using the usual -norm for the local difference, the prediction error is

 ∥x[k]−^x[k]∥ ≈ ∥e[k]−ci[k]∥. (19)

Therefore, the estimation error can be approximated as the normed difference between the actual tangent vector and the quantized tangent vector. Thus for small changes in the observed vector, the accuracy of tangent direction and tangent magnitude determines the accuracy of the estimate.

### V-B Distortion Bounds

The average distortion induced by a quantizer is a typical measure of performance. In what follows, we derive an upper and lower bound on the distortion for the proposed GPC algorithm. Recall that a metric ball with radius centered at on the Grassmann manifold is defined as

 Bδ(z)={y∈Gn,1:d(y,z)≤δ} (20)

such that . A closed form volume formula for is given as [43]

 Vol(Bδ(z))=δ2(n−1). (21)

Consider with and volume of given by (21). Let denote the differential form of the Haar measure on . The distortion in the ball normalized by the volume of the ball was shown to be [7, Lemma 1]

 ∫Bγ(z)d2(y,z)(dy)Vol(Bγ(z))=(2(n−1)2n)γ2. (22)

For memoryless quantization, the volume together with a point density and covering assumption over the entire are used to characterize distortion. For the proposed GPC algorithm, the Voronoi region is determined by the tangent direction and tangent magnitude codebooks which makes the covering argument difficult. To overcome this difficulty, we assume that the tangent magnitude codebook provides concentric annular partitions of the sphere cap centered around the predicted vector and the tangent direction codebook partitioning each annulus into equiangle sectors. We obtain the bounds by considering the ball that is enclosed in the smallest annular sector and the ball that encloses the largest annular sector. Similarly, the distortion upper bound is given by the volume of the ball that covers the Voronoi cell.

Let denote the minimum chordal distance between the tangent direction codewords and denote the minimum Euclidean distance between the tangent magnitude codewords. Similarly, let denote the maximum chordal distance between the tangent direction codewords and denote the maximum Euclidean distance between the tangent magnitude codewords. Suppose that the tangent direction and magnitude codebooks maps uniformly to an equiangle sectors of concentric annulus centered at the predicted vector. Then the following lemma provides the bounds on the distortion for GPC algorithm.

###### Lemma 5 (Distortion bounds)

If and , lower and upper quantization distortion bounds are given by

 Dlower = (2(n−1)2n)(γlower2)2 Dupper = (2(n−1)2n)(λupper2)2. (23)

The lower bound is given by the volume of a metric ball that has ball radius which is smaller of the half minimum chordal distance of tangent direction codebook and half minimum distance of tangent magnitude codebook. The upper bound is similarly obtained by considering the volume of a metric ball which covers a Voronoi region. The bounds are exact since the metric ball volume formula is accurate [7, Lemma 1].

No claim is made on the tightness of the bound since an accurate description of the Voronoi region obtained by the proposed tangent codebook remains an open problem. In Section VI-A, we provide numerical examples comparing the bounds obtained with actual distortion using fixed codebooks.

Using the obtained lower bound, we may further quantify the reduction in distortion lower bound compared to memoryless quantization on the Grassmann manifold. For , the lower bound on the fixed rate quantizer on the Grassmann manifold was shown to be

 DGn,1(N)=(2(n−1)2n)N−1n−1 (24)

where is the size of the codebook with rate bits [6, 7]. Suppose that is dominated by the tangent direction codebook such that and that Grassmannian codebook is used for the tangent direction codebook. Then, the lower bound for the GPC algorithm can be expressed as

 Dlower = (2(n−1)2n)⎛⎝γ2lower4⎞⎠ (25) = 14(2(n−1)2n)2N−1n−1d = 14(2(n−1)2n)DGn,1(Nd)

showing that the lower bound is smaller than when .

### V-C Performance Measures

The closed loop prediction gain ratio is often used in vector quantization literature [3] as a measure of how well the predictor performs with respect to the changes in the input. The closed loop prediction gain is usually written as the ratio of mean squared norm of the observed signal over mean squared norm of the prediction error. We define the mean squared error to be . For our GPC algorithm, we measure the closed loop prediction performance by

 Gclp = E[∥x[k]∥2]E[d2(~x[k],x[k])] (26) = 1E[d2(~x[k],x[k])]

where denotes the squared chordal prediction error. In fact, (26) can be further expressed as a function of the tangent vector assuming that the small angle approximation holds. Using (15) and (16), the distance function in the denominator can be approximated as . Therefore, the closed loop prediction gain for GPC algorithm becomes

 Gclp ≈ 1E[∥e[k]∥2] (27)

which shows the dependence of the closed loop prediction gain performance on the tangent magnitude. The tangent magnitude is in turn dependent on the changes in the observed process. A closed form relationship between the observed process and the tangent magnitude is in general difficult to obtain. In Section VI-B, we show some empirical results of the closed loop prediction gain performance for the proposed GPC algorithm.

## Vi Simulation Results

In this section, we provide numerical results to illustrate the performance of the proposed GPC algorithm.

### Vi-a Distortion Bounds

We present a numerical example illustrating the operational distortion and compare it with the upper and lower bounds given in Lemma 5. Correlated vectors were generated according to a second order autoregressive model with memory coefficients and with additive noise distributed according to zero mean complex Gaussian with variance , i.e., . The normalized vectors were considered to be the samples on to which the proposed GPC algorithm was applied. For this experiment, an tangent direction codebook was used and the tangent magnitude codebook size was varied from to . Fig. 6 shows the operational distortion with upper and lower distortion bounds obtained in Lemma 5 as a function of the tangent magnitude codebook size. The lower bound captures the distortion trend over the range of codebook sizes while the upper bound seems too loose. We also illustrate the lower bound of a memoryless quantization using a Grassmannian codebook with codebook sizes of , , , and bits so that the total number of bits used for the codebook matches that of the proposed GPC algorithm. We see that the proposed GPC algorithm provides significant improvement in distortion over the memoryless quantization technique. Unfortunately, the upper bound from Lemma 5 is dominated by the resolution of the -bit tangent direction codebook which has higher distortion than the memoryless quantization with adjusted number of codebook size. Nevertheless, the result shows that a significant reduction in distortion is achieved by the proposed GPC algorithm and the achievable distortion can be controlled by the tangent magnitude codebook which is a simple scalar codebook.

### Vi-B Closed Loop Prediction Gain and Prediction Error

To illustrate the dependence on the tangent direction and tangent magnitude codebooks, Fig. 7 shows the closed loop prediction gains for various error tangent magnitude codebook sizes and fixed tangent direction codebook of size . For these numerical examples, a correlated vector sequence was generated according to a first order autoregressive model (or Gauss-Markov model [44]) with correlation coefficient where is Bessel function of zeroth order and is the normalized Doppler frequency. The sequence of channel coefficients are generated according to

 h[k]=αh[k−1]+√1−α2z[k] (28)

where is the time index and is a vector with each entry drawn from an i.i.d. zero mean complex white Gaussian process. The normalized vectors are the correlated sequence on the Grassmann manifold. For the tangent direction codebook, an Grassmannian codebook [45] was used and the tangent magnitude codebooks were based on a uniform quantization between and using , , , and bits. For an upper bound, the closed loop prediction gain without quantizing the tangent magnitude is also shown. The result illustrates the dependence of closed loop prediction gain on tangent magnitude codebook size as a function of correlation parameter . For highly correlated data, the tangent magnitude codebook resolution has higher impact on the closed loop prediction gain. This is because the smallest tangent magnitude quantization level may be larger than the prediction error leading to an over estimation. If the tangent magnitude codebook is adjusted based on the correlation, e.g., quantize in the range of instead of , this gap may be closed.

Another useful performance measure is the chordal distance error between the estimated vector and the observed vector . The chordal distance error shows how close the estimated vector is to the observed vector using the proposed GPC algorithm. In MIMO communication application considered in VI-C, the chordal distance error has a direct impact on the respective communication theoretic performance measures. In Fig. 8, we show the chordal distance between and and the chordal distance between the quantized vector and the observed vector for memoryless quantization using Grassmannian codebook with . Fig. 8 illustrates the substantial improvement in the quantization accuracy compared with memoryless technique.

To further illustrate the quantizer accuracy, we show the operational mean squared chordal distance error (MSE) as a function of for the proposed GPC algorithm and memoryless quantizer using Grassmannian codebook in Fig. 9. The memoryless quantizer provides approximately dB of MSE whereas the proposed GPC algorithm provides as little as dB of MSE which shows that significant accuracy can be obtained over memoryless quantization techniques. As the correlation decreases, the MSE approaches that of the memoryless quantization MSE.

### Vi-C Application to Zero Forcing Multiuser MIMO System

In this section, we illustrate the application of proposed GPC algorithm to limited feedback multiuser MIMO system using zero forcing precoding [30]. We assume that the transmitter has transmit antennas and each user is equipped with single receive antenna. We assume that the encoder and decoder are initialized and that each user has a perfect channel estimate. Then, each user performs the prediction as described in Section IV and feedback the indices of quantized tangent direction and tangent magnitude codewords. The transmitter uses the received indices and performs the prediction as depicted in Fig. 4. Then, the predicted channel vectors are used to form the composite channel matrix to compute the zero forcing precoder. The channel to each user is assumed to be temporally correlated with correlation according to [46]. Each user’s channel is independently generated assuming same temporal correlation.

To compare the random codebook approach and the proposed GPC algorithm, we compare the achievable sum rate for three scenarios. First, the achievable sum rate assuming perfect CSI at the transmitter is obtained. For the perfect CSI case, i.i.d. channel is assumed. The perfect CSI case provides the baseline for what can be achieved. The second scenario is the random vector codebook approach also assuming i.i.d. channel [30]. Finally, the proposed GPC algorithm using 9-bit codebook for and that corresponds to Doppler frequencies of 0.2Hz, 2Hz, 4Hz, and 8 Hz at 5ms update intervals that is found in LTE-Advanced and IEEE 802.16m.

Fig. 10 illustrates the achievable sum rate for cases being considered. Contrary to the random codebook strategy, the proposed GPC algorithm provides significant sum rate gain. In fact, for , the system starts to become interference limited above SNR of 20dB illustrating the superior CSI accuracy when the channel is highly correlated. Furthermore, each user is equipped with the same codebooks which eliminates the need to store multiple codebooks at the transmitter, thus reducing the overhead for practical applications.

Fig. 11 illustrates the sum rate improvement of the proposed technique over the Householder technique in [26] over a range of SNR for channels with various normalized Doppler frequencies. Both methods used -bit feedback per channel use. The plot shows that the proposed GPC algorithm outperforms the Householder technique especially at high SNR illustrating higher CSI resolution obtained by the GPC algorithm.

## Vii Conclusion

In this paper, we proposed a new predictive coding algorithm on the Grassmann manifold for limited feedback in multiple antenna wireless systems. Building on the classical predictive vector quantization on linear vector space and the geometric properties of the Grassmann manifold, we derived a predictive coding framework for . Distortion bounds were obtained showing possible distortion improvement over memoryless quantization technique. In simulations we showed that the proposed GPC algorithm provides significant sum rate improvement for multiuser MIMO system using practical codebook size. Future work should consider the optimization of the tangent magnitude codebook and extensions to a higher dimensional Grassmann manifold, i.e., for .

## Appendix A Proof of Lemma 1

It was shown in [47] that the tangent vector between and in can be written as

 e=tan−1(∥∥∥x2ρ−x1∥∥∥)x2/ρ−x1∥x2/ρ−x1∥. (29)

The normed term can be simplified as

 ∥∥∥x2ρ−x1∥∥∥2 = (x2ρ−x1)∗(x2ρ−x1) (30) = 1|ρ|2−1.

Therefore,

 ∥∥∥x2ρ−x1∥∥∥ = √1|ρ|2−1=d|ρ|

where is the chordal distance between and . Clearly, and such that .
Using the exponential form of trigonometric identities and , we have

 tan−1(d|ρ|) = j2ln⎛⎜ ⎜ ⎜ ⎜⎝1−j(d|ρ|2)1+j(d|ρ|2)⎞⎟ ⎟ ⎟ ⎟⎠ (31) = −jln(|ρ|+√|ρ|2−1) = cos−1|ρ|.

Since is the cosine of the subspace angle between and , this shows that the norm of the tangent vector is equal to the arc length, i.e., with subspace angle [35, p. 603].

## Appendix B Proof of Lemma 2

For the general case where with , the geodesic between and was shown to be [29]

 X(t) = X1Vcos(Σt)V∗+Usin(Σt)V∗

where is the compact singular value decomposition of the tangent emanating from to . For the case , let be the tangent vector emanating from to . Then, we may assume without loss of generality and identify with and with to obtain

 G(x1,e,t)=x1cos(∥e∥t)+→esin(∥e∥t). (32)

It is clear that . At , we have

 G(x1,e,1) = x1√1+d2/|ρ|2 (33) +x2/ρ−x1d/|ρ|d/|ρ|√1+d2/|ρ|2 = x2ρ√1+d2/|ρ|2 = x2

where we have used the identities

 sin(x) = x√1+x2 cos(x) = 1√1+x2 (35)

in (33) and the fact that in (B).
To verify that for is a valid point on the Grassmann manifold, taking the inner product of with itself yields for by using the fact that .

## Appendix C Proof of Lemma 3

For the general case where , , the parallel transport of tangent emanating from along the geodesic direction with compact singular value decomposition, , was shown to be [29]

 ^E=[−X1Vsin(Σt)U∗+Ucos(Σt)U∗+(I−UU∗)]E. (36)

We need to show the parallel transport of the tangent vector emanating from to in the geodesic direction for the case . Without loss of generality, we assume that the singular value decomposition of is given with as the left singular vector, as the singular value, and for the right singular vector. Then

 →e(t) = (37) =

Since , the parallel transported tangent vector emanating from is found by evaluating (37) for . Using (6) and (35), we have

 ^e = −x1∥e∥sin(∥e∥)+ecos(∥e∥) (38) = −x1tan−1(d/|ρ|)(d/|ρ|)√1+d2/|ρ|2 +tan−1(d/|ρ|)(x2/ρ−x1)d/|ρ|1√1+d2/|ρ|2 = tan−1(d/|ρ|)(d/|ρ|)√1+d2/|ρ|2(x2ρ−x1(1+d2|ρ|2)) = tan−1(d|ρ|)x2ρ∗−x1d

which is the desired result.

## Appendix D Derivation of Prediction Function in Definition 4

Recall that the parallel transported tangent vector emanating from is given in (8). Computing the geodesic with from along at gives

 ^x = G(x2,^e,1) (39) = = |ρ|x2+ρ∗x2−x1.

To see that , we have

 ^x∗^x = (|ρ|x2+ρ∗x2−x1)∗(|ρ|x2+ρ∗x2−x1) (40) = 1

where we have used the fact that . To see that the prediction is distance preserving, the inner product of and gives

 x∗2^x = x∗2x2|ρ|+x∗2x2ρ∗−x∗2x1 (41) = |ρ|.

Therefore,

 d(x2,^x)=√1−|ρ|2=d(x1,x2). (42)

## References

• [1] A. Haoui and D. Messerschmitt, “Predictive vector quantization,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Process., vol. 9, 1984, pp. 420–423.
• [2] H.-M. Hang and J. Woods, “Predictive vector quantization of images,” IEEE Trans. Commun., vol. 33, no. 11, pp. 1208–1219, 1985.
• [3] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression.   Kluwer Academic, 1991.
• [4] H. Khalil, K. Rose, and S. L. Regunathan, “The asymptotic closed-loop approach to predictive vector quantizer design with application in video coding,” IEEE Trans. Image Process., vol. 10, no. 1, pp. 15–23, 2001.
• [5] A. Barg and D. Nogin, “Bounds on packings of spheres in the grassmann manifold,” IEEE Trans. Inf. Theory, vol. 48, no. 9, pp. 2450–2454, 2002.
• [6] W. Dai, Y. Liu, and B. Rider, “Quantization bounds on grassmann manifolds of arbitrary dimensions and MIMO communications with feedback,” in Proc. of IEEE Global Telecom. Conf., vol. 3, 2005, pp. 1456–1460.
• [7] B. Mondal, S. Dutta, and R. W. Heath Jr., “Quantization on the Grassmann manifold,” IEEE Trans. Signal Process., vol. 55, no. 8, pp. 4208–4216, 2007.
• [8] A. Ashikhmin and R. Gopalan, “Grassmannian packings for efficient quantization in MIMO broadcast systems,” in Proc. of IEEE Int. Symp. on Info. Theory, 2007, pp. 1811–1815.
• [9] L. Zheng and D. N. C. Tse, “Communication on the grassmann manifold: a geometric approach to the noncoherent multiple-antenna channel,” IEEE Trans. Inf. Theory, vol. 48, no. 2, pp. 359–383, 2002.
• [10] I. Kammoun and J.-C. Belfiore, “A new family of grassmann space-time codes for non-coherent MIMO systems,” IEEE Commun. Lett., vol. 7, no. 11, pp. 528–530, 2003.
• [11] A. M. Cipriano, I. Kammoun, and J.-C. Belfiore, “Simplified decoding for some non-coherent codes over the grassmannian,” in Proc. of IEEE Int. Conf. on Commun., vol. 2, 2005, pp. 757–761.
• [12] IEEE, “IEEE 802.16e-2005 IEEE Standard for Local and metropolitan area networks, Part 16: Air interface for fixed broadcast wireless access systems, Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in License Bands and Corrigendum 1,” Dec. 2005.
• [13] 3GPP, “Physical layer aspects of UTRA high speed downlink packet access,” Technical Report TR25.814, 2006. [Online]. Available: http://www.3gpp.org/ftp/Specs/html-info/Meetings-R1.htm
• [14] A. Narula, M. J. Lopez, M. D. Trott, and G. W. Wornell, “Efficient use of side information in multiple-antenna data transmission over fading channels,” IEEE J. Sel. Areas Commun., vol. 16, no. 8, pp. 1423–1436, 1998.
• [15] J. C. Roh and B. D. Rao, “Transmit beamforming in multiple-antenna systems with finite rate feedback: a VQ-based approach,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp. 1101–1112, 2006.
• [16] J. A. Tropp, I. S. Dhillon, R. W. Heath Jr., and T. Strohmer, “Designing structured tight frames via an alternating projection method,” IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 188–209, 2005.
• [17] T. Inoue and R. W. Heath Jr., “Kerdock codes for limited feedback precoded MIMO systems,” IEEE Trans. Signal Process., vol. 57, no. 9, pp. 3711–3716, 2009.
• [18] D. J. Love, R. W. Heath Jr., V. K. N. Lau, D. Gesbert, B. D. Rao, and M. Andrews, “An overview of limited feedback in wireless communication systems,” IEEE J. Sel. Areas Commun., vol. 26, no. 8, pp. 1341–1365, 2008.
• [19] K. Huang, B. Mondal, R. W. Heath Jr., and J. G. Andrews, “Markov models for limited feedback MIMO systems,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Process., vol. 4, 2006, pp. 9–12.
• [20] K. Huang, R. W. Heath, Jr., and J. G. Andrews, “Limited feedback beamforming over temporally-correlated channels,” IEEE Trans. Signal Process., vol. 57, no. 5, pp. 1959–1975, 2009.
• [21] B. Mondal and R. W. Heath Jr., “Adaptive feedback for MIMO beamforming systems,” in Proc. of IEEE Workshop on Signal Process. Adv. in Wireless Commun., 2004, pp. 213–217.
• [22] R. Samanta and R. W. Heath Jr., “Codebook adaptation for quantized MIMO beamforming systems,” in Proc. of Asilomar Conf. on Signals, Systems and Computers, 2005, pp. 376–380.
• [23] J. H. Kim, W. Zirwas, and M. Haardt, “Efficient feedback via subspace-based channel quantization for distributed cooperative antenna systems with temporally correlated channels,” EURASIP Journal on Advances in Signal Processing, vol. 2008, 2008. [Online]. Available: http://www.hindawi.com/GetArticle.aspx?doi=10.1155/2008/847296
• [24] V. Raghavan, R. W. Heath Jr., and A. M. Sayeed, “Systematic codebook designs for quantized beamforming in correlated MIMO channels,” IEEE J. Sel. Areas Commun., vol. 25, no. 7, pp. 1298–1310, 2007.
• [25] R. W. Heath Jr., T. Wu, and A. C. K. Soong, “Progressive refinement for high resolution limited feedback multiuser MIMO beamforming,” in Proc. of Asilomar Conf. on Signals, Systems and Computers, 2008, pp. 743–747.
• [26] L. Liu and H. Jafarkhani, “Novel transmit beamforming schemes for time-selective fading multiantenna systems,” IEEE Trans. Signal Processing, vol. 54, no. 12, pp. 4767–4781, 2006.
• [27] T. Kim, D. J. Love, and B. Clerckx, “MIMO systems with limited rate differential feedback in slowly varying channels,” IEEE Trans. Commun., vol. 59, no. 4, pp. 1175–1189, 2011.
• [28] T. Inoue and R. W. Heath Jr., “Grassmannian predictive coding for limited feedback multiuser MIMO systems,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Process., 2011.
• [29] A. Edelman, T. A. Arias, and S. T. Smith, “The geometry of algorithms with orthogonality constraints,” SIAM J. Matrix Analysis and Applications, vol. 20, no. 2, pp. 303–353, 1998.
• [30] N. Jindal, “MIMO broadcast channels with finite-rate feedback,” IEEE Trans. Inf. Theory, vol. 52, no. 11, pp. 5045–5060, 2006.
• [31] K. Huang, J. G. Andrews, and R. W. Heath Jr., “Orthogonal beamforming for SDMA downlink with limited feedback,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Process., vol. 3, 2007, pp. 97–100.
• [32] G. Caire and S. Shamai, “On the achievable throughput of a multiantenna Gaussian broadcast channel,” IEEE Trans. Inf. Theory, vol. 49, no. 7, pp. 1691–1706, 2003.
• [33] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation technique for near-capacity multiantenna multiuser communication-part I: channel inversion and regularization,” IEEE Trans. Commun., vol. 53, no. 1, pp. 195–202, 2005.
• [34] J. M. Lee, Introduction to Smooth Manifolds, ser. Graduate texts in mathematics; 218.   Springer, 2003.
• [35] G. H. Golub and C. H. Van Loan, Matrix Computations, 3rd ed.   The Johns Hopkins University Press, 1996.
• [36]