# Recursive CSI Quantization of Time-Correlated MIMO Channels by Deep Learning Classification

## Abstract

In frequency division duplex (FDD) multiple-input multiple-output (MIMO) wireless communications, limited channel state information (CSI) feedback is a central tool to support advanced single- and multi-user MIMO beamforming/precoding. To achieve a given CSI quality, the CSI quantization codebook size has to grow exponentially with the number of antennas, leading to quantization complexity, as well as, feedback overhead issues for larger MIMO systems. We have recently proposed a multi-stage recursive Grassmannian quantizer that enables a significant complexity reduction of CSI quantization. In this paper, we show that this recursive quantizer can effectively be combined with deep learning classification to further reduce the complexity, and that it can exploit temporal channel correlations to reduce the CSI feedback overhead.

- 3GPP
- Third generation partnership project
- ABF
- transmit beamformed Alamouti space-time/frequency coding
- BD
- block-diagonalization
- BF
- beamforming
- CDF
- cumulative distribution function
- CSI
- channel state information
- CSIT
- CSI at the transmitter
- DNN
- deep neural network
- DoF
- degrees of freedom
- eMBMS
- enhanced multimedia broadcast/multicast service
- FDD
- frequency division duplex
- IA
- interference-alignment
- i.i.d.
- independent and identically distributed
- LBM
- leakage-based multicasting
- LTE
- long term evolution
- MC-IC
- multicast interference channel
- MISO
- multiple-input single-output
- MIMO
- multiple-input multiple-output
- MSE
- mean-squared error
- RBD
- regularized block-diagonalization
- RVQ
- random vector quantization
- SDR
- semidefinite relaxation
- SINR
- signal to interference and noise ratio
- SNR
- signal to noise ratio
- SQBC
- subspace quantization based combining
- SVD
- singular value decomposition
- SVR
- singular value ratio
- UMTS
- universal mobile telecommunications system

## I Introduction

Limited CSI feedback is a well-established technique for supporting efficient MIMO transmissions in FDD systems [1, 2, 3, 4]. Often, the framework of Grassmannian CSI quantization is adopted, since subspace information is required for many popular transmit precoding schemes. A large number of different ways for constructing Grassmannian quantization codebooks exists; e.g., [5, 6, 7, 8] to mention just a few of the more recent constructions.

Generally, in case of memoryless quantization of isotropic channels, such as, independent and identically distributed (i.i.d.) Rayleigh fading channels, it is known that maximally spaced subspace packings achieve optimal quantization performance in terms of subspace chordal distance; however, such packings are difficult to construct for larger MIMO systems and codebook sizes [9, 10, 11]. When adopting Grassmannian quantization in larger-scale MIMO systems and/or for high resolution quantization, one faces two main challenges: 1) quantization complexity and 2) feedback overhead. The former issue can effectively be tackled, if the channel exhibits structure that can be exploited for quantization; e.g., in the millimeter wave band, the channel is often assumed to be sparse, which allows for efficient parametric CSI quantization by sparse decomposition [12, 13, 14]. Such techniques, however, are not applicable in i.i.d. Rayleigh fading situations. Also, recently, a number of approaches that utilize deep neural networks have been proposed to enable efficient CSI quantization [15, 16, 17]; yet, these publications mostly consider relatively low resolution quantization, as neural networks are hard to train for large quantization codebooks. When the channel exhibits temporal correlation, quantizers with memory, such as, differential quantizers or techniques based on recurrent neural networks, can provide significantly better performance than memoryless approaches [18, 19, 20, 21, 22, 23, 24, 25]. Yet, they mostly require adaptation of the quantization codebook on the fly or online neural network learning, which can be prohibitive in terms of complexity.

Contribution: In [26], we have proposed a recursive multi-stage quantization approach that can reduce the complexity of high resolution Grassmannian quantization in moderate to large-scale MIMO systems by orders of magnitude. In this paper, we show that this approach can effectively be enhanced by DNN classification to further reduce the implementation complexity and, thus, support high resolution Grassmannian quantization with low complexity. Hence, rather than adopting an end-to-end DNN approach, we propose to enhance well-known model-based CSI quantizers by neural network features. We furthermore propose a simple approach to exploit temporal channel correlation in recursive multi-stage quantization, by selectively updating the individual stages of the quantizer.

Notation: The Grassmann manifold of -dimensional subspaces of the -dimensional Euclidean space is . The trace of matrix is , the conjugate-transpose is , the Frobenius norm is and vectorization is . The -dimensional subspace spanned by the columns of , is . The expected value of random variable is . The operation determines the minimizer of the function over the set . The size of set is . The vector-valued complex Gaussian distribution with mean and covariance is . The zeroth-order Bessel function of the first kind is .

## Ii Channel Model

We consider a MIMO wireless communication system with transmit- and receive-antennas, where . We denote the frequency-flat baseband MIMO channel matrix at time instant as . We consider a spatially uncorrelated Rayleigh fading channel, i.e., , as caused by a strong scattering environment.

We assume that the MIMO channel follows a stationary Gaussian stochastic process with temporal auto-correlation function parametrized by the time-lag according to \linenomath

(1) |

Considering, for example, Clarke’s Doppler spectrum [27], the auto-correlation function is , where is the normalized Doppler shift, is the maximal absolute Doppler shift and is the symbol time interval.

## Iii Grassmannian Quantization

In this paper, we focus on Grassmannian CSI quantization at the receiver, in order to provide CSI feedback to the transmitter. For this purpose, we apply a compact size singular value decomposition (SVD) to the channel \linenomath

(2) |

and utilize the orthogonal basis , consisting of the left singular vectors corresponding to the non-zero singular values, as relevant CSI to represent the -dimensional subspace spanned by the channel .

### Iii-a Single-Stage Quantization

In single-stage quantization, matrix is quantized by applying a quantization codebook consisting of semi-unitary matrices , . For bits of CSI feedback per time instant, the codebook is of size .

As quantization metric, we consider the subspace chordal distance, as it is relevant for many subspace-based precoding techniques, such as, block-diagonalization and interference alignment [28, 29, 30, 31]. The CSI quantization problem thus is \linenomath

(3) | |||

(4) |

where (4) denotes the chordal distance normalized by the subspace dimension . Solving this non-convex Grassmannian quantization problem usually implies an exhaustive search over all codebook entries, which can easily become intractable in case of large codebooks.

#### Quantization Distortion

In case of memoryless random vector quantization (RVQ), the average normalized single-stage quantization distortion is \linenomath

(5) |

with dimension-dependent constant as specified in [32].

#### Selective CSI Update

In a temporally correlated channel, it may not be necessary to update the quantized CSI every time instant, as the channel may not have changed sufficiently. To exploit this, we consider a simple selective CSI update based on the quantization error w.r.t. the previously quantized CSI \linenomath

(6) |

Here, the tuning parameter determines the trade-off between the frequency of CSI updates and the achieved average quantization distortion.

### Iii-B Recursive Multi-Stage Quantization

We have proposed recursive multi-stage quantization in [26] as a means to reduce the quantization complexity in case a large quantization codebook is employed. In this approach, the CSI is recursively quantized in stages according to \linenomath

(7) | |||

(8) | |||

(9) |

where , and . Matrix is known as subspace quantization based combining (SQBC) matrix and has been derived in [33]. This recursive multi-stage quantizer successively reduces the dimensions of the intermediate quantizer input until the intended subspace dimension is reached.

In each of the stages of this approach, a Grassmannian quantization problem is solved. Each stage uses a quantization codebook with codebook entries of dimension ; the difference is known as dimension step-size. Compared to single-stage quantization, however, each stage uses a much smaller codebook, since the total number of quantization bits is distributed amongst the stages, such that . In this paper, we apply equal bit allocation amongst stages, even though this is suboptimal in terms of quantization distortion; yet, this choice leads to the smallest total number of codebook entries of the stages: .

Furthermore, we apply a dimension step-size , as this achieves the lowest quantization complexity via one-dimensional Grassmannian quantization in the orthogonal complement. Specifically, this means that instead of the minimum chordal distance quantization problem in (7), we actually employ a quantization codebook for the one-dimensional orthogonal complements of the elements of , and find the codebook entry that maximizes the chordal distance. The details of this equivalent, yet less complex quantization problem formulation are explained in [26].

#### Quantization Distortion

The average normalized chordal distance distortion of the recursive multi-stage quantizer utilizing RVQ in each stage, a dimension step-size of and equal bit allocation is \linenomath

(10) | |||

(11) |

Here, denotes the average normalized chordal distance of the -th stage, which quantizes the -dimensional subspace by the -dimensional subspace . The dimension-dependent constant is provided in [32].

#### Selective Stage Update

Similarly to single-stage quantization, we can also adopt a selective CSI update in multi-stage quantization. Yet, here we have the additional degree of freedom to only update a subset of the stages of the quantizer, based on the currently achieved CSI quality. Specifically, fixing the first stages of the quantizer to the previously quantized matrices , the quantizer input matrix of the -th stage can be calculated recursively by replacing with in Eq. 9, and then the quantizer proceeds as usual to update the remaining stages.

Input | 1st Layer | 2nd Layer | Output |
---|---|---|---|

fully connected | fully connected | class output | |

ReLu with dropout | soft-max | cross-entropy loss |

To decide how many stages the quantizer should update, we propose an approach similar to (6). If we do not update any stage. Otherwise, we calculate the expected distortion under the assumption that the first stages are not updated, and determine the largest (smallest number of CSI updates) that achieves an acceptable distortion \linenomath

(12) | |||

(13) |

The two tuning parameters effectively define a hysteresis for the acceptable CSI quality and thereby determine the frequency of the stage updates.

### Iii-C Deep Learning Classification

The chordal distance quantization problem (3) is essentially a classification problem and can as such, in principle, be handled by neural network structures. The problem is that the number of classes is often too large, such that a DNN does not achieve sufficient classification accuracy.

Consider, for example, a system with antennas. If we intend to achieve an average normalized chordal distance distortion of , we have to employ a single-stage quantization codebook of 34 bits, which gives an intractable number of quantization classes. In contrast, for the multi-stage quantizer, we achieve the same accuracy with 7 bits per stage, i.e., a codebook size of 128 entries per stage, which is a number that a neural network can handle. However, the total feedback overhead is increased to 7 bits per stage times 6 stages equals 42 bits. Hence, for the single-stage approach (3), the quantization problem is essentially intractable, whereas with multi-stage quantization (7) the number of classes per stage is in fact so small that we can even adopt DNN classification.

As we consider Grassmannian quantization, the DNN classification outcome should be unaffected by right-multiplication of by an arbitrary unitary matrix. To exploit this invariance, we apply a phase-rotation to the individual columns of , such that the first row contains only real numbers, and vectorize the result before feeding into the neural network.

We summarize the adopted DNN structure in Table I. We have investigated DNN structures of varying width and depth. As can be seen from Table I, we employ a relatively shallow neural network with a wider first layer. Going deeper did not achieve more accurate classification. The adopted structure provides a good trade-off between complexity and achieved classification accuracy.

## Iv Simulations

In our first simulation, we compare recursive multi-stage quantization to differential and predictive Grassmannian quantization, utilizing the same simulation setup and quantizers as in [22]. Specifically, the channel matrices of dimension are from a Rayleigh fading distribution and are temporally correlated according to Clarke’s Doppler spectrum [27] by employing the sum-of-sinusoids approach of [34]. The predictive and differential quantizers, as proposed in [21], provide 6 bits of feedback per time instant. For comparison, we also show the performance of memoryless single-stage quantization (without selective CSI update) for 9, 26, 39 and 53 bits of feedback, respectively. For the recursive multi-stage quantizer, we have selected the number of quantization bits per stage to achieve the same average performance as the differential quantizer. The average feedback overhead of the recursive quantizer with selective stage update is written next to the simulated data points in Fig. 1.

We observe in Fig. 1 that at low Doppler frequencies the differential and predictive quantizers are more effective in exploiting temporal correlation than the recursive quantizer with selective stage update, as they require only 6 bits of feedback. However, they involve on-the-fly adaptation of the entire quantization codebook at each time instant, which is hard to realize in practice. At moderate to higher Doppler frequencies all three approaches achieve very similar performance. In terms of complexity, though, the recursive quantizer is a lot more tractable, as it does not require any codebook adaptations.

Stage | 1 | 3 | 5 | 7 | 9 | 11 | 13 | 15 | 17 | 19 | 21 | 23 | 25 | 27 | 29 | 31 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Dimension | 321 | 301 | 281 | 261 | 241 | 221 | 201 | 181 | 161 | 141 | 121 | 101 | 81 | 61 | 41 | 21 |

Distortion | 5.2 e | 5.6 e | 5.9 e | 6.4 e | 7.0 e | 7.6 e | 8.4 e | 9.4 e | 1.1 e | 1.2 e | 1.4 e | 1.8 e | 2.3 e | 3.2 e | 5.4 e | 13.2 e |

Exhaustive | 5.0 e | 5.4 e | 5.8 e | 6.2 e | 6.8 e | 7.4 e | 8.2 e | 9.2 e | 1.0 e | 1.1 e | 1.3 e | 1.7 e | 2.2 e | 3.1 e | 5.1 e | 13.1 e |

In Fig. 2, we show an exemplary trace of the instantaneous chordal distance quantization error of recursive, differential and predictive quantization for a normalized Doppler frequency of . We observe that the recursive quantizer, in contrast to the other two, has no convergence phase and exhibits relatively predictable quantization performance, which is basically dictated by the hysteresis parameters of the selective stage update. In this example, we have selected and . The black-dotted line in the figure corresponds to the maximally acceptable distortion , which determines when a stage update takes place. For predictive and differential quantization, the performance shows larger variations over time, with increasing distortion whenever the quantizer cannot keep up with the temporal subspace variation of the channel.

In our second simulation, we consider quantization on and employ a Gauss-Markov channel model to generate the temporally correlated channel vectors according to , with i.i.d. and . We select the codebook sizes to achieve an average chordal distance quantization distortion of . For single-stage quantization this requires a codebook size of 125 bits. Obviously, we cannot implement a codebook that is this large; the results of single-stage quantization are therefore based on the theoretic performance investigations of RVQ provided in [32]. For recursive quantization, we employ 31 stages and use a codebook size of 6 bits per stage, i.e., a total feedback overhead of 186 bits when all stages are updated. The individual stages of the recursive quantizer are realized by the classification neural network explained in Section III-C. We summarize the normalized chordal distance distortion of the network in Table II. The classification accuracy of the individual stages of the quantizer lies at approximately 90%; however, the incurred average chordal distance distortion penalty compared to an exhaustive search is negligible.

In Fig. 3, we show the average number of quantization bits of single- and multi-stage quantization with selective CSI/stage update to achieve an average distortion of as a function of the normalized Doppler frequency. We observe that at low Doppler frequencies the recursive multi-stage quantizer requires less feedback overhead than the single-stage quantizer, as it can selectively update just a subset of the stages of the quantizer. This behavior is investigated in more detail in Fig. 4, where we plot the relative frequency of the number of updated stages of the quantizer. We can see that for small Doppler frequencies in most cases no or just few of the later stages of the quantizer are updated, whereas for high Doppler frequencies almost all stages have to be updated every time.

## V Conclusion

In this paper, we have extended recursive Grassmannian multi-stage quantization to exploit temporal channel correlations by selectively updating the individual stages of the quantizer, depending on the achieved distortion. We have shown that this approach performs similar to differential/predictive Grassmannian quantization at moderate to high Doppler frequencies, yet without requiring complicated quantization codebook adaptations. We have furthermore shown that multi-stage quantization can effectively be combined with neural network structures, as the number of classes per stage is sufficiently small to enable accurate DNN classification.

### References

- D. Love, R. Heath, Jr., V. Lau, D. Gesbert, B. Rao, and M. Andrews, “An overview of limited feedback in wireless communication systems,” IEEE Journal on Selected Areas in Communications, vol. 26, no. 8, Oct. 2008.
- J. Choi, Z. Chance, D. Love, and U. Madhow, “Noncoherent trellis coded quantization: A practical limited feedback technique for massive MIMO systems,” IEEE Transactions on Communications, vol. 61, no. 12, pp. 5016–5029, December 2013.
- J. Park and R. W. Heath, “Multiple-antenna transmission with limited feedback in device-to-device networks,” IEEE Wireless Communications Letters, vol. 5, no. 2, pp. 200–203, 2016.
- G. Kwon and H. Park, “Limited feedback hybrid beamforming for multi-mode transmission in wideband millimeter wave channel,” IEEE Transactions on Wireless Communications, vol. 19, no. 6, pp. 4008–4022, 2020.
- A. Medra and T. N. Davidson, “Incremental Grassmannian feedback schemes for multi-user MIMO systems,” IEEE Transactions on Signal Processing, vol. 63, no. 5, pp. 1130–1143, 2015.
- A. Decurninge and M. Guillaud, “Cube-split: Structured quantizers on the grassmannian of lines,” in IEEE Wireless Communications and Networking Conference, pp. 1–6, March 2017.
- V. V. Ratnam, A. F. Molisch, O. Y. Bursalioglu, and H. C. Papadopoulos, “Hybrid beamforming with selection for multiuser massive MIMO systems,” IEEE Transactions on Signal Processing, vol. 66, no. 15, pp. 4105–4120, 2018.
- S. Schwarz, M. Rupp, and S. Wesemann, “Grassmannian product codebooks for limited feedback massive MIMO with two-tier precoding,” IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 5, pp. 1119–1135, Sep. 2019.
- I. S. Dhillon, R. Heath, Jr., T. Strohmer, and J. A. Tropp, “Constructing packings in Grassmannian manifolds via alternating projection,” ArXiv e-prints, Sept. 2007.
- H. E. A. Laue and W. P. du Plessis, “A coherence-based algorithm for optimizing rank-1 Grassmannian codebooks,” IEEE Signal Processing Letters, vol. 24, no. 6, pp. 823–827, 2017.
- B. Tahir, S. Schwarz, and M. Rupp, “Constructing Grassmannian frames by an iterative collision-based packing,” IEEE Signal Processing Letters, vol. 26, no. 7, pp. 1056–1060, July 2019.
- X. Luo, P. Cai, X. Zhang, D. Hu, and C. Shen, “A scalable framework for CSI feedback in FDD massive MIMO via DL path aligning,” IEEE Trans. on Signal Processing, vol. 65, no. 18, pp. 4702–4716, Sep. 2017.
- H. Xie, F. Gao, S. Zhang, and S. Jin, “A unified transmission strategy for TDD/FDD massive MIMO systems with spatial basis expansion model,” IEEE Transactions on Vehicular Technology, vol. 66, no. 4, pp. 3170–3184, April 2017.
- S. Schwarz, “Robust full-dimension MIMO transmission based on limited feedback angular-domain CSIT,” EURASIP Journal on Wireless Communications and Networking, vol. 2018, no. 1, pp. 1–20, Mar 2018.
- T. Wang, C. Wen, S. Jin, and G. Y. Li, “Deep learning-based CSI feedback approach for time-varying massive MIMO channels,” IEEE Wireless Communications Letters, vol. 8, no. 2, pp. 416–419, April 2019.
- Z. Liu, L. Zhang, and Z. Ding, “Exploiting bi-directional channel reciprocity in deep learning for low rate massive MIMO CSI feedback,” IEEE Wireless Communications Letters, vol. 8, no. 3, pp. 889–892, 2019.
- J. Jang, H. Lee, S. Hwang, H. Ren, and I. Lee, “Deep learning-based limited feedback designs for MIMO systems,” IEEE Wireless Communications Letters, vol. 9, no. 4, pp. 558–561, 2020.
- T. Inoue and R. Heath, Jr., “Grassmannian predictive coding for delayed limited feedback MIMO systems,” in 47th Annual Allerton Conference on Communication, Control, and Computing, Oct. 2009.
- D. Sacristan-Murga and A. Pascual-Iserte, “Differential feedback of MIMO channel Gram matrices based on geodesic curves,” IEEE Trans. on Wireless Communications, vol. 9, no. 12, pp. 3714–3727, Dec. 2010.
- O. El Ayach and R. Heath, Jr., “Grassmannian differential limited feedback for interference alignment,” IEEE Transactions on Signal Processing, vol. 60, no. 12, pp. 6481–6494, Dec 2012.
- S. Schwarz, R. Heath, Jr., and M. Rupp, “Adaptive quantization on the Grassmann-manifold for limited feedback multi-user MIMO systems,” in 38th International Conference on Acoustics, Speech and Signal Processing, pp. 5021 – 5025, Vancouver, Canada, May 2013.
- S. Schwarz and M. Rupp, “Predictive quantization on the Stiefel manifold,” IEEE Signal Processing Letters, vol. 22, no. 2, pp. 234–238, 2015.
- Y. Ge, Z. Zeng, T. Zhang, and Y. Liu, “Spatio-temporal correlated channel feedback for massive MIMO systems,” in IEEE/CIC International Conference on Communications in China, pp. 1–5, 2018.
- C. Lu, W. Xu, H. Shen, J. Zhu, and K. Wang, “Mimo channel information feedback using deep recurrent network,” IEEE Communications Letters, vol. 23, no. 1, pp. 188–191, 2019.
- X. Li and H. Wu, “Spatio-temporal representation with deep neural recurrent network in mimo csi feedback,” IEEE Wireless Communications Letters, vol. 9, no. 5, pp. 653–657, 2020.
- S. Schwarz and M. Rupp, “Reduced complexity recursive grassmannian quantization,” IEEE Signal Processing Letters, vol. 27, pp. 321–325, 2020.
- R. H. Clarke, “A statistical theory of mobile radio reception,” Bell Systems Technical Journal, vol. 47, pp. 957–1000, 1968.
- N. Jindal, “MIMO broadcast channels with finite-rate feedback,” IEEE Transactions on Information Theory, vol. 52, no. 11, p. 5, Nov. 2006.
- N. Ravindran and N. Jindal, “Limited feedback-based block diagonalization for the MIMO broadcast channel,” IEEE Journal on Selected Areas in Communications, vol. 26, no. 8, pp. 1473 –1482, Oct. 2008.
- M. Rezaee and M. Guillaud, “Limited feedback for interference alignment in the K-user MIMO interference channel,” in Proc. Information Theory Workshop, pp. 1–5, Lausanne, Suisse, September 2012.
- R. Krishnamachari and M. Varanasi, “Interference alignment under limited feedback for MIMO interference channels,” IEEE Transactions on Signal Processing, vol. 61, no. 15, pp. 3908–3917, Aug 2013.
- W. Dai, Y. Liu, and B. Rider, “Quantization bounds on Grassmann manifolds and applications to MIMO communications,” IEEE Trans. on Information Theory, vol. 54, no. 3, pp. 1108 –1123, March 2008.
- S. Schwarz and M. Rupp, “Subspace quantization based combining for limited feedback block-diagonalization,” IEEE Transactions on Wireless Communications, vol. 12, no. 11, pp. 5868–5879, 2013.
- Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties for Rayleigh fading channels,” IEEE Transactions on Communications, vol. 51, no. 6, pp. 920 – 928, June 2003.