Deep Learning Strategies For Joint Channel Estimation and Hybrid Beamforming in Multi-Carrier mm-Wave Massive MIMO Systems
Hybrid analog and digital beamforming transceivers are instrumental in addressing the challenge of expensive hardware and high training overheads in the next generation millimeter-wave (mm-Wave) massive MIMO (multiple-input multiple-output) systems. However, lack of fully digital beamforming in hybrid architectures and short coherence times at mm-Wave impose additional constraints on the channel estimation. Prior works on addressing these challenges have focused largely on narrowband channels wherein optimization-based or greedy algorithms were employed to derive hybrid beamformers. In this paper, we introduce a deep learning (DL) approach for joint channel estimation and hybrid beamforming for frequency-selective, wideband mm-Wave systems. In particular, we consider a massive MIMO Orthogonal Frequency Division Multiplexing (MIMO-OFDM) system and propose three different DL frameworks comprising convolutional neural networks (CNNs), which accept the received pilot signal as input and yield the hybrid beamformers at the output. Numerical experiments demonstrate that, compared to the current state-of-the-art optimization and DL methods, our approach provides higher spectral efficiency, lesser computational cost, and higher tolerance against the deviations in the received pilot data, corrupted channel matrix, and propagation environment.
The conventional cellular communications systems suffer from spectrum shortage while the demand for wider bandwidth and higher data rates is continuously increasing [mimoOverview]. In this context, millimeter wave (mm-Wave) band is a preferred candidate for fifth-generation (5G) communications technology because they provide higher data rate and wider bandwidth [mimoOverview, mishra2019toward, 5GwhatWillItBe, hodge2019reconfigurable, ayyar2019robust]. Compared to sub-6 GHz transmissions envisaged in 5G, the mm-Wave signals encounter a more complex propagation environment that is characterized by higher scattering, severe penetration losses, lower diffraction, and higher path loss for fixed transmitter and receiver gains [mimoHybridLeus1, mimoHybridLeus2]. The mm-Wave systems leverage massive antenna arrays - usually in a multiple-input multiple-output (MIMO) configuration - to achieve array and multiplexing gain, and thereby compensate for the propagation losses at high frequencies [mimoRHeath].
However, such a large array requires a dedicated radio-frequency (RF) chain for each antenna resulting in an expensive system architecture and high power consumption. In order to address this, hybrid analog and baseband beamforming architectures have been introduced, wherein a small number of phase-only analog beamformers are employed to steer the beams. The down-converted signal is then processed by baseband beamformers, each of which is dedicated to a single RF chain [mimoHybridLeus1, mimoHybridLeus2, mimoRHeath, mimoScalingUp]. This combination of high-dimensional phase-only analog and low-dimensional baseband digital beamformers significantly reduces the number of RF chains while also maintaining sufficient beamforming gain [mmwaveKeyElements, mimoRHeath].
However, lack of fully digital beamforming in hybrid architectures poses challenges in mm-Wave channel estimation [channelEstLargeArrays, channelEstLargeArrays2, channelEstimation1, channelEstimation1CS, channelModelSparseBajwa, channelModelSparseSayeed]. The instantaneous channel state information (CSI) is essential for massive MIMO communications because precoding at downlink or decoding at uplink transmission requires highly accurate CSI to achieve spatial diversity and multiplexing gain [mimoHybridLeus1, mimoHybridLeus2]. In practice, pilot signals are periodically transmitted and the received signals are processed to estimate the CSI [channelEstLargeArrays2, channelEstLargeArrays2]. Further, the mm-Wave environments such as indoor and vehicular communications are highly variable with short coherence times [coherenceTimeRef] that necessitates use of channel estimation algorithms that are robust to deviations in the channel data. Once the CSI is obtained, the hybrid analog and baseband beamformers are designed using either instantaneous channel matrix or channel covariance matrix (CCM). Bamforming based on the latter provides lower spectral efficiency [widebandHBWithoutInsFeedback] because CCM does not reflect the instantaneous profile of the channel. Hence, it is more common to utilize the channel matrix for hybrid beamforming [mimoHybridLeus3, hybridBFAltMin, hybridBFLowRes, sohrabiOFDM].
In recent years, several techniques have been proposed to design the hybrid precoders in mm-Wave MIMO systems. Initial works have focused on narrow-band channels [mimoHybridLeus1, mimoHybridLeus2, mimoHybridLeus3, mimoRHeath, hybridBFLowRes]. However, to effectively utilize the mm-Wave MIMO architectures with relatively larger bandwidth, there are recent and concerted efforts toward developing broadband hybrid beamforming techniques. The key challenge in hybrid beamforming for a broadband frequency-selective channel is designing a common analog beamformer that is shared across all subcarriers while the digital (baseband) beamformer weights need to be specific to a subcarrier. This difference in hybrid beamforming design of frequency-selective channels from flat-fading case is the primary motivation for considering hybrid beamforming for orthogonal frequency division multiplexing (OFDM) modulation. The optimal beamforming vector in a frequency-selective channel depends on the frequency, i.e., a subcarrier in OFDM, but the analog beamformer in any of the narrow-band hybrid structures cannot vary with frequency. Thus, a common analog beamformer must be designed in consideration of impact to all subcarriers, thereby making the hybrid precoding more difficult than the narrow-band case.
Among prior works, [widebandChannelEst1, widebandChannelEst2] consider channel estimation for wideband mm-Wave massive MIMO systems. The hybrid beamforming design was investigated in [alkhateeb2016frequencySelective, sohrabiOFDM, widebandHBWithoutInsFeedback, widebandMLbased] where OFDM-based frequency-selective structures are designed. In particular, [alkhateeb2016frequencySelective] proposes a Gram-Schmidt orthogonalization based approach for hybrid beamforming (GS-HB) with the assumption of perfect CSI and GS-HB selects the precoders from a finite codebook which are obtained from the instantaneous channel data. Using the same assumption on CSI, [sohrabiOFDM] proposed a phase extraction approach for hybrid precoder design. In [zhu2016novel], a unified analog beamformer is designed based on the second-order spatial channel covariance matrix of a wideband channel. In [zhang2016low], the Eckart-Young-Mirsky matrix approximation is employed to find the wideband beamforming matrices that have the minimum Euclidean distance from the optimal solutions. In [lee2014matrix], the wideband beamformer design is cast as a search for a common basis matrix for the subspaces spanned by all subcarriers’ channel matrices and the higher order singular value decomposition (HOSVD) method is applied. In [chen2018hybrid], antenna selection is also introduced to wideband hybrid beamforming. It exploits the asymptotic orthogonality of array steering vectors and proposes two angular-information-based beamforming schemes to relax the assumption of full CSI at the transmitter such that knowledge of only angles of departure is required.
Nearly all of the aforementioned methods strongly rely on perfect CSI knowledge. This is very impractical given the highly dynamic nature of mm-Wave channel [coherenceTimeRef]. To relax this dependence and obtain robust performance against the imperfections in the estimated channel matrix, we examine a deep learning (DL) approach. The DL is capable of uncovering complex relationships in data/signals and, thus, can achieve better performance. This has been demonstrated in several successful applications of DL in wireless communications problems such as channel estimation [mimoDLChannelEstimation, deepCNN_ChannelEstimation], analog beam selection [mimoDLHybrid, hodge2019multi], and also hybrid beamforming [mimoDLHybrid, mimoDLChannelModelBeamformingFacebook, mimoDeepPrecoderDesign, elbirDL_COMML, elbirQuantizedCNN2019, elbirHybrid_multiuser]. In particular, DL-based techniques have been shown [deepCNN_ChannelEstimation, deepLearningCommOverAir, elbirIETRSN2019, elbirQuantizedCNN2019, elbirDL_COMML] to be computationally efficient in searching for optimum beamformers and tolerant to imperfect channel inputs when compared with the conventional methods,. However, these works investigated only narrow-band channels [mimoDeepPrecoderDesign, mimoDLChannelModelBeamformingFacebook, elbirDL_COMML, elbirQuantizedCNN2019]. The DL-based design of hybrid precoders for broadband mm-Wave massive MIMO systems, despite its high practical importance, remains unexamined so far.
In this paper, we propose a DL-based joint channel estimation and hybrid beamformer design for wideband mm-Wave systems. The proposed framework constructs a non-linear mapping between the received pilot signals and the hybrid beamformers. In particular, we employ convolutional neural networks (CNNs) in three different DL structures. In the first framework (F1), a single CNN maps the received pilot signals directly to the hybrid beamformers. In the second (F2) and third (F3) frameworks, we employ multiple CNNs to also estimate the channel separately. In F2, entire subcarrier data are fed to a single CNN for channel estimation. This is a less complex architecture but it does not allow flexibility of controlling each channel individually. Therefore, we tune the performance of F2 in F3, which has a dedicated CNN for each subcarrier.
The proposed DL framework operates in two stages: offline training and online prediction. During training, several received pilot signals and channel realizations are generated, and hybrid beamforming problem is solved via the manifold optimization (MO) approach [hybridBFAltMin, manopt] to obtain the network labels. In the prediction stage when the CNNs operate in real-time, the channel matrix and the hybrid beamformers are estimated by simply feeding the CNNs with the received pilot data. The proposed approach is advantageous because it does not require the perfect channel data in the prediction stage yet it provides robust performance. Moreover, our CNN structure takes less computational time to produce hybrid beamformers when compared to the conventional approaches.
The rest of the paper is organized as follows. In the following section, we introduce the system model for wideband mm-Wave channel. We formulate the joint channel estimation and beamforming problem in Section III. We then present our approaches toward both of these problems in Sections IV and V, respectively. We introduce our various DL frameworks in Section VI and follow it with numerical simulations in Section VII. We conclude in Section VIII.
Throughout this paper, we denote the vectors and matrices by boldface lower and upper case symbols, respectively. In case of a vector , represents its th element. For a matrix , and denote the th column and the th entry, respectively. The is the identity matrix of size ; denotes the statistical expectation; denotes the rank of its matrix argument; is the Frobenius norm; denotes the Moore-Penrose pseudo-inverse; and denotes the angle of a complex scalar/vector. The notation expressing a convolutional layer with filters/channels of size , is given by @.
Ii System Model
We consider hybrid precoder design for a frequency selective wideband mm-Wave massive MIMO-OFDM system with subcarriers (Fig. 1). The base station (BS) has antennas and RF chains to transmit data streams. In the downlink, the BS first precodes data symbols at each subcarrier by applying the subcarrier-dependent baseband precoders . Then, the signal is transformed to the time-domain via -point inverse fast Fourier transforms (IFFTs). After adding the cyclic prefix, the transmitter employs a subcarrier-independent RF precoder to form the transmitted signal. Given that consists of analog phase shifters, we assume that the RF precoder has constant equal-norm elements, i.e., . Additionally, we have the power constraint that is enforced by the normalization of baseband precoder where . Thus, the transmit signal is
In mm-Wave transmission, the channel is represented by a geometric model with limited scattering [mimoChannelModel1]. The channel matrix includes the contributions of clusters, each of which has the time delay and scattering paths/rays within the cluster. Hence, each ray in the th cluster has a relative time delay , angle-of-arrival (AOA) , angle-of-departure (AOD) , relative AOA (AOD) shift () between the center of the cluster and each ray [alkhateeb2016frequencySelective], and complex path gain for . Let denote a pulse shaping function for -spaced signaling evaluated at seconds [channelModelSparseSayeed], then the mm-Wave delay- MIMO channel matrix is
where and are the and steering vectors representing the array responses of the receive and transmit antenna arrays respectively. Let be the wavelength for the subcarrier with frequency of . Since the operating frequency is relatively higher than the bandwidth in mm-Wave systems and the subcarrier frequencies are close to each other, (i.e., , ), we use a single operating wavelength where is speed of light and is the central carrier frequency [sohrabiOFDM]. This approximation also allows for a single frequency-independent analog beamformer for each subcarrier. Then, for a uniform linear array (ULA), the array response of the transmit array is
where is the antenna spacing and can be defined in a similar way as for . Using the delay- channel model in (II), the channel matrix at subcarrier is
where is the length of cyclic prefix [channelModelSparseBajwa].
With the aforementioned block-fading channel model [mmWaveModel1], the received signal at subcarrier is
where represents the average received power and channel matrix and is additive white Gaussian noise (AWGN) vector. The received signal is first processed by the analog combiner . Then, the cyclic prefix is removed from the the processed signal and -point FFTs are applied to yield the signal in frequency domain. Finally, the receiver employs low-dimensional digital combiners . The received and processed signal is obtained as , i.e.,
where the analog combiner has the constraint similar to the RF precoder.
Iii Problem Formulation
In practice, the estimation process of the channel matrix is a challenging task, especially in case of a large number of antennas deployed in massive MIMO communications [channelEstLargeArrays, channelEstimation1]. Further, short coherence times of mm-Wave channel imply that the channel characteristics change rapidly [coherenceTimeRef]. Literature indicates several mm-Wave channel estimation techniques [mimoChannelModel2, channelEstimation1CS, channelEstimation1, mimoAngleDomainFaiFai, mimoHybridLeus2]. In our DL framework, the channel estimation is performed by a deep network which accepts the received pilot signals as input and yields the channel matrix estimate at the output layer [deepCNN_ChannelEstimation]. During the pilot transmission process, the transmitter activates only one RF chain to transmit the pilot on a single beam; the receiver meanwhile turns on all RF chains [mimoHybridLeus2]. Hence, unlike other DL-based beamformers [elbirDL_COMML, elbirQuantizedCNN2019, mimoDLChannelModelBeamformingFacebook, mimoDeepPrecoderDesign] that presume knowledge of the channel, our framework exploits DL for both channel matrix approximation as well as beamforming.
Specifically, we focus on designing hybrid precoders , by maximizing the overall spectral efficiency of the system under power spectral density constraint for each subcarrier. Let be the overall spectral efficiency of the subcarrier . Assuming that the Gaussian symbols are transmitted through the mm-Wave channel [mimoRHeath, mimoHybridLeus1, mimoHybridLeus2, alkhateeb2016frequencySelective], is
where corresponds to the noise term in (II). The hybrid beamformer design is equivalent to the following optimization problem:
where and are the feasible sets for the RF precoder and combiners which obey the unit-norm constraint and
The hybrid beamformer design problem in (III) requires analog and digital beamformers which, in turn, are obtained by exploiting the structure of the channel matrix in mm-Wave channel. Our goal is to recover , , , and for the given received pilot signal. In the following section, we describe the channel estimation and design methodology of hybrid beamformers before introducing learning-based approach.
Iv Channel Estimation
In our work, DL network estimates the channel from the received pilot signals in the preamble stage. Consider the downlink scenario when the transmitter employs a single RF chain to transmit pilot signals on a single beam where . Then, the receiver activates RF chains to apply for to process the received pilots [deepCNN_ChannelEstimation, mimoHybridLeus2]. Since the number of RF chains in the receiver is limited by (usually less than in a single channel use), a total of combining vectors are employed. Hence, the total channel use in the channel acquisition process is . After processing through combiners, the received pilot signal becomes
where and are and beamformer matrices. The denotes pilot signals and is effective noise matrix, where . The noise corruption of the pilot training data is measured by SNR. Without loss of generality, we assume that and , and , where is the transmit power. Then, the received signal (9) becomes
The initial channel estimate (ICE) is then
We consider as an initial estimate because, later, we improve this approximation with a deep network that maps to .
V Hybrid Beamformer Design For Wideband mm-Wave MIMO Systems
The design problem in (III) requires a joint optimization over several matrices. This approach is computationally complex and even intractable. Instead, a decoupled problem is preferred [mimoRHeath, sohrabiOFDM, elbirQuantizedCNN2019, hybridBFAltMin]. Here, the hybrid precoders are estimated first and then the hybrid combiners are found. Define the mutual information of the mm-Wave channel that can be achieved at the BS through Gaussian signalling as [alkhateeb2016frequencySelective]
The hybrid precoder are then obtained by maximizing the mutual information, i.e.,
We note here that one could approximate the optimization problem in (V) by exploiting the similarity between the hybrid beamformer and the optimal unconstrained beamformer . The latter is obtained from the right singular matrix of the channel matrix [hybridBFAltMin, mimoRHeath]. Let the singular value decomposition of the channel matrix be , where and are the left and the right singular value matrices of the channel matrix, respectively, and is matrix composed of the singular values of in descending order.
By decomposing and as where , the unconstrained precoder is readily obtained as [mimoRHeath]. The hybrid precoder design problem for subcarrier then becomes the minimization of the Euclidean distance between and as
Incorporating all subcarriers in the problem produces
contain the beamformers for all subcarriers.
Once the hybrid precoders are designed, the hybrid combiners realized by minimizing the mean-square-error (MSE), . The combiner-only optimization is
A more efficient form of (V) is due to [mimoRHeath], where a constant term is added to the cost function. Here, denotes the minimum MSE (MMSE) estimator defined as . Then, (V) reduces to the optimization problem
where is the covariance of the array output in (5). The unconstrained combiner in a compact form is then [WoptCombiner],
In (V), the multiplicative term does not depend on or , It, therefore, has no bearing on the solution and can be ignored. Define
Then, the hybrid combiner design problem becomes
In [manopt], manifold optimization or “Manopt” algorithm is suggested to effectively solve the optimization problems in (V) and (V). Note that both of these problems do not require a codebook or a set of array response of transmit and receive arrays [mimoRHeath]. In fact, the manifold optimization problem for (V) and (V) are initialized at a random point, i.e., beamformers with unit-norm and random phases.
Vi Learning-Based Joint Channel Estimation and Hybrid Beamformer Design
We introduce three DL frameworks F1, F2, and F3 (Fig. 2). In all of them, hybrid beamformers are the outputs. The ICE values obtained from the received pilot signal in the preamble stage form the inputs. The F1 architecture is Multi-Carrier Hybrid Beamforming Network (MC-HBNet). It comprises a single CNN which accepts the ICEs jointly for all subcarriers. The input size is ). The ICEs introduce a performance loss if the channel estimates are inaccurate. To address this, F2 employs separate CNNs for channel estimation (Multi-Carrier Channel Estimation Network or MC-CENet) and hybrid beamforming (HBNet). The MC-CENet accepts the ICE of a single subcarrier as input; other subcarriers are fed sequentially, one at a time. So, the training data consists of a single ICE (with input of size ) for each subcarrier. To make the setup even more flexible at the cost of computational simplicity, F3 employs one CNN per subcarrier for estimating the channel. For the th subcarrier, each Single Carrier Channel Estimation Network (SC-CENet, ) feeds into a single HBNet.
Vi-a Input Data
We partition the input ICE data into three components to enrich the input features. In our previous works, similar approaches has provided good features for DL implementations [elbirQuantizedCNN2019, elbirDL_COMML, elbirIETRSN2019, deepCNN_ChannelEstimation]. In particular, we use the real, imaginary parts and the absolute value of each entry of ICEs. The absolute value entry indicates to the DL network that the real and imaginary input feeds are connected. Define the input for MC-HBNet in F1 as . Then, for ICE, the -th entry of the submatrices per subcarrier is for the first “channel” or input matrix of . The second and the third channels are and , respectively. Hence, the size of is . In F2, the input data comprises single subcarrier ICEs. The input for MC-CENet is of size . The input data for each SC-CENet in F3 is same as in F2. The inputs of HBNet in both F2 and F3 also have the same structure; it is denoted as which is of size , where , and .
The hybrid beamformers are the common output for all three frameworks (Fig. 2). We represent the output as the vectorized form of analog beamformers common to all subcarriers and baseband beamformers corresponding to all subcarriers. The output is an real-valued vector
where is a real-valued vector which includes the phases of analog beamformers. The is composed of the baseband beamformers for all subcarriers as where
The output label of MC-CENet in F2 is the channel matrix. Given that MC-CENet is fed by the ICE , the output label for MC-CENet is
which is a real-valued vector of size . The SC-CENet in F3 has similar input and output structures as the MC-CENet but ICEs are fed to each SC-CENet separately.
Vi-C Network Architectures and Training
We design four deep network architectures (Fig. 3). The MC-HBNet and HBNet have input size of whereas the input for MC-CENet and SC-CENet is . The number of filters and number of units for all layers are shown in Fig. 3. There are dropout layers with a probability after each fully connected layer in each network. We use pooling layers after the first and second convolutional layers only in MC-HBNet and HBNet to reduce the dimension by two. The output layer of all networks are the regression layer with the size depending on the application as discussed earlier. The network parameters are fixed after a hyperparameter tuning process that yields the best performance for the considered scenario [elbirDL_COMML, elbirQuantizedCNN2019, elbirIETRSN2019]. The proposed deep networks are realized and trained in MATLAB on a PC with a single GPU and a 768-core processor. We used the stochastic gradient descent algorithm with momentum 0.9 and updated the network parameters with learning rate and mini-batch size of samples. Then, we reduced the learning rate by the factor of after each 30 epochs. We also applied a stopping criteria in training so that the training ceases when the validation accuracy does not improve in three consecutive epochs. Algorithm 2 summarizes steps for training data generation.
To train the proposed CNN structures, we realize different scenarios for (see Algorithm 2). For each scenario, we generated a channel matrix and received pilot signal where we introduce additive noise to the training data on both the channel matrix and the received pilot signal which are defined by SNR and SNR respectively
Vii Numerical Simulations
We evaluated the performance of the proposed DL frameworks through several experiments. We compared our DL-based hybrid beamforming (hereafter, DLHB) with the state-of-the-art hybrid precoding algorithms such as Gram-Schmidt-orthogonalization-based method (GS-HB) [alkhateeb2016frequencySelective], phase-extraction-based method (PE-HB) [sohrabiOFDM], and another recent DL-based multilayer perceptron (MLP) method [mimoDeepPrecoderDesign]. As a benchmark, we implemented a fully digital beamformer obtained from the SVD of the channel matrix. We also present the performance of the MO algorithm [hybridBFAltMin] used for the labels of the hybrid beamforming networks. The MO algorithm constitutes a performance yardstick for DLHB, in the sense that the latter cannot perform better than the MO algorithm because the hybrid beamformers used as labels are obtained from MO itself. Finally, we implemented spatial frequency CNN (SF-CNN) architecture [deepCNN_ChannelEstimation] that has been proposed recently for wideband mm-Wave channel estimation. We compare the performance of our DL-based channel estimation with SF-CNN using the same parameters.
We followed the training procedure outlined in the Section VI with elements, antennas, and RF chains. Throughout the experiments, unless stated otherwise, we use subcarriers at GHz with GHz bandwidth, and clusters with scatterers for all transmit and receive angles that are uniform randomly selected from the interval . We selected and as the first columns of an discrete Fourier transform (DFT) matrix and the first columns of an DFT matrix respectively [deepCNN_ChannelEstimation]. Then, we set and . In the prediction stage, the preamble data are different from the training stage. Instead, we construct from (9) and (11) with a completely different realization of noise corresponding to SNR.
Vii-a Spectral efficiency evaluation
Figure 4 shows the spectral efficiency of various algorithms for varying test SNR, given SNR dB. The DLHB techniques - fed with only the received pilot data (i.e., ) - outperform GS-HB [alkhateeb2016frequencySelective] and PE-HB [sohrabiOFDM] that utilize perfect channel matrix to yield hybrid beamformers. Further, GS-HB algorithm requires the set of array responses of received paths which is difficult to achieve in practice. The MO algorithm is used to obtain the labels of the deep networks for hybrid beamforming, hence the performances of the DL approaches are upper-bounded by the MO algorithm. However, note that perfect channel information is required for even the benchmark MO algorithm [hybridBFAltMin]. The gap between the MO algorithm and the DL frameworks is explained by the corruptions in the DL input which causes deviations from the label data (obtained via MO) at the output regression layer. Note that our DLHB methods improve upon other DL-based techniques such as MLP [mimoDeepPrecoderDesign], which lacks a feature extraction stage provided by convolutional layers in our networks. Among the DL frameworks, F2 and F3 exhibit superior performance than F1 because the channel estimated by MC-CENet and SC-CENet has higher accuracy. On the contrary, F1 uses ICEs directly as input and is, therefore, unable to achieve similar improvement. While F2 and F3 have similar hybrid beamforming performance, F3 has computationally more complex because of presence of CNNs in the channel estimation stage.
In order to compare the algorithms with the same input channel data, we use the channel matrix estimate obtained from MC-CENet for MO, GS-HB, PE-HB and MLP when SNR dB. Figure 5 shows the spectral efficiency so obtained with respect to SNR, which determines the noise added to the received pilot data. For SNR dB, we note that the non-DL methods perform rather imperfectly but their performance is at least similar with the true channel matrix case shown in Fig 4. The DL-based techniques exceed in comparison and exhibit higher tolerance against the corrupted channel data corresponding to SNR. The F2 and F3 quickly reach the maximum efficiency when SNR is increased to dB. Again, the F1 fares poorly because it is directly fed by the ICEs and lacks the channel estimation network.