Millimeter Wave MIMO Channel Estimation using Overlapped Beam Patterns and Rate Adaptation
This paper is concerned with the channel estimation problem in Millimeter wave (mmWave) wireless systems with large antenna arrays. By exploiting the inherent sparse nature of the mmWave channel, we first propose a fast channel estimation (FCE) algorithm based on a novel overlapped beam pattern design, which can increase the amount of information carried by each channel measurement and thus reduce the required channel estimation time compared to the existing non-overlapped designs. We develop a maximum likelihood (ML) estimator to optimally extract the path information from the channel measurements. Then, we propose a novel rate-adaptive channel estimation (RACE) algorithm, which can dynamically adjust the number of channel measurements based on the expected probability of estimation error (PEE). The performance of both proposed algorithms is analyzed. For the FCE algorithm, an approximate closed-form expression for the PEE is derived. For the RACE algorithm, a lower bound for the minimum signal energy-to-noise ratio required for a given number of channel measurements is developed based on the Shannon-Hartley theorem. Simulation results show that the FCE algorithm significantly reduces the number of channel estimation measurements compared to the existing algorithms using non-overlapped beam patterns. By adopting the RACE algorithm, we can achieve up to a 6dB gain in signal energy-to-noise ratio for the same PEE compared to the existing algorithms.
Millimeter wave (mmWave) communication has been shown to be a promising technique for next generation wireless systems due to a large expanse of available spectrum [3, 4, 5, 6]. This spectrum, ranging from 30GHz to 300GHz, offers a potential 200 times the bandwidth currently allocated in today’s mobile systems . However, a critical challenge in exploiting the mmWave frequency band is its severe signal propagation loss compared to that over conventional microwave frequencies [5, 8, 9]. To compensate for such a loss, large antenna arrays can be employed to achieve a high power gain. Fortunately, owing to the small wavelength of mmWave signals, these arrays can be packed into small areas at the transmitter and receiver [10, 11]. For such mmWave systems, channel state information (CSI) is essential for effective communication and precoder design. However, the use of large antenna arrays results in a large multiple-input multiple-output (MIMO) channel matrix. This makes the channel estimation of mmWave systems very challenging due to the large number of channel parameters to be estimated. Moreover, owing to the high frequency, it is often not feasible to obtain digital samples from each antenna . To resolve this high frequency sampling problem, analog beamforming techniques have been proposed and widely adopted in open literature (see [12, 13, 14, 10] and references therein). The main idea of analog beamforming is to control the phase of the signal transmitted or received by each antenna via a network of analog phase shifters.
Using analog beamforming techniques, the most straightforward channel estimation method is to exhaustively search in all possible angular directions. Specifically, consider a system with transmit antennas and receive antennas. If we aim at achieving a minimum angular resolution of , an exhaustive search-based channel estimation would then require a set of transmit beamforming vectors at the transmitter designed to span all possible beam directions and likewise with receive beamforming vectors at the receiver. By searching all possible combinations, an matrix can be formed whose entries represent the channel gains between the transmit and the receive beams. This matrix is commonly referred to as the virtual channel matrix [15, 16, 17, 18, 19]. Despite the large number of entries expected for the mmWave MIMO channel matrix, it has been shown in recent measurements [20, 21] that the mmWave channel exhibits sparse propagation characteristics in the angular domain. That is, there are only a few dominant propagation paths in mmWave channels. This sparsity can be seen in the virtual channel matrix, as only a limited number of transmit and receive direction combinations have a non-zero gain . Therefore, the key objective of mmWave channel estimation is to identify these paths so that the transceiver can align the transmit and receive beams along these paths.
Recently, some compressed sensing based channel estimation algorithms have been proposed to explore the channel sparsity in mmWave systems, e.g., [1, 5, 10, 22]. The fundamental idea in some of these approaches is to search multiple transmit/receive directions in each measurement, by creating initial beam patterns that span a wider angular range than those used by the exhaustive search. Similar adaptive beamforming algorithms and multi-stage codebooks were proposed in [23, 14, 24, 25]. More recent work  has also shown that such hierarchical codebooks can be achieved with a single RF chain by exploiting sub-array and deactivation (turning-off) antenna processing techniques. By initially using wider beam patterns, multi-stage approaches are able to reduce the number of measurements required for channel estimation. However, this introduces a loss of directionality gain, leading to a lower signal-to-noise ratio (SNR) at the receiver and a higher probability of estimation error (PEE). In this sense, there exists a challenging trade-off between estimation time and accuracy for mmWave channel estimation.
As one of the seminal works on multi-stage codebook approaches,  developed a “divide and conquer” search type algorithm to estimate sparse mmWave channels. As shown in Fig. 1, in each stage of this algorithm, the possible ranges of angles of departure (AODs) and angles of arrival (AOAs) are both divided into non-overlapped angular sub-ranges. Correspondingly, non-overlapped beam patterns are designed at both the transmitter and receiver such that each transmit (receive) beam pattern exactly covers one AOD (AOA) sub-range. The channel estimation carried out in each stage consists of time slots. In each time slot, the pilot signal is transmitted using one of the beam patterns at the transmitter, and then collected by one of the beam patterns at the receiver. The corresponding channel output for each pair of transmit-receive beam patterns can then be obtained. These time slots span all the combinations of transmit-receive beam patterns. By comparing the magnitudes of the corresponding channel outputs, the transmit/receive sub-ranges that the AOD/AOA most likely belong to are determined. The receiver can then feed back the AOD information for use at the transmitter. Afterwards, the algorithm will limit the estimation to the angular sub-range identified at each link end in the previous stage and further divide it into sub-ranges for the channel estimation in the next stage. An example of multi-stage angular refinement, using our proposed approach, can be seen in Fig. 3. This process continues until the smallest beam width resolution is reached. It is shown in  that the algorithm requires estimation time proportional to per path. Despite the significant improvement when compared to the exhaustive search approach, such a channel estimation algorithm still might not be quick enough to track the fast channel variations, especially for mmWave mobile channels with rapidly changing parameters. Furthermore, at high SNR it may not be necessary to perform so many measurements, which would result in an unnecessary time delay. Adaptive training approaches are also investigated in [27, 28, 29, 30]. While these approaches are shown to significantly improve system performance as the number of measurement iterations is increased, there is generally no adaptation of the number of measurements with respect to the associated probability of success or channel conditions.
Motivated by this, in this paper we develop a fast mmWave MIMO channel estimation framework by designing a set of novel overlapped beam patterns that can significantly reduce the number of channel estimation measurements. In order to improve estimation accuracy, we then introduce a novel rate-adaptive channel estimation approach, where the average number of channel measurements is adapted to channel conditions. The main contributions of this paper are summarized as follows:
Relying on novel overlapped beam pattern design, we first present a fast channel estimation (FCE) algorithm for mmWave systems. In this algorithm, we develop a maximum likelihood (ML) detector to optimally extract the channel AOD/AOA information from the measurements. We also design a linear minimum mean squared error (LMMSE) channel estimator to estimate the channel coefficients by optimally combining the selected measurements in all stages.
We then develop a rate-adaptive channel estimation (RACE) algorithm, in which additional measurements are permitted when the current measurements are found to have an inadequate probability of success. In this way, the number of measurements can adapt to the channel conditions and the channel estimation accuracy can be significantly improved with minimal measurements.
We analyze the probability of channel estimation error for the FCE algorithm. In particular, we derive a closed-form approximation, lower bound and upper bound for the PEE. Based on the Shannon-Hartley theorem, we also provide some theoretical analysis for the minimum energy required to estimate the channel using the RACE algorithm.
Finally, we compare the performance of the proposed algorithms to that of the algorithm in  with non-overlapped beam patterns. Simulation results show that both of the proposed algorithms can significantly reduce the number of channel measurements compared to . We show that the FCE algorithm achieves a guaranteed reduction of channel measurements, at the expense of estimation accuracy. On the other hand, the RACE algorithm can achieve the same average reduction of channel measurements as the FCE algorithm, but using up to 6dB less signal energy compared to the algorithm in .
Notation : is a matrix, is a vector, is a scalar, and is a set. is the 2-norm of , is the determinant of and represents the cardinality of . , and are the transpose, conjugate transpose and conjugate of , respectively. For a square matrix , represents its inverse. We use to denote a diagonal matrix with entries of on its diagonal. is the identity matrix, is an all column vector and denotes the ceiling function. is a complex Gaussian random vector with mean and covariance matrix . and denote the expected value and covariance of , respectively. denotes the Kronecker product of and whereas denotes the row-wise Kronecker product of and .
Ii System Model
Consider a mmWave MIMO system composed of transmit antennas and receive antennas. We consider that both the transmitter and receiver are equipped with a limited number of radio frequency (RF) chains. Following , we further assume that these RF chains, at one end, can only be combined to form a single beam pattern, indicating that only one pilot signal can be transmitted and received at one time. In this paper, for simplicity, we consider the unconstrained beamforming vectors by ignoring some practical constraints imposed by hardware such as constant amplitude and quantized phase shifters. However, in practice, our unconstrained beamforming vectors could be realized by using a network of constrained beamformers with quantized phase shifters and constant amplitude as the hybrid-beamforming approach adopted in  and depicted by Figure 2 therein. To estimate the channel matrix, the transmitter sends a pilot signal , with unit energy (), to the receiver. Denote by and , respectively, the beamforming vector at the transmitter and beamforming vector at the receiver. The corresponding channel output can be represented as
where denotes the MIMO channel matrix, is the transmit power and is an complex additive white Gaussian noise (AWGN) vector following distribution .
In this paper we follow  and adopt a two-dimensional (2D) sparse geometric-based channel model. Specifically, we consider an -path channel between the transceiver, with the th path having steering AOD, , and AOA, where . Then the corresponding channel matrix can be expressed in terms of the physical propagation path parameters as
where is the fading coefficient of the th propagation path, and and respectively denote the transmit and receive spatial signatures of the th path. To simplify the analysis, we assume that the transmitter and receiver have the same number of antennas (i.e., ). However, it is worth pointing out that the developed schemes can be easily extended to a general asymmetric system. If uniform linear antenna arrays (ULA) are employed at both the transmitter and receiver, we can define and , respectively, where
Here, the steering angle, , is related to the physical angle by with denoting the signal wavelength111Note that the use of ULA results in no distinguishable difference between AODs and or between AOAs and . Hence, only AODs and AOAs in the range need to be considered.. A similar expression can be written for at the receiver. With half-wavelength spacing, the distance between antenna elements becomes .
From (2), we can see that the overall channel state information of each path includes only three parameters, i.e., the AOD , the AOA , and the fading coefficient . We assume that the fading coefficient of each path follows a complex Gaussian distribution with zero mean and variance and that both and can only take some discrete values from the set . Here, for the sake of ensuing mathematical problem formulation, we only consider the discrete AOA and AOD. It is noteworthy that they can be continuous in practice. However, the extension to the case with continuous AOD/AOA may require the consideration of other more practical issues such as the number of RF chains to realize the beam patterns and the hardware constraints (e.g., quantized phase shifters) imposed on the RF beamforming vectors, which may constitute a new paper. We thus have left this extension as our future work.
We aim to find an efficient way to estimate the three parameters for each path. The key challenge here is how to design a sequence of and in such a way that the channel parameters can be quickly and accurately estimated. We consider pairs of beam patterns that are designed to span all possible transmit-receive combinations. Denote by and , respectively, the transmit and receive beamforming vectors adopted in the th channel measurement time slot such that . Similarly to , we assume the same pilot symbol is transmitted during the time slots. Then, after time slots, we can obtain a sequence of measurements represented as
where describes the channel input-output relationship for a given set of transmit and receive beamforming vectors defined by
is an vector of the corresponding noise terms. Note that since , , the vector follows the same distribution as that of , i.e., .
Motivated by the geometric sparsity of the mmWave channel, in the following two sections, we propose to use a set of overlapped beam patterns that are able to estimate the AOD/AOA information very quickly. We then extend the algorithm to use a rate-adaptive estimation approach. The adaptive nature of the algorithm permits additional measurements to be performed under poor channel conditions. This allows the fast channel estimation to be carried out with significant accuracy and energy efficiency.
Iii Fast Channel Estimation with Overlapped Beam Patterns
In this section, we develop a fast channel estimation framework for mmWave systems using overlapped beam patterns, as illustrated in Fig. 3. Specifically, we design a set of beam patterns that are adopted in different measurement time intervals and are overlapped with one another in the angular domain. A maximum likelihood based estimation algorithm is then proposed to accurately retrieve the channel state information from the set of measurements. The proposed channel estimation algorithm also works in a similar multi-stage manner as that in  where each stage reduces the possible sub-ranges in which the AOD/AOA are expected to be found.
Iii-a An Example of Overlapped Beam Pattern Design
We will first explain the design principle of overlapped beam patterns using a simple example. Following Fig. 1, we divide the AOD/AOA angular spaces into sub-ranges in the first stage, denoted by , and , respectively. However, instead of using 3 beam patterns to cover them at each transceiver end as in Fig. 1(a), we propose to use only 2 overlapped beam patterns to achieve this. Fig. 2(a) illustrates our designed beam patterns in the first stage. We can see that the first and second beam patterns cover , and , , respectively, and are overlapped in the whole range of . Intuitively, if a path is observed in two measurements using adjacent beam patterns, the AOD or AOA must belong to the overlapped sub-range of these two beam patterns. It is also seen that each beam pattern can have different amplitudes in different sub-ranges. We represent the amplitudes of each beam pattern in different sub-ranges by a vector. For beam pattern 1 and 2 in Fig. 2(a), these vectors are respectively defined as
where corresponds to the th beam pattern with denoting the amplitude of the th beam pattern in sub-range . By using measurement time slots, we can then span all beam pattern combinations between the transceiver. We denote the sequential set of beam patterns respectively adopted at the transmitter and receiver by
We refer to these as the beam pattern design matrices, with their th row denoting the beam pattern adopted in the th measurement time slot, where . The efficient design of these beam patterns can lead to many solutions. However, one desirable property is that the same quantity of signal energy is transmitted/received via each sub-range over all measurements, i.e., the energy of each column of (8)-(9) should have the same Euclidean norm. This provides the same accuracy for each possible sub-range combination. Another desirable property is that the transmit/receive beamforming gains of all measurement patterns are equal, i.e., the energy of each rows of (8)-(9) should be the same.
One possible way to make the beam pattern sub-range amplitudes in (7) follow the aforementioned properties, is to normalize and to have unit energy in each row and have equal energy in each column. For the beam patterns in Fig. 2, we have , as beam patterns and do not cover, respectively, the third and first sub-ranges. Due to the symmetry between the two beam patterns, we further have and . This leads to and , and the matrices in (8)-(9) become
In order to observe the resultant amplitude gains (i.e., the transceiver gain) over each of the sub-range combinations, we introduce another matrix referred to as the generator matrix. We define this as the row-wise Kronecker product between the transmit beam pattern design matrix and the receive beam pattern design matrix such that
recalling that and represent the row-wise Kronecker and Kronecker product operations, respectively. By denoting and as the entry on the th and th column on the th row in and , respectively, we can express the th column of as
where the relationship is a result of the Kronecker product operation.
The columns of generator matrix describe the received measurement gains over each of the sub-range combinations. For example, if a path were present between the transmit sub-range and the receive sub-range , the measurement vector would be expected to be a scalar multiple of column of , expressed as . It is important to note that the sets of beam patterns used in this paper are not unique. We later show that their performance, in terms of PEE, depends only on the Euclidean distance between each column of , which directly determines the probability that one column corresponding to a certain sub-range is to be mistaken for another. In the example set shown in (III-A), each column has the same equal minimum Euclidean distance when compared with all other columns, although some have more spatial neighbours at this minimum distance than others.
Iii-B Beamforming Vector Design
To generate the beam patterns illustrated in Fig. 2(a) and described in (III-A), the transmit and receive beamforming vectors should be designed as follows. Denote by and , respectively, the transmit beamforming vector and receive beamforming vector corresponding to the th pair of beam patterns in and . We then design the product of the transmit array response and transmit beamforming vector to have
and the product of the receiver array response and receive beamforming vector to have
where has been defined in (3) and is a scalar constant that ensures . Physically, corresponds to the average directivity gain of each beam pattern and is the same for all due to the normalization of the rows in (III-A). Eqs. (14) and (15) can be expressed in a matrix form as
where denotes the cardinality of set and is a matrix whose columns describe the antenna array response at each angle. Therefore and can be designed as
where is the pseudo inverse of .
Iii-C Channel Measurements
We now perform channel estimation in the first stage using the previously designed transmit and receive beamforming vectors and . In each time slot, the beamforming vectors and are adopted to transmit/receive the pilot signal . If we substitute the channel in (2) into (5), we get
Without loss of generality, let us consider the case with AOD, and AOA, , where and are respectively, the transmit and receive sub-range indices of the th propagation path. By recalling (14)-(15) we write
which leads to
where is a sparse row vector that describes the channel gain at each of the sub-range combinations by
and zero otherwise. For example with , a single path (i.e., ) with coefficient , exists on the first transmit sub-range (i.e., ) and second receive sub-range (i.e., ) leads to and
Iii-D Maximum Likelihood Detection of AOD/AOA Information
We now require an efficient means of detecting given that a generator matrix has been used to obtain the channel outputs in (26). Due to its optimal detection properties, this subsection elaborates how to implement a maximum likelihood detection  method to extract the AOD/AOA information from the received measurements. We begin by considering the distribution of . From (26), this can be expressed as
Recall that , where is the identity matrix. Also recall that the pilot signal has unit energy, i.e., . For the signal component, as each of the path coefficients have zero mean we can write and
By defining a binary version of , denoted by , with elements defined by
we can separate the AOD/AOA information in from each of the path coefficient variances, . As each path coefficient has variance , this then gives . We can then re-write the distribution of as
It can now be seen that follows a zero mean, circularly symmetric complex Gaussian (CSCG) distribution with corresponding probability density function (PDF) defined as 
Now let us find the conditional probability of , given the receive measurement vector and knowledge of , denoted by . Define as the set of all possible binary channel realizations such that . We also define to represent the cardinality of this set. Following the principle of maximum likelihood detection and based on Bayes rule , we can express the probability of for all possible as
where the term
is independent of a particular channel realization. We assume that each channel realization is equiprobable, therefore
We then denote the probability that the th element of has a path by . We can express this probability as the sum of all in (35) in which by
Following the maximum likelihood approach we then find the most likely sub-range combination by
Finally by finding the most likely transmitter and receiver subranges through
we can reduce the ranges of possible AOD and AOA to, respectively, the th transmit and th receive angular sub-ranges. Each of these two sub-ranges will be further divided into another sub-ranges for the channel estimation in the next stage.
Iii-E Multi-stage Generalization
In general the proposed channel estimation algorithm works in a similar multi-stage manner as that in , requiring stages. We show a high level overview of this process in Fig. 3. In the th stage, we initially divide the possible AOA angular space into non-overlapped sub-ranges and divide the possible AOD angular space into . Then only overlapped beam pattern pairs will be designed at the transmitter and receiver to cover these sub-ranges. The designed beam patterns are characterized by the beam pattern design matrices and . These should be generated to maximize the minimum Euclidean distance between the columns of the corresponding generator matrix .
Given and , we can then generate both the transmit beamforming vectors and receive beamforming vectors respectively in the same way as in (18)-(19). For example, to generate , the corresponding vector in (16), which is redefined as for rigorousness, should be designed such that its th entry, denoted by , satisfies
where is a scalar constant for the th stage to guarantee that satisfies . Physically, describes the desired beam pattern amplitude at angle when is used. Each receive beamforming vector can be designed in the same way.
The channel output on the th estimation stage can then be obtained after time slots by
and denotes the transmit power of the pilot signal in the th stage. Similar to that in , we prefer that all the stages have an equal probability of failure, indicating that we should allocate power among stages inversely proportional to the beamforming gains of these beam patterns, i.e.,
where is a constant. Similar to (39) we then find the most likely sub-range combination of the th stage by
with the corresponding most likely transmitter and receiver sub-ranges given by
The selected sub-ranges, and are then used for the channel estimation in the next stage. This process continues until the minimum angle resolution is reached requiring stages. It is worth pointing out that although the proposed algorithms are elaborated based on the estimation process of a single path, their implementation in multi-path scenario is actually feasible by following the same procedure as in [5, Algorithm 2]. More specifically, multiple paths are estimated sequentially, with the first path being estimated using the multi-stage algorithms described above. Subsequent paths can then be found by returning to the first stage and repeating the estimation. Moreover, in each stage’s measurements, the expected contributions from all previously estimated paths can be subtracted to reveal new paths.
Iii-F Estimation of the Fading Coefficient
Once all estimation stages described in the previous subsection have been performed, we estimate the identified path fading coefficient . In , the value of was estimated based on the measurement of the final stage only. To improve the estimation accuracy, we estimate by using all measurements in all stages of the algorithm. Denote by and , respectively, the vector of all received measurements and the vector of their estimates such that
where denotes the th column of . Provided that the AOD/AOA estimation is correct in each stage, we can write
where is the vector of corresponding noise terms. In the case where the AOD/AOA estimation is incorrect, the estimation of the fading coefficient is not important as there will be a beam misalignment between the transmitter and receiver. Following the LMMSE principle , we can then estimate the fading coefficient as
where is an identity matrix. Now we can formally describe the proposed fast channel estimation algorithm using overlapped beam patterns in Algorithm 1.
Remark 1. It can be seen that, compared with the channel estimation algorithm in  with the same value of , our proposed algorithm also requires stages, but the number of measurement time slots required in each stage reduces to , instead of . In general, this yields a reduction in measurement time slots. For the example of discussed earlier with , a increase in estimation rate can be achieved.
Iv Rate Adaptive Channel Estimation Algorithm
The proposed channel estimation scheme explained in the previous section uses a fixed where the detector is forced to make a decision after measurements, irrespective of what the computed probability may be. Leveraging the detection method developed in the previous section, we now propose a novel rate-adaptive channel estimation (RACE) algorithm.
We first introduce a target maximum probability of estimation error (PEE), denoted by . The basic principle of the RACE algorithm is that after the initial measurements are completed in any given stage, if the most likely sub-range combination probability does not satisfy , then additional measurements will be performed. To this end, the receiver will feedback the current most likely transmit sub-range, , and also the information indicating whether more measurements are required or not.