A Hardware-Efficient Analog Network Structure for Hybrid Precoding in Millimeter Wave Systems

A Hardware-Efficient Analog Network Structure for Hybrid Precoding in Millimeter Wave Systems

Xianghao Yu, , Jun Zhang, , and Khaled B. Letaief,  This work was supported in part by the Hong Kong Research Grants Council under Grant No. 16210216. This paper was presented in part at the IEEE Global Communications Conference (GLOBECOM), Singapore, Dec. 2017 [1]. X. Yu, J. Zhang and K. B. Letaief are with the Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology (HKUST), Kowloon, Hong Kong (e-mail: xyuam, eejzhang,eekhaled@ust.hk). K. B. Letaief is also with Hamad Bin Khalifa University, Doha, Qatar (e-mail: kletaief@hbku.edu.qa).
Abstract

Hybrid precoding has been recently proposed as a cost-effective transceiver solution for millimeter wave (mm-wave) systems. While the number of radio frequency (RF) chains has been effectively reduced in existing works, a large number of high-precision phase shifters are still needed. Practical phase shifters are with coarsely quantized phases, and their number should be reduced to a minimum due to cost and power consideration. In this paper, we propose a novel hardware-efficient implementation for hybrid precoding, called the fixed phase shifter (FPS) implementation. It only requires a small number of phase shifters with quantized and fixed phases. To enhance the spectral efficiency, a switch network is put forward to provide dynamic connections from phase shifters to antennas, which is adaptive to the channel states. An effective alternating minimization (AltMin) algorithm is developed with closed-form solutions in each iteration to determine the hybrid precoder and the states of switches. Moreover, to further reduce the hardware complexity, a group-connected mapping strategy is proposed to reduce the number of switches. Simulation results show that the FPS fully-connected hybrid precoder achieves higher hardware efficiency with much fewer phase shifters than existing proposals. Furthermore, the group-connected mapping achieves a good balance between spectral efficiency and hardware complexity.

Alternating minimization, hardware efficiency, hybrid precoding, large-scale antenna arrays, millimeter wave communications.

I Introduction

Uplifting the carrier frequency to millimeter wave (mm-wave) bands is an effective approach to meet the capacity requirement of the upcoming 5G networks, and thus mm-wave communication has drawn extensive attention from both academia and industry [2, 3]. Thanks to the small wavelength of mm-wave signals, large-scale antenna arrays can be leveraged at transceivers to combat huge path loss at mm-wave frequencies and support directional transmissions with advanced multiple-input-multiple-output (MIMO) techniques. As equipping each antenna element with a single radio frequency (RF) chain is costly and power hungry, hybrid precoding has been put forward as a cost-effective transceiver solution, which utilizes a limited number of RF chains to connect a digital baseband precoder and an analog RF precoder [4].

In contrast to the conventional fully digital precoder, the additional hardware in the hybrid one is the analog component, also called the analog network, which determines the overall hardware structure of the hybrid precoder. Most existing works on hybrid precoding are performance-oriented, i.e., aiming at maximizing the spectral efficiency [5, 4, 6]. However, spectral efficiency close to the fully digital precoder was achieved with bulky hardware and impractical assumptions for the analog network, which results in a poor hardware efficiency and hinders its practical implementation. Thus, it is of great importance to develop hardware-efficient analog networks that help the practical deployment of hybrid precoders.

To discuss hardware-efficient design, we first introduce a few terminologies for describing the hybrid precoder structure. Each hybrid precoder structure is specified by its mapping strategy and hardware implementation. Specifically, the mapping strategy decides how the RF chains and antenna elements are connected, which also determines the number of hardware components needed in the analog network. Typical mapping strategies include the fully- and partially-connected ones. The fully-connected one exploits all the degrees of freedom to perform the mapping, i.e., it maps every RF chain to all the antennas, e.g., [4]. In contrast, each RF chain is only connected to a subset of antennas in the partially-connected one, e.g., [7]. On the other hand, the hardware implementation specifies the adopted hardware components and the way each RF chain-antenna pair is connected. The single phase shifter (SPS) implementation is the most commonly adopted one, which deploys one phase shifter to realize each RF chain-antenna connection [8]. More recently, a double phase shifter (DPS) implementation was proposed in [9, 10] to simplify the hybrid precoding algorithm design, where two distinct phase shifters are used to connect each RF chain-antenna pair.

In this paper, we propose a novel analog network structure that significantly improves the hardware efficiency of hybrid precoders. This is achieved by an innovative hardware implementation, called the fixed phase shifter (FPS) implementation, and a new mapping strategy, i.e., the group-connected mapping. In particular, the new structure can approach the performance of the fully digital precoder with very few fixed phase shifters.

I-a Related Works

The fully-connected mapping strategy with the SPS implementation, referred as the SPS fully-connected structure, is the most popular structure in earlier works on hybrid precoding [4, 11, 6, 12, 13]. However, this structure entails a drawback in the analog network, i.e., the number of phase shifters in use is , with and being the numbers of RF chains and antennas, respectively. Note that phase shifters, originally utilized in military radar systems, are newly-introduced hardware components in hybrid precoding systems, and currently very costly for commercial use, e.g., it can be around a hundred US dollars even with low resolution [14]. Hence, deploying such a large number of phase shifters would cause prohibitively high cost and power consumption. More importantly, phase shifters are assumed with variable high resolution to provide near-optimal performance with effective algorithms, which is far from practical.

To improve the hardware efficiency, one possible way is to reduce the number of phase shifters in use via changing the mapping strategy. Partially-connected mapping, which connects each RF chain to a subset of antennas, stands out as a popular solution [15, 16, 7, 17, 10]. A semidefinite relaxation based alternating minimization (SDR-AltMin) algorithm was proposed in [15] for hybrid precoder design with this mapping strategy. Based on a similar idea as successive interference cancellation (SIC), an iterative hybrid precoding algorithm for the partially-connected mapping was proposed in [16]. In addition, a greedy algorithm and a modified K-means algorithm were developed in [17] and [10], respectively, to dynamically optimize the subarrays in the partially-connected mapping for performance improvement. While various techniques were introduced to design hybrid precoders with the partially-connected mapping, there still exists a non-negligible gap in spectral efficiency compared with the fully-connected one. Inevitably, trade-offs need to be made between hardware efficiency and spectral efficiency, but the partially-connected mapping goes to an extreme, i.e., it enhances the hardware efficiency by incurring too much performance degradation. It is thus of practical importance to develop hardware-efficient hybrid precoder structures that can achieve more flexible trade-offs.

On the other hand, different hybrid precoding algorithms have been proposed assuming phase shifters with arbitrary precision, e.g., orthogonal matching pursuit (OMP) [4], manifold optimization [15], and SIC [16]. Following these works, a straightforward refinement for practical hardware implementation is to design hybrid precoders with quantized phase shifters [12, 18, 11, 19, 20]. The main approach is either to determine all the phases at once [4, 18, 11, 19] or update one phase at a time [20] by ignoring the quantization effect at first. Then the phases are heuristically quantized into the finite feasible set according to certain criteria. However, a simple quantization step is far from satisfactory, and the optimality and convergence of the proposed algorithms cannot be guaranteed [20]. In addition, hybrid precoder design based on codebooks consisting of quantized phases was investigated in [21, 22, 23]. While codebook-based design enjoys a low complexity, there will be certain performance loss, and it is not clear how much performance gain can be further obtained. The number of quantized phase shifters was to some extent reduced in [19], which is approximately for achieving a certain required precision , e.g., around quantized phase shifters are needed for . Unfortunately, a large number of phase shifters are still needed for achieving a high spectral efficiency under practical settings in multiuser OFDM systems, i.e., 40 quantized phase shifters for each RF chain, and the number varies with the precision requirement. More importantly, in these existing works, the phases need to be adapted to the channel states, which brings high hardware implementation complexity and also increases power consumption. Recently, a hybrid precoder structure that adopts switches to improve the hardware efficiency was put forward in [24]. Nevertheless, simply replacing variable phase shifters with switches will cause significant performance degradation. Therefore, a more effective approach to handle quantized phases is needed, and the number of phase shifters should be reduced to a minimum.

Fig. 1: A multiuser mm-wave MIMO-OFDM system with FPS hybrid precoder implementation. To simplify the figure, in the analog precoder, each solid line with a slash represents parallel signal transmissions while each dotted line stands for switches.

I-B Contributions

In this paper, we investigate hardware-efficient design for hybrid precoding in general multiuser orthogonal frequency-division multiplexing (OFDM) mm-wave systems. The main contributions are summarized as follows.

  • As a first step, a novel hardware implementation is proposed for the analog network, called the fixed phase shifter (FPS) implementation, where only a small number of phase shifters with fixed phases are needed. To compensate the performance loss induced by the fixed phases, a switch network is proposed to provide dynamic connections from phase shifters to antennas, which is easily implementable by adaptive switches.

  • An AltMin algorithm is developed to design the hybrid precoder with the fully-connected mapping, where an upper bound of the objective function is derived as an effective surrogate. In particular, the large-scale binary constraints induced by the switch network are delicately tackled with the help of the upper bound, which leads to closed-from solutions for both the dynamic switch network and the digital baseband precoder, and therefore enables a low-complexity hybrid precoding algorithm.

  • To further reduce the hardware complexity, a novel mapping strategy, i.e., the group-connected mapping, is proposed and then applied along with the FPS implementation. This flexible mapping strategy incorporates the popular fully- and partially-connected mapping strategies as special cases. More importantly, the introduction of this new mapping strategy does not incur any additional design challenges as the hybrid precoder can be readily designed by leveraging existing hybrid precoding algorithms.

  • Extensive comparisons are provided to reveal valuable design insights. In particular, the FPS fully-connected hybrid precoder structure is shown to be able to easily approach the performance of the fully digital precoder, and enjoys a higher hardware efficiency than existing proposals. What deserves a special mention is the sharp reduction of the number of phase shifters compared with existing hybrid precoder implementations, e.g., 10 fixed phase shifters in total are sufficient. In addition, the FPS group-connected structure, which further reduces the number of switches, provides a flexible way to trade off spectral efficiency with hardware complexity.

In summary, our results firmly show that the proposed FPS group-connected structure is a promising candidate for hardware-efficient hybrid precoding in 5G mm-wave communication systems.

I-C Organization

The remainder of this paper is organized as follows. In Section II, we introduce the system model and proposed FPS implementation, followed by the problem formulation. The AltMin algorithms for the single-carrier and multicarrier systems with the FPS fully-connected mapping strategy are demonstrated in Sections III and III-C, respectively. Section IV introduces the group-connected mapping strategy. Simulation results are presented in Section V. Finally, we conclude this paper in Section VI.

I-D Notations

The following notations are used throughout this paper. and stand for a column vector and a matrix, respectively; The conjugate, transpose, and conjugate transpose of are represented by , , and ; and denote the and Frobenius norms of vector and matrix ; establishes a block diagonal matrix using as its diagonal terms; and indicate the trace and vectorization; Expectation and the real part of a complex variable is noted by and .

Ii System Model

Ii-a Hybrid Precoding and Combining

Consider the downlink transmission of a multiuser mm-wave MIMO-OFDM system as shown in Fig. 1. A base station (BS) leverages an -size antenna array to serve users over subcarriers using OFDM. Each user is equipped with antennas and receives data streams from the BS on each subcarrier. The numbers of available RF chains are and for the BS and each user, respectively, which are restricted as and .

The received signal of the -th user on the -th subcarrier is given by

(1)

where the subscript stands for the -th user on the -th subcarrier. The average received power of the -th user is denoted as , and is the transmitted signal such that , where is the transmit power. In addition, denotes the circularly symmetric complex Gaussian noise with power as at the users. The digital baseband precoders and combiners are denoted as and , respectively, with dimensions and . Since the transmitted signals for all the users are mixed together by the digital precoders, and analog RF precoding is a post-IFFT (inverse fast Fourier transform) operation, the RF analog precoder with dimension is a common component shared by all the users and subcarriers. Correspondingly, the RF analog combiner is subcarrier-independent for each user. In this paper, we focus on the precoder design while the combiners can be designed in a similar way.

As discussed in Section I, each hybrid precoder structure is primarily determined by the mapping strategy and hardware implementation. In particular, the former maps the signals out of the limited RF chains to the large-scale antenna array, while the latter decides what kind of and how many hardware components are adopted to process the signal for each RF chain-antenna pair. In this section, a novel hardware implementation is first proposed to seek a hardware-efficient hybrid precoder structure. Then, to achieve a better balance between the hardware complexity and spectral efficiency, a flexible mapping strategy is introduced in Section IV.

Phase shifter Other hardware components
Number Type Power Hardware Number Power
SPS [4, 15] Fully-connected Adaptive 50 mW N/A N/A N/A
Partially-connected
SPS with Butlter Fully-connected Fixed 20 mW Coupler 10 mW
matrices [25] Partially-connected
DPS [9, 10] Fully-connected Adaptive 50 mW N/A N/A N/A
Partially-connected
FPS Fully-connected Multi-channel 20 mW Switch 5 mW
Group-connected Fixed
TABLE I: Comparisons of hardware components in the analog network for different hybrid precoder structures

Ii-B FPS Implementation

Recently, a DPS implementation was proposed in [9, 10], which enables low-complexity hybrid precoder design and also greatly improves the spectral efficiency. These benefits come from allowing the same signal to pass through two phase shifters. Inspired by this insight, we propose a hardware-efficient implementation in the following.

In the proposed implementation, phase shifters are used, where , as shown in Fig. 1. One critical difference between the proposed implementation and existing ones is that the number of phase shifters no longer depends on any other parameters, e.g., the number of RF chains or antennas, and can be made very small, which effectively improves the hardware efficiency. Inspired by the beneficial operation in the DPS implementation, the signal from each RF chain is passed through all available phase shifters. In other words, each phase shifter is an -channel phase shifter [26] that can simultaneously process the output signals from RF chains, i.e., in a parallel fashion. On the other hand, while the number of (multi-channel) phase shifters could be small, it is still intractable to shift arbitrary phases or to switch between multiple quantized phase levels at a high speed to adapt to the channel states. In our proposal, instead of variable phase shifters, the phase shifters are assumed with fixed phases [27], which is independent of the channel states. Thus, this proposal is referred as the FPS implementation.

Remark 1: With the limited number of fixed phase shifters, the analog precoder can only provide the same static precoding gain for all RF chain-antenna pairs and therefore inevitably entails performance loss.

To overcome this drawback brought by the simplified hardware implementation, we propose to cascade a dynamic switch network after the fixed phase shifters, which is adapted to the channel states. The signal flow in the FPS implementation is illustrated as follows. To clearly illustrate the proposed FPS implementation, we focus on the signal flow of one RF chain-antenna pair, as shown in Fig. 2.

Fig. 2: The FPS implementation from an RF chain to a connected antenna.

The fixed phase shifters generate signals with different phases for the output signal of the given RF chain. We propose to adaptively combine a subset of the signals to compose the analog precoding gain from the RF chain to the antenna, which is realized by adaptive switches. Hence, switches are needed for each RF chain-antenna pair. Note that, with only binary on-off states, adaptive switches are much easier to implement than adaptive phase shifters [27, 24].

Remark 2: The adaptive switch network enables the analog precoder to offer various precoding gains for different RF chain-antenna pairs to adapt to the channel states. Later we will see that although the proposed FPS implementation can only provide the analog precoding gains from a -dimension codebook, its performance is satisfactory with just a small value of .

In summary, all the hardware components needed for the FPS implementation are fixed phase shifters and switches per RF chain-antenna pair, and the total number of switches depends on the employed mapping strategy.

Accordingly, the analog RF precoding matrix can be expressed as

(2)

where the switch matrix is a binary matrix with dimension , and the Boolean constraints are induced by the switches with binary states. Note that some entries may be forced to be zero due to different mapping strategies, which shall be discussed later. The matrix stands for the phase shift operation carried out by the available fixed phase shifters, given by a block diagonal matrix as

(3)

where is the normalized phase shifter vector containing all fixed phases . Note that although there are non-zero parameters in matrix , only phase shifters are required since the phase shifters are with parallel channels and shared by all RF chain-antenna pairs.

Table I lists the required hardware components in the analog network for different hybrid precoder structures, as well as the corresponding power consumption of each kind of hardware component [24]. It shows that the proposed FPS implementation employs much less (fixed) phase shifters and consumes less power compared with existing works. While a bunch of switches are cascaded after the fixed phase shifters, the advantages of this proposal in hardware complexity and power consumption shall be demonstrated more explicitly in Section V via numerical comparisons.

Remark 3: The ease of implementation and operation is another important aspect in hybrid precoder design. As switches only have binary states while high-resolution phase shifters need to be adaptive between a large number of states, the design and implementation of adaptive switches are generally easier than high-resolution adaptive phase shifters [28], which makes the proposed FPS a practical and hardware-efficient implementation for the hybrid precoder structure.

Ii-C Problem Formulation

There exist different formulations to maximize the spectral efficiency of hybrid precoding systems. One can either directly maximize the spectral efficiency [5], or adopt other performance metrics, e.g., mean square error (MSE) [29] as surrogates to maximize the spectral efficiency. However, these formulations either result in high-complexity algorithms or with poor performance. More importantly, in multiuser multicarrier (MU-MC) systems, the analog precoder is a component that is shared by all users and subcarriers, which incurs additional difficulties on hybrid precoder design and therefore calls for a more tractable formulation to maximize the spectral efficiency. It has been shown in [4, 15, 18, 20, 9, 13, 30] that minimizing the Euclidean distance between the fully digital precoder and the hybrid precoder is an effective and tractable alternative objective for maximizing the spectral efficiency in mm-wave systems.

On the other hand, it was found in [9, 10] that the hybrid precoder in the multiuser setting produces residual inter-user interference, as it only approximates the fully digital precoder. Such interference will significantly degrade the system performance, especially at high SNR regimes. Moreover, this issue is more prominent in the multicarrier system as the analog precoder is shared by a large number of subcarriers.

Therefore, to both effectively approximate the fully digital precoder and cancel the inter-user interference, we propose to apply a two-layer precoding at the baseband [31]. In particular, the digital baseband precoder consists of two parts, i.e.,

(4)

where is a normalization factor, is the precoder that is utilized for approximating the fully digital precoder along with the analog precoder , and is the precoder that is responsible for canceling the inter-user interference. A similar approach was adopted in [32].

Correspondingly, the first task, i.e., to approximate the fully digital precoder, can be formulated as

(5)

where the combined fully digital precoder is denoted as , and is the concatenated digital precoder111The phrase “digital precoder” is used to refer in the remainder of this paper with a slight abuse of terminology, as it is the digital part in the hybrid precoder that approximates the fully digital precoder. with dimension . The constraint set of the switch matrix is denoted as . Note that, while the transmit power constraint is not explicitly considered in , it shall be satisfied by adapting the normalization factor after is solved.

With the digital precoder at hand, the other precoder is cascaded after it to cancel the inter-user interference based on the effective channel including the hybrid precoder and physical channel, which is given by

(6)

where with dimension is the composite digital precoder on the -th subcarrier. Then, our goal is to design precoders that satisfy the conditions

(7)

A simple way to achieve the conditions is the block diagonal (BD) precoder. More details can be found in [33].

Since the inter-user interference is canceled, we can determine the normalization factor to satisfy the transmit power constraint , which is given by

(8)

Note that the combiners at the user side are with the same analog network structure as (2). The hybrid combiners can be designed in a similar way as for each user independently, and thus are omitted due to space limitation. In addition, the problem formulation is not limited to any specific channel models or fully digital precoding schemes. It can be easily observed that the hybrid precoder can be readily designed by (6) to (8) once is solved, and hence we will focus on in the following sections.

Iii Hybrid Precoder Design With the FPS Implementation

In this section, we design the hybrid precoder with the FPS implementation and the popular fully-connected mapping strategy, for which every entry in the switch matrix is a binary optimization variable and there are in total switches. As shown in the hybrid precoder design problem , the main task is to design the binary switch matrix and the digital precoding matrix . First we make some observations on .

Remark 4: Since the switch matrix is with finite possibilities, the cardinality of the constraint set for the analog precoding matrix is finite, which means that the OMP algorithm [4, 13] is applicable to . However, different from the SPS case, the dimension of the dictionary in the OMP algorithm for the FPS implementation is oversize, i.e., , which is a huge number in large-scale antenna systems and hence hinders its practical implementation.

Remark 5: Alternating minimization can be directly applied to where the binary constraints can be tackled with the semidefinite relaxation (SDR) technique [15]. However, an -dimension semidefinite programming (SDP) problem should be solved in each iteration, which causes prohibitive computational complexity. Moreover, how to recover a rank-one solution from an SDR with binary constraints is still an open problem [34]. This means that the optimality of the relaxation in each iteration of the alternating procedure cannot be ensured and hence the overall convergence of the AltMin algorithm cannot be guaranteed.

As discussed above, the main difficulty to solve is the large-size binary constraints of the switch matrix . As a matter of fact, even if we only focus on the design of the switch matrix , is an NP-hard problem [34]. In this section, by deriving an effective surrogate for the objective function and adopting alternating minimization, we come up with a low-complexity hybrid precoding algorithm that well tackles the binary constraints.

Note that the property of the combined digital precoding matrix differs for different system settings. It is a tall matrix in single-carrier systems, i.e., , since . In contrast, when it comes to multicarrier systems, is likely to be a fat matrix as for practical system parameters. As we will see in this section, this difference affects the manipulation of the algorithm, and we first present the hybrid precoder design in single-carrier systems222In this paper, single-carrier systems refer to single-carrier transmissions assuming flat-fading channels. The choice of such a model is for the ease of presentation, and the algorithm will be later extended to the more realistic multicarrier case with frequency-selective fading channels..

Iii-a An Upper Bound for the Objective

In [15, 9, 5], it has been shown that imposing a semi-orthogonal structure for is an efficient way to achieve near-optimal performance. Inspired by these results, we take a similar approach as follows. In single-carrier systems, the digital precoding matrix is a tall matrix, and thus the semi-orthogonal constraint is specified as

(9)

where , is a scaling factor, and is a semi-unitary matrix. Then, an upper bound is derived for the objective function in in the following lemma.

Lemma 1.

The objective function in is upper bounded by

(10)
Proof:

The objective function in can be rewritten as

(11)

According to (3), the phase shifter matrix is a semi-unitary matrix, i.e., . Therefore, we can derive an upper bound for the last term in (11), given by

(12)

Step (a) follows the singular value decomposition (SVD) of by utilizing the semi-unitary property of , whose left singular vectors are the columns of . ∎

Iii-B Alternating Minimization

By adopting the upper bound (10) as the surrogate objective function and dropping the constant term , the hybrid precoder design problem is reformulated as

(13)

Alternating minimization, as an effective tool for optimization problems involving different subsets of variables, has been widely applied and shown empirically successful in hybrid precoder design [15, 9, 5]. In this section, we apply this design principle to the hybrid precoder design with the FPS fully-connected structure. In each step of the AltMin algorithm, one subset of the optimization variables is optimized while keeping the other parts fixed.

When the switch matrix and are fixed, the optimization problem can be written as

(14)

According to the definition of the dual norm [35], we have

(15)

where and stand for the infinite and one Schatten norms [35], and (b) follows the Hölder’s inequality. The equality is established only when

(16)

where follows the SVD and is a diagonal matrix with non-zero singular values .

While we can divide the optimization of the two variables and into two separate subproblems, we propose to update them simultaneously to save the number of subproblems involved in the AltMin algorithm and therefore reduce the computational complexity. By adding a constant term to the objective function in , the subproblem of updating and can be recast as

(17)
Proposition 1.

The optimal solution to (17) is given by

(18)
(19)

where , , is the indicator function, and denotes an matrix with all entries equal to one. The objective function in (17) can be rewritten as in (36) in the proof. In addition, is the -th smallest entry in , and

(20)

where .

Proof:

See Appendix A. ∎

Basically, is a quadratic function within each interval , as shown in (36) in the proof. This means that the optimal solutions of in all the intervals can only be obtained either at the endpoints of the intervals, i.e., , or at the vertexes of the parabolas, i.e., , if they fall into the intervals. Therefore, the optimal is obtained via a closed-form solution by comparing the optimal solutions of in all the intervals , as indicated in (18). Nevertheless, since the number of intervals to be compared is , it will incur high computational complexity when is large as in mm-wave systems. In the following lemma, we show that there is no need to compute the optimal in all the intervals , which further reduces the complexity of the proposed algorithm.

Lemma 2.

The optimal is obtained at one of the points , where denotes the set of the ’s that have finite values of .

Proof:

See Appendix B. ∎

Lemma 2 indicates that any endpoints of the intervals cannot be the optimal solution for . Moreover, since is a coercive function, i.e., , we only need to pick the ’s that have finite values of , i.e., the ones that satisfy the first two conditions in (20), and the optimal solution for is given by

(21)

By Lemma 2, the number of intervals we need to compare to obtain the optimal is shrunk from to , which is empirically shown to be less than 5 via simulations in Section V and hence further reduces the computational complexity of the proposed AltMin algorithm.

Thus, we have shown that, with the help of the upper bound derived in (12), the large-scale binary switch matrix can be efficiently optimized by a closed-form solution (19), which verifies the benefits and superiority of the surrogate objective function adopted in . With the closed-form solutions derived in (16), (19), and (21) at hands, the AltMin algorithm for the FPS fully-connected structure in single-carrier systems is summarized as FPS-AltMin Algorithm. There are several issues involved in the FPS-AltMin algorithm that require some further remarks.

0:  
1:  Construct an initial point for according to (16) and set ;
2:  repeat
3:     Fix , optimize and according to (21) and (19), respectively;
4:     Fix and , update with (16);
5:     ;
6:  until convergence.
7:  Compute the additional BD precoder at the baseband to cancel the inter-user interference [9], and calculate the normalization factor according to (8) for the hybrid precoder at the transmit end.
8:  return   and .
FPS-AltMin Algorithm: A Low-Complexity Hybrid Precoding Algorithm for the FPS Fully-Connected Structure

1) Convergence: The FPS-AltMin algorithm is essentially a block coordinate descent (BCD) algorithm with two blocks and , whose globally optimal solutions are given by (16), (19) and (21). Hence, the algorithm is guaranteed to converge to a stationary point of [36].

2) Initial point: Since the algorithm converges to a stationary point, it may be sensitive to the initial point . We provide a way to construct an initial point in the FPS-AltMin algorithm. The fully digital precoding matrix can be decomposed as follows according to its SVD , i.e.,

(22)

where is an matrix with full column rank, is a -dimension square matrix, and is an arbitrary matrix. In (22), the fully digital precoding matrix is decomposed into two matrices that satisfy the dimensions of and , respectively. In other words, , , and is a globally optimal solution to the hybrid precoding problem without any constraints on the analog precoding matrix . In this way, we generate the initial point as

(23)

Note that fully extracts the information of the row space of , whose basis are the first rows in . We also stress that the satisfies the semi-unitary constraint introduced in (9).

3) Computational complexity: We compare the computational complexity of the proposed algorithm with the ones mentioned in Remarks 4 and 5. Since the dictionary size in the OMP algorithm is , the computational complexity could be prohibitively high even though this algorithm only needs a small number of iterations. For the SDR method mentioned in Remark 5, in each iteration333The procedure that updates both the analog and digital precoders is counted as one iteration., an -dimension SDP problem should be solved for updating the analog part while a pseudo-inverse operation is needed for updating the digital precoder. Therefore, the computational complexity per iteration is . On the contrary, in each iteration of the proposed FPS-AltMin algorithm, the computational complexity is dominated by the truncated SVD and sorting operations, with the complexity , which is much lower than those of the OMP algorithm and SDR method444To solve the switch matrix in one iteration, the running time of the SDR method is 1.3 s while the proposed FPS-AltMin algorithm takes 0.04 s when , , and ..

(a) Fully-connected mapping strategy.
(b) Partially-connected mapping strategy.
(c) Group-connected mapping strategy.
Fig. 3: Three mapping strategies for hybrid precoding in mm-wave MIMO systems: each RF chain is connected to all antennas in (a), to antennas in (b), and to antennas in (c).

Iii-C Hybrid Precoder Design in Multicarrier Systems

Multicarrier techniques such as OFDM are often utilized to overcome the frequency-selective fading caused by the large available bandwidth in mm-wave systems. Compared with the narrowband hybrid precoder design in Section III, the main difference in OFDM systems is that the analog precoder is shared not only by all users but also across all subcarriers [15, 21]. In particular, the digital precoding matrix in is no longer a tall matrix, since for practical OFDM system settings.

In this section, we modify the FPS-AltMin algorithm for OFDM systems. Similar to (9), we enforce a semi-orthogonal constraint on the digital precoding matrix. As is generally a fat matrix, the semi-orthogonal constraint is specified as

(24)

In this way, the upper bound of the objective function derived in (12) still holds since

(25)

where (c) comes from the SVD of , i.e., , since is a semi-unitary matrix, and the columns of are the left singular vectors of . As the modifications in multicarrier systems lie in the digital precoding matrices and , in the modified AltMin algorithm, the update of and is the same as that in Section III-B. On the other hand, since is a fat matrix in OFDM systems, the optimization of should be modified as

(26)

where and is a diagonal matrix with non-zero singular values , which is the SVD of . Correspondingly, the construction of the initial is given by

(27)

where is the SVD of and the subscript denotes the first to the -th columns of a matrix.

By substituting (27) and (26) into Steps 1 and 4 in the FPS-AltMin algorithm, respectively, we obtain the modified FPS-AltMin algorithm for mm-wave OFDM systems. The conclusion on convergence remains the same as was discussed in Section III-B while the computational complexity is . Furthermore, the inter-user interference canceling approach can also be extended to OFDM systems, i.e., an additional BD precoder is utilized based on the effective channel that is defined as

(28)

where with dimension is the composite digital precoder on the -th subcarrier. Therefore, the extension to multicarrier systems does not lead to extra design difficulties compared with single-carrier systems.

Iv The Group-Connected Mapping Strategy for Hybrid Precoding

In previous sections, the hybrid precoder design is based on a novel hardware implementation but with a conventional mapping strategy, i.e., the fully-connected mapping. In this section, a new mapping strategy, called the group-connected mapping, is proposed to offer a flexible trade-off between hardware complexity and spectral efficiency. In particular, with this mapping strategy, the number of switches in the FPS implementation is further reduced.

Iv-a The Group-Connected Mapping Strategy

Fig. 3 compares different mapping strategies. In the group-connected mapping, the RF chains and antennas are divided into groups, as shown in Fig. 3(c). Within each group, the mapping strategy is the same as the fully-connected mapping, i.e., each RF chain is connected to all antennas. Thus, the analog precoding matrix has the block diagonal structure, with each block corresponding to one RF chain-antenna group, specified as

(29)

with being the analog precoding matrix in the -th group. Note that while the RF chains and antennas are uniformly divided into groups in Fig. 3(c) to simplify notation, the grouping can be flexible, i.e., the numbers of RF chains and antennas in different groups can be different.

The proposed group-connected mapping is a general mapping strategy that incorporates existing mapping strategies as special cases:

  • When , which means that all RF chains and antennas are in the only one group, the group-connected mapping reduces to the fully-connected one, as shown in Fig. 3(a).

  • When , which means there is only one RF chain in each group, and each of them is connected to antennas, as shown in Fig. 3(b), the mapping strategy corresponds to the partially-connected one, and the analog precoding matrix is a block diagonal matrix with each block being an -dimension vector [15, Eq. 29].

Inevitably, trade-offs need to be made among hardware complexity and spectral efficiency. The two existing mapping strategies provide such a trade-off, but in an extreme way. The fully-connected mapping strategy is with too low hardware efficiency, while the partially-connected one incurs too much performance degradation. In contrast, it will be shown later in Section V that the group-connected mapping provides a smoother transition between the two extreme cases. To the best of the authors’ knowledge, this is the first proposal for a general mapping strategy in hybrid precoding systems.

Similar to existing mapping strategies, the group-connected mapping can also be applied to hybrid precoding along with any hardware implementations, e.g., SPS, DPS, and FPS implementations. As this paper mainly focuses on the FPS hardware implementation, we will elaborate the hybrid precoder design with the FPS group-connected structure in the following.

0:  
1:  if  then
2:     Construct an initial point for according to (23) and set ;
3:     repeat
4:        Fix , optimize and according to (21) and (19), respectively;
5:        Fix and , update with (16);
6:        ;
7:     until convergence.
8:  else
9:     Construct an initial point for according to (27) and set ;
10:     repeat
11:        Fix , optimize and according to (21) and (19), respectively;
12:        Fix and , update with (26);
13:        ;
14:     until convergence.
15:  end if
16:  Compute the additional BD precoder at the baseband to cancel the inter-user interference [9], and calculate the normalization factor according to (8) for the hybrid precoder at the transmit end.
17:  return   and .
FPS-AltMin Algorithm: A Low-Complexity Hybrid Precoding Algorithm for the FPS Group-Connected Structure

Iv-B Hybrid Precoder Design for the FPS Group-Connected Structure

As mentioned before, the number of RF chains and phase shifters has already been reduced by the FPS implementation. On the other hand, the amount of switches depends on the number of connections, which in turn is determined by the mapping strategy. For the group-connected structure, the analog precoding matrix can be rewritten as

(30)

where is a block diagonal matrix that extracts the first blocks from the matrix in (3), and with dimension is the switch matrix for the -th group. Hence, there are RF chain-antenna pairs, and the number of switches in use is , which is reduced by the factor of compared with the FPS fully-connected structure. Furthermore, the hardware implementation of the analog network is simplified with the group-connected mapping. In particular, with the conventional fully-connected mapping, -way power dividers and -way power combiners are required [37]. In contrast, with the proposed group-connected mapping, only -way power dividers and -way power combiners are needed.

Fortunately, the reduced hardware complexity does not incur additional difficulties and computational complexity in hybrid precoder design. Due to the block diagonal structure of , the product of and can be expressed as

(31)

The matrix is the sub-matrix consisting of the -th to the -th rows of . In this way, the hybrid precoder design problem can be decoupled into subproblems, each of which corresponds to one group, given by

(32)

where is the sub-matrix that extracts the -th to the -th rows from . We can observe that each subproblem is with the same form as with the FPS fully-connected structure. This result is also intuitively true since the mapping strategy within each group is nothing but the fully-connected one.

Following the same procedures in Sections III and III-C, the subproblems can be solved in a parallel fashion. The only additional step is to determine whether the matrix is a tall or fat matrix, i.e., to decide whether or not, since they correspond to different ways to update in single-carrier and multicarrier design, respectively. For the FPS group-connected structure, the computational complexity of the proposed FPS-AltMin algorithm is .

Note that this design methodology for the group-connected mapping is applicable to any kinds of hardware implementation. This means that the algorithm design for the group-connected mapping with any hardware implementations can be realized by directly migrating the design for the fully-connected mapping, which has been investigated in abundant existing works [4, 11, 6, 5, 15]. It also shows the benefits of introducing this group-connected mapping from the algorithmic perspective.

V Simulation Results

In this section, we evaluate the performance of the proposed FPS-AltMin algorithm via simulations. Unless otherwise specified, the BS and each user are equipped with 144 and 16 antennas, respectively, while all the transceivers are equipped with uniform planar arrays. The phases of the available fixed phase shifters are uniformly separated within by equal length intervals. Four users and 128 subcarriers are assumed when considering multiuser OFDM systems. To reduce the cost and power consumption, the minimum number of RF chains is adopted according to the assumptions in Section II-A, i.e., and . The phases of the available fixed phase shifters are uniformly separated within by equal-length intervals. The nominal SNR is defined as , and all the simulation results are averaged over 1000 channel realizations. For the fully digital precoder, the BD precoder is adopted, which is asymptotically optimal in high SNR regimes [33]. Furthermore, the Saleh-Valenzuela model is adopted in simulations to characterize mm-wave channels [4, 15], and the frequency domain channel matrix for the -th subcarrier given by [38, 15]

(33)

where is the normalization factor. The numbers of clusters and rays in each cluster are represented by and , respectively. The channel gain of the -th ray in the -th cluster is denoted as . Furthermore, represent the receive and transmit array response vectors, where () and () stand for azimuth and elevation angles of arrival and departure, respectively. While this channel model is used in the simulation, our precoder design does not depend on the channel model and is also applicable to other more general models.

V-a Single-User Single-Carrier (SU-SC) Systems

As a great number of previous efforts have been spent on point-to-point systems, it is intriguing to test the performance of the proposed implementation and algorithm by comparing with existing works as benchmarks. The OMP algorithm proposed in [4, 13] has been widely used as a low-complexity algorithm with the analog precoder selected from a predefined set, which contains the array response vectors of the channels. An alternating minimization algorithm was then proposed in [15] to improve the performance over the OMP algorithm, yet with high computational complexity of performing the manifold optimization, referred as the MO-AltMin algorithm. For the SPS partially-connected structure, a dynamic subarray approach was proposed in [17] to compensate the performance loss caused by the fewer connections between the RF chains and antennas555As the algorithm in [17] can only design the hybrid precoder at the BS side, a fully digital combiner is adopted at the user side for this approach while other approaches adopt hybrid combiners in Fig. 4..

Fig. 4: Spectral efficiency achieved by different hybrid precoding algorithms in SU-SC systems when and .

In Fig. 4, the performance of a random binary switch matrix in the FPS fully-connected structure is firstly presented. It shows that this approach is far from satisfactory and therefore a delicate design of the switch matrix is needed. Fig. 4 also compares the performance achieved by the proposed FPS-AltMin algorithm in the FPS fully-connected structure with three existing approaches in the SPS fully-connected structure. It shows that, although the phase shifters are with fixed phases and the number of them is small, i.e., 30 fixed phase shifters, the proposed FPS fully-connected structure achieves the highest spectral efficiency. Thanks to the proposed low-complexity FPS-AltMin algorithm, the simulation time of the proposed algorithm is comparable to the OMP one for the SPS fully-connected structure. The performance gain in spectral efficiency over the benchmarks is mainly attributed to the proposed FPS hardware implementation, where each signal from an RF chain passes through more than one phase shifter. Furthermore, the results show that the proposed FPS-AltMin algorithm leads to an effective design of the dynamic switch network. Note that the MO-AltMin algorithm is so far the one that achieves the best performance in the SPS fully-connected structure, which means the proposed structure and algorithm stand out as an excellent candidate for hybrid precoding with high hardware efficiency, high spectral efficiency, and low-complexity design methodology.

V-B Multiuser Multicarrier Systems

As we have shown that only a small number of phase shifters is required to approach the performance of the fully digital precoder in SU-SC systems, we wonder whether this phenomenon still establishes when the analog precoder is shared by all subcarriers and users in MU-MC systems. While the MO-AltMin algorithm well tackles the unit modulus constraint induced by the SPS implementation, the extremely high computational complexity hinders its further extension to MU-MC systems where the dimension of the optimization scales up quickly.

Besides the fully digital case, we consider the following three baseline cases for comparison. A hybrid precoder design where one phase shifter is optimized in each iteration was developed in [39], which so far achieves the best spectral efficiency in the literature. In addition, Butler matrices can utilize fixed phase shifters and hybrid couplers to realize the SPS fully-connected structure, and the OMP algorithm is suitable for designing the analog network based on Butler matrices. In [9], the DPS fully-connected structure was proposed for MU-MC systems to approach the performance of the fully digital precoder by sacrificing the hardware efficiency of employing a large number of phase shifters, i.e., phase shifters. In the evaluation of MU-MC systems, the DPS fully-connected structure is adopted as the benchmark, where a simple low-rank matrix approximation is enough for designing the hybrid precoder.

Fig. 5: Spectral efficiency achieved by different hybrid precoding algorithms in MU-MC systems when , , and .

As shown in Fig. 5, the proposed FPS fully-connected structure only entails little performance loss compared to the DPS fully-connected one when only 30 fixed phase shifters are adopted. Both the DPS fully-connected and FPS fully-connected structures benefit from the operation that allows the same signal to pass through multiple phase shifters, while the main difference between them is the quantized and fixed phases assumed in the FPS one. This simulation result demonstrates that the performance loss caused by the quantization is negligible with the proposed hybrid precoder structure. On the other hand, the FPS fully-connected structure enjoys significant improvement in terms of spectral efficiency compared with the SPS fully-connected structure with the algorithm in [39] and the OMP algorithm based on Butler matrices, which illustrates the effectiveness of both the newly proposed implementation and algorithm. More importantly, it indicates that the number of phase shifters can also be sharply reduced by the proposed FPS implementation even if the analog precoder is shared in MU-MC systems.

Fig. 6: Spectral efficiency achieved by different hybrid precoding algorithms in mm-wave MIMO systems given SNR dB.
TABLE II: Power consumption of the analog network for different hybrid precoder structures in MU-MC systems
Phase shifter Other hardware Total power666The total power consumed by the main hardware components in the analog network.
Type Hardware
DPS fully-connected [9] 2304 Adaptive N/A N/A 115.2 W
FPS fully-connected 10 Fixed777For fair comparisons, the power consumed by the FPS implementation is counted by calculating the power of fixed phase shifters, each of which is with the same power consumption as the fixed phase shifter in the Butler matrix implementation. Switch 11520 59.2 W
SPS fully-connected 1152 Adaptive N/A N/A 57.6 W
4-bit quantization [39]
FPS fully-connected 2 Fixed Switch 2304 11.84 W
SPS fully-connected 3456 Fixed Coupler 4032 109.44 W
with Bulter matrices

V-C Comparisons of Hardware Efficiency

To improve the hardware efficiency, the number of fixed phase shifters, i.e., , should be reduced to a minimum. Thus, a natural question is how many fixed phase shifters are needed to support a satisfactory spectral efficiency. Fig. II plots the spectral efficiency achieved with different numbers of fixed phase shifters, i.e., . The simulation parameters are the same as those in Figs. 4 and 5 for SU-SC and MU-MC systems, respectively. Fig. II shows that in SU-SC systems 15 phase shifters are enough for achieving a satisfactory performance as the spectral efficiency almost saturates when we further increase the number of fixed phase shifters. By contrast, 576 variable phase shifters with arbitrary precision are needed in the SPS implementation. Moreover, the OMP algorithm achieves a lower spectral efficiency and the MO-AltMin algorithm suffers from the high computational complexity. A similar phenomenon is found in MU-MC systems, i.e., around 10 fixed phase shifters are sufficient, which has not been revealed in existing works. Although the DPS implementation slightly outperforms the proposed FPS-AltMin algorithm, it employs 200 times more phase shifters with variable and high resolution. This illustrates that the proposed FPS implementation is much more hardware-efficient than existing hybrid precoder implementations, and also with satisfactory performance, which is quite attractive for practical implementation of hybrid precoding.

As MU-MC is more likely to be the system setting in future 5G mm-wave networks, we compare the power consumption of different hybrid precoder structures in such systems, as listed in Table II.

As the power consumption of the baseband and RF chains are the same for different hybrid precoder structures, in this section we compare the power consumption of the analog network, which is the distinct part for different structures and is mainly determined by the power consumed by phase shifters, switches or couplers. The total power consumption of the analog network in Table II is calculated as

(34)

where and are the power consumption of each phase shifter and switch/coupler given in Table I. For fair comparisons, we compare the hardware efficiency by calculating the power consumption of different hybrid precoder structures while keeping comparable spectral efficiency. As indicated in Fig. II, 10 fixed phase shifters in the FPS fully-connected structure are sufficient to achieve comparable performance as that of the DPS fully-connected one. Table II shows that, while a switch network is required in the FPS fully-connected structure, it consumes much less power as the power consumption of each switch is small. This leads to a higher hardware efficiency than the DPS fully-connected structure that requires a large number of adaptive phase shifters.

On the other hand, it is found in Fig. II that 2 fixed phase shifters in the FPS full-connected structure are sufficient for achieving a comparable spectral efficiency as the SPS fully-connected one with the algorithm in [39]. Note that although infinite resolution phase shifters are assumed in [39], quantized phase shifters should be adopted to ensure practical comparison in terms of the power consumption. Therefore, as suggested in [24] all the phase shifters in the SPS fully-connected structure are quantized with 4 bits. According to Table II, to achieve the same spectral efficiency, the SPS fully-connected structure needs almost 5 times more power than the FPS fully-connected one, which again demonstrates the advantages of our proposal in terms of hardware efficiency. In addition, due to the large numbers of fixed phase shifters and hybrid couplers in the Butler matrix implementation, it suffers from a huge power consumption and the lowest spectral efficiency, which results in a low hardware efficiency. Moreover, it is observed that different levels of hardware efficiency can be readily achieved by adapting the number of fixed phase shifters in the FPS fully-connected structure.

Fig. 7: Spectral efficiency of different values of with the FPS group-connected structure in SU-SC systems when , , , and .

V-D The FPS Group-Connected Hybrid Precoder Structure

In this part, we evaluate the spectral efficiency achieved by the proposed group-connected mapping strategy. By employing this mapping strategy with the FPS implementation, the number of switches can be reduced by a factor of , which is the number of groups in the mapping. In existing works, only the fully-connected () and p