Channel Reconstruction-Based Hybrid Precoding for Millimeter Wave Multi-User MIMO Systems

# Channel Reconstruction-Based Hybrid Precoding for Millimeter Wave Multi-User MIMO Systems

Miguel R. Castellanos, Vasanthan Raghavan, Jung H. Ryu,
Ozge H. Koymen, Junyi Li, David J. Love, and Borja Peleato
Purdue University, West Lafayette, IN 47907, USA
Qualcomm Corporate R&D, Bridgewater, NJ 08807, USA
This material is based upon work supported in part by the National Science Foundation under grants CCF1403458 and CNS1642982.
###### Abstract

The focus of this paper is on multi-user multi-input multi-output (MIMO) transmissions for millimeter wave systems with a hybrid precoding architecture at the base-station. To enable multi-user transmissions, the base-station uses a cell-specific codebook of beamforming vectors over an initial beam alignment phase. Each user uses a user-specific codebook of beamforming vectors to learn the top- (where ) beam pairs in terms of the observed signal-to-noise ratio () in a single-user setting. The top- beam indices along with their s are fed back from each user and the base-station leverages this information to generate beam weights for simultaneous transmissions. A typical method to generate the beam weights is to use only the best beam for each user and either steer energy along this beam, or to utilize this information to reduce multi-user interference. The other beams are used as fall back options to address blockage or mobility. Such an approach completely discards information learned about the channel condition(s) even though each user feeds back this information. With this background, this work develops an advanced directional precoding structure for simultaneous transmissions at the cost of an additional marginal feedback overhead. This construction relies on three main innovations: 1) Additional feedback to allow the base-station to reconstruct a rank- approximation of the channel matrix between it and each user, 2) A zeroforcing structure that leverages this information to combat multi-user interference by remaining agnostic of the receiver beam knowledge in the precoder design, and 3) A hybrid precoding architecture that allows both amplitude and phase control at low-complexity and cost to allow the implementation of the zeroforcing structure. Numerical studies show that the proposed scheme results in a significant sum rate performance improvement over naïve schemes even with a coarse initial beam alignment codebook.

{keywords}

Millimeter wave, multi-input multi-output, multi-user, beamforming, hybrid precoding, phase and amplitude control, zeroforcing, generalized eigenvector, channel estimation

## I Introduction

Over the last few years, there has been a growing interest in leveraging the opening up of the spectrum in the millimeter wave band (- GHz) in realizing the emerging higher data rate demands of cellular systems [1, 2, 3, 4]. Communications in the millimeter wave band suffers from increased path loss exponents, higher shadow fading, blockage and penetration losses, etc., than sub- GHz systems leading to a poorer link margin than legacy systems [5, 6, 7, 8, 9, 10]. However, by restricting attention to small cell coverage and by reaping the increased array gains from the use of large antenna arrays at both the base-station and user ends, significant rate improvements can be realized in practice.

Millimeter wave propagation is spatially sparse with few dominant clusters in the channel relative to the number of antennas [5, 6, 11, 12]. Spatial sparsity of the channel along with the use of large antenna arrays motivates a subset of physical layer beamforming schemes based on directional transmissions for signaling. In this context, there have been a number of studies on the design and performance analysis of directional beamforming/precoding structures for single-user multi-input multi-output (MIMO) systems [13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. These works [16, 17, 18, 19] show that directional schemes are not only good from an implementation standpoint, but are also robust to phase changes across clusters and allow a smooth tradeoff between peak beamforming gain and initial user discovery latency. There has also been progress in generalizing such directional constructions for multi-user MIMO transmissions [22, 23, 24, 25].

In this context, while legacy systems use as many radio frequency (RF) chains111An RF chain includes (but is not limited to) analog-to-digital converters (ADCs), digital-to-analog converters (DACs), mixers, low-noise and power amplifiers (PAs), etc. as the number of antennas, their higher cost, energy consumption, area and weight at millimeter wave carrier frequencies has resulted in the popularity of hybrid beamforming systems [26, 27, 28, 29]. A hybrid beamforming system uses a smaller number of RF chains than the number of antennas, with the one extreme case of a single RF chain being called the analog/RF beamforming system and the other extreme of as many RF chains as the number of antennas being called the digital beamforming system. Spatial sparsity of millimeter wave channels ensures that having as many RF chains as the number of dominant clusters in the channel is sufficient to reap the full array gain possible over these channels.

A number of recent works have addressed hybrid beamforming for millimeter wave systems. The problem of finding the optimal precoder and combiner with a hybrid architecture is posed as a sparse reconstruction problem in [17], leading to algorithms and solutions based on basis pursuit methods. While the solutions achieve good performance in certain cases, to address the performance gap between the solution proposed in [17] and the unconstrained beamformer structure, an iterative scheme is proposed in [30, 31] relying on a hierarchical training codebook for adaptive estimation of millimeter wave channels. The authors in [30, 31] show that a few iterations of the scheme are sufficient to achieve near-optimal performance. In [32], it is established that a hybrid architecture can approach the performance of a digital architecture as long as the number of RF chains is twice that of the data-streams. A heuristic algorithm with good performance is developed when this condition is not satisfied. A number of other works such as [33, 34, 35, 36] have also explored iterative/algorithmic solutions for hybrid beamforming.

A common theme that underlies most of these works is the assumption of phase-only control in the RF/analog domain for the hybrid beamforming architecture. This assumption makes sense at the user end with a smaller number of antennas (relative to the base-station end), where operating the PAs below their peak rating across RF chains can lead to a substantially poor uplink performance. On the other hand, amplitude control (denoted as amplitude tapering in the antenna theory literature) is necessary at the base-station end with a large number of antennas for side-lobe management and mitigating out-of-band emissions. Further, given that the base-station is a network resource, simultaneous amplitude and phase control of the individual antennas across RF chains is feasible at millimeter wave base-stations at a low-complexity222Any calibration complexity can be seen as a one-time effort at the unit level for a large array and defrayed as a low network cost. and cost [37, pp. 285-289][38, 39]. In particular, the millimeter wave experimental prototype demonstrated in [40] allows simultaneous amplitude and phase control. Thus, it is important to consider a hybrid architecture with these constraints. Further, given the directional nature of the channel, a solution should both inherit a directional structure and provide an intuitive description of the beam weights. For example, a black box-type algorithmic solution that does not provide an intuitive description of the beam weights is less preferable over a solution that is constructed out of measurement reports obtained over an initial beam alignment phase with a directional structure for the sounding beams.

Main Contributions: With this backdrop, this work addresses these two fundamental issues in hybrid beamformer design. It is assumed that the base-station trains all the users in the cell with a cell-specific codebook of beamforming vectors over an initial beam alignment phase. Each user makes an estimate333In a practical implementation such as the Third Generation Partnership Project New Radio (3GPP 5G-NR) design, is typically assumed both in terms of measurements and reporting [41]. The received is estimated as the received power of a beamformed link (corresponding to the beam pair under consideration) using a certain reference symbol resource. This metric is typically known as the reference symbol received power (RSRP) of the link. of the top- (where ) beams over this phase and reports the beam indices to be used by the base-station as well as the measured/received signal-to-noise ratios (s). The simplest implementation at the base-station uses only the best beam information for beam steering or zeroforcing as in [23, 24], with other beams serving as fall back options.

In contrast to this approach, we propose to reconstruct or estimate a rank- approximation of the channel matrix between the base-station and the user (at the base-station end). To realize this reconstruction, we envision the additional feedback of the phase of the received signal estimate of the top- beams over the beam alignment phase and the cross-correlation information of the top- beams at the user end with the beam used for multi-user reception. With this novel construction, the base-station can remain agnostic of the user’s top- beams in precoder design. In terms of overhead, in 3GPP 5G-NR, these quantities can be fed back over the physical uplink control channel (PUCCH) with a Type-II feedback scheme [41, Sec. 8.2.1.6.3, pp. 24-26]; see Sec. V-C for a detailed study that demonstrates this feedback overhead to be marginal. Leveraging the rank- channel approximation, we propose the use of a zeroforcing structure that is then quantized to meet the RF precoding constraints (amplitude and phase control) at the base-station end for simultaneous transmissions.

To benchmark and compare the performance of the proposed scheme, we establish two upper bounds for the sum rate. This is a fundamentally difficult problem given the non-convex dependence of the sum rate on the beamforming vectors [42, 43, 44]. The first bound is based on an intuitive parsing and understanding of the zeroforcing structure. The second bound is based on an alternating optimization of the beamformer-combiner pair with signal-to-leakage and noise ratio ([45] and signal-to-interference and noise ratio () as optimization metrics. Numerical studies show that the proposed scheme performs significantly better than a naïve beam steering solution even for an initial beam alignment codebook of poor resolution. Further, the proposed scheme is comparable with the established upper bounds provided the beam alignment codebook resolution is moderate-to-good. Thus, our work establishes the utility and efficacy of the proposed feedback techniques as well as opens up avenues for further investigation of such approaches in hybrid beamforming with millimeter wave systems.

Organization: This paper is organized as follows. Sec. II develops the system setup and explains the RF precoder architectural constraints adopted in this work. In Sec. III, we provide a background of the initial beam alignment phase and the feedback mechanism necessary for the multi-user beamforming envisioned in this work. Sec. IV generates two upper bounds on the sum rate to benchmark the performance of the proposed scheme. Sec. V performs a number of numerical studies to understand the performance of the proposed scheme relative to a naïve beam steering solution as well as to the upper bounds developed in Sec. IV. Concluding remarks are provided in Sec. VI.

Notations: Lower- and upper-case bold symbols are used to denote vectors and matrices, respectively. The -th entry of a vector and the -th entry of a matrix are denoted by and , respectively. The regular matrix transpose and complex conjugate Hermitian transpose operations of a matrix are denoted by and , respectively. The two-norm of a vector is denoted as with , and standing for the set of reals, complex numbers and the complex normal random variable, respectively.

## Ii System Setup

We consider a cellular downlink scenario with a single base-station serving potential users. The base-station and each user are assumed to be equipped with planar arrays of dimensions antennas and antennas, respectively. At both ends, the inter-antenna element spacing is where is the wavelength of propagation. With and , the base-station and each user are assumed to have and RF chains, respectively.

For the channel between the base-station and the -th user (where ), we assume an extended geometric propagation model over clusters/paths [46, 6]

 Hk=√NrNtLkLk∑ℓ=1αk,ℓuk,ℓv†k,ℓ. (1)

In (1), , and denote the complex gain, the array steering vector at the user end corresponding to the angle of arrival (AoA) in azimuth/zenith, and the array steering vector at the base-station corresponding to the angle of departure (AoD) in azimuth/zenith, respectively. The cluster gains are assumed to be independent and identically distributed (i.i.d.) standard complex Gaussian random variables: . The normalization of the channel ensures that .

In terms of the system model, we focus on the narrowband aspects and assume that the base-station serves users simultaneously with data along RF chains. The base-station precodes data-streams for the -th user with the symbol vector using the digital/baseband precoder which is then up-converted to the carrier frequency by the use of the RF precoder . This results in the following system equation at the -th user

 yk=√ρKHkFRF⋅[K∑m=1FDig,msm]+nk (2)

where is the pre-precoding and is the white Gaussian noise vector added at the -th user. We assume that are i.i.d. complex Gaussian random vectors with and .

At the -th user, we assume that is processed (down-converted) with an user-specific RF combiner followed by a user-specific digital combiner to produce an estimate of as follows

 ˆsk=G†Dig,kG†RF,kyk (3) =√ρKG†Dig,kG†RF,kHkFRFFDig,ksk+√ρKG†Dig,kG†RF,kHkFRFK∑m=1,m≠kFDig,msm+nk. (4)

The achievable rate (in nats/s/Hz) at the -th user when treating multi-user interference as noise is given as

 Rk = logdet(Irk+ρKG†Dig,kG†RF,kHkFRFFDig,kF†Dig,kF†RFH†kGRF,kGDig,k⋅Σ−1intf) (5)

where denotes the interference and noise covariance matrix

 Σintf = Irk+ρKG†Dig,kG†RF,kHkFRF⎛⎝∑m≠kFDig,mF†Dig,m⎞⎠F†RFH†kGRF,kGDig,k. (6)

The traditional use of finite-rate feedback has been to convey the index of a precoder matrix from an appropriately-designed codebook of precoders to assist with adaptive transmissions to improve  [47, 48]. More generally, feedback from users can also be used to aid in scheduling, channel estimation and advanced/non-codebook based precoder design. In this work, as we will see later in Sec. III, we assume that each user feeds back its top beam indices, an estimate of the received and signal phase, and cross-correlation of the top receive beams to assist with the design of a non-codebook based multi-user precoder structure. In terms of precoder constraints, we make the assumption that .

For the RF precoder, we assume that the amplitude and phase of each entry in are controlled by a finite precision gain controller and phase shifter, respectively. In other words, the amplitude and phase come from a set of and quantization levels

 |FRF(i,j)|∈{A1,⋯,A2Bamp},∠FRF(i,j)∈{ϕ1,⋯,ϕ2Bphase}, (7)

where . Prior work on hybrid beamforming such as [17, 30, 31, 32] etc., assume that the RF precoder can only be controlled by a phase shifter. However, such constraining assumptions are not reflective of practical implementations [38, 39, 40], where an independent gain controller can be used in every RF chain for every antenna. With these structural constraints on the precoder, the transmit power constraint is captured by

 (8)

We are interested in the design of RF and digital precoders with the sum rate, , being the metric to maximize. In general, we only need the constraints and . However, the considered sum rate optimization with such an assumption is quite complicated. To overcome this complexity, we consider a simple use-case in this work.

## Iii Multi-User Beamformer Design

We are interested in the practically-motivated setting where each user is equipped with only one RF chain and the base-station transmits one data-stream to each user that is simultaneously scheduled. In this scenario, (for all ) and . The system decoding model in (2) and (4) reduce to

 ˆsk=G†Dig,kG†RF,kyk = (9) = √ρK⋅g†kHk[f1s1,⋯,fKsK]+g†knk (10)

where and , and the second equation follows assuming444A simple realization of the hybrid precoding architecture is achieved by setting and the desired for the -th user is set as the -th column of . The desired is such that and meets the quantization constraints in (7). In a practical implementation, could be primarily used for sub-band precoding and in the narrowband context of this work, would reflect such an implementation-driven model. and . The power constraint is equivalent to and reduces to

 Rk=log⎛⎜⎝1+ρK⋅|g†kHkfk|21+ρK⋅∑m≠k|g†kHkfm|2⎞⎟⎠. (11)

The focus of this section is to first develop an advanced feedback mechanism and a systematic design of the multi-user beamforming structure based on a directional representation of the channel. This structure allows the base-station to combat multi-user interference in simultaneous transmissions.

### Iii-a Initial Beam Alignment

Enabling multi-user transmissions in practice is critically dependent on an initial beam acquisition process (commonly known as the beam alignment phase). In a practical implementation such as 3GPP 5G-NR, beam alignment corresponds to a beam sweep over a block of secondary synchronization (SS) signals transmitted over multiple ports/RF chains. The use of multiple directional beams over multiple ports results in a composite beam pattern at the base-station end (as seen from the user side). The composite pattern can lead to uncertainty in the direction of the strongest path between the base-station and the user. This directional ambiguity is subsequently resolved with a beam refinement over the individual constituent beams that make the composite beam on separate resource elements. Beam refinement allows identification and ambiguity resolution of the constituent beams.

Such a “post directional ambiguity resolved” beam alignment process is modeled by assuming that the base-station is equipped with an element codebook

 Ftr={ftr,1,…,ftr,N}, (12)

and the -th user is equipped with an element user-specific codebook

 Gktr={g(k)tr,1,…,g(k)tr,M}. (13)

A typical design methodology for is a hierarchical design with different sets of beams that trade-off peak array gain at the cost of initial beam acquisition latency. For example, at least from the 3GPP 5G-NR perspective, the designs of and are intended to be implementation-specific at the base-station and user ends, respectively. Nevertheless, overarching design guidelines for beam broadening are provided in [14, 19, 49, 50]. In particular, a broadened beam can be generated by an optimal co-phasing of a number of array steering vectors in appropriately chosen directions. Both the number of such vectors as well as their steering directions can be optimized to produce a broadened beam. It must also be pointed out that most of the beam broadening works have some variations in terms of design principles and these variations themselves do not affect the flavor of results reported in this paper.

In the beam alignment phase, the top- beam indices at the base-station and each user that maximize an estimate of the received are learned. In particular, the received corresponding to the -th beam index pair at the -th user is given as

 SNR(k)rx(m,n)=∣∣∣(g(k)tr,m)†Hkftr,n∣∣∣2. (14)

Let the beam pair indices at the -th user be arranged in non-increasing order of the received and let the top- beam pair indices be denoted as

 (15)

With the simplified notation of

 SNR(k)rx,ℓ≜SNR(k)rx(mkℓ,nkℓ),ℓ=1,⋯,P, (16)

we have . With the initial beam alignment methodology as described above, we now leverage the top- beam information learned at the -th user to estimate the channel matrix and to design at the base-station end.

### Iii-B Channel Reconstruction and Beamformer Design

A typical use of the feedback information at the base-station is to select the top/best beam indices for all the users and to leverage this information to construct a multi-user transmission scheme. Such an approach is adopted in [24], where multi-user beam designs leveraging only the top beam pair index, , and intended to serve different objectives are proposed: i) greedily (from each user’s perspective) steering a beam to the best direction for that user (called the beam steering scheme), ii) using the information collated from different users to combat interference to other simultaneously scheduled users via a zeroforcing solution (called the zeroforcing scheme), and iii) for leveraging both the beam steering and interference management objectives via a generalized eigenvector optimization (called the generalized eigenvector scheme). If the beam pair is blocked or fades, the -th user requests the base-station to switch to the beam index and it switches to the beam with index (and so on) [10].

In this work, we propose to generalize the structures in [24] by leveraging all the top- beam pair indices fed back from each user. In this direction, the base-station intends to reconstruct or estimate a rank- approximation of (a scaled version of) the channel matrix corresponding to the -th user as follows

 ˆHk=P∑ℓ=1ˆαk,ℓˆuk,ℓˆv†k,ℓ, (17)

where and are defined as estimates of the array steering vectors and , respectively. Given the channel model structure in (1), (17) is simplified by estimating and by and , respectively, where

 γk,ℓ≜√QBSNR(SNR(k)rx,ℓ) (18)

for some choice of . In the above description, denotes an appropriately-defined -bit quantization operation555A -bit quantization operation is precisely specified if disjoint intervals that exactly and entirely span the range of the quantity and a representative/quantized value from each interval are specified. of the quantity under consideration. However, estimating as in (17) is not complete until we have an estimate for and . The quantity can be estimated by the user with the same reference symbol resource (or pilot symbol) transmitted during the beam training phase with no additional training overhead. Therefore, we define as the -bit quantization of the phase of an estimate of the pilot symbol

 (19)

for some choice of . The noise term captures the additive noise in the initial beam alignment process corresponding to the top- beam pairs.

For , we note that the base-station not only needs the beam indices that are useful for the user side, but also the useful part of the user’s codebook () since the base-station is typically unaware of it. To avoid this unnecessary complexity and feedback given the proprietary structure of , we assume that the -th user uses a multi-user reception beam . In the simplest manifestation, could be the best training beam learned in the beam alignment phase, . However, a more sophisticated choice for is not precluded. For example, an iterative choice that maximizes the (instead of the ) could be considered for .

We then note that the estimated , defined as,

 ˆSINRk≜ρK⋅|g†kˆHkfk|21+ρK⋅∑m≠k|g†kˆHkfm|2 (20)

is only dependent on in the form of . Building on this fact, each user generates , defined as,

 βk,ℓ≜g†kˆuk,ℓwhereˆuk,ℓ=g(k)tr,mkℓ. (21)

It then quantizes the amplitude and phase of for some choice of and and feeds them back

 μk,ℓ≜QBcorr,amp(|βk,ℓ|),νk,ℓ≜QBcorr,phase(∠βk,ℓ). (22)

For both and , without loss in generality, relative phases with respect to and (that is, and ) can be reported.

The mappings between the quantities of interest and the approximated quantities as well as the feedback overhead needed from each user to implement the proposed scheme are described in Table I. While the feedback overhead increases linearly with (the rank of the channel approximation), there are diminishing returns in terms of channel representation accuracy since the clusters captured in are sub-dominant as increases (and are eventually limited by ). Thus, it is useful to select to trade-off these two conflicting objectives.

Following the above discussion, the -th user feeds back the matrix , defined as

 Pk≜⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣nk1γk,10μk,10nk2γk,2φk,2−φk,1μk,2νk,2−νk,1⋮⋮⋮⋮⋮nkPγk,Pφk,P−φk,1μk,Pνk,P−νk,1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦, (23)

and the base-station approximates as follows

 g†kˆHk=P∑ℓ=1μk,ℓγk,ℓ⋅ej(φk,ℓ+νk,ℓ)⋅(ftr,nkℓ)†. (24)

In other words, is represented as a linear combination of the top- beams as estimated from in the initial beam alignment phase. The weights in this linear combination correspond to the relative strengths of the clusters as distinguished by the codebook resolution (at both ends).

The base-station uses the channel matrix constructed for each user based on its feedback information () and generates a good beamformer structure, illustrated in the next result, for use in multi-user transmissions.

###### Proposition 1.

The zeroforcing beamformer structure is one where for every user that is simultaneously scheduled, the beam nulls the multi-user interference in with as given in (20). The beams in the zeroforcing structure are the unit-norm column vectors of the matrix , where is the matrix given as

 H=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣g†1ˆH1g†2ˆH2⋮g†KˆHK⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣[]c∑Pℓ=1μ1,ℓγ1,ℓ⋅ej(φ1,ℓ+ν1,ℓ)⋅(ftr,n1ℓ)†∑Pℓ=1μ2,ℓγ2,ℓ⋅ej(φ2,ℓ+ν2,ℓ)⋅(ftr,n2ℓ)†⋮∑Pℓ=1μK,ℓγK,ℓ⋅ej(φK,ℓ+νK,ℓ)⋅(ftr,nKℓ)†⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦. (25)
###### Proof.

See Appendices -A and -B. ∎

## Iv Upper Bounds for Rsum

We are interested in benchmarking the performance of the zeroforcing structure against an upper bound on . The goal of optimizing over with perfect channel state information is a non-convex optimization problem [42, 43, 44] that appears to be complicated. In this context, an alternate formulation based on the signal-to-leakage and noise ratio metric [45] that simultaneously maximizes the array gain seen by the -th user, , and minimizes the interfering array gain seen by the other users, is relevant. Since these objectives are in some sense conflicting and can be weighed differently, we consider the composite metric

 (26)

for an appropriate set of weighting factors with .

### Iv-a Upper Bound Motivated by the Zeroforcing Structure

Building on Prop. 1, we now develop an upper bound for motivated by the zeroforcing structure. In this direction, we consider a signal-to-leakage-type metric equivalent of (26) based on the estimated channel matrix

 ˆSLNRk≜ηk,k|g†kˆHkfk|21+∑m≠kηm,k|g†mˆHmfk|2 (27)

for an appropriate set of weighting factors with .

###### Proposition 2.

Assuming that and are known at the base-station, the choice of that maximizes is given by the generalized eigenvector structure

 (28)
###### Proof.

See Appendix -C. ∎

Several remarks are in order at this stage.

• In the case where are set to zero for all (that is, the focus is not on interference management), the solution in (28) reduces to

 fk=ˆH†kgk∥ˆH†kgk∥=∑Pℓ=1μk,ℓγk,ℓ⋅e−j(φk,ℓ+νk,ℓ)⋅ftr,nkℓ∥∥∑Pℓ=1μk,ℓγk,ℓ⋅e−j(φk,ℓ+νk,ℓ)⋅ftr,nkℓ∥∥. (29)

This is not surprising, and the base-station greedily steers a beam along the weighted set of top- beams from for the -th user. In other words, the base-station generates a set of transmit weights that are matched to the transmit angular spread of the channel as identified by the resolution of .

• In the case where except if or (for a specific ), it can be seen that reduces to

 fk=ˆH†kgk−ηm′,k⋅(g†m′ˆHm′ˆH†kgk)⋅ˆH†m′gm′∥∥ˆH†kgk−ηm′,k⋅(g†m′ˆHm′ˆH†kgk)⋅ˆH†m′gm′∥∥. (30)

In other words, the specific design of in (30) removes a certain component of the beam corresponding to the -th user from the beam corresponding to the -th user.

• In the general case, while it gets much harder to simplify in (28), it can be seen that has the structure

 fk=∑Km=1ˆδm,kˆH†mgm∥∥∑Km=1ˆδm,kˆH†mgm∥∥ (31)

for some complex scalars . In other words, the optimal is in the span of with the weights that make the linear combination being a complicated function of as well as .

• The above observations are not entirely surprising given the Karhunen-Loève interpretation of the eigen-space of the channel(s) [51, 52, 11] and utilizing an expansion of on this basis. Such an expansion is also consistent with Prop. 1 which shows that in the pure interference management case ( for all ), is given as

 fk=∑Km=1Gm,kˆH†mgm∥∥∑Km=1Gm,kˆH†mgm∥∥ (32)

where the matrix .

• On the other hand, from (24), we note that is itself a linear combination of the beams from . Thus, in (28) is a linear combination of beams from . In other words, the design of is equivalent to a search over scalar (complex) weights, where denotes the size of the initial beam alignment codebook at the base-station end.

With this interpretation, while Prop. 2 considers only the maximization of (not even the sum rate with ), we can consider the optimization of over from a class , defined as

 (33)
###### Theorem 1.

Assume that the same multi-user beams as in the zeroforcing scheme are used for reception at the -th user. Let be defined as the solution to the search over the complex scalars

 (34)

With as above and

 fk=∑Nn=1δ⋆n,kftr,n∥∥∑Nn=1δ⋆n,kftr,n∥∥, (35)

we obtain an upper bound to the sum rate with the zeroforcing scheme. ∎

The proof is trivial following the structure of in the zeroforcing scheme in (32) and the definition of the class in (33). Since the structure in (35) is obtained as a search over scalar parameters, we call this upper bound a scalar optimization-based upper bound. Further, while (35) is difficult to practically implement, it provides a benchmark to compare the realizable zeroforcing scheme of Prop. 1.

Another important consequence of (35) is that the coefficients of for either the zeroforcing or the upper bound are (in general) not of equal amplitude. Thus, has to be quantized for implementation to ensure that the RF beamforming constraints are satisfied. In particular, we compute with an appropriate quantization scheme as below

 |ˆfk(i)|=˜QBamp(|fk(i)|),∠ˆfk(i)=˜QBphase(∠fk(i)), (36)

and use them in transmissions for the -th user. Good choices for will be discussed in Sec. V-C.

### Iv-B Bounding Rsum with an Alternating/Iterative Optimization

We now propose an iterative maximization algorithm to optimize over . In this approach, we first optimize the metric over (assuming is fixed), and then optimize the metric over (assuming is fixed). The algorithm is as follows:

1. Initialize randomly.

2. For , where is chosen according to a stopping criterion to determine convergence:

• With fixed, compute as the solution to the following optimization

 f(i)k=argmaxfkmax{ηm,k}SLNRk. (37)

From Lemma 1 in Appendix -A, the solution to the above problem with fixed can be seen to be

 (38)

This candidate has to be used to compute for all possible weights and optimized to produce .

• With fixed, compute as the solution to the following optimization

 g(i+1)k=argmaxgkSINRk. (39)

Again, from Lemma 1 in Appendix -A, we have

 g(i+1)k=⎛⎝INr+ρK∑m≠kHkf(i)mf(i),†mH†k⎞⎠−1Hkf(i)k. (40)
3. Compute with and for a (potential) upper bound.

Numerical studies show that for almost all channel realizations, the proposed algorithm converges in a small number of steps () to lead to a tolerable level of difference between successive iterates of . Further, while we are unable to theoretically establish that the proposed algorithm results in an upper bound to , numerical studies (see Sec. V-D) suggest that it leads to an upper bound for almost all channel realizations.

## V Numerical Studies

We now present numerical studies in a single-cell downlink framework to illustrate the advantages of the proposed beamforming solutions. The channel model from (1) is used to generate a channel matrix with clusters, AoDs uniformly distributed in a coverage area, and AoAs uniformly distributed in a coverage area for each of the users in the cell. The AoD spread captures a traditional three-sector approach with a zenith coverage and the AoA spread corresponds to the assumption of the use of multiple subarrays [9] with the best subarray limited to a coverage. is justified from millimeter wave channel measurements reported in [9, 12]. The antenna dimensions assumed in these studies are and at the base-station end, and and at each user. We consider simultaneous transmissions from the base-station to out of the users in the cell.

In terms of user scheduling, commonly used criteria include a round robin or a proportionate fair scheduler. On the other hand, a recently proposed directional scheduler [24] leverages the smaller beamwidths afforded by large antenna dimensions to schedule users with dominant clusters that are spatially well-separated. In this work, the first of the users is scheduled randomly and the second user is chosen to ensure that . In other words, the considered scheduler implements a directional avoidance protocol with the dominant cluster in the channel of the first user separated spatially from the dominant cluster in the channel of the second user, as parsed by . With this scheduler, we now primarily focus on the beamforming aspects.

For the initial beam alignment codebooks, based on the beam broadening principles proposed in [19], Figs. 1(a)-(d) illustrate the beam patterns in the azimuth plane for codebooks of sizes , , and , respectively, to cover the AoD space with a planar array at the base-station side. The optimization proposed in [19] results in a discrete Fourier transform (DFT) codebook solution for and . From Fig. 1, we observe that a beam codebook of small size (e.g., ) where each beam offers a broad directional coverage can reduce the acquisition latency at the cost of peak and/or worst-case array gain. On the other hand, a beam codebook of large size (e.g., ) where each beam can offer precision in terms of beamspace (and array gain) comes at the cost of acquisition latency. For the codebooks at the user end, two codebook sizes ( for a reduced acquisition latency and for performance improvement at the cost of acquisition latency) are considered with similar beam design principles as for the base-station side.

At this stage, it is worth noting that a number of system parameters impact the performance of the proposed multi-user schemes such as: i) Granularity of and (initial beam alignment codebook sizes), ii) Coarseness of channel approximation (rank-), iii) Finite-rate feedback of channel reconstruction parameters, and iv) Quantization of the resulting multi-user beams.

### V-a Impact of Initial Beam Alignment Codebook

In the first study, we consider the relative performance of the zeroforcing scheme (proposed in Prop. 1) relative to a baseline beam steering scheme with different initial beam alignment codebooks. We assume that the system has infinite-precision feedback of channel reconstruction parameters and infinite-precision resolution in the quantization of multi-user beams. We also compare the performance of the proposed schemes with the zeroforcing scheme presented in [23, 24], where the system is assumed to be able to find perfectly aligned directional beams in the training phase. Fig. 2 illustrates this comparative performance with different choices of in approximating and different codebook sizes ( and ).

While it is intuitive that there should be diminishing performance as increases (since increasing beyond the channel rank is not expected to improve performance), whether this saturation in performance is observed with a low-rank channel approximation is dependent on the resolution of the codebooks. In particular, increasing when the codebook granularity is already poor (small and ) does not lead to any performance improvement than observed with (beam steering). On the other hand, with a high resolution for (large ), even a rank- approximation appears to be sufficient to reap most of the performance improvement gains. This is because the performance of the baseline (beam steering) scheme is already quite good and significant relative improvement over it with increasing has a lower likelihood unless the channel has a large number of similar gain clusters (a low-probability event). When is large and is small, the beam steering performance is poor and the channel can be better approximated with the higher codebook resolution of leading to a sustained performance improvement for even up to . For example, with or and , zeroforcing based on a rank- channel approximation leads to around bps/Hz improvement at the median level.

In terms of performance comparison, note that the scheme from [23, 24] assumes but infinite-precision in terms of beam alignment (). Thus, it is not surprising that as and increase, the performance of the proposed schemes compare well with that of [23, 24]. For lower codebook resolutions, the proposed schemes overcome the codebook disadvantage by leveraging a better channel approximation as increases. These observations suggest that the optimal choice of the rank in approximating (which in turn determines the feedback overhead) depends not only on the rank of the true channel , but also on the codebook granularities. In general, a higher (and feedback overhead) is necessary if the codebook resolution is rich enough at the user end to allow the parsing of the channel better, but poor enough at the base-station end to allow a sustained performance improvement with increasing . In particular, we provide the following heuristic design guidelines based on our studies

 (41)

### V-B Quantizer Design

Towards the second study, we utilize different quantization functions to quantize the different parameters needed in channel reconstruction. For a phase term with a dynamic range of (e.g., and ), we use a uniform quantizer of the form

 QB(θ)=2π2B⋅round(2B2π⋅θ), (42)

where stands for a function that rounds off the underlying quantity to the nearest integer. For an amplitude term with a dynamic range of (e.g., ), we use a non-uniform quantizer of the form

 QB(α)=round((2B−1)⋅α)2B−1. (43)

The reason for scaling with respect to in (43) instead of by is because we want the quantized set to include both and for proper cross-correlation quantization. For example, in the typical case where the multi-user reception beam , we have and the use of a uniform amplitude quantizer will not allow the correct reproduction of this important quantity at the base-station end.

Quantization of the is performed on a dB scale rather than on a linear scale. This is intuitive since measurements have a wide dynamic range. The proposed quantizer is similar to quantizations considered in Fourth Generation (4G) systems. In particular, for a received term (in dB) with a theoretically unbounded range (e.g., ), we first cap to a maximum value of and quantize a spread of (in dB) with quantization levels (denoted as ) as follows:

 ϱi=ϱmax−Δ2B−1⋅i,i=0,⋯,2B−1. (44)

The quantization of is given as

 QB(ϱ)=ϱi⋆wherei⋆=argmini=0,⋯,2B−1|ϱ−ϱi|. (45)

The parameters and correspond to the maximum quantizer level value and the distance between adjacent quantizer levels, respectively. In our numerical studies, we use dB with dB for bits, and dB for bits.

A similar approach is pursued in quantizing the amplitudes of the multi-user beam. While these amplitudes do not span a wide range, the relative variation across the antenna array can show wide variations. Specifically, the infinite-precision zeroforcing beams generated in Prop. 1 are quantized to meet the RF constraints in (7) as described next. Since , we assume that on average . By scaling by , we can ensure that is centered around dB and for this quantity, we generate quantization levels in dB scale (denoted as ) corresponding to a step size of (in dB) as follows:

 fi=Δf⋅[i+1−2B−1],i=0,⋯,2B−1. (46)

With these levels that are spaced apart, we obtain the quantized beam weights as

 (47)

where

 j⋆=argminj10log10(Nt⋅|fk(i)|2)−fjprovided10log10(Nt⋅|fk(i)|2)>f