Tensor-Based Channel Estimation for Dual-Polarized Massive MIMO Systems

# Tensor-Based Channel Estimation for Dual-Polarized Massive MIMO Systems

Cheng Qian, Member, IEEE, Xiao Fu, Member, IEEE, Nicholas D. Sidiropoulos, Fellow, IEEE, Ye Yang Preliminary conference version of part of this work will be presented at ICASSP 2018 [1]. This work was supported in part by the National Science Foundation under project NSF ECCS 1808159 and NSF ECCS 1608961. C. Qian and N. D. Sidiropoulos are with the Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22904 USA (e-mail: alextoqc@gmail.com, nikos@virginia.edu).X. Fu is with the School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331 (xiao.fu@oregonstate.edu). Y. Yang is with Physical Layer & RRM IC Algorithm Department, WN Huawei Co., Ltd, Shanghai, China (yangye@huawei.com).MATLAB code is available at https://www.mathworks.com/matlabcentral/fileexchange/69176-tensor-based-channel-estimation-for-dual-polarized-mimo
###### Abstract

The 3GPP suggests to combine dual polarized (DP) antenna arrays with the double directional (DD) channel model for downlink channel estimation. This combination strikes a good balance between high-capacity communications and parsimonious channel modeling, and also brings limited feedback schemes for downlink channel state information within reach—since such channel can be fully characterized by several key parameters. However, most existing channel estimation work under the DD model has not yet considered DP arrays, perhaps because of the complex array manifold and the resulting difficulty in algorithm design. In this paper, we first reveal that the DD channel with DP arrays at the transmitter and receiver can be naturally modeled as a low-rank tensor, and thus the key parameters of the channel can be effectively estimated via tensor decomposition algorithms. On the theory side, we show that the DD-DP parameters are identifiable under mild conditions, by leveraging identifiability of low-rank tensors. Furthermore, a compressed tensor decomposition algorithm is developed for alleviating the downlink training overhead. We show that, by using judiciously designed pilot structure, the channel parameters are still guaranteed to be identified via the compressed tensor decomposition formulation even when the size of the pilot sequence is much smaller than what is needed for conventional channel identification methods, such as linear least squares and matched filtering. Extensive simulations are employed to showcase the effectiveness of the proposed method.

Channel estimation, massive MIMO, dual-polarized array, tensor factorization, identifiability.

## I Introduction

The dual-polarized (DP) antenna array has many appealing features that make it a strong candidate for adoption in next generation communication systems and massive MIMO [2, 3, 4, 5]. For example, Foschini and Gans [6] showed that the capacity of systems with DP antennas at the transmitter can be increased up to 50% compared to systems without polarization. Besides the increased capacity, DP antenna arrays have other key advantages relative to single-polarization counterparts with the same number of antennas, including smaller form factor and easier installation, better interference mitigation capability, and higher link reliability.

In the recent releases of technical specifications suggested by the 3GPP, the DP array and the double directional (DD) channel model are considered key techniques [5, 3, 4]. The DD channel model is parsimonious for multipath channels with a small number of dominant paths, and such scenarios arise in millimeter wave (mmWave) based wireless communications. Modlel parsimony is really essential for designing limited feedback schemes for downlink channel state information in frequency-division duplex (FDD) massive MIMO [3]. Specifically, 3GPP suggests that the mobile users estimate the DD channel parameters such as directions-of-arrival (DOAs), directions-of-departure (DODs), the complex path-loss associated with each path, and then feed back these parameters to the base station (BS). This strategy is rather economical, as it is expected that the number of dominant paths will be small to moderate in practical deployments. On the other hand, there are very few works related to the DD-DP channel/parameter estimation problem. Most of the existing channel estimation algorithms such as [7, 8, 9, 10, 11, 12] do not take polarization into consideration, and thus cannot be applied to this particular kind of system.

There are many challenges in the way of estimating the key parameters of the DD-DP channel. First, considering polarization adds another level of difficulty on top of the (DODs, DOAs, path losses) parametrization, which is already not easy to handle in some cases, e.g., when we have small-size pilot matrices or a large number of multipaths. Formulating the parameter estimation problem for the DD-DP channel in a mathematically tractable form and tackling it using effective signal processing tools that provide analytical performance guarantees is quite nontrivial. Second, although the DD-DP channel (or to be more precise, blocks of the channel) can be modeled using long-existing array processing models (as we will show), it is hard to apply the classic array processing algorithms (e.g., MUSIC [13] and ESPRIT [14]) for estimating the key parameters. The reason is that classic array processing methods usually work under relatively restrictive assumptions—e.g., MUSIC and ESPRIT need the number of multipaths to be smaller than the number of transmit and the number of receive antennas, which may not be satisfied in practice. Real systems often have to deal with more multipaths than antennas on one end of the link. Third, the conventional estimation methods use matched filtering or linear least squares to extract an estimate of the channel matrix out of the received signals, and then perform parameter identification. To do this, the pilot sequence has to be quasi-orthogonal or at least full row-rank, respectively. This is very expensive for massive MIMO systems if the number of transmit antennas is large.

Very recently, Zhu et. al., proposed an interesting framework for two-dimensional DOA and DOD estimation of wideband massive MIMO-OFDM systems with DP arrays [5]. The key idea behind this algorithm is to exploit a so-called multi-layer reference signal structure to estimate the arrival and departure angles. Specifically, the transmitter and receiver communicate with each other iteratively, and in each iteration the transmitter (or receiver) fixes a beam and then the receiver (or transmitter) varies a paired beamforming vector to receive (or transmit) data. This way, a closed-form formula for DOA/DOD can be asymptotically derived. The total number of iterations is proportional to the product of the number of paired beams and the number of radio frequency bins at both transmitter and receiver. This closed-loop iterative protocol implicitly assumes that the transmitter, receiver, and scatterers remain static, and the two ends of the link are synchronized in beam sweeping. Its resolution is also limited by the utilized beamwidth.

In this work, we consider the parameter estimation problem for frequency division duplex (FDD) dual-polarized DD channels—but the algorithms can be easily implemented in the time division duplex (TDD) systems as well. We aim at designing novel efficient channel estimation algorithms and analyzing the identifiability of the key parameters, i.e., DOAs, DODs and path-losses, of this model. Unlike the existing methods for DD-DP channels as in [5], our proposed approach does not require multiple iterations between the transmitter and receiver, which may not be desired or realistic in practice, e.g., in a scenario with relatively higher mobility. Our method is also naturally with high spatial resolution, inheriting nice properties of related array processing techniques. In addition, we fully characterize the theoretical boundaries of our methods in terms of parameter identifiability, leveraging advanced tensor algebra, and show that our method can work under a variety of challenging scenarios where existing methods tend to fail. Our detailed contributions can be summarized as follows:

• Tensor-Based Formulation. We show that the DD-DP channel can be naturally modeled as a low-rank tensor. Leveraging this structure, we recast the associated channel estimation problem as a low-rank tensor decomposition problem [15] and handle it using effective tensor decomposition algorithms.

• Rigorous Identifiability Analysis. On the theory side, we show that the channel (i.e., the multipath parameters) are identifiable under very mild and practical conditions—even when the number of paths largely exceeds the number of receive antennas, a practically important case that classic DP array processing algorithms, e.g., [16], cannot cope with.

• Reduced-pilot Formulation and Identifiability. We propose a downlink signaling strategy that utilizes a judiciously designed pilot structure. We show that this pilot structure combined with compressed tensor modeling can substantially reduce the downlink overhead, without losing identifiability of the channel parameters. This design is particularly suitable for massive MIMO systems, for which existing methods usually need very long pilot sequences to help the receivers extract the channel matrix and then perform parameter estimation. An effective estimation algorithm is also proposed for the designed piloting strategy.

We should mention that some very recent work [11, 12] also studied the downlink and uplink channel estimation problems from a tensor decomposition viewpoint. Nevertheless, the work in [11, 12] did not consider dual-polarized antenna arrays. Hence, the formulated problems and analyses there are quite different from ours.

A preliminary conference version of part of this work was presented at ICASSP 2018 [1]. The conference version includes the basic modeling and identifiability claims without detailed proofs. This journal version additionally includes detailed proofs of the identifiability results, and the compressed tensor factorization formulation, its identifiability proof, and a new algorithm that handles the compressed formulation.

Notation: Throughout the paper, superscripts , , , and represent transpose, complex conjugate, Hermitian transpose, matrix inverse and pseudo inverse, respectively. We use , , and for absolute value, Frobenius norm, -norm and -norm, respectively; denotes an estimate of , is a diagonal matrix holding the argument in its diagonal, is the vectorization operator and takes the phase of its argument; is the th element of a vector, is the entry of , and is the th column of . Symbols denote the Kronecker, Khatri-Rao, element-wise, and outer products, respectively; extracts the elements in rows to and columns to , extracts the elements in the columns to and extracts the elements in the rows to . is the identity matrix and is the zero matrix.

## Ii Signal Model and Problem Statement

### Ii-a Double Directional Dual-Polarized Channel Model

We consider an FDD massive MIMO system, where there are DP transmit antennas and DP receive antennas, see Fig. 1. In the array processing literature, this type of DP array is also known as “cross-polarized” array [16]. Under the considered scenario, each antenna pair consists of a vertical (V) polarized antenna and a twin horizontal (H) polarized antenna—and each antenna is connected with an RF chain. Therefore, the channel can be represented as a matrix, where each element of the channel matrix represents a link between a transmit antenna (which could be a V-polarized or a H-polarized antenna) and a (V- or H-polarized) receive antenna. The signal received by the user is given by [4]

 x(t)=Hs(t)+n(t), t=1,⋯,N (1)

where is the transmitted signal, is zero-mean i.i.d. circularly symmetric complex Gaussian noise. The downlink channel matrix can be represented as

 (2)

where is a channel matrix between all the V-polarized transmit antennas and V-polarized receive antennas, is a channel matrix between all the H-polarized transmit antennas and V-polarized receive antennas, and likewise for the other two blocks in (2).

For notational simplicity, let and . Then, according to the channel model suggested by the 3GPP [4], the th subchannel matrix is modeled as

 (3)

where is the component of line-of-sight (LOS) and is the component of non-line-of-sight (NLOS), and are energy normalization factors with being the ratio between the power related to the LOS and the power related to the NLOS, and

 H(p,q)LOS =~β(p,q)1ar(θ1,ϕ1)aHt(ϑ1,φ1) (4) H(p,q)NLOS =K∑k=2~β(p,q)kar(θk,ϕk)aHt(ϑk,φk) (5)

where the first path is assumed to be the LOS path that usually exists in systems operating at millimeter wave (mmWave) frequencies, and is the number of paths between the two subarrays. is associated with the array manifold of subarray : and are azimuth and elevation DOAs of the th path, respectively. Similarly, is determined by the subarray , and and are the azimuth and elevation DODs of the th path, respectively. Note that are generalized path-losses, which are random variables and affected by the small-scale loss, large-scale loss, distance between BS and MS, and dual-polarization parameters. We may express

 ~β(p,q)k=αk~γ(p,q)k,k=1,⋯,K (6)

where denotes the standard path-loss which is caused by propagation and fading, while denotes the polarization factor. Without loss of generality, we can absorb and into and , respectively, and define and . Thus,

 β(p,q)k=αkγ(p,q)k,k=1,⋯,K. (7)

Substituting (4) and into (3) produces

 H(p,q)=Ardiag(β(p,q))AHt (8)

where

 Ar =[ar(θ1,ϕ1)⋯ar(θK,ϕK)] (9) At =[at(ϑ1,φ1)⋯at(ϑK,φK)] (10) β(p,q) =[β(p,q)1⋯β(p,q)K]T. (11)

Now the channel matrix in (2) can be rewritten as

 H [AtAt]H (12)

which in a more compact form is

 H=(I2⊗Ar)Λ(I2⊗At)H (13)

where

 (14)

The model in (II-A) has been advocated by the 3GPP as a standardized channel modeling approach for the long-term evolution (LTE) systems [4, 3]. As mentioned in [4], most standardized channels like spatial channel model (SCM), SCM extension (SCME) [17], WINNER [18] and ITU [19] are based on this model. The model assumes that the H-polarized subarray and the V-polarized subarray share the same array manifolds, while the polarization information is contained in the path-loss vectors, i.e., ’s. This model has many favorable features. It concisely models the effect of polarization. More importantly, it incorporates the elevation information of the transmit and receive antenna arrays in addition to the azimuth information—leading to the so-called 3-D channel modeling, which is considered very useful for next generation wireless communication systems, since it provides many more degrees of freedom that can potentially enhance system performance; see detailed discussion in [4, 3].

In practice, the specific form of and are intimately tangled with the array geometry. For example, the BSs are usually equipped with uniform rectangular arrays (URAs) that each has and horizontal and vertical array units111Note that in the DP arrays, each array unit consists of a pair of DP antennas.. An illustration is shown in Fig. 2. In this special case, the th steering vector for the transmitter becomes

 at(ϑk,φk)=at,k=ay,k⊗ax,k (15)

where and with and . Here, is the wavelength, and are the inter-element spacing distances for horizontal and vertical units, respectively.

For ease of exposition, let us assume that the receivers employ uniform linear arrays (ULAs). Note that the analytical tools that we use can be easily extended to cover cases where the receive array has a different geometry, e.g., URA. Under the ULA assumption, the th steering vector for the receiver is

 (16)

where with being the inter-element spacing of the ULA at the receiver side. Note that to avoid phase wrapping, and must be carefully chosen. The most widely adopted choice for and is half-wavelength. In this paper, given the angular ranges of , and , we assume that , and are determined such that , and for all in their own ranges, respectively.

### Ii-B Problem Statement

Given the described channel model, our goal is to estimate the key parameters of the 3-D downlink channel. Here, by “key parameters”, we mean the set of DOAs and DODs that are associated with the paths and the corresponding path-losses, i.e., , and for all and . Note that for massive MIMO systems that follow the channel model in (II-A), one is very well motivated to estimate these parameters. The reason is that the (potentially large) channel matrix that has complex-valued elements is fully characterized by the parameters of interest. Since the number of multipaths is usually not large in practice, the number of parameters is relatively small, i.e., ( for the DOAs, for the DODs and for the complex-valued path-losses) and we usually have

 8K≪4MrMt.

For example, in a scenario where and paths exist, the channel matrix consists of complex-valued elements while we only have 48 key parameters, of which are real-valued (DOAs and DODs). Therefore, by estimating the key parameters, one can feedback the downlink channel to the BS in a very economical way—only feeding back the parameters rather than the whole channel suffices to recover the downlink MIMO channel at the BS. In fact, this is the main idea enabling implementation of limited feedback schemes in mmWave massive MIMO systems suggested by the 3GPP [3].

In this work, we begin with the scenario where the downlink channel matrix can be estimated at the receiver via simple procedures such as matched filtering or linear least squares (LS). Our objective is to estimate the key parameters from a given . We will first study identifiability theory associated with this simpler scenario—i.e., under what conditions the DOAs, DODs and path-losses can be provably identified from ? Effective algorithms based on the identifiability analysis will also be proposed. In addition, we will consider more challenging yet desirable scenarios where only a compressed version of available at the receiver, and provide a pragmatic and effective algorithm to estimate the parameters of interest—this approach can substantially reduce the pilot length, thereby greatly saving the downlink training overhead.

### Ii-C Prior Art and Challenges

Estimating the DOAs, DODs and path-losses from is not a trivial task. Nevertheless, many classic methods from the array processing society can be applied, under some conditions. For example, if the submatrix has full-column rank , and , and also if the receive and transmit antenna arrays are ULAs/URAs, the subspace methods such as ESPRIT and MUSIC can be applied to estimate the DOAs and DODs. After that, the path-losses can be recovered, e.g., via LS estimation.

This is viable, but can only work under relatively stringent conditions. The classic array processing methods like ESPRIT and MUSIC all assume full column rank of , , and , which implicitly assumes that (to be precise, is needed for subspace methods like MUSIC). In many practical scenarios, is relatively small—e.g., the newest model of iPhone (i.e., iPhone X released in 2017) only supports two receive antennas, while the number of paths can easily exceed two. Is there a more powerful method that can provably identify the parameters of interest under much more relaxed conditions? We will address this question in the next section.

Another possible way to handle the parameter estimation problem is to treat each block as a sparse optimization problem [8, 9]. For each subchannel block, we discretize the DOA and DOD domains into fine angle grids and then construct three overcomplete angle dictionaries (codebooks), denoted by , and . Then, we have , where is a sparse matrix that selects out the columns associated with the active DODs and DOAs from the dictionaries. This way, the parameter estimation problem becomes a sparse recovery problem that can be handled by formulations such as LASSO [20], i.e.,

 ming(p,q) ∥h(p,q)−(D∗y⊗D∗x⊗Dr)g(p,q)∥22+λ∥g(p,q)∥1

where with ; and other sparse optimization algorithms such as orthogonal matching pursuit [9]. The difficulty is that to ensure good spatial resolution, , and are very “fat” matrices, where , and denotes the number of angle grid points after quantization. Consequently, is of size . If one quantizes the DOA and DOD space (ranging from to ) using a resolution of one degree, then —which poses a very hard sparse optimization problem.

## Iii Proposed Approach

In this section, we propose to estimate the key parameters of interest using low-rank tensor factorization—which has provable guarantees under realistic and relaxed conditions.

### Iii-a Tensor Preliminaries

To make the paper self-contained, we briefly present the definition of tensor and some useful theorems on the uniqueness of tensor decomposition in the following.

###### Definition 1

(Tensor). A tensor is a multidimensional array indexed by three or more indices. Specifically, an -th order tensor that has latent factor matrices can be written as

 X=F∑f=1[U1]:,f∘⋯∘[UN]:,f

where and the minimal such is the rank of tensor or the canonical polyadic decomposition (CPD) rank of . [15].

###### Definition 2

(Unfolding). For an -th order tensor in Definition 1, its -mode matrix unfolding can be written as

 X(n)=(UN⊙⋯⊙Un+1⊙Un+1⊙⋯⊙U1)UTn.

Simply speaking, each unfolding is obtained by taking the mode- slabs of the tensor (i.e., subtensors obtained by fixing the th index of the original tensor), vectorizing the slabs, and then stacking all the vectors into a matrix—see details in [15].

Low-rank tensor decomposition [also known as CPD or Parallel Factor Analysis (PARAFAC)] aims at factoring into a sum of column-wise outer products of —with each such outer product being a rank-one tensor. Unlike matrix factorization, which is in general non-unique, the PARAFAC decomposition has unique solution under mild conditions, up to scaling and permutation of the components. One of the best-known uniqueness results for third-order tensors is due to Kruskal [21], which was later extended to higher orders by Sidiropoulos and Bro [22].

###### Theorem 1

[22] Given a th order tensor as in Definition 1, if , then and the decomposition of is essentially unique, where denotes Kruskal rank.

The essential uniqueness makes the latent factors of a tensor identifiable from the ‘ambient data’ up to some trivial ambiguities—which has enabled a tremendous amount of applications—see an overview in [15]. Theorem 1 is known to be a broadly applicable general result. In some special cases where the factor matrices have special structure, e.g., Vandermonde structure, the uniqueness condition in Theorem 1 can be improved. For example, we have the following theorem:

###### Theorem 2

[23] Consider a tensor , where , and with Vandermonde having distinct nonzero generators. Then, if

 krank(U2)+min{I1+krank(U3),2F}≥2F+2

the factors are essentially unique.

Theorem 2 presents a much milder identifiability condition relative to that in Theorem 1. The result is tailored for tensors that have a latent factor with Vandermonde structure. Such structure emerges quite often in array processing, since some array geometries (ULA, URA, nested or coprime arrays) naturally give rise to Vandermonde matrices.

### Iii-B Tensor-Based Method and Identifiability

Our proposed approach starts by noticing that is in fact a tensor of rank (at most) , when the BS is equipped with a URA and the receiver with a ULA. To see this, let us first vectorize each block as , and stack the vectorized ’s into a matrix as follows:

 ˇH =[h(Vr,Vt),h(Vr,Ht),h(Hr,Vt),h(Hr,Ht)] =(A∗y⊙A∗x⊙Ar)BT (17)

where and and

 B=[β(Vr,Vt)β(Vr,Ht)β(Hr,Vt)β(Hr,Ht)]T∈C4×K (18)

in which we have used (15) (or, more precisely ) and .

Eq. (III-B) is exactly the definition of a four-slab fourth-order tensor of rank in the matrix unfolding form [15] when (cf. Definition 2). We have this fourth-order tensor because of the array manifolds that we have assumed for the transmit and receive arrays. The four factor matrices , (, ) and are the manifold of the receive antenna array, the manifold matrices of (horizontal and vertical) transmit antenna arrays, and the path-loss matrix, receptively. A side comment is that if some other array geometries are employed at both sides, one can also derive a low-rank tensor from blocks of by rearranging. We list the resulting tensor structure of some pertinent cases for widely used configurations of the transmit and receive antenna arrays in Table I.

#### Iii-B1 Array Manifold Estimation

Our idea is to estimate , , and from the tensor , and then estimate the multipath parameters using the estimated factor matrices. As will be shown shortly, if , and are accurately estimated, the DOAs and DODs can then be estimated in closed-form. To estimate the array manifolds and the path-losses (i.e., ), we propose to employ the following tensor decomposition formulation:

 minAr,Ax,Ay,B∥∥ˇH−(A∗y⊙A∗x⊙Ar)BT∥∥2F. (19)

which is the least squares fitting formulation for low-rank tensor factorization.

Various low-rank tensor decomposition algorithms can be applied to identify the loading matrices [24, 25, 15]. Among them, one of the most popular methods is the so-called alternating least squares (ALS) technique. To implement ALS for solving (19), we make use of different unfoldings of the tensor , which are denoted as follows:

 H(1) =(B⊙A∗y⊙A∗x)ATr (20) H(2) =(B⊙A∗y⊙Ar)AHx (21) H(3) =(B⊙A∗x⊙Ar)AHy (22) H(4) =(A∗y⊙A∗x⊙Ar)BT. (23)

Note that is exactly . Using the unfoldings, one can easily implement the following alternating optimization algorithm:

 Ar ←argminAr∥∥H(1)−(B⊙A∗y⊙A∗x)ATr∥∥2F (24a) Ax (24b) Ay (24c) B ←argminB∥∥H(4)−(A∗y⊙A∗x⊙Ar)BT∥∥2F (24d)

where the four subproblems are all linear LS problems that can be readily solved in closed-form. The ALS algorithm repeatedly solves the subproblems until convergence. Derivative-based schemes can also be used for optimization, from Gauss-Newton and BFGS to simple stochastic gradient type methods. See [15] for more information.

#### Iii-B2 Parameter Estimation

Once , , and are obtained from (24), the angles and can be estimated by exploiting the manifold structure of and . To proceed, let us consider the following:

 ^ωr,k =∠(¯¯¯¯¯AHr,kA––r,k) (25) ^ωx,k =∠(¯¯¯¯¯AHx,kA––x,k) (26) ^ωy,k =∠(¯¯¯¯¯AHy,kA––y,k) (27)

where and are the vectors consisting of the first and last entries of with length , respectively. Then we estimate from , and and from and as

 ^θk =sin−1(ν2πdr^ωr,k) (28) ^φk =sin−1(√(ν2πdx^ωx,k)2+(ν2πdy^ωy,k)2) (29) ^ϑk =tan−1(dx^ωy,kdy^ωx,k). (30)

Eqs. (28)-(30) hold because we have assumed that the transmit and receive antenna arrays are URA and ULA, respectively. The closed-form solutions are also rotationaly invariant—not affected by scaling ambiguity that is brought by tensor decomposition.

We should mention that the tensor decomposition algorithm in (24) has already given an initial estimate of , i.e., the path-losses. However, since there is an intrinsic scaling ambiguity of tensor decomposition (cf. Lemma 1 in Appendix A), such an initial estimate may not be useful. Nevertheless, this is easy to fix. Note that can be reconstructed from and without scaling ambiguity. Then, the estimate of without scaling ambiguity can be computed as

 ^B ←argminB∥∥H(4)−(^A∗y⊙^A∗x⊙^Ar)BT∥∥2F. (31)

Algorithm 1 summarizes the tensor based parameter estimation, where in the first step we have assumed that the pilot matrix has orthogonal rows so that gives a fairly accurate estimate of . Note that the order of applying tensor decomposition, angle estimation, and path-loss estimation matters—since angle decomposition can naturally remove the scaling ambiguity, as we discussed.

Note that the complexity of (25)-(27) is very low but the resulting estimate could be suboptimal. For better accuracy, one can resort to single-tone frequency estimation algorithms, e.g., [26, 27] or maximum likelihood (ML)-based (periodogram) methods, to estimate the DODs and DOAs from the estimated manifolds. These latter methods are statistically efficient (approximately) in the high SNR regime, but are computationally more demanding than the simple closed-form solutions provided earlier.

#### Iii-B3 Parameter Identifiability

As we mentioned, the channel parameters can be estimated using some other methods, e.g., MUSIC and ESPRIT, which possibly admit lower complexity compared to tensor decomposition. However, a salient feature of tensors is that the factors are uniquely identifiable under mild conditions. For example, it can be shown that meet the -rank condition [23] provided that all the DOAs, DODs and path-losses are not the same, which is a mild condition considering the random nature of the multiple paths. Then we have the following theorem:

###### Theorem 3

Assume that the scenario where the transmitter is equipped with a URA and the receiver a ULA, and that and are different for any . Also assume that the pathloss parameters in are generated following some jointly continuous distribution. Then, the key parameters and the path-losses for all and are uniquely identifiable via the proposed approach provided that

 min(Mr,K)+min(Mx,K) +min(My,K) +min(4,K)≥2K+3 (32)

almost surely.

The proof relies on the identifiability of the four-slab four-way tensors and is relegated to Appendix A. Although the proof is relatively straightforward for someone who is frequently exposed to tensors, the implication of Theorem 3 is important: Using the proposed approach, the key parameters are uniquely identifiable even when the number of paths largely exceeds the number of . This makes the proposed method widely applicable to many realistic scenarios, especially where the receive antenna array is of a relatively small size, as in many mobile phones. Furthermore, this identifiability is independent of the array configurations of transmit and receive antennas.

Theorem 3 is intuitive and easy to read, but is not the best bound that we can get. In fact, if we look at the parameter estimation problem from a multi-snapshot 2-D harmonic retrieval viewpoint, a much stronger identifiability result can be obtained, which is summarized in the following theorem:

###### Theorem 4

The parameters are all uniquely identifiable provided that

 K≤argmaxF,Pr,Px,Py F s.t. max((Pr−1)PxPy,Pr(Px−1)Py, PrPx(Py−1))≥F 8QrQxQy≥F (33)

where .

Theorem 4 can be proven by invoking identifiability results in multi-dimensional harmonic retrieval, in particular, the construction following the IMDF algorithm [28]. The result in Theorem 4 is a bit harder to read compared to Theorem 3, but is far better upon close inspection. For example, when , the identifiable case under Theorem 3 is with up to , while multi-paths can be guaranteed to be identified under Theorem 4. Furthermore, even when the MS only has a single dual-polarized antenna, it can be shown using the IMDF based approach that the number of identifiable paths is upper bounded by . For details about the IMDF method, we refer the readers to [28]. Due to its simplicity, it is either a good candidate to initialize the method proposed in Section III-B or can be directly applied for computational efficiency.

#### Iii-B4 Computational Complexity

We should remark that the complexity of the proposed tensor decomposition method is dominated by the step for solving (24), where each subproblem is a least squares problem. Taking (24a) as an example, the solution to this subproblem is as follows:

 ^ATr= ((BHB)⊛(ATyA∗y)⊛(ATxA∗x))−1× (B⊙A∗y⊙A∗x)HH(1),

which needs flops if one uses the above relatively naive implementation. The matrix inversion and large matrix product (i.e., ) parts are the most costly to compute. Nevertheless, these two operations can be avoided if some advanced solvers are employed, e.g., [24, 29].

## Iv Parameter Identification Using Frugal Pilots

In the previous section, we have proposed a tensor decomposition-based method for estimating the DOAs, DODs, and path-losses of the MIMO 3-D channel when a reliable estimate of is available. This is viable when is ‘fat’ or square and with full row-rank. In other words, when the pilot sequence is long enough so that has full row rank, can be estimated via least squares (or simply matched filtering if ). Then, the method that is proposed in the previous section can be applied. In practice, using a long pilot sequence is not desirable since this creates large downlink training overhead. When is large, the size of is at least if one wishes to make the rows of orthogonal to each other, that could be costly. In this section, we propose another approach to handle the above challenge. We carefully design the transmit pilot sequence and formulate a compressed tensor decomposition (CTD) problem for parameter estimation. As it turns out, we can use a pilot matrix whose size is much smaller than to identify the channel parameters.

### Iv-a Proposed Downlink Training and Parameter Identification Approach

To reduce downlink overhead in a massive MIMO system while keeping identifiability of the key parameters of interest, we propose to employ the following specially structured pilot matrix:

 S=[Q00Q]∈C2Mt×N (34)

where (assuming is an even number for simplicity) whose elements are generated following a certain absolutely continuous distribution and . This way, the (noise-free) received data matrix becomes

 X=[Ardiag(β(Vr,Vt))AHtQArdiag(β(Vr,Ht))AHtQArdiag(β(Hr,Vt))AHtQArdiag(β(Hr,Ht))AHtQ] (35)

Given the above , our goal is to identify , , —i.e., the path losses and the array manifolds, since all the key parameters can be easily estimated from them as described in the previous section.

Physically, the proposed design of in (34) corresponds to a time-division multiplexing strategy that transmits pilots from the H-polarized array first, and then transmits the same pilots from the V-polarized array (or the other way around), with the other turned off—which is very easy to implement in practice. Nevertheless, as we will show, such a simple signaling strategy combined with tensor algebra allows us to identify the parameters of interest under very mild conditions—even when the number of columns of is much smaller than .

### Iv-B Identification Approach and Theoretical Guarantees

One can see that the four blocks in (35) comprise a four-slab three-way tensor, where the th block is defined as

 X(p,q)=Ardiag(β(p,q))AHtQ (36)

for and . Taking the transpose of and then vectorizing it yields

 x(p,q)=(Ar⊙(QTA∗t))β(p,q). (37)

Thus, by collecting , we have

 Z =[x(Vr,Vt) x(Vr,Ht) x(Hr,Vt) x(Hr,Ht)] =(Ar⊙QTA∗t)BT∈CNMr/2×4 (38)

It is readily seen that is nothing but a matrix unfolding of a third order tensor whose latent factors are , and . It immediately follows that , and can be identified under the proposed pilot design under certain conditions. Hence, at least the angle of arrivals can be easily estimated from , under quite mild conditions that guarantee identifiability of third-order tensors, as stated in Theorem 1. Let

 E=(QTA∗t)∗=QHAt. (39)

Estimating from the compressed measurements is nontrivial—can we uniquely identify the DODs from ? The answer is affirmative—if the transmit array is an ULA or URA and certain mild conditions are satisfied, which we will explain shortly.

#### Iv-B1 Manifold Estimation via Smoothed ESPRIT

There are many ways to estimate , and from , since this is nothing but a third-order tensor decomposition problem. The identifiability of this kind of tensor (in which there is at least one factor which has a Vandermonde structure) has also been well understood—e.g., the aforementioned Theorem 2 that was derived in [23]. Here, we propose to employ a method that was recently proposed in [30] to handle this problem. The method is in nature a subspace method, which works under very relaxed identifiability conditions by exploiting the Vandermonde structure of a latent factor of the tensor. The detailed proof of the algorithm can be found in [30]. Here, we refer to this algorithm as the smoothed ESPRIT algorithm.

The algorithm starts by working with a third-order tensor as follows (with a bit of notational abuse):

 X=(A⊙B)CT (40)

where and are drawn from some continuous distributions, and , and is Vandermonde with distinct nonzero generators. To identify , and , one can employ the following procedure:

First, let us define a cyclic selection matrix . It is easy to check that

 (Ji2⊗IJ)X =((Ji2A)⊙B)CT =(A1⊗B)diag([A]i2+1,:)CT (41)

where contains the first rows of . Thus, by varying from 0 to where , we can construct a smoothed matrix as

 Xs =[J0X⋯JI2−1X] =(A1⊙B)(A2⊙C)T∈CI1J×I2K. (42)

where takes the first rows of . Starting from (IV-B1), the smoothed ESPRIT algorithm applies a series of re-arranging of the tensor elements and finally converts the factorization problem to a classical DOA estimation problem that can be handled by ESPRIT—and solves it via eigen-decomposition—see details in [30]. The algorithm also offers very favorable identifiablity guarantees:

###### Theorem 5

[30] Consider a third-order tensor , where , , , is Vandermonde with distinct nonzero generators. Assume that and are drawn from certain absolutely continuous distributions, respectively. Then, if

 F≤min((I1−1)J, I2K) (43)

where and are chosen from

 {I1,I2}=argmax{I1,I2}∈Z+ min((I1−1)J,I2K) (44)

Then , and are identifiable up to permutation and scaling of columns, almost surely.

The above method can be directly applied to in (IV-B), if we treat , , as , , , respectively, and the corresponding dimensions are , and . It is clear that the , , and are identifiable up to scaling and permutation ambiguities under quite mild conditions—both and can be smaller than the number of paths. We remark that , , and can be identified using any tensor factorization method, e.g., the least squares fitting formulation for tensor decomposition and ALS, which may have some other benefits such as being more noise-robust. Nevertheless, the employed approach admits by far the strongest identifiability result for a third-order tensor which has a Vandermonde latent factor. Another good feature of the employed approach is that it is very lightweight and consists of only simple algebraic procedures and eigen-decomposition, which is very friendly to real-time implementation.

#### Iv-B2 Identification of DODs

By the proposed procedure (or any other tensor factorization algorithm), and can be identified. However, whether is identifiable from is not yet clear. Recall that we previously defined in (39). Since there is complex scaling ambiguity in the estimate of , i.e., , the problem is equivalent to solving

 ^e=ξQHat

when is a known compression (fat) matrix and is a given compressed measurement vector, where can represent any column of the estimated and represents the corresponding column in . Here, is a complex-valued non-zero scalar that represents the scaling ambiguity inherited from the tensor factorization phase. Solving the above underdetermined system of equations to recover the vector is quite similar to the problem of compressive sensing [31]. However, our is not sparse, and sparsity is what modern compressive sensing relies on to establish signal identifiability—this raises the question if is still identifiable from the system of linear equations, since an underdetermined system could have an infinite number of solutions?

To address this issue, we have the following lemma:

###### Lemma 1

Given a system of equations where , , , and is a function of . Assume that is generated following some absolutely continuous distribution, and that is a steering vector keeping the transmit array structure. Also assume that . Then, is identifiable from almost surely.

###### Proof:

Let us assume that there exists another steering vector that also satisfies , where and . Hence, we have

 QH[ξat,−ξ′a′t]=0. (45)

Since and are Kronecker product of two Vandermonde vectors, we have

 rank([ξat,−ξ′a′t])=2

if . Also because is a random matrix generated following some absolutely continuous distribution and , we have

 rank(QH[ξat,−ξ′a′t])=2

holds with probability 1 (Lemma 1, [32])—which is a contradiction to (45). This completes the proof. \qed

Lemma 1 clearly indicates that if , the solution to is unique (where contains the inherited scaling ambiguities from the tensor factorization stage)—if the columns of are Vandermonde vectors that have different generators. To be specific, we have the following corollary:

###### Corollary 1

Assume that , and that is the manifold of an URA or ULA, which has a set of different DODs, i.e, for . Then, can be uniquely identified from the system .

Corollary 1 indicates that under the premise that the third-order tensor is identifiable, then one can use a pilot matrix has as few as columns, which can be rather economical in practice. On the other hand, the results in Lemma 1 and Corollary 1 are not entirely surprising: After all, all the columns in are parametrized by only two variables—it makes much sense that we can identify them from two equations.

Combining the above results, we have the following theorem that states an integrated result of the two steps (i.e., tensor factorization and identification):

###### Theorem 6

Assume that the receive antenna array is a ULA and the transmit array is a URA, and that every component in