Joint Pilot Design and Uplink Power Allocation in Multi-Cell Massive MIMO Systems

Joint Pilot Design and Uplink Power Allocation in Multi-Cell Massive MIMO Systems

Trinh Van Chien Student Member, IEEE, Emil Björnson, Senior Member, IEEE, and Erik G. Larsson, Fellow, IEEE The authors are with the Department of Electrical Engineering (ISY), Linköping University, SE-581 83 Linköping, Sweden (email:;; This paper was supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 641985 (5Gwireless). It was also supported by ELLIIT and CENIIT. Parts of this paper were presented at IEEE ICC 2017.

This paper considers pilot design to mitigate pilot contamination and provide good service for everyone in multi-cell Massive multiple input multiple output (MIMO) systems. Instead of modeling the pilot design as a combinatorial assignment problem, as in prior works, we express the pilot signals using a pilot basis and treat the associated power coefficients as continuous optimization variables. We compute a lower bound on the uplink capacity for Rayleigh fading channels with maximum ratio detection that applies with arbitrary pilot signals. We further formulate the max-min fairness problem under power budget constraints, with the pilot signals and data powers as optimization variables. Because this optimization problem is non-deterministic polynomial-time hard due to signomial constraints, we then propose an algorithm to obtain a local optimum with polynomial complexity. Our framework serves as a benchmark for pilot design in scenarios with either ideal or non-ideal hardware. Numerical results manifest that the proposed optimization algorithms are close to the optimal solution obtained by exhaustive search for different pilot assignments and the new pilot structure and optimization bring large gains over the state-of-the-art suboptimal pilot design.

Massive MIMO, Pilot Design, Signomial Programming, Geometric Programming, Hardware Impairments.

I Introduction

The demands on capacity and reliability in wireless cellular networks are continuously increasing. It is known that multiple input multiple output (MIMO) techniques can improve both capacity and reliability [1, 2, 3], but current systems only support up to eight antennas per base station (BS). While codebook-based channel acquisition is attractive in such small-scale MIMO systems, these methods are not scalable and unable to support the fifth generation (G) demands on spectral efficiency (SE) in non-line-of-sight conditions [4]. Massive MIMO was proposed in [5] as a possible solution and it has emerged as a key G technology, because it offers significant improvements in both SE and energy efficiency [6, 5, 4, 7, 8]. By equipping the BSs with hundreds of antennas, mutual interference, thermal noise, and small-scale fading can be almost eliminated by virtue of the channel hardening and favorable propagation properties [6]. The BSs only need to use linear detection schemes, such as maximum ratio (MR) or zero forcing, to achieve nearly optimal performance [9]. In addition, the SE only depends on the large-scale fading coefficients, thus power control algorithms are easier to deploy than in small-scale MIMO systems, which are greatly affected by small-scale fading [10].

The uplink (UL) detection and downlink precoding in Massive MIMO are based on instantaneous channel state information (CSI), which the BSs obtain from UL pilot signals. Mutually orthogonal pilots are desirable, but this is impractical in multi-cell scenarios since the pilot overhead would be proportional to the total number of users in the entire system. The consequence is that the pilot signals need to be reused across cells. This leads to pilot contamination [11, 12], where users sending the same pilot degrade each others channel estimation and cause large mutual interference. Hence, the pilot design is of key importance in Massive MIMO and should be optimized to mitigate the pilot contamination effects.

The baseline scheme for mitigating pilot contamination is to introduce a pilot reuse factor , such that each pilot is only reused in of the cells. This approach, which was studied in [13, 14, 15, 16], can greatly reduce the pilot contamination, even if the pilots are randomly assigned within each cell. However, this gain comes at the cost of using times more pilots than in a system reusing the pilots in every cell. For any given cell, only a few users in the neighboring cells cause most of the potential pilot contamination, thus it is most important that these potential contaminators are assigned different pilots from the users in the given cell. Algorithms for coordinated pilot assignment were proposed in [17, 18, 19, 20]. A pilot reuse dictionary was defined in [17] and the corresponding pilot assignment problem was shown to be non-deterministic polynomial-time hard (NP-hard), which motivates the design of heuristic assignment mechanisms. Although [17] proposed several greedy algorithms, the optimized SE was far from that with exhaustive search over all pilot assignments. Graph theory was used for pilot assignment in [18], by exploiting variations in the large-scale fading coefficients. A method called “smart pilot assignment” was proposed in [20] to enhance the max-min fairness SE level, by optimizing a heuristic mutual interference metric. Alternatively, [19] formulated the pilot assignment problem as a potential game. The numerical results in [18, 19, 20] show performance that is similar to an exhaustive search, but with a substantially lower computational complexity. Moreover, the authors of [21, 22] utilized particular channel properties to reduce channel estimation errors and mitigate pilot contamination. In particular, [21] utilized the orthogonality among different channels and an assumed low-rankness of the channel covariance matrices. An adjustable phase shift pilot construction was suggested in [22] based on the relationship between channel correlations in the frequency domain and their power angle-delay spectrum. However, all these algorithms rely on the assumption of fixed pilot and data power.

The pilot and payload data powers are usually treated as constants in the Massive MIMO literature, but it is known from [23, 12] that the performance can be much improved by using the optimal power allocation, which balances the mutual interference levels. To improve the channel estimation quality, more power might also be assigned to the pilots than to the data transmissions [24, 25]. For single-cell systems, [24] showed that a pilot-data power imbalance is especially important for cell-edge users. Moreover, the power allocation that maximizes the sum SE is much different from the one that maximizes the max-min SE. Similar behaviors for multi-cell systems were observed in [25]. The authors in [26] considered power optimization problems with pilot reuse factors. To the best of our knowledge, no prior work analyzes joint pilot design and power control in Massive MIMO systems.

In this paper, we propose a novel pilot design and optimize the UL performance in multi-cell Massive MIMO systems, using the max-min fairness utility. Our main contributions are:

  • We propose a new pilot design where the pilot signals are treated as continuous variables. We demonstrate that previous pilot designs are special cases of our proposal.

  • Based on the proposed pilot design, we derive closed-form expressions of the SE with Rayleigh fading channels and MR detection, for the cases of ideal hardware and with hardware impairments. These expressions explicitly demonstrate how the SE is affected by mutual interference, noise, and pilot contamination.

  • We formulate the max-min fairness problem for the proposed pilot design, by treating the pilot signals, pilot powers, and data powers as optimization variables. This is an NP-hard signomial program, so we propose an algorithm that finds a local optimum in polynomial time. For comparison the optimal solution by an exhaustive search of different pilot assignments is also investigated.

  • The proposed algorithms are evaluated numerically, with either ideal hardware or hardware impairments. The results show that our local solution is close to the global optimum by exhaustive search over different pilot assignments and demonstrate significant improvements over the heuristic algorithms in prior works.

A preliminary version of this work, focusing only on pilot optimization with fixed data powers, was presented in [27].

The rest of this paper is organized as follows: Section II presents our proposed pilot structure and compares it with prior works. Lower bounds on the UL ergodic SE for arbitrary pilots are derived in Section III, while Section IV formulates the max-min fairness optimization problems and provides the global and local solutions. Sections V and VI extend our research to the case of hardware impairments and correlated Rayleigh fading, respectively. Finally, Section VII gives extensive numerical results and some conclusions are provided in Section VIII.

Notations: Lower bold letters are used for vectors and upper cases are for matrices. and stand for regular transpose and Hermitian transpose, respectively. The superscript denotes the conjugate transpose of a complex number. is the identity matrix of size . is the space of complex (real) matrices, while denotes the space of -length complex vectors. is the set of nonnegative real numbers. denotes the expectation of a random variable and is the Euclidean norm. Finally, is the circularly symmetric complex Gaussian distribution, while is the normal distribution.

Ii Pilot Designs for Massive MIMO Systems

We consider the UL of a multi-cell Massive MIMO system with cells. Each cell consists of a BS equipped with antennas that serves single-antenna users. All tuples of cell and user indices belong to a set defined as


The radio channels vary over time and frequency. We divide the time-frequency plane into coherence intervals, each containing samples, such that the channel between each user and each BS is static and frequency flat. In each coherence block, the pilot signaling utilizes symbols and the remaining is dedicated to data transmission. In this paper, we focus on the UL, so the fraction of the coherence interval is dedicated to UL data transmission. However, it is straightforward to extend our work to the downlink by using time division duplex (TDD) and channel reciprocity. We assume to keep the training process feasible and stress that the case is of practical importance since it gives rise to pilot contamination and since is large in practice.

Ii-a Proposed Pilot Design

Let us denote the mutually orthonormal basis vectors , where is a vector whose th element has unit magnitude, and all other elements are equal to zero. The corresponding basis matrix is


We assume that the pilot signals of the users can span arbitrarily over the above basis vectors. We aim at designing a pilot signal collection comprising the pilot signals used by all users in the network and each of them has the length of symbols. The pilot signal of user  in cell  is and the power that this user assigns to the th pilot basis is denoted as . Thus, the pilot of user  in cell  is


We stress that the pilot construction in (3) can be used to create any set of orthogonal pilot signals (up to a unitary transformation) and many different sets of non-orthogonal signals. 111The pilot signals in (3) are formed as linear combinations of basis vectors in the complex field. The new pilot design allows the use of nonorthogonal pilot signals even within a cell in order to get extra degrees of freedom to minimize the interference in the network. The total pilot power consumption utilized by user  in cell  is and we assume that it satisfies the power constraint


where is the maximum pilot power for user  in cell . The inner product of two pilot signals and is


These pilot signals are orthogonal if the product is zero, which only happens when they allocate their powers to different subsets of basis vectors. Otherwise, they are non-orthogonal and then the two users cause pilot contamination to each other. If the square roots of the powers allocated to the users in cell  are gathered in matrix form as


then the users in cell  utilize a pilot matrix defined as


We now describe the difference between this new pilot structure and the prior works, for example [20, 18, 24, 25].

Ii-B Other Pilot Designs

The works [20, 18] considered the assignment of orthogonal pilot signals under the assumption of fixed equal pilot power. Using our notation, the pilot matrix in cell  is


where is the equal power level of all users. is a permutation matrix, that assigns the pilot signals to each user in cell . The assignment is optimized in [20, 18] to minimize a heuristic mutual interference metric. Note that these works assume orthogonal pilot signals and equal power allocation, which are simplifications compared to (7). These assumptions are generally suboptimal. Apart from this, the selection of the optimal permutation matrices for cell  is a combinatorial problem, so to limit the computational complexity [20, 18] and the references therein only study the special case of .

The previous work [24] optimized the pilot powers to maximize functions of the SE, but the paper only considered a single cell without pilot contamination. The authors of [25] optimized the pilot powers to minimize the UL transmit power for a multi-cell system. This work assumed and a fixed pilot assignment. If is the pilot power of user  in cell , the square root of the power matrix allocated to the users in cell  is a diagonal matrix defined as


where denotes the diagonal matrix with the vector on the diagonal. The pilot matrix in cell  is then formulated as


Similar to (4), the pilot power at user  in cell  is limited as


Since orthogonal pilots and fixed pilot assignment are assumed, this is also a special case of (7). We can combine the pilot structure in (10) and the idea of selecting a permutation matrix in (8) to jointly optimize the power allocation and pilot assignment. In particular, the pilot signals of the users in cell  are now defined as


This modified pilot design is a special case of (7) and has not been studied in prior works, but will be considered herein. In order to analyze the channel estimation, we define a pilot reuse set including all tuples of cell and user indices that cause pilot contamination to user  in cell :


We stress that designing an exhaustive search to obtain the best pilot assignment strategy is extremely computationally expensive.222For the first user in the first cell , there are possibilities of . There are then possible and so on. As in prior works, we only consider the case when using (12) and we further assume that orthogonal pilots are used within each cell; that is, for any user indices in cell . To perform an exhaustive search, we need to construct a dictionary , see Fig. 1, with all the possible combinations of pilot assignments in the network. Let denote the index of the pilot signal assigned to user  in cell . It follows that for since all users within a cell use different pilots. The pilot assignment matrix containing the pilot indices of the users is


Each row of contains to and there are different combinations, each defining a permutation matrix for the pilot signals in (8) and (12). The dictionary contains all the pilot assignment matrices. For each , we can extract the pilot reuse sets as333Each collection of pilot reuse sets is generated by different . By eliminating the copies, the size of the dictionary can be reduced to , which still grows rapidly with and .


The dictionary will be later used to obtain the pilot assignment that maximizes the SE performance.

Fig. 1: The dictionary contains all possible pilot assignment indices for all users in the network.

Iii Uplink Massive MIMO Transmission

This section provides ergodic SE expressions with arbitrary pilot signals, which are later used for pilot optimization.

Iii-a Channel Estimation with Arbitrary Pilots

During the UL pilot transmission, the received signal at the BS of cell  is


where denotes the channel between user  in cell  and BS . is the additive noise with independent elements distributed as . Correlating in (16) with the pilot of user  in cell , we obtain


We consider uncorrelated Rayleigh fading since results obtained with this tractable model well matches the results obtained in non-line-of-sight measurements [28]. The channel between user  in cell  and BS  is distributed as


where the variance determines the large-scale fading, including geometric attenuation and shadowing. By using minimum mean squared error (MMSE) estimation, the distributions of the channel estimate and estimation error when using the pilot structure in (7) are given in Lemma 1.

Lemma 1.

If the system uses the pilot structure in (7), the MMSE estimate of based on in (17) is computed as


The channel estimate is distributed as




The estimation error is independent of the channel estimate and distributed as


The proof follows directly from standard MMSE estimation techniques in [29]. ∎

Lemma 1 provides the MMSE estimator for the pilot design in (7). The pilot powers as well as inner products between pilot signals appear explicitly in the expressions. We now compute the channel estimate and estimation error of when using the pilot structure in (12).

Corollary 1.

If the system uses the alternative pilot structure in (12), the MMSE channel estimate in (19) is simplified to


The estimate channel and estimation error are distributed as


This follows from replacing the terms and in Lemma 1 by and , and then doing some algebra. ∎

Corollary 1 reveals that the quality of the estimated channel heavily depends on both the pilot power control and the pilot reuse set . A proper selection of mitigates channel estimation errors, and will also reduce the coherent interference during data transmission. Aligned with prior works, in the special case of , the channel estimate and estimation error are obtained for the pilot structure in (8). We now use the distributions in Lemma 1 and Corollary 1 to derive lower bounds on the UL ergodic capacity.

Iii-B Uplink Data Transmission


In the UL data transmission, user  in cell  transmits the signal . The received signal vector at BS  is the superposition of the transmitted signals


where is the transmit power corresponding to the signal and the additive noise is . To detect the transmitted signal, BS  selects a detection vector and applies it to the received signal as


A general lower bound on the UL ergodic capacity of user  in cell  is computed in [9] as


where the effective SINR value, , is


The lower bound on the UL ergodic capacity in (28) is computed by using the use-and-then-forget bounding technique [6] and its tightness compared to the other possible bounds is discussed in Appendix D in [6]. Although the channel capacity for Massive MIMO in the case of imperfect CSI is unknown, we believe that the lower bound in (28) is quite close to the actual capacity. This is because the effective noise is comprised of a sum of many uncorrelated terms, it is close to Gaussian. This agrees with the worst-case-is-Gaussian assumption made when obtaining the bound. As a contribution of this paper, we compute a closed form expression for this lower bound in the case of MR detection with

Lemma 2.

This is a highly computationally scalable detection method for Massive MIMO systems. If the system uses the pilot structure in (7) and MR detection, the SE in (28) for user  in cell  becomes


where is shown in (32).


The proof is available in Appendix -A. ∎

From (32), we notice that it is always advantageous to add more BS antennas since the numerator grows linearly with (and only some terms in the denominator have the same scaling). The first term in the denominator represents non-coherent interference that only depends on the number of BSs and users, while it is independent of . The second term in the denominator represents coherent interference caused by pilot contamination and it grows linearly with . As a consequence, as , we have


This limit depends only on the pilot design (i.e., inner products between pilot signals) and data power. An optimized selection of the power terms improves the SE by enhancing the channel estimation quality and reducing the coherent interference.

We also consider the achievable SE for the modified pilot structure in (12) as shown in Corollary 2.

Corollary 2.

If the system uses the pilot structure in (12), a lower bound on the capacity for user  in cell  with uncorrelated Rayleigh fading channels and MR detection is


where the SINR value, , is given in (35).


This follows as a special case of Lemma 2. ∎

The SE in Corollary 2 depends explicitly on the choice of thus the optimization of the pilot assignment is a combinatorial problem. We stress that the SINR expressions reflect the joint effects of pilot design, channel estimation quality, pilot contamination, and data power control, in contrast to the MSE that cannot distinguish between pilot contamination and noise. Hence, the SINR is a good metric to consider in the max-min fairness optimization as shown in the next section.

Iv Max-min Fairness Optimization

In this section, we first utilize the SE expressions in Lemma 2 and Corollary 2 to formulate max-min fairness problems with joint pilot and data optimization. We demonstrate that these optimization problems are NP-hard and propose an algorithm to find the globally optimal solution with the pilot design in (12) by making an exhaustive search over all pilot assignments. In addition, instead of looking for the global optimum, an algorithm to obtain a locally optimal solution in polynomial time is presented when using the new pilot design in (7).

Iv-a Problem Formulation


A key vision of Massive MIMO is to provide uniformly good quality of service for everyone in the network. We will investigate how to optimize the pilots and powers towards this goal. We consider the pilot and data powers as optimization variables. The max-min fairness optimization problem is first formulated for the proposed pilot design in (7) as444The optimization problem (36) requires coordination among the cells to be solved, but the main target in this paper is to investigate how much the max-min fairness SE can be improved in multi-cell Massive MIMO by joint pilot design and UL power control. One potential way to deal with practical limitations such as backhaul signaling, delays, and scalability is to implement the optimization problem in a distributed manner using dual/primal decomposition [30].


where is the maximum power that users can provide for each data symbol. Note that this optimization problem jointly generates the pilot signals and performs power control on the pilot and data transmission. The epigraph-form representation of (36) is

subject to (37b)

From the expression of the SINR constraints in (37b), we realize that the proposed optimization problem is a signomial program.555A function defined in is signomial with terms () if the exponents are real numbers and the coefficients are also real but at least one must be negative. In case all are positive, is a posynomial function. Therefore, the max-min fairness optimization problem is NP-hard in general and seeking the optimal solution has very high complexity in any non-trivial setup [31]. However, the power constraints (37c) and (37d) ensure a compact feasible domain and make the SINRs continuous functions of the optimization variables. According to Weierstrass’ theorem [32], an optimal solution always exists.

For the alternative pilot design in (12), the max-min fairness optimization problem is formulated as


The optimization problem (38) is non-convex since it contains a combinatorial pilot assignment selection. Fortunately the optimal solution to this problem can be obtained by looking up every instance in the dictionary . For each we attain the pilot reuse sets , and then convert (38) to a convex problem as shown in Corollary 3.

Corollary 3.

For a given pilot assignment matrix , (38) reduces to the geometric program


The optimal solution to (39) is obtained in polynomial time due to its convexity. By checking every instance in the dictionary and solving the corresponding problem (39), the global optimum to (38) is obtained as the highest objective value to (39).

In more detail, the globally optimal solution to (38) is obtained as shown in Algorithm 1. The th iteration seeks the optimal solution , and for given by considering (39) as the main cost function. The algorithm is terminated when the iteration index equals . The global optimum to the pilot and data power control together with the pilot reuse set are obtained from the maximum values of all . This is a practical issue. We are indeed able to find the solution, but it will take very long time.

Input: Set ; Select the initial values of and for ; Set up the dictionary .

  • Iteration :

    • Assign the reuse pilot set index by an instance .

    • Solve the following geometric program to obtain and

      subject to
  • If Stop. Otherwise, go to Step 3.

  • Restore ,, and . Set , then go to Step 1.

Output: Set , then the optimal solutions: , , and

Algorithm 1 Global solution to (38) by exhaustive search

Algorithm 1 is computationally heavy since the number of iterations grows rapidly with and , but it obtains the global optimum to the max-min SE problem (38). Specifically, the main cost of each iteration in Algorithm 1 is the geometric program (40) which includes optimization variables and constraints. Based on [33], in general, the computational complexity of this algorithm is of the order of


where is the cost of evaluating the first and second derivatives of the objective and constraint functions in (40). Therefore, this approach will serve as a benchmark for comparison in Section VII. For the sake of completeness, we also include another benchmark whereas the data powers are fixed at their maximum value then Algorithm 1 is solved with respect to the remaining pilot power variables, as was done in our previous work [27].

Iv-B Local Optimality Algorithm

This subsection provides a method to obtain a local optimum to the optimization problem (37). To this end, the signomial SINR constraints are converted to monomial ones by using the weighted arithmetic mean-geometric mean inequality [34] stated in Lemma 3.666 A function defined in is monomial if the coefficient and the exponents are real numbers.

Lemma 3.

[34, Lemma 1] Assume that a posynomial function is defined from the set of monomials as


then it is lower bounded by a monomial function as


where is a non-negative weight corresponding to . We say that is the best approximation to near the point in the sense of the first order Taylor expansion, if the weight is selected as


By using this lemma, the max-min fairness optimization problem (37) is converted to a geometric program by bounding the term in the numerators of the SINR constraints:


where is the weight value corresponding to . This leads to a lower bound on the SINR value for user  in cell  obtained as


where the value is presented in (47).

The optimal solution to the max-min SE optimization problem (37) is lower bounded by solving the geometric program

subject to

By virtue of the successive approximation technique [35], a locally optimal Karush-Kuhn-Tucker (KKT) point to the max-min fairness optimization problem (37) can be obtained if we solve (48) iteratively as shown in Theorem 1.

Theorem 1.

Selecting an initial point in the feasible domain and solving (48) in an iterative manner by consecutively updating the weight values from the optimal powers of the previous iteration, the solution will converge to a KKT local point to (37).


The proof is adapted from the general framework in [35] and is sketched in Appendix -B. ∎

In particular, we first select the initial powers that satisfy , . Then the corresponding weight values are computed as in (44). Furthermore, in each iteration, the SINR constraints are converted to the corresponding monomials by bounding the pilot power of user  in cell  as in (47), by using the weight values computed from the optimal pilot powers in the previous iteration. The pilot and data allocation solution is obtained by solving the geometric program (48) before the weight values are updated again at the end of each iteration. We repeat the procedure until this algorithm has converged to a KKT point. The convergence can be declared, for example, when the variation between two consecutive iterations is sufficient small. The proposed algorithm for obtaining a locally optimal solution is summarized in Algorithm 2. Note that one can also fix the data powers and only optimize the pilot signals in Algorithm 2, as was done in our previous work [27]. Algorithm 2 involves optimization with variables and constraints, and it has a computational complexity of the order of [33] 777The exact complexity or the runtime of the proposed algorithms are not suitable metrics since they depend significantly on the computer configuration and how much time is spent to optimize the implementations. However (41) and (49) give basic insights into the general computational complexity scaling.


where is the cost of evaluating the first and second derivatives of the objective and constraint functions in (48). is the number of iterations needed for this algorithm to converge to the KKT point. Even though each iteration in Algorithm 2 is more costly than in Algorithm 1 since we carefully design powers for all pilot signals, the successive approximation approach converges after only a few iterations.

Input: Set ; Select the maximum powers and for Select the initial values of powers for ; Compute the weight values