# Asymptotics of Nonlinear LSE Precoders with Applications to Transmit Antenna Selection

## Abstract

This paper studies the large-system performance of Least Square Error (LSE) precoders which minimize the input-output distortion over an arbitrary support subject to a general penalty function. The asymptotics are determined via the replica method in a general form which encloses the Replica Symmetric (RS) and Replica Symmetry Breaking (RSB) ansätze. As a result, the “marginal decoupling property” of LSE precoders for -steps of RSB is derived. The generality of the studied setup enables us to address special cases in which the number of active transmit antennas are constrained. Our numerical investigations depict that the computationally efficient forms of LSE precoders based on “-norm” minimization perform close to the cases with “zero-norm” penalty function which have a considerable improvements compared to the random antenna selection. For the case with BPSK signals and restricted number of active antennas, the results show that RS fails to predict the performance while the RSB ansatz is consistent with theoretical bounds.

## 1Introduction

For the channel

with , and , the nonlinear precoder with the general penalty function is given by

The precoder maps the -dimensional source vector , scaled with the power control factor , to the -dimensional input vector whose entries are taken from the given support . The mapping is such that the distortion caused by the channel impact, i.e., , is minimized over the given input support subject to some constraints imposed by . The conventional precoding schemes such as , Tomlinson-Harashima or vector precoding, mostly consider the average transmit power constraint and assume the set of possible input constellation points to be the complex plane, i.e., . The latter consideration was partially relaxed in [1] where authors studied the “per-antenna constant envelope precoding”. The set of possible constellation points was later generalized to an arbitrary set by introducing a class of power-limited nonlinear precoders [2]. The precoder in generalizes the earlier schemes by letting different types of constraints be imposed on the precoded vector. In fact, due to the generality of the penalty function the scope of restrictions on is broaden. Consequently, several precoding schemes are considered as special cases of . To name some examples, let ; then, for , the precoder reduces to the precoder introduced in [3], and by considering for some constant , the precoder reduces to a constant envelope precoder [1].

This paper investigates the asymptotic performance of the precoder. Our motivation comes from recent promising results reported for massive systems [4]. For some choices of and , the system can be asymptotically analyzed via tools from random matrix theory [5]. The tools, however, fail to study the large-system performance of the precoder for many other choices. Therefore, we invoke the “replica method” developed in statistical mechanics. In the context of multiuser systems, the replica method was initially utilized by Tanaka in [6] to study the asymptotic performance of randomly spread CDMA detectors. The method was later widely employed for large-system analysis in communications and information theory; see for example [7] and the references therein.

### Contributions

For nonlinear precoders, we determine the input-output distortion, as well as the marginal distribution of output entries, in the large-system limit via the replica method. We deviate from our earlier replica symmetric study in [8], by determining the general replica ansatz which includes both the replica symmetry and symmetry breaking ansätze. Our general result furthermore depicts that under any assumed replicas’ structure, the output symbols of the precoder marginally decouple in the asymptotic regime. A brief introduction to the replica method is given in the appendix through the large-system analysis. As an application, we study special cases of the precoder with co- nstraints on the number of active antennas. Our numerical inv- estigations show that computationally efficient precoders based on -norm minimization perform significantly close to precoders with zero-norm penalty. Moreover, the problem of BPSK transmission with constraint on the number of active antennas is shown to exhibit replica symmetry breaking.

### Notation

We represent scalars, vectors and matrices with non-bold, bold lower case and bold upper case letters, respectively. A identity matrix is shown by , and the matrix with all entries equal to one is denoted by . indicates the Hermitian of the matrix . The set of real and integer numbers are denoted by and , and their corresponding non-negative subsets by superscript ; moreover, represents the complex plane. For , and identify the real part and argument, respectively. and denote the Euclidean and -norm, respectively, and represents the zero-norm defined as the number of nonzero entries. For a random variable , represents either the probability mass or density function. Moreover, identifies the expectation operator. For sake of compactness, the set of integers is abbreviated as and a zero-mean complex Gaussian distribution with variance is represented by . Whenever needed, we assume the support to be discrete. The results, however, are in full generality and hold also for continuous distributions.

## 2Problem Formulation

Consider the precoding scheme illustrated in in which

is a random matrix whose eigendecomposition is with being a Haar distributed unitary matrix, and being a diagonal matrix with asymptotic eigenvalue distribution .

has zero-mean and unit-variance complex Gaussian entries, i.e., and is independent of .

is a non-negative real power control factor.

is a general penalty function with decoupling property, i.e., .

The dimensions of grow large, such that the load factor, defined as , is kept fixed in both and .

For this setup, we define the asymptotic marginal as follows.

The asymptotic marginal of determines large-system characteristics of including the marginal distribution of its entries. In order to quantify the large-system performance, we further define the asymptotic distortion as a measure.

## 3Main Results

We start by defining the -transform of a distribution.

Proposition ? expresses and in terms of the -transform of . The result is determined for a general structure of replicas, and only relies on the replica continuity assumption which is briefly explained in the appendix.

To determine and in Proposition ?, one needs to determine the fixed-point through , and then, find the function at the of and in an analytic form. Finding the solution of , however, is notoriously difficult and possibly some of the solutions are not of use. The trivial approach is to restrict the search to a set of parameterized matrices. The most primary set is given by . The solution, however, may result in an invalid prediction of the performance. A more general structure is given by imposing the structure which we address in the sequel.

### 3.1General Marginal Decoupling Property

Proposition ? enables us to investigate a more general form of the “asymptotic marginal decoupling property” introduced in [8]. The property indicates that in the large-system limit, the marginal distribution of all output entries are identical and expressed as the output distribution of an equivalent single-user system. In fact, it can be considered as a dual version of the decoupling property investigated in the literature for different classes of nonlinear estimators, e.g. [9]. As the analysis in [8] was under the assumption, the result was limited to the cases in which assumption gives a valid prediction. The generality of Proposition ?, however, enables us to investigate this property of the precoder for any structure of replicas. To illustrate the property, consider the following definition.

### 3.2 Ansätze

Parisi proposed the method of to construct a set of parameterized matrices which recursively extends to larger classes. The method starts from the structure for , and then recursively constructs new structures. After steps of recursion, becomes of the form

for some non-negative real scalars , and , and sequences and . The structure in reduces to by setting . By substituting in Proposition ?, the -steps ansatz is determined. For cases that the ansatz gives the exact solution, the coefficients at the saddle points are equal to zero. However, in cases that fails, the sequence has non-zero entries. The investigations in [2] show that the ansatz clearly fails giving a valid prediction of the performance in some cases. Therefore, the ansätze are required to be considered further. For sake of compactness, we state the one-step ansatz, i.e., , in this paper. The result, however, is extended to an arbitrary number of breaking steps by taking the approach in Appendix D of [12].

## 4Applications to Transmit Antenna Selection

As we discussed, considering a general penalty function lets us investigate several transmit constraints. Restrictions on the number of active antennas is a constraint which arises in systems with [13]. The goal in these systems is to minimize the number of chains which significantly reduces the overall -cost. The fundamental limits as well as efficient selection algorithms, however, have not been yet precisely addressed in the literature. In this section, we investigate the asymptotics of some special cases of the precoder which imply .

### 4.1 by Zero-Norm Minimization

The precoder with imposes constraints on the average transmit power and number of active antennas. For , the decoupled precoder reads

for . Here, the decoupled precoder is a hard thresholding operator. As , tends to zero as well. For the case with limited peak power where for some

the decoupled precoder is given by

where and . The decoupled precoder in is a two-steps hard thresholding operator which in the first step constrains the transmit peak power, and in the second step, implies the constraint. By setting , becomes zero and .

The precoders with zero-norm penalty function need to minimize a non-convex function which has a high computational complexity. We therefore propose an alternative form of the precoder based on the -norm minimization.

### 4.2 by -Norm Minimization

To reduce the complexity of the precoding schemes in Section Section 4.1, we modify as . The objective function in this case is convex, and therefore, for convex choices of , the resulting form of the precoder is effectively implemented by employing computationally feasible algorithms. We start by considering in which

with . The decoupled precoder in this case is a soft thresholding operator. In fact, is obtained from by multiplying the factor . Similar to , the threshold in tends to zero as . For the case with limited peak transmit power, the decoupled precoder reads

for and . As in , the decoupled precoder in is a two-steps thresholding. In the first step, is constrained the peak power via a hard thresholding operator with level , and then at the second step, the constraint is imposed on the decoupled input by a soft thresholding operator as in . By setting , the threshold reads and .

### 4.3 with -PSK Signals on Antennas

Considering the precoding support as , for , the precoder is constrained to map the source to a vector of -PSK symbols over a subset of antennas while keeping the others silent. In this case, the transmit power on each active antenna is , and therefore, which indicates that any restriction on the average transmit power imposes a proportional constraint on the number of active antennas. Consequently, is applied via the precoder by setting the penalty function as . By defining the function as , the decoupled precoder in this case is derived as

where for . As in Sections Section 4.1 and Section 4.2, describes a thresholding operator over the -PSK constellation. Here, by growth of , the threshold increases, and consequently, the number of active transmit antennas reduces.

### 4.4Numerical Results

Throughout the numerical investigations, the asymptotic fraction of active antennas is denoted by which is determined by with being the indicator function. The average transmit power is represented by , and the is denoted by which reads . We consider to be a fading channel whose entries are with zero mean and variance ; thus, follows Marcenko-Pastur’s law, and [14].

Considering Sections Section 4.1 and Section 4.2, Fig. ? shows the predicted asymptotic distortion at in terms of the inverse load factor for two cases of dB and no peak power constraint. In the -limited case, the curves have been sketched for , and in the other case, has been considered; moreover, the average transmit power is set to be . As a benchmark, we have also plotted the points for random which meet the corresponding curves. In fact, in the random , the precoder selects a subset of transmit antennas randomly and precodes using the penalty function . As the figure depicts, for the case of no peak power restriction, the zero-norm and -norm based precoders need respectively about and fewer active transmit antennas compared to the random . The gains in the case of dB reduce to and respectively.

In order to investigate the impact of , we have also considered an example of antenna selection with BPSK transmission, i.e., in Section 4.3. Fig. ? illustrates the as well as one-step prediction of the asymptotic distortion at for two cases of and when . For sake of comparison, a theoretically rigorous lower bound for the case of has been also sketched. The lower bound is derived as in [2]. As the figure shows, the ansatz starts to fail predicting the asymptotic distortion as grows, and it even violates the lower bound in large inverse load factors. For this regime of , however, the one-step ansatz gives a theoretically valid prediction.

## Appendix: Large-System Analysis

In the sequel, we briefly sketch the derivations. Consider the Hamiltonian , and define the partition function to be

By a standard large deviation argument, it is shown that

in which . Moreover, the asymptotic distortion reads where we define , and . is determined in terms of by setting in , and

Thus, the evaluation of and reduce to determining ; the task which we do via the replica method. Using the Riesz equality which states ,

*Replica Method:* Evaluating from is not trivial, as . The replica method determines the of by conjecturing the replica continuity. The replica continuity indicates that the “analytic continuation” of the non-negative integer moment function, i.e., for , onto equals to the non-negative real moment function, i.e., for . The rigorous justification of the replica continuity has not been yet precisely addressed; however, the analytic results from the theory of spin glasses confirm the validity of the conjecture for several cases.

Considering the replica continuity assumption, Proposition ? is concluded by taking some lines of calculations form which have been left for the extended version of the manus- cript due to the page limitation.

### References

- S. K. Mohammed and E. G. Larsson, “Per-antenna constant envelope precoding for large multi-user MIMO systems,”
*IEEE Trans. on Comm.*, vol. 61, no. 3, pp. 1059–1071, 2013. - M. A. Sedaghat, A. Bereyhi, and R. Mueller, “LSE precoders for massive MIMO with hardware constraints: Fundamental limits,”
*arXiv preprint arXiv:1612.07902*, 2016. - C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation technique for near-capacity multiantenna multiuser commu-nication-Part I: channel inversion and regularization,”
*IEEE Trans. on Comm.*, vol. 53, no. 1, pp. 195–202, 2005. - J. Hoydis, S. Ten Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: How many antennas do we need?”
*IEEE Journal on selected Areas in Communications*, vol. 31, no. 2, pp. 160–171, 2013. - D. A. Schmidt, M. Joham, and W. Utschick, “Minimum mean square error vector precoding,”
*European Transactions on Telecommunications*, vol. 19, no. 3, pp. 219–231, 2008. - T. Tanaka, “A statistical-mechanics approach to large-system analysis of CDMA multiuser detectors,”
*IEEE Trans. on Inf. Theory*, vol. 48, no. 11, pp. 2888–2910, 2002. - B. M. Zaidel, R. Müller, A. L. Moustakas, and R. de Miguel, “Vector precoding for gaussian MIMO broadcast channels: Impact of replica symmetry breaking,”
*IEEE Trans. on Inf. Theory*, vol. 58, no. 3, pp. 1413–1440, 2012. - A. Bereyhi, M. A. Sedaghat, S. Asaad, and R. R. Müller, “Nonlinear precoders for massive MIMO systems with general constraints,”
*International ITG Workshop on Smart Antennas (WSA)*, 2017. - D. Guo and S. Verdú,“Randomly spread CDMA: Asymptotics via statistical physics,”
*IEEE Trans. on Inf. Theory*, vol. 51, no. 6, pp. 1983–2010, 2005. - S. Rangan, A. K. Fletcher, and V. Goyal, “Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing,” in
*IIEEE Trans. on Inf. Theory*, 2012, pp. 1902–1923. - A. Bereyhi, R. Müller, and H. Schulz-Baldes, “RSB decoupling property of MAP estimators,”
*IEEE Inf. Theory Work. (ITW)*, pp. 379–383, 2016 - A. Bereyhi, R. R. Müller, and H. Schulz-Baldes, “Statistical mechan-ics of MAP estimation: General replica ansatz,”
*arXiv preprint arXiv: 1612.01980*, 2016. - A. F. Molisch, M. Z. Win, Y.-S. Choi, and J. H. Winters, “Capacity of MIMO systems with antenna selection,”
*IEEE Trans. on Wireless Comm.*, vol. 4, no. 4, pp. 1759–1772, 2005. - V. A. Marčenko, and L. A. Pastur, “Distribution of eigenvalues for some sets of random matrices,”
*Mathematics of the USSR-Sbornik*, vol. 1, no. 4, pp. 457-483, 1967.