Linear Precoding for Relay Networks with FiniteAlphabet Constraints
Abstract
In this paper, we investigate the optimal precoding scheme for relay networks with finitealphabet constraints. We show that the previous work utilizing various design criteria to maximize either the diversity order or the transmission rate with the Gaussianinput assumption may lead to significant loss for a practical system with finite constellation set constraint. A linear precoding scheme is proposed to maximize the mutual information for relay networks. We exploit the structure of the optimal precoding matrix and develop a unified twostep iterative algorithm utilizing the theory of convex optimization and optimization on the complex Stiefel manifold. Numerical examples show that this novel iterative algorithm achieves significant gains compared to its conventional counterpart.
I Introduction
Cooperative relaying is an emerging technology, which provides reliable high data rate transmission for wireless networks without the need of multiple antennas at each node. These benefits can be further exploited by utilizing judicious cooperative design (see [1, 2, 3, 4, 5] and the references therein).
Most of the existing designs optimize the performance of relay networks with Gaussianinput assumptions, for example, maximizing output signaltonoise (SNR) [1, 2], minimizing mean square error (MSE) [3, 1] and maximizing channel capacity [1, 4, 5]. Despite the information theoretic optimality of Gaussian inputs, they can never be realized in practice. Rather, the inputs must be drawn from a finite constellation set, such as pulse amplitude modulation (PAM), phase shift keying (PSK) modulation and quadrature amplitude modulation (QAM), in a practical communication system. These kinds of discrete constellations depart significantly from the Gaussian idealization [6, 7]. Therefore, there exhibits a big performance gap between the scheme designed with the Gaussianinput assumption and the scheme designed from the standing point of finitealphabet constraint [8].
In this paper, we consider the twohop relay networks with finiteinput constraint and utilize linear precoder to improve the maximal possible transmission rate of networks. We exploit the optimal structure of the precoding matrix under finitealphabet constraint and develop a unified framework to solve this nonconvex optimization problem. We prove that the left singular matrix of the precoder coincides with the right singular matrix of the effective channel matrix; the mutual information is a concave function of the power allocation vector of the precoder; the optimization of the right singular matrix with unitary constraint can be formulated as an unconstrained one on the complex Stiefel manifold. Once these important results are provided, the optimal precoder is solved with a complete twostep iterative algorithm utilizing the theory of convex optimization and optimization on the manifold. We show that this novel iterative algorithm achieves significant gains compared to its conventional counterpart.
Notation: Boldface uppercase letters denote matrices, boldface lowercase letters denote column vectors, and italics denote scalars. The superscripts and stand for transpose and Hermitian operations, respectively. The subscripts and denote the th element of vector and the ()th element of matrix , respectively. The operator represents a diagonal matrix whose nonzero elements are given by the elements of vector . Furthermore, represents the vector obtained by stacking the columns of ; and represents an identity matrix and a zero matrix of appropriate dimensions, respectively; denotes the trace operation. Besides, all logarithms are base 2.
Ii System Model
(3) 
(9) 
Consider a relay network with one transmitandreceive pair, where the source node attempts to communicate to the destination node with the assistance of relays (). All nodes are equipped with a single antenna and operated in halfduplex mode. We consider a flat fading cooperative transmission system, in which the channel gain from the source to the destination is denoted by , whereas those from the source to the th relay and from the th relay to the destination are denoted as and , respectively.
We focus on the amplifyandforward protocols combined with single relay selection [5]. The signal transmission is carried out by blocks with block length , . For the selected relay node, there is a data receiving period of length before a data transmitting period of length .
The original data at the source node is denoted by
where , with being the data symbol at the th time slot, . We assume that the original information signals are equally probable from discrete signaling constellations such as PSK, PAM, or QAM with unit covariance matrix, i.e., .
The original data is processed by a precoding matrix before being transmitted from the source node. The precoded data is given by , where is a generalized complex precoding matrix.
The source node sends the signal with power constraint during the first time slots. Let and be received signals at the th relay node and the destination, respectively, which are given by
where and denote, respectively, the independent and identically distributed (i.i.d.) zeromean circularly Gaussian noise with unit variance at the th relay and the destination.
Assuming the th relay node is selected at the second time slots, it scales the received signal by a factor (so that its average transmit power is ) and forwards it to the receiver. We assume only the secondorder statistics of is known at the th relay node, then can be chosen as . At the second time slots, the source node sends the signal . Therefore, the destination node receives a superposition of the relay transmission and the source transmission according to
where denotes the noise vector of the destination at the second time slots, and denotes the effective noise with .
For convenience in the presentation, we normalize by and denote the received signal vector as . Thus, the effective inputoutput relation for the twohop transmission with precoding is summarized as
(1) 
where is the original transmitted signal vector; is i.i.d. complex Gaussian channel noise vector with zero mean and unit variance, i.e., ; is the effective channel matrix of the twohop relay channel
(2) 
Our precoding scheme is thus the design of matrix to maximize the mutual information with finitealphabet constraints. Note that for the proposed algorithm to be effectively implemented in practice, the destination estimates effective channel matrices of relay networks through pilot assisted channel estimation. Then the destination node selects one relay for cooperation and provides the corresponding effective channel to the source node via a feedback channel. Considering the special structure of the channel matrix (2), the amount of the feedback can be very small. After signal feedback, the source node utilizes the proposed precoding algorithm to optimize the network performance.
Iii Mutual Information for Relay Networks
We consider the conventional equiprobable discrete constellations such as ary PSK, PAM, or QAM, where is the number of points in the signal constellation. The mutual information between and , with the equivalent channel matrix and the precoding matrix known at the receiver, is given by (3), where denotes Euclidean norm of a vector; both and contain symbols, taken independently from the ary signal constellation [7, 8].
Proposition 1
Let be a unitary matrix, and the following relationships hold:
(4)  
(5) 
Proof:
See proof in [9]. \qed
Proposition 1 implies that the property of mutual information for the discrete input vector is different from the case of Gaussian inputs. For Gaussian inputs, the mutual information is unchanged when either transmitted signal or received signal is rotated by a unitary matrix. The case of finite inputs does not follow the same rule. Therefore, it provides us a new opportunity to improve the system performance.
The optimization of the linear precoding matrix is carried out over all complex precoding matrices under transmit power constraint, which can be cast as a constrained nonlinear optimization problem
(6) 
Proposition 2 (Necessary Condition)
Proof:
See proof in [9]. \qed
It is important to note that Proposition 2 gives a necessary condition satisfied by any critical points, since the optimization problem (6) is nonconvex. It is possible to develop an algorithm based on the gradient of the Lagrangian as proposed in [10, 8]. However, this kind of algorithms can be stuck at a local maximum, which is influenced by the starting point and may be far away from the optimal solution. This fact will be shown via an example in the sequel.
Iv Precoder Design to Maximize the Mutual Information
Iva Optimal Left Singular Vector
We start by characterizing the dependence of mutual information on precoder . Consider the singular value decomposition (SVD) of the channel matrix , where and are unitary matrices, and the vector contains nonnegative entries in decreasing order. Note that the equivalent channel matrix defined in (2) is full rank for any nonzero channel gain . We also consider the SVD of the precoding matrix and define and , where and are named as the left and right singular vectors, respectively; the vector is nonnegative constrained by transmit power.
Proposition 3
The mutual information depends on the precoding matrix only through . For a given , we can always choose the precoding matrix of the form in order to minimize the transmit power , i.e., the left singular vector of coincides with the right singular vector of .
Proof:
See proof in [9]. \qed
From the results in Proposition 1 and 3, it is possible to simplify the channel matrix (1) to
(10) 
Now our discussion will be based on this simplified channel model (10). The optimization variables are power allocation vector and right singular vector , which are the focuses of the next two subsections. In the sequel, we will use and to emphasize the dependence of mutual information on variables and , respectively.
IvB Optimal Power Allocation
Given a right singular vector of the precoder, we consider the following optimization problem over the power allocation vector
(11) 
where denotes a column vector with all entries one.
Proposition 4
The mutual information is a concave function of the squared singular values of the precoder, , i.e., the mutual information Hessian with respect to the power allocation vector satisfies . Moreover, the Jacobian of the mutual information with respect to the power allocation vector is given by
(12) 
where is a reduction matrix given by
(13) 
Proof:
See proof in [9]. \qed
The concavity result in Proposition 4 extends the Hessian and concavity results in [11, Theorem 5] from realvalued signal model to a generalized complexvalued case. It ensures to find the global optimal power allocation vector given a right singular vector , and the gradient result in (12) provides the possibility to develop a steepest descent type algorithm to achieve the global optimum [12].
We first rewrite the problem (11) using the barrier method:
(14) 
where is the logarithmic barrier function, which approximates an indicator illustrating whether constraints are violated
(15) 
with the parameter setting the accuracy of the approximation. The gradient of objective function (14) is
(16) 
where is the th element of vector . Therefore, the steepest descent direction is chosen as
Then it is necessary to decide a positive step size so that . The backtracking line search conditions [12] states that should be chosen to satisfy the inequalities
(17)  
(18) 
The above ideas can be summarized as the following algorithm, which ensures to converge to the optimal power allocation vector because of the concavity.
Algorithm 1
Steepest Descent to Maximize the Mutual Information Over Power Allocation Vector

Given a feasible , , , tolerance .

Compute the gradient of at , , as (16) and the descent direction . Set the step size .

Evaluate . If it is sufficiently small, then go to Step 7.

If , then set , and repeat Step 4.

If , then set , and repeat Step 5.

Set . Go to Step 2.

Stop if , else , and go to step 2.
IvC Optimization Over Right Singular Vector
This section considers an alternative optimization problem for maximizing the mutual information over the right singular vector for a given power allocation vector,
(19) 
This unitary matrix constrained problem can be formulated as an unconstrained one in a constrained search space
(20) 
where we define the function as , and with domain restricted to the feasible set:
(21) 
in which the set is complex Stiefel manifold [13]
(22) 
Associated with each point is a vector space called tangent space, which is formed by all the tangent vectors at the point .
Proposition 5
The gradient of the mutual information on the tangent space is
(23) 
Proof:
See proof in [9]. \qed
Utilizing the gradient on the tangent space as the descent direction has been suggested in [13], i.e., ; however, moving towards the direction on the tangent space may lost the unitary property. Therefore, it needs to be restored in each step via projection.
The projection of an arbitrary matrix onto the Stiefel manifold is defined to be the point on the Stiefel manifold closest to in the Euclidean norm
(24) 
Proposition 6 (Projection)
Let be a full rank matrix. If the SVD of is , the projection is unique, which is given by .
Proof:
See proof in [9]. \qed
Combining the search direction and projection method with the line search conditions in (17) and (18), we are able to develop the optimization algorithm to maximize the mutual information over the right singular vector .
Algorithm 2
Steepest Descent to Maximize the Mutual Information on Complex Stiefel Manifold
IvD TwoStep Approach to Optimize Precoder
Now we are ready to develop a complete twostep approach to maximize the mutual information over a generalized precoding matrix via combining Proposition 3 and Algorithm 1 and 2.
Algorithm 3
TwoStep Algorithm to Maximize the Mutual Information Over a Generalized Precoding Matrix
V Applications
We consider a singlerelay network with the block length and the channel coefficient , and . We assume the same transmit power at the source and relay node (i.e., ), and the SNR is 3 dB. when the elements of the transmitted signal is drawn independently from BPSK constellations, the mutual information is bounded by 1 bit/s/Hz as shown in (3).
The convergence of the proposed approach is illustrated in Fig. 1. We also show the convergence of algorithms proposed in [8, 10]. From Fig. 1, it is shown that the direct gradient method is stuck at a local maximum (0.53 bit/s/Hz). The reason for such performance is that the optimization problem is not convex in general. In contrast, the proposed twostep algorithm exploits the characterization of the optimal solution, which leads to a solution with the optimal left singular vector, the optimal power allocation vector (for a given right singular vector), and the local optimal right singular vector from an arbitrary start point. Hence, the algorithm is able to converge to a much higher value (0.85 bit/s/Hz) with about 60 percent improvement. Note that the progress of the proposed method has a staircase shape, with each stair associated with either the iteration for , named as outer iteration in [12], or the change between the optimizations of the power allocation vector and the right singular vector.
The performance of the proposed algorithm is shown is Fig. 2, in which the information symbol is modulated as QPSK, and the channel is the same as the above case. For the sake of completeness, we also show the performance corresponding to the case of no precoding, maximum diversity design in [14], maximum coding gain design in [14, 15], and maximum capacity design with Gaussian inputs assumption in [4]. From Fig. 2, we have following several observations.
The method based on maximizing capacity with Gaussianinput assumption may result in a significant loss for discrete inputs, especially when the SNR is in mediumtohigh regions. The reason comes from the difference in design power allocation vector and right singular vector. For Gaussian inputs, it is always helpful to allocation more power to the stronger subchannels and less power to the weaker subchannels to maximize the capacity. However, this does not work for the case of finite inputs. Since the mutual information of the relay network is upper bounded by from (3), there is little incentive to allocate more power to subchannels when they are already close to saturation. Moreover, the right singular vector for Gaussian inputs is an arbitrary unitary matrix to maximize the capacity, because the mutual information is unchanged when the input signal is rotated by a unitary matrix.
The maximum coding gain design has better performance than the method of maximum diversity order and no precoding. We should note that the maximum coding design in [14] is only valid for the case of block length and QPSK inputs; it is extended to the case of and 16QAM inputs in [15].
The proposed twostep precoder optimization results in significant gain on mutual information in a wide range of SNR region. For example, it is about 2 dB, 4 dB and 10 dB better than the method of maximum coding gain, no precoding and maximum capacity, respectively, when the channel coding rate is 2/3. Moreover, this algorithm is able to be utilized for an arbitrary block length and input type.
Vi Summary and Conclusion
In this paper, we have studied the precoding design for dualhop AF relay networks. In contrast with the previous work utilizing various design criteria with the idealistic Gaussianinput assumptions, we have formulated the linear precoding design from the standpoint of discreteconstellation inputs. To develop an efficient precoding design algorithm, we have chosen the mutual information as the utility function. Unfortunately, the maximization of this utility function over all possible complex precoding matrix is nonconvex, i.e., the direct optimization on the precoder can be stuck at a local maxima, which is influenced by the starting point and may be far away from the optimal solution. We have exploited the structure of the precoding matrix under finitealphabet constraint and developed a unified framework to solve this nonconvex optimization problem. We have proposed a twostep iterative algorithm to maximize the mutual information. Numerical examples have shown substantial gains of our proposed approach on mutual information compared to its conventional counterparts.
References
 [1] A. S. Behbahani, R. Merched, and A. M. Eltawil, “Optimizations of a MIMO relay network,” IEEE Trans. Signal Process., vol. 56, no. 10, pp. 5062–5073, 2008.
 [2] A. B. Gershman, N. D. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson, and B. Ottersten, “Convex optimizationbased beamforming: From receive to transmit and network designs,” IEEE Signal Process Mag., vol. 27, no. 3, pp. 62–75, 2010.
 [3] N. Khajehnouri and A. Sayed, “Distributed MMSE relay strategies for wireless sensor networks,” IEEE Trans. Signal Process., vol. 55, no. 7, p. 3336, 2007.
 [4] Y. Rong, X. Tang, and Y. Hua, “A unified framework for optimizing linear nonregenerative multicarrier MIMO relay communication systems,” IEEE Trans. Signal Process., vol. 57, no. 12, pp. 4837–4851, 2009.
 [5] W. Zeng, C. Xiao, Y. Wang, and J. Lu, “Opportunistic cooperation for multiantenna multirelay networks,” IEEE Trans. Wireless Commun., vol. 9, no. 10, pp. 3189–3199, 2010.
 [6] A. Lozano, A. Tulino, and S. Verdu, “Optimum power allocation for parallel Gaussian channels with arbitrary input distributions,” IEEE Trans. Inform. Theory, vol. 52, no. 7, pp. 3033–3051, 2006.
 [7] C. Xiao and Y. R. Zheng, “On the mutual information and power allocation for vector Gaussian channels with finite discrete inputs,” in Proc. IEEE Globecom, New Orleans, LA, Nov. 2008, pp. 1–5.
 [8] W. Zeng, M. Wang, C. Xiao, and J. Lu, “On the power allocation for relay networks with finitealphabet constraints,” in Proc. IEEE Globecom, Miami, FL, 2010.
 [9] W. Zeng, C. Xiao, M. Wang, and J. Lu, “Linear precoding for relay networks with finitealphabet inputs: Theory and practice,” submitted to IEEE Trans. Wireless Commun., Dec. 2010.
 [10] D. Palomar and S. Verdú, “Gradient of mutual information in linear vector Gaussian channels,” IEEE Trans. Inform. Theory, vol. 52, no. 1, pp. 141–154, 2006.
 [11] M. Payaro and D. P. Palomar, “Hessian and concavity of mutual information, differential entropy, and entropy power in linear vector Gaussian channels,” IEEE Trans. Inform. Theory, vol. 55, no. 8, pp. 3613–3628, 2009.
 [12] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
 [13] J. H. Manton, “Optimization algorithms exploiting unitary constraints,” IEEE Trans. Signal Process., vol. 50, no. 3, pp. 635–650, 2002.
 [14] Y. Ding, J. Zhang, and K. Wong, “The amplifyandforward halfduplex cooperative system: Pairwise error probability and precoder design,” IEEE Trans. Signal Process., vol. 55, no. 2, pp. 605–617, 2007.
 [15] ——, “Optimal precoder for amplifyandforward halfduplex relay system,” IEEE Trans. Wireless Commun., vol. 7, no. 8, p. 2891, 2008.