A Deep Learning Framework for Optimization of MISO Downlink Beamforming
Abstract
Beamforming is an effective means to improve the quality of the received signals in multiuser multipleinputsingleoutput (MISO) systems. This paper studies fast optimal downlink beamforming strategies by leveraging the powerful deep learning techniques. Traditionally, finding the optimal beamforming solution relies on iterative algorithms which leads to high computational delay and is thus not suitable for realtime implementation. In this paper, we propose a deep learning framework for the optimization of downlink beamforming. In particular, the solution is obtained based on convolutional neural networks and exploitation of expert knowledge, such as the uplinkdownlink duality and the structure of known optimal solutions. Using this framework, we construct three beamforming neural networks (BNNs) for three typical optimization problems, i.e., the signaltointerferenceplusnoise ratio (SINR) balancing problem, the power minimization problem and the sum rate maximization problem. The BNNs for the former two problems adopt the supervised learning approach, while the BNN for the sum rate maximization problem employs a hybrid method of supervised and unsupervised learning to improve the performance beyond the state of the art. Simulation results show that with much reduced computational complexity, the BNNs can achieve nearoptimal solutions to the SINR balancing and power minimization problems, and outperform the existing algorithms that maximize the sum rate. In summary, this work paves the way for fast realization of the optimal beamforming in multiuser MISO systems.
I Introduction
The beamforming technique has attracted much attention in the past decades for its ability to realize the performance gain of the multiple antennas in the downlink. Beamforming has been formulated in various ways, i.e., as a signaltointerferenceplusnoise ratio (SINR) balancing problem (also known as interference balancing problem) under a total power constraint [1, 2, 3, 4, 5], as a power minimization problem under quality of service (QoS) constraints [6, 7, 8, 9, 10], or as a sum rate maximization problem under a total power constraint [3, 11, 12, 13]. The existing approaches to finding the optimal beamforming solutions heavily rely on tailormade iterative algorithms and convex optimization, which is in turn solved by general iterative algorithms such as the interior point method. For instance, the SINR balancing problem can be solved by the iterative algorithm of [14]. The power minimization problem can be reformulated as a secondorder cone programming (SOCP) [10, 8] or semidefinite programming (SDP) problem [15], which can be solved directly by an optimization software package such as CVX [16]. Its optimal solution can also be obtained using iterative algorithms such as Algorithm A of [17] and the dual algorithm of [6]. However, the optimal solution to the sum rate maximization problem is usually hard to obtain because the problem is nonconvex. Instead, locally optimal solutions are obtained via iterative algorithms, such as the weighted minimum mean square error (WMMSE) algorithm [11, 12], and the water filling algorithm combined with zeroforcing (ZF) beamforming [13].
The main drawbacks of existing iterative algorithms are the high computational complexity and delay. As a result, the beamforming technique is unable to meet the demands of realtime applications in the fifthgeneration (5G) system and beyond, such as autonomous vehicles and mission critical communications. Even in nonrealtime applications, where the smallscale fading varies in the order of milliseconds, the latency introduced by the iterative process renders the beamforming solution outdated. To address this challenge, researchers have proposed some simple heuristic beamforming solutions which admit closedform solutions, such as the maximumratio transmission beamforming, the ZF beamforming, and the regularized ZF (RZF) beamforming. These heuristic beamforming solutions are directly computed based on the channel state information (CSI) without iteration, and thus involve low computational delay. However, the reduction of delay is achieved at the cost of performance loss. The tradeoff between delay and performance seems to restrict the potential of the beamforming techniques and its applications in practice.
Thanks to the recent advances in deep learning (DL) techniques, it becomes possible to learn the optimal beamforming in real time by taking into account both the performance and the computational delay simultaneously. This is because the DL technique trains neural networks offline and then deploys the trained neural networks for online optimization. The computational complexity is transferred from the online optimization to the offline training, and only simple linear and nonlinear operations are needed when the trained neural network is used to find the optimal beamforming solution, thus greatly reducing the computational complexity and delay.
Benefiting from the development of specialized hardware, such as graphic processing units and field programmable gate arrays, DL can be implemented using these hardware resources conveniently. Accordingly, DL techniques have been widely used in many applications including wireless communications. Many researches have attempted to use DL to deal with some issues in the physical layer, including channel decoding [18, 19, 20], detection [21, 22, 23, 24, 25], channel estimation [26, 27, 28], and resource management [29, 30, 31, 32, 33, 34]. More specifically, based on the iterative belief propagation, [18] and [19] proposed a deep neural network (DNN) architecture and a convolutional neural network (CNN) architecture for channel decoding, respectively. The work in [21] demonstrated that by using tools from DL, the trained detectors performed well without any prior knowledge of the underlying channel models. Authors in [27] developed a CNN called CsiNet to learn a transformation from CSI to a nearoptimal number of representations and an inverse transformation from codewords to CSI. Furthermore, [28] proposed a realtime CSI feedback architecture called CsiNetlong shortterm memory for timevarying massive multipleinputmultipleoutput (MIMO) channels. The approach to channel estimation and signal detection based on DL in [35] achieved the performance comparable to the MMSE estimator. Among these efforts, the autoencoder based on unsupervised DL, investigated in [36, 37, 38], is an ambitious attempt [39] to learn an endtoend communications system. Besides, DL can also facilitate resource management [29, 30], e.g. power allocation [31, 32, 33, 34]. Finally, [40, 41] provided an overview on the recent advances in DLbased physical layer communications and [42] suggested the potential applications of DL to the physical layer.
However, with the exception of [43, 44, 45, 46], there are no works focusing on the beamforming design in multiantenna communications based on DL. [43] considered an outagebased approach to transmit beamforming in order to deal with the channel uncertainty at base stations (BSs). However, only a single user was considered in [43]. [44] designed a decentralized robust precoding scheme based on DNN in a network MIMO configuration. The projection over a finite dimensional subspace in [44] reduced the difficulty, but also limited the performance. [45] used a DL model to predict the beamforming matrix directly from the signals received at the distributed BSs based on omni or quasiomni beam patterns in millimeter wave systems, whose sum rate performance was restricted by the quantized codebook constraint. We notice that none of them addressed the SINR balancing problem under power constraint and power minimization problem under SINR constraints. The sum rate maximization problem was investigated in [45] but without considering the total power constraint. [44, 45] predicted the beamforming matrix in the finite solution space at the cost of performance loss. Furthermore, [43, 46] directly estimated the beamforming matrix without exploiting the problem structure in which the number of variables to predict increases significantly as the numbers of transmit antennas and users increase. This will lead to high training complexity of the neural networks when the numbers of transmit antennas and users are large.
Motivated by the aforementioned facts and the universal approximation theorem [31], we propose a general DL framework to achieve not only nearoptimal beamforming matrix, but also reduce complexity as compared to the iterative methods. Based on the proposed framework, we develop beamforming neural networks (BNNs) to solve three aforementioned optimization problems. Learning the optimal beamforming solution is highly nontrivial, and there are still challenges that need to be addressed in designing the BNNs. Firstly, the popular neural network software packages such as Keras and Tensorflow currently (December 2018) do not support complex numbers as input or output [39]. Both channel and beamforming matrices are inherently complex. Naive transformation of complex vectors to real vectors by concatenating the real and imaginary parts not only leads to high complexity of prediction, but also may lose the specific structures of the problems of interest. Secondly, the power minimization problem has strict QoS constraints and guaranteeing a feasible solution using neural networks is a challenge. In addition, different from the SINR balancing problem and power minimization problem whose optimal solutions exist and supervised learning can be used, there is no known algorithm that can achieve the optimal solution to the sum rate maximization problem (and other nonconvex beamforming problems), and thus the supervised learning method based on locally optimal solution cannot achieve good performance. In this paper, we will tackle these challenges, and our main contributions are summarized as follows:

We provide a DLbased framework for the beamforming optimization in the multipleinputsingleoutput (MISO) downlink, where the BS has multiple antennas while each user terminal has a single antenna. The proposed framework is designed based on the CNN structure and the exploitation of expert knowledge such as the uplinkdownlink duality and the structure of the optimal solution. The real and imaginary parts of complex channel coefficients are fed into the BNNs as two vectors. Due to the parameter sharing scheme used in the CNN structure, less parameters are required. Furthermore, the expert knowledge exploits the model/structure of the specific problem to improve learning efficiency by specifying the best parameters to be learned; those parameters are typically not the direct beamforming matrix. Under this framework, we propose three BNNs for solving three typical optimization problems in MISO systems, i.e., the SINR balancing problem under a total power constraint, the power minimization problem under QoS constraints, and the sum rate maximization problem under a total power constraint.

In the proposed supervised BNNs for the SINR balancing problem and the power minimization problem, instead of estimating the beamforming matrix with elements, where is the number of the transmit antennas at the BS and is the number of users, we exploit the uplinkdownlink duality of the solutions [14, 6] and predict the virtual uplink power allocation vector with only elements. Thus, the demand on the prediction capability of the BNNs in terms of network neurons and layers is significantly reduced. Also, the training and prediction complexity and cost are reduced. In the proposed BNN for the sum rate maximization problem, we exploit the structure of the optimal solutions and predict two power allocation vectors with totally elements. This approach still has advantages compared to predicting the beamforming matrix directly.

We propose a hybrid twostage BNN with both supervised and unsupervised learning to find the beamforming solution to the sum rate maximization problem [33], since its optimal solution is still unknown. In the first stage, we use the supervised learning method with the mean squared error (MSE)based loss function to make the predictions as close as possible to the WMMSE algorithm, which is known to achieve the best known locally optimal solution. In the second stage, we modify the metric in the loss function to be the sum rate, and update the network parameters according to the unsupervised learning method, which achieves an improved performance over the WMMSE algorithm.
The remainder of this paper is organized as follows. Section II introduces the system model and formulates three beamforming optimization problems in the MISO downlink. Section III provides the framework for the beamforming optimization and then Sections IV, V and VI propose the BNNs under the framework for the SINR balancing problem, the power minimization problem, and the sum rate maximization problem, respectively. Numerical results are presented in Section VII. Finally, conclusion is drawn in Section VIII.
Notations: The notations are given as follows. Matrices and vectors are denoted by bold capital and lowercase symbols, respectively. and stand for transpose and conjugate transpose of , respectively. The notations and are and norm operators, respectively. The operator denotes the operation to diagonalize the vector into a matrix whose main diagonal elements are from . Finally, represents a complex Gaussian vector with zeromean and covariance matrix .
Ii System Model
We consider a downlink transmission scenario where a BS equipped with antennas serves singleantenna users. The channel between user and the BS is denoted as where and are the smallscale fading and the largescale fading, respectively. The received signal at user is given by
(1) 
where represents the beamforming vector for user , is the transmitted symbol from the BS to user , and denotes the additive Gaussian white noise (AWGN) with zero mean and variance . The received SINR of user equals
(2) 
One conventional optimization problem seeks to maximize subject to a transmit power constraint, where ’s are constant weights denoting the importance of the substreams. Such an optimization problem is referred to as interference or SINR balancing, and has been investigated in many works [1, 2, 3, 4, 5]. The SINR balancing problem is formulated as:
(3) 
where is a set of beamforming vectors and is the power budget.
Another important problem is the power minimization problem under a set of SINR constraints [7, 8]. A network operator may be more interested in how to minimize the transmit power while fulfilling the demands for QoS, i.e.,
(4) 
where is the SINR constraint of user . For ease of composition, we define as the SINR constraint vector.
Finally, the weighted sumrate maximization problem under power constraint is also an important issue that has attracted lots of attention [3, 12, 11], which can be formulated as:
(5) 
where is a constant weight of user .
We choose the above problems as representative examples to demonstrate the effectiveness of our proposed DL beamforming framework. The optimal solutions are available for P1 [14, 47, 10] and P2 [6, 10, 8, 15], so supervised learning can be adopted. P2 has the additional challenge of satisfying the strict QoS constraints. P3 is a difficult nonconvex problem and is usually solved using the iterative WMMSE approach [12, 11], therefore supervised learning is not adequate. In the rest of the paper, we will show how the solutions to these three types of problems can be efficiently learned by the proposed DLbased beamforming framework.
Iii A DLbased Framework for Beamforming Optimization
DLbased neural networks were initially designed for solving classification problems, but they can also achieve satisfactory performance in regression problems. For example, the DNN was used to predict transmit power [32, 31]. Existing works mainly take real data, such as channel gains and transmit power, as input and output, but channel and beamforming matrices are both complex. In addition, predicting the beamforming matrix with elements directly may lead to inaccurate and even underfitting results. Obviously we can use wider or deeper neural networks with more neurons to improve the learning ability, but such a huge network will lead to high training and implementation complexities and cannot guarantee the learning performance. For example, too deep or wide neural networks can cause overfitting.
The proposed DLbased framework for the beamforming optimization in MISO downlink is shown in Fig. 1. To deal with complex data, we choose the CNN architecture that naturally accepts the complex channel input. The reason for choosing the CNN, instead of other neural networks, is that the CNN can share parameters among the real and imaginary parts of the complex channel coefficients, thus reducing the number of parameters. To overcome the challenge of predicting the beamforming matrix directly, we take the expert knowledge (e.g., the structure of the optimal solution) of the beamforming matrix into account. The proposed framework, instead of estimating the beamforming matrix directly, only predicts the key features extracted from the beamforming matrix according to the expert knowledge specific to the problem of interest. Therefore the demand for the prediction capability of the BNNs in terms of network neurons and layers, as well as the complexity, is significantly reduced.
The proposed framework includes two main modules: the neural network module and beamforming recovery module. The neural network module is composed of an input layer, convolutional (CL) layers, batch normalization (BN) layers, activation (AC) layers, a flatten layer, a fullyconnected (FC) layer, and an output layer, whereas key features and the functional layers in the beamforming recovery module are specified by the expert knowledge. Below we give a brief introduction to these layers.
Iii1 Input layer and CL layer
The complex channel coefficients are fed into the neural network module to predict the key features, which are not supported by the current neural network software. To deal with this issue, two data transformations are available. One is to separate the complex channel vector, for example , into the inphase component and quadrature component , where and contain the real and imaginary parts of each element in , respectively. We call this transformation I/Q transformation. Another transformation, suggested by [48], is to map the complex channel vector into two real vectors and , where the former contains the phase information and the latter includes the magnitude information of . This transformation is referred to as P/M transformation. Without loss of generality, we adopt I/Q transformation of complex channels as the input of the first CL layer. Each CL layer creates one or more convolution kernels that are convolved with the layer input and the parameters of convolution kernels are shared among different channel coefficients. Note that the samples are fed into the neural network module in batches.
Iii2 BN Layer
The BN layers are introduced in the neural network module, which can be put before or after the AC layers [49] according to practical experience. In the proposed framework, we adopt the former where the BN layers normalize the output of the CL layers through subtracting the batch mean and dividing by the batch standard deviation. Consequently, the BN operation introduces two trainable parameters, i.e., a “mean” parameter and a “standard deviation” parameter, in each BN layer. The denormalization is allowed by changing only the two parameters, instead of changing all parameters which may lead to the instability of the neural network module. Besides, the BN layer has the following advantages:

The probability of overfitting is reduced since the BN layer presents some regularization effects similar to dropout, by adding some noise to each AC layer.

The BN layer enables a higher learning rate which can accelerate convergence because the BN operation can avoid the AC function going into the gradientinsensitive region.

In addition, with the BN layer, the neural network is less sensitive to the initialization of weights.
Iii3 Activation Layer
Since the predicted variables are continuous and positive real numbers, it is suggested that the AC functions that can generate negative values, such as tanh and linear functions should not be used in the last AC layer. The rectified linear unit (ReLU) and sigmoid functions are good choices for the last AC layer. For the intermediate AC layers, the ReLU function generally shows better performance than other AC functions. However, if the BN layer is adopted before each AC layer, the sigmoid function can also work well because of the normalization operation introduced by the BN layer.
Iii4 Flatten Layer and Output Layer
The flatten layer is only used to change the shape of its input into the correct format, i.e., a vector, for the FC layer to interpret. The main function of the output layer is to generate the predicted results after the neural network finishes training.
Note that apart from these functional layers, the loss function also plays an important role in the proposed framework, which is marked on the output layer in Fig. 1. The loss function together with the learning rate guides the learning process of the neural network. In other words, the loss function “tells” the neural network how to update its parameters. Since the output values are continuous, it is suggested to utilize the mean absolute error (MAE) or the MSE as a metric. Given the predicted results of the th sample in the neural network module is and the target result is , the MAE and MSE are defined as
(6) 
and
(7) 
respectively, where is the size of a batch, i.e., the number of samples fed into the neural network module for each training.
The beamforming recovery module is an important component whose aim is to recover the beamforming matrix from the predicted key features at the output layer. The functional layers in the beamforming recovery module are designed according to the expert knowledge of the beamforming optimization which maps/converts the key features to the beamforming matrix. The expert knowledge is problemdependent and has no unified form, but what is in common is that the expert knowledge can significantly reduce the number of variables to be predicted compared to the beamforming matrix. For example, the uplinkdownlink duality and specific solution structures [3] are the typical expert knowledge for beamforming optimization.
In what follows we propose three BNNs under the proposed framework for problems P1, P2, and P3, respectively, and provide implementation details to show how to make use of the expert knowledge.
Iv BNN for SINR Balancing Problem
As mentioned above, estimating the beamforming matrix directly leads to the higher complexity of prediction due to the large amount of variables. In order to reduce the prediction complexity, we introduce a scheme which first predicts the power allocation vector as the key feature and then achieves the corresponding beamforming matrix based on the predicted results. Such a scheme is based on the expert knowledge named the uplinkdownlink duality.
Iva UplinkDownlink Duality
Before we present the BNN for the SINR balancing problem P1, we first introduce the following lemma to describe the uplinkdownlink duality of problem P1 [14].
Lemma 1.
Given and , we have
(8) 
where and are given as
(9)  
s.t.  
and
(10)  
s.t.  
respectively, with
(11) 
and
(12) 
Note that and are downlink and uplink power vectors, respectively.
Note that problem (9) is an equivalent virtual problem of problem P1 whose optimal solutions are connected by where , is the optimal solution to problem P1, and and are the optimal solutions to problem (9). Based on Lemma 1, we find that the uplink and downlink scenarios have the same achievable SINR region and the normalized beamforming designed for the uplink reception immediately carries over to the downlink transmission [14]. Thus we first obtain the optimal power allocation and beamforming matrix for the easytosolve uplink problem (10). Then given the optimal beamforming , the optimal is obtained as the first components of the dominant eigenvector of the following matrix [50]
(13) 
where , , , and
(14) 
Finally, the downlink beamforming matrix is derived as . Thus, instead of predicting directly, we can predict the uplink power allocation vector .
In the supervised learning method, the prediction performance of the BNN depends on the quality of training samples. To generate the training samples, the optimal and can be found by an iterative optimization algorithm in [14, Table 1].
IvB BNN Structure
The proposed BNN for the SINR balancing problem P1, shown in Fig. 2, is based on the proposed BNN framework in Fig. 1. The functions and operations of the basic layers such as the input, CL, BN, and output layers, are the same as those in the proposed framework. Therefore, we do not explain these layers here and readers can refer to Section III for detail. Note that in the proposed BNN for problem P1, the intermediate AC layers are fulfilled with the ReLU function whereas the last AC layer is implemented using the sigmoid function. Besides the existing layers in the framework, a scaling layer and a conversion layer are also introduced in the BNN for problem P1, which belong to the beamforming recovery module. In the following, we give the details of the scaling layer and the conversion layer.
IvB1 Scaling Layer
Due to the limitation of the BNN, it is almost impossible to guarantee that the output of the output layer always meets the power constraint in the SINR balancing problem P1. As we know, the optimal solution is achieved when the equality of the constraint in problem P1 holds. Therefore, we scale the results of the output layer to meet the power constraint by the following transformation,
(15) 
IvB2 Conversion Layer
After receiving the scaled power allocation vector , we can achieve the downlink beamforming matrix as the final output of the BNN based on by the conversion layer. The beamforming recovery implemented by the conversion layer includes the following process:

Calculate .

Calculate where .

Find the maximal eigenvalue of and the associated eigenvector with respect to , i.e., .

Output as the final result where .
In the proposed BNN for the SINR balancing problem P1, the supervised learning with the loss function based on the MSE metric is adopted.
V BNN for Power Minimization Problem
Similar to the BNN for the SINR balancing problem P1, the BNN for the power minimization problem P2 obtains the downlink beamforming matrix according to the uplinkdownlink duality, i.e., the expert knowledge. Specifically, we first predict the uplink power allocation vector as the key feature using the trained neural network, then obtain the normalized beamforming matrix based on the predicted results. Finally, the downlink beamforming matrix is recovered from the normalized beamforming matrix by the uplinkdownlink conversion method.
Va UplinkDownlink Duality
Note that the conversion method adopted in the BNN for the SINR balancing problem P1 can not be used again, because the power budget is unknown in the power minimization problem P2. Instead, we employ the conversion method in the following lemma [47].
Lemma 2.
Given the optimal beamforming matrix for the uplink problem, i.e.,
(16)  
s.t.  
where is given as in (12).
The optimal beamforming vectors for the downlink problem P2, can be obtained by multiplying the optimal normalized beamforming vector by a scaling factor, i.e., , where is the th element of vector and
(17) 
where
(18) 
The vector of the scaling factors is the optimal downlink power allocation vector. Given the optimal normalized beamforming matrix , Lemma 2 allows us to achieve the optimal downlink power vector by (17), then . Actually, if we know the uplink power allocation vector , the normalized beamforming matrix can be inferred as
(19) 
where . Therefore, the only results that need to be predicted by the BNN is the uplink power allocation vector , which reduces significantly the computational complexity compared to the strategy that attempts to predict the beamforming matrix directly. The iterative algorithm in [6] provides a way to achieve the optimal uplink power allocation vector as the training samples in the supervised learning method.
VB BNN Structure
The BNN for the power minimization problem P2 in Fig. 3 is also based on the proposed BNN framework. However, the operations of the conversion layer in Fig. 3 are different from those in the BNN for problem P1. After receiving the uplink power allocation vector from the output layer, the beamforming recovery in the conversion layer performs the following operations:

Calculate .

Calculate where .

Calculate the downlink power allocation vector .

Output the downlink beamforming vectors as the final results.
Note that the predicted power vector by the BNN is, in general, not exact. The prediction error will lead to the inaccuracy of power allocation vector as well as the downlink beamforming . More specifically, if the predicted power vector has an acceptable accuracy with respect to the target power vector , i.e., where is a small constant, then we can obtain a suboptimal solution whose objective value is larger than that of the optimal solution, i.e., . Intuitively, The extra power consumption can be regarded as the cost of the prediction error. However, if the predicted vector has a significant error, i.e., , the downlink beamforming inferred from the prediction may become infeasible since some elements of the vector have negative values. This suggests that different from problem P1, there is a certain probability of infeasibility of the BNN prediction for problem P2. However, our experiments show that the failure probability of the proposed BNN for problem P2 is lower than in most settings. More details will be given in Section VII. Moreover, the supervised learning with the loss function based on the MSE metric is adopted in the proposed BNN for problem P2.
Vi BNN for Sum Rate Maximization Problem
Different from the SINR balancing problem P1 and the power minimization problem P2, whose optimal solutions are available for the supervised learning, the optimal solution to the sum rate maximization problem P3 is still unknown and thus can not make use of uplinkdownlink duality directly. However, we will exploit a connection between problems P2 and P3 to find some key features of the optimal solution to problem P3.
Via Solution Structure
It was suggested in [51] that the optimal solution to problem P2, using the minimal amount of power to achieve the given SINR targets, must meet the power constraint in problem P3. In this case the beamforming matrix resulting from problem P2 is feasible for problem P3 and also achieves the maximal sum rate. According to the connection between problems P2 and P3, it has been pointed out in [3] that the optimal downlink beamforming vectors for problem P3 follows the structure as
(20) 
where is a positive parameter and according to the strong duality of problem P2. This is because is the optimal cost function in problem P2 and is the dual function. Note that the parameter vector can be considered as a virtual power allocation vector. The solution structure in (20) provides the required expert knowledge for the beamforming design in problem P3 and and are the key features. But to our best knowledge, there is no lowcomplexity algorithm in the literature that can find the optimal and in (20). The WMMSE algorithm is a good choice to find the suboptimal solutions [11, 12]. Therefore, we can obtain the power allocation vectors and according to the WMMSE algorithm. The supervised learning with the loss function based on the MSE metric will be first used to achieve as close to the results of the WMMSE algorithm as possible, i.e.,
(21) 
where and are the power vectors obtained from the WMMSE algorithm, and and are the predicted results of the BNN. It is worth pointing out that the results in the training samples of problems P1 and P2 are optimal, thus the MSEbased loss function is equivalent to the objective function and the supervised learning method updates network parameters towards the direction of the optimal solution. However, the WMMSE algorithm for problem P3 is suboptimal and thus (21) is not equivalent to the real objective of problem P3 which aims to maximize the weighted sum rate. To further improve the sum rate performance, we continue to train the BNN in an unsupervised learning way, whose loss function takes the objective function directly as a metric, i.e.,
(22) 
ViB Hybrid BNN Structure
The BNN for the sum rate maximization problem P3 is presented in Fig. 4. The major difference from the BNNs in Figs. 2 and 3 is that the BNN in Fig. 4 has two stages of training. The first stage is responsible for pretraining using the supervised learning method with the loss function based on the MSE metric (21), while the second stage is responsible for enhanced training using the unsupervised learning method with the loss function whose metric is the objective function (22). Such a hybrid learning method of the supervised and unsupervised learning can significantly improve the learning performance and also accelerate convergence [33]. More specifically, the pretraining, as the approximation of WMMSE algorithm, starts with the random initialization of neural network parameters and the loss function (21). After the pretraining is finished, the neural network parameters are reserved and the loss function is replaced by (22), such that the secondstage training can achieve at least the same performance as the WMMSE algorithm.
Different from the BNNs in Figs. 2 and 3, the output layer in Fig. 4 generates values including the power allocation vectors and . Then the scaling layer scales the results of the output layer and to meet the power constraint by the following method:
(23) 
Finally, the construction layer constructs the downlink beamforming vectors according to (20):
(24) 
Vii Simulation Results
To evaluate the performances of the proposed BNNs, we carry out numerical simulations to compare the BNNs with several benchmark solutions (when available), including the optimal beamforming, the ZF beamforming, the RZF beamforming, and the WMMSE algorithm. We consider a downlink transmission scenario where the BS is equipped with antennas and its coverage is a disc with a radius of 500 m. There are singleantenna users and these users are distributed uniformly within the coverage of the BS. Note that none of these users is closer to the BS than 100 m. The pathloss between the user and the BS is set as [dB] [52] where is the distance in km. The noise power spectral density is dBm/Hz and the total system bandwidth is 20 MHz. Without loss of generality, we assume all the substreams have the same importance and all the users have the same priority, i.e., and . Besides, perfect CSI is assumed to be available at the BS.
In our simulation, we prepare 20000 training samples and 5000 testing samples, respectively. All the BNNs have one input layer, two BN layers, two CL layers, three AC layers, one flatten layer, one FC layer, and one output layer. The FC layer in the BNNs for problems P1 and P2 has neurons but that in the BNN for problem P3 has neurons. Besides, each CL layer has 8 kernels of size and the first two AC layers adopt the ReLU function. Adam optimizer is used with the MSE metricbased loss function. However, in the second stage of the BNN for problem P3, the metric of the loss function becomes the sum rate. Note that the last AC layer can be the ReLU or sigmoid function. Here, we adopt the sigmoid function so that the target output in the training and testing samples should be normalized.
Viia BNN for SINR Balancing Problem
We first consider the BNN for the SINR balancing problem P1, which updates network parameters in a supervised learning way. The iterative algorithm in [14, Table 1] is used to generate the training and testing samples. Fig. 5 shows the SINR performance averaged over 5000 samples in two cases: one only considering the smallscale fading but the other considering both the smallscale fading and largescale fading. In both cases, the SINR performance of the proposed BNN solution is very close to that of the optimal solution [14]. It is observed that there is an obvious gap between the optimal solution and the ZF beamforming in the low signaltonoise ratio (SNR) regime of Fig. 5 as well as the low transmitpower regime of Fig. 5. However, the gap decreases as the SNR or transmit power increases.
To further compare the SINR performances of the optimal solution, the ZF beamforming, the RZF beamforming, and the BNN solution, we evaluate the output SINR in Fig. 6 assuming that the number of users is the same as the number of BS antennas, i.e., , and they increase together. It is shown that the BNN solution has some performance loss compared to the optimal solution due to the estimation error, but the BNN solution always achieves a better performance than the ZF beamforming and RZF beamforming. This fact indicates the application prospect of the BNN: the computational complexity and time of the BNN solution is similar to those of the ZF beamforming and RZF beamforming, but is much lower than that of the optimal solution because the optimal solution relies on an iterative process. Besides, we also find that the SINR performances of the four solutions decrease as the transmit antenna number (user number) increases and among the four solutions the ZF beamforming suffers most from the performance loss.
In Fig. 7, we demonstrate the generality of the proposed BNN by fixing the user number as and the transmit power as dBm and show the SINR performance versus different transmit antenna settings. We train only a single BNN with {, }, but allow the number of transmit antennas to vary from 4 to 10 when using the trained BNN. It can be seen that these predicted results are very close to that of the optimal solution. This fact suggests the generality of the BNN, i.e., we can train a large BNN with more antennas which will also work for the cases with less antennas without retraining. This will be useful when some transmit antennas of the BS are malfunctioning or turned off.
ViiB BNN for Power Minimization Problem
In this subsection, we consider the BNN for the power minimization problem P2, which also updates network parameters in a supervised learning way. The iterative algorithm in [14, Table 1] is used to generate the training and testing samples. We first investigate the effect of the SINR constraints of users on the power consumption. For convenience of comparison, we assume the SINR constraints of all users are the same, i.e. . In Fig. 8, we compare the power performances of the optimal beamforming, the ZF beamforming, and the beamforming obtained by the BNN. Note that both Figs. 8 and 8 have two Yaxes where the left Yaxis is used to measure the transmit power (or SNR) and the right Yaxis is used to show the feasibility of the BNN. As mentioned in Section V, the BNN may fail to find a feasible solution to problem P2 if the prediction error is unacceptable. Figs. 8 and 8 present the power/SNR performance in the cases without and with consideration of the largescale fading, respectively. In both cases, the power/SNR performance of the BNN solution is close to that of the optimal solution, and significantly outperforms the ZF beamforming in the low SINRconstraint regime which is higher than that of the optimal solution. Besides, we find that the feasibility of the BNN solution in both cases is more than 99.4%.
To further compare the BNN solution with the optimal solution and the ZF beamforming, we plot their power performance and execution time per sample in Figs. 9 and 9, respectively. In Fig. 9, the BS antenna number and SINR target of users are fixed as and dB. It is observed from Fig. 9 that as the user number increases, the performance gap between the ZF beamforming and the optimal beamforming becomes large because more users share the array gain. The BNN solution shows a better performance than ZF beamforming and has the feasibility of up to 99%. Fig. 9 demonstrates that compared to the optimal solution, the BNN solution can reduce the execution time per sample by two orders of magnitude, which is slightly longer than that of the ZF beamforming. This is because the BNN solution and the ZF beamforming are obtained without an iterative process, but the BNN needs to execute the neural network operations as well as the conversion process. According to the results in Figs. 9 and 9, we can conclude that the BNN solution provides a good balance between the performance and computational complexity.
ViiC BNN for Sum Rate Maximization Problem
In this subsection, we evaluate the performance of the BNN for the sum rate maximization problem P3 based on the proposed hybrid learning under the assumption that and . The ZF and RZF beamforming with the equal power allocation are introduced as two baseline solutions. The WMMSE algorithm with random initialization [11, 12] is used to generate samples for the supervised learning in the first stage. First, Fig. 10 shows the sum rate performance averaged over 5000 samples in two different cases: the former case in Fig. 10 only considers smallscale fading and and the latter case in Fig. 10 considers both smallscale fading and largescale fading. It is shown that the sum rate performance of all solutions increases as the transmit power/SNR increases. We observe that in both cases the proposed BN solution based on the hybrid learning always achieves the best solutions, but the performance of the supervised learningbased BNN solution is barely satisfactory. This is because the second stage of the hybrid learning method aims to maximize the sum rate and its performance is bounded by the global optimal solution to problem P3. But the aim of the BNN solution based on the supervised learning is to achieve as close to the WMMSE solution as possible and its performance is restricted by the WMMSE solution, which is verified in Figs. 10 and 10.
We further compare the sum rate performance and the computational complexity, in terms of the execution time per sample, of five beamforming solutions in Figs. 11 and 11, respectively. We fix the transmit power budget as dBm and assume the transmit antenna number is the same as the user number, i.e., . As the number of transmit antennas increases, the sum rate performance of all five solutions increases simultaneously. However, the performance of the proposed BNN solution based on the hybrid learning method is always superior to those of the other solutions, and the performance gap becomes larger when the number of the transmit antenna increases. According to Fig. 11, the execution time per sample of the BNN solutions based on the supervised learning and hybrid learning methods is at the same level, which is slightly longer than that of the ZF beamforming and the RZF beamforming, for the same reason of Fig. 9(b). As expected, the WMMSE algorithm consumes the most time because of its iterative process. Similar to the other proposed BNNs, it proves that the proposed BNN solution to the sum rate problem P3 provides a good balance between the performance and computational complexity.
Viii Conclusions
In this paper, we proposed a DLbased framework for fast optimization of the beamforming vectors in the MISO downlink and then devised three BNNs under this framework for the SINR balancing problem under the total power constraint, the power minimization problem under individual QoS constraints, and the sum rate maximization problem under the total power constraint, respectively. The proposed BNNs are based on the CNN structure and expert knowledge. The supervised learning method was adopted for the SINR balancing problem and the power minimization problem because their optimal solutions exist for generating training samples. However, there is no known optimal solution to the nonconvex sum rate maximization problem, therefore the corresponding BNN adopts a hybrid learning method which first pretrains the neural network based on the supervised learning method, and then updates the network parameters with the unsupervised learning method to further improve learning performance. Furthermore, in order to reduce the complexity of prediction, the proposed BNNs take advantage of expert knowledge to extract the key features instead of predicting the beamforming matrix directly. Simulation results demonstrated that the proposed BNN solutions provided a good balance between the performance and computational complexity.
Acknowledgment
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.
References
 [1] J. Zander, “Performance of optimum transmitter power control in cellular radio systems,” IEEE Trans. Veh. Technol., vol. 41, no. 1, pp. 57–62, Feb. 1992.
 [2] G. Montalbano and D. T. M. Slock, “Matched filter bound optimization for multiuser downlink transmit beamforming,” in Proc. IEEE Int. Conf. Universal Personal Commun., vol. 1, Florence, Italy, Oct. 1998, pp. 677–681.
 [3] E. Björnson, M. Bengtsson, and B. Ottersten, “Optimal multiuser transmit beamforming: A difficult problem with a simple solution structure,” IEEE Signal Process. Mag., vol. 31, no. 4, pp. 142–148, Jul. 2014.
 [4] H. Boche and M. Schubert, “A general duality theory for uplink and downlink beamforming,” in Proc. IEEE Conf. Veh. Technol. Conf. (VTC), vol. 1, Vancouver, Canada, Sep. 2002, pp. 87–91.
 [5] D. Gerlach and A. Paulraj, “Base station transmitting antenna arrays for multipath environments,” Signal Process., vol. 54, no. 1, pp. 59–73, Oct. 1996.
 [6] Q. Shi, M. Razaviyayn, M. Hong, and Z. Luo, “SINR constrained beamforming for a MIMO multiuser downlink system: Algorithms and convergence analysis,” IEEE Trans. Signal Process., vol. 64, no. 11, pp. 2920–2933, Jun. 2016.
 [7] F. RashidFarrokhi, K. R. Liu, and L. Tassiulas, “Transmit beamforming and power control for cellular wireless systems,” IEEE J. Sel. Areas Commun., vol. 16, no. 8, pp. 1437–1450, Oct. 1998.
 [8] A. B. Gershman, N. D. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson, and B. Ottersten, “Convex optimizationbased beamforming,” IEEE Signal Process. Mag., vol. 27, no. 3, pp. 62–75, May 2010.
 [9] D. P. Palomar, M. A. Lagunas, and J. M. Cioffi, “Optimum linear joint transmitreceive processing for MIMO channels with QoS constraints,” IEEE Trans. Signal Process., vol. 52, no. 5, pp. 1179–1197, May 2004.
 [10] A. Wiesel, Y. C. Eldar, and S. Shamai, “Linear precoding via conic optimization for fixed MIMO receivers,” IEEE Trans. Signal Process., vol. 54, no. 1, pp. 161–176, Jan. 2006.
 [11] Q. Shi, M. Razaviyayn, Z. Luo, and C. He, “An iteratively weighted MMSE approach to distributed sumutility maximization for a MIMO interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331–4340, Sep. 2011.
 [12] S. S. Christensen, R. Agarwal, E. D. Carvalho, and J. M. Cioffi, “Weighted sumrate maximization using weighted MMSE for MIMOBC beamforming design,” IEEE Trans. Wireless Commun., vol. 7, no. 12, pp. 4792–4799, Dec. 2008.
 [13] T. Yoo and A. Goldsmith, “On the optimality of multiantenna broadcast scheduling using zeroforcing beamforming,” IEEE J. Sel. Areas Commun., vol. 24, no. 3, pp. 528–541, Mar. 2006.
 [14] M. Schubert and H. Boche, “Solution of the multiuser downlink beamforming problem with individual SINR constraints,” IEEE Trans. Veh. Technol., vol. 53, no. 1, pp. 18–28, Jan. 2004.
 [15] Z.Q. Luo, W.K. Ma, A. M.C. So, Y. Ye, and S. Zhang, “Semidefinite relaxation of quadratic optimization problems,” IEEE Signal Process. Mag., vol. 27, no. 3, pp. 20–34, May 2010.
 [16] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,” http://cvxr.com/cvx, Mar. 2014.
 [17] F. RashidFarrokhi, L. Tassiulas, and K. R. Liu, “Joint optimal power control and beamforming in wireless networks using antenna arrays,” IEEE Trans. Commun., vol. 46, no. 10, pp. 1313–1324, Oct. 1998.
 [18] E. Nachmani, Y. Be’ery, and D. Burshtein, “Learning to decode linear codes using deep learning,” in Proc. IEEE Annual Allerton Conf. Commun. Control Comput., Monticello, USA, Sep. 2016, pp. 341–346.
 [19] F. Liang, C. Shen, and F. Wu, “An iterative BPCNN architecture for channel decoding,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 144–159, Feb. 2018.
 [20] H. Kim, Y. Jiang, R. Rana, S. Kannan, S. Oh, and P. Viswanath, “Communication algorithms via deep learning,” arXiv preprint arXiv:1805.09317, 2018.
 [21] N. Farsad and A. Goldsmith, “Detection algorithms for communication systems using deep learning,” arXiv preprint arXiv:1705.08044, 2017.
 [22] H. He, C.K. Wen, S. Jin, and G. Y. Li, “A modeldriven deep learning network for MIMO detection,” arXiv preprint arXiv:1809.09336, 2018.
 [23] C. Fan, X. Yuan, and Y.J. A. Zhang, “CNNbased signal detection for banded linear systems,” arXiv preprint arXiv:1809.03682, 2018.
 [24] X. Tan, W. Xu, Y. Be’ery, Z. Zhang, X. You, and C. Zhang, “Improving massive MIMO belief propagation detector with deep neural network,” arXiv preprint arXiv:1804.01002, 2018.
 [25] N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” arXiv preprint arXiv:1805.07631, 2018.
 [26] H. He, C. Wen, S. Jin, and G. Y. Li, “Deep learningbased channel estimation for beamspace mmWave massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 852–855, Oct. 2018.
 [27] C.K. Wen, W.T. Shih, and S. Jin, “Deep learning for massive MIMO CSI feedback,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018.
 [28] T. Wang, C.K. Wen, S. Jin, and G. Y. Li, “Deep learningbased CSI feedback approach for timevarying massive MIMO channels,” IEEE Wireless Commun. Lett., 2018.
 [29] M. Eisen, C. Zhang, L. F. Chamon, D. D. Lee, and A. Ribeiro, “Learning optimal resource allocations in wireless systems,” arXiv preprint arXiv:1807.08088, 2018.
 [30] K. Ahmed, H. Tabassum, and E. Hossain, “Deep learning for radio resource allocation in multicell networks,” arXiv preprint arXiv:1808.00667, 2018.
 [31] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learning to optimize: Training deep neural networks for wireless resource management,” in Proc. IEEE Int. Workshop Signal Process. Advances Wireless Commun. (SPAWC), Sapporo, Japan, Jul. 2017, pp. 1–6.
 [32] F. Liang, C. Shen, W. Yu, and F. Wu, “Towards optimal power control via ensembling deep neural networks,” arXiv preprint arXiv:1807.10025, 2018.
 [33] W. Lee, M. Kim, and D.H. Cho, “Deep power control: Transmit power control scheme based on convolutional neural network,” IEEE Commun. Lett., vol. 22, no. 6, pp. 1276–1279, Jun. 2018.
 [34] X. Li, J. Fang, W. Cheng, H. Duan, Z. Chen, and H. Li, “Intelligent power control for spectrum sharing in cognitive radios: A deep reinforcement learning approach,” IEEE Access, vol. 6, pp. 25 463–25 473, 2018.
 [35] H. Ye, G. Y. Li, and B. Juang, “Power of deep learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
 [36] S. Dörner, S. Cammerer, J. Hoydis, and S. t. Brink, “Deep learning based communication over the air,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 132–143, Feb. 2018.
 [37] T. J. O’Shea, K. Karra, and T. C. Clancy, “Learning to communicate: Channel autoencoders, domain specific regularizers, and attention,” in Proc. IEEE Int. Symp. Signal Process. Inf. Technol. (ISSPIT), Limassol, Cyprus, Dec. 2016, pp. 223–228.
 [38] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep learningbased MIMO communications,” arXiv preprint arXiv:1707.07980, 2017.
 [39] Z. Zhao, “Deepwaveform: A learned OFDM receiver based on deep complex convolutional networks,” arXiv preprint arXiv:1810.07181, 2018.
 [40] Z. Qin, H. Ye, G. Y. Li, and B.H. F. Juang, “Deep learning in physical layer communications,” arXiv preprint arXiv:1807.11713, 2018.
 [41] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile and wireless networking: A Survey,” arXiv preprint arXiv:1803.04311, 2018.
 [42] T. Wang, C.K. Wen, H. Wang, F. Gao, T. Jiang, and S. Jin, “Deep learning for wireless physical layer: Opportunities and challenges,” China Commun., vol. 14, no. 11, pp. 92–111, Nov. 2017.
 [43] Y. Shi, A. Konar, N. D. Sidiropoulos, X. Mao, and Y. Liu, “Learning to beamform for minimum outage,” IEEE Trans. Signal Process., vol. 66, no. 19, pp. 5180–5193, Oct. 2018.
 [44] P. de Kerret and D. Gesbert, “Robust decentralized joint precoding using team deep neural network,” in Proc. Int. Symp. Wireless Commun. Systems (ISWCS), Lisbon, Portugal, Aug. 2018.
 [45] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic, “Deep learning coordinated beamforming for highlymobile millimeter wave systems,” IEEE Access, vol. 6, pp. 37 328–37 348, 2018.
 [46] H. Huang, W. Xia, J. Xiong, J. Yang, G. Zheng, and X. Zhu, “Unsupervised learning based fast beamforming design for downlink MIMO,” IEEE Access, pp. 1–1, 2018.
 [47] W. Yu and T. Lan, “Transmitter optimization for the multiantenna downlink with perantenna power constraints,” IEEE Trans. Signal Process., vol. 55, no. 6, pp. 2646–2660, Jun. 2007.
 [48] M. Kulin, T. Kazaz, I. Moerman, and E. D. Poorter, “Endtoend learning from spectrum data: A deep learning approach for wireless signal identification in spectrum monitoring applications,” IEEE Access, vol. 6, pp. 18 484–18 501, 2018.
 [49] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
 [50] W. Yang and G. Xu, “Optimal downlink power assignment for smart antenna systems,” in Proc. IEEE Int. Conf. Acoustics, Speech Process. (ICASSP), vol. 6, Seattle, USA, May 1998.
 [51] E. Björnson and E. Jorswieck, “Optimal resource allocation in coordinated multicell systems,” Foundations and Trends® in Communications and Information Theory, vol. 9, no. 2–3, pp. 113–381, 2013.
 [52] H. Dahrouj and W. Yu, “Coordinated beamforming for the multicell multiantenna wireless system,” IEEE Trans. Wireless Commun., vol. 9, no. 5, pp. 1748–1759, May 2010.