Distributed Power Allocation for Coordinated Multipoint Transmissions in Distributed Antenna Systems

Distributed Power Allocation for
Coordinated Multipoint Transmissions in
Distributed Antenna Systems

Xiujun Zhang,  Yin Sun,  Xiang Chen,  Shidong Zhou,  Jing Wang,  Digital Object Identifier 10.1109/TWC.2013.13.120863
Abstract

This paper investigates the distributed power allocation problem for coordinated multipoint (CoMP) transmissions in distributed antenna systems (DAS). Traditional duality-based optimization techniques cannot be directly applied to this problem, because the non-strict concavity of the CoMP transmission’s achievable rate with respect to the transmission power induces that the local power allocation subproblems have non-unique optimum solutions. We propose a distributed power allocation algorithm to resolve this non-strict concavity difficulty. This algorithm only requires local information exchange among neighboring base stations serving the same user, and is thus flexible with respect to network size and topology. The step-size parameters of this algorithm are determined by only local user access relationship (i.e., the number of users served by each antenna), but do not rely on channel coefficients. Therefore, the convergence speed of this algorithm is quite robust to channel fading. We rigorously prove that this algorithm converges to an optimum solution of the power allocation problem. Simulation results are presented to demonstrate the effectiveness of the proposed power allocation algorithm.

Coordinated multipoint transmission, distributed power allocation, distributed antenna system.

I Introduction

The explosive growth of mobile access services has led to a huge demand for enhanced throughput and extended coverage in the next generation wireless networks. In recent years, distributed antenna system (DAS) has emerged as a promising network architecture to achieve these goals [1, 2, 3]. In this architecture, each base station is equipped with some remote antennas which are distributed in the entire cell area, as shown in Fig. 1. These distributed antennas are connected to the base station via wired backhaul network. By this, nearby distributed antennas are able to coordinate with each other and provide enhanced service experience to the mobile users. This technique is called the coordination multipoint (CoMP) transmission in the literature [3, 4].

One of the key techniques to realize high throughput in wireless networks is power allocation. Traditionally, power allocation of wireless networks is handled by centralized algorithms, e.g., [5, 6, 7, 8]. These algorithms request multi-hop signaling mechanisms to gather the channel state information (CSI) of all the wireless links at a central processing unit in a short time period, and then distribute the obtained power allocation solution to the transmitters. Such mechanisms would generate enormous signaling overhead on the backhaul network, and is probably not scalable when the network size grows large.

Recently, a great deal of research efforts have focused on distributed power allocation for various wireless networks. Game theory based power allocation techniques were proposed in [9, 10, 11, 12, 13], which intend to compute Nash equilibrium power allocation solutions. However, these Nash equilibrium solutions might be far from optimality [12]. Duality-based distributed power allocation techniques were proposed in [14, 15, 16], where the global power allocation problem is decomposed into many local power allocation subproblems, each of which can be solved by utilizing only locally available network information. However, these techniques cannot be directly applied to CoMP transmissions in DAS — the local power allocation subproblem may have many optimum solutions, because the data rate of CoMP transmission is not strictly concave with respect to the power variables [14, 17]. Since no global network information is available when solving the local power allocation subproblems, it is quite difficult to find a global feasible solution among all the locally optimum solutions.

One promising method to address this non-strict concavity problem is the proximal point method [18], which adds strictly concave terms to the objective function without affecting the optimum solution. However, typical proximal point algorithms require a two-layer nested iteration structure, where each outer-layer update can proceed only after the inner-layer iterations converge [18]. Such a structure is not suitable for on-line distributed implementation, because it is difficult to decide in a distributed manner when the inner-layer iterations can stop. In [19], a single-layer proximal point algorithm was proposed for multi-path routing problems. However, the convergence analysis in [19] also cannot be directly utilized for the power allocation problem considered here, owing to the additional channel coefficients in our problem. It is difficult to answer whether the channel coefficients have significant impact on the convergence behavior of the algorithm mentioned above.

Fig. 1: System model of coordination multipoint (CoMP) transmissions in a distributed antenna system (DAS).

This paper investigates the distributed power allocation problem for a downlink DAS with many antennas and many single antenna users. Each user is served by several nearby antennas via CoMP transmission techniques. Meanwhile, each antenna may serve several users over orthogonal channels. The main contributions of this paper are summarized as follows:

  • A distributed power allocation algorithm is proposed to maximize the weighted sum rate of the downlink DAS, subject to per-antenna power consumption constraints. This power allocation algorithm is implemented distributedly among the base stations instead of being executed in a centralized fashion. The algorithm possesses a nice single-layer iteration structure, which is desirable for on-line implementations. In each iteration, the algorithm only requires local information exchange among neighboring base stations serving the same user, which is flexible with respect to network size and topology.

  • A novel procedure is proposed to compute the primal optimum solution of the local power allocation subproblem, which is simpler than that proposed in [19].

  • We rigorously analyze the convergence and optimality of the proposed distributed algorithm for the power allocation problem. The bounds of the step-size parameters to ensure convergence are derived.

  • We show that the step-size parameters of this algorithm are determined by only local user access relationship (i.e., the number of users served by each antenna), but do not rely on channel coefficients111The channel coefficients are only utilized locally to solve the local power allocation subproblem.. Therefore, the convergence speed of this algorithm is quite robust to channel fading.

Our proposed power allocation algorithm is motivated by the work of [19]. However, our work differs from it in several respects. First, our analysis indicates that a larger step-size can be used for the algorithm in [19], which can achieve a faster convergence speed. Second, while our problem has additional channel fading coefficients, we show that the step-size parameters and the convergence speed of our algorithm are robust to different channel fading coefficients. Finally, our procedure for solving the local power allocation problem is simpler than that proposed in [19].

For ease of later use, we define the following notations: Let denote the number of elements in set , and let denote the set . The projection of a real number on the set is defined as .

The remaining parts of this paper are organized as follows: In Section II, the system model and problem formulation are presented. Section III presents the proposed power allocation algorithm and its distributed implementation. Simulation results of the proposed power allocation strategy are presented in Section IV. Conclusions are drawn in Section V.

Ii System Model and Problem Formulation

Ii-a System Model

We consider a downlink DAS with distributed antennas and single antenna mobile users, which are denoted by and , respectively. Each base station is equipped with several distributed antennas, as illustrated in Fig. 1. These distributed antennas are connected to the base station via wired backhaul network. The total throughput of this network is limited by the strong co-channel interference. By allowing several nearby antennas to transmit to one user in a coordinated fashion, the CoMP transmission techniques, such as space-time block coding or maximum ratio transmission [3], convert the strong interferences into useful signals and thereby significantly boost the total network throughput. The set of antennas serving the th user is denoted by , and the set of users communicating with the th antenna is expressed as . In practice, the number of serving antennas for each user, i.e., , is usually small, due to the limitation of implementation complexity for CoMP transmissions.

When the density of the distributed antennas is high, CoMP transmissions can not mitigate all the strong interferences, which results in some strong residual interferences. In [20], it was shown that orthogonal transmission is Pareto optimal for strong interference Gaussian channels. Therefore, the users with strong mutual interference should be scheduled to communicate over orthogonal channels via frequency (or time) division multiple access, while geographically separated users with weak mutual interference are allowed to share the same channel resource. This scheduling task belongs to the type of timetabling problem, which is a classic problem in the computer science literature with many practical algorithms [21, 22].

After selecting proper antennas for CoMP transmission and scheduling the users, there are only weak interferences in the network. We consider a slow fading wireless environment. Let be the complex coefficient of the wireless channel from the th antenna to the th user and be the transmission power of the th antenna for serving the th user. The data rate of CoMP transmission to the th user is given by

(1)

where is the set of source antenna and serving user pair which may interfere the th user, or more specifically, represents that the -th source antenna serving the -th user through the same serving channel of the th user.

There are two difficulties for utilizing the data rate function to formulate the power allocation problem: First, it leads to a non-convex optimization problem that is NP-hard [23], for which one may not be able to find a solution that is both fast and global optimal even by centralized optimization. The design of a distributed optimization algorithm will be even more difficult, if not impossible. In order to reduce the solution complexity, we need to find an approximate rate function of that is convex. Second, it can be quite difficult to attain the exact expression of the rate function . In practice, the number of interfering antennas is usually much larger than the number of source antennas. Although the receiver can get an accurate estimation of the channel gain for each source antenna , it may be too demanding to estimate the channel gain from the enormous interfering antennas, especially when the powers of the interference signals are weak. On the other hand, estimating the noise-plus-interference power is obviously much easier. For these reasons, we consider to utilize an upper bound of the noise-plus-interference power , which is denoted by , to derive an approximate rate function. Let denote the normalized channel gain from the th antenna to the th user. Then, we derive a conservative rate function

(2)

The key benefit of the conservative rate function is that it is convex and is computable without accurate knowledge of the channel gain for the enormous interfering antennas, which resolves the two difficulties mentioned above. We will illustrate the rate loss for using this conservative rate function to formulate the power allocation problem in Section IV.

Ii-B Problem Formulation

The rest of this paper focuses on the following power allocation problem to maximize the weighted sum rate of the DAS:

(3)

where is the weight of the th user’s data rate and is the maximal allowable transmission power of the th antenna.

This problem is a convex optimization problem, which can be solved by standard centralized convex optimization algorithms such as the interior point method [24]. However, these centralized algorithms are hard to be fulfilled in large-scale DAS, due to the heavy signaling overhead over the backhaul network. In contrast, duality-based optimization techniques [14, 15, 16] cannot be directly applied to this problem, either, because they require the objective function to be strictly concave. However, the objective function in (3) is not strictly concave with respect to the transmission power variables, since it is constant when the value of is fixed. If the duality-based optimization techniques [14, 15, 16] are utilized, the decomposed local power allocation subproblem may have many locally optimum solutions at some special dual points. It is quite difficult to recover a global feasible solution among all the locally optimum solutions. When the dual variables are updated around these dual points, the primal power allocation variables keep oscillating and hardly converge (see [19] for more details).

Iii Distributed Power Allocation Algorithm

In this section, we propose a power allocation algorithm to solve the problem (3), which is distributed among the base stations instead of being centralized over the entire network. The key feature of this algorithm is that its step-size parameters and convergence speed are robust to different channel fading coefficients, which makes our algorithm quite convenient for practical implementations. The details are provided in the following subsections.

Iii-a Single-layer Distributed Power Allocation Algorithm

To circumvent the aforementioned oscillation problem, we make use of the idea in the proximal point method [18], which is to add some quadratic terms to the objective function and make it strictly concave in the primal variables. We reformulate the original power allocation problem (3) as

(4)
(5)

where we have introduced some quadratic auxiliary terms to make the objective function strictly concave with respect to the transmission power variables. Here, is the auxiliary variable corresponding to , is the parameter of the quadratic terms. For notational convenience, let us use the dimensional vector to denote the transmission power variables of the antennas serving the th users, and the dimensional vector to denote all the transmission power variables. Similarly, we define the dimensional vector and the dimensional vector as the auxiliary variable vectors corresponding to and . It is known that the optimum value of the objective function in (III-A) coincides with that in (3) [18]. In particular, if is the optimum solution to (3), then solves (III-A).

The standard proximal point method in general has a two-layer nested optimization structure: the inner layer iterations optimizing for fixed by a Lagrangian dual optimization method, and the outer layer iterations optimizing the auxiliary variable . Such a layered structure is not suitable for on-line distributed implementations, because it is difficult to decide in a distributed manner when the inner-layer iterations have converged. In the following, we present a modified proximal point method with a single-layer optimization structure, where the outer-layer update of does not request that the inner-layer dual updates have converged.

The Lagrangian of the problem (III-A) can be written as:

(6)

where is the vector of dual variables corresponding to the constraints in (5). Now we are able to present our distributed power allocation algorithm as the following:

Algorithm A: Single-layer Distributed Power Allocation Algorithm
At the th iteration,

  • Dual variable update:
    Let and , maximize with respect to :

    (7)

    Update the dual variables by

    (8)

    where is the step-size of the dual update.

  • Auxiliary variable update:
    Let and , maximize with respect to :

    (9)

    Update the auxiliary variables by

    (10)

    where is the step-size for auxiliary variable update.

The value of can be chosen arbitrarily in . The choices of to ensure convergence of Algorithm A will be discussed in Section III-C. We note that while the convergence analysis in [19] apply for the degenerated case of , it is difficult to answer if practical channel coefficients would have significant impact on the convergence behavior of the algorithm. One major contribution of this paper is to show that the step-size parameters to ensure convergence are irrelevant of (see Section III-C). Since the convergence speed of iterative optimization algorithms is mainly affected by the step-size, the convergence speed of our algorithm is quite robust to different values of . We will also show that step-sizes larger than those of [19] can be utilized in our algorithm to achieve a faster convergence speed.

Iii-B Distributed Implementation of Algorithm A

We proceed to explain how to implement Algorithm A in a distributed fashion. The Lagrangian maximization problems (7) and (9) can be decomposed into many independent local power allocation subproblems for each user. Specifically, the terms of the Lagrangian (III-A) can be reassembled as

(11)

Therefore, the Lagrangian maximization problems (7) and (9) can be rewritten as

(12)

where

(13)

Therefore, problems (7) and (9) can be decomposed into a series of local power allocation subproblems.

In practice, the resource allocation of a user is carried out at a nearby base station. However, the antennas serving this user may belong to several base stations, as illustrated in Fig. 1. Therefore, neighboring base stations need to exchange information during the iterations of Algorithm A. The distributed implementation procedure of Algorithm A is described as follows:

At the th iteration of Algorithm A, the base station assigned to the th user first utilizes the channel quality information to solve a subproblem of (7), given by

(14)

and forwards the power allocation solutions to nearby base stations controlling the antennas . Then, the base station controlling the th antenna utilizes the power allocation solutions to update the dual variable according to (8), and sends to the base station assigned to the th user. Next, the base station assigned to the th user solves a subproblem of (9), i.e.,

(15)

and utilizes the resultant solution to update the auxiliary variables according to (10). Therefore, Algorithm A can be implemented in a totally distributed fashion, and it only requires local exchange of the power allocation solution and the dual variable among neighboring base stations in each iteration. In addition, when the channel power gain changes, each user sends the updated channel power gain to its assigned base station.

Iii-B1 Solution to Local Power Allocation Subproblem (14)

The optimum solution to (14) satisfies the following Karush-Kuhn-Tucker (KKT) conditions [24]:

(16)

Define

(17)

as the set of antennas serving the th user with positive power. Hence, for all . If is known, the KKT conditions in (16) indicate

(18)

By conducting a weighted summation of the equations in (18), we obtain an equation of , i.e.,

(19)

Let us define , then (19) can be reformulated as

(20)

where , . The root of the quadratic equation (20) is given by

(21)

Substituting (21) into (18), we obtain the optimum solution to the subproblem (14) as

(22)

Iii-B2 A Novel Procedure to Derive

Until now, the left task is to determine in the optimum solution to (14). Let us consider the unconstrained problem corresponding to (14), i.e.,

(23)

Our research indicates that if in the solution to (23) satisfies , then the solution to (14) must satisfy (i.e., ). This statement is expressed in the following lemma:

Lemma 1

Suppose that is a differentiable concave function on , and is defined as with . If and , then for any satisfying .

Proof.

See Appendix A. ∎

With Lemma 1, we are able to compute the optimum choice of . The detailed procedure is given as follows:

  • Initialization: Set .

  • Compute and according to (21) and (22), respectively.

  • If for all , output and exit; otherwise, set and return to P-2.

Remark 1: Lemma 1 allows us to rule out all the elements with from in one iteration. Therefore, the proposed procedure can converge much faster than the method proposed in [19],[25], which is to eliminate only one element with the smallest negative in each iteration. Our method significantly reduces the number of iterations to compute and does not require sorting procedure of .

Iii-C Convergence Analysis

In this subsection, we obtain the bounds on the step-sizes to ensure convergence. First, we need some notations and definitions to simplify the expressions of our theoretical analysis. Let us consider the function

(24)

which has incorporated the power constraint in the definition. The analysis of this subsection applies to any objective function in the form of with being a concave function. With (24), the Lagrangian (III-A) can be rewritten as

(25)

where is a dimensional matrix with binary elements representing the relationship between the antennas and their transmit power variables, i.e., if the th antenna is selected to serve the th user, one of the elements on the th row and the th to the th columns is 1; otherwise, all of these elements are 0. Moreover, it satisfies

(26)

because the th antenna serves users and each transmit power variable belongs to only one antenna. is a diagonal matrix with diagonal elements representing the parameters of the quadratic terms. represents the vector of maximal transmission power of the antennas. Therefore, the Lagrangian maximization problems (7) and (9) can be expressed as

(27)

and

(28)

respectively. Let be a diagonal matrix with diagonal elements representing the step size for dual update. Let be a diagonal matrix with diagonal elements representing the step size for auxiliary update. Then, the dual update (8) and auxiliary update (9) can be rephrased as

(29)

and

(30)

We also need to define the stationary point of Algorithm A.

Definition 1

A point is a stationary point of Algorithm A, if

(31)
(32)
(33)

where represents the Hadamard (elementwise) product of two vectors and with the same dimension.

Let us further consider a Lagrangian maximization problem . The KKT conditions suggest that there must exist a subgradient of satisfying

(34)

Similarly, let denote a stationary point of Algorithm A, then we can get from (31) that

(35)

Now we are ready to introduce the main result of this paper in the following theorem, i.e., the sufficient condition for the convergence of Algorithm A.

Theorem 1

If the objective function is in the form of with being a concave function, and the step-size satisfies

(36)

where is the number of users served by the th antenna, the proposed distributed power allocation Algorithm A will converge to a stationary point of the algorithm, and is an optimum solution.

The proof of Theorem 1 relies on the following key result:

Lemma 2

Let and be two maximizers of the Lagrangian (25) for fixed auxiliary variable , i.e., and , and is a stationary point of Algorithm A, then

(37)

where and are defined in (34) and (35).

Proof.

See Appendix B. ∎

With Lemma 2, we are able to prove Theorem 1. The details are relegated to Appendix C. Some remarks about Theorem 1 are provided as follows:

Remark 2: If we choose , then the step-size parameters do not rely on the channel coefficients . On the contrary, they are only determined by the number of users served by the th antenna, i.e., . Since the convergence speed of iterative optimization algorithms is mainly affected by the step-size, the convergence speed of our algorithm is quite robust to different values of .

Remark 3: It is worthwhile to note that the channel fading coefficients is involved in on the left hand side of (2), by not in the right hand side of (2). This is the key reason that the bound on the step-size in Theorem 1 is irrelevant to .

Remark 4: In [19, Lemma 3], the authors proved that

(38)

for the degenerated case of . One can see that (38) is looser than the inequality (2) in Lemma 2. Moreover, in [19, Proposition 4], the authors only proved the convergence of their algorithm for the step-sizes , which are smaller than the step-sizes of our algorithm, i.e., . Therefore, our algorithm can achieve a faster speed of convergence than that of [19]. Some numerical results will be provided in the next section to illustrate this.

Remark 5: Theorem 1 provides a sufficient condition for the convergence of Algorithm A, for all the system circumstances. According to our simulation experiences, there exist some circumstances that larger step-sizes than those of (36) can also obtain an optimum solution to problem (3). However, it is difficult to prove that such weaker conditions ensure the convergence of Algorithm A uniformly for all the system circumstances.

Remark 6: If the channel gains change before convergence as in the slow fading environment, the resultant power allocation solution may not be optimum. However, Algorithm A is able to track the changes of the slow fading environment to some extent. For example, suppose that the channel gains change after the algorithm has reached a near optimum solution. We can still use the dual variable and auxiliary variable of the last iteration as the initial state of the subsequent iterations. As long as the changes of channel gains are small, the dual variable and auxiliary variable of the last iteration is not far from the optimum solution, and the number of iterations for convergence is much smaller than using a random initial state.

Iv Numerical Simulations

In this section, we present some simulation results to demonstrate the efficiency of the proposed power allocation algorithm. We consider a downlink DAS with 7 cells. Each cell is equipped with 7 distributed antennas, including 1 antenna locating at the center of the cell and 6 remote antennas distributed near the boundary of the cell. Similar with Fig. 1, the locations of these 49 antennas form a hexagonal lattice. The minimal distance between two neighboring antennas is meters. The users are distributed uniformly in the entire network area, with the extra constraint that the distance from a user to a nearest antenna is no smaller than 10 meters. The wireless channel coefficients are composed by three components: large-scale path loss, shadowing, and small-scale Rayleigh fading. The path loss and shadowing are determined by the SCM model for Urban Macro environments [26]. Specifically, the path loss is given by , where is the distance in meters between the user and the antenna. The shadowing component satisfies a log-normal distribution with zero mean and a standard deviation of 8 dB. For downlink CoMP transmissions, each user is served by antennas, which are selected based on large-scale channel path loss. The maximal transmission power of each antenna is assumed to be the same, i.e., . The data rate weights are chosen as . Two users are allowed to be scheduled on the same channel, if they are served by different antennas. The bandwidth of each receiver is 1MHz, and the noise figure of each receiver is 5 dB. The conservative noise-plus-interference power is chosen to be 5 dB greater than the noise power. Therefore, the noise-plus-interference power at each receiver is dBm. We utilize in (2) to formulate the power allocation problem (3). After the power allocation solution is derived, we substitute it into the original rate function in (1) to compute the achievable data rate. All the simulation results are obtained by averaging over 1000 system realizations.

We compare our proposed power allocation strategy for problem (3) with the following 2 reference strategies: The first strategy considers the optimal power allocation for downlink CoMP transmissions with no interference, which provides a performance upper bound of the practical scenarios with interference. The second one is a simple equal power allocation (EPA) strategy, where each antenna allocates its transmission power equally to serve its users.

Figure 2 illustrates the simulation results of per-user throughput versus transmission power for different power allocation strategies, where each cell has 10 users.

Fig. 2: Simulation results of per-user throughput versus transmission power for .

Figure 3 presents the simulation results of per-user throughput versus the number of users per cell , where the transmission power dBm. One can observe that the proposed power allocation strategy has a small gap from the performance upper bound, especially when the transmission power is small. However, the simple equal power allocation scheme has a lower throughput. The performance of equal power allocation is poor, because the wireless links from different antennas to one user have quite different channel quality. The base station should spend more power on the strong wireless links, instead of using the same power for different wireless links. Through careful user scheduling and setting reasonable threshold of noise amplification, the proposed algorithm can achieve performance approaching to that of the ideal non-interference scenario. Therefore, the proposed power allocation strategy plays an essential role to realize the benefits of downlink CoMP transmissions in DAS.

Fig. 3: Simulation results of per-user throughput versus the number of users per cell for dBm.

Figure 4 illustrates the evolutions of the dual optimality gap of the proposed power allocation algorithm and the distributed optimization algorithm of [19] for and dBm, where the dual optimality gap is given by . The parameters of our distributed power allocation algorithm are chosen as , , and . The parameters of the algorithm of [19] are given by , (see Remark 4), and . Since the step-sizes of our proposed power allocation algorithm are larger than the reference algorithm in [19], our algorithm exhibits a faster convergence speed. We note that this is the convergence speed when the algorithm is cold started. In practice, since the channel condition varies slowly, the power allocation solution from the previous run of the algorithm is an excellent initial state for warm-starting the algorithm. By this, the algorithm generally converges much faster.

Fig. 4: Convergence of our proposed algorithm and the algorithm in [19] for and dBm.

V Conclusions

We have proposed a distributed power allocation algorithm for downlink CoMP transmissions in DAS. We considered an approximate power allocation problem with a non-strictly concave objective function, which makes traditional duality-based optimization techniques not applicable for this problem. We have resolved this non-strict concavity issue by adding some quadratic terms to make the objective function strictly concave, and developed a distributed algorithm to solve the power allocation problem. A key merit of this algorithm is that its convergence speed is robust to different values of the channel coefficients. Its implementation only requires local information exchange among neighboring base stations serving the same user. The convergence and optimality of this algorithm has been established rigorously. Our simulation results have revealed that significant throughput improvements can be realized by this power allocation algorithm.

Appendix A Proof of Lemma 1

Since is a concave function of , is also concave with respect to . The KKT conditions indicate

(A.1)

and

(A.2)

By taking the weighted summation of the partial derivations , we obtain

(A.3)

Let , (A.1) and (A.3) imply

(A.4)

For , (A.2) and (A.3) suggest

(A.5)

Comparing (A.4) and (A.5), we derive that

(A.6)

If , then

(A.7)

Since is a concave function, . Further, by the positivity of , we have .

Suppose there exists some such that and . Then, (A.1) and (A.2) imply

(A.8)

which further suggests

(A.9)

Since and is a concave function, we have . Therefore, , which contradicts with the assumption of and . Therefore, if , .

Appendix B Proof of Lemma 2

We need to use the fact that is in the form of , where is a concave function. Equation (34) can be also written as

(B.1)

where is the subgradient of . By conducting a weighted summation of the equations in (B.1), we obtain

(B.2)

Let us define

(B.3)
(B.4)

Then, (B.2) indicates

(B.5)

Then, the formula on the left hand side of (2) can be written as

(B.6)

Since is a concave function, we obtain