Joint Data Routing and Power Scheduling for Wireless Powered Communication Networks
Abstract
In a wireless powered communication network (WPCN), an energy access point supplies the energy needs of the network nodes through radio frequency wave transmission, and the nodes store the received energy in their batteries for their future data transmission. In this paper, we propose an online stochastic policy that jointly controls energy transmission from the EAP to the nodes and data transfer among the nodes. For this purpose, we first introduce a novel perturbed Lyapunov function to address the limitations on the energy consumption of the nodes imposed by their batteries. Then, using Lyapunov optimization method, we propose a policy which is adaptive to any arbitrary channel statistics in the network. Finally, we provide theoretical analysis for the performance of the proposed policy and show that it stabilizes the network, and the average power consumption of the network under this policy is within a bounded gap of the minimum power level required for stabilizing the network.
1 Introduction
Nowadays, smart electronic devices are increasingly making their way into our daily life. It is predicted that by 2021, there will be around 28 billion connected devices all over the world [1], a great number of which will be portable and batterypowered. However, in some applications, replacing the batteries or recharging them by cables is impossible, e.g. in biomedical implants inside human bodies [2] or distributed monitoring sensors in a wide area of forest. Consequently, to ensure a better user experience for the nextgeneration networks, the problem of providing the required power for the portable batteryoperated devices has recently gained lots of attention, both from academia and industry [3, 2]. Recently, the idea of charging batteries over the air is considered as a solution which guarantees an uninterrupted connection and operates autonomously, while reduces the massive battery disposal. Wireless Power Transfer (WPT) is the key enabling technology for charging over the air. There are various WPT methods including Radio Frequency (RF) power transfer[3], resonant coupling[4] and inductive coupling[5]. Compared to the two latter methods, RF power transfer provides a wider coverage range and is more flexible for transmitter/receiver deployment and movement [2]. Therefore, it is considered as the most promising WPT approach by the literature.
Adapting WPT technology in wireless communication networks introduces new research challenges, mostly related to increasing coverage and efficiency. A prominent challenge is how to maintain power transfer efficiency despite the transmission path loss [6]. There has been numerous studies on energy beamforming as a technique for alleviating the high transmission path loss (e.g., see [6, 7, 8, 9]). In [6] and [7], a wireless powered communication network (WPCN) consisting of a hybrid data/energy access point with multiple antennas and several singleantenna users is considered, in which the access point transmits energy toward the nodes in the downlink direction and the nodes transmit data to the access point in the uplink direction. In these works, the minimum achievable rate among users is maximized by optimizing the beamforming vector and some other controllable parameters. Energy beamforming for the socalled simultaneous wireless information and power transfer (SWIPT) method is studied in [8, 9]. Under SWIPT, both energy and data are jointly transmitted by an RF carrier in the downlink, the receiver extracts data or harvests power through splitting the received signal in the time or power domain.
Cooperative wireless powered communication is another line of research that aims at increasing the network coverage (e.g. see [10, 11, 12, 13]). The intuition behind it is that in WPCNs, the users nearer to the access point harvest more energy, while need to consume less energy for their data transmission. Hence, using cooperation, these nodes can use some of their surplus energy to help relaying the data of the further nodes. In [10], a two user scenario is considered in which the nearer user allocates a portion of its harvested energy to help relaying the farther user’s data to the access point. The authors have maximized the sum rate of the users by optimizing the resource allocation. In [11], the authors have derived the achievable rate of a twohop relay network, in which the relay stores its harvested energy in the battery for its future transmissions. While most works in the related literature have considered two node cooperation there are very few works on multihop cooperation, e.g. [12] and [13]. In these works a general multihop network with energy transfer capability has been considered, and the routing policy and energy allocation are determined so as to maximize the sum rate and the lifetime of the network.
It should be noted that most of the existing works in the literature have focused on optimizing the network parameters for a single timeslot. Clearly, this approach is not optimal when the users can store the harvested energy in their batteries for their future use. There are very few works on the longterm network optimization [14, 15, 16, 17, 18]. The authors in [14] have studied the longterm network utility optimization through markov decision programming (MDP) theory. The MDP method requires statistical knowledge of the channel variation, and the complexity of its solutions grows fast as the network dimension increases [19]. However, the Lyapunov optimization technique applied in [15, 16, 17, 18], is independent of the channel statistics and the network dimension.
In this work, we consider the problem of designing an optimal WPT policy that schedules power allocation, data routing and energy beamforming in a multihop WPCN. We consider a battery level constraint for each node, which indicates that at each timeslot, the energy that can be consumed can not be greater than the energy stored in the battery. Moreover, the average data backlog in the network should remain finite. The battery constraint complicates the problem, since high energy consumption at a time may highly lower down the battery level and cause energy outage in future. Therefore, the decision at one timeslot affects the optimal decision in future, as well. This coupling makes finding the optimal policy highly challenging. A similar problem has been considered in utility maximization for energy harvesting wireless sensor networks in [20, 21]. The authors have addressed the battery constraint by a modified Lyapunov optimization method. However, their method is not applicable in our energy optimization problem, as their objective function (and hence, their accordingly analysis) is totally different. In this paper, we use Lyapunov optimization method with a novel Lyapunov function to avoid energy outage. We propose an online policy that is independent of the channel statistics. Under this policy, at each timeslot, the energy beam is focused toward the nodes with lower battery levels, greater queue backlogs and better energy link condition. Moreover, the data is routed through the nodes with less congested queues and greater battery level. We then analyze the performance of the proposed policy and provide theoretical results that show the performance of our policy is within of the optimal policy, for any , while the average backlogs of data queues are upper bounded by . We would like to note that the most related work to our paper is [15], which studies energy optimization in a singlehop WPCN. However, the authors in [15] have pursued a different approach to address the energy outage problem, which imposes a minimum requirement on maximum transmission power of the access point. This minimum requirement can be too high in practice and may not be satisfied in certain cases, due to safety or implementation issues.
The contributions of this paper can be summarized as follows:

We propose a power scheduling, energy beamforing and data routing policy for a general multihop WPCN.

We show that our policy conforms to the battery level constraint.

Using Lyapunov optimization method, we bound the optimality gap of the EAP average power consumption and the average backlog of the queues.
The rest of the paper is organized as follows. Section 2 illustrates our system model and problem formulation. Section 3 presents our proposed policy. The performance of the policy is analyzed in Section 4. Simulation results are presented in Section 5, and finally, Section 6 concludes the paper.
Notation: We use boldface letters to denote matrices and vectors. denotes the transpose of a matrix. denotes the absolute value. If not mentioned, vectors are single rowmatrices. represents the expectation. denotes . equals if the is satisfied and equals , othewise.
2 System Model
We consider a WPCN consisting of one energy access point (EAP) and wireless nodes, where there exist streams of data between distinct endpoints in the network. The nodes are batterypowered, and the batteries are recharged by the energy received from the EAP. There exist energy links between the EAP and the nodes, and data links between the nodes. The topology of a sample network is depicted in Fig. 1. For each data link , and denote the transmitter and receiver of the th link, respectively. Moreover, we define and to be the sets of the ingoing and outgoing data links of node , respectively. The time horizon is divided into timeslots with fixed length^{1}^{1}1Without loss of generality, we assume the slot duration is normalized to 1. Therefore, we sometimes use the terms “power” and “energy” interchangeably., indexed by . At the beginning of each timeslot, a small portion of it is devoted to channel estimation and control signaling. The rest of the timeslot is divided equally for energy and data transmission, respectively. The EAP is equipped with antennas to focus its transmission beam toward the nodes. Moreover, we assume that the nodes use a single antenna for both energy reception and data transmission/reception.
The channels state information are assumed to be constant during a timeslot but vary randomly and independently in successive timeslots. At each timeslot , and represent the channel gain of the th data link and the gain of the channel between th antenna of the EAP and the node , respectively. Accordingly, we define and as the channel vectors for data links and energy link of node , respectively.
2.1 Data and Energy Transmission
Let denote the data link power vector, in which the th entry specifies the transmission power over the th data link. Moreover, let denote the set of all feasible power vectors. We assume that setting any element of a power vector in to zero results in a new power vector that also belongs to . Furthermore, we assume that the peak transmission power is limited to (i.e., ). Let denote the data transmission capacity of link under power vector and channel vector . Some important properties of is presented in following remark.
Remark 1
Consider two power vectors and , where and . The capacity of link under each of these two power vectors satisfies the following properties:
(1)  
(2)  
(3) 
where (2) holds for some .
Note that the above properties are satisfied under conventional ratepower functions. Equation (1) implies that for any link , no data can be passed through it if no power is assigned to this link. Inequality (2) states that the ratepower function is upper bounded by a linear function, which is the case for differentiable functions with limited first derivative. Finally, inequality (3) holds due to the interference effect among wireless links. Furthermore, we assume that there exists a constant , such that . For any link , let denote the transmission rate allocated to stream , over that link. Clearly, the sum allocated rate over each link should not exceed the capacity of that link. Therefore, a feasible rate allocation scheme should satisfy
(4) 
Fig. 2 shows the considered structure of an EAP. The EAP performs energy beamforming to concentrate its transmit energy towards the nodes. Vector denotes the normalized beamforming vector of the EAP, and accordingly, the received power at each node is given by
(5) 
where is the EAP’s transmit power at time , with its peak power equal to , i.e., .
2.2 Wireless Nodes
As shown in Fig. 3, each node includes data queues and is equipped with a battery. Let denote the backlog of the data queue allocated to stream in node . The backlog evolves as follows
(6) 
where denotes the random arrival process of data stream at node . Note that is nonzero only if node is the source of stream . Let , then clearly we have
where is the arrival rate for stream .
The battery of each node is recharged by the energy received from the EAP and is discharged when the node transmits data. Let denote the battery level of node at beginning of timeslot . Therefore, the battery level at node evolves according to following equation:
(7) 
where is the total transmit power of node at timeslot .
2.3 Network Controller
There exists a network controller, located at the EAP that controls over both data and energy links, having access to channel state information, data queue backlogs and the battery levels of all links and nodes in the network. It controls the energy links by specifying the EAP transmission power and the beamforming vector , and the data links by determining their power vector and routing of the data streams. It routes the data through allocating the capacity of the links to the existing data streams in the network.
As aforementioned, in this paper, we focus on designing a joint data routing and energy transfer control policy for the network controller that minimizes the total transferred power of the EAP while guaranteeing all data queues in the network to be stable. The intended policy can be explicitly formulated as the solution to the following problem^{2}^{2}2Note that including both time average and expectation in this problem formulation is due to the fact that we seek the optimal policy among both stationary and nonstationary policies. For a nonstationary policy, the expectations are timedependent and hence, taking the time average is required. .
p(t),w(t),p_AP(t), C_l^s(t) ¯p_AP = lim_T →∞1T∑_t=0^T1E{p_AP(t)} \addConstraint∑_l ∈O(n)p_l(t) ≤E_n(t), ∀n,t \addConstraint¯U = lim sup_T →∞1 T∑_t = 0^T1∑_n,sE{U_n^s(t)} ¡ ∞ ∀n,s \addConstraint(4),(6), (7), where Constraint (2.3) guarantees that the sum power allocated to the outgoing links of a node is not greater than what can be supported by the battery level of the node. Moreover, Constraint (2.3) ensures stability of all queues.
Note that the formulated problen in (2.3) is a stochastic utility optimization problem. Although in general, these problems can be tackled by the socalled min drift plus penalty (MDPP) algorithm [22], the battery constraint in (2.3) highly complicates our problem and makes it quite challenging. This is mainly due to the fact that in the batteryoperated case, consuming high power in a specific timeslot may drastically lower the battery level down and restrict future transmissions. Therefore, having the battery level constraint, policies with independent decisions at each timeslot are not optimal any more, which is not acceptable in the MDPP problem formulation. In the sequel, we propose a solution to handle the battery constraint in MDPP problem formulation. For convenience, all the notations in the paper and their definitions are presented in Table 1.
Symbol  Meaning 

The backlog of data queue allocated to stream in node at timeslot .  
The battery level of node at timeslot .  
Number of nodes, streams and data links.  
The transmitter index for link .  
The receiver index for link .  
The set of outgoing links from node .  
The set of ingoing links to node .  
The maximum number of outgoing links from from a specific node.  
The maximum number of ingoing links to a specific node.  
The harvested energy by node at timeslot .  
The beamforming vector.  
The transmission power of EAP at timeslot .  
The exogenous data arrival for stream in node .  
The mean value of the exogenous data arrival for stream in node .  
The vector of data link channel states at timeslot .  
The vector of energy link channel states at timeslot .  
The vector of power allocation to data links at timeslot .  
The set of valid power vectors.  
Capacity of link at timeslot .  
Maximum transmission power of nodes and EAP. 
3 The Proposed Online Control Policy for Joint Data Routing and Power Transfer Scheduling
In this section, we present an online control policy for the network controller. This policy is developed based on the Lypunov optimization method [22]. We propose a novel perturbed Lyapunov function to push the battery level up. The Lyapunov function is defined as
where
(8) 
Note that constant in (8), is an energy normalization factor and we set it to , where is the slope of the linear upper bound for the ratepower function in (2) and is a constant and will be discussed later. We also define the Lypunov drift function,
where and are the sets of all data queues and batteries in the network, respectively. Next we define the driftpluspenalty function, as follows
(9) 
where is a control parameter. The following Lemma establishes an upper bound on the above driftpluspenalty function.
Lemma 1
For the defined driftpluspenalty function, the following inequality always holds
(10) 
where , and . The constant is defined in Appendix 7.
see Appendix 7. The parameter and the set in the above Lemma are the the congestion threshold and the set of congested queues, respectively. Moreover, we call the congested queues in the set the critically congested queues, since the size of their backlog exceeds the normalized battery level of their corresponding node. Our policy tends to decrease a queue backlog only if the queue is congested. Consequently, setting the congestion threshold to the smallest possible value reduces the average queue backlog. The parameter can be optimized to achieve the minimum congestion threshold. Substituting in the definition of with , it can be verified that is minimized at .
As will be shown in section 4, any policy that minimizes the right hand side of (10) at each timeslot stabilizes the network and yields an average power consumption within a bounded gap to the optimal power consumption. Consequently, we are interested in finding a policy that minimizes the upper bound in (10). For this purpose, we first rearrange the right hand side in (10) and rewrite it as follows:
(11) 
where is called the data coefficient of stream over link and is defined as
(12) 
Furthermore, is called the power coefficient for node and is defined as
(13) 
In order to minimize the right hand side of (11), it suffices to minimize the inner terms of the two expectations as the other terms are constant with respect to the control variables and . To minimize the first expectation, we first allocate the whole capacity of each link to the stream with greatest data coefficient over that link, and then we select the minimizing power vector. Furthermore, minimization of the second expectation can be decomposed into beamforming vector selection and the EAP transmission power selection. We rewrite the term inside the expectation as
(14) 
where
(15) 
It can be verified that the term inside the brackets is minimized if we select the beamforming vector in direction of the eigenvector of with maximum eigenvalue. Substituting in (14), is then minimized by determining according to following rule
(16) 
The data link control and energy link control polices are summarized in Algorithms 1 and 2, respectively.
It should be noted that finding the optimal power vector in the data link control policy requires solving the maxweight problem
which can be NPhard in general. However, in certain cases, e.g., in interferencefree networks, closedform solutions can be found. Furthermore, approximate solutions for this problem results in a bounded optimality gap in the overall performance. The approximate solutions have been extensively discussed in [22, Chapter 6].
4 Performance Analysis of the Proposed Control Policies
In this section, we first derive a lower bound on the minimum required power for stability. We then use Lyapunov Optimization Theorem [22] to compare the proposed control policy to the derived lower bound.
4.1 Lower Bound on the Minimum Power for Stability
In order to obtain a lower bound on the minimum power that stabilizes the queue of each link, we substitute the instantaneous battery constraint (2.3) with a more relaxed constraint on the average power consumption. The battery constraint along with (7) imply that
(17) 
where a limited initial battery charge is assumed. The last inequality in (17) holds under any policy that conforms to the battery constraint. We name (17) the average power constraint and use it as a substitute for the battery constraint (2.3).
Let denote the data streams arrival rate vector. We define the capacity region as the set of all data arrival rate vectors that can be stabilized under the average power constraint (interested readers are referred to [23] for more details on the network capacity region). The following theorem introduces a randomized policy that achieves the minimum power consumption over all other polices with average power constraint.
Theorem 1
Suppose that channel states and data arrivals are i.i.d over different timeslots. Moreover, assume that the arrival rates belong to the capacity region (i.e., ). The minimum power required for stability, , can be obtained by a stationary and probably randomized policy. This policy is a pure function of , and , with the following properties
(18)  
(19) 
The proof is similar to the proof of Theorem 4.5 in [22], and hence, is omitted for brevity.
Note that Theorem 1 only states that such stationary optimal policy with aforementioned properties exists, and does not derive such policy. In sequel, we use these properties to compare the average power consumption under our proposed policy to the lower bound on the minimum required power for stability, i.e., .
4.2 Performance of the Proposed Policy
In this section, we derive the optimality gap of our proposed policy. Moreover, we show that the proposed policy stabilizes the network and conforms to the battery constraint. The following theorem summarizes the performance of the proposed policy,
Theorem 2
Suppose the channel states and data arrivals are i.i.d over timeslots, and the arrival rates are strictly inside the capacity region, i.e., there is a scalar such that , where is a vector with all entries equal to . Under the proposed policy,

At any timeslot , the transmission power assigned to data links originated from node are nonzero only if its battery level is higher than the maximum data transmission power, i.e., .

The time average expected power consumption satisfies,
(20) 
The queues are stable and time average expected sum backlog satisfies,
(21) where .
It should be noted that part 1 in Theorem 2 guarantees that our proposed policy does not violate the battery level constraint. Moreover, parts 2 and 3 show the optimality of the power consumption and stability of the network under our proposed policy, respectively.
Remark 2
Note that the performance bounds in and introduce a tradeoff between the optimality gap and the average queue backlog. According to this tradeoff, when the average power consumption is within of the minimum required power, the average backlog could be upper bounded by a term of the order of .
Now we prove Theorem 2.
Part 1 is proven in Appendix 8, optimality (part 2) and stability (part 3) are proven here. Suppose that the arrival rate is . Since the arrivals are i.i.d, according to Theorem 1 there is a stationary randomized policy with the following properties,
(22) 
where is the minimum power required for stability when the arrival rate equals . Let denote the above stationary policy. Our proposed policy minimizes the right hand side of (10) over any alternative policy including . Plugging the properties in (22) into right hand side of (10) yields,
Taking expectation with respect to and from both sides results in
Then, summing both sides over yields
Now by rearranging the terms and dropping the negative terms when appropriate, we get the following inequalities:
(23)  
(24) 
The bounds in (23) and (24) can be separately optimized over values of . Since , letting in (23) and taking limits as concludes the second statement of Theorem 2. To prove the last statement, we first substitute the inner summation in (24) with summation over all data queues and then add the term to the right hand side as a compensation. Setting and taking limits as completes the proof of the third statement of Theorem 2.
5 Simulation Results
In this section, we consider a wireless network consisting of one EAP and five wireless nodes, as shown in Fig. 4. There are two streams of data, from node 1 to node 4 and from node 2 to node 5, with average arrival rates of bit/slot and bit/slot, respectively. The data links and energy links channel states are generated according to Rician fading model [24], with Rician factor equal to 1. The EAP is equipped with antennas that are configured as a half wavelength separated array. Moreover, similar to existing works in literature (e.g see [13] ) we assume no interference across the data links and consider an AWGN model for their capacity, as follows
where kHz and dBm/Hz are the channel bandwidth and the noise spectral density, respectively. Finally, the maximum transmission power of the EAP and the nodes are considered to be and , respectively. All numerical results have been obtained by running the simulation for timeslots using Matlab 2015a on a simulation platform with 20 cores and 256 GB of RAM.
Fig. 5 shows the average power consumption of the EAP as well as the average backlog of the data queues in the network, versus the trade off parameter . As can be seen in Figures 4(a) and 4(b), the average power consumption decays very fast as increases, while the average data queue backlog increases linearly with . Such behavior complies with our theoretical results derived in (20) and (21).
Next, Fig. 6 depicts a sample path for the data queue backlog process and the battery level process of node 1 (for ). It can be verified from this figure that the queue backlog is stabilized around Megabits while no energy outage has occurred.
Finally, Fig. 7 shows the average transmission pattern of the EAP (for ). As can be clearly seen in this figure, there are three distinguished peaks in the transmission patter of the EAP at the direction toward node 1, 2 and 3 which are the most congested nodes in this network. Note that the peak at is due to the linear structure of the EAP antenna (the gains at and for a linear array are reciprocal). Moreover, as can be clearly verified from this figure, the maximum value of the pattern is at the direction toward node 3. This is due to the fact that all network traffic pass through this node.
6 Conclusion
In this paper, we focused on a wireless powered communication network with batteryoperated nodes and proposed a joint power allocation, data routing and energy beamforming policy to minimize the average power consumption in the network. The proposed policy adapts to general networks with arbitrary channel models, without any knowledge of the channel statistics. By theoretical analysis, we proved that our proposed policy conforms to the battery constraint and stabilizes the network. Moreover, we derived the optimality gap for the average power consumption under this policy. Finally, various numerical results are provided to show the significant performance of the proposed solution.
7 Upper Bound for Drift Plus Penalty Function
Here we prove the inequality in (10) holds. We enumerate three different cases for and , and bound the increment of in successive timeslots for each case:

:

and :

and :
Considering the above three cases and taking summation over for and we would have,
(28) 
where . Adding to both sides of (28), taking expectation conditioned on and and rearranging the terms proves the intended result.
8 The Proposed Policy Conforms to Battery Constraint
Here we prove part 1 of Theorem 2. Let us assume for a specific node . Consider data link such that and a power vector . Let us define another power vector, , by setting the th entry in to zeros. The transmission power for data links are determined by the solution of the minimization problem,
(29) 
To prove the intended result, it suffices to show . The following always hold,
(30)  
(31)  
(32)  
(33) 
where (32) and (33) are due to the properties of the capacity function in (3) and (2). It can be verified that