TwoScale Stochastic Control for Integrated Multipoint Communication Systems with Renewables
Abstract
Increasing threats of global warming and climate changes call for an energyefficient and sustainable design of future wireless communication systems. To this end, a novel twoscale stochastic control framework is put forth for smartgrid powered coordinated multipoint (CoMP) systems. Taking into account renewable energy sources (RES), dynamic pricing, twoway energy trading facilities and imperfect energy storage devices, the energy management task is formulated as an infinitehorizon optimization problem minimizing the timeaveraged energy transaction cost, subject to the users’ quality of service (QoS) requirements. Leveraging the Lyapunov optimization approach as well as the stochastic subgradient method, a twoscale online control (TSOC) approach is developed for the resultant smartgrid powered CoMP systems. Using only historical data, the proposed TSOC makes online control decisions at two timescales, and features a provably feasible and asymptotically nearoptimal solution. Numerical tests further corroborate the theoretical analysis, and demonstrate the merits of the proposed approach.
Twoscale control, aheadoftime market, realtime market, battery degeneration, CoMP systems, smart grids, Lyapunov optimization.
I Introduction
Interference is a major obstacle in wireless communication systems due to their broadcast nature, and becomes more severe in nextgeneration spectrum and energyconstrained cellular networks with smaller cells and more flexible frequency reuse [1]. With ever increasing demand for energyefficient transmissions, coordinated multipoint processing (CoMP) has been proposed as a promising paradigm for efficient intercell interference management in heterogeneous networks (HetNets) [2]. In CoMP systems, base stations (BSs) are partitioned into clusters, where BSs per cluster perform coordinated beamforming to serve the users [3, 4, 5]. As the number of BSs in HetNets increases, their electricity consumption constitutes a major part of their operational expenditure, and contributes a considerable portion to the global carbon footprint [6]. Fortunately, emerging characteristics of smart grids offer ample opportunities to achieve both energyefficient and environmentallyfriendly communication solutions. Such characteristics include high penetration of renewable energy sources (RES), twoway energy trading, and dynamic pricing based demandside management (DSM) [9, 10, 11]. In this context, energyefficient “green” communication solutions have been proposed for their economic and ecological merits [3, 4, 6, 5]. Driven by the need of sustainable “green communications,” manufacturers and network operators such as Ericsson, Huawei, Vodafone and China Mobile have started developing “green” BSs that can be jointly supplied by the persistent power sources from the main electric grid as well as from harvested renewable energy sources (e.g., solar and wind) [7, 8]. It is expected that renewable powered BSs will be widely deployed to support futuregeneration cellular systems.
A few recent works have considered the smartgrid powered CoMP transmissions [12, 13, 14, 15]. Assuming that the energy harvested from RES is accurately available a priori through e.g., forecasting, [12] and [13] considered the energyefficient resource allocation for RESpowered CoMP downlinks. Building on realistic models, our recent work dealt with robust energy management and transmitbeamforming designs that minimize the worstcase energy transaction cost for the CoMP downlink with RES and DSM [14]. Leveraging novel stochastic optimization tools [16, 17, 18], we further developed an efficient approach to obtain a feasible and asymptotically optimal online control scheme for smartgrid powered CoMP systems, without knowing the distributions of involved random variables [15].
A salient assumption in [12, 13, 14, 15] is that all involved resource allocation tasks are performed in a single time scale. However, RES and wireless channel dynamics typically evolve over different time scales in practice. Development of twoscale control schemes is then well motivated for CoMP systems with RES. In related contexts, a few stochastic optimization based twoscale control schemes were recently proposed and analyzed in [19, 20, 21, 22]. Extending the traditional Lyapunov optimization approach [16, 17, 18], [19] introduced a twoscale control algorithm that makes distributed routing and server management decisions to reduce power cost for largescale data centers. Based on a similar approach, [20] developed a socalled MultiGreen algorithm for data centers, which allows cloud service providers to make energy transactions at two time scales for minimum operational cost. As far as wireless communications are concerned, [21] performed joint precoder assignment, user association, and channel resource scheduling for HetNets with nonideal backhaul; while [22] introduced a twotimescale approach for network selection and subchannel allocation for integrated cellular and WiFi networks with an emphasis on using predictive future information. Note that however, neither [21] nor [22] considers the diversity of energy prices in fast/slowtimescale energy markets, and the energy leakage effects in the energy management task.
Ii
In the present paper, we develop a twoscale online control (TSOC) approach for smartgrid powered CoMP systems considering RES, dynamic pricing, twoway energy trading facilities and imperfect energy storage devices. Suppose that the RES harvesting occurs at the BSs over a slow timescale relative to the coherence time of wireless channels. The proposed scheme performs an aheadoftime (e.g., 15minute ahead, or, hourahead) energy planning upon RES arrivals, while deciding realtime energy balancing and transmitbeamforming schedules per channel coherence time slot. Specifically, the TSOC determines the amount of energy to trade (purchase or sell) with the aheadoftime wholesale market based on RES generation, as the basic energy supply for all the time slots within a RES harvesting interval. On the other hand, it decides the amount of energy to trade with the realtime market, energy charging to (or discharging from) the batteries, as well as the coordinated transmitbeamformers to guarantee the users’ quality of service (QoS) per time slot. Generalizing the Lyapunov optimization techniques in [19, 20, 21, 22, 23], we propose a synergetic framework to design and analyze such a twoscale dynamic management scheme to minimize the longterm timeaveraged energy transaction cost of the CoMP transmissions, without knowing the distributions of the random channel, RES, and energy price processes. The main contributions of our work are summarized as follows.

Leveraging the aheadoftime and realtime electricity markets, and building on our generalized system models in [14, 15], a novel twoscale optimization framework is developed to facilitate the dynamic resource management for smartgrid powered CoMP systems with RES and channel dynamics at different time scales.

While [15, 19] and [20] do not account for battery degeneration (energy leakage), we integrate the modified Lyapunov optimization technique into the twoscale stochastic optimization approach to leverage the diversity of energy prices along with the energy leakage effects on the dynamic energy management task.

Using only past channel and energyprice realizations, a novel stochastic subgradient approach is developed to solve the aheadoftime energy planning (sub)problem, which is suitable for a general family of continuous distributions, and avoids constructing the histogram estimate which is computationally cumbersome, especially for highdimensional vector of random optimization variables.

Rigorous analysis is presented to justify the feasibility and quantify the optimality gap for the proposed twoscale online control algorithm.
The rest of the paper is organized as follows. The system models are described in Section II. The proposed dynamic resource management scheme is developed in Section III. Performance analysis is the subject of Section IV. Numerical tests are provided in Section V, followed by concluding remarks in Section VI.
Notation. Boldface lower (upper) case letters represent vectors (matrices); and stand for spaces of complex vectors and real matrices, respectively; denotes transpose, and conjugate transpose; denotes a diagonal matrix with diagonal elements ; the magnitude of a complex scalar; and denotes expectation.
Iii System Models
Consider a clusterbased CoMP downlink setup, where a set of distributed BSs (e.g., macro/micro/pico BSs) is selected to serve a set of mobile users, as in e.g., [14, 15]. Each BS is equipped with transmit antennas, whereas each user has a single receive antenna. Suppose that through the smartgrid infrastructure conventional power generation is available, but each BS can also harvest RES (through e.g., solar panels and/or wind turbines), and it has an energy storage device (i.e., battery) to save the harvested energy. Relying on a twoway energy trading facility, the BS can also buy energy from or sell energy to the main grid at dynamically changing market prices. For the CoMP cluster, there is a lowlatency backhaul network connecting the set of BSs to a central controller [4], which coordinates energy trading as well as cooperative communication. This central entity can collect both communication data (transmit messages, channel state information) from each BS through the cellular backhaul links, as well as the energy information (energy purchase/selling prices, energy queue sizes) via smart meters installed at BSs, and the griddeployed communication/control links connecting them.^{1}^{1}1Perfect channel state information will be assumed hereafter, but the proposed formulation can readily account for the channel estimation errors to robustify the beamforming design; see e.g., [14, 15]. In addition, generalizations are possible to incorporate imperfect energy queue information based on the Lyapunov optimization framework in [20]. Although their detailed study falls out of the present paper’s scope, such imperfections are not expected to substantially affect the effectiveness of the proposed scheme.
As the RES and wireless channel dynamics emerge typically at different time scales in practice, we propose a twoscale control mechanism. As shown in Fig. 1, time is divided in slots of length smaller than the coherence time of the wireless channels. For convenience, the slot duration is normalized to unity; thus, the terms “energy” and “power” can be used interchangeably. On the other hand, we define the (virtual) “coarsegrained” time intervals in accordance with the slow RES harvesting scale, with each coarsegrained interval consisting of time slots.
Iiia AheadofTime Energy Planning
At the beginning of each “coarsegrained” interval, namely at time , , let denote the RES amount collected per BS , and . With available, an energy planner at the central unity decides the energy amounts , , to be used in the next slots per BS . With a twoway energy trading facility, the BSs then either purchase energy from the main grid according to their shortage, or sell their surplus energy to the grid at a fair price in order to reduce operational costs. Specifically, following the decision, BS contributes its RES amount to the main grid and requests the grid to supply an average energy amount of per slot .
RES is assumed harvested for free after deployment. Given the requested energy and the harvested energy , the shortage energy that is purchased from the grid for BS is clearly ; or, the surplus energy that is sold to the grid is , where . Depending on the difference , the BS either buys electricity from the grid with the aheadoftime (i.e., longterm) price , or sells electricity to the grid with price for profit (the latter leads to a negative cost). Notwithstanding, we shall always set to avoid meaningless buyandsell activities of the BSs for profit. The transaction cost with BS for such an energy planning is therefore given by
(1) 
For conciseness, we concatenate into a single random vector all the random variables evolving at this slow timescale; i.e., .
IiiB CoMP Downlink Transmissions
Per slot , let denote the vector channel from BS to user , , ; let collect the channel vectors from all BSs to user , and . With linear transmit beamforming performed across BSs, the vector signal transmitted to user is: , , where denotes the informationbearing scalar symbol with unitenergy, and denotes the beamforming vector across the BSs serving user . The received vector at slot for user is therefore
(2) 
where is the desired signal of user , is the interuser interference from the same cluster, and denotes additive noise, which may also include the downlink interference from other BSs outside user ’s cluster. It is assumed that is a circularly symmetric complex Gaussian (CSCG) random variable with zero mean and variance .
The signaltointerferenceplusnoise ratio (SINR) at user can be expressed as
(3) 
The transmit power at each BS clearly is given by
(4) 
where the matrix
selects the corresponding rows out of to form the th BS’s transmitbeamforming vector of size .
To guarantee QoS per slot user , it is required that the central controller selects a set of satisfying [cf. (3)]
(5) 
where denotes the target SINR value per user .
IiiC RealTime Energy Balancing
For the th BS, the total energy consumption per slot includes the transmissionrelated power , and the rest that is due to other components such as air conditioning, data processor, and circuits, which can be generally modeled as a constant power, [13]. We further suppose that is bounded by . Namely,
(6) 
Per slot , the energy supply available from the aheadoftime planning may not exactly meet the actual demand at BS . Hence, the BS is also allowed to perform realtime energy trading with the main grid to balance its supply with demand. Let denote the realtime energy amount that is purchased from () or sold to () the grid by BS . Let and () denote the realtime energy purchase and selling prices, respectively. Then the realtime energy transaction cost for BS is
(7) 
Fig. 2 depicts the dayahead and realtime energy prices in the PennsylvaniaJerseyMaryland (PJM) wholesale market [24]. In practice, the average purchase price in the realtime market tends to be no lower than that in the dayahead market; that is, ; similarly, we have . Again, we use a random vector to collect all random variables evolving at the fast timescale.
IiiD Energy Storage with Degeneration
As energy consumption will become a major concern of the future largescale cellular networks, uninterrupted power supply type storage units can be installed at the BSs to prevent power outages, and provide opportunities to optimize the BSs’ electricity bills. Different from the ideal battery models in [12, 13, 14, 15, 20], we consider here a practical battery with degeneration (i.e., energy leakage over time even in the absence of discharging) as in [23].
For the battery of the th BS, let denote the initial amount of stored energy, and its state of charge (SoC) at the beginning of time slot . The battery is assumed to have a finite capacity . Furthermore, for reliability purposes, it might be required to ensure that a minimum energy level is maintained at all times. Let denote the energy delivered to or drawn from the battery at slot , which amounts to either charging () or discharging (). The stored energy then obeys the dynamic equation
(8) 
where denotes the storage efficiency (e.g., means that 10% of the stored energy will be “leaked” over a slot, even in the absence of discharging).
The amount of power (dis)charged is also assumed bounded by
(9) 
where and are introduced by physical constraints.
With and consideration of , we have the following demandandsupply balance equation per slot :
(10) 
Iv Dynamic Resource Management Scheme
Note that the harvested RES amounts , the aheadoftime prices , the realtime prices , and the wireless channel matrices are all random. The smartgrid powered CoMP downlink to be controlled is a stochastic system. The goal is to design an online resource management scheme that chooses the aheadoftime energytrading amounts at every , as well as the realtime energytrading amounts , battery (dis)charging amounts , and the CoMP beamforming vectors per slot , so as to minimize the expected total energy transaction cost, without knowing the distributions of the aforementioned random processes.
According to (1) and (7), define the energy transaction cost for BS per slot as:
(11) 
Let . The problem of interest is to find
(12) 
where the expectations of are taken over all sources of randomness. Note that here the constraints (5), (6), (8), (9), and (10) are implicitly required to hold for every realization of the underlying random states and .
Iva TwoScale Online Control Algorithm
(12) is a stochastic optimization task. We next generalize and integrate the Lyapunov optimization techniques in [19, 20, 21, 22, 23] to develop a TSOC algorithm, which will be proven feasible, and asymptotically nearoptimal for (12). To start, assume the following two relatively mild conditions for the system parameters:
(13)  
(14) 
Condition (13) simply implies that the energy leakage of the battery can be compensated by the charging. Condition (14) requires that the allowable SoC range is large enough to accommodate the largest possible charging/discharging over time slots of each coarsegrained interval. This then makes the system “controllable” by our twoscale mechanism.
Our algorithm depends on two parameters, namely a “queue perturbation” parameter , and a weight parameter . Define and . Derived from the feasibility requirement of the proposed algorithm (see the proof of Proposition 1 in the sequel), any pair that satisfies the following conditions can be used:
(15) 
where
(16)  
(17)  
(18) 
Note that the interval for in (15) is well defined under condition (14), and the interval for is valid when .
We now present the proposed TSOC algorithm:

Initialization: Select and , and introduce a virtual queue , .

Aheadoftime energy planning: Per interval , observe a realization , and determine the energy amounts by solving
(19) where expectations are taken over . Then the BSs trade energy with the main grid based on , and request the grid to supply an average amount per slot .

Energy balancing and beamforming schedule: At every slot , observe a realization , and decide by solving the following problem given
s. t. (20) The BSs perform realtime energy trading with the main grid based on , and coordinated beamforming based on .

Queue updates: Per slot , charge (or discharge) the battery based on , so that the stored energy , ; and update the virtual queues .
Remark 1
Note that we use queue sizes instead of in problems (IVA) and (IVA); see also [19, 20]. Recall that the main design principle in Lyapunov optimization is to choose control actions that minimize . For the aheadoftime energy planning, this requires apriori knowledge of the future queue backlogs over slots at time . It is impractical to assume that this information is available. For this reason, we simply approximate future queue backlog values as the current value at , i.e., , , in (IVA). To ensure that the realtime energy balancing and beamforming schedule solves the same problem as the aheadoftime energy planning, we also use in (IVA) although the realtime battery state of charge is available at slot . Rigorous analysis shows that the performance penalty incurred by such an approximation does not affect the asymptotic optimality of the proposed stochastic control scheme. On the other hand, using in realtime energy balancing can be also suggested in practice. While our feasibility analysis affords such a modification, deriving the optimality gap is left for future research.
IvB RealTime Energy Balancing and Beamforming
It is easy to argue that the objective (IVA) is convex. Indeed, with , the transaction cost with can be alternatively written as
(21) 
which is clearly convex [25]; and so is the objective in (IVA).
The SINR constraints in (5) can be actually rewritten into a convex form. Observe that an arbitrary phase rotation can be added to the beamforming vectors without affecting the SINRs. Hence, we can choose a phase so that is real and nonnegative. Then by proper rearrangement, the SINR constraints become convex secondorder cone (SOC) constraints [27]; that is,
IvC AheadofTime Energy Planning
To solve (IVA), the probability distribution function (pdf) of the random state must be known across slots . However, this pdf is seldom available in practice. Suppose that is independent and identically distributed (i.i.d.) over time slots, and takes values from a finite state space. It was proposed in [19] to obtain an empirical pdf of from past realizations over a large window comprising intervals. This estimate becomes accurate as grows sufficiently large; then it can be used to evaluate the expectations in (IVA). Based on such an empirical pdf, an approximate solution for (IVA) could be obtained.
Different from [19], here we propose a stochastic gradient approach to solve (IVA). Suppose that is i.i.d. across time slots (but not necessarily with a finite support). For stationary , we can remove the index from all optimization variables, and rewrite (IVA) as (with shorthand notation )
s. t.  
(23a)  
(23b)  
(23c)  
(23d) 
Note that this form explicitly indicates the dependence of the decision variables on the realization of .
Since the energy planning problem (IVA) only determines the optimal aheadoftime energy purchase , we can then eliminate the variable and write (23) as an unconstrained optimization problem with respect to the variable , namely
(24) 
where we define
(25) 
with the compact notation . Since is jointly convex in [cf. (IVB)], then the minimization over is within a convex set; thus, (23a)(23c) is still convex with respect to [25, Sec. 3.2.5]. In addition, due to , we can alternatively write , which is in the family of convex functions. Hence, (24) is generally a nonsmooth and unconstrained convex problem with respect to , which can be solved using the stochastic subgradient iteration described next.
The subgradient of can be first written as
With denoting the optimal solution for the problem in (IVC), the partial subgradient of with respect to is , where
with .
Defining , a standard subgradient descent iteration can be employed to find the optimal for (24), as
(26) 
where denotes iteration index, and is the sequence of stepsizes.
Implementing (26) essentially requires performing (highdimensional) integration over the unknown multivariate distribution function of present in through in (IVC). To circumvent this impasse, a stochastic subgradient approach is devised based on the past realizations . Per iteration , we randomly draw a realization from past realizations, and run the following iteration
(27) 
where with obtained by solving a convex problem (IVC) with .
As is indeed an unbiased random realization of [28], if we adopt a sequence of nonsummable diminishing stepsizes satisfying and , the iteration (27) asymptotically converges to the optimal as [29].
Compared with [19], the proposed stochastic subgradient method is particularly tailored for our setting, which does not require the random vector to have discrete and finite support. In addition, as the former essentially belongs to the class of statistical learning based approaches [30], the proposed stochastic method avoids constructing a histogram for learning the underlying multivariate distribution and requires a considerably smaller number of samples to obtain an accurate estimate of .
Remark 2
The computational complexity of the proposed algorithm is fairly low. Specifically, for solving the realtime energy balancing and beamforming problem (IVB) per slot , the offtheshelf interiorpoint solver incurs a worstcase complexity to obtain the decisions [26]; for solving the aheadoftime energy planning problem (IVC) every slots, the stochastic subgradient approach needs iterations to obtain an optimal solution, while the per iteration complexity is in the order of . And updating in (27) requires only linear complexity .
V Performance Analysis
In this section, we show that the TSOC can yield a feasible and asymptotically (near)optimal solution for problem (12).
Va Feasibility Guarantee
Note that in problems (IVA) and (IVA), are removed from the set of optimization variables and the constraints in (8) are ignored. While the battery dynamics are accounted for by the TSOC algorithm (in the step of “Queue updates”), it is not clear whether the resultant , . Yet, we will show that by selecting a pair in (15), we can guarantee that , ; meaning, the online control policy produced by the TSOC is a feasible one for the original problem (12), under the conditions (13)–(14).
To this end, we first show the following lemma.
Lemma 1
If and , the battery (dis)charging amounts obtained from the TSOC algorithm satisfy: i) , if ; and ii) , if .
In TSOC, we determine by solving (IVA). From the equivalent problem (IVB), we can see that the determination of is decoupled across BSs, and it depends on the first derivative of . By (21), the maximum possible gradient for is . It then follows that if , we must have . Similarly, if , we must have . Given that , the lemma follows readily.
Lemma 1 reveals partial characteristics of the dynamic TSOC policy. Specifically, when the energy queue (i.e., battery SoC) is large enough, the battery must be discharged as much as possible; that is, . On the other hand, when the energy queue is small enough, the battery must be charged as much as possible; i.e., . Alternatively, such results can be justified by the economic interpretation of the virtual queues. Specifically, can be viewed as the instantaneous discharging price. For high prices , the TSOC dictates full charge. Conversely, the battery units can afford full discharge if the price is low.
Based on the structure in Lemma 1, we can thus establish the following result.
Proposition 1
See Appendix A.
Remark 3
Note that Proposition 1 is a sample path result; meaning, the bounded energy queues , , hold per time slot under arbitrary, even nonstationary, processes. In other words, under the mild conditions (13)–(14), the proposed TSOC with proper selection of always yields a feasible control policy for (12).
VB Asymptotic Optimality
To facilitate the analysis, we assume that the random processes and are both i.i.d. over slow and fast timescales, respectively. Define and