Two-Scale Stochastic Control for Integrated Multipoint Communication Systems with Renewables

Two-Scale Stochastic Control for Integrated Multipoint Communication Systems with Renewables

Xin Wang, , Xiaojing Chen, , Tianyi Chen, , Longbo Huang, , and Georgios B. Giannakis, 
Work in this paper was supported by the Program for New Century Excellent Talents in University, the Innovation Program of Shanghai Municipal Education Commission; US NSF grants ECCS-1509005, ECCS-1508993, CCF-1423316, CCF-1442686, and ECCS-1202135. The work of L. Huang was supported in part by the National Basic Research Program of China Grant 2011CBA00300, 2011CBA00301, the National Natural Science Foundation of China Grant 61033001, 61361136003, 61303195, and the China youth 1000-talent grant. X. Wang and X. Chen are with the Key Laboratory for Information Science of Electromagnetic Waves (MoE), Department of Communication Science and Engineering, Fudan University, 220 Han Dan Road, Shanghai, China. Emails: {xwang11, 13210720095} X. Wang is also with the Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431 USA. T. Chen and G. B. Giannakis are with the Department of Electrical and Computer Engineering and the Digital Technology Center, University of Minnesota, Minneapolis, MN 55455 USA. Emails: {chen3827, georgios} L. Huang is with the Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China. Email: X. Wang, X. Chen, and T. Chen contributed equally to this work.

Increasing threats of global warming and climate changes call for an energy-efficient and sustainable design of future wireless communication systems. To this end, a novel two-scale stochastic control framework is put forth for smart-grid powered coordinated multi-point (CoMP) systems. Taking into account renewable energy sources (RES), dynamic pricing, two-way energy trading facilities and imperfect energy storage devices, the energy management task is formulated as an infinite-horizon optimization problem minimizing the time-averaged energy transaction cost, subject to the users’ quality of service (QoS) requirements. Leveraging the Lyapunov optimization approach as well as the stochastic subgradient method, a two-scale online control (TS-OC) approach is developed for the resultant smart-grid powered CoMP systems. Using only historical data, the proposed TS-OC makes online control decisions at two timescales, and features a provably feasible and asymptotically near-optimal solution. Numerical tests further corroborate the theoretical analysis, and demonstrate the merits of the proposed approach.


Two-scale control, ahead-of-time market, real-time market, battery degeneration, CoMP systems, smart grids, Lyapunov optimization.

I Introduction

Interference is a major obstacle in wireless communication systems due to their broadcast nature, and becomes more severe in next-generation spectrum- and energy-constrained cellular networks with smaller cells and more flexible frequency reuse [1]. With ever increasing demand for energy-efficient transmissions, coordinated multi-point processing (CoMP) has been proposed as a promising paradigm for efficient inter-cell interference management in heterogeneous networks (HetNets) [2]. In CoMP systems, base stations (BSs) are partitioned into clusters, where BSs per cluster perform coordinated beamforming to serve the users [3, 4, 5]. As the number of BSs in HetNets increases, their electricity consumption constitutes a major part of their operational expenditure, and contributes a considerable portion to the global carbon footprint [6]. Fortunately, emerging characteristics of smart grids offer ample opportunities to achieve both energy-efficient and environmentally-friendly communication solutions. Such characteristics include high penetration of renewable energy sources (RES), two-way energy trading, and dynamic pricing based demand-side management (DSM) [9, 10, 11]. In this context, energy-efficient “green” communication solutions have been proposed for their economic and ecological merits [3, 4, 6, 5]. Driven by the need of sustainable “green communications,” manufacturers and network operators such as Ericsson, Huawei, Vodafone and China Mobile have started developing “green” BSs that can be jointly supplied by the persistent power sources from the main electric grid as well as from harvested renewable energy sources (e.g., solar and wind) [7, 8]. It is expected that renewable powered BSs will be widely deployed to support future-generation cellular systems.

A few recent works have considered the smart-grid powered CoMP transmissions [12, 13, 14, 15]. Assuming that the energy harvested from RES is accurately available a priori through e.g., forecasting, [12] and [13] considered the energy-efficient resource allocation for RES-powered CoMP downlinks. Building on realistic models, our recent work dealt with robust energy management and transmit-beamforming designs that minimize the worst-case energy transaction cost for the CoMP downlink with RES and DSM [14]. Leveraging novel stochastic optimization tools [16, 17, 18], we further developed an efficient approach to obtain a feasible and asymptotically optimal online control scheme for smart-grid powered CoMP systems, without knowing the distributions of involved random variables [15].

A salient assumption in [12, 13, 14, 15] is that all involved resource allocation tasks are performed in a single time scale. However, RES and wireless channel dynamics typically evolve over different time scales in practice. Development of two-scale control schemes is then well motivated for CoMP systems with RES. In related contexts, a few stochastic optimization based two-scale control schemes were recently proposed and analyzed in [19, 20, 21, 22]. Extending the traditional Lyapunov optimization approach [16, 17, 18], [19] introduced a two-scale control algorithm that makes distributed routing and server management decisions to reduce power cost for large-scale data centers. Based on a similar approach, [20] developed a so-called MultiGreen algorithm for data centers, which allows cloud service providers to make energy transactions at two time scales for minimum operational cost. As far as wireless communications are concerned, [21] performed joint precoder assignment, user association, and channel resource scheduling for HetNets with non-ideal backhaul; while [22] introduced a two-timescale approach for network selection and subchannel allocation for integrated cellular and Wi-Fi networks with an emphasis on using predictive future information. Note that however, neither [21] nor [22] considers the diversity of energy prices in fast/slow-timescale energy markets, and the energy leakage effects in the energy management task.


In the present paper, we develop a two-scale online control (TS-OC) approach for smart-grid powered CoMP systems considering RES, dynamic pricing, two-way energy trading facilities and imperfect energy storage devices. Suppose that the RES harvesting occurs at the BSs over a slow timescale relative to the coherence time of wireless channels. The proposed scheme performs an ahead-of-time (e.g., 15-minute ahead, or, hour-ahead) energy planning upon RES arrivals, while deciding real-time energy balancing and transmit-beamforming schedules per channel coherence time slot. Specifically, the TS-OC determines the amount of energy to trade (purchase or sell) with the ahead-of-time wholesale market based on RES generation, as the basic energy supply for all the time slots within a RES harvesting interval. On the other hand, it decides the amount of energy to trade with the real-time market, energy charging to (or discharging from) the batteries, as well as the coordinated transmit-beamformers to guarantee the users’ quality of service (QoS) per time slot. Generalizing the Lyapunov optimization techniques in [19, 20, 21, 22, 23], we propose a synergetic framework to design and analyze such a two-scale dynamic management scheme to minimize the long-term time-averaged energy transaction cost of the CoMP transmissions, without knowing the distributions of the random channel, RES, and energy price processes. The main contributions of our work are summarized as follows.

  • Leveraging the ahead-of-time and real-time electricity markets, and building on our generalized system models in [14, 15], a novel two-scale optimization framework is developed to facilitate the dynamic resource management for smart-grid powered CoMP systems with RES and channel dynamics at different time scales.

  • While [15, 19] and [20] do not account for battery degeneration (energy leakage), we integrate the modified Lyapunov optimization technique into the two-scale stochastic optimization approach to leverage the diversity of energy prices along with the energy leakage effects on the dynamic energy management task.

  • Using only past channel and energy-price realizations, a novel stochastic subgradient approach is developed to solve the ahead-of-time energy planning (sub-)problem, which is suitable for a general family of continuous distributions, and avoids constructing the histogram estimate which is computationally cumbersome, especially for high-dimensional vector of random optimization variables.

  • Rigorous analysis is presented to justify the feasibility and quantify the optimality gap for the proposed two-scale online control algorithm.

The rest of the paper is organized as follows. The system models are described in Section II. The proposed dynamic resource management scheme is developed in Section III. Performance analysis is the subject of Section IV. Numerical tests are provided in Section V, followed by concluding remarks in Section VI.

Notation. Boldface lower (upper) case letters represent vectors (matrices); and stand for spaces of complex vectors and real matrices, respectively; denotes transpose, and conjugate transpose; denotes a diagonal matrix with diagonal elements ; the magnitude of a complex scalar; and denotes expectation.

Iii System Models

Consider a cluster-based CoMP downlink setup, where a set of distributed BSs (e.g., macro/micro/pico BSs) is selected to serve a set of mobile users, as in e.g., [14, 15]. Each BS is equipped with transmit antennas, whereas each user has a single receive antenna. Suppose that through the smart-grid infrastructure conventional power generation is available, but each BS can also harvest RES (through e.g., solar panels and/or wind turbines), and it has an energy storage device (i.e., battery) to save the harvested energy. Relying on a two-way energy trading facility, the BS can also buy energy from or sell energy to the main grid at dynamically changing market prices. For the CoMP cluster, there is a low-latency backhaul network connecting the set of BSs to a central controller [4], which coordinates energy trading as well as cooperative communication. This central entity can collect both communication data (transmit messages, channel state information) from each BS through the cellular backhaul links, as well as the energy information (energy purchase/selling prices, energy queue sizes) via smart meters installed at BSs, and the grid-deployed communication/control links connecting them.111Perfect channel state information will be assumed hereafter, but the proposed formulation can readily account for the channel estimation errors to robustify the beamforming design; see e.g., [14, 15]. In addition, generalizations are possible to incorporate imperfect energy queue information based on the Lyapunov optimization framework in [20]. Although their detailed study falls out of the present paper’s scope, such imperfections are not expected to substantially affect the effectiveness of the proposed scheme.

Fig. 1: A smart grid powered CoMP system. Two BSs with local renewable energy harvesting and storage devices implement two-way energy trading with the main grid.

As the RES and wireless channel dynamics emerge typically at different time scales in practice, we propose a two-scale control mechanism. As shown in Fig. 1, time is divided in slots of length smaller than the coherence time of the wireless channels. For convenience, the slot duration is normalized to unity; thus, the terms “energy” and “power” can be used interchangeably. On the other hand, we define the (virtual) “coarse-grained” time intervals in accordance with the slow RES harvesting scale, with each coarse-grained interval consisting of time slots.

Iii-a Ahead-of-Time Energy Planning

At the beginning of each “coarse-grained” interval, namely at time , , let denote the RES amount collected per BS , and . With available, an energy planner at the central unity decides the energy amounts , , to be used in the next slots per BS . With a two-way energy trading facility, the BSs then either purchase energy from the main grid according to their shortage, or sell their surplus energy to the grid at a fair price in order to reduce operational costs. Specifically, following the decision, BS contributes its RES amount to the main grid and requests the grid to supply an average energy amount of per slot .

RES is assumed harvested for free after deployment. Given the requested energy and the harvested energy , the shortage energy that is purchased from the grid for BS is clearly ; or, the surplus energy that is sold to the grid is , where . Depending on the difference , the BS either buys electricity from the grid with the ahead-of-time (i.e., long-term) price , or sells electricity to the grid with price for profit (the latter leads to a negative cost). Notwithstanding, we shall always set to avoid meaningless buy-and-sell activities of the BSs for profit. The transaction cost with BS  for such an energy planning is therefore given by


For conciseness, we concatenate into a single random vector all the random variables evolving at this slow timescale; i.e., .

Iii-B CoMP Downlink Transmissions

Per slot , let denote the vector channel from BS to user , , ; let collect the channel vectors from all BSs to user , and . With linear transmit beamforming performed across BSs, the vector signal transmitted to user is: , , where denotes the information-bearing scalar symbol with unit-energy, and denotes the beamforming vector across the BSs serving user . The received vector at slot for user is therefore


where is the desired signal of user , is the inter-user interference from the same cluster, and denotes additive noise, which may also include the downlink interference from other BSs outside user ’s cluster. It is assumed that is a circularly symmetric complex Gaussian (CSCG) random variable with zero mean and variance .

The signal-to-interference-plus-noise ratio (SINR) at user can be expressed as


The transmit power at each BS clearly is given by


where the matrix

selects the corresponding rows out of to form the -th BS’s transmit-beamforming vector of size .

To guarantee QoS per slot user , it is required that the central controller selects a set of satisfying [cf. (3)]


where denotes the target SINR value per user .

Iii-C Real-Time Energy Balancing

For the -th BS, the total energy consumption per slot includes the transmission-related power , and the rest that is due to other components such as air conditioning, data processor, and circuits, which can be generally modeled as a constant power, [13]. We further suppose that is bounded by . Namely,

Fig. 2: Hourly price trend for day-ahead and real-time electricity markets during Oct. 01-07, 2015 [24].

Per slot , the energy supply available from the ahead-of-time planning may not exactly meet the actual demand at BS . Hence, the BS is also allowed to perform real-time energy trading with the main grid to balance its supply with demand. Let denote the real-time energy amount that is purchased from () or sold to () the grid by BS . Let and () denote the real-time energy purchase and selling prices, respectively. Then the real-time energy transaction cost for BS is


Fig. 2 depicts the day-ahead and real-time energy prices in the Pennsylvania-Jersey-Maryland (PJM) wholesale market [24]. In practice, the average purchase price in the real-time market tends to be no lower than that in the day-ahead market; that is, ; similarly, we have . Again, we use a random vector to collect all random variables evolving at the fast timescale.

Iii-D Energy Storage with Degeneration

As energy consumption will become a major concern of the future large-scale cellular networks, uninterrupted power supply type storage units can be installed at the BSs to prevent power outages, and provide opportunities to optimize the BSs’ electricity bills. Different from the ideal battery models in [12, 13, 14, 15, 20], we consider here a practical battery with degeneration (i.e., energy leakage over time even in the absence of discharging) as in [23].

For the battery of the -th BS, let denote the initial amount of stored energy, and its state of charge (SoC) at the beginning of time slot . The battery is assumed to have a finite capacity . Furthermore, for reliability purposes, it might be required to ensure that a minimum energy level is maintained at all times. Let denote the energy delivered to or drawn from the battery at slot , which amounts to either charging () or discharging (). The stored energy then obeys the dynamic equation


where denotes the storage efficiency (e.g., means that 10% of the stored energy will be “leaked” over a slot, even in the absence of discharging).

The amount of power (dis)charged is also assumed bounded by


where and are introduced by physical constraints.

With and consideration of , we have the following demand-and-supply balance equation per slot :


Iv Dynamic Resource Management Scheme

Note that the harvested RES amounts , the ahead-of-time prices , the real-time prices , and the wireless channel matrices are all random. The smart-grid powered CoMP downlink to be controlled is a stochastic system. The goal is to design an online resource management scheme that chooses the ahead-of-time energy-trading amounts at every , as well as the real-time energy-trading amounts , battery (dis)charging amounts , and the CoMP beamforming vectors per slot , so as to minimize the expected total energy transaction cost, without knowing the distributions of the aforementioned random processes.

According to (1) and (7), define the energy transaction cost for BS per slot as:


Let . The problem of interest is to find


where the expectations of are taken over all sources of randomness. Note that here the constraints (5), (6), (8), (9), and (10) are implicitly required to hold for every realization of the underlying random states and .

Iv-a Two-Scale Online Control Algorithm

(12) is a stochastic optimization task. We next generalize and integrate the Lyapunov optimization techniques in [19, 20, 21, 22, 23] to develop a TS-OC algorithm, which will be proven feasible, and asymptotically near-optimal for (12). To start, assume the following two relatively mild conditions for the system parameters:


Condition (13) simply implies that the energy leakage of the battery can be compensated by the charging. Condition (14) requires that the allowable SoC range is large enough to accommodate the largest possible charging/discharging over time slots of each coarse-grained interval. This then makes the system “controllable” by our two-scale mechanism.

Our algorithm depends on two parameters, namely a “queue perturbation” parameter , and a weight parameter . Define and . Derived from the feasibility requirement of the proposed algorithm (see the proof of Proposition 1 in the sequel), any pair that satisfies the following conditions can be used:




Note that the interval for in (15) is well defined under condition (14), and the interval for is valid when .

We now present the proposed TS-OC algorithm:

  • Initialization: Select and , and introduce a virtual queue , .

  • Ahead-of-time energy planning: Per interval , observe a realization , and determine the energy amounts by solving


    where expectations are taken over . Then the BSs trade energy with the main grid based on , and request the grid to supply an average amount per slot .

  • Energy balancing and beamforming schedule: At every slot , observe a realization , and decide by solving the following problem given

    s. t. (20)

    The BSs perform real-time energy trading with the main grid based on , and coordinated beamforming based on .

  • Queue updates: Per slot , charge (or discharge) the battery based on , so that the stored energy , ; and update the virtual queues .

Remark 1

Note that we use queue sizes instead of in problems (IV-A) and (IV-A); see also [19, 20]. Recall that the main design principle in Lyapunov optimization is to choose control actions that minimize . For the ahead-of-time energy planning, this requires a-priori knowledge of the future queue backlogs over slots at time . It is impractical to assume that this information is available. For this reason, we simply approximate future queue backlog values as the current value at , i.e., , , in (IV-A). To ensure that the real-time energy balancing and beamforming schedule solves the same problem as the ahead-of-time energy planning, we also use in (IV-A) although the real-time battery state of charge is available at slot . Rigorous analysis shows that the performance penalty incurred by such an approximation does not affect the asymptotic optimality of the proposed stochastic control scheme. On the other hand, using in real-time energy balancing can be also suggested in practice. While our feasibility analysis affords such a modification, deriving the optimality gap is left for future research.

Next, we develop efficient solvers of (IV-A) and (IV-A) to obtain the TS-OC algorithm.

Iv-B Real-Time Energy Balancing and Beamforming

It is easy to argue that the objective (IV-A) is convex. Indeed, with , the transaction cost with can be alternatively written as


which is clearly convex [25]; and so is the objective in (IV-A).

The SINR constraints in (5) can be actually rewritten into a convex form. Observe that an arbitrary phase rotation can be added to the beamforming vectors without affecting the SINRs. Hence, we can choose a phase so that is real and nonnegative. Then by proper rearrangement, the SINR constraints become convex second-order cone (SOC) constraints [27]; that is,

We can then rewrite the problem (IV-A) as

s. t.

As is convex and increasing, it is easy to see that is jointly convex in [25, Sec. 3.2.4]. It then readily follows that (IV-B) is a convex optimization problem, which can be solved via off-the-shelf solvers.

Iv-C Ahead-of-Time Energy Planning

To solve (IV-A), the probability distribution function (pdf) of the random state must be known across slots . However, this pdf is seldom available in practice. Suppose that is independent and identically distributed (i.i.d.) over time slots, and takes values from a finite state space. It was proposed in [19] to obtain an empirical pdf of from past realizations over a large window comprising intervals. This estimate becomes accurate as grows sufficiently large; then it can be used to evaluate the expectations in (IV-A). Based on such an empirical pdf, an approximate solution for (IV-A) could be obtained.

Different from [19], here we propose a stochastic gradient approach to solve (IV-A). Suppose that is i.i.d. across time slots (but not necessarily with a finite support). For stationary , we can remove the index from all optimization variables, and rewrite (IV-A) as (with short-hand notation )

s. t.

Note that this form explicitly indicates the dependence of the decision variables on the realization of .

Since the energy planning problem (IV-A) only determines the optimal ahead-of-time energy purchase , we can then eliminate the variable and write (23) as an unconstrained optimization problem with respect to the variable , namely


where we define


with the compact notation . Since is jointly convex in [cf. (IV-B)], then the minimization over is within a convex set; thus, (23a)-(23c) is still convex with respect to [25, Sec. 3.2.5]. In addition, due to , we can alternatively write , which is in the family of convex functions. Hence, (24) is generally a nonsmooth and unconstrained convex problem with respect to , which can be solved using the stochastic subgradient iteration described next.

The subgradient of can be first written as

With denoting the optimal solution for the problem in (IV-C), the partial subgradient of with respect to is , where

with .

Defining , a standard sub-gradient descent iteration can be employed to find the optimal for (24), as


where denotes iteration index, and is the sequence of stepsizes.

Implementing (26) essentially requires performing (high-dimensional) integration over the unknown multivariate distribution function of present in through in (IV-C). To circumvent this impasse, a stochastic subgradient approach is devised based on the past realizations . Per iteration , we randomly draw a realization from past realizations, and run the following iteration


where with obtained by solving a convex problem (IV-C) with .

As is indeed an unbiased random realization of [28], if we adopt a sequence of non-summable diminishing stepsizes satisfying and , the iteration (27) asymptotically converges to the optimal as [29].

Compared with [19], the proposed stochastic subgradient method is particularly tailored for our setting, which does not require the random vector to have discrete and finite support. In addition, as the former essentially belongs to the class of statistical learning based approaches [30], the proposed stochastic method avoids constructing a histogram for learning the underlying multivariate distribution and requires a considerably smaller number of samples to obtain an accurate estimate of .

Remark 2

The computational complexity of the proposed algorithm is fairly low. Specifically, for solving the real-time energy balancing and beamforming problem (IV-B) per slot , the off-the-shelf interior-point solver incurs a worst-case complexity to obtain the decisions [26]; for solving the ahead-of-time energy planning problem (IV-C) every slots, the stochastic subgradient approach needs iterations to obtain an -optimal solution, while the per iteration complexity is in the order of . And updating in (27) requires only linear complexity .

V Performance Analysis

In this section, we show that the TS-OC can yield a feasible and asymptotically (near-)optimal solution for problem (12).

V-a Feasibility Guarantee

Note that in problems (IV-A) and (IV-A), are removed from the set of optimization variables and the constraints in (8) are ignored. While the battery dynamics are accounted for by the TS-OC algorithm (in the step of “Queue updates”), it is not clear whether the resultant , . Yet, we will show that by selecting a pair in (15), we can guarantee that , ; meaning, the online control policy produced by the TS-OC is a feasible one for the original problem (12), under the conditions (13)–(14).

To this end, we first show the following lemma.

Lemma 1

If and , the battery (dis)charging amounts obtained from the TS-OC algorithm satisfy: i) , if ; and ii) , if .


In TS-OC, we determine by solving (IV-A). From the equivalent problem (IV-B), we can see that the determination of is decoupled across BSs, and it depends on the first derivative of . By (21), the maximum possible gradient for is . It then follows that if , we must have . Similarly, if , we must have . Given that , the lemma follows readily.

Lemma 1 reveals partial characteristics of the dynamic TS-OC policy. Specifically, when the energy queue (i.e., battery SoC) is large enough, the battery must be discharged as much as possible; that is, . On the other hand, when the energy queue is small enough, the battery must be charged as much as possible; i.e., . Alternatively, such results can be justified by the economic interpretation of the virtual queues. Specifically, can be viewed as the instantaneous discharging price. For high prices , the TS-OC dictates full charge. Conversely, the battery units can afford full discharge if the price is low.

Based on the structure in Lemma 1, we can thus establish the following result.

Proposition 1

Under the conditions (13)–(14), the TS-OC algorithm with any pair specified in (15) guarantees , , .


See Appendix A.

Remark 3

Note that Proposition 1 is a sample path result; meaning, the bounded energy queues , , hold per time slot under arbitrary, even non-stationary, processes. In other words, under the mild conditions (13)–(14), the proposed TS-OC with proper selection of always yields a feasible control policy for (12).

V-B Asymptotic Optimality

To facilitate the analysis, we assume that the random processes and are both i.i.d. over slow and fast timescales, respectively. Define and