On Online Energy Harvesting in Multiple Access Communication Systems1footnote 11footnote 1This work was presented in part at the 2012 IEEE International Symposium on Information Theory [12].

# On Online Energy Harvesting in Multiple Access Communication Systems111This work was presented in part at the 2012 IEEE International Symposium on Information Theory [12].

M. Badiei Khuzani, P. Mitran
Department of Electrical and Computer Engineering
University of Waterloo, Canada
###### Abstract

We investigate performance limits of a multiple access communication system with energy harvesting nodes where the utility function is taken to be the long-term average sum-throughput. We assume a causal structure for energy arrivals and study the problem in the continuous time regime. For this setting, we first characterize a storage dam model that captures the dynamics of a battery with energy harvesting and variable transmission power. Using this model, we next establish an upper bound on the throughput problem as a function of battery capacity. We also formulate a non-linear optimization problem to determine optimal achievable power policies for transmitters. Applying a calculus of variation technique, we then derive Euler-Lagrange equations as necessary conditions for optimum power policies in terms of a system of coupled partial integro-differential equations (PIDEs). Based on a Gauss-Seidel algorithm, we devise an iterative algorithm to solve these equations. We also propose a fixed-point algorithm for the symmetric multiple access setting in which the statistical descriptions of energy harvesters are identical. Along with the analysis and to support our iterative algorithms, comprehensive numerical results are also obtained.

{keywords}

Energy harvesting, multiple access communication, iterative algorithm.

## I Introduction

The direct impact of energy on communication cost and lifetime has spurred significant efforts to manage and optimize energy consumption. In this respect, current and future state of the art technology has focused on harvesting energy from the environment. It is thus of paramount importance to design suitable adaptive power transmission policies for these technologies. In particular, the formulation of power policies in energy harvesting systems depends intricately on many factors, including energy arrival distribution, battery capacity, quality of service, etc. Moreover, most renewable energy resources have unpredictable behaviour that makes the design process of optimal power policies difficult. Solar panels, for instance, can only scavenge sunlight during the daytime and even then, this is a function of weather conditions. Another example is thermoelectric generators where energy is absorbed based on random temperature gradients between two metal junctions. Regarding these examples, a key objective of recent studies is to engineer optimal transmission power polices. These studies, depending on causal or non-causal knowledge of future energy arrivals, fall within two major categories: offline or deterministic (for non-causal), and online or stochastic (for causal) energy harvesting systems.

In the offline regime and in terms of throughput maximization, optimal power allocation for different communication topologies has been well studied. For instance, [23] studies the multiple access channel (MAC), [21] studies the broadcast channel, and the interference channel is studied in [20]. In addition, the issue of maximizing throughput in a fading channel has been treated in [14]. There, a directional water-filling algorithm is proposed. In [5], a continuous time energy harvesting system with constant energy leakage rate due to battery imperfections is considered. Another interesting problem has been studied in [22] where an offline energy harvesting problem subject to minimizing the transmission completion time is analyzed. Specifically, a continuous-time policy to minimize the delivery time of data packets is formulated. Among more recent results in the offline setting is [8] where energy cooperation in a two-hop communication system is considered.

As an overview of prior works in the online regime, we refer the reader to [14], [16], [18], and [15]. In [14] an algorithm in the offline problem of throughput maximization by a deadline was heuristically applied to the online counterpart. The authors have also considered a dynamic programming solution for online policies. However, the curse of dimensionality in the backward induction renders the computational cost of this approach very expensive. In [16], the capacity of the additive white Gaussian noise channel (AWGN) under discrete-time energy arrivals and infinite battery capacity is characterized. Additionally, two achievable schemes based on save-and-transmit and best-effort-transmit are studied there. In [18], queuing aspects of the online energy harvesting problem with infinite battery and buffer capacity have been considered. The authors have also suggested a greedy policy that in the low signal to noise ratio (SNR) regime is throughput optimal and attains minimum delay. A more relevant study related to the work presented here is [19]. Therein, Srivastava and Koksal have investigated an optimization problem where the objective is to maximize a utility function subject to causality and battery constraints. More interestingly, they addressed a trade-off between achieving the optimum utility and keeping the discharge rate low.

In this paper, we consider the online setting with continuous time policies in which the energy release rates are regulated dynamically based on the remaining charge of the battery at each moment. This architecture naturally requires a different mathematical framework in terms of modelling and analysis. Particularly, the main tool here for modelling the interaction between battery, energy arrivals, and energy consumption is a stochastic process known as a compound Poisson dam model. This model was pioneered by Moran in 1954 [13] and studied further by Gaver-Miller [6] and Harrison-Resnick [9]. In connection with this model, we derive an upper bound on the total sum-throughput of an online energy harvesting system. Also in terms of achievability, we construct an optimization problem to maximize the sum-throughput subject to an ergodicity constraint. This maximization problem turns out to be non-linear and analytically cumbersome. Relying on a calculus of variations approach, we subsequently find a system of simultaneous PIDEs as necessary conditions for an optimal power policy. We then propose a Gauss-Seidel method (see [3]) to solve these equations efficiently. In the symmetric case, when the statistical description of all the energy harvesters are identical, we obtain an alternative algorithm using a fixed point iteration method. Moreover, in the case of the point-to-point channel setting, the necessary condition further reduces to a non-linear, autonomous ordinary differential equation (ODE) that can be solved directly, using conventional numerical methods [12].

The rest of the paper is organized as follows. In section II, we review some background, definitions, and notation. Section III deals with necessary and sufficient conditions for ergodicity of the storage process. In Section IV, we derive an upper bound as well as the achievability results for both finite and infinite storage cases, including two algorithms for the achievability part. These algorithms are then used to compute the numerical results in Section V. Lastly, in Section VI, we summarize our main findings and outline possible future directions.

## Ii Preliminaries

### Ii-a Communication model

We consider multiple access transmission nodes that wish to transmit their data over a shared communication channel. Furthermore, each transmission node has an energy harvesting module and a battery to capture and store arriving energy packets. Throughout the paper, we denote the instantaneous transmission power at time from the node () by . Also, to quantify the corresponding transmission rate of the nodes, we consider Shannon’s rate function, , where denotes the noise power spectral density. In particular, Shannon’s rate function carries the following properties and, unless stated otherwise, only these properties will be used in Section IV:

[Positivity] .

[Differentiability] is three times continuously differentiable on .

[Monotone increasing] for all .

[Concavity] for all .

Letting denote the long-term average rate of the user, we then have the rate-region described by

 ∑k∈SRk≤limT→∞1T∫T0r(∑k∈SPk(s)) ds, (1)

where the inequality holds for all subsets , and the resulting region is a polytope called polymatroid. In this study, we restrict ourselves to the dominant face of this polymatroid (called permutahedron) that represents the total sum-throughput (or sum-rate) of the channel. Then, the sum-throughput is

 M∑k=1Rk=limT→∞1T∫T0r(M∑k=1Pk(s)) ds. (2)

### Ii-B Energy harvesting and storage model

In our energy harvesting model, we allow the transmission nodes to use different techniques for harvesting exogenous energy. For example, while one node may collect solar energy, another node can use a thermoelectric generator. This mechanism is especially important for sensor networks where distributed terminals may measure miscellaneous targets that also feed sensors with energy (e.g. see [7]). Mathematically, we assume that for each individual node , energy is replenished into the corresponding battery according to specific energy arrivals , where the superscript denotes the order of arrivals. Furthermore, the energy arrivals for node are independent, identically distributed (i.i.d) according to which occur at random arrival times denoted by . The interarrival times are also assumed to be i.i.d and exponentially distributed. Therefore, the attributed point process, , is a homogeneous Poisson point process with intensity denoted by . Consequently, the total energy flow into node and up to time is a compound Poisson process,

 EInk(0,t]≜N(t)∑i=0Eik. (3)

To characterize the storage model, we also need to determine the output process at each transmitter. To do so, let denote the energy stored in the -th battery as a function of time. Then, the total energy expenditure until time is

 EOutk(t) ≜∫t0Pk(s) ds, (4) =∫t0pk(Xk(s)) ds, (5)

where represents the transmission power policy of the -th transmitter, modulated by the available energy in the battery. Now, the storage equation in terms of the energy arrivals in (3) and the drift process in (5) is

 Xk(t)=Xk(0)+EInk(0,t]−∫t0pk(Xk(s)) ds, (6)

where is the initial battery reserve at time , and here the battery is assumed to have infinite capacity . In the case that the -th battery has a finite storage capacity, say , then , and we can similarly characterize the following dynamics,

 Xk(t)=Xk(0)+EInk(0,t]−∫t0pk(Xk(s)) ds−Zk(t), (7)

where is valued process that is null at zero , non-decreasing, continuous almost everywhere, and such that . This process, known as reflection process [17], ensures that for any energy arrival, the storage process remains inside the boundary, i.e., .

It is also interesting to note that the application of the structures in (6) and (7) are not limited to the current problem. In fact, this formulation has wide applicability in other fields of studies. Examples include workload modulated queues [4], water reservoir dam analysis [1], food contaminants exposure in bioscience [2], etc. In this paper, the ergodicity results of [1] will be used and are summarized in section III.

### Ii-C Notation

In the rest of the paper and for conciseness, we adopt several shorthand notations. In particular, stands for . For , we define the rectangular domain as

 A≜[0,L1]×[0,L2]×⋯×[0,LM].

Related to this, we also define the dimensional integral by

 L1∫0L2∫0⋯LM∫0(⋅) dx1dx2⋯dxM,

which is represented by . For all subsets , we use to denote the projection of onto the coordinates indexed by , i.e.,

 A({1,3})=[0,L1]×[0,L3].

Then, denotes integration over a subset of . is also a shorthand for

 Aj≜[0,L1]×⋯[0,Lj−1]×[0,Lj+1]⋯×[0,LM].

Finally, to avoid confusion between energy arrivals and the expectation operator, we use for the latter.

## Iii Ergodic Theory of Storage Process

We here summarize necessary and sufficient conditions for ergodicity of the storage process in (6). Before stating the definitions regarding ergodic behaviour, we first put some mild constraints on the transmission policies. Particularly, for all ,

The first condition indicates that as long as there is energy in the battery, transmission continues (otherwise, the battery would have a minimum energy reserve that can not be consumed). The second condition does not permit the energy in the battery to be consumed instantly. Regarding these constraints, we say a policy is admissible iff it fulfills these two conditions.

###### Definition 1

The hitting time, , is defined as the first time that the energy level in the battery reaches the value of . More specifically,

 τ(x)=inf{t≥0:X(t)=x}.
###### Definition 2

[1, pp. 290] The storage process is said to be transient, if and only if for all initial energy levels in the battery, we have . Alternatively, the storage process is said to be recurrent if and only if , . In the case of a recurrent storage process, it is said to be positive recurrent if it further satisfies for one and therefore for all (irreducibility). Similarly, the recurrent storage process is null recurrent if for one and therefore for all .

One motivation for surveying ergodic conditions is to rule out policies that result in transient and null recurrent battery behaviours. For example in the transient case a.s. which is unrealistic. Also, in the null recurrent case which implies an unbounded energy reserve in the battery.

###### Theorem 1

(Assmussen [1, Thm. 3.6]) The storage process is positive recurrent if and only if there exist a probability measure that is absolutely continuous on and which may possess an atom at zero, , i.e.,

 πk(xk)=π0k+∫xk0+fk(vk) dvk, (8)

and such that

 fk(xk)=λkpk(xk){π0k(1−Bk(xk))+∫xk0+(1−Bk(xk−vk))fk(vk) dvk}. (9)

Furthermore, is the unique stationary distribution of the process .

###### Remark 1

The elegant proof of Assmussen for the converse part of Theorem 1 is based on an embedded Markov chain at marked arrival times. In particular, for recurrent embedded chains, it is shown that any storage interval is recurrent in the sense of Harris. An alternative proof of the converse part of Theorem 1 adopts the additional condition . Due to this extra condition, the required time to reach the zero state in the absence of new arrivals from any energy level in the battery must be finite. For this constraint, it can also be shown that is a regenerative recurrent point for the process and therefore, due to the additional constraint, the probability measure has a strict atom at zero.

###### Remark 2

As discussed in [1, pp. 297], in the finite energy case (), the storage process is always positive recurrent and the probability measure is likewise governed by (8) and (9).

###### Remark 3

We note that the atom of the probability measure corresponds to an absorbing state of the process in the sense that upon entering state , the process remains there until an energy arrival occurs (at which point the process transits to another state). Based on this and the first constraint on admissible power policies (in particular ), there is no atom at in the finite case since it has a strictly negative drift in (7) that shifts the process to the inner region of the state-space instantaneously, i.e., . Therefore, the battery never idles with (reflecting boundary).

An interpretation for the forward equation in (9) can be provided in terms of level crossing theory. In particular,

 fk(xk)pk(xk)=λk{π0k(1−Bk(xk))+∫xk0+(1−Bk(xk−vk))fk(vk) dvk}, (10)

is the equilibrium condition between the rate of down crossing at level (the l.h.s of (10)) and up crossing at level (the r.h.s of (10)). We can also view (9) as a Volterra integral equation of the second kind with the kernel function , and it can thus be solved numerically (see [10]).

In this paper, we consider the energy arrivals , to be exponentially distributed with parameter . Thereby, we have

 K(xk,vk)=exp(−ζk(xk−vk)),

that simplifies (9) to

 fk(xk)=λkexp(−ζkxk)pk(xk)Gk(xk), (11)

where

 Gk(xk)≜(π0+∫xk0+exp(ζkvk)fk(vk) dvk). (12)
###### Remark 4

The storage models in (6) and (7) are memoryless, in the sense that at each time instant , the power policy only depends on the available charge in the battery and not the entire sample path . As an extension, we can also define a storage model with memory and infinite battery capacity as follows

 Xk(t)=Xk(0)+EInk(0,t]−∫t0pk(Xk(u);u≤s) ds. (13)

The extension of the storage model with memory and finite battery capacity follows similarly. However, when the arrival process is Poisson, it can be shown that is a sufficient statistic for an optimal power policy for both infinite and finite battery cases (see Appendix A). In this regard, knowledge of the entire path as an argument of is excessive.

## Iv Bounds on Total Average Throughput

Our objective now is to derive an upper bound on the average throughput as well as achievable policies with good performance. In connection with our system model, we will analyze a MAC with 1) finite, and 2) infinite storage batteries.

In particular, in the finite storage case, a good power policy must manage overflow in the battery as regular overflow causes energy waste and potentially decreases the sum throughput. To reduce overflow, the power policy must result in a large transmission power when the battery charge is large as otherwise overflow is likely to occur upon a new arrival. However, transmitting with too large a transmission power when the battery happens to have large charge is also undesirable due to the concavity of the rate function. In other words, there is a tension between overflow and the rate at which the large battery charge is consumed to reduce overflow likelihood.

To further clarify the latter point, consider an energy harvesting system with a single node () in which energy is replenished into a battery exactly every units of time. In addition, assume that the transmitter sends data by using a constant transmission power . Two cases can now be examined:

i) : In this case, the transmitter fails to consume the entire battery charge before the next arrival, and thus overflow occurs regularly. We then have

 T×r(EαT)≤T×r(ET). (14)

ii) : In this case, the transmitter depletes its available battery charge within of each arrival. From the concavity of the rate function, we have the following inequality

 αT×r(EαT)≤T×r(ET). (15)

Here, the tension between (i) and (ii) is resolved by the optimal choice of , i.e., .

### Iv-a An Upper Bound

#### Iv-A1 Finite Storage Battery

In this case . Then from (2) and due to ergodicity of the storage processes in the finite battery case (ref. Remark 3), we have almost surely

 M∑k=1Rk\lx@stackrela.s.=E[r(M∑k=1pk(Xk))], (16)

where the expectation is with respect to the stationary distribution in Theorem 1. In addition, from the concavity property of the rate function and Jensen’s inequality,

 E[r(M∑k=1pk(Xk))]≤r(M∑k=1E[pk(Xk)]). (17)

It thus remains to bound the mean transmission power . This can be accomplished by integrating by parts as follows

 E[pk(Xk)] =π0kpk(0)+∫Lk0+pk(xk)fk(xk) dxk (18) \lx@stackrel(a)=∫Lk0+pk(xk)fk(xk) dxk (19) \lx@stackrel(b)=λk∫Lk0+exp(−ζkxk)Gk(xk) dxk (20) =−λkζkexp(−ζkxk)Gk(xk)∣∣Lk0++λkζk∫Lk0+exp(−ζkxk)G′k(xk) dxk, (21)

where (a) comes from the first constraint on the admissible power policies and (b) follows from (11). Now from (12),

 G′k(xk)=fk(xk)exp(ζkxk). (22)

Also we note that and

 e−ζkLkGk(Lk) =e−ζkLk(π0k+∫Lk0+eζkxkfk(xk) dxk) (23) \lx@stackrel(c)≥e−ζkLk(π0k+∫Lk0+fk(xk) dxk) (24) =e−ζkLk, (25)

where inequality (c) is due to the fact that for all since . Substituting (25) and (22) in (21) thus leaves us

 E[pk(Xk)] =λkζk(Gk(0+)−eζkLkGk(Lk))+λkζk∫Lk0+fk(xk) dxk, (26) ≤λkζk(π0k−e−ζkLk+∫Lk0+fk(xk) dxk) (27) =λkζk(1−exp(−ζkLk)), (28)

In the last step, we now use (28) and the non-decreasing property of the rate function to characterize an upper bound for all as follows

 M∑k=1Rk≤r(M∑k=1λkζk(1−e−ζkLk))≜Rupper. (29)

#### Iv-A2 Infinite Storage Battery

We now take . In this case, similar to (20) we can directly compute,

 E[pk(xk)] =λk∫∞0+e−ζkxkGk(xk) dxk (30) =λk∫∞0+e−ζkxk(π0k+∫xk0+eζkvkfk(vk)dvk) dxk (31) =λkζkπ0k+λk∫∞0+∫xk0+eζk(vk−xk)fk(vk) dvkdxk (32) \lx@stackrel(a)=λkζkπ0k+λk∫∞0+∫∞vkeζk(vk−xk)fk(vk) dxkdvk (33) =λkζkπ0k+λkζk∫∞0+fk(vk) dvk (34) =λkζk, (35)

where in (a), we changed the order of integration due to the Fubini’s theorem. Thus, for positive recurrent policies and when all , we have the following upper bound

 M∑k=1Rk≤r(M∑k=1λkζk). (36)
###### Remark 5

In contrast with the inequality (29) which only holds for positive recurrent transmission power policies, (36) is valid for transient and null recurrent power policies as well.

In particular, in the infinite battery case, regardless of the type of power policy, and thus (36) follows by concavity of the rate function. Nevertheless, the strict equality in (35) will be used to study transmission power policies that result in ergodic behavior for the infinite battery capacity case in subsection IV-B.

### Iv-B Achievable allocation scheme

To derive transmission power policies with good performance, we start with the ergodicity assumption and the definition of expectation, i.e.,

 M∑k=1Rk =limT→∞1T∫T0r(M∑k=1Pk(s)) ds (37) \lx@stackrela.s.=∫Ar(M∑k=1pk(xk))M∏k=1πk(dxk) (38) ≜R({pk(xk)}Mk=1), (39)

where

 πk(dxk)=[π0kδ(xk)+fk(xk)]dxk, (40)

and denotes the Dirac delta function. We now aim to find achievable policies through the following optimization problem

 sup{π0k,fk(xk)}Mk=1∫Ar(M∑k=1pk(xk))M∏k=1πk(dxk), (41) s.t.: ∀k∈[M] fk(xk)=λke−ζkxpk(xk)(πk0+∫xk0+e−ζkvfk(v) dv), (42) π0k+∫Lk0+fk(xk) dxk=1, (43) π0k≥0,  fk(xk)≥0, (44)

which maximizes the overall expected throughput of the multiple access channel subject to the stationary probability measure constraints of the batteries. However, tackling this non-linear optimization problem is challenging as the feasibility constraint in (42) is not in an explicit form. To circumvent this difficulty, we use a calculus of variations approach to transform the problem into a set of necessary conditions for an optimal solution. As a starting point, consider the following linear mappings

 gk(xk)≜fk(xk)eζkxk,   xk>0, (45)

that converts the positive recurrent condition in (11) into

 gk(xk) =λkpk(xk)(πk0+∫xk0+gk(v) dv) (46) =λkpk(xk)Gk(xk), (47)

with as in (12). Hence, (38) is valid with

 pk(xk)={λkGk(xk)/gk(xk)xk>00xk=0, (48) πk(dxk)=[π0kδ(xk)+e−ζkxkgk(xk)]dxk. (49)

With this substitution, we obtain an equivalent formulation for the optimization problem in (41)-(44) as below

 sup{π0k},{gk(xk)}∫Ar(M∑k=1pk(xk))M∏k=1πk(dxk), (50) s.t.: ∀k∈[M] Gk(xk)=(π0k+∫xk0+gk(v) dv), (51) π0k+∫Lk0+e−ζkvgk(v) dv=1, (52) π0k≥0,  gk(xk)≥0, (53)

where and are according to (48) and (49).

Through the formulation in (50)-(53), we can show that the throughput maximization problem in (41)-(44) is concave with respect to each coordinate over a convex feasible set. In particular, since the transformation between and is linear, the concavity of (41)-(44) can be shown equivalently by proving the concavity of the formulation in (50)-(53). To this end, suppose that are two arbitrary sets of optimization parameters belonging to the feasible region defined in (51)-(53). Then for all and , it readily follows that also satisfies (51)-(53), where and are the convex combinations of the densities and atoms, respectively. This proves the convexity of the feasible region (51)-(53).

###### Proposition 1

Let , and be the utility functions corresponding to , and respectively, such that

 k=j, (π0,αk,gαk(xk))=(π0,1k,g1k(xk))=(π0,2k,g2k(xk)), k≠j.

Then,

 Rαj≥αR1j+¯αR2j. (54)
{proof}

The proof is relegated to Appendix B.

Now define an ensemble of perturbation functions, , such that

 ∫Lk0+ψk(v) dv=0, (55) ∫Lk0+exp(−ζkv)ψk(v) dv=0, (56)

and the are continuous and bounded on their domain and . For sufficiently small , it thus follows that meets (51)-(53) with the same atoms and thus lies inside the feasibility region. Then, with , it must be true for a global maximum solution that

 Rϵ≤R, (57)

where

 Rϵ=∫Ar(M∑k=1pϵkk(xk))M∏k=1πϵkk(dxk), (58)

and

 πϵkk(xk) ≜[π0kδ(xk)+e−ζkxkgk(xk)+ϵke−ζkxkψk(xk)]dxk =πk(dxk)+ϵke−ζkxkψk(xk)dxk, (59)

and is calculated from (48) to be,

 pϵkk(xk)=⎧⎪⎨⎪⎩λkGk(xk)+ϵkΨk(xk)gk(xk)+ϵkψk(xk)xk>00xk=0, (60)

with,

 Ψk(xk)≜∫xk0ψk(v) dv. (61)

For the moment, we assume that only the coordinate is perturbed; that is . Expanding the right hand side of (58) to first order then results in

 Rϵj ×[πj(dxj)+ϵje−ζjxjψj(xj)dxj]∏k∈[M]−jπk(dxk) (62) =R+ϵj∫Ar(M∑k=1pk(xk))eζjxjψj(xj)dxj∏k∈[M]−jπk(dxk) +ϵj∫A∂r(∑Mk=1pk(xk))∂pj(xj)dpϵjj(xj)dϵj∣∣ϵj=0M∏k=1πk(dxk) +O(ϵ2j). (63)

On the other hand, we note that

 dpϵjj(0)dϵj∣∣ϵj=0=0, (64)

since from (60).222Alternatively, for all as the battery is empty. Therefore,

 ∫A∂r(∑Mk=1pk(xk))∂pj(xj)dpϵjj(xj)dϵj∣∣ϵj=0δ(xj)dxj=0, (65)

and we thus have from (63)

 Rϵj =R+ϵj∫Ar(M∑k=1pk(xk))eζjxjψj(xj)dxj∏k∈[M]−jπk(dxk) +ϵj∫A∂r(∑Mk=1pk(xk))∂pj(xj)dpϵjj(xj)dϵj∣∣ϵj=0eζjxjgj(xj)dxj∏k∈[M]−jπk(dxk)+O(ϵ2j).

This expansion, accompanied with inequality (57) establishes the following necessary condition for a locally (and thus globally) optimal solution

 ∫Ar(M∑k=1pk(xk))eζjxjψj(xj)d<