# Optimal Offline and Competitive Online Strategies for Transmitter-Receiver Energy Harvesting

###### Abstract

A joint transmitter-receiver energy harvesting model is considered, where both the transmitter and receiver are powered by (renewable) energy harvesting source. Given a fixed number of bits, the problem is to find the optimal transmission power profile at the transmitter and ON-OFF profile at the receiver to minimize the transmission time. With infinite capacity at both the transmitter and receiver, optimal offline and optimal online policies are derived. The optimal online policy is shown to be two-competitive in the arbitrary input case. With finite battery capacities at both ends, only random energy arrival sequence with given distribution are considered, for which an online policy with bounded expected competitive ratio is proposed.

## I Introduction

Extracting energy from nature to power communication devices has been an emerging area of research. Starting with [1, 2], a lot of work has been reported on finding the capacity, approximate capacity [3], structure of optimal policies [4], optimal power transmission profile [5, 6, 7, 8], competitive online algorithms [9], etc. One thing that is common to almost all the prior work is the assumption that energy is harvested only at the transmitter while the receiver has some conventional power source. This is clearly a limitation, however, helped to get some critical insights into the problem.

In this paper, we broaden the horizon, and study the more general problem when energy harvesting is employed both at the transmitter and the receiver. The joint (tx-rx) energy harvesting model has not been studied in detail and only some preliminary results are available, e.g., a constant approximation to the maximum throughput has been derived in [10] or [11], [12]. This problem is fundamentally different than using energy harvesting only at the transmitter, where receiver is always assumed to have energy to stay on. In contrast to the variable power model at the transmitter where it can choose to transmit any power level given the available energy constraint, the receiver energy consumption model is binary, as it uses a fixed amount of energy to stay on, and is off otherwise. Since useful transmission happens only when the receiver is on, the problem is to find jointly optimal decisions about transmit power and receiver ON-OFF schedule. Under this model, there is an issue of coordination between the transmitter and the receiver to implement the joint decisions, however, we ignore that in the interest to make some analytical progress, and assume that the decisions are made by a centralized controller.

We study the canonical problem of finding the optimal transmission power and receiver ON-OFF schedule to minimize the time required for transmitting a fixed number of bits, first in the case when there is no limit on the battery capacities and then generalize it for finite battery capacities at both the transmitter and the receiver. We first consider the offline case, where the energy arrivals both at the transmitter and the receiver are assumed to be known non-causally. Even though offline scenario is unrealistic, it still gives some design insights. Then we consider the more useful online scenario, where both the transmitter and the receiver only have causal information about the energy arrivals. To characterize the performance of an online algorithm, typically, the metric of competitive ratio is used that is defined as the maximum ratio of ‘profit’ of the online and the offline algorithm over all possible inputs.

For the infinite battery capacity case, in prior work [5], an optimal offline algorithm has been derived for the case when energy is harvested only at the transmitter, which cannot be generalized with energy harvesting at the receiver together with the transmitter. To understand the difficulty, assume that the receiver can be on for maximum time . The policy of [5] starts transmission at the first energy arrival time, and power transmission profile is the one that yields the tightest piecewise linear energy consumption curve that lies under the energy harvesting curve at all times and touches the energy harvesting curve at end time. The policy of [5], however, may take more than time and hence may not be feasible with the receiver on time constraint. So, we may have to either delay the start of transmission and/or keep stopping in-between to accumulate more energy to transmit with higher power for shorter bursts, such that the total time for which transmitter and receiver is on, is less than . Similarly, for the finite battery capacity, an optimal offline algorithm has been derived for the case when energy is harvested only at the transmitter in [13]. However, once again there is no easy way of extending the results of [13], when both the transmitter and receiver are powered by EH, and we need a new approach.

With infinite battery capacity at both the transmitter and the receiver, in the offline scenario, we derive the structure of the optimal algorithm, and then propose an algorithm that is shown to satisfy the optimal structure. The power profile of the proposed algorithm is fundamentally different than the optimal offline algorithm of [5], however, the two algorithms have some common structural properties. The recipe of our solution is to first solve the simpler problem of finding the optimal offline algorithm when there is only one energy arrival at the receiver. Building upon this solution, we then derive the optimal offline solution to the problem with multiple energy arrivals at the receiver, to be one among finitely many solutions of the problem with only one energy arrival at the receiver, where corresponding single energy arrivals are suitably constructed. This technique not only gives an elegant method to prove the optimality, but also helps in simplifying the complexity of the optimal algorithm.

Next, we consider the more useful setup of online algorithms that use only causal information. With infinite battery capacities at both ends, for the online scenario, we propose an online algorithm, which starts at time where the accumulated energy at both the transmitter and the receiver is sufficient to transmit the given number of bits eventually. The transmit power at any time (only updated at energy arrival epoch of the transmitter) is such that using the available energy, the remaining number bits are transmitted in minimum time assuming no more energy is going to arrive in future. We show that the competitive ratio of the proposed online algorithm is strictly less than for any energy arrival inputs, even if chosen by an adversary. With only energy harvesting at the transmitter, a -competitive online algorithm has been derived in [9]. This result is more general with different proof technique that allows energy harvesting at the receiver. To prove that the proposed online algorithm is optimal, we show a lower bound on the competetive ratio that is arbitrarily close to for any online algorithm. This is accomplished by constructing two “bad” sequences of energy arrivals at the transmitter and the receiver, for which any algorithm fails to achieve a competitive ratio of better than for at least one of the two sequences.

Finally, we consider the case of finite battery capacity. With finite battery capacity, it is easy to show that the competitive ratio of any online algorithm with the worst case input is unbounded as follows. Suppose, by time slot , any online algorithm consumes more (less) energy than the optimal offline algorithm, then it is easy to construct future energy arrival sequences, for which the optimal offline algorithm can finish transmission of given number of bits, on account of knowing the input sequence and transmitting at a slower (faster) rate, while the online algorithm can never finish the transmission. Thus, we restrict ourselves to scenario where energy arrivals follow a known distribution, but the realization information is only known causally. We propose a simple Accumulate and Dump algorithm, that waits for battery to fill up to a certain prefixed level, and as soon as the accumulated energy is above the level, uses all the energy in the next slot, and restarts accumulating all over again. We show that the expected competitive ratio of the proposed algorithm is finite, which can be computed explicitly given the energy arrival distribution. In prior work [13, 14, 15], optimal offline algorithm has been derived when only the transmitter is powered with EH and has a finite battery capacity. Instead of the offline regime, in this paper, we concentrate on the online setting which is more relevant in practice and propose algorithms that have a finite penalty with respect to the optimal offline algorithm.

## Ii System Model

The energy arrival instants at transmitter are marked by ’s with energy ’s for . The total energy harvested at the transmitter till time is given by

(1) |

Similarly, the energy arrival instants at the receiver are denoted as with energy . We initialize to without affecting the system model as follows. If , i.e. the first energy arrival at the receiver occurs before the first energy arrival at the transmitter, then we assume that energy is harvested at the receiver at time , i.e. . We shift the time origin to , i.e. . Note that, since the transmitter has energy to transmit before time , no transmission policy can start transmission before . Therefore, assuming whenever , does not affect any transmission policy. Similarly, whenever , we assume energy arrives at the transmitter at time , i.e. , and we offset time origin to .

The receiver spends a constant amount of power to be in ‘on’ state during which it can receive data from the transmitter. When it is in ‘off’ state it does not receive data, and uses no power. Hence, each energy arrival of adds amount of receiver on time. The total ‘time’ harvested at the receiver till time is given by,

(2) |

The rate of transmission using transmit power when the receiver is on is given by a function which is assumed to follow the following properties,

P1) | |||

P2) | |||

P3) |

Assuming an AWGN channel, function is one such example satisfying all the above properties.

Let a transmission policy change its transmission power at time instants ’s, i.e. is the transmitter power between time and . The receiver is on from time to whenever and is off only if . Thus, succinctly, we say that receiver is on at time to mean that transmit power for and receiver is on. The start and the end time of any policy is denoted by and , respectively. Thus, any policy can be represented as , , where and . The energy used by a policy at the transmitter upto time is denoted by , and the number of bits sent by time is represented by . Clearly, for ,

(3) | ||||

(4) | ||||

(5) | ||||

(6) | ||||

(7) | ||||

(8) |

Similarly, the total time for which the receiver is on till time is denoted as .

Except for section VII, we assume that an infinite battery capacity is available both at the transmitter and the receiver to store the harvested energy. Our objective is, given a fixed number of bits , minimize the time of their transmission. For any policy, the total time for which the receiver is on is referred to as the ‘transmission time’ or the ‘transmission duration’, and the time by which the transmission of bits is finished, is called as the ‘finish time’. Thus, we want to minimize the finish time. Also, since the receiver may not be always on before finish time, we have transmission time less than or equal to finish time. Formally, we want to solve,

(9) | ||||

subject to | (10) | |||

(11) | ||||

(12) |

Under transmission policy , the total receiver on time till time for is given by,

(13) |

where and is a function that takes value if and if . Constraints (11) and (12) are the energy neutrality constraints at the transmitter and the receiver, i.e. energy/on-time used cannot be more than available energy/on-time

## Iii Optimal Offline Algorithm for Single Energy Arrival at the Receiver

In this section, we consider an offline scenario, i.e., all energy arrival epochs ’s and energy harvest amounts ’s at the transmitter are known ahead of time non-causally. Moreover, we assume that the receiver gets only one energy arrival of at time , and hence the total receiver on time is . The crux of problem in both cases (with single/multiple energy arrivals at the receiver) lies in overcoming the problem of the limited transmission time available at the receiver and is not affected much by the number of energy harvests at the receiver. As we shall see, the optimal offline algorithm with multiple energy arrivals at the receiver (solving (9)) consists of repeated application of the derived optimal algorithm for the single energy arrival case. Hence, we postpone the analysis with multiple energy arrivals at the receiver to section V.

With only one energy harvest at the receiver, i.e. with total receiver time harvested at time , a special case of (9) to minimize the finish time of transmission of bits is,

(14) | ||||

(15) | ||||

(16) | ||||

(17) |

Compared to the no receiver constraint [5], Problem (14) is far more complicated, since it involves jointly solving for optimal transmitter power allocation and time for which to keep the receiver on.

We next present some structural results on the optimal policy to (14) starting with Lemma 1, which states that transmission powers in the optimal policy to (14) are non-decreasing over time.

###### Lemma 1.

In an optimal solution to Problem (14), if , then with ^{1}^{1}1Observe that without receiver energy harvesting constraint (17), from [5] and Lemma 1 would be same as Lemma 1 in [5]. But, as we have constraint on the total receiver time, in optimal solution, transmitter may shut off for some time and resume transmission when enough energy is harvested. Hence, may be in-between transmission. Lemma 1 shows that even if this happens, non-zero powers still remain non-decreasing..

###### Proof.

We prove this by contradiction. Assume that the optimal policy (say ), with violates the condition stated in Lemma 1. Let be the first transmission power such that . Let .

Suppose . This situation is shown in Fig. 1 (a). In this case, consider a new transmission policy (say ) which is same as the optimal policy till time . From to , transmits at a constant power . Then the number of bits transmitted by policy from time to is given by while the optimal policy transmits bits. Due to concavity of ,

Hence, both and transmit equal number of bits till time , while transmits more number of bits than by time . After time , suppose policy transmits with power same as policy till it completes transmitting bits. Since has transmitted more bits than till time , it finishes transmitting all bits earlier than , contradicting the optimality of .

When , by our assumption on choosing , and . So, . If any of is non zero, then no longer remains the minimum index violating the condition stated in Lemma 1. Hence, . This situation is shown in Fig. 1(b). Now, consider a policy where the transmission power is same as the optimal policy before time and after time . From to , keeps the receiver off (so transmitter does not transmit in this duration) and from to it transmits at power . This policy still transmits equal number of bits and ends at the same time as the optimal policy . Now that matches with the form of in Case 1 from time to , we could proceed to generate another policy form (like in Case 1) which would finish earlier than . Hence, this new policy would finish earlier than as well and we would reach a contradiction.

∎

Although, Lemma 1 is valid for every optimal policy to (14), we will narrow down the search for optimal solutions by looking at an interesting property presented in Lemma 2, which tells us that there is no need to stop in-between transmissions, and start again. Thus, without affecting optimality, the start of the transmission can be delayed so that transmission power is non-zero throughout.

###### Lemma 2.

The optimal solution to Problem (14) may not be unique, but there always exists an optimal solution where once the transmission has started, the receiver remains ‘on’ throughtout, until the transmission is complete.

###### Proof.

We construct an optimal solution for which for all , i.e., with no breaks in transmission, from any other optimal solution. Let an optimal policy be characterized by . Now, if , then we are done. Suppose some powers, say for some , where . We first look at instant .

Consider Fig. 2 (a), and a new policy (say ) which is same as policy before time and after time . But, it keeps the receiver off for a duration of starting from time (i.e. from to ) and transmits with power from time till . transmits same amount of bits in same time as and also satisfies constraints (15)-(17). So is also an optimal policy. But the receiver off duration in , , has been shifted to left.

Next, we generate another policy from by shifting the off duration to start from epoch upto , , as shown Fig. 2 (b). is shifted right to start from . Note that is also optimal. We continue this process of shifting the receiver off period to the left to generate new optimal policies till we reach a policy (say ) where the receiver is off for time from , i.e. from to , , as shown in Fig. 2 (c). As has transmission power from the start time to , the effective start time of can now be changed to .

We can repeat this procedure for each off period corresponding to till the total off period is shifted to the beginning of transmission. This results in a policy with no zero powers in between, that starts after time (at ) and ends at the same time as policy .

∎

In the subsequent discussion, the optimal solution to Problem (14) means one with no breaks in transmission (reception). As we shall see in Theorem 1, such an optimal solution is unique.

Next, we show that the transmission power changes (if at all) only at energy arrival epochs ’s, and the energy used up by that epoch is equal to all the energy that has arrived till then.

###### Lemma 3.

For optimal policy , for some , , and .

It may happen that at some epoch , holds true, but the transmission power does not change. For notational simplicity, we include all such ’s in , where .

Next lemma states that if we take any feasible policy, and decrease and increase while keeping the number of transmitted bits fixed, the transmission time increases, while reducing the finish time of the policy. Lemma 4 will be useful to prove uniqueness of the optimal policy with no breaks in transmission.

###### Lemma 4.

Consider two policies , and , , which are feasible with respect to energy constraint (16), have non-decreasing powers and transmit same number of bits in total. If is same as from time to , but with and , then we have that the finish time with is less than that of , i.e., with some , and the transmission time of is more than that of , i.e., .

###### Proof.

and having used same amount of energy from to , we can say that , and . Thus, we can define and . As and transmit equal number of bits in total and are identical between time and , we can just equate the number of bits transmitted by before and after (LHS of (18)) with that of (RHS of (18)), i.e.,

(18) |

(Note that only one of the four variable can be independently chosen.) Therefore, from (18),

(19) |

As is a continuous & differentiable function, the mean value theorem implies that and such that

(20) | |||

(21) |

Substituting (20) and (21) in (19) we get,

(22) |

Now is an increasing function of since is convex. Hence, with ,

(23) |

Thus, (22) implies . So, transmission time in the policy , , is greater than the transmission time in policy i.e. . ∎

Lemma 5 uses Lemma 4 to prove that if the start time of the optimal policy is delayed beyond the first ‘time’ arrival instant at the receiver, then the transmission time will be equal to , i.e., it will exhaust all the transmission time available with the receiver.

###### Lemma 5.

For an optimal policy , either or .

###### Proof.

We use contradiction to prove the result. Suppose the optimal policy say , starts at and has transmission time . We will generate another policy which has finish time less than that of , having transmission time squeezed in between and . Consider policy () in relation to , as defined in Lemma 4. As , , , are all related (by constraints presented in Lemma 4), choice of one variable (we consider ) defines . By definition of ’s, is the first energy arrival which is on the boundary of energy constraint (16) i.e. and is the last epoch satisfying . Hence, we can choose , such that and would be feasible with respect to energy constraint (16). Note that if , then any value of would have made infeasible.

From Lemma 4, we know that the transmission time of policy is more than that of , i.e. . From the hypothesis . Therefore, let , with . If the chosen value of is such that , then . If not, then we can further reduce so that (,,, being related by continuous functions). Note that, when , any choice of would make . Hence, with this choice of , holds and policy contradicts the optimality of policy (as finish time of is less than finish time of , from Lemma 4). Thus if in an optimal policy. ∎

Summarising the results of Lemmas 1-5, the optimal policy may change transmission powers only at energy arrival epochs i.e. for some . At these epochs, it exhausts the total energy available i.e. . The transmission powers are also non-decreasing with time, and the optimal policy uses up the total ‘receiver time’ allowed, if it does not start transmitting from .

Now we prove in Theorem 1 that the structure described in Lemma 1-5 including Lemma 6 (for ease of presentation Lemma 6 is postponed to section IV) is not only necessary, but is indeed sufficient for optimality of a policy.

###### Theorem 1.

###### Proof.

The proof consists of establishing both necessary and sufficiency conditions. The necessity of (24) follows as it is a constraint to the Problem (14), (25) follows from Lemma 1, 2, (26) follows from Lemma 3, (27) follows from Lemma 5, and (28) follows from Lemma 6.

Now, we prove the sufficiency of the structure (24)-(28). Let a policy , follow structure (24)-(28). We need to show that this policy is optimal, which we do via contradiction. Suppose is not optimal. Let there exists another policy , which is optimal. Since abides by Lemma 1-6 on account of its optimality, also satisfies structure (24)-(28). (Now both and satisfy structure (24)-(28) but is optimal i.e. it finishes before . This would would mean that there possibly exists some more conditions which are followed by but not ). We need to show that such a optimal policy (different from ) cannot exist or is infeasible, i.e., both and cannot simultaneously satisfy (24)-(28) and be different.^{2}^{2}2Note that Lemma 2 suggests that optimal solution to Problem (14) may not be unique in general, but Theorem 1 shows that the optimal solution without breaks in transmission is indeed unique.

The following cases arise depending on whether , or .

Case1: If , then by (27), . So policy finishes after time and hence cannot be optimal.

Case2: Suppose . Let be the first epoch for which for some .

Suppose . If, in policy , transmission continues after i.e. , then the amount of energy used by in interval can be lower bounded by , which follows from (25). Since , is more than , which is the energy used by policy . But by structure (26), uses all energy available at both and . So, the maximum energy available in is . Therefore, uses more than available energy in and is not feasible with respect to the energy constraint.

If , then it can be easily verified by concavity of function that transmits strictly less number of bits in interval than in interval . Both policies being same till , we conclude that transmits less than bits by its finish time , and thus it is not feasible with respect to (24).

When , symmetrical arguments follow.

Case3: This case argues the infeasibility of when . Since , transmission time of is equal to from (27). The idea of the proof is to show that if an optimal policy starts its transmission early and finishes earlier than policy , it always takes more transmission time than (), which is going to violate the time constraint (17). First, we establish that must be same as policy from epoch to an epoch such that . Let , and continue from with constant power till . Clearly from definition of .

Suppose . Since transmission with a constant power from to is feasible, transmission with constant power from to , and from to is also feasible for any policy (Refer to Fig. 4 (a)) and hence,

(29) |

Transmission with power exhausts all available energy at epochs and . Therefore, power (in policy ) from to must be greater than . If not, then transmission with power in would become infeasible. Thus, from (29),

(30) |

Now, transmission with power from to , and transmission with power from to are both feasible for any policy. This combined with (30) would imply transmission with a constant power from to is feasible and hence,

(31) |

Since finish time of , , transmits in interval and uses atleast energy in this interval, which follows from (25). But, the maximum energy available for transmission in interval is . From (31), we can infer that uses more than this available energy in , and therefore, we reach a contradiction over feasibility of . So, our hypothesis, , is incorrect. Since, , we can conclude that .

Now, let and . From the definition of , . Then the amount of energy used by policy between and is more than what is available. So () and similarly, we can show that () till epoch . This completes the proof that is same as policy from epoch to .

By structure (28) we can be sure that there exists atleast one epoch which belongs to as well as . So, .

Continuing with Case3, consider the following process which creates feasible policies from policy as shown in Fig. 4 (b). We define two pivots and . Initially we set and . The transmission power right before is ( initially) and right after is ( initially). Keeping the policy same from to , we increase by a small amount to and decrease by a small amount to such that the number of bits transmitted (i.e. ) remains same under this transformation. This would lead to change in the start time and finish time . Let the starting time of transmission change to and the finish time change to for some (note that is dependent on ). We denote such a policy by vectors .

Following Lemma 4, we can conclude that . We continue increasing till either (in which case we change ) or (where we change ) or hits an epoch, say (we change , in this case). After this, we again start increasing with changed definitions of . We continue this process till or becomes equal to . Note that the value of for which becomes equal to , would be greater than , since policy