Age of Information Minimization for an Energy Harvesting Source with Updating Erasures:Without and With Feedback

# Age of Information Minimization for an Energy Harvesting Source with Updating Erasures: Without and With Feedback

Songtao Feng, and Jing Yang Songtao Feng and Jing Yang are with the School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, PA, 16802, USA. Email: {sxf302, yangjing}@psu.edu. This work is presented in part in the 2018 IEEE International Conference on Computer and Communications (INFOCOM) - Workshop on Age of Information [1] and the 2018 IEEE International Symposium on Information Theory [2].
###### Abstract

Consider an energy harvesting (EH) sensor that continuously monitors a system and sends time-stamped status update to a destination. The sensor harvests energy from nature and uses it to power its updating operations. The destination keeps track of the system status through the successfully received updates. With the recently introduced information freshness metric “Age of Information” (AoI), our objective is to design optimal online status updating policy to minimize the long-term average AoI at the destination, subject to the energy causality constraint at the sensor. Due to the noisy channel between the sensor and the destination, each transmitted update may be erased with a fixed probability, and the AoI at the destination will be reset to zero only when an update is successfully received. We first consider status updating without feedback available to the sensor and show that the Best-effort Uniform updating (BU) policy is optimal. We then investigate status updating with perfect feedback to the sensor and prove the optimality of the Best-effort Uniform updating with Retransmission (BUR) policy. In order to prove the optimality of the proposed policies, for each case, we first identify a lower bound on the long-term average AoI among a broad class of online policies, and then construct a sequence of virtual policies to approach the lower bound asymptotically. Since those virtual policies are sub-optimal to the original policy, the original policy is thus optimal.

Age of information, energy harvesting, online status updating, noisy channel, feedback

## I Introduction

Recently, a metric called “Age of Information” (AoI) has been introduced to measure the freshness of the information in a status monitoring system from the destination’s perspective  [3]. Specifically, at time , the AoI in the system is defined as , where is the time stamp of the latest received update at the destination. AoI has shown to be fundamentally different from standard network performance metrics, such as throughput or delay. It has attracted growing attention from different research communities, due to its simple form and potential in unifying sampling and transmission for timely information delivery.

Generally speaking, there are two main approaches in the study of AoI. The first approach is to characterize the AoI under given status updating policies [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. The second approach is to design certain status updating policies to actively optimize AoI [18, 19, 20]. Modeling the status monitoring system as a queueing system, where updates are generated at the source according to a random process, the time average AoI has been analyzed in different queueing management settings. For systems with a single server, the corresponding AoI has been studied in single-source single-server queues [3], the Last-Come First-Served (LCFS) queue with preemption in service [4], the First-Come First-Served (FCFS) queue with multiple sources [5, 6], the queue with multiple souces which only keeps the latest status packet of each source in the queue [7], the LCFS queue with gamma-distributed service time and Poisson update packet arrivals [8]. Moreover, in queue systems, packet deadlines are found to improve AoI performance in [9], and AoI in the presence of packet delivery errors is evaluated in [10]. The AoI in systems with multiple servers has been evaluated in [11, 12, 13]. A related metric, Peak Age of Information (PAoI), is introduced and studied in [14, 15, 16, 19]. For more complicated multi-hop networks, reference [17] introduces a novel stochastic hybrid system (SHS) approach to derive explicit age distributions. The optimality properties of a preemptive Last Generated First Served (LGFS) service discipline in a multi-hop network are identified in [18]. AoI optimization with the knowledge of the server state has been studied in [19]. The relationship between AoI and the MMSE in remote estimation of a Wiener process is investigated in [20].

Due to the magnified tension between keeping information fresh and the stringent energy constraint, AoI in energy harvesting (EH) wireless networks has attracted increasing interests recently [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]. An EH sensor harvests energy from the environment and use it to power its sensing and communication operations. Due to the stochastic energy arrival process, all of the operations are subject to the so-called energy causality constraint. Under such constraints, various policies have been proposed to optimize different communication and sensing performance metrics [32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42]. Such sample path-wise constraint also makes the design and analysis of the status updating policy in EH systems extremely challenging. Under the assumption that the battery size is sufficiently large, [21] shows that updates should be scheduled only when the server is free to avoid queueing delay, and a lazy update policy that introduces inter-update delays outperforms the greedy policy. Reference [22] investigates AoI-optimal offline and online status updating policies, where the online problem is modeled as a Markov decision process and solved through dynamic programming. In [23, 24, 25, 26], optimal online status updating policies under different assumptions on the battery size have been identified. Specifically, for the infinite battery case, [23] shows that the best-effort uniform updating policy, which updates at a constant rate when the source has sufficient energy, is optimal when the channel between source and destination is perfect. When the battery size is finite, the optimal policies are shown to have certain threshold structures [24, 25, 26]. Offline policies to minimize AoI in EH channels have been studied in [27, 28]. Reference [29] analyzes the AoI performance of two channel coding schemes when channel erasures are present. Using the SHS tools proposed in [17], reference [31] and reference [43] study the average AoI for a finite battery EH system, with and without preemption of packets in service allowed, respectively. An interesting setting is considered in [30], where extra information is carried by the timing of the update packets. A tradeoff between the average AoI and the average message rate is studied for several achievable schemes.

In this paper, we take the imperfect updating channel into consideration and investigate the optimal updating policies of an EH system where updating erasures can happen. Assuming each update can be erased with a constant probability, the AoI at the destination will be reset only when an update is successfully received. Our objective is to design online status updating policies to minimize the average AoI at the destination. Depending on whether there exists updating feedback to the source, we consider two possible scenarios:

1) No updating feedback. In this case, the source has no knowledge of whether an update is successful. It can only use the update-to-date energy arrival profile and updating decisions as well as the statistical information, such as the energy arrival rate and the erasure probability of the channel, to decide the upcoming updating time points. We show that the Best-effort Uniform updating (BU) policy, which was shown to be optimal under the perfect channel setting in [23], is still optimal.

2) Perfect updating feedback. In this case, the source receives an instantaneous feedback when an update is transmitted. Therefore, it can decide when to update next based on the feedback information, along with the information it uses for the no feedback case. For this case, we propose a Best-effort Uniform updating with Retransmission (BUR) policy and prove its optimality.

Although the proposed policies are quite intuitive, their optimality is quite challenging to establish, compared with [23]. This is because both battery outage and updating erasure will affect the AoI under the proposed policies. While the impact of either of those two events can be analyzed relatively easily when isolated, it becomes extremely challenging when both of them are involved. Besides, when there exists perfect updating feedback to the source, the updating erasure under the BUR will lead to subsequent retransmission and energy consumption, thus affecting the battery outage probability in the future. Such complicated interplay between those two events makes the problem even more complicated. In order to overcome such difficulties, we propose a novel virtual policy based approach. Specifically, for both BU and BUR updating policies, we construct a sequence of virtual policies, which are strictly suboptimal to their original counterparts, and eventually converge to them. By designing the virtual policies in a sophisticated manner, we are able to decouple the effects of battery outage and updating errors in the performance analysis. We show that the long-term average AoI under virtual policies converges to the corresponding lower bound, which implies the optimality of the original policy.

The remainder of the paper is structured as follows: In Sec. II, we describe the system model and problem formulation. In Sec. III and Sec. IV, we consider the no updating feedback case and the perfect updating feedback case, respectively. In Sec. V, we evaluate the proposed policies through extensive simulation results. We conclude in Sec. VI. For the sake of readability, we defer some proofs to the appendix.

## Ii System Model and Problem Formulation

Consider a scenario where an energy harvesting sensor continuously monitors a system and sends time-stamped status updates to a destination. The destination keeps track of system status through received updates. We use the metric Age of Information (AoI) to measure the “freshness” of the status information available at the destination.

We assume that the energy unit is normalized so that each status update requires one unit of energy. This energy unit represents the energy cost of both measuring and transmitting a status update. Assume energy arrives at the sensor according to a Poisson process with parameter . Hence, energy arrivals occur at discrete time instants . We assume for ease of exposition, since we can always scale the time axis proportionally to make per unit time. The sensor is equipped with a battery to store harvested energy. In this paper, we focus on the case where the battery size is infinite.

We assume the time used to collect and transmit a status update is negligible compared with the time scale of the long-term average AoI in the system. Therefore, a status update can be generated and transmitted at any time as long as the energy level is greater than or equal to one. We assume the channel between the source and the destination is noisy, thus the transmitted update may be corrupted and unrecognizable at the destination. Specifically, we assume that with probability , , an update will be successfully delivered to the destination, independent with any other factors in the system. As shown in Fig. 1, the AoI at the destination will be reset to zero only when an update is successfully received. We consider two possible cases. For the no updating feedback case, the source has no information of the updating result. For the perfect updating feedback case, we assume there is a perfect feedback channel between the destination and the source, so that the source is notified about an updating failure once it happens.

A status update policy is denoted as , where is the th update time at the source. However, due to channel fading, only a subset of the update packets will be successfully delivered. Thus, the actual status update times at the destination are different from in general. Therefore, we use to denote the th actual update time at the destination. We assume , i.e., an update is successfully delivered right before time zero, and the system starts with an initial energy of , .

Define as the total amount of energy harvested in , and as the energy level of the sensor right before the update time . Then, under any feasible status update policy, the energy queue evolves as follows

 E(l−1) =E0+A1, (1) E(l−n) =E(l−n−1)−1+An,n=2,3,…. (2)

Based on the Poisson arrival process assumption, is an independent Poisson random variable with parameter .

In order to ensure every update time is feasible, we must have the energy causality constraint satisfied all the time, i.e.,

 E(l−n) ≥1,n=1,2,…, (3)

which indicates that the source will generate and transmit an update only when it has sufficient energy.

We use and to denote the number of status updates sent by the source and the number of status updates successfully received at the destination over , respectively. Define as the cumulative AoI at the destination over . Denote the delay between two successful updates as , for . Then,

 R(T) =∑N(T)i=1X2i+(T−SN(T))22, (4)

and the time-average AoI over the duration can be expressed as .

Our objective is to determine the sequence of update times at the source, so that the time average AoI at the destination is minimized, subject to the energy causality constraint. We focus on a set of online policies. Specifically, for the no updating feedback case, the information available for determining the updating point includes the updating history , the energy arrival profile over , as well as the energy harvesting statistics (i.e., in this scenario) and the probability of updating success . Denote the set of such online policies as . For the perfect updating feedback case, the source also utilizes up-to-date updating feedback to make its decisions. We denote the set of such online policies as . Then, the optimization problem can be formulated as

 minπ∈Π limsupT→+∞E[R(T)T] (5) s.t. (???)−(???),

where equals or , depending on the setting, and the expectation in the objective function is taken over all possible energy harvesting sample paths and channel fading realizations.

## Iii Status Updating Without Feedback

In this section, we will study the optimal status updating policy for the case where there is no update feedback available to the sensor. We show that the expected long-term average AoI has a lower bound for a broad class of online policies, which can be achieved by the BU updating policy.

### Iii-a A Lower Bound

Note that when battery size is infinite, no energy flow will happen, and the long-term average status updating rate is subject to the energy harvesting rate constraint. Specifically, we have the following lemma.

###### Lemma 1 (Lemma 1 in [41])

Under any policy , it must have almost surely.

We point out that Lemma 1 is also valid for all , which will be discussed in Sec. IV.

Besides, we also have the following intuitive yet important observation.

###### Lemma 2

For any that achieves a finite expected long-term average AoI, it must have almost surely.

###### Proof.

We prove it by contradiction. Assume

 P[limT→∞M(T)=∞]<1,

i.e., there exists and , such that

 P[limT→∞M(T)

Define

 pn:=(1−p)n−1p, (6)

i.e., the probability that is the first successful update time after . Then,

which implies that the expected long-term average AoI cannot be finite. ∎

In order to obtain a valid lower bound, in the following, we only need to focus on the policies that achieve finite expected long-term average AoI. To facilitate the following analysis, we introduce a broad class of online policies defined as follows.

###### Definition 1 (Bounded Updating Policy)

If under a policy , the th updating point at the source (i.e., ) satisfies for any fixed , is called a bounded updating policy.

Denote the set of bounded updating policies as . Then, . Intuitively, any practical status updating policy should be in , as it is undesirable to have any th updating point (and the inter-update delay between any consecutive updating points before ) to become unbounded in expectation. We have the following lower bound for bounded updating policies.

###### Theorem 1 (Lower Bound for Channel without Feedback)

For any policy , the expected long-term average AoI is lower bounded by .

The proof of Theorem 1 is provided in Appendix -A.

### Iii-B Optimal Online Status Updating

In this section, we propose online status updating policies to achieve the lower bound derived in Section III-A. We will start with the BU updating policy introduced in [23]. Although we assume a noisy channel in this work, when there is no CSI or feedback available to the source, intuitively, it is still desirable for the source to update in a uniform fashion, so that the successfully received updates at the destination would be most uniformly distributed in time.

###### Definition 2 (BU Updating)

The sensor is scheduled to update the status at , . The sensor performs the task at if ; Otherwise, the sensor keeps silent until the next scheduled status updating time point.

Here we use to denote the th scheduled updating time point. It is in general different from the th actual updating time , since some scheduled updates may be infeasible due to battery outage.

BU updating ensures that the energy causality constraint is always satisfied. We expect that BU updating achieves the lower bound in Theorem 1. However, analyzing its AoI performance is very challenging. Although we are able to identify a renewal structure in the system status evolution under the BU updating policy (i.e., a renewal interval can begin right after the sensor successfully delivers an update and the battery state becomes ), the analysis of the expected average AoI over one renewal interval is still very complicated, mainly due to two reasons:

First, different from the perfect channel case [23], the actual update time at the destination may deviate from the scheduled update time due to two possible events: battery outage and update erasure. Although the average AoI can be characterized in systems where only one of such events can happen, it is hard to analyze the AoI when the effects of both events are involved.

Second, the expected length of such a renewal interval is unbounded. This is because the battery evolution under BU updating can be modeled as a Martingale process, and as we will show in the proof of Lemma 4, the expected time when it becomes empty for the first time (i.e., hitting time of zero) is infinity. Since with a non-zero probability the renewal interval contains such an interval, the expected length of each renewal interval is thus unbounded, and the corresponding expected average AoI becomes intractable.

To overcome such challenges, we will construct a sequence of virtual policies, and show that the expected time average AoI under those virtual policies approaches the lower bound in Theorem 1. Since such virtual policies are sub-optimal to the BU updating policy, the optimality of BU updating can thus be proved. In order to simplify the definition and analysis of the virtual policy, we assume . The proof can be slightly modified to show that the optimality of the proposed policy is valid for any .

###### Definition 3 (Bu-ErT0)

The sensor performs BU updating until the battery level after sending an updating becomes zero for the first time, or until time , in which case the sensor depletes its battery; After that, when the battery level becomes higher than or equal to one after a successful update for the first time, the sensor reduces the battery level to one, and then repeats the process.

###### Lemma 3

For any , BU-ER updating policy is sub-optimal to the BU updating policy.

###### Proof.

We note that BU-ER updating is identical to BU updating except the energy removal at time and when becomes higher than one. Given the same energy harvesting sample path, the battery level under BU is always higher than that under BU-ER. Thus, BU-ER incurs more infeasible status updates. With the same channel fading profile, the instantaneous AoI under BU-ER updating is always greater than or equal to that under BU updating sample path-wisely. Thus, the expected time-average AoI under BU-ER is greater than or equal to that under BU, which proves the lemma. ∎

We note that the BU-ER updating policy is a renewal type policy, i.e., the states of the system evolve according to a renewal process. To analyze the expected long-term average AoI, it suffices to analyze the expected average AoI over one renewal interval. In the following, we will focus on the first renewal interval, and show that the corresponding expected average AoI converges to the lower bound in Theorem 1 as increases. As illustrated in Fig. 2, we note that the renewal interval consists of two stages. The first stage starts at time zero and ends until the battery becomes empty for the first time, or until time . We denote as the end of the first stage. We note that all scheduled status updating epochs over are feasible. The second stage starts at and ends when the battery level becomes higher than or equal to one after a successful update for the first time after . We denote the end of the second stage as .

###### Lemma 4

Under BU-ER updating,

###### Proof.

Consider a “random walk” , which starts with and increments with , where is an i.i.d. Poisson random variable with parameter . Denote the first -hitting time for as . Then and . Note that when , is identical to the battery level evolution process under the BU-ER updating policy almost surely, and the corresponding .

Define a Martingale process associated with as with and . According to the proof of Theorem in [41],

 exp(−αΩ0)=E[exp(−αΩκ−κγ(α))]. (9)

Taking the derivative of both sides of (9) with respect to , we have

 Ω0exp(−αΩ0)=E[(Ωκ+κγ′(α))exp(−αΩκ−κγ(α))]. (10)

Since and , (10) can be reduced to

 exp(−α)=E[κγ′(α)exp(−κγ(α))]≤E[κγ′(α)], (11)

where the inequality follows from the fact that .

Dividing both sides of (11) by , we have

 E[κ]≥exp(−α)/γ′(α). (12)

Note that

 limα→0γ′(α) =limα→0(−e−α+1)=0+. (13)

Thus, we have

 limT0→∞E[T1]≥limα→0exp(−α)/γ′(α)=∞. (14)

###### Lemma 5

Under BU-ER updating, , , , are bounded.

###### Proof.

We consider another genie-aided virtual process starting at time as follows. The source performs BU-ER after , and keeps tracking the battery level and genie-informed update result. If a status update is erased and the battery level is above zero, the sensor depletes its battery and repeat the process. The process stops when the battery level after a successful update becomes one for the first time. Denote the duration of the second state as .

For each sample path, we can see that the battery level under the new virtual process is always less than or equal to that under BU-ER, due to the extra energy depletion after and before . Since the update erasure patterns are the same under both policies, we must have . We note that at each updating time point between and , the battery level is above zero with probability ; and if the previous event happens, the update is successfully delivered with probability . Therefore, under the new virtual policy is a geometric random variable with parameter . Thus, its first and second moments are bounded. Therefore, and are bounded.

Next, we note that under the BU-ER updating, the AoI over is a renewal reward process, which resets to zero at . According to Proposition 3.4.6 in [44], is bounded. Therefore is uniformly bounded for any . Similarly, we can show that is uniformly bounded. ∎

###### Lemma 6

As , the expected long-term average AoI under BU-ER is upper bounded by .

###### Proof.

First, we note that the

 limT0→∞E[(T1+T2−SN(T1))2]2E[T1+T2]=limT0→∞E[(T1−SN(T1))2]+E[T22]+2E[T1−SN(T1)]E[T2]2E[T1]=0, (15)

where the first equality follows from that the two events and are independent, and the second equality follows from Lemma 4 and Lemma 5.

Then, we note that under BU-ER,

 limT→∞E[R(T)T] ≤∑N(T1)i=1X2i+(T1+T2−SN(T1))22E[T1+T2].

Consider the channel state realization at the scheduled status updating epochs under BU (and BU-ER) updating. Let be the duration between the th and st epochs when the channel states are good and the corresponding update would be successful if it were sent. Then, is identical to . This is because there is no battery outage over , and whether an update is successful or not only depends on the channel state. Combining with (15), we have

 limT0→∞limT→∞E[R(T)T]≤limT0→∞E[∑N(T1)i=1X2i]2E[T1+T2] (16) ≤limT0→∞E[∑N(T1)+1i=1Y2i]2E[∑N(T1)+1i=1Yi−(∑N(T1)+1i=1Yi−T1)] (17) (18)

where (18) follows from Wald’s equality and the fact that is a stopping time for for any given .

Since , according to Lemma 4,

 limT0→∞E[N(T1)+1]E[Y1]≥limT0→∞E[T1]=∞. (19)

Meanwhile, we have uniformly bounded for any based on Proposition 3.4.6 in [44]. Therefore, (18) is equal to , i.e., . ∎

Theorem 1, Lemma 3 and Lemma 6 imply the optimality of the BU updating, as summarized in the following theorem.

###### Theorem 2 (Optimality of BU Updating)

Among all policies in , the BU updating policy is optimal when updating feedback is unavailable, i.e.,

 limsupT→∞E[R(T)T] =2−p2p.

## Iv Status Updating With Perfect Feedback

In this section, we consider the case where there exists perfect updating feedback to the sensor. With perfect updating feedback, the sensor has the choice to retransmit the update immediately or wait and update later, thus leading to optimal solutions different from the no feedback case. In order to facilitate the analysis, in the following, we focus on another class of online policies, termed as uniformly bounded policies.

### Iv-a A Lower Bound

Define as the number of attempted updates (including the last successful one) between two successful updates at time and under any online policy in . Then, could be any integer number greater than or equal to one.

###### Definition 4 (Uniformly bounded policy)

Under a policy , if: 1) there exists a function such that when , , , and , and 2) for any , then, is called a uniformly bounded policy.

Roughly speaking, the first condition ensures that the source updates frequently so that the AoI at the destination does not grow unbounded in expectation; The second condition requires that the source does not update too frequently in any period of time. Such conditions are consistent with our intuition that the optimal policies should try to maintain a constant as much as possible. We note that uniformly bounded policies do not have to be renewal or Markovian in general. Denote the set of uniformly bounded policies as , then . We have the following lemma.

###### Lemma 7

For any , it must have and .

The proof of this lemma is adapted from the proof of Theorem 3 in [23], and provided in Appendix -B.

Besides, we also have the following observation.

###### Lemma 8

Under any policy , it must have

###### Proof.

First, we observe that

 limT→∞E[∑N(T)+1i=1Ki]T≤limT→∞E0+E[∑N(T)+1i=1Ai]T (20)

due to the energy causality constraint. We note that is a continuous-time martingale, where is a Poisson process with parameter one. Therefore, according to the optimal stopping time theorem [44], for any stopping time , we have , i.e., . Since is a stopping time associated with the past energy arrivals and channel fading realizations under any , we have . Plugging it into (20), we have

 limT→∞E[∑N(T)+1i=1Ki]T≤limT→∞E[SN(T)+1]T=1+limT→∞E[XN(T)+1]T=1, (21)

where the last equality follows from Lemma 7.

Besides, we note that under any online policy , is an i.i.d. geometric random variable with parameter . Therefore, applying Wald’s equality, we have

 limT→∞E[∑N(T)+1i=1Ki]T=limT→∞E[N(T)+1]E[Ki]T=limT→∞E[N(T)+1]Tp. (22)

Combining with (21), we have . ∎

In order to obtain a lower bound on the AoI for all , we will first drop the energy causality constraint, and focus on those online policies that satisfy Lemma 8 and are also uniformly bounded. Denote the set of such policies as . Then, we have . Since not all policies in would be feasible if the energy causality constraint is imposed, the minimum expected long-term AoI achieved by policies in serves as a lower bound for policies in .

###### Theorem 3

Any policy is suboptimal to a renewal policy, i.e., a policy under which the successful updating points form a renewal process. Besides, under the renewal policy, only depends on .

A sketch of the proof is as follows: For any given policy , we construct a renewal policy based on all possible sample paths under . Specifically, our approach is to first average over sample paths with the same , so that all factors other than that may affect can be averaged out. Then, we form a sophisticated linear combination of , and use it as the inter-update delay under the new policy. Such a policy is a renewal policy, and each renewal interval only depends on . Through rigorous stochastic analysis, we prove that the constructed renewal policy always outperforms the original policy. The detailed proof of Theorem 3 is provided in Appendix -C.

In the following, we will focus on renewal policies in , and identify the AoI-optimal renewal policy.

###### Theorem 4

Under the optimal renewal policy in , equals a constant irrespective of , and the corresponding long-term average AoI equals .

Proof:  Define as the length of the renewal interval if the number of attempts over that interval is , and . Then, is the probability that the th attempt is successful. Therefore, to minimize the expected long-term average AoI, it suffices to solve the following optimization problem:

 min{xk}∑∞k=1x2kpk2∑∞k=1xkpks.t.∞∑k=1xkpk≥1/p. (23)

This is a non-linear fractional programming problem and can be solved using the parametric approach in [45]. Below we provide an alternative yet simpler approach.

We note that

 ∑∞k=1x2kpk2∑∞k=1xkpk≥(∑∞k=1xkpk)22∑∞k=1xkpk=12∞∑k=1xkpk≥12p, (24)

where the first inequality in (24) follows from Jensen’s inequality and the second one follows from the constraint in (23). The equalities hold if and only if for all and . Combining with , we have for all . Thus, the solution of (23) is , for all and the corresponding minimum is .

Combining Theorem 3 and Theorem 4, we obtain a lower bound for all as follows.

###### Theorem 5 (Lower Bound for Channel with Perfect Feedback)

For any policy , the expected long-term average AoI is lower bounded by .

### Iv-B Optimal Online Status Updating

Motivated by the uniform structure of under the optimal renewal policy in Theorem 4, we define the Best-effort Uniform updating with Retransmission (BUR) policy as follows.

###### Definition 5 (BUR Updating)

The sensor is scheduled to update the status at , . The sensor keeps sending updates at until an update is successful or until it runs out of battery; Otherwise, the sensor keeps silent until the next scheduled status update time.

In order to prove that the BUR updating policy is optimal, we will first construct a sequence of policies which are sub-optimal to the BUR updating policy, and show that the limit of those suboptimal policies achieves the lower bound in Theorem 5.

###### Definition 6 (BUR with Energy Removal (BUR-ERT0))

The sensor performs BUR updating policy until the battery level after sending an update becomes zero for the first time, or until time , in which case the sensor depletes its battery after a successful update at ; After that, when the battery level becomes higher than or equal to one after a successful update for the first time, the sensor reduces the battery level to one, and then repeats the process.

###### Lemma 9

The BUR-ER updating policy is suboptimal to the BUR updating policy.

###### Proof.

We note that the BUR-ER updating policy is identical to the BUR updating policy up to the energy removal step. Given the same energy harvesting sample path, the battery level under BUR is always higher than that under BUR-ER. Thus, BUR-ER incurs more infeasible status updating points. With the same channel fading profile, the instantaneous AoI under BUR-ER is always greater than or equal to that under BUR sample path-wisely. Thus, the expected time-average AoI under BUR-ER is greater than or equal to that under BUR. ∎

Note that BUR-ER updating is a renewal policy and Fig. 3 is an illustration of one renewal interval. In order to analyze the expected long-term average AoI, it suffices to analyze the expected average AoI over one renewal interval. Thus, we will focus on the first renewal interval, and show that the expected average AoI converges to the lower bound in Theorem 5. The renewal interval consists of two stages. The first stage starts at time zero and ends until the battery becomes empty for the first time, or until time , denoted as . We note that all scheduled updating points over are feasible. The second stage starts at and ends when the battery level after a successful update becomes higher than or equal to one for the first time after , denoted as .

###### Lemma 10

Under BUR-ER updating,

###### Proof.

Consider a “random walk” . It starts with and the evolves as , where is an i.i.d. Poisson random variable with parameter and is an i.i.d. geometric random variable with parameter . Denote the first zero-hitting time for as . Then and . We note that when , is identical to the battery level evolution process under the BUR-ER updating policy.

For ease of exposition, define , and for . Then, we have

 E[e−αCn−γ(α)]=1. (25)

Based on the definition of , and , we have

 E[e−αCn]=e1p(e−α−1)peα1−(1−p)eα. (26)

Therefore,

 γ(α) =logE[e−αCn]=1p(e−α−1)+logpeα1−(1−p)eα. (27)

Taking derivative of (27), we get

 γ′(α)=−1pe−α+11−(1−p)eα. (28)

Next, we define a process associated with as . We note that

 E[e−αΩk−γ(α)k|Ω1,…,Ωk−1]=E[e−α(Ωk−1+Ck)+−γ(α)k|Ω1,…,Ωk−1] ≤E[e−α(Ωk−1+Ck)−γ(α)k|Ω1,…,Ωk−1]=e−αΩk−1−γ(α)(k−1)E[e−αCk−γ(α)] =e−αΩk−1−γ(α)(k−1), (29)

where (29) follows from (25). Therefore, is a super-martingale process, i.e.,

 e−αΩ0 ≥E[e−αΩT1−γ(α)T1]≥E[1−(αΩT1+T1γ(α))].

Since and , combining with (28), we have

 E[T1] ≥limα→0+1−e−αΩ0γ(α)=limα→0+Ω0e−αΩ0γ′(α)=∞. (30)

###### Lemma 11

Under the BUR-ER updating policy, , are uniformly bounded.

###### Proof.

Under BUR-ER updating policy, the number of energy arrivals over (denoted as ) is a Poisson random variable with parameter . If the source has sufficient energy, the total number of attempts at time (denoted as ) is an i.i.d. geometric random variable with parameter . Therefore, if the battery is empty at time , it will increase to one or above after a successful update at time only when , which will happen with a constant probability. Thus, is a geometric random variable whose first and second moments are finite. ∎

###### Lemma 12

As , the expected long-term average AoI under BUR-ER updating is upper bounded by .

###### Proof.

First, we note that

 limT0→∞E[(T1+T2−SN(T1))2]2E[T1+T2]≤limT0→∞E[(T2+1p)2]2E[T1]=0, (31)

where (31) follows from the fact that is upper bounded by under the BU-ER policy, Lemma 10 and Lemma 11.

Next, we note that the BU-ER updating policy is a renewal policy and the expected long-term average AoI is equal to the expected average AoI over one renewal interval. Therefore,

 limT0→∞limT→∞E[R(T)T]≤limT0→∞E[∑N(T1)i=1X2i+(T1+T2−SN(T1))2]2E[T1+T2] (32) ≤limT0→∞E[∑N(T1)i=1X2i]2E[SN(T1)]=limT0→∞E[N(T1)]1p22E[N(T1)]1p=12p, (33)

where (33) follows from (31) and the fact that for and . ∎

Lemma 12 indicates that the expected time-average AoI under the BUR-ER updating policy converges to the lower bound in Theorem 5 as goes to infinity. According to Lemma 9, BUR-ER is suboptimal to BUR. Therefore, the BUR updating policy also achieves the lower bound, thus it is optimal. We summarize the optimality result in the next theorem.

###### Theorem 6 (Optimality of BUR Updating)

Among all policies in , the BUR updating policy is optimal when transmission feedback is available, i.e.,

 limsupT→∞E[R(T)T] =12p.

## V Simulation Results

In this section, we evaluate the performances for the proposed status updating policies through simulations. For each case, we generate sample paths for the Poisson energy harvesting process with and compute the sample average of the time average AoI over sample paths.

### V-a Status Updating Without Feedback

First, we evaluate the BU updating policy in Fig. 4. We vary , and plot both the time average AoI as a function of and the corresponding lower bound in the figure. We observe that all time average AoI curves gradually approach the corresponding lower bound as . The results show that the proposed BU updating policy is optimal. Note that the time average AoI is monotonically decreasing as increases. This is intuitive since channel with better quality, i.e., larger , will render smaller time average AoI.

Next, we evaluate the performances of virtual policies BU-ER for different value of in Fig. 5. We fix and plot the time average AoI under BU-ER with . We also compare with a greedy updating policy and the BU updating policy. Under the greedy updating policy, the sensor updates instantly when one unit of energy arrives. As we observe in Fig. 5, the greedy policy results in the highest average AoI, and never approaches the lower bound. The time averaged AoI under the BU-ER updating policy is monotonically decreasing as increases, and gradually approaches that under the BU updating policy. This is consistent with Lemma 3 and Lemma 6 that BU-ER updating is sub-optimal to BU updating, and eventually converges to it when increases.

### V-B Status Updating With Perfect Feedback

Next, we evaluate the performances of the proposed online policies when perfect feedback is available to the sensor. In Fig. 6, under the BUR updating policy, we plot the time average AoI with and the corresponding lower bound . We note that as , the time average AoI approaches the lower bound. Thus BUR updating is optimal. We then evaluate the performances of the BUR-ER updating policy in Fig. 7. We fix , choose and plot the time average AoI as a function of . As a comparison, we also plot the time average AoI under the BU updating policy and the BUR updating policy in the figure. We note that the AoI under BUR-ER gradually decreases and approaches that under the BUR updating policy as increases, which is consistent with Lemma 9 and Lemma 12. The performance gap between the BU updating and the BUR updating indicates that exploiting updating feedback can significantly reduces time average AoI in the system.

## Vi Conclusions

In this paper, we considered the optimal online status update policies for an energy harvesting source in presence of updating erasures. We investigated both cases where no updating feedback or perfect feedback is available to the source. For each case, we first obtained a lower bound and then proved the proposed status updating policy can achieve the lower bound among a broadly defined class of policies. The optimality of proposed status update policies were proved through constructing a sequence of virtual status updating policies which are sub-optimal to the original policy and asymptotically achieve the lower bound. The performances of the proposed policies were evaluated through simulations. We point out that although we only showed the optimality of the proposed policies within a subset of online policies, we conjecture that their optimality can be extended for all online policies. How to generalize the results is one of our future steps. Another direction we would like to pursue is to investigate the impact of update erasures on the optimal updating policy for an EH source with finite battery.

### -a Proof of Theorem 1

Define , , and . Then, under any , the expected average AoI over can be expressed as

 E[R(T)T]=1TE⎡⎣N(T)∑i=0(STi−Si)22⎤⎦ (34) =12TE⎡⎣M(T)∑n=1pnl2n+⎛⎝1−M(T)∑n=1pn⎞⎠T2+M(T)∑n=1∞∑j=1(lTn+j−ln)2ppj⎤⎦, (35)

where the first two terms inside the expectation in (35) correspond to the AoI contribution over , and the last term correspond to the AoI contribution over any other . This can be explained as follows. With fixed updating epochs , depending on the realization of the channel state, the interval can be decomposed into segments, separated by successful updates. The probability to have , , as one of such segment equals , which corresponds to the event that update at succeeds, and the next successful update is at . The corresponding AoI contribution over thus needs to be weighted by when the expected AoI is calculated. Since the AoI contribution over is always positive, in the following, we will drop it to obtain a lower bound, i.e.,

 limT→∞E[R(T)T]≥limT→∞12TE⎡⎣p∞∑j=1pjM(T)∑n=1(lTn+j−ln)2⎤⎦ (36) ≥limT→∞12TE⎡⎢⎣p∞∑j=1pj1M(T)⎛⎝M(T)∑n=1(lTn+j−ln)⎞⎠2⎤⎥⎦ (37) =limT→∞12TE⎡⎣p∞∑j=1pj1M(T)(jT−j∑n=1lTn)2⎤⎦ (38) =limT→∞12p∞∑j=1pjj2E⎡⎣(T−¯lTj)2M(T)T⎤⎦, (39)

where (38) is based on Jensen’s inequality, (39) is derived by considering the cases and separately, and .

Since each term in the summation in (39) is positive, we can switch the order of limit and summation. We note that for any given , according to the definition of bounded policy. Besides, for any policy that renders a finite expected average AoI, we must have almost surely according to Lemma 2. Therefore, according to the bounded convergence theorem [46], we have

 limT→∞E⎡⎣¯lTjM(T)⎤⎦=0,limT→∞E⎡⎣(¯lTj)2M(T)T⎤⎦=0. (40)

Combining with (39), we have

 limT→∞E[R(T)T]≥12p∞∑j=1pjj2limT→∞E[TM(T)]≥12p∞∑j=1j2(1−p)j−1p=2−p2p, (41)

where the first inequality follows from Lemma 1.