# Optimal Status Update for Age of Information Minimization with an Energy Harvesting Source

###### Abstract

In this paper, we consider a scenario where an energy harvesting sensor continuously monitors a system and sends time-stamped status updates to a destination. The destination keeps track of the system status through the received updates. We use the metric Age of Information (AoI), the time that has elapsed since the last received update was generated, to measure the “freshness” of the status information available at the destination. Our objective is to design optimal online status update policies to minimize the long-term average AoI, subject to the energy causality constraint at the sensor. We consider three scenarios, i.e., the battery size is infinite, finite, and one unit only, respectively.

For the infinite battery scenario, we adopt a best-effort uniform status update policy and show that it minimizes the long-term average AoI. For the finite battery scenario, we adopt an energy-aware adaptive status update policy, and prove that it is asymptotically optimal when the battery size goes to infinity. For the last scenario where the battery size is one, we first show that within a broadly defined class of online policies, the optimal policy should have a renewal structure. We then focus on a renewal interval, and prove that the optimal policy should have a threshold structure, i.e., if the AoI in the system is below a threshold when an energy arrival enters an empty battery, the sensor should store the energy first and then update when the AoI reaches the threshold; otherwise, it updates the status immediately. Simulation results corroborate the theoretical bounds.

## I Introduction

Enabled by the widespread wireless communications and the proliferation of ultra-low power sensors, ubiquitous sensing has profoundly changed almost every aspect of our daily lives. In many applications, such as environment monitoring [2], vechicle tracking [3], sensors are deployed to monitor the status of sensing objects, and communicate the status information to a monitor. To keep track of the status, it is desirable to keep the status information at the monitor as fresh as possible. However, this is often constrained by limited physical resources, such as energy and bandwidth. In order to measure the freshness of the status updates at the monitor, a metric called “Age of Information” (AoI) has been introduced to measure the timeliness of the status information in a network [4]. Proposed for a node monitoring a system and sending time-stamped status updates to a destination, the metric has proved to be of fundamental importance for quantifying the freshness of information as it considers the time of generation of information in addition to network delivery delay. Specifically, at time , the AoI in the system is defined as , where is the time stamp of the latest received update packet at the destination, i.e., the time at which it was acquired at the source. AoI is fundamentally different from standard network performance metrics, such as throughput and delay. Roughly speaking, to maximize the throughput of update packets in a system, source nodes should generate as many updates as possible and push them through the network. However, heavy traffic load may congest the network and lengthen the delivery time of each update packet, which essentially increases the age of each received update, thus increasing AoI in the system. Additionally, although the age of each update is closely related to the delay it experiences in the network, update packets that get stuck in a network may become outdated after fresher update packets arrive at the destination. Thus, conventional first-come first-served (FCFS) queue management protocols are no longer desirable. Last-come first-served (LCFS) or even dropping some aged packets may become more preferable.

There have been two main directions in the study of AoI since it was first introduced in [4]. The first direction is to model the status updating system as a queueing system, where the update packets are generated according to a random process, and analyze the corresponding AoI under different queue management protocols. For single-server systems, all update packets from all sources are buffered in a single queue and then delivered to the destination through a single transmitter. The corresponding AoI has been analyzed in single-source single-server queues [4], the Last-Come First-Served (LCFS) queue with preemption in service [5], the First-Come First-Served (FCFS) system with multiple sources [6, 7], and a multiple-source system which only keeps the latest status packet of each source in the queue [8]. LCFS with gamma-distributed service times and Poisson update packet arrivals is considered in [9]. Most recently, packet deadlines are found to improve AoI in systems in [10], and AoI in the presence of packet delivery errors in an system is evaluated in [11]. A related metric, Peak Age of Information (PAoI), is introduced in [12, 13], and has been studied in multi-class systems in [14]. Age penality function or non-linear age has been studied in [15]. In systems with multiple servers, AoI has been evaluated in [16, 17, 18], and the optimality properties of a preemptive Last Generated First Served (LGFS) service discipline are identified in [19, 20]. The second direction is to control the generating process of the update packets, so that the AoI is optimized. Optimal status update policy with knowledge of the server state has been studied in [21]. The relationship between AoI and the MMSE in remote estimation of a Wiener process is investigated in [22, 23]. Various source and channel coding techniques for AoI optimization have been discussed in [24, 25, 26, 27, 28]. AoI optimization for data storage has been studied in [29].

In parallel, energy harvesting (EH) has been well on its way to becoming a game-changing technology in the field of autonomous wireless networked systems. The notion of acquiring energy from nature to power wireless transmitters is intriguing since careful transmission scheduling can render extended or even perpetual operation of the network [30, 31, 32]. Optimal transmission scheduling for throughput and delay optimization has been studied under both infinite battery setting [30, 33, 32] and finite battery setting [31, 34, 35, 36, 37]. With signal processing related performance metrics, such as detection delay and estimation error, optimal sensing scheduling policies have been developed to optimize the sensing performances of EH sensor networks [38, 39, 40, 41].

Age of Information in EH wireless networks is in its infancy with only a few recent works that investigate various status update policies under an energy harvesting setting, in very specific setups [42, 43, 44, 45]. It has been shown, in [42] that in this setting, with knowledge of the system state, updates should be submitted only when the server is free to avoid queueing delay. Moreover, a greedy policy that submits a fresh update as the system becomes idle is shown to be inefficient; a lazy update policy that introduces inter-update delays is better. The optimal update policy remains open even in this setting. In [43], under the assumption that a status update packet can be generated and served (transmitted) instantly, the authors investigate optimal offline and online policies. The optimal offline policy is to equalize the inter-update delays as much as possible, subject to the energy constraint imposed by the energy harvesting source. The online problem is cast as a Markov decision process in a discrete-time setting, and solved through dynamic programming. Other threshold type status update policies have been studied in [44] and shown to be optimal under certain conditions. An offline policy to miminimize AoI in a two-hop relay channel is studied in [45].

In this paper, we investigate optimal online status update policies for an energy harvesting source with various battery sizes in a continuous-time setting. Similar to [43, 44], we assume a status update packet can be generated by the source at any time and transmitted to a destination instantly, given sufficient energy is available at the source. We assume that the energy unit is normalized so that each status update requires one unit of energy. This energy unit represents the cost of both measuring and transmitting a status packet. We assume energy arrives at the sensor according to a Poisson process, and the sensor only has causal information of the energy arrival profile in addition to the parameter of the Poisson process. Our objective is then to determine the sequence of update instants so that the long-term average AoI at the destination is minimized, subject to the energy causality constraint at the source.

We first study the properties of the time-average AoI as a function of inter-update delays, and establish a connection between this problem and the optimal sensing problem studied in [41]. This motivates us to adopt the (asymptotically) optimal sensing policies in [41] for AoI minimization, namely, a best-effort uniform status update policy for the infinite battery case, and an energy-aware adaptive status update policy for the finite battery case. Since the AoI function does not have all the properties required to establish the optimality of those policies in [41], we revise the proofs accordingly to re-establish their (asymptotic) optimality. We then study a special case where the battery size is one unit, and propose a threshold based status update policy, i.e., if the AoI in the system is below a threshold when an energy enters an empty battery, the sensor should store the energy and hold status update until the AoI reaches the threshold; otherwise, it consumes the energy to update the status immediately. Through rigorous stochastic analysis, we show that within a broadly defined class of online policies, this threshold based status update policy is optimal.

## Ii System Model and Problem Formulation

Consider a scenario where an energy harvesting sensor continuously monitors a system and sends time-stamped status updates to a destination. The destination keeps track of the system status through the received updates. We use the metric Age of Information (AoI) to measure the “freshness” of the status information available at the destination.

In a typical wireless sensor network, the measurement and radio frequency transmission processes consume power in the range of , and take a few seconds or less. Meanwhile, typical output power under average conditions for different EH technologies, such as indoor solar cells, piezoelectric cells and wireless power transfer, ranges from below 100 to hundreds of [46, 47, 48]. Roughly speaking, it takes about a few minutes or more for an EH node to charge its battery in order to perform one status update.

Therefore, in the following, we assume that the time used to collect and transmit a status update is negligible compared with the time scale of inter-update delays, i.e., given sufficient energy is available at the source, a status update can be generated by the source at any time and transmitted to the destination instantly. In this case, a status update is transmitted immediately after it is generated to avoid unnecessary queueing delay. We assume the channel between the source and the destination is noiseless, thus the transmitted update always gets delivered successfully. We leave the more general setting where the channel is noisy and updates may be corrupted and unrecognizable at the destination as our future work.

We assume that the energy unit is normalized so that each status update requires one unit of energy. This energy unit represents the cost of both measuring and transmitting a status update. In the following, we assume a Poisson energy arrival process to make the theoretical analysis easier to track. We will relax this assumption in the simulations in Section V. Assume energy arrives at the sensor according to a Poisson process with parameter . Hence, energy units arrive at discrete time instants . We assume throughout this paper for ease of exposition. The sensor is equipped with a battery with capacity , . When , it corresponds to the infinite battery case.

A status update policy is denoted as , where is the -th update epoch. We assume , i.e., the system updates its status information right before time zero. Denote the inter-update delays as , for . Then, we have .

Define as the total amount of energy harvested in , and as the energy level of the sensor right before the scheduled updating epoch . For a clear exposition of the paper, we assume the system has one unit amount of energy before it updates at time zero, and after that, the battery becomes empty, i.e.,

(1) |

Then, under any feasible status update policy, the energy queue evolves as follows

(2) | ||||

(3) |

for . Equation (3) corresponds to the energy causality constraint in the system. Based on the Poisson arrival process assumption, is an independent Poisson random variable with parameter .

Under any feasible status update policy, the AoI as a function of time is shown in Fig. 1. We use to denote the number of status updates generated over . Define as the total “reward”, i.e., age of information experienced by the system over . Then,

(4) |

and the time average AoI over the duration can be expressed as .

Our objective is to determine the sequence of update epochs , so that the time average AoI at the FC is minimized, subject to the energy causality constraint. We focus on a set of online policies in which the information available for determining the updating epoch includes the updating history , the energy arrival profile over , as well as the energy harvesting statistics (i.e., in this scenario). The optimization problem can be formulated as

(5) | |||||

s.t. |

where the expectation in the objective function is taken over all possible energy harvesting sample paths. This problem does not admit an MDP formulation in general, and it is extremely challenging to explicitly identify the optimal solution.

## Iii Optimal Status Updating when is large

In [41], we studied an optimal sensing scheduling problem. Our objective was to strategically select the sensing epochs, so that the long-term average sensing performance can be optimized. We assumed that the sensing performance over can be expressed as , where is the -th inter-sensing delay. Under the assumption that 1) is convex and monotonically increasing in ; 2) is increasing in ; and 3) is upper bounded by a positive constant, we proposed two sensing policies, for the infinite and finite battery cases, respectively, and proved their (asymptotic) optimality.

We note that the AoI minimization problem can be treated as a particularized case of the optimal sensing scheduling problem studied in [41], by replacing the general sensing performance metric with AoI. Thus, for this particular case, . We note that this function exhibits the first two properties required to establish the optimality of the proposed sensing scheduling policies in [41]. However, the last condition i.e., is upper bounded by a positive constant, does not hold, due to the fact that and it is unbounded. Therefore, the optimality of the policies proposed in [41] need to be carefully examined. In the following, we will utilize the specific form of the AoI function to bypass the last condition and reaffirm the optimality of the policies.

For the completeness of this paper, in this section, we adapt the major results and policies in [41] for the AoI minimization setup. We leave out the proofs that do not reply on the third assumption, and provide necessary new proofs only. We will start with the infinite battery case, and investigate its performance lower bound and the corresponding bound-achieving status updating policy. The policy is shown to have a uniform updating structure. With insights drawn from the infinite battery case, we will then study the finite battery case. We will develop an energy-aware status updating policy by modifying the uniform updating policy, and show that as the battery size increases, it approaches the uniform updating policy, thus it is asymptotically optimal.

### Iii-a Status Update with Infinite Battery

When the battery size is infinite, no energy overflow will happen. Thus, the maximum achievable long-term average status update rate is one update per unit time. If we drop the energy causality constraint, and replace it with this long-term average status update rate constraint, we obtain a lower bound on the long-term average AoI, which is 1/2. This lower bound corresponds to a uniform status update policy which updates once per unit time. However, it may become infeasible when the energy causality constraint is imposed. Thus, we propose the following policy to ensure the status update policy is always feasible.

###### Definition 1 (Best-effort Uniform Status Update Policy)

The sensor is scheduled to update the status at , . The sensor performs the task at if ; Otherwise, the sensor keeps silent until the next scheduled status update epoch.

Here we use to denote the -th scheduled status update epoch, which is in general different from the -th actual status update epoch since some of the scheduled status update epochs may be infeasible.

###### Theorem 1

The best-effort uniform status update policy is optimal when the battery size is infinite, i.e.,

### Iii-B Status Update with Finite Battery

In order to minimize the long-term average AoI when the battery size is finite, intuitively, the status update policy should try to prevent any battery overflow, as wasted energy leads to performance degradation. Meanwhile, the properties of AoI require the status update rate to be as uniform as possible in time. Those two objectives are not aligned with each other, thus, the optimal status update policy should strike a balance between them.

In the following, we propose an energy-aware adaptive status update policy, which adaptively changes the update rate based on the instantaneous battery level. When the battery level is high, the sensor updates more frequently in order to prevent battery overflow; When the battery level is low, the sensor updates less frequently to avoid infeasible status update epochs. Meanwhile, the update rate does not vary significantly in time in order to control the increase of time-average AoI caused by the jittering updating epochs.

###### Definition 2 (Energy-aware Adaptive Status Update Policy)

Assume . The adaptive status update policy defines status update epochs recursively as follows

(6) |

where , , and , with being a positive number such that . The sensor samples and updates the status at if ; Otherwise, the sensor keeps silent until the next scheduled status update epoch.

As , we have for any fixed , i.e., the adaptive status update policy converges to the best-effort uniform status update policy as battery size increases. Thus, we expect that the long-term average AoI under the adaptive status update policy converges to that under the best-effort uniform status update policy, which is 1/2, as the battery size approaches infinity.

Let and be two functions defined on some subset of the real numbers. Denote if and only if , where is a positive constant. Then, the asymptotic optimality of the adaptive status update policy is described in the following theorem.

###### Theorem 2

Under the adaptive status update policy, the gap between the long-term average AoI and its lower bound scales in almost surely.

Theorem 2 implies that as battery size increases, the long-term average AoI under the adaptive status update policy approaches 1/2, which is the lower bound on the long-term average AoI in a system with infinite battery. Thus, it is asymptotically optimal. The proof of Theorem 2 is provided in Appendix -B.

## Iv A Special Case:

In the previous section, we investigate the optimal and asymptotically optimal status update policies when battery size is infinite, and finite but sufficiently large, respectively. However, when the battery size is so small that the asymptotics cannot kick in, those policies may not perform very well. This motivates us to investigate other status update policies when battery size is small. One extreme case for this scenario is when , i.e., the battery can only store the energy for one status update operation. In this case, the battery only has two states: empty, or full. When it is empty, obviously, any status update should not be scheduled. When one unit amount of energy arrives, the battery jumps to the other state, and it then needs to decide when to spend the energy for status update. Denote as time duration between and the first energy arrival time after . Then, we have the following observations

###### Lemma 1

When , under any feasible online policy, we must have , , and s are independent and identically distributed (i.i.d.) random variables, with common distribution .

Lemma 1 is based on the energy causality constraint, and the memoryless property of the inter-arrival times of the Poisson energy arrivals.

As defined in Sec. II, the policy space includes all of the online policies which make the status updating decision based on up-to-date updating history and energy arrival profile, as well as the energy harvesting statistics. In other words, is a function of , among other variables.

In order to facilitate the analysis, in the following, we focus on a special class of online policies, termed as uniformly bounded policies.

###### Definition 3 (Uniformly bounded policy)

For an online policy with , if , , and there exists a function such that , and the second moment of is finite, then this policy is a uniformly bounded policy.

###### Theorem 3

Any uniformly bounded policy is sub-optimal to a renewal policy, i.e., a policy under which the updating epochs form a renewal process. Besides, under the renewal policy, only depends on .

The proof is provided in Appendix -C. Our approach involves two steps of averaging. The first step of averaging is in the space of status update sample paths under a given uniformly bounded policy. For each fixed and , we group all of the sample paths with , and obtain the corresponding average inter-update delay . This step essentially averages out all factors that may affect other than . The second step is to do an averaging in the temporal domain. For each fixed , we form a sophisticated linear combination of involved s, and use it as the inter-update delay under the new policy. Such a policy is a renewal policy, it is always feasible, and each renewal interval only depends on . Through rigorous stochastic analysis, we prove that the new renewal policy always outperforms the original policy in terms of time-average AoI.

In the following, we will focus on renewal policies, and show that the threshold structure of the optimal renewal policy in the following theorem.

###### Theorem 4

In the class of renewal policies, the optimal policy has a threshold structure, i.e., equals a constant if ; otherwise equals . Here , and the corresponding long-term average AoI equals .

Theorem 4 indicates the optimality of the following threshold-based status update policy.

###### Definition 4 (Threshold-based Status Update Policy)

When an energy unit enters an empty battery, the sensor performs a status update immediately if the AoI at the FC is greater than a threshold ; Otherwise, it holds its operation until the AoI is exactly equal to .

## V Simulation Results

We consider a wearable device powered by a piezoelectric energy harvester which harvests energy at an average rate 10 per second. The device sends update packets periodically to a monitoring device, such as a cell phone, through low-power transmission technologies, such as Bluetooth or Zigbee. Each update consumes energy 1 and lasts for one second. We normalize the unit energy to be 1 , and the unit time to be seconds. Therefore, the EH rate is equivalent to one unit energy per unit time. We will first evaluate the proposed policies with a Poission EH process, and then study them with a first-order Markov process.

First, we fix the battery size . We generate sample paths for the Poisson EH process over , and perform status updating according to the best-effort uniform status update policy. The time-average AoI as a function of is shown in Fig. 2. We plot one sample path of the time-average AoI and the sample average over sample paths in the figure. We observe that both curves gradually approach the lower bound as increases. When , there is only a very small difference between the simulation results and the analytical lower bound. The results indicate that the proposed best-effort uniform status update policy is optimal.

Next, we study the time average AoI under the adaptive status update policy with finite battery sizes. We fix time units and plot the average AoI over sample paths in Fig. 3. We note that for each fixed , the gap between the time average AoI and the lower bound monotonically decreases as increases, which is consistent with Theorem 2.

Then, we compare the performances of the three policies for . For a fair comparison, we optimize the parameters for the best-effort uniform status update policy and the adaptive status update policy numerically before we perform the comparison. We note that the optimal update rate for the best-effort uniform policy is once every seconds. We also modify the adaptive status update policy to make it applicable for the case . Specifically, we schedule the next update seconds away if the battery level is full right before the current update; otherwise, we schedule it in time seconds. We numerically search for the optimal value of , and it turns out that when , the time-average AoI is minimized. This is opposite to the case when is large but finite. Although it is a bit counter intuitive, it is due to the memoryless property of the inter-arrival times of a Poisson process, i.e., the expected waiting time for the next energy arrival keeps unchanged after current scheduled update epoch, regardless of its feasibility. If at current scheduled update epoch, the battery will become empty immediately after it updates the status, and the AoI will then linearly grow from zero; If , the AoI has a positive value already, and will grow with the same rate. Thus, in order to balance the inter-update delays to minimize the time average AoI, the system should be more aggressive to update if the current scheduled update is infeasible. We then generate a sample path and plot the time average AoI as a function of time units under each policy, as shown in Fig. 4. The corresponding sample-path average over sample paths is plotted in Fig. 5. As we expect, for both scenarios, the threshold based updating policy outperforms the other two policies, and approaches its limit as gets sufficiently large.

Last, we evaluate the performance of the proposed algorithms with Markovian EH processes, which has been typically assumed in the literature [49]. Specifically, we model the EH process as a stationary first-order discrete-time Markov chain with two states, namely, ON and OFF, and the length of each time slot will be decided later. We assume that at the end of each time slot the device will go from OFF to ON with probability , and from ON to OFF with probability . Furthermore, we assume that the energy harvesting device will harvest 1 energy in one time slot if the state is ON, and does not harvest any energy in the OFF state. Thus, the steady state probability that the device is in the OFF and ON states are and , respectively, and the average EH rate is thus equal to per time slot. To make the EH rate consistent with that under the Poisson setting for comparison, we normalize the length of each time slot as second, so that the EH rate for the Markov process is always equal to 1 . The values of and controls the burstiness of the EH process. When , the EH process becomes a uniform process, and when are very small, the EH process becomes very bursty, i.e., it may be ON and harvest energy in consecutive time slots for many time slots, and then switch to OFF and be inactive for a long period of time. Intuitively, the bursty EH process will results in a larger AoI than the uniform EH process.

In the following, we consider various values of and , and evaluate the time-average AoI under the proposed policies for different values of . Specifically, we generate EH sample paths over seconds under each setting. For each sample path, we run the policies and track the time-average AoI. We then average over the sample paths to get the average AoI, and summarize them in Table I. When , we run the best-effort uniform status updating policy which updates once every unit time if it has sufficient energy. As we note in Table I, when , the resulted average AoI is exactly equal to the lower bound, i.e., 0.5 unit time. This is because no battery outage would happen with to the uniform EH process. For the rest cases, the average AoI are close to 0.5. This is because the probability of battery outage approaches zero when for . The AoI monotonically increases as the EH process becomes more bursty, which is consistent with our intuition. When , we perform the energy-aware adaptive status updating with parameter . As we observe in the table, the average AoI is close to when , and it exhibits the same monotonicity in as for the case. When , we choose the threshold based updating with threshold . Again, we note that the AoI monotonically decreases as increase, and it exactly equals when . This is because when the EH process is uniform, the AoI in the system is always equal to one when an energy unit arrives. Since it is above the threshold, the sensor will perform an update immediately, which leads to the uniform updating.

The simulation results indicate that although we assume a Poisson EH process for the ease of theoretical analysis, such an assumption may not be critical for the optimality of the proposed policies, especially for the cases that and is finite but large. Theoretically characterizing the performance of these policies with more general EH processes is one of our future steps.

Setting | |||
---|---|---|---|

0.5212 | 1.2069 | 2.2291 | |

0.5039 | 0.5627 | 0.9855 | |

0.5018 | 0.5224 | 0.6991 | |

0.5009 | 0.5152 | 0.5761 | |

0.5000 | 0.5009 | 0.5000 |

## Vi Conclusions

In this paper, we investigated optimal status update policies for an energy harvesting source equipped with a battery. We considered three different cases, namely, the battery size is infinite, finite but large, and one unit. We proposed three different status updating policies for those cases, and established their optimality through theoretical analysis. We also evaluated the performances of the proposed policies through simulation results.

We point out that the (asymptotically) optimal status update polices for the infinite battery and finite battery cases are closedly related to our earlier work [41]. In [41], we have studied an optimal sensing problem where the objective is to optimize the long-term average sensing performance. We assume that the sensing performance was measured by a general function of the inter-sensing delays. Examples include the MMSE in reconstructing a wide-sense stationary random process. We observe that the average AoI as a function of the inter-update delays can be treated as a particularized case of that general function. Such inherent connection between AoI minimization and a general sensing performance optimization implies that AoI as a metric of information freshness does have deep connections with other performance metrics in remote sensing/estimation systems. Unveiling the intricate relationship between AoI and other remote estimation related metrics is one of our future directions.

### -a Proof of Theorem 1

To prove Theorem 1, it suffices to show that almost surely.

The uniform best-effort status update policy partitions the time axis into slots, each with length . Let be the energy level of the sensor right before the scheduled status update at time . Based on , we can group the time slots into intervals labeled as , where corresponds to the -th interval that begins with for some and ends when becomes positive as increases; corresponds to the -th interval that begins with for some and ends when becomes zero as increases. Note that we assume one unit energy is available at time , i.e., .

We note that jumps from zero to some positive value at the end of , due to random energy arrivals over the last time slot in . Based on assumption that the energy arrivals follow a Poisson process, the length of follows an independent geometric distribution where

(7) |

With a bit abuse of notation, in equation (7), and in the following proofs, we use to denote the length of the interval labeled as ; similarly for .

Over the interval labeled as , all of the scheduled status update epochs are feasible, except for the last one bounding . Considering the duration bounded by the first and last feasible status update epochs over , the aggregated AoI equals , where . Since all of the scheduled status updating epochs over are infeasible except for the last one (which is also the first feasible status update epoch over ), the aggregated AoI over the duration bounded by the last feasible status update epoch over and the first feasible status update epoch over is .

Let be the number of s over . Then the number of s over is either or , depending on whether time is a feasible update epoch or not. Therefore,

(8) | |||

(9) |

where (27) follows from the definition of and , (9) follows from the results that and almost surely, as proved in the proof of Theorem 1 in [41]. Since ’s are i.i.d. geometric random variables, and converges to the first and second moments of the geometric distribution specified in (7), which are finite constants. Therefore, we have (9) converges to 1/2 almost surely.

### -B Proof of Theorem 2

Consider the first scheduled status update epochs under the proposed adaptive status update policy for a sample path of the energy harvesting process. Let denote the number of intervals between two scheduled status updating epochs with duration , be that with duration , and be that with duration . Let be the number of status updating epochs the battery overflows, and be the number of infeasible status update epochs. Then, the -th scheduled status update epoch happens at time . Let be the total amount of energy wasted. Then,

(10) |

where is a Poisson random variable with parameter . Dividing both sides by and taking the limit as goes to , we have

According to Theorem 3 in [41], for almost every sample path,

(11) | ||||

(12) |

Combining with the fact that and , we have

(13) |

Based on Taylor expansion and (13), we have

On the other hand, due to the existence of infeasible status update epochs, we have

(14) |

where the inequality in (14) follows from the fact that differs from the delay between two consecutive scheduled status update epochs only when battery outage happens, and denote the number of ’s with .

### -C Proof of Theorem 3

Let be the status update epochs under a uniformly bounded policy, and be the corresponding inter-update delays. Based on the definition of in (4), we have

(17) |

Thus,

(18) | ||||

(19) |

We aim to show that 1) , and 2) is suboptimal to a renewal policy. In the following, we will show them separately.

#### -C1

First, we denote as the cumulative distribution function (cdf) of under the uniformly bound policy, and be the total number of updates over . Then, we have

(20) |

Next, we note that

(21) | |||

(22) | |||

(23) | |||

(24) |

where (23) follows from the definition of uniformly bounded policy, and (24) follows from the memoryless property of the inter-arrival time of a Poisson process.

Therefore, by fist conditioning on the last update epoch prior to (or at) time , we have

(25) | |||

(26) |

Note that is defined as the total number of energy arrivals over , which upper bounds the total number of status updates over , i.e., , due to energy causality constraint. Thus,

(28) | |||

(29) | |||

(30) |

under each status update sample path. Plugging in (27) and letting , we have

(31) | |||

(32) |

where (32) holds for any , due to the assumption in Definition 3 that is bounded. Since , we have as .

#### -C2 is sub-optimal to a renewal policy

For any given uniformly bounded policy, we will construct a renewal policy as follows: For all of the status update sample paths under the given uniformly bounded policy, we will group those with based on the value of , and find the corresponding average inter-update delay . Specifically, we define

(33) |

Since each under the given policy, we have , and it depends only on . Besides, we have the following observation:

###### Lemma 2

For any fixed ,

Proof: Based on the property of conditional expectation, we have

(34) | ||||

(35) |

where in (34) is an indicator function, which takes value 1 if event is true; otherwise, it equals 0. Equation (35) follows from the fact that events and are equivalent, and

The Lemma is proved after taking expectation of both sides of (35) with respect to .

Next, we will construct a renewal policy based on the definition of . Define

(36) | ||||

(37) |

Then, we have the following observations.

###### Proposition 1

For any fixed , is a valid distribution.

This proposition can be proved based on the facts that , and .

###### Proposition 2

For any fixed , , and it depends on only.

This proposition is due to , and it depends only on , as well as Proposition 1. Proposition 2 indicates that if we define a status update policy such that the corresponding inter-update delay is determined by the delay between the last status update epoch and the first energy arrival time after that according to , then, the corresponding policy always satisfies the energy causality constraint, and the inter-update delays over are independent and identically distributed, thus it is a renewal policy over .

With a little abuse of notation, in the following, we use to denote a random variable that has the same distribution as .

###### Lemma 3

For any fixed ,

(38) |

Proof: First, we note that . Thus, we have

(39) | ||||

(40) | ||||

(41) | ||||

(42) | ||||

(43) | ||||

(44) |

where we switch the order of summation and expectation in (41) since , (42) follows from Lemma 2, (43) follows from the definitions of in (36), and (44) follows from the definition of in (37). Dividing on both sides of (44), we have (38) proved.

###### Lemma 4

Under the uniformly bounded policy, we have

Proof: Based on Cauchy-Schwarz inequality, we have

(45) | |||

(46) |

Lemma 4 then follows from the fact that is independent with .

Last, we will show that the corresponding renewal policy always outperforms the original uniformly bounded policy in terms of AoI.

(47) | |||

(48) | |||

(49) | |||

(50) | |||

(51) | |||

(52) | |||

(53) | |||

(54) |

where (49) follows from the Lemma 4, (50) follows from Lemma 2, (52) follows from Jensen’s inequality and Proposition 1. Combining with Lemma 3, we have (53), which is greater than or equal to (54), the minimum long-term average AoI of the optimal renewal policy. We use to denote the set of feasible renewal policies under which only depends on . Since the inequality holds for every , we have

(55) | ||||

(56) |

### -D Proof of Theorem 4

Based on Theorem 3, we assume the inter-update delays under a renewal policy is a function of , the duration between the last update epoch and the first energy arrival after it.

Then, to minimize the long-term average AoI is equivalent to

(57) |

Based on the assumption that the energy arrival process is Poisson with , is an exponential random variable with rate 1. In order to make problem (57) more tractable to solve, we introduce the following parameterized problem

(58) |

We have the following observation.

###### Proposition 3

The optimal solution of problem (57) is given by