Opportunistic MultiChannel Access in Heterogeneous 5G Network with Renewable Energy Supplies
Abstract
A heterogeneous system, where small networks (e.g., small cell or WiFi) boost the system throughput under the umbrella of a large network (e.g., large cell), is a promising architecture for the 5G wireless communication networks, where green and sustainable communication is also a key aspect. Renewable energy based communication via energy harvesting (EH) devices is one of such green technology candidates. In this paper, we study an uplink transmission scenario under a heterogeneous network hierarchy, where each mobile user (MU) is powered by a sustainable energy supply, capable of both deterministic access to the large network via one private channel, and dynamic access to a small network with certain probability via one common channel shared by multiple MUs. Considering a general EH model, i.e., energy arrivals are timecorrelated, we study an opportunistic transmission scheme and aim to maximize the average throughput for each MU, which jointly exploits the statistics and current states of the private channel, common channel, battery level, and EH rate. Applying a simple yet efficient “savethentransmit” scheme, the throughput maximization problem is cast as a “rateofreturn” optimal stopping problem. The optimal stopping rule is proved to has a timedependent thresholdbased structure for the case with general Markovian system dynamics, and degrades to a pure threshold policy for the case with independent and identically distributed system dynamics. As performance benchmarks, the optimal power allocation scheme with conventional power supplies is also examined. Finally, numerical results are presented, and a new concept of “EH diversity” is discussed.
Heterogeneous networks, small cell, energy harvesting, opportunistic transmission, optimal stopping.
1 Introduction
1.1 Motivations
Heterogeneous networks (HetNets), where small networks (e.g., small cell or WiFi) composed of lowpower access points (APs) are placed under the coverage of a large network (e.g., large cell), are evolving into a new type of network deployment that could enhance the overall system throughput with reasonable cost and power consumption [1, 2]. Standardization bodies, such as ETSI and 3GPP, have paid much attention to this shifting of network paradigm and have made HetNets part of the current and future cellular standards. Now, commercial small cell deployments could already be found globally, operated by various cellular carriers [3, 4].
In a traditional cellular network, a mobile user (MU) is usually assigned a dedicated private channel to access the base station (BS), while this link may experience bad channel conditions due to the possible severe path loss and shadowing between the MU and the BS. In such cases, however, the desired qualityofservice (QoS) could still be satisfied by allowing the MU to access a nearby AP in an underlying small network via a common channel, if the corresponding channel condition is relatively good. Essentially, the MU in the above HetNet could deploy a multichannel access scheme: The messages from MU could be directly delivered to the cellular BS, or if available, jointly via a nearby lowpower AP [5]. It is worth noting that the small network could be operated over a band orthogonal to the large network: e.g., WiFi uses the unlicensed band [6] and femtocells could be allocated with different bands from the large network via orthogonal frequency division multiple access (OFDMA) or time division multiple access (TDMA) [4, 7]. If needed, the small network can also share the same bands with the large network. For either case, there are two modes of access control for small networks: restricted access, i.e., only preregistered MUs could access the corresponding AP [4, 5]; and open access, i.e., any local MUs in the small network could gain the access. In practice, the MU may fail to establish a dedicated link to the small network due to congestion over the limited spectrum resources, which introduces another type of access randomness beyond channel variation in the conventional cellular system.
Another significant advantage enabled by the aforementioned HetNet is that the MU could potentially enjoy a longer lifetime since its power consumption may be reduced by the help of communicating with the nearby local AP. However, since the lifetime of an MU is still limited by the stored energy in the batteries [8], the MU should seek an “active” way to recharge itself, especially in a green fashion. Such renewable energy powered nodes, which can efficiently convert certain environment energy (e.g., those from solar, wind, and vibration) into electric energy [9], will play critical roles in the next generation or 5G wireless system, which is designed to be environment friendly and to support diversified applications such as machinetomachine communications and Internet of things (IoT). In this way, the MU could prolong the battery life almost infinitely, and fulfil the increasing demands of green operations in 5G [10]. Compared with the conventional power supply, such a renewable energy supply raises a new transmission design constraint: The consumed energy up to any time should be bounded by the harvested energy until this point, which is named as the EH constraint [8].
In this paper, we study a simple uplink HetNet scenario depicted in Fig. 1, where each EHbased MU has an individual link, namely a private channel, to the large network BS for deterministic access. Moreover, a local AP of a small network offers a common channel, which is randomly shared by all nearby MUs. Here we consider a scenario that each MU could access the common channel with a certain probability at each time slot. Thus, based on this multichannel access setup, the MU could fulfil a transmission by using the harvested energy via either its private channel solely or via both the private and common channels simultaneously. Joint information processing is done with low latency by a cloudbased radio access network (CRAN) platform, which is a popular platform candidate for 5G [11, 12, 13].
On the MU side, there are two types of state information that could be causally known before the transmission: the channel state information (CSI) of the links to the large network and the small network (if the AP was successfully accessed by the MU); and the energy state information (ESI), i.e., the EH rate (the harvested energy per unit time) and the battery state at the MU. Therefore, the MU could decide when to start a transmission with both CSI and ESI at hand. Obviously, a longer time to harvest energy while probing the system may result in a higher transmission power, and create a higher likelihood to secure the common channel; however, it may reduce the average effective transmission time. Thus, this leaves us an interesting tradeoff to optimize: energy saving time vs. data transmission time. In addition, we consider a “savethentransmit” scheme such that each transmission would consume all the harvested energy at the MU. This suboptimal power utilization scheme is able to deploy a large instantaneous transmit power such that the shortterm transmission rate is maximized, and is more tractable for analysis as well.
1.2 Contributions
First, we propose an opportunistic transmission scheme for the multichannel HetNet uplink powered by sustainable energy supplies, which enhances the average throughput for each user by jointly exploiting the stochastic CSI and ESI. More precisely, the throughput maximization is cast as a “rateofreturn” optimal stopping problem. With Markovian private channel and EH models, the optimal stopping rule is proved to exist and have a statedependent thresholdbased structure under both finite and infinite battery capacity assumptions. The optimal throughput is proved to be strictly increasing over the probability that the common channel is secured.
Second, we study the case when the private channel gains and the EH rates are respectively independent and identically distributed (i.i.d.) across different communication blocks. The corresponding optimal stopping rule is proved to be a purethreshold policy, i.e., the threshold does not change over time, which could be found via a onedimension search. With such a fixed threshold, the mean saving time is proved to be decreasing polynomially over the probability that the common channel is secured. We also show via simulations that the randomness of EH rates, leading to the socalled “EH diversity”, influences the throughput performance and could be exploited by our proposed purethreshold policy: Specifically, we find that the more dynamically the EH rate varies, the higher the average throughput that the MU could achieve.
Finally, we quantify the performance of the case with conventional power supplies as the benchmark, showing that the corresponding optimal power allocation has a “waterfilling” structure, where the water level is jointly determined by the statistics of the private and common channels, and the probability that the common channel is secured.
1.3 Related Works
Most of existing works related to the uplink of heterogeneous cellular networks assume certain deterministic access control of the underlying small networks [4, 7, 14, 15]. From the views of both the femtocell owner and the overall network operator, authors in [7] evaluated the femtocell performance with open and restricted accesses. It was shown that with nonorthogonal (in terms of frequency or time) multiple access for mobile users, open access benefits both the femtocell owner and the network operator; with orthogonal multiple access, the femtocell access control strategy (open or restricted) is closely dependent on the user density. In [14], by adopting open access, the outage behaviors of both femtocell and large cell users were analyzed via stochastic geometry to model the locations of both the femtocell APs and the cellular users. The authors also presented several interference avoidance methods to enhance the peruser capacity. In [15], each large cell user was assigned one direct link to the BS, and one relay link to the femtocell AP. Playing a noncooperative game against the others, each user could seek its preferred openaccess femtocell and split the rates between the BS and the AP to maximize its own utility. In contrast to these existing works, here we consider users with random, not deterministic, access to the local AP, which is more realistic in WiFi based HetNets.
On the other hand, the study of wireless transmitters powered by renewable energy has drawn a lot of attention in recent years [16]. Particularly, with noncausal knowledge on energy arrival processes, the throughput maximization problem was investigated for both nonfading and fading channels in [8, 17], in addition to the classic threenode Gaussian relay channel [18]. With causal knowledge, the optimal throughput in fading channels over finitetime horizons was obtained via dynamic programming in [8, 17]. A savethentransmit protocol was proposed in [19], where each communication block is divided into two parts: the first one for harvesting energy and the other for data transmission. On the contrary, we consider the savethentransmit strategy in this paper over an infinite number of communication blocks. For a wireless network where multiple EHbased users share one common channel, authors in [20] investigated the performance of some standard medium access control protocols, e.g., TDMA, framedAloha, and dynamicframedAloha. Under the similar system setup, authors in [21] proposed a decentralized access scheme based on game theory, which could achieve some local maxima of the network utility. In this paper, a different scenario is studied where each user has a multichannel access, and an individual utility to maximize.
Channel probing techniques have also been studied in the literature. In [22], the authors discussed how a transmitter probes a relay channel with some additional time cost when its direct channel is undesirable. In addition, similar channel probing and selection problems for WiFi and cognitive radios were investigated in [23] and [24], respectively. For [22, 23, 24], the key idea is that the sender may spend time on probing the channel quality before starting a transmission. We here adopt a similar idea. However, we need to face a different and more challenging scenario: Besides probing the large cell network, we also need to probe the resource availability in the small local network, as well as the local battery status that is dynamic due to the energy arrival and withdrawal.
The remainder of this paper is organized as follows. The specific system model and problem formulation are described in Section 2. The throughput optimization problem is solved for both Markovian and i.i.d. cases in Section 3. The optimal power allocation with traditional power supplies is discussed in Section 4. Numerical results are provided in Section 5. Finally, Section 6 concludes the paper.
2 System Model and Problem Formulation
2.1 System Model
As shown in Fig. 1, an uplink HetNet communication scenario is considered: One private channel connected to the large network BS is assigned to each EHbased MU, and one common channel connected to a given small network AP is randomly accessed by all nearby users. All private and common channels are orthogonal in frequency, slotted in time, and synchronized. The duration of each time slot is unified. Moreover, in each slot, an MU can access at most one local AP through the common channel. Define the probability that the common channel is secured by an MU as , called securing probability. Similar to a WiFi system, the MU cannot hold the common channel forever; it is required to release the common channel after the usage.
Channel model
Under the above setup, an MU can fulfill a transmission: i) via the private channel only; ii) or via both the private and common channels.

In case i), the received signal in the th time slot at the BS is given by
(1) where is the channel gain of the MUtoBS link, is the transmit power, is the transmitted signal with zero mean and unit variance, and is the circularly symmetric complex Gaussian (CSCG) noise with zero mean and unit variance. Define on a state space with finite mean and variance.
Here, we assume that follows a more general Markovian model [25] while follows an i.i.d. model, due to the fact that the MUtoBS link usually experiences a much longer distance such that the channel may be under correlated shadowing, while the MUtoAP link usually experiences fast fading, given its much shorter distance. The CSI includes both and . For simplicity, the time for the MU to learn the CSI is neglected given the much longer length of one time slot.
Assume that the fiber connections between the BS/AP and the CRAN are perfect, such that the CRAN based joint decoding is optimal. By applying the Shannon capacity formula, at time slot , the instant transmission rate of the MU over the above channel model is expressed as
Note that the common channel can be secured with probability in our proposed multichannel model. To make the expression of more concise, we introduce an indicator such that
Then, can be written as
(3) 
The constraint on transmit power levels and will be specified later
Energy model
In general, the entire operation of the MU relies on the harvested energy. Here, we mainly focus on the effect of the EH constraint on transmit power, not only for analytical tractability and gaining insights, but also due to the fact that data transmission usually dominates the power consumption in mediumtolong range wireless systems [26, 27]. In other words, the energy consumption on circuit overhead and channel training (acquiring CSI of both private and common channels) are assumed relatively negligible.
We use to denote the energy level at the battery for the considered MU at the beginning of time slot , and quantify the energy level into unit steps, i.e., , where is the smallest energy unit, and could be either a finite integer or infinity. For the case of , it is a good approximation when the battery capacity is large enough compared with the EH rate, e.g., an AAsized NiMH battery has a capacity of 7.7 kJ, which requires a couple of hours to be fully charged by some commercial solar panels [28]. During time slot , the MU harvests amount of energy, where the sequence is modeled as a homogeneous Markov process [29]. Due to hardware limitations, the EH rate could be represented over a finite state space . The energy state information (ESI, i.e., EH rate and battery status) is assumed causally known by the MU.
Operation model
Given that the MU is driven by the accumulated energy, we consider a “savethentransmit” scheme over multiple time slots: The MU harvests energy and exploits the access opportunity of the common channel simultaneously over a certain number of time slots, and then transmits by using up the total available energy in the battery. Such a scheme has the nature of maximizing the shortterm transmission rate, and is practical due to its implementation simplicity. As such, if we let as the first time slot after one data transmission, can be written as
When , there is , where is the accumulated energy during the transmission slot in the previous savethentransmit period. The MU decides when to stop “saving” and start a transmission according to its current CSI and ESI. Specifically, at the beginning of time slot , according to some optimal savethentransmit policy, an MU can:

either transmit immediately during the current time slot (via either the private channel or both the private and common channels);

or skip transmission (release the common channel if it has been secured by the MU).
In Fig. 2, we show one realization of the saving and access process, in which two users are assigned with two private channels, respectively, and share one common channel. In particular, MU 1 transmits only through its private channel at time and MU 2 transmits via both its private and the common channel at time .
2.2 Problem Formulation
Our goal is to maximize the average throughput of the MU. First, we determine the transmit power for maximizing the instant rate . At time , according to the savethentransmit scheme, it is easy to see that the transmit power and satisfy . When , it follows , since the MU can only use the private channel; and when if , in order to maximize , the power allocation follows the “waterfilling” scheme given in the next lemma.
Lemma 2.1
When the MU can transmit via both the private and common channels (i.e., ), it is optimal to allocate power as follows:

If , we have that and ;

If and , we have and ;

If and , we have and .
Lemma 2.1 can be proved by using standard convex optimization techniques and thus the proof is omitted for brevity. For notation simplicity, we define the state of the MU, including CSI and ESI, at time as . In this way, is fully determined by .
Next, we let be some stopping rule indicating the time slot to stop saving and start transmission. Thus, the transmission rate at the time slot would be denoted as . Here, we make the following assumption: The steadystate distribution of exists under the stopping rule . We will verify this assumption later by showing that our proposed transmission scheme will indeed result in a stationary . With the above assumption, it follows that the steadystate distribution of also exists given that and are stationary, respectively. Then, applying the stopping rule for infinitely many times, we obtain
where the expectation is taken over the stationary distribution of and , and is the average throughput per savethentransmit period. The core of the proposed savethentransmit scheme is to find the optimal stopping rule to achieve the maximum throughput , which are defined as
(4) 
In the next section, we will find and .
3 Optimal Stopping Rule and Throughput
The problem defined in (4) is a “rateofreturn” problem and could be converted into a standard optimal stopping problem [30, 31]. With some and, we let , and consider a new problem:
(5) 
Under this interpretation, can be regarded as the offer at time , is the cost, and is the net reward. We let since it is irrational that a transmitter does not send any data forever. The following lemma, which is directly from Theorem 1 of chapter 6 in [31], connects problems (4) and (5):
Lemma 3.1
Therefore, we just need to focus on finding the optimal stopping rule for problem (5) and such that . In the rest of this section, we first solve problem (5) for the case with Markovian private channel states and EH rates. Then, we consider the corresponding i.i.d. case.
3.1 Markovian Case
Here, we assume that and are homogeneous Markov processes with some stationary distributions, respectively. Given some , we define the remaining expected maximum reward starting at time in state as
(6) 
Moreover, we observe that the “cost” is a constant, which allows us to use to represent , i.e.,
(7) 
Based on this observation, the following proposition shows that the optimal stopping rule for problem (5) exists and also shows the form of the optimal stopping rule, whose proof is given in Appendix A.
Proposition 3.1
The optimal stopping rule for problem (5) exists with either or , and it has the following form
(8) 
Moreover, the optimal throughput satisfies
(9) 
where is the initial state of each savethentransmit period, a random vector defined over the space with a certain stationary distribution.
It is observed from (8) that the optimal stopping rule for problem (5) is statedependent and has a thresholdbased structure with parameter . The structure is derived from the optimality equation (see Theorem 2 in [30]), or equivalently, the dynamic programming equation (see (3) in [32]). Such a structure also implies that the closed form of calculation of is in general extremely difficult, especially in Proposition 3.1 where the stationary distribution of the battery is unknown and the battery capacity could be infinite. Thus, numerical methods are more preferred in finding .
Although the calculation of is hard, some properties of can be obtained and are given in the next proposition.
Proposition 3.2
is uniquely determined by (9) and is strictly increasing over .
We first show the uniqueness of . We observe that in (9), its lefthand side is monotonically increasing from zero to positive infinity over . Notice that in the righthand side of (9), we have
which is obtained according to (6). It follows that the righthand side of (9) is monotonically deceasing from a finite number, i.e., from
to negative infinity over . Thus, there exists a unique that makes (9) hold.
For the monotonicity of over , please see Appendix B.
Remark 3.1
The strict monotonicity of the optimal throughput over the securing probability implies that the common channel is helpful in general.
Remark 3.2
The stationary distribution of exists under the optimal stopping rule in (8). Specifically:

When is finite, the transition probability of the energy level is also determined under the stopping rule and the stationary distribution of . Moreover, all the attainable states of the battery form a positive recurrent class. Thus, has a steadystate distribution.

When is infinite, from the perspective of queueing theory, the average discharging rate of the battery is the same as the recharging rate since all energy will be used for transmission in each savethentransmit period. Therefore, the stationary distribution of exists. Moreover, it can be approximated as a Brownian motion process [33].
3.2 i.i.d. Case
In this subsection, we focus on the case when and are both i.i.d., respectively. As a special case of the one studied in the previous subsection, the optimal stopping rule of this case still exists. Taking one step further, the corresponding optimal stopping rule is simplified to bear a purethreshold structure.
Proposition 3.3
When and are i.i.d. with finite means and variances, respectively, the optimal stopping rule for problem (5) has the following form:
(10) 
where is a fixed real number.
Since the optimal stopping rule is given by (8) based on Proposition 3.1, we could further rearrange the rule as
The function is defined by , where . The following properties of play a key role in the proof of this proposition:

for all ;

for all . Moreover, as ;

for all . Moreover, as ;

as .
If all the above properties are true, it follows that , there exists such that whenever , which implies that the stopping rule has the form given by (10) (similar to the technique used in [30]). The proof of the four properties is given in Appendix C.
Moreover, we note that the expected value of the optimal stopping rule indicates the mean saving time. The next proposition shows that for a fixed threshold, the mean saving time is shortened under the proposed opportunistic scheme with multichannel access.
Proposition 3.4
Given a fixed , is decreasing polynomially over .
The proof is given in Appendix D. Following Proposition 3.3 and Lemma 3.1, we have
Then, we obtain
(11) 
Conjecture: is a quasiconcave function over .
Our conjecture will be validated via numerical results in Section 5. Such a conjecture enables us to apply some simple search methods, e.g., bisection search, to find the optimal threshold.
4 Throughput with Conventional Power Supply
In this section, we investigate the throughput of the MU with a conventional power supply in the discussed multichannel access system, which will serve as performance benchmarks for our proposed schemes introduced in previous sections. Note that we only need to change the EH constraints into the average power constraints in the setup, and keep the same channel and access models as before.
With a conventional power supply, the instant transmission rate given by (3) still holds. Then, finding the optimal power allocation is equivalent to solving the following optimization problem:
(12)  
(13)  
where is the maximum average total power. The optimal power allocation is given in the next proposition.
Proposition 4.1
Proposition 4.1 can be proved by applying a similar technique to the proof of optimal adaptation (5) in [34], and thus is omitted here.
The optimal power allocation has a “waterfilling” structure similar to the optimal solution of the single fading channel case under an average power constraint, while the water level is jointly determined by the securing probability and the statistics of both private and common channels.
5 Numerical Results
In this section, we present some numerical results to validate our analysis. Besides the optimal power allocation with a conventional power supply, we also consider the method of besteffort delivery [35] as a comparison benchmark, i.e., the transmitter directly uses up the harvested energy in the previous time slot and does not store energy. In the simulation, the length of each time slot is ms and the energy step is set to be J.
5.1 Markovian Private Channel Gains
First, we consider a renewable energy supply at the MU with a timecorrelated private channel, which corresponds to Section 3.1. Here, we use a simple model to illustrate the throughput performance with different schemes. Let the capacity of the battery , and the EH rate . The common channel is static with a constant power gain for . The gain of the private channel has two states with transition probability 1 from state to and probability from state to .
In Fig. 3, we show the average throughput with the opportunistic scheme proposed in Section 3.1 against other schemes under the impact of securing probability . First, we observe that the optimal allocation with the conventional power supply serves as the performance upper bound. We also observe that the throughput attained by the opportunistic scheme increases as increases, which agrees with Proposition 3.2. Second, when , the opportunistic scheme is better than the besteffort delivery. It agrees with our intuition that when the transmitter experiences a bad channel (), it skips the transmission immediately and waits for a better channel state, which may lead to a higher average throughput. Third, the opportunistic scheme and the besteffort delivery have the same performance at , since the common channel is good () and always secured by the MU, such that the MU does not need to skip any transmission, which results in the same average throughput for the two schemes. We could also conclude that only when the difference between the good and bad channel conditions is large enough, the opportunistic scheme performs significantly better.
5.2 i.i.d. Private Channel Gains
We apply a twostate EH model (similar to that in [29]), where the EH rate can be either zero (“BAD”) or (“GOOD”) with probability for each state. The channel gains in either the private or the common channel are i.i.d. following an exponential distribution with unit mean.
In Fig. 4, we show how the threshold influences the average throughput with different securing probability . We observe that the average throughput could be optimized by adjusting the threshold , which validates the results in (11) and our conjecture in Section 3.2.
We also show how the mean saving time varies over the securing probability in Fig. 5. Since the optimal threshold is different when changes, we choose two typical values for comparison: , which is optimal for ; and , which is optimal for based on our results in Fig. 4. For either or , we observe from Fig. 5 that the mean saving time decreases as increases, which agrees with Proposition 3.4. The mean saving time with optimal falls in between those with and , respectively.
Next, we want to show the existence of EH diversity and the proposed opportunistic scheme is able to explore this type of diversity. For better illustration, we focus on the purethreshold policy discussed in Section 3.2. The securing probability is set to be 0.5. The Markovian EH model (a) in Fig. 6 is the benchmark, which is equivalent to an i.i.d. EH model with probability 0.5 to be either “GOOD” or “BAD”. To compare, we choose EH models (b) and (c) as shown in Fig. 6, which have the same stationary distribution as that of model (a), while bearing different “randomness”: Model (b) changes from one state to the other with a higher frequency compared with model (a), and model (c) changes with a lower frequency such that the EH rate is likely to stay in one state and rarely change over time. In addition, we also consider model (d), which represents the case that the EH rate has a higher stationary probability to be “GOOD”.
In Fig. 7, we show the average throughput over different threshold values for the four EH models with the purethreshold opportunistic scheme depicted in Fig. 6. First, we observe that EH model (d) achieves the highest throughput, since model (d) has the largest stationary probability for the EH rate to be in the “GOOD” state. Second, we observe the throughput differences across EH models (a), (b) and (c). When , these three models lead to the same throughput performance, for implies that the opportunistic transmission scheme is not applied such that the average throughput is mainly determined by the stationary characteristics. When increases until the optimal value that leads to the maximum average throughput, we observe that the EH model (b) could make the transmitter achieve a slightly higher throughput than model (a). Similarly, model (a) is able to achieve a higher throughput than that of model (c). Note that among models (a), (b) and (c), EH model (b) is more likely to shift from one state to the other, while model (c) is likely to keep staying in either “BAD” or “GOOD” state. The EH model (a) behaves in between. The observation is that when the EH rate varies in a more dramatic way, it has larger randomness, where we could claim a higher EH diversity. Accordingly, our proposed opportunistic scheme would take advantage of such EH diversity by exploiting the EH variation, where the transmitter could opportunistically wait or start the transmission depending on the energy state.
Finally, over i.i.d. channel, the throughput performance of the MU with different power supplies and transmission schemes is shown in Fig. 8. To make them comparable, we let W. The EHbased transmitter with the opportunistic transmission scheme could achieve about of the throughput with the optimal power allocation, which is relatively worse than the Markovian case as shown in Fig 3.
6 Conclusion
In this paper, we considered a HetNet uplink with multichannel access, where each EHpowered MU has deterministic access to a private channel linked to the cellular BS, and random access to a common channel linked to a local AP. As such, the MU could fulfil a transmission via its private channel or via both private and common channels. By jointly taking advantage of channelenergy variation and common channel sharing, we proposed an opportunistic transmission scheme that allows the transmitter to properly probe the channelenergy state, such that the average transmission rate is maximized. In particular, we formulated the average throughput maximization problem as an optimal stopping problem of rateofreturn. By applying the optimal stopping theory, we proved that the optimal stopping rule exists and has a statedependent and thresholdbased structure in general. Moreover, when the private channel gains and EH rates are i.i.d., respectively, the optimal stopping rule turned out to be a simple purethreshold policy. We also found the optimal power allocation scheme for the transmitter powered by a conventional power supply, to serve as performance benchmarks. Numerical results validated the analysis with both Markovian and i.i.d. statistical models for the private channel gains and EH rates. We showed that under a renewable energy supply, the proposed opportunistic transmission scheme could achieve a higher throughput than the method of besteffort delivery. Also, our simulation results revealed the throughput gap between the cases with conventional and renewable energy supplies. Furthermore, the phenomenon of EH diversity was briefly discussed, which could be explored by the proposed purethreshold policy such that the throughput performance could be enhanced.
Appendices
6.1 Proof of Proposition 3.1
According to the optimal stopping theory [30, 31], the existence of the optimal stopping rule could be proved by checking the following two conditions: For a given ,
 C1:

;
 C2:

a.s..
We first check C1 and C2 for and , respectively.

: For C1, we have . Since the channel gains are finite a.s., and the battery capacity is finite, the expectation of the transmission rate is finite as well, which proves that C1 holds. For C2, we only need to show that for any large negative real number , there exists a.s. such that for all , . In fact, for any , , which implies that . However, the term will increase to infinity as . Thus, when , can be as small as we want a.s., i.e., a.s., which proves that C2 holds.

: For this case, we check C2 first. Recall the expression of in (3) and given as , where is the maximum EH rate and is finite. Then, we have
(14) a.s.. By using L’Hpital’s rule [36], the first term in (14) satisfies
We could apply a similar check for the second term of (14). Thus, C2 holds. For C1, we could use the above results of C2 and obtain that , there exists an such that . Since the channel gains are finite a.s., and for all , , we obtain , which implies that C1 holds.
Therefore, both C1 and C2 hold for either or , which implies that the optimal stopping rule exists.
Next, we derive the optimal stopping rule. According to (7), we have . Meanwhile, satisfies the dynamic programming equation (equation (3) in [32]):
(15) 
Therefore, the optimal stopping rule has the following form
where the second equation holds due to (7). By letting , we obtain the form of as shown in (8).
Finally, we compute . By Lemma 3.1, makes the following equation hold:
Thus, we could obtain by some simple rearrangements.
6.2 Proof of Proposition 3.2
Recall Proposition 3.1 that the optimal stopping rule exists and it is easy to check that . Then, given some , there exists an such that for all , we have . Therefore, when we consider the expected value of , we can just focus on a finite horizon, i.e., . Then, by the dynamic programming algorithm (e.g., Theorem 2 of Chapter 3 in[31], or equation (3) in [32]), we have
Now, we show that is strictly increasing over by contradiction. First, we fix , and let increase to , where is a small positive real number. Then, we move backward. Note that at step , only depends on and does not change with . At , we observe that
Note that the private channel could not be strictly better than the common channel [4], i.e., it is unrealistic that . It follows that . Thus, we have that strictly increases as increases to .
Suppose that at for , strictly increases as increases to . Since the expected value of also strictly increases following a similar argument as we discussed at step , we have that the expected value of strictly increases. Then, at , we have
(16) 
which strictly increases and thus implies that such an increment holds for all .
At the step , we have
(17) 
where should also strictly increase as increases to . However, we recall from Proposition 3.1 that , which is attained by and . It implies that in order to make , the value should not be fixed and must strictly increase accordingly, which contradicts the assumption in the first step that is fixed. Thus, strictly increases as increases. Finally, this proposition is proved by letting (i.e., is large enough).
6.3 Proof of Proposition 3.3
For Property 1), it is straightforward to see that
(18) 
For Property 2), suppose that the transmitter does not stop channelenergy probing until time ; then starting at , we should have . Thus, could be written as
due to . Then, with a fixed such that , along with Property 1), is expanded as
(19)  
(20) 
where the second inequality holds due to for . Note that we do not put the time index on and since and are i.i.d., respectively. Next, we want to show that both (19) and (20) are finite and could be as small as we want with a large , which would complete the proof for 2).

For (19): by plugging , we obtain
since has finite mean and are i.i.d. with finite mean as well. Moreover, if , .

For (20): Since , and both and have finite means, respectively, it follows that (20) is finite. When the transmitter occupies the common channel at time , there are three possible events by Lemma 2.1: If , allocating all power to one of the two channels; otherwise, allocating the power to both channels at a certain ratio. Note that the probability of any above events happening does not depend on if is large enough. To see this point, we let
When is large, there is
and similarly, we have
Then, by applying , and , we can expand (20) as
Similarly as the reasoning in (19), we obtain that as .
Therefore, we conclude that is finite and could be arbitrarily small when is large enough.
For 3), we expend as
since only are correlated over time. By Property 2), we know is finite and thus is finite since is a finite space. Moreover, by Property 2), we have