Optimal Channel Sensing Strategy for Cognitive Radio Networks with Heavy-Tailed Idle Times
In Cognitive Radio Network (CRN), the secondary user (SU) opportunistically access the wireless channels whenever they are free from the licensed / Primary User (PU). Even after occupying the channel, the SU has to sense the channel intermittently to detect reappearance of PU, so that it can stop its transmission and avoid interference to PU. Frequent channel sensing results in the degradation of SU’s throughput whereas sparse sensing increases the interference experienced by the PU. Thus, optimal sensing interval policy plays a vital role in CRN. In the literature, optimal channel sensing strategy has been analyzed for the case when the ON-OFF time distributions of PU are exponential. However, the analysis of recent spectrum measurement traces reveals that PU exhibits heavy-tailed idle times which can be approximated well with Hyper-exponential distribution (HED). In our work, we deduce the structure of optimal sensing interval policy for channels with HED OFF times through Markov Decision Process (MDP). We then use dynamic programming framework to derive sub-optimal sensing interval policies. A new Multishot sensing interval policy is proposed and it is compared with existing policies for its performance in terms of number of channel sensing and interference to PU.
In recent years, usage of wireless devices such as smart phones, and laptops has grown exponentially. A major concern over this growth is that a large number of wireless devices are now trying to access limited wireless spectrum. Further spectrum measurement campaigns have shown that the fixed spectrum assignment policy for wireless devices has resulted in under utilization of the allotted bandwidth . Hence, to solve this problem and to have better spectrum utilization, researchers have proposed the technique of Cognitive Radio Network (CRN). In CRN, the licensed bands are made available to unlicensed users, also called as Secondary Users (SUs) whenever the licensed or Primary User (PU) are not using the spectrum. In CRN, the channel sensing parameters of the SUs such as sensing time, sensing interval, and sensing accuracy have an impact on the performance of both secondary and primary network. Many works in literature [2, 3, 4, 5, 6, 7] and references therein have studied the effect of PHY and MAC layer sensing parameters on the throughput of the secondary network. Most of these papers have assumed the ON and OFF time distribution of channel occupancy of the PU to be exponential.
However, an in-depth analysis of spectrum measurement traces reveals that the idle times of ISM and GSM bands exhibit power law decay till some critical time after which it has exponential decay , . By power law decay, we mean that the log-log plot of probability density function of channel idle times, given by , will be a straight line with negative slope . The data sets with above behavior have been shown to be well modeled with Hyper-exponential distribution (HED) , . Similarly, Sharma et al.  have simulated 802.11 WLAN clients-server model in OPNET simulator and have observed that the channel idle times can be modeled using HED distribution. The authors of  and  have proposed an optimal SU sensing / transmission strategy to maximize the throughput of SU with constraint on PU packet collision for generalized as well as hyper-exponential PU idle time distributions.
Many of the existing works have made the unrealistic assumption that SUs have full-duplex capability. A full duplex SU can transmit signal and detect the reappearance of PU at the same time. However, the design of full-duplex system with acceptable PU detection probability is highly complex. Further, it increases the energy consumption of the SU. A promising use-case scenario of secondary network is the wireless sensor network wherein it is not cost effective to deploy full-duplex SU. Considering the practical difficulties in implementing full-duplex SU, we look at the design of opportunistic secondary network that has half-duplex capability. In a full-duplex system the SU can stop its transmission as soon as the PU is detected. Thus the interference to PU can be kept minimal (maximum of one PU packet as a result of packet header corruption). However in the case of half-duplex system the interference of SU with PU is in general more than that of full-duplex case and can be as large as the inter-sensing duration. Thus the problem of finding optimal sensing strategy becomes even more crucial in the half-duplex system.
The major contribution of our work which differs from the existing literature on channel sensing strategies , ,  in CRN are as follows: In contrast to , we model PU OFF times as HED which is more realistic. Secondly, we have designed an optimal channel sensing interval framework considering half-duplex SUs whereas ,  have assumed full-duplex SUs. Finally, we frame optimization problem that tries to minimise both the cost for number of channel sensing and the cost of interference to PU by choosing optimal channel sensing intervals. We have used dynamic programming to derive the optimal channel sensing interval policy in our work. Interest readers can look into [15, 16] on applying dynamic programming to other optimization problems in CRN.
The important insight of our work is that the constant periodic sensing policy is not an optimal solution for non-exponential PU OFF times. We have proved the above point by deducing the structure of optimal solution using Markov Decision Process (MDP). We further suggest a new sub-optimal policy called ”Multishot sensing interval policy” which outperforms existing sub-optimal channel sensing policies in the literature .
The rest of the paper is organized as follows: A brief overview of system model is given in section II. In section III, we formulate the optimization problem which balances the number of SU channel sensing and interference to PU. The structure of optimal solution is derived using MDP in section IV. In section V, the different sub-optimal channel sensing interval policies are proposed . Section VI compares the performance of sub-optimal policies under various channel traffic conditions through simulation. Section VII studies the effect channel sensing parameters. Finally, we conclude the paper in section VIII.
Ii System Model
We consider CRN having half-duplex SUs. Following the studies on spectrum measurement traces, we model the OFF time distribution of PU to be heavy tailed. As mentioned earlier, the heavy-tailed idle times (OFF times) of PU are well-modeled as K-phase HED distribution,
where ’s are the phase probabilities such that , and ’s are the rates of mixture of exponential distribution . 111 The random variable is said to follow HED if X is, with probability , exponentially distributed with parameter for . The realistic spectrum measurement traces can be used to estimate the parameters ’s and ’s of HED as shown in . Suppose the SU sense the channel to be free from the primary user and occupies it. Let be the time at which SU occupies the channel. In order to avoid interference to PU, the half-duplex SU has to limit its transmission duration and intermittently sense the channel for the reappearance of PU. Let the sensing instants of SU fall at time instants , ,…, where the index denotes the sensing instant at which SU detects the presence of PU. We denote the channel sensing intervals , ,…, by , ,…, , respectively as indicated in Fig. 1. Our formulation of the optimal sensing interval policy minimizes the number of sensing as well as the cost involved in the interference to PU.
We assume that SU immediately access the channel when PU goes from ON state to OFF state. We have also studied the effect of delayed occupancy of the channel in section VII. The residual PU OFF time at sensing instant, , also has HED distribution with same but with different ’s. For example, the phase probabilities of residual PU OFF time at first sensing instant , denoted by , ,…, , can be calculated as follows:
where is given as
In general, the phase probabilities at sensing instant are given by
which can be rewritten as
Let denote the cost per channel sensing. Let denote the cost which is a measure of the interference to PU per unit time. The costs and can be a measure of channel sensing/switching energy and SU retransmission energy, respectively. Alternatively, they can represent the time for channel sensing/switching and the time for SU retransmission. We now define an indicator random variable to represent the SU’s interference to PU in sensing interval as follows:
Then, the average amount of interference to PU in sensing interval, denoted as , is calculated as
Let and be the weights (importance) that we assign to balance the number of channel sensing by SU and Interference to PU. Thus, the average cost incurred by SU at channel sensing instant for choosing next sensing interval as is given by
Iii Formulation of Optimization problem
We formulate the problem of optimal channel sensing interval mechanism in this section. Let be the sensing instant at which SU detects the presence of PU. is given by . Note that the sensing interval is chosen by the SU at sensing instant. Let denote the probability that the residual OFF time is greater than . It is given by
By using the average cost function per sensing instant given by (6), the average total cost incurred by the SU during OFF time of the PU can be calculated as
The total cost function can be rewritten as
where we use the fact that to get the second equality. By observing the SU channel sensing activity in Fig. 1, we can also write total cost function, as
We now formally state our optimization problem as follows. Our objective is to find the optimal channel sensing intervals such that is minimized, i.e.,
Note that the variables in above optimization problem take values from and the sensing index is a function of and .
Iv Dynamic Programming framework
A deep look into the system model and the objective function (Equations (8) and (9)) suggests that the SU at each sensing instant has to select the next optimal sensing interval considering the past sensing intervals to minimize the over-all cost of this sensing process. This observation suggests Stochastic Dynamic Programming (SDP) as a tool to solve the above optimization problem . SDP is a generic method to solve very complex problems by breaking them into subproblems. In order to solve the optimization problem using SDP, the optimal solution should be decomposable into sub-problems. The total cost function defined in (8) clearly has a decomposable optimal structure and hence we can use SDP to arrive at the optimal solution. We formulate the SDP as
where is the minimum cost at time . If the value of , the minimal cost will be same as optimal total cost through recursion of (11).
Iv-a Structure of Markov Decision Process
We model the above optimization problem as a Markov Decision Process (MDP) with the state space as the set of all possible probability vectors such that and . The action space of MDP be the whole non-negative real line . At sensing instant, the probability vector is given as with phase probabilities of residual OFF time. The probability vector at zeroth sensing instant, , be the initial state of MDP. The SU choose an action from at state which will cost . In the first channel sensing instant, the state of the system will be in . The SU choose an action which move the system to state . Similarly, for the channel sensing instant, the state of the system will be in which depends only on the previous state and the action ( satisfies Markovian property). The cost for choosing an action at state will be given by (6).
Iv-A1 Countable state space
Iv-A2 Compact action space
The action space of the above MDP can be restricted to the compact set without loss of optimality. The is the upper bound on channel sensing intervals that the SU can choose from and is given as (Lemma III.4 in )
where is the upper bound on the total expected cost when SU always take channel sensing interval of unit length,
We have shown that the MDP structure of our optimization problem has countable state space, compact action space and a non-negative cost function. From the above discussion, we conclude that the optimal policies for MDP can be restricted to non-randomized decision policies and the action space is restricted to . Thus, the minimum total cost function can be achieved by the minimal solution of the following SDP :
where the random variable follows HED with parameters and phase-probabilities , the probability vector is a function of and action . The optimal channel sensing interval for a given is the one that minimizes the above equation.
Iv-B Periodic sensing interval for exponential OFF times
We now demonstrate the correctness of MDP framework by deriving the optimal sensing interval policy for the well-known case of channels with exponential OFF times. The exponential distribution can be considered as a special case of HED with number of phases with and . Thus the above MDP framework can be used to derive the optimal sensing interval policy for exponential PU OFF time distribution. In this case, the residual OFF time at every sensing instant has same exponential distribution and thus results in one-state MDP problem, i.e. . The optimal sensing interval is found by minimizing the total cost function of SDP given below:
By substituting and in the above equation and re-arranging,
The second derivative of above function is given as
where for . Thus, for and hence the total cost function of SDP is a convex function. On differentiating Eq. (14) w.r.t and equating to zero, we get
Multiplying both sides of above equation by , we get
which can be written in the form , where and . The solution of the above form is Lambert-W function  at point , i.e,
where denotes the branch of the Lambert-W function that is real-valued on the interval with values below -1. From the above equation, we will get the optimal sensing interval which is used by SU at all sensing instants (i.e. periodic sensing interval ) as
Thus, we have shown that (15) is equivalent to the results derived in . We have also proven that the optimal sensing policy for exponential OFF time distribution is periodic sensing with sensing interval .
V Sub-optimal Policies
The Hyper-exponential distribution given in (1) is a convex combination of exponential distributions. Using the concepts of reliability theory, we can show that HED has Decreasing Failure Rate, i.e. the probability is increasing with increase in ’j’ . As a result, the optimal policy for (11) should account for infinite number of optimal actions/sensing intervals for . Moreover, the formulated MDP problem has continuous state space and action space. Thus the derivation of optimal sensing intervals (i.e actions) for HED OFF time is computationally complex and we are going for suboptimal policies. First, we adapt some of the existing sub-optimal policies  for our cost function given in (16). We then suggest a new policy called ”Multishot sensing interval policy” which outperforms existing sub-optimal policies in many scenarios.
V-a Exponential sensing interval policy
In exponential channel sensing interval policy, the secondary user at each sensing instant selects the next sensing interval which is a realization of exponential random variable with parameter . At every sensing instant (state), the SU’s sensing interval (action) is an exponential random variable with parameter . The optimal exponential parameter is derived as follows: The total cost function, for the exponential sensing interval policy is calculated as:
where , & are
Substituting the above values in (16), we will get for exponential sensing interval policy as
Taking the first and second order derivative of with respect to , we get
and the minimal total cost as
Thus, the SU at each sensing instant will take value from exponential distribution with parameter as the next channel sensing interval. We plot the optimal parameter given in (18) against weight in Fig. 2 using numerical computation in C++. In our simulation, we vary the channel load by only varying the HED OFF times as in . Thus the average PU OFF time decreases with increase in channel load. We can observe from Fig. 2 that the mean optimal sensing interval decreases with increase in channel load condition. Further, we can also notice that SU sense the channel frequently when more importance is given to reduce interference to PU, i.e. 0.
V-B One-stage sensing interval policy
One-stage sensing interval policy is a policy improvement over first stage (zeroth sensing instant) of existing exponential sensing interval policy. In one-stage sensing policy, the SU uses the SDP formulation given in (11) to select only the first sensing interval . Thereafter, the SU follows exponential sensing interval policy by replacing random variable with following residual HED OFF time distribution with phase probabilities . The value of in (11) will be , as given in (19).
The optimal parameters and for one-stage sensing interval policy are derived as follows:
(i) Evaluate the upper bound on sensing interval, i.e. using (IV-A2).
(ii) Vary the values of from zero to in steps of (In our simulation, we set based on analysis of )
(iii) For each values of , calculate the cost and probability vector using (6) & (4), respectively. For the remaining stages, the exponential sensing interval policy is used.
(iv) For each value of , the parameter of exponential sensing interval policy is calculated using (18) by replacing with . Similarly, the is calculated using (19) with .
(v) The total cost of one-stage policy, , is given as
The value of which minimizes the is taken as the optimal first sensing interval and its corresponding exponential parameter is taken as for one-stage sensing interval policy.
V-C Multishot sensing interval policy
We propose a new sub-optimal policy called “Multishot sensing interval policy” based on the observation that the probability vector as when we rearrange HED parameters such that . At zeroth sensing instant, SU assumes idle time to follow exponential random variable with parameter and uses periodic sensing interval policy to derive the first sensing interval . If the channel is still idle at the first sensing instant, SU assumes that idle time was generated by exponential random variable with parameter and uses periodic sensing interval policy to derive . SU keeps on changing parameter of exponential RV till it reaches sensing instant where it uses to derive . If the channel is still free from PU, the SU uses as the sensing interval for the remaining sensing instants. We will prove through simulation that multishot policy outperforms existing sub-optimal policies in most of the test-cases.
V-D Computational complexity of suboptimal policies
We now compare the computational complexity in obtaining the optimal parameters of one-stage and multishot sub-optimal channel sensing policies. In case of exponential and multishot policy, the expected number of times the sensing interval is computed is a bounded constant and hence the order of complexity is . For the case of one-stage sub-optimal policy, we fix the appropriate time step and then evaluate Eq.(20) for number of times to get the optimal parameters. We can observe that the order of complexity of one-stage sub-optimal policy is .
In similar lines to that of one-stage sub-optimal policy, it is possible to derive Mth stage suboptimal policy but the computational complexity will be of order . As the number of stages M increases, the suboptimal policy gets closer to the optimal solution. When , we will get the optimal policy (based on the observation that the vector as )
Vi Simulation Results
We calculate the optimal parameters of different sub-optimal policies numerically using C++. Then, we simulate the PU channel occupancy patterns where OFF times are generated with HED distribution given in , . When the channel become free from PU, the SU will access the channel using one of the sub-optimal policies. We evaluate the performance of different sub-optimal policies in terms of total cost through a simulator written in C++.
The optimal sensing intervals of multishot sub-optimal policy, , are calculated using (15) for parameters , respectively. From sensing instant, the SU always choose as the channel sensing interval.
In our simulation, we generated the channel occupancy model using two sets of HED parameters given in  and  (Light traffic – load 0.1 and Medium traffic – load [0.3,0.5]). Then, we evaluated the performance of different sub-optimal policies in terms of average number of channel sensing , average interference to PU (in time units) and which are plotted in Figs. 3 – 5, respectively. We observed that the interference to PU is less in case of multishot policy as compared to other sub-optimal policies.
The performance of all sub-optimal policies for channels with cost functions and are tabulated in Table I. We can observe that multishot policy outperforms exponential sensing interval policy in terms of total cost in all type of traffic conditions. The performance of sub-optimal policies also depends on channel’s traffic conditions as well as costs and . For example, we observe from Fig. 5(a) and Table I that the cross-over point of for multishot and one-stage sub-optimal policies varies with change in costs.
In general, the proposed multishot policy outperforms one-stage sub-optimal policy when more weightage is given to reduce interference to PU. One-stage policy outperforms multishot policy if we give more importance to reduce the number of channel sensing 222The cross-over point of for multishot and one-stage policy varies with respect to , and also HED parameters.. However, the major advantage of using multishot sub-optimal policy is that the complexity in calculating the parameters of one-stage policy is very high () as compared to multishot policy ().
|HED Params||Policy||No.of.sensing,||Interference||Total cost,|
|Light Traffic ||Exponential||2.778||1.905||1.593||1.388||9.881x||1.941||2.963||4.529||2.278||4.217||5.463||6.217|
|Medium Traffic ||Exponential||1.634||1.323||1.211||1.138||3.521x||6.918x||1.056||1.613||1.134||2.468||3.556||4.468|
|5-phase HED ||Exponential||2.912||1.973||1.637||1.417||1.065||2.093||3.196||4.880||2.415||4.425||5.691||6.424|
Vii Effect of Sensing parameters and delayed occupancy
Vii-a Effect of delayed occupancy
When the channel is busy due to transmission of PU, the SU has to periodically sense the channel for spectrum opportunity following a busy-period channel sensing strategy. As a result the SU cannot occupy the channel as soon as it is released by the PU resulting in the missed spectrum opportunity. We now account for the effect of the delayed channel occupancy by the SU on the sub-optimal channel sensing policies.
Let the random variable X denote the OFF time of the PU. Let the interval between the time the channel becomes free until it is sensed and occupied by the SU be denoted the random variable . Let the p.d.f of be denoted as . The residual channel idle time, after subtracting the missed opportunity, from the PU’s OFF time still follows an HED distribution but with different phase probabilities as derived below. Then, the p.d.f of residual channel idle time, denoted as , is calculated as
Thus the remaining channel idle time due to delayed occupancy follows HED, irrespective of SU’s busy-period sensing interval mechanism, with same but with different phase probabilities . 333In multi-channel scenario, the validity of assumption depends on sensing duration, channel sensing order, channel switch delay and transmit/receive mode switch delays (for half-duplex SU).
For example, we have considered exponential sensing interval policy with parameter for SU’s busy-period sensing. As a result of memory-less property of exponential distribution, the missing opportunity due to delayed occupancy will also follows same exponential distribution, i.e. . We have plotted the normalized throughput of SU against for different values of in Fig. 6. The normalized throughput decreases with decrease in weightage factor for interference in total cost function. When the weightage for interference to PU decreases, we will have larger optimal sensing intervals and hence lesser throughput due to interference with PU.
Vii-B Effect of sensing error and sensing duration
Two important parameters that affect the performance of channel sensing are (i) probability of detection and (ii) probability of false alarm which are defined as,
The probability of false alarm can be expressed in terms of , channel sensing time and signal-to-noise ratio (SNR) of complex valued PU signal as 
where is the tail probability of standard normal distribution, is the sampling frequency. The target probability of signal detection is usually set by regulatory bodies to avoid interference to PU. For example, IEEE 802.22 WRAN working group sets the target in the worst-case scenario of dB. Thus with received SNR and target , we can calculate false alarm for different values of .
The channel sensing error can be included in the cost function give by equation 11 of stochastic dynamic programming framework as
Note that the probability of detection and other channel sensing parameters are indirectly captured by as shown in (21). We have evaluated the performance of our proposed multishot policy for different values of , i.e. for different channel sensing duration , for a fixed , dB, and MHz. Whenever the channel is sensed busy (either due to PU reappearance or false alarm), SU follows busy-period sensing interval policy till the channel is sensed idle and revert back to multishot policy (restarts from ) after regaining the channel. In our simulation, we have assumed exponential policy with parameter as SU’s busy-period sensing interval policy.
We have also incorporated channel sensing duration which is a function of . The normalized throughput of SU for varying channel load condition is plotted against in Fig. 7(a) for . We can observe that the normalized throughput decreases with increase in . However, we didn’t observe much difference in normalized throughput with respect to different channel loads. The reason being that the normalized throughput is measured as the fraction of time SU uses the channel idle time for packet transmission. Similarly, the normalized throughput is plotted against for a fixed in Fig. 7(b).
We now discuss the effect of finite channel sensing duration on the total cost function of various sub-optimal policies. Any optimal (even sub optimal) solution would choose sensing interval that are much larger than the sensing duration . Else the fraction of time spent on sensing will be a large overhead. Under this condition, the sensing duration has minimal impact on our total cost. Our total cost depends on the number of sensing made and the interference to PU. Finite sensing duration adds a small constant to the successive sensing interval chosen, and slowly drifts the sensing points as compared to the ideal case of “Zero sensing duration”. If is the expected number of sensing done in ideal case , with finite sensing duration case it will be around ””. Therefore the error involved in total sensing cost is just of the order .
In this paper, we have considered optimal channel sensing policies for channels with heavy-tailed idle time distribution, which are modeled as HED. We have shown that the periodic sensing is not optimal when channel’s traffic deviates from the exponential distribution. The optimization problem, with an objective to minimize the number of SU’s channel sensing and SU’s interference to PU, is formulated. The structure of optimal solution is deduced through the MDP and dynamic programming framework. By showing that the state and action space of MDP are continuous, we proposes sub-optimal channel sensing interval policy called ‘Multishot sensing interval policy’ that minimizes the cost for sensing and interference to PU. Finally, we have compared the performance of our proposed Multishot sensing interval policy with other existing sub-optimal policies in literature for various channel traffic conditions.
-  Cormio, C., Chowdhury K. R.: ’A survey on MAC protocols for cognitive radio networks’, Ad Hoc Networks, 2009, 7, (7), pp.1315–1329.
-  Liang, Y. C., Zeng, Y., Peh, E. C. Y., Hoang, A. T.: ‘A Sensing-Throughput Tradeoff for Cognitive Radio Networks’, IEEE Trans.on Wireless communication, 2008, 7, (4), pp.1326–1337.
-  Pei, Y., Liang, Y.-C., Teh, K., Li, K. H.: ‘Energy-efficient design of sequential channel sensing in cognitive radio networks: Optimal sensing strategy, power allocation, and sensing order’, IEEE Jour. Sel. Areas Communication, 2011, 29, (8), pp.1648–1659.
-  Khoshkholgh, M., Navaie, K., Yanikomeroglu, H.: ‘Optimal design of the spectrum sensing parameters in the overlay spectrum sharing’, IEEE Trans. Mobile Computing, 2014, 13, (9), pp. 2071–2085.
-  Shokri-Ghadikolaei, H., Fischione, C.: ‘Analysis and optimization of random sensing order in cognitive radio networks’, IEEE Jour. Sel. Areas Communication, 2015,33, no. 5, pp. 803–819.
-  Liang, Y. C., Chen, K. C., Li, F. Y., Mähönen.: ‘Cognitive Radio Networking and Communications: An Overview’, IEEE Trans. on Vehicular Technology, 2011, 60, (7), pp. 3386–3407.
-  Pei, Y., Hoang, A.T., Liang, Y. C.: ‘Sensing-throughput tradeoff in cognitive radio networks: how frequently should spectrum sensing be carried out?’, Proc. IEEE PIMRC 2007, pp. 1–5.
-  L. Stabellini, L.: ‘Quantifying and modeling spectrum opportunities in a real wireless environment’, Proc. IEEE WCNC 2010, pp. 1–6.
-  M. Wellens, M.,Riihijärvi, J., Mähönen, P.: ‘Empirical time and frequency domain models of spectrum use’, Physical Communication, 2009, 2, (1), pp. 10–32, 2009.
-  Liu, Y., Tewfik, A.: ‘Hyperexponential approximation of channel idle time distribution with implication to secondary transmission strategy’, Proc. IEEE ICC 2012, pp. 1800–1804.
-  Feldmann, A., Whitt, W.: ‘Fitting mixtures of exponentials to long-tail distributions to analyze network performance models’, Proc. IEEE INFOCOM, 1997, pp. 1096–1104.
-  Sharma, M., Sahoo, A.: ’A comprehensive methodology for opportunistic spectrum access based on residual white space distribution’, ACM Proc. of the 4th International Conference on Cognitive Radio and Advanced Spectrum Management, 2011.
-  Huang, S., Liu, X., Ding, Z.: ‘Optimization of transmission strategies for opportunistic access in cognitive radio networks’, IEEE Trans. on Mobile Computing, 2009, 8, (12), pp.1636–1648.
-  Liu, Y., Tewfik, A.: ‘Primary traffic characterization and secondary transmissions’, IEEE Trans. on Wireless Communications,, 2014, 13, (6), pp. 3003–3016.
-  Shabara, Y., Zahran, A., ElBatt, T.: ‘Efficient spectrum access strategies for cognitive networks with general idle time statistics’, Proc. IEEE ICC 2015, pp. 7743–7749.
-  Lee, W. Y., Akyildiz, I. F.: ‘Optimal Spectrum Sensing Framework for Cognitive Radio Networks’, IEEE Trans. on Wireless Communications, 2008, 7, (10), pp.3845–3857.
-  Azad, A.P., Alouf, S., Altman, E., Borkar, V., Paschos, G.S.: ‘Optimal control of sleep periods for wireless terminals’, IEEE J. on Select. Areas in Communications, 2011, 29, (8), pp. 1605–1617.
-  Arthur F. Veinott, Jr.: ‘Lectures in Dynamic Programming and Stochastic Control’, MS&E 351 Dynamic Programming and Stochastic Control, Stanford University, 2008.
-  Feinberg, E.A.: ‘On stationary strategies in borel dynamic programming’, Mathematics of operation research, 1992, 17, (2), pp. 392–397.
-  Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 1994.
-  Corless, R., Gonnet, G., Hare, D., Jeffrey, D., Knuth, Donald (1996).: ‘On the Lambert W function’, Advances in Computational Mathematics (Berlin, New York: Springer-Verlag) 5: 329–359.