Underlay Cognitive Radios with Capacity Guarantees for Primary Users

# Underlay Cognitive Radios with Capacity Guarantees for Primary Users

Antonio G. Marques
The work in this paper was supported by the Spanish Min. of Sci.&Inn. grant No. TEC2009-12098. Parts of this paper were presented at CROWNCOM 2012. This paper has been submitted for publication to IEEE TSP.A. G. Marques is with the Dept. of Signal Theory and Comms., King Juan Carlos Univ., Camino del Molino s/n, Fuenlabrada, Madrid 28943, Spain. Phone: +34 914-888-222, fax: +34 914-887-500, email: antonio.garcia.marques@urjc.es
###### Abstract

To use the spectrum efficiently, cognitive radios leverage knowledge of the channel state information (CSI) to optimize the performance of the secondary users (SUs) while limiting the interference to the primary users (PUs). The algorithms in this paper are designed to maximize the weighted ergodic sum-capacity of SUs, which transmit orthogonally and adhere simultaneously to constraints limiting: i) the long-term (ergodic) capacity loss caused to each PU receiver; ii) the long-term interference power at each PU receiver; and iii) the long-term power at each SU transmitter. Formulations accounting for short-term counterparts of i) and ii) are also discussed. Although the long-term capacity constraints are non-convex, the resultant optimization problem exhibits zero-duality gap and can be efficiently solved in the dual domain. The optimal allocation schemes (power and rate loadings, frequency bands to be accessed, and SU links to be activated) are a function of the CSI of the primary and secondary networks as well as the Lagrange multipliers associated with the long-term constraints. The optimal resource allocation algorithms are first designed under the assumption that the CSI is perfect, then the modifications needed to accommodate different forms of imperfect CSI (quantized, noisy, and outdated) are analyzed.

{keywords}

Cognitive radios, resource management, stochastic approximation, imperfect channel state information.

## I Introduction

Cognitive radios (CRs) implementing dynamic spectrum access (DSA) schemes are the next generation solution for the problem of deploying new wireless services in an overcrowded radio environment [12, 10]. CR users, typically referred to as secondary users (SUs), have to sense the radio spectrum and use the sensing measurements to adapt dynamically the configuration of the CR. Such tasks have to be carried out with the aim of optimizing the quality of service (QoS) of the SUs while limiting the interference to the receivers which hold the licence of the frequency band, referred to as primary users (PUs). The specific rules that establish how SUs and PUs coexist and how the interference is limited depend on the so-called CR paradigm considered (underlay, overlay, or interweave [10]) and the DSA policy implemented [31].

The merits of adaptive schemes for traditional wireless systems that first acquire knowledge of the channel state information (CSI) and then use the CSI to optimally allocate the transmit resources are well documented; see [9]. However, for channel-adaptive schemes to be deployed in CR scenarios [20, 26, 27, 13], important challenges not present in traditional wireless networks arise. Next we describe several of them.

Challenge 1: Sensing the CR spectrum and acquiring the corresponding CSI (especially the one of the primary network) is a difficult task. The CSI in CRs is heterogeneous (presence of PUs, SU-to-PU channels, SU-to-SU channels, PU-to-PU channels) and inherently distributed. Some PUs can be located far away and not willing to collaborate with the SUs. The CSI may also vary fast and, due to interference, might not be stationary. Furthermore, to become aware of the overall radio environment, not only channels but also additional (network) side information may need to be sensed/estimated [10]. As a result, the CSI in CRs has higher dimensionality and heterogeneous quality (information of SU-to-SU links is typically better than that of SU-to-PU). Hence, advanced signal processing schemes that keep track of the CSI and mitigate the existing uncertainties have to be implemented. To deal with these problems, most CR works consider that the CSI contains some type of imperfections. Such imperfections are typically modeled as either noisy CSI (the actual CSI is corrupted with additive noise [20]) or quantized CSI (only a coarse description of the channel CSI is available, [19, 15]). Fewer works have considered the fact that the CSI may be not only noisy but also outdated [4, 17]; have developed signal processing schemes to mitigate the CSI uncertainties; or have incorporated those imperfections into the design of resource allocation (RA) algorithms [20, 25, 1, 5]. In this paper we take a general approach to model the CSI imperfections and consider that the distribution of the instantaneous CSI (referred to as belief) is available. This will allow us to: i) consider simultaneously different sources of CSI imperfections; and ii) address the design of systems with a broad degree of CSI uncertainties (from almost perfect CSI to severely degraded CSI). The expression for the belief and the rules to update it will depend on the operating conditions of the system. For example, if the CSI is perfect, the belief coincides with the instantaneous channel measurements. On the other hand, if only statistical CSI is available, the belief coincides with the long-term distribution of the channel and does not vary with time.

Challenge 2: As already mentioned, CR transmissions must obey additional rules that establish how SUs and PUs coexist and how to control interference. Such rules are typically formulated as constraints and depend on the specific CR paradigm and the DSA policies implemented. Overlay CRs (referred to as interweave CRs in [10]) allow SUs to transmit only if PUs are not active. Differently, underlay CRs allow for SU transmissions provided that the damage (interference) to the PUs is not too high. To keep the interference low, some works limit the interference power at the primary receiver side, either by imposing instantaneous (short-term) or average (long-term) interference power constraints; see, e.g., [13, 30, 29, 11]. The latter are better suited for fading channels because they can exploit the diversity of the interfering link [30, 11]. Other works guarantee a minimum signal-to-interference-plus-noise ratio (SINR) at the PU receiver [14, 8]. Short-term SINR constraints can be easily translated to (short-term) interference power constraints, while long-term SINR constraints cannot. More recent designs use a probabilistic approach to limit the probability of interfering the primary transmissions [26, 27, 2, 17]. Other works have designed schemes either guaranteeing a minimum capacity (rate) for the PU or limiting the capacity-loss at the PU receiver [8, 19]. Providing guarantees on the capacity of the PU links is typically a non-convex problem, so that most works have developed suboptimal solutions and focused on short-term formulations, which are more tractable and in some cases can be rendered convex [8]. In this paper we consider that PUs are not always active. When the channels are not occupied, the SUs are allowed to transmit (overlay paradigm). When the PUs are active, the SUs transmissions adhere to diverse DSA constraints (short and long term interference power and rate loss) that guarantee that the damage to PUs is kept under control (underlay paradigm).

Challenge 3: CRs have to use the time-varying (imperfect) CSI to dynamically adapt the available resources (power and rate loadings of the SUs) and decide the frequency bands to be used and the specific SUs that will use them. Relative to the RA in traditional wireless systems, the problem in CRs is challenging not only because more variables are involved, but also because the description of the CSI is more complicated and the schemes have to satisfy the additional DSA constraints. Different approaches have been used to formulate and solve the RA problem: game theory [21], non-linear optimization [29], convex approximation [5], dynamic programming [4], adaptive control [26] and even bio-inspired models [6]. In this paper, we design the RA schemes using non-linear optimization and dual stochastic approximation tools. The stochastic schemes are robust to channel non-stationarities and require less computational burden than that of the (non-stochastic) allocation schemes. Moreover, they are well suited for dealing with CSI imperfections. Dual stochastic algorithms have been successfully used to allocate resources in wireless networks, see, e.g., [23, 18] and [19, 27] for examples in the context of CRs.

Motivated by these challenges, we design RA algorithms that optimize the rate performance of the SUs and limit the interference to the PUs. We focus on CRs where SUs adapt their power and rate loadings dynamically, and access orthogonally a set of frequency bands which are primarily devoted to PU transmissions. Orthogonal here means that if a SU is transmitting, no other SU can be active in the same band. The RA schemes are then obtained as the solution of a weighted sum-average capacity maximization subject to four types of constraints: i) limits on the long-term (ergodic) capacity loss inflicted to each PU; ii) limits on the long-term interference power at each PU [11]; iii) limits on the long-term power transmitted by each SU; and iv) short-term formulations of i) and ii). Consideration of i) is challenging because the interfering (SU) powers render the capacity term non-convex, and it is the main contribution of this work. Although non-convex, it holds that the formulated problem has zero duality gap. As a result, the Langrangian relaxation is optimal. Additionally, the operating conditions of the secondary network (and the formulation of the objective to optimized) are such that the problem in the dual domain can be separated across users and frequency bands. This favorable structure allows for a significant reduction on the complexity required to find the optimal solution and, hence, renders the non-convex problem computationally tractable. Different forms of channel imperfections are considered (quantized, noisy, outdated, statistical). The optimal RA schemes are complemented with simple but effective stochastic signal processing algorithms both to mitigate the effects of the CSI imperfections, and to estimate online the value of the multipliers required to implement the optimal RA. Such stochastic algorithms are able to track the time-variation of the environment and/or learn unknown parameters on-the-fly, features that are especially attractive for CR systems [12, 19].

The rest of the paper is organized as follows. Sec. II presents the model for the (perfect) CSI, describes the operating conditions of the secondary network, and formulates the DSA constraints that SUs must obey. Sec. III deals with the design of the optimal RA algorithms. First, the optimization problem which gives rise to the RA is formulated and then, its solution is obtained. Sec. IV discusses different methods (including stochastic) to estimate the multipliers required to implement the optimal RA. Sec. V describes different forms of CSI imperfections and analyzes how the optimal schemes have to be modified to account for imperfect CSI. Sec. VI presents different illustrative numerical examples that corroborate the theoretical claims. Conclusions in Sec. VII wrap-up this paper.111 Notation: denotes vector transposition; the optimal value of variable ; () the Boolean “and” (“or”) operator; expectation; the indicator function ( if is true and zero otherwise); and the projection of the scalar onto the interval , i.e., .

## Ii Model description

We consider a CR network with secondary users (indexed by ) transmitting opportunistically and orthogonally over different frequency bands (indexed by ). For simplicity, we assume that: i) each band has the same bandwidth and is occupied by a different primary user; and ii) the secondary network has an access point (AP) which is the destination of all secondary users. The AP acts as a central scheduler which collects the CSI and then makes the RA decisions. Extensions to scenarios where those assumptions do not hold true can be handled with a moderate increase in complexity.

### Ii-a Channel state information

Intuitively speaking, the CSI in wireless systems comprises the information of the channel links which: i) is known by the system and ii) is relevant from a RA perspective. A key feature of CR systems is that the CSI is heterogeneous, meaning that it is typically different for the primary and secondary network. The reason for that is twofold. First, the schemes used to acquire the CSI are different for the primary and secondary network [cf. i)]. Second, the impact of the CSI on the design of the RA is different [cf. ii)]. For ease of exposition, we first design the RA schemes assuming that the CSI is error-free. Accordingly, the model for the perfect CSI is presented here, while the model for imperfect CSI (and the corresponding modifications for the RA schemes) is presented in Sec. V.

The CSI available at instant is formed by variables: , , and for all and . Before explaining the meaning of such variables, we clarify that subscript “1” will be used to emphasize that the channel involves primary transceivers, while subscript “2” is used to emphasize that only secondary transceivers are involved. Starting with the CSI of the PUs, is a Boolean variable which is one if the PU that transmits on the th channel is active at time and zero otherwise. Variable represents the instantaneous noise-normalized power gain between the th SU and the th PU at instant . Similarly, represents the instantaneous noise-normalized power gain between the th SU and the AP in the th channel at instant . All , and are stationary random processes. The assumption of perfect CSI implies that at instant , the value of those variables is known deterministically. Finally, we will use to denote the (interference free) signal-to-noise ratio (SNR) between the PU transmitter and PU receiver. For simplicity, we will assume that does not vary with time (either because the PU channels are fixed or because the PU transmitter implements a channel-inversion power loading [9]). Nonetheless, our schemes can be easily modified to account for varying with time.

To finish this section, let denote the vector of overall CSI containing: i) the power gains of the CR-to-CR links, and ii) the normalized power gains of the CR-to-PU links; and iii) Boolean variables indicating whether the channels are occupied. Clearly, the value of varies with time and, wherever convenient, we will write to stress this fact.

### Ii-B Resources at the secondary network

Now, we introduce the design variables, i.e, the variables that will be adapted as a function of the (primary and secondary) CSI . Let denote a Boolean variable taking the value one if the th secondary user is scheduled to transmit into the th band and zero otherwise. Provided that , let denote the instantaneous power transmitted over the th band by the th secondary user. We analyze the case where instantaneous rate and power variables are coupled through Shannon’s capacity formula. Such a coupling will be written as , which is an increasing and concave function. Nonetheless, the basic results in this paper hold for any increasing and concave.

The CR operates in a time-block fashion, where the duration of each block corresponds to the coherence time of the fading channel. This way, at every time the AP will use the current CSI vector to find the (optimum) value of and . Since varies with and depend on , the value of the design variables will vary across time as well. Throughout the manuscript, we will write , and , or , and , wherever is convenient to emphasize the corresponding dependence.

Having introduced the design variables, now we formulate constraints that these variables need to satisfy. To ensure that at most one user transmits into a given band , we need . If the left hand side of the constraint is equal to one, then one user is accessing the channel (orthogonal access). If it is equal to zero, then none is transmitting (either because all secondary channels are poor, or because it causes very high interference to the PUs). To simplify the notation, we consider an additional virtual SU user , with zero transmit power and rate; i.e., . The th user will be active (and thus ) if none of the actual SUs is transmitting. Then, we can write

 ∑mwmk,2(h)=1,   ∀k. (1)

We also consider that the maximum average (long-term) power the th SU can transmit is ; hence,

 \mathbbmEh[∑kwmk,2(h)pmk,2(h)]≤ˇpm2,  ∀m. (2)

Such a constraint is not only reasonable to effect QoS across CRs, but also to limit the power consumption of each of the CR transmitters. The expectation in (2) is taken over all possible values of and ; i.e, considering all , , and . While (1) needs to hold for each and every channel realization (hence, for each and every time instant), (2) only needs to hold in the long term.

### Ii-C Dynamic spectrum access constraints

The next step is to identify the rules that dictate how SU transmissions affect the performance of the PUs. Such rules will be formulated as constraints that will be incorporated into the optimization problem that gives rise to the RA schemes. In other words, the DSA constraints will represent how SUs have to modify their behavior so that the damage caused to the PUs is kept under control.

When the DSA constraints are formulated, several factors have a significant impact both in terms of the system operation and the mathematical formulation of the problem. Two important ones are discussed next. The first factor is whether the interference constraints are formulated as instantaneous (short-term) or as average (long-term) constraints. The former requires the constraint to hold for each and every time instant, while the latter requires the constraint to hold on average (taking into account all time instants jointly). Clearly, instantaneous constraints are more restrictive than their average counterparts (which can exploit the so-called “cognitive diversity” of the primary CSI [30, 29]), and therefore the performance of the secondary network will be higher in the latter case. Mathematically, long-term constraints are typically dualized, while short-term constraints are handled using alternative methods. The second factor is the metric used to measure the actual damage that the CRs inflict to the PUs. Among the metrics considered in the literature we find: interference power at the PUs, probability on interfering the PUs, and rate loss inflicted to the PUs. Most works have focused on limiting the interference power. The reason is twofold: i) it is a simple (and intuitive) metric to measure the interference, and ii) it can be formulated as a convex constraint. Limiting the rate loss may be considered a better alternative because it focuses on the actual damage that the interference causes to the PUs (most communications systems are designed to either guarantee or maximize a certain transmission rate). From a mathematical perspective, constraints limiting the rate loss are typically non-convex. As a result, very few works have explored that alternative; see e.g. [8, 19]. The problem of limiting the probability of interference for a system with operating conditions very similar to the ones considered in this paper was thoroughly investigated in [17].

As already mentioned, the main contribution of this work is to limit the long-term rate (capacity) loss on the PUs. However, we will also impose limits on the long-term interference power. The reason is twofold. First, such constraints were not considered for systems with the same exact operating conditions than those considered in this work; see [11] for a very related one. More importantly, joint consideration of rate loss and interference power constraints will help us to compare these two alternatives. For similar reasons, the end of the section is devoted to discuss the modifications required to handle short-term interference power and rate loss constraints.

We start with the formulation of the long-term interference power constraints. Let denote the maximum average interference power the th primary receiver can tolerate (provided that the PU is active) and recall that the th SU transmits in the th channel only if the Boolean scheduling variable is one. Then, the following constraints need to hold

 \mathbbmEh[∑mwmk,2(h)hmk,1pmk,2(h) ∣∣ak,1=1]≤ˇpk,1,  ∀k. (3)

The fact that the expectation is taken across all reflects that (3) is a long-term constraint. Clearly, for a given channel realization just one of the terms inside the expectation is active. This property will be exploited in upcoming sections. Finally, note that only CSI realizations for which are considered in the expectation. In fact, (3) can be rewritten as . If one does not want to bound the long-term interference power that the PU receives when it is active, but the long-term power at the PU receiver irrespective of whether the PU is active of not, then has to be removed from the previous expressions.

Next, we formulate the long-term (ergodic) capacity constraints. For such a purpose we define the function , where stands for the interference power at the th PU receiver. Our formulation guarantees a minimum long-term rate of for the th PU. This minimum rate can either be a fixed value [19] or expressed as a fraction of the rate that the PU achieves when no CRs are present. Mathematically, the rate requirement in the latter case can be written as where is the maximum (relative) capacity loss that the CRs can cause to the th PU. With these issues in mind, the long-term capacity constraint is formulated as

 \mathbbmEh[∑mwmk,2(h)rk,1(hmk,1pmk,2(h))∣∣ak,1=1]≥ˇrk,1, ∀k. (4)

Again, for a given channel realization only one of the terms inside the expectation is active. The expression in (4) confirms that if the constraint is written as , then is a non-convex function [cf. the definition of ].

We close this section by briefly discussing the formulation of the short-term DSA constraints. To write the short-term counterparts of (3) and (4) we do not need to take into account all , but only the current one . Hence, the short-term constraints for the time instant are

 ak,1[n]∑mwmk,2[n]hmk,1[n]pmk,2[n]≤ak,1[n]ˇpk,1, (5) ak,1[n]∑mwmk,2[n]rk,1(hmk,1[n]pmk,2[n])≥ak,1[n]ˇrk,1, (6)

which need to hold for all and . Capitalizing on the fact that at every time instant only one SU is active, the alternative set of constraints can be considered

 ak,1[n]hmk,1[n]pmk,2[n]≤ak,1[n]ˇpk,1, (7) ak,1[n]rk,1(hmk,1[n]pmk,2[n])≥ak,1[n]ˇrk,1, (8)

which in this case need to hold for all , and . Clearly, if (7) and (8) are satisfied, then (5) and (6) are satisfied too. It can also be rigorously shown that (7) and (8) do not imply a loss of optimality relative to (5) and (6). As already pointed out, key for showing this result is that at every time instant at most one SU is active, so that bounds on the non-active users are irrelevant. The main advantage of considering (7) and (8) is that the transmit powers of the different SUs are decoupled, so that each of the expressions in (7) and (8) can be solved with respect to (w.r.t.) . This implies that the constraints can be rewritten as simple box constraints. To be specific, let represent the maximum power the amplifier at the SU can transmit. Moreover, assume that and let and be, respectively, the values of for which the constraints (7) and (8) are satisfied with equality. Based on these notational conventions, we define the maximum short-term power as if , and if . Then, the short-term DSA constraints can be replaced with . In a nutshell, the orthogonal access among SUs allow us to rewrite the short-term DSA constraints as time-varying power peak constraints. The power bound enforced by each of such peak constraints will depend on the metrics used to measure the interference (rate loss and/or interference power), the limits set on the chosen metric ( and ), and the CSI at instant .

## Iii Formulating and solving the RA problem

To formulate the optimization problem that gives rise to the optimum RA algorithms, we need to identify: i) the variables to be optimized; ii) the constraints the variables need to satisfy; and iii) the metric to be optimized. The first step was accomplished in Sec. II-B. Regarding the second step, Boolean variables are constrained to belong to the set and variables are constrained to belong to the set , where stands for the instantaneous peak power constraint introduced at the end of Sec. II-C. Moreover, and need to satisfy (1) and (2), and the DSA constraints in (3) and (4).

Regarding the third step (metric to be optimized), we are interested in maximizing the weighted ergodic sum-capacity given by , where represents a user-dependent priority coefficient. Note that by varying , the border of the capacity region can be found [28]. Recall that for a given channel realization and channel only one of the terms (SUs) is active. Other objective functions, such as ergodic sum-utility rate could be used without changing the basic structure of the solution; see, e.g., [18] for further details on a related problem.

Under all previous considerations, the optimal RA is obtained as the solution of the following problem:

 ¯c∗2:=max{wmk,2(h),pmk,2(h)} ∑k,m\mathbbmEh[βmwmk,2(h)rmk,2(hmk,2pmk,2(h))] (9a) s. to:  wmk,2(h)∈{0,1}, 0≤pmk,2(h)≤ˇpmk,2(h), (???); (9b) (9c)

where the dependence of the optimization variables on the CSI has been made explicit. Note that we are interested in optimizing a long-term objective (9a), subject to both short-term (9b) and long-term (9c) constraints. As we will see in the next section, the approach to handle (9b) and (9c) will not be the same.

### Iii-a Optimal RA

The main challenge of finding the optimal RA is that (9) is not a convex problem. Basically, there are three sources of non-convexity in (9): i) scheduling coefficients are constrained to belong to , which is a non-convex set; ii) the monomials , and are not jointly convex; and iii) the constraint (4) is not convex w.r.t. . The two first sources on non-convexity can be “easily” bypassed by transforming (relaxing) the problem in (9) into a convex one which yields the same optimality conditions; see App. A for technical details. However, the third source of non-convexity cannot be bypassed. Two undesirable consequences associated with lack of convexity are [3]: (c1) zero-duality gap is not guaranteed, and (c2) development of numerical algorithms that find the optimal solution in polynomial time is not guaranteed. Remarkably, it can be shown that (see related discussion in App. A, and [24], [22]): the problem in (9) exhibits zero-duality gap. This result implies that the constraints can be dualized without losing optimality. However, (c2) still holds, so that finding an efficient algorithm to optimize the (unconstrained) Lagrangian is still challenging. Interestingly, due to the structure of (9) we will show that the optimization can be separated (decomposed) across channels and users, decreasing dramatically the computational complexity to find the optimal solution.

After the previous discussion, we are ready to present the solution of (9). Our approach to deal with the constraints in (9) is twofold. The long-term constraints in (9c) –namely, (2), (3) and (4)– will be dualized, while the constraints in (9b) (all short-term) will be handled using alternative methods such as scalar projections. Regarding the long-term constraints, let , and denote the Lagrange multipliers associated with (2), (3) and (4), respectively. With this notational conventions, it can be shown (see App. A) that the optimal solution of (9) is

 φmk(pmk,2[n]) := βmrmk,2(hmk,2[n]pmk,2[n])−πmpmk,2[n] (10) − θkak,1[n]hmk,1[n]pmk,2[n] + ρkak,1[n]rk,1(hmk,1[n]pmk,2[n]), pm∗k,2[n] := [argmaxpmk,2[n] φmk(pmk,2[n])]ˇpmk,2[n]0, (11) wm∗k,2[n] := \mathbbm1{m=argmaxlφlk(pl∗k[n])}\mathbbm1{pm∗k,2[n]>0∨m=0}. (12)

Key for understanding the solution of (9) is the definition of the functional in (10). Mathematically, represents the contribution to the Lagrangian of (9) if the transmit power is and . Intuitively, (10) can be interpreted as a user-channel quality indicator (the higher the indicator, the better). Under this interpretation, the rates of SUs and PUs are rewards (first and fourth terms), and the transmit and interference powers are costs (second and third terms). The corresponding prices are , , and , respectively. The indicator also manifests the existing trade-off between the SUs (first and second terms) and the PUs (third and forth terms). Note that if the fourth term in (10) is replaced with , the optimum value of and in (11) and (12) do not change. This implies that we can also interpret the quality indicator as a functional which penalizes the allocations that entail a high capacity loss for the PU.

Based on the definition , equation (11) reveals that is found separately for each of the user-channel pairs. Similarly, (12) reveals that to find , i.e., the optimal scheduling for channel ; no information from channels other than is required. These attractive features are present because the optimization problem in the dual domain is separable across users and channels (see [18], [17]). Keys for this property to hold are the consideration of orthogonal access in the secondary network and the definition of the objective in (9). Capitalizing on the favorable structure of the solution, we now analyze in further detail the optimal RA. Starting with the optimal scheduling in (12), we observe that is available in closed form, provided that the optimum power is known. Equation (12) reveals that the scheduling follows a winner-takes-all strategy, guaranteeing that the access is orthogonal (at most one user is active), opportunistic ( is a continuous random variable), and greedy (only the user with highest quality in a given band must be scheduled). Note that the second condition in (12) dictates that if all users decide to transmit with zero power, the channel is assigned to the virtual user . The details of the optimum power allocation are a bit more intricate. To obtain we need first to maximize w.r.t. . Consider first a simplified case where the CR constraints (3) and (4) are not present. In such a case only the two first terms in (10) are present, so that is strictly concave and differentiable. As a result, the optimization is convex and can be easily found. Specifically, for this case is available in closed form as . The previous expression is basically a waterfilling power loading [9] projected onto the feasible interval defined by the instantaneous constraints. When the CR constraint (3) is active, the third term in (10) needs to be considered. However, since that term is linear w.r.t. , the structure of is basically the same and can still be efficiently found. In fact, the solution follows again a (modified) waterfilling scheme ; see, e.g., [11]. Differently, when all four terms in (10) are considered, the optimization is challenging because is not concave any more. The reason is that the last term is strictly convex, rendering the sum of the four terms in (10) non-concave and therefore, the optimization non-convex.

However, the fact of the optimization not being convex does not necessarily imply that cannot be efficiently found. The first reason is that optimizing involves a single (scalar) variable. As a result, simple line search methods can be used. The second reason is that the structure of can be exploited to focus the search on a small region. For example, it can be rigorously shown that the waterfilling solution is an upperbound for . Moreover, if the CSI is perfect, then has at most three stationary points, so that is either 0 or one of those three points. Once are found, finding just requires the evaluation of closed-form expressions [cf. (12)]. In other words, because in the dual domain the problem can be separated across users and channels, optimizing the Lagrangian does not require solving one non-convex problem in a dimensional space. Rather, closed forms need to be evaluated (for the scheduling coefficients), and non-convex problems in a one-dimensional space need to be solved (for the power loadings).

The expressions obtained in this section revealed how the optimal RA depends on the (perfect) CSI and the Lagrange multipliers. Schemes to compute the multipliers in our CR setup are discussed in the next section, while the alternatives to account for CSI imperfections are analyzed in Sec. V.

## Iv Stochastic estimation of the multipliers

Different methods can be used to obtain the value of , and . Based on Lagrangian Duality Theory, are set to a constant value corresponding to the value that maximizes the dual function associated with (9). Since our problem has zero duality gap, when , and are substituted into (10)-(12), the resulting RA is the optimal solution of (9) [3]. To find such values, one has to resort to iterative search algorithms such as dual subgradient methods [3], which at each iteration update the value of the multiplier according to the long-term violation of the corresponding constraint (let us recall that regardless of the convexity of the primal problem, the dual problem is always convex). Dual subgradient methods (either with constant or diminishing stepsize) and dual descend methods are reasonable alternatives for the problem at hand. Methods exploiting the separability in the dual domain can be used too. The main drawback associated with all previous methods is that at every iteration, the expectations in the long-term constraints (which require averaging over all possible states of ) need to be computed. Moreover, the multipliers have to be recomputed if either the long-term distribution of the channels or the number of users change.

Recently, alternative approaches that rely on stochastic approximation tools have been proposed to find the value of the multipliers [23, 19, 27]. These approaches do not try to find the optimal value of , but time-varying estimates of them which are updated at every instant and remain sufficiently close to . An important advantage of these approaches is that their computational complexity is very low. Moreover, they exhibit additional advantages that are especially attractive in CR setups. Namely: i) they are robust to channel non-stationarities (which may arise in environments with interference); ii) they do not need to have statistical knowledge of the channels; and iii) they can cope with changes in either the secondary network (number of users, or QoS levels) or the primary network (limits on the interference power, rate loss, or capacity function of the PUs). In other words, stochastic schemes offer a way to learn the environment online and keep track of its time variation. As we will see, the only price to pay is that the resulting schemes are slightly suboptimal.

To be specific and rigorous, with , and being small and constant stepsizes, the following iterations are proposed

 πm[n+1] = (13) θk[n+1] = [θk[n]−ηθak,1[n](ˇpk,1 (14) − ∑mwm∗k,2[n]hmk,1[n]pm∗k,2[n])]∞0 ρk[n+1] = [ρk[n]+ηρak,1[n](ˇrk,1 (15) − ∑mwm∗k,2[n]rk,1(hmk,1[n]pmk,2[n]))]∞0.

From an optimization point of view, the updates in (13)-(15) form an unbiased stochastic subgradient of the dual function of (9); see [3]. Assuming that the updates in (13)-(15) are bounded, the following optimality/feasibility result can be shown222A proof of this result is not presented here due to space limitations, but it can be derived following the lines of [23, 18]..

###### Proposition 1

The sample average of the stochastic RA: i) is feasible and ii) entails a small loss of performance relative to the optimal solution of (9). Specifically, defining ; ; ; ; and . Then, it holds with probability one that as :
i) , , , and
ii) , where as .

In words, the proposition guarantees asymptotic optimality of the stochastic iterates because they give rise to a RA which is feasible and achieves a value (performance) arbitrarily close to , which is the optimal objective that the original (non-stochastic) solution of (9) achieves [cf. (9a)]. Note also that can be used as a parameter to set the tradeoff between optimality and tracking capabilities. If optimality is the only concern, the stochastic iterations in (13)-(15) could be run using a time-varying stepsize which diminishes with time. Under mild conditions, it can be shown that such iterations converge to the optimal solution; see, e.g., [19] for details. Clearly, the price to pay in that case is that the algorithms would lose their tracking capabilities.

###### Remark 1

In this work, we have assumed that there is a central scheduler (AP) that gathers the CSI, finds the optimum RA, and runs the stochastic iterates. Moreover, we have also assumed that the signalling channels which convey the control information are error free. Nonetheless, it is worth remarking that the stochastic estimates are robust to errors. In fact, if the errors in the updates are bounded and have zero mean, then the results in Prop. 1 still hold. See [7] for a related result. In addition, the next section will show that our schemes are also robust to errors/imperfections in the CSI.

## V Imperfect channel state information

The optimal RA schemes were designed assuming that the CSI was perfect. Here, we relax that assumption and account for CSI imperfections. Although the assumption of perfect CSI may be reasonable for some wireless systems, it is unlikely to hold in CR scenarios (see related discussion in Sec. I). This is especially true for the CSI of the primary network, which is typically more difficult to obtain and entails a higher cost than that of secondary links. We first present different alternatives to model the CSI imperfections and then, describe how the RA schemes have to be modified to account for them.

The main change in the formulation when the CSI is not perfect is that the values of , and (instantaneous CSI) are not longer deterministically known at instant . Rather, the knowledge of , and will be probabilistic and time varying. As a result, the CSI now will correspond to the probability density function (pdf) of , , available at time . Such a pdf will be referred to as instantaneous belief and denoted as , , , respectively. The specific expression for the instantaneous belief will depend on the operating conditions of the system. Focusing on for illustrative purposes, two extreme examples are analyzed next. First, consider the case when the CSI is perfect. For this case, the value of at instant is perfectly known, so that belief at instant (instantaneous pdf) would be , where is a Dirac delta function. Consider now that no instantaneous measurements are available, so that only (long-term) statistical CSI is available. For the case of Rayleigh channels, the belief would be , where represents the average gain of the SU-to-PU channel. Clearly, in this case the belief would not vary with time.

Three different sources of imperfections are considered here: quantized CSI, noisy CSI, and outdated CSI. For each of them, we first give a high level description of how to model the imperfections and the corresponding belief. Then, we provide several examples that will allow us to gain insights and be more specific. Regarding the first source of imperfections, research has consistently shown that feedbacking a small number of information bits about the instantaneous channel conditions to the transmitter (or schedulers) can allow near optimal channel adaptation [15]. To implement such schemes, the channel domain has to be quantized into non-overlapping quantization regions. Such quantization can be carried out jointly for different channels (vector quantization) or separately for each of them. Once the quantizer is known, at each instant the transmitter is notified of the region the instantaneous channels falls into. The instantaneous belief will be given by the pdf of the channel gain within the active region. A different source of imperfections is the presence of noise in the channel measurements. A zero-mean additive white noise is typically assumed for the noise, so that the belief will be given by the instantaneous channel measurement and the noise pdf. Many systems do not estimate the power gain of the channel, but its complex low-pass equivalent. In such a case, the (complex) noise would affect the low-pass equivalent. The belief in this case can be obtained from the actual measurement, the noise distribution and taking into account that power gain is the squared modulus of the complex low-pass equivalent. Finally, we also consider that the CSI may be outdated. This model is well motivated in CRs where sensing the (PU) channels entails a high cost so that they are cannot be sensed at every time instant. To update the belief in this case we need to assume a specific time-correlation model for the CSI. Based on that model and on the available measurements up to instant , the belief is estimated using stochastic prediction/correction schemes.

###### Example 1

A simple but very effective alternative to define the quantized CSI is to use a scalar quantizer for each of the channel gains. For example, focusing on the SU-to-SU channels, the domain of can be divided into non overlapping intervals , where , stands for the th quantization threshold and and . Clearly, in this case bits suffice to identify the region (interval) channel falls into. Most quantized CSI designs ignore the time-correlation of the channel and assume that the CSI is available instantaneously and free of errors [15]. In such a scenario, let be the index which identifies the region the channel falls into. If the channel follows a exponential distribution (Rayleigh model) and its average gain is , then the belief of at instant is .

###### Example 2

The task of acquiring the Boolean variable is basically a detection problem. Consider that the output of the detection process is binary and denoted by . In order to incorporate the sensing errors into our model, we denote the probabilities of miss detection and false alarm as and , respectively. Based on those, we define and , where and stand for the long-term probabilities of and , respectively. If the time-correlation of is ignored, then the belief of at time is simply: if ; and if . Schemes to update the belief for more general sensing models and that leverage the time-correlation of the PUs activity can be found in, e.g., [17].

###### Example 3

In this example, we design prediction/correction schemes for a practical channel/measurement model for the SU-to-PU channels. Let be the low-pass equivalent of the SU-to-PU channel, so that . We will assume that is a complex Gaussian process with independent real and imaginary parts (Rayleigh model). For notational convenience we will deal with as a vector whose first and second entries correspond to the real and imaginary parts, respectively. The time dynamics of are assumed to follow a first-order Markovian model with where represents the autocorrelation coefficient and an innovation process independent of . The process is assumed to be white and complex Gaussian distributed with zero mean and diagonal covariance matrix , where is the identity matrix [9]. Once the model of the ground-truth channel has been described, we introduce the model for the measurements and errors. For such a purpose, let denote a Boolean variable which is one if the channel is sensed at instant and zero otherwise. Moreover, let denote the noisy measurement of obtained if . The measurement is modeled as where is a white noise independent of which follows a complex Gaussian distribution with zero mean and diagonal covariance matrix . Let denote the pdf of at instant , conditioned to all measurements up to instant . Under the previous model, it readily follows that is Gaussian pdf and its mean and covariance (denoted, respectively, as and ) suffice to describe the full distribution. The stochastic iterations to update and are described next.

If , then it holds that and . If , we first update the belief of the previous instant to get the predictions and . Then, we use the measurement to correct the predictions as follows:

 μmk[n] = (^υmk[n]+νmk)−1(^υmk[n]~gmk[n]+νmk^μmk[n]) (16) υmk[n] = (^υmk[n]+νmk)−1(^υmk[n]νmk). (17)

Clearly, when the updates correspond to those of a classical Kalman filter. Different prediction/correction steps will be required if either the time dynamics or the sensing errors are modeled differently. See, e.g., [17] for alternative models. As mentioned before, based on (instantaneous pdf of ), the belief (instantaneous pdf of ) can be obtained by using the transformation .

To finish this section, we introduce notation to denote the overall imperfect CSI available at time . For example, suppose that: a) the CSI of the SU-to-SU gains is quantized as described in Example 1; b) the errors on the activity of the PUs follow the model described in Example 2; and c) the CSI of the SU-to-PU channels is outdated and noisy as described in Example 3. With these operating conditions, is a vector of length containing: a) the region index of each of the gains of the SU-to-SU links; ii) the probability of each of the PUs being active; and iii) the means and variances of the SU-to-PU links. Clearly, based on the information gathered on , the instantaneous beliefs , , can be trivially obtained. For notational convenience, we will use to denote the belief of the CSI of the overall system. Moreover, will be written as whenever is convenient to stress the dependence on .

### V-a Modifying the RA schemes

The first step to design RA schemes capable of accounting for CSI imperfections is to modify the formulation of the constraints which depend explicitly on the instantaneous CSI. Strictly speaking, the formulation of the long-term constraints in (2), (3) and (4) (and the objective ) do not have to be modified. One just has to take into account that the total expectation in those constraints can be rewritten as . The notation emphasizes that the inner expectation is taken over , and according to the pdfs in . Differently, the short-term constraints in (7) and (8) need to be modified. When the CSI is imperfect, those constraints involve random variables, so that strict satisfaction of the constraints may be impossible (e.g., if the instantaneous belief has infinite support). As a result, the constraints have to be reformulated. A reasonable reformulation is to take expectations across the instantaneous belief at both sides of the constraints and consider

 \mathbbmEbk,1(x|n)[ak,1[n]]\mathbbmEbmk,1(x|n)[hmk,1[n]]pmk,2[n] ≤\mathbbmEbk,1(x|n)[ak,1[n]]ˇpk,1, (18) \mathbbmEbk,1(x|n)[ak,1[n]]\mathbbmEbmk,1(x|n)[rk,1(hmk,1[n]pmk,2[n])] ≥\mathbbmEbk,1(x|n)[ak,1[n]]ˇrk,1. (19)

Note that to gain intuition in (18) and (19) we have implicitly assumed that and are independent, so that the expectations were obtained separately. The long-term expectations in (3) and (4) are different from those in (18) and (19). In the former, the expectations were taken considering all time instants. In the latter, the expectations are taken at instant and only over the CSI uncertainties. Clearly, as the knowledge of the CSI improves, the beliefs approximate to a Dirac delta centered in the actual value of the channel and hence, the constraints in (18) and (19) approximate to those in (7) and (8). As we did in Sec. II-C, to handle the short-term DSA constraints we solve (18) and (19) w.r.t. and redefine the maximum instantaneous peak power constraint as , where and are the roots of (18) and (19), respectively. Another reasonable reformulation to handle the CSI imperfections is to consider that (7) and (8) need to hold with a certain short-term probability (e.g., the probability of the interference power at time exceeding has to be less than a certain value). The procedure to deal with the constraints would be similar. The instantaneous belief would be used to solve the constraints w.r.t. the , the corresponding values of and would be found, and such values would be used to obtain .

With these modifications in mind, it can be shown (see App. A) that the optimal RA with imperfect CSI is

 ~φmk(pmk,2[n]) := \mathbbmEb(x|n)[φmk(pmk,2[n])], (20) pm∗k,2[n] := [argmaxpmk,