Joint Sensing and Power Allocation in Nonconvex Cognitive Radio Games: Nash Equilibria and Distributed Algorithms
In this paper, we propose a novel class of Nash problems for Cognitive Radio (CR) networks, modeled as Gaussian frequency-selective interference channels, wherein each secondary user (SU) competes against the others to maximize his own opportunistic throughput by choosing jointly the sensing duration, the detection thresholds, and the vector power allocation. The proposed general formulation allows to accommodate several (transmit) power and (deterministic/probabilistic) interference constraints, such as constraints on the maximum individual and/or aggregate (probabilistic) interference tolerable at the primary receivers. To keep the optimization as decentralized as possible, global (coupling) interference constraints are imposed by penalizing each SU with a set of time-varying prices based upon his contribution to the total interference; the prices are thus additional variable to optimize. The resulting players’ optimization problems are nonconvex; moreover, there are possibly price clearing conditions associated with the global constraints to be satisfied by the solution. All this makes the analysis of the proposed games a challenging task; none of classical results in the game theory literature can be successfully applied.
The main contribution of this paper is to develop a novel optimization-based theory for studying the proposed nonconvex games; we provide a comprehensive analysis of the existence and uniqueness of a standard Nash equilibrium, devise alternative best-response based algorithms, and establish their convergence. Some of the proposed algorithms are totally distributed and asynchronous, whereas some others require limited signaling among the SUs (in the form of consensus algorithms) in favor of better performance; overall, they are thus applicable to a variety of CR scenarios, either cooperative or noncooperative, which allows the SUs to explore the existing trade-off between signaling and performance.
Over the past decade, there has been a growing interest in Cognitive Radio (CR) as an emerging paradigm to address the de jure shortage of allocated spectrum that contrasts with the de facto abundance of unused spectrum in virtually any spatial location at almost any given time. The paradigm posits that so-called cognitive radios [also termed as secondary users (SUs)] would use licensed spectrum in an ad-hoc fashion in such a way as to cause no harmful interference to the primary spectrum license holders [also termed as primary users (PUs)]. Evidently, such an opportunistic spectrum access is intertwined with the design of multiple secondary system components, such as (but not limited to) spectrum sensing and transmission parameters adaptation. Indeed, the choice of the sensing parameters (e.g., the detection thresholds and the sensing duration) as well as the consequent design of the physical layer transmission strategies (e.g., the transmission rate, the power allocation) have both a direct impact on the performance of primary and secondary systems. The interplay between these two interacting components calls for a joint optimization of the sensing and transmission parameters of the SUs, which is the main focus of this paper.
1.1 Motivation and related work
The joint optimization of the sensing and transmission strategies has been only partially addressed in the literature, even for simple CR scenarios composed of one PU and one SU. For example, in [1, 2], the authors proposed alternative centralized schemes that optimize the detection thresholds for a bank of energy detectors, in order to maximize the opportunistic throughput of a SU, for a given sensing time and constant-rate/power transmissions. The optimization of the sensing time and the sensing time/detection thresholds for a given missed detection probability and constant rate of one SU was addressed in [3, 4] and , respectively. A throughput-sensing trade-off for a fixed transmission rate was studied in . In  (or ) the authors focused on the joint optimization of the power allocation and the equi-false alarm rate (or the sensing time) of a SU over multi-channel links, for a fixed sensing time (or detection probability). All the aforementioned schemes however are not applicable to scenarios composed of multiple SUs (and PUs). The case of multiple SUs and one PU was considered in  (and more recently in ), under the same assumptions of ; however no formal analysis of the proposed formulation was provided.
The transceiver design of OFDM-based CR systems composed of multiple primary and secondary users have been largely studied in the literature of power control problems over the interference channel, and have been traditionally approached from two very different perspectives: a holistic design of the system and an individual selfish design of each of the users. The former is also referred to as Network Utility Maximization (NUM) (other approaches within this perspective are based on Nash bargaining formulations) and has the potential of obtaining the best of the network at the expense of a centralized computation or heavy signaling/cooperation among the users; examples are [11, 12, 13, 14, 15, 16, 17]. The latter fits perfectly within the mathematical framework of Game Theory and usually leads to distributed algorithms at the expense of a loss of global performance; related papers are [18, 19, 20, 21, 22, 23], and two recent overviews are [24, 25]. In both the aforementioned approaches and classes of papers the sensing process is not considered as part of the optimization; in fact the SUs do not perform any sensing but they are allowed to transmit over the licensed spectrum provided that they satisfy interference constraints imposed by the PUs, no matter if the PUs are active of not.
When the sensing comes explicitly into the system design, the application of the holistic approach mentioned above leads to nonconvex NP hard optimization problems. These cases cannot be globally solved by efficient algorithms in polynomial time; one typically can design (centralized) sub-optimal algorithms that converge just to a stationary solution. Their implementation however would require heavy signaling among the users (or the presence of a centralized network controller having the knowledge of all the system parameters); which strongly limits the range of applicability of such formulations to practical CR networks. For these reasons, in this paper, we attack the multi-agent decision making problem from a different perspective; we concentrate on optimization strategies where the SUs are able to self-enforce the negotiated agreements on the usage of the licensed spectrum either in a totally decentralized way or by requiring limited and local signaling among the SUs (in the form of consensus algorithms). Aiming at exploring the trade-off between signaling and performance, the proposed approach is then expected to be more flexible than classical optimization techniques and applicable to a wider range of CR scenarios.
1.2 Main contributions
This paper along with our companion work  advances the current approaches (based on the optimization of specific components of a CR system in isolation), in the direction of a joint and distributed design of sensing and transmission parameters of a CR network, composed of multiple PUs and SUs.
We study a novel class of Nash equilibrium problems as proposed in , wherein each SU aims at maximizing his own opportunistic throughput by jointly optimizing the sensing parametersthe sensing time and the false alarm rate (and thus the decision thresholds) of a bank of energy detectorsand the power allocation over the multi-channel links. Because of sensing errors, the SUs might access the licensed spectrum when it is still occupied by active PUs, thus causing harmful interference. This motivates the introduction of probabilistic interference constraints that are imposed to control the power radiated over the licensed spectrum whenever a missed detection event occurs (in a probabilistic sense). The proposed formulation accommodates alternative combinations of power/interference constraints. For instance, on top of classical (deterministic) transmit power (and possibly spectral masks) constraints, we envisage the use of average individual (i.e., on each SU) and/or global (i.e., over all the SUs) interference tolerable at the primary receivers. The former class of constraints is more suitable for scenarios where the SUs are not willing to cooperate; whereas the latter constraints, which are less conservative, seem more realistic in settings where SUs may want to trade some limited signaling for better performance. By imposing a coupling among the transmit and sensing strategies of the SUs, global interference constraints introduce a new challenge in the system design: how to enforce global interference constraints without requiring a centralized optimization but possibly only limited signaling among the SUs? We address this issue by introducing a pricing mechanism in the game, through a penalization in the players’ objective functions. The prices need to be chosen so that the interference constraints are satisfied at any solution of the game and a clearing condition holds; they are thus additional variables to be determined.
The resulting class of games is nonconvex (because of the nonconvexity of the players’ payoff functions and constraints), lacks boundedness in the price variables, and there are side constraints with associated price equilibration that are required to be satisfied by the equilibrium; all these features make the analysis a challenging task. The convexity of the players’ individual optimization problems is, in fact, one indispensable assumption under which noncooperative games have traditionally been studied and analyzed. The classical case where a NE exists is indeed when the players’ objective functions are (quasi-)convex in their own variables with the other players’ strategies fixed, and the players’ constraint sets are compact and convex and independent of their rivals’ strategies (see, e.g., [27, 28]). Without such convexity, a NE may not exist (as in the well-known case of a matrix game with pure strategies); analytically, abstract mathematical theories granting its existence, like those in [29, 30], are difficult to be applied to games arising from realistic applications such as those occurred in the present paper.
The main contribution of this work is to develop a novel optimization-based theory for the solution analysis of the proposed class of nonconvex games (possibly) with side constraints and price clearing conditions, and to design distributed best-response based algorithms for computing the Nash equilibria, along with their convergence properties. Building on , the solution analysis is addressed by introducing a “best-response” map (including price variables) defined on a proper convex and compact set, whose fixed-points, if they exist, are Nash equilibria of the original nonconvex games; the obtained conditions are in fact sufficient for such a map to be a single-valued continuous map; this enables the application of the Brouwer fixed-point theorem to deduce the existence of a fixed-point of the best-response map, thus of a NE of the whole class of proposed games. While seemingly very simple, the technical details lie in deriving (reasonable) conditions for which the best-response map is single-valued and for the boundedness of the prices in order for the existence of a compact set on which the Brouwer result can be based. Interestingly, the obtained conditions have the same physical interpretation of those obtained for the convergence of the renowned iterative waterfilling algorithm solving the power control game over interference channels [18, 19, 20, 21, 22]. We then focus on solutions schemes for the proposed class of games; we design alternative distributed (possibly) asynchronous best-response based algorithms that differ in performance, level of protection of the PUs, computational effort and degree of cooperation/signaling among the SUs, and convergence speed; which makes them applicable to a variety of CR scenarios (either cooperative or noncooperative). For each algorithm, we establish its convergence and also quantify the time and communication costs for its implementation. Our numerical results show that: i) the proposed joint sensing/transmission optimization outperforms current centralized and decentralized state-of-the-art results based on separated optimization of the sensing and the transmission parts; ii) our algorithms exhibit a fast convergence behavior; and iii) as expected, some (limited) cooperation among the SUs (in the form of consensus algorithms) yields a significant improvement in the system performance. The proposed solution schemes can also be used to compute the so-called Quasi-NE of the associated games, a relaxed equilibrium concept introduced and studied in our companion paper .
The paper is organized as follows. Sec. 2 briefly introduces the system model, as proposed in ; Sec. 3 focuses on the system design and formulates the joint optimization of the sensing parameters and the power allocation of the SUs within the framework of game theory; several games are introduced. The solution analysis of the proposed games is addressed in Sec. 4, where sufficient conditions for the existence and uniqueness of a standard NE along with their interpretation are derived. Distributed algorithms solving the proposed games along with their convergence properties and computational/communication complexity are studied in Sec. 5. Numerical experiments are reported in Sec. 6, whereas Sec. 7 draws the conclusions. Proofs of our results are given in Appendix A-F. The paper requires a background on Variational Inequalities (VIs); we refer to [32, 33] for an introductory overview of the subject and its application to equilibrium problems in signal processing and communications. A comprehensive treatment of VIs can be found in the two monographs [34, 35]; a detailed study of convex games based on the VI and complementarity approach is addressed in [36, 22]. The main properties of Z and P matrices, which are widely used in the paper, can be found in [34, 37].
2 System Model
We consider a scenario composed of active SUs, each consisting of a transmitter-receiver pair, coexisting in the same area and sharing the same band with PUs. The network of the SUs is modeled as an -frequency-selective SISO Interference Channel (IC), where is the number of subcarriers available to the cognitive users. We focus on multicarrier block-transmissions without loss of generality. In order not to interfere with on-going PU transmissions, before transmitting, the SUs sense periodically the licensed spectrum looking for the subcarriers that are temporarily not occupied by the PUs. A brief description of the sensing mechanism and transmission phase performed by the SUs as proposed in the companion paper  is given in the following, where we introduce the basic definitions and notation used throughout the paper; we refer the reader to  for details and the assumptions underlying the proposed model.
2.1 The spectrum sensing phase
In , we formulated the sensing problem as a binary hypothesis testing; the decision rule of SU over carrier based on the energy detector is
where is the received baseband complex signal over carrier ; is the number of samples, with and denoting the sensing time and the sampling frequency, respectively; is the decision threshold for the carrier ; represents the absence of any primary signal over the subcarrier , whereas represents the presence of the primary signaling.
The performance of the energy detection performed by SU over carrier is measured in terms of the detection probability and false alarm probability . Under standard assumptions in decision theory, these probabilities are given by 
where is the Q-function, and , , , and are constant parameters, whose explicit expressions are given in . The detection probability can also be rewritten as a function of the false alarm rate as:
where we also introduced the definition of the missed detection probability .
The interpretation of and within the CR scenario is the following: signifies the probability of successfully identifying from the SU a spectral hole over carrier , whereas the missed detection probability represents the probability of SU failing to detect the presence of the PUs on the subchannel and thus generating interference against the PUs. The free variables to optimize are the detection thresholds ’s and the sensing times ’s; ideally, we would like to choose ’s and ’s in order to minimize both and , but (3) shows that there exists a trade-off between these two quantities that will affect both primary and secondary performance. It turns out that, ’s and ’s can not be chosen by focusing only on the detection problem (as in classical decision theory), but the optimal choice of and must be the result of a joint optimization of the sensing and transmission strategies over the two phases; such an optimization is introduced in Sec. 3.
Robust sensing model. The proposed sensing model can be generalized in several directions; see [38, 26]. For instance, one can explicitly take into account device-level uncertainties (e.g., uncertainty in the power spectral density of the PUs’ signals and thermal noise) as well as system level uncertainties (e.g., the current number of active PUs) by modeling the detection process of the primary signals as a composite hypothesis testing. This leads to a uniformly most-powerful detector scheme that is robust against device-level and system-level uncertainties; detailed can be found in [38, 26] and are omitted here. It is important however to remark that the resulting detection probability and false alarm rate of the aforementioned robust scheme are still given by (2) and (3), but with a different expression for ’s and ’s . This means that analysis and results developed in the next sections are valid also for this more general model.
2.2 The transmission phase
The transmission strategy of each SU is the power allocation vector over the subcarriers, subject to the following (local) transmit power constraints
where denotes possibly spectral mask [the vector inequality in (4) is component-wise].
According to the opportunistic transmission paradigm, each subcarrier is available for the transmission of SU if no primary signal is detected over that frequency band, which happens with probability . This motivates the use of the aggregate opportunistic throughput as a measure of the spectrum efficiency of each SU . Given the power allocation profile of the SUs, the target false alarm rate (assumed to be equal over the whole licensed spectrum), the sensing time , and taking the log of the opportunistic throughput, the payoff function of each SU is then (see  for more details)
where , with , is the portion of the frame duration available for opportunistic transmissions and is the maximum information rate achievable on link over carrier when no primary signal is detected and the power allocation profile of the SUs is :
with and , where is the channel transfer function of the direct link and is the cross-channel transfer function between the secondary transmitter and the secondary receiver ; and is the power spectral density (PSD) of the background noise over carrier at the receiver (assumed to be Gaussian zero-mean distributed).
As a final remark note that the throughput defined in (5) is not the average throughput experienced by the SUs, which instead would include an additional rate contribution resulting from the erroneous decision of the SUs to transmit over the licensed spectrum still occupied by the PUs. We have not included this contribution in the objective functions of the SUs because in maximizing the function we do not want to “incentivize” the undue usage of the licensed spectrum. Moreover, differently from the opportunistic throughput in (5), the maximization of the average throughput would require the knowledge from the SUs of the a-priori probabilities of the PUs’ spectrum occupancy, which is in general not available.
2.3 Probabilistic interference constraints
Due to the inherent trade-off between and [see (2) and (3)], maximizing the aggregate opportunistic throughput (5) of SUs will result in low and thus large , hence causing harmful interference to PUs. To allow the SUs’ transmissions while preserving the QoS of the PUs, we envisage the use of probabilistic interference constraints that limit the interference generated by the SUs whenever they misdetect the presence of a PU. Examples of these constraints are the following:
Individual overall bandwidth interference constraint: for each SU
Global overall bandwidth interference constraints:
where [or ] are the maximum average interference allowed to be generated by the SU [or all the SU’s] that is tolerable at the primary receiver; and ’s are a given set of positive weights. If an estimate of the cross-channel transfer functions between the secondary transmitters and the primary receiver is available, then the natural choice for is , so that (7) and (8) become the average interference experienced at the primary receiver. Methods to obtain the interference limits along with some implementation aspects related to this issue and alternative interference constraints are discussed in Sec. 5.1.1.
We wish to point out that other interference constraints, like per-carrier interference constraints, as well as multiple PUs can be readily accommodated, without affecting the analysis and results that will be presented in the forthcoming sections. For notational simplicity, we stay within the above setting.
3 System Design based on Game Theory
We focus now on the system design and formulate the joint optimization of the sensing parameters and the power allocation of the SUs within the framework of game theory. We consider next two classes of equilibrium problems: i) games with individual constraints only (Sec. 3.1 below); and ii) games with individual and global constraints (Sec. 3.1 and Sec. 3.3 below). The former formulation is suitable for modeling scenarios where the SUs are selfish users who are not willing to cooperate, whereas the latter class of games is applicable to the design of systems where the SUs can exchange limited signaling in favor of better performance. Indeed, being less conservative than individual interference constraints, global interference constraints are expected to yield better performance of the SUs at the cost of more signaling. The aforementioned formulations are thus applicable to complementary CR scenarios.
3.1 Game with local interference constraints
In the proposed game, each SU is modeled as a player who aims to maximize his own opportunistic throughput by choosing jointly a proper power allocation strategy , sensing time , and false alarm rate , subject to power and individual probabilistic interference constraints. Stated in mathematical terms we have the following formulation.
In (9) we also included additional lower and upper bounds of satisfying and upper bounds on detection and missed detection probabilities and , respectively. These bounds provide additional degrees of freedom to limit the probability of interference to the PUs as well as to maintain a certain level of opportunistic spectrum utilization from the SUs . Note that the constraints and do not represent a real loss of generality, because practical CR systems are required to satisfy even stronger constraints on false alarm and detection probabilities; for instance, in the WRAN standard, .
3.2 Game with global interference constraints
We add now global interference constraints to the game theoretical formulation in (9). This introduces a new challenge: how to enforce global interference constraints in a distributed way? By imposing a coupling among the transmissions and the sensing strategies of all the SUs, global interference constraints in principle would call for a centralized optimization. To overcome this issue, we introduce a pricing mechanism in the game, based on the relaxation of the coupling interference constraints as penalty term in the SUs’ objective functions, so that the interference generated by all the SUs will depend on these prices. Prices are thus addition variables to be optimized (there is one common price associated with any of the global interference constraints); they must be chosen so that any solution of the game will satisfy the global interference constraints, which requires the introduction of additional constraints on the prices, in the form of price clearance conditions. Denoting by the price variable associated with the global interference constraint (8), we have the following formulation.
Player ’s optimization problem is to determine, for given and , a tuple such that (10) Price equilibrium: The price obeys the following complementarity condition: (11)
In (11), the compact notation means , , and . The price clearance conditions (11) state that global interference constraints (8) must be satisfied together with nonnegative price; in addition, they imply that if the global interference constraint holds with strict inequality then the price should be zero (no penalty is needed). Thus, at any solution of the game, the optimal price is such that the global interference constraint is satisfied.
3.3 The equi-sensing case
The decision model proposed in Sec. 2.1 is based on the assumption that the SUs are somehow able to distinguish between primary and secondary signaling. This can be naturally accomplished if there is a common sensing time (still to optimize) during which all the SUs stay silent while sensing the spectrum. However, the formulation (10), in general, leads to different optimal sensing times of the SUs, implying that some SU may start transmitting while some others are still in the sensing phase. To overcome this issue, several directions have been explored in the companion paper , under the model (10)-(11). Here we follow the approach of modifying the formulation in (10) in order to “force” in a distributed way the same optimal sensing time for all the SUs. Roughly speaking, the idea is to perturb the payoff functions of the players by a penalty term that discourages the players to deviate from equi-sensing strategies. Stated in mathematical terms, we have the following formulation.
Player ’s optimization problem is to determine, for given , and , a tuple in order to (12) Price equilibrium: The price obeys the complementarity condition (11).
The third term in the objective function of each SU in (12) helps to induce the same optimal sensing time for all the SUs. Roughly speaking, one expects that for sufficiently large , the aforementioned term will become the dominant term in the objective functions of the SUs, leading thus to solutions of the game having sensing times that differ from their average by any prescribed accuracy. This intuition has been made formal in our companion paper  for stationary solutions of the game (12), and it can be similarly extended to the Nash equilibria; we omit the details because of space limitation.
3.4 Unified formulation and summary of notation
In this section, we introduce a compact and unified formulation of the proposed games that simplifies their analysis. Let us start by separating the convex constraints in the feasible set of the players from the nonconvex ones. The interference constraints (a) in (9) are bi-convex and thus not convex, whereas constraints (b) are convex in and . This motivates the following change of variables:
so that the constraints on in each player’s feasible set become convex in the tuple [with ]. Indeed, for each , we have
where denotes the inverse of the Q-function [ is a strictly decreasing function on ], which are convex constraints in [provided that ]. Using the above transformation, we can equivalently rewrite the missed detection probability and the throughput of each player in terms of the tuples ’s, denoted by and , respectively; the explicit expression of these quantities is:
To incorporate the equi-sensing case in our unified formulation, we introduce the functions , which represent the objective functions of the users including the equi-sensing term, with denoting the strategy profile of all the players:
We can now rewrite the feasible set of each player’s optimization problem in terms of the new variables , denoted by : for each let
where we have separated the convex part and the nonconvex part; the convex part is given by the polyhedron corresponding to the constraints (b) and (c) in (9) under the transformation (13) [cf. (14)]:
whereas the nonconvex part in (18) is given by the constraint (a) that we have rewritten as by introducing the local interference violation function
This measures the violation of the local interference constraint (a) at . Similarly, it is convenient to introduce also the global interference violation function , which depends on the strategy profile of all the players:
Based on the above definitions, throughout the paper, we will use the following notation. The convex part of the joint strategy set is denoted by , whereas the set containing all the (convex part of) players’ strategy sets except the -th one is denoted by ; similarly, we define and . For notational simplicity, when it is needed, we will use interchangeably either or to denote the strategy tuple of player ; similarly, the strategy profile of all the players will be denoted either by or , with , and , whereas is the strategy profile of all the players except the -th one. All the tuples above are intended to be column vectors; for instance, signifies , with , where each and For future convenience, Table 1 collects the above definitions and symbols. Using the above notation, the games introduced in the previous sections can be unified under the following reformulation.
Players’ optimization. The optimization problem of player is: (23) Price equilibrium. The price obeys the following complementarity condition: (24)
|sensing time of SU|
|power allocation vector of SU|
|scalar price variable|
|false alarm probability of SU|
|missed detection probability of SU on carrier [cf. (3)]|
|normalized sensing time of SU [cf. (13)]|
|strategy tuple of SU|
|strategy profile of all the SUs except the -th one|
|strategy profile of all the SUs|
|payoff function of SU including the equisensing penalization [cf. (17)]|
|local interference constraint violation of SU [cf. (21)]|
|global interference constraint violation of SU [cf. (22)]|
|,||feasible set of SU [cf. (18)], joint feasible strategy set of|
|joint strategy set of the SUs except the -th one|
|,||convex part of [cf. (19)], Cartesian product of all ’s|
Needless to say, when and , reduces to the game in (9) where there are only individual interference constraints (7), whereas when , coincides with the game in (10)-(11) with local and global interference constraints.
As a final remark, we observe that the proposed formulations may be extended to cover more general settings, without affecting the validity of the results we are going to present. For instance, the case of multiple active PUs and additional local/global interference constraints (such as per-carrier constraints) can be readily accommodated: Instead of having a single price variable, we associate a different price to each global interference constraint and proceed similarly as in (23)-(24). Also, the sensing model introduced in Sec. 2.1 can be generalized to the case of multiple active PUs, and the presence of device-level uncertainties (e.g., uncertainty in the power spectral density of the PUs’ signals and thermal noise) as well as system level uncertainties (e.g., lack of knowledge of the number of active PUs). The mathematical details of these more general formulations can be found in our companion paper ; for notational simplicity, here we will stay within the formulation (23)-(24), without loss of generality.
4 Solution Analysis: Nash Equilibria
This section is devoted to the solution analysis of the games introduced in the previous section. In order to provide a unified analysis, we focus on the general game with side constraints; results for the other proposed formulations are obtained as special cases. We start our analysis by studying the feasibility of each optimization problem in (23) (cf. Sec. 4.1); we then extend the definitions of NE to a game with side constraints and establish its main properties (cf. Sec. 4.2).
4.1 Feasibility conditions
Introducing the SNR detection experimented by SU over carrier and using the definitions given in Sec. 2.1, sufficient conditions guaranteeing the existence of an optimal solution for each player’s optimization problem (23) are the following: For all and , there must exist a common sensing time (corresponding to normalized sensing times ) such that
The first set of conditions in (25) simply postulates the existence of an overlap among the (normalized) sensing time intervals in (23), which is necessary to guarantee the existence of a common value for the sensing times in the original variables ’s. The second set of conditions guarantees that the strategy sets ’s (and thus ’s) are not empty. Interestingly, they quantify the existing trade-off between the sensing time (the product “time-bandwidth” of the system) and detection accuracy: the smaller both false alarm and missed detection probability values, the larger the sensing time (the decision process must be more accurate).
Throughout the paper, we tacitly assume that each user’s optimization problem under consideration has a nonempty strategy set (the associated feasibility conditions above are satisfied).
4.2 Existence and uniqueness of the NE
We focus in this section on the NE of . The definition of NE for a game with price equilibrium conditions such as is the natural generalization of the same concept introduced for classical noncooperative games having no side constraints (see, e.g., ) and is given next.
In words, the proposed notion of equilibrium is a stable state of the network consisting of an equilibrium power/sensing profile and price : at , the SUs have no incentive to change their power/sensing profiles based on the current state of the network [represented by (27)], while the optimal value of the price is such that all global interference constraints are met [a situation represented by (28)]. Note that, for a set of fixed price , the equilibrium power/sensing profile can be interpreted as the NE of a classical noncooperative game (having thus only local constraints), wherein the payoff function of each player is and the strategy set is . The proposed equilibrium concept is thus a NE of the aforementioned game with an appropriately selected price.
The game is nonconvex with the nonconvexity occurring in the players’ objective functions and the local/global interference constraints; moreover, the feasible price [satisfying (28)] is not explicitly bounded [note that this price cannot be normalized due to the lack of homogeneity in the players’ optimization problem (23)]. Because of that, the existence of a NE is in jeopardy. The rest of this section is then devoted to provide a detailed solution analysis of the game; we derive sufficient conditions for the existence and the uniqueness of a NE.
Mathematically, a NE can be interpreted as a fixed-point of the players’ best-response map. When this map is a continuous single valued function, the existence of a fixed-point can be proved by using the renowned Brouwer fixed-point theorem111Brouwer fixed-point theorem states that every continuous (vector-valued) function defined over a nonempty convex compact set has a fixed point in . (see, e.g., [35, Th. 2.1.18]), provided that one can identify a convex compact set for the application of the theorem. Our goal is then to derive a set of sufficient conditions under which the best-response map associated with is a single-valued continuous map over a proper compact and convex set; this is a nontrivial task, because of the nonconvexity of the players’ optimization problems and the potential unboundedness of the price. The new line of analysis we propose is based on the following three steps:
To deal with the unboundedness of the price, we introduce an auxiliary price-truncated game , where the price is constrained to be upper bounded by a given positive constant ;
We derive sufficient conditions for the nonconvex players’ optimization problems in the game to have unique optimal solutions; building on such solutions we introduce a continuous single-value mapthe best-response associated with the game defined on a convex and compact set, whose fixed-points are the Nash equilibria of the game . We can then apply the Brouwer fixed-point theorem to deduce that has a NE;
The final step is to demonstrate that there exists a sufficiently large such that the price truncation in the game is not binding. This will allow us to deduce that a NE of is also a NE of the original, un-truncated, game .
Step 1: The price-truncated game
To motivate the price-truncated game, observe first that the price complementarity condition in (28) is equivalent to
In order to bound the price in (29), let us introduce the price interval defined as: given ,
Game . The game is composed of players’ optimization problems: the following nonconvex optimization problems for the players (32) and the price-truncated optimization problem for the -st player (33)
Note that in the game there are no side constraints, but the price complementarity condition in (28) is treated as an additional player of the game, at the same level of the other players. In fact, this formulation facilitates the solution analysis of the game, as detailed next.
Let us start our analysis by rewriting the NE of as fixed-points of a proper best-response map defined on a convex and compact set, which allows us to apply standard fixed-point arguments. Given , suppose that each optimization problem in (32) has a unique optimal solution for every fixed and (we derive shortly conditions for this assumption to hold; see Proposition 2 below); let denote such a solution by , i.e.,
where in (34) we made explicit the dependence of on the strategy profile of the other players and the price In order to have a unique solution also of the price-truncated linear optimization problem (33), we introduce the following proximal-based regularization in (33): given , , and , let
Note that, thanks to the proximal regularization, the optimization problem in (35) becomes strongly convex for any given , and thus has a unique solution , which depends on . Building on (34) and (35), we can introduce the following best-response map associated with the price-truncated game :
Note that, even though the feasible sets of the players’ optimization problems in (32) are nonconvex, the map is defined over the convex and compact set ; which is a key point to apply the Brouwer fixed-point theorem. Moreover, the set of fixed-points of coincides with that of the NE of the game , establishing thus the desired connection between the map (36) and the game . More formally, we have the following.
Suppose that each optimization problem in (34) has a unique optimal solution for every given and . A tuple is a NE of if and only if it is a fixed-point of the map ; that is .
Based on Lemma 1, we can now study the existence of a NE of by focusing on the fixed-points of the map .
Step 2: Existence of a NE of
We provide now sufficient conditions guaranteeing that each nonconvex problem (32) has a unique optimal solution, for every given