Dynamic Control of Interference Limited Underlay D2D Network
Abstract
DevicetoDevice (D2D) communication appears as a key communication paradigm to realizing the vision of Internet of Things (IoT) into reality by supporting heterogeneous objects interconnection in a large scale network. These devices may be many types of objects with embedded intelligence and communication capabilities, e.g., smart phones, cars, or home appliances. The issue in in this type of communication is the interference to cellular communication caused by D2D communication. Thus, proper power control and resource allocation should be coordinated in D2D network to prevent excessive interference and drastic decrease in the throughput of the cellular system. In this paper, we consider the problem of crosslayer resource allocation in timevarying cellular wireless networks with D2D communication and incorporate average interference to cellular system as a qualityofservice constraint. Specifically, each D2D pair in the network injects packets to its queue, at rates chosen in order to maximize a global utility function, subject to network stability and interference constraints. The interference constraint enforces an arbitrarily low interference to the cellular system caused by D2D communication. We first obtain the stability region for the multiuser systems assuming that the nodes have full channel state information (CSI) of their neighbors. Then, we provide a joint flow control and scheduling scheme, which is proven to achieve a utility arbitrarily close to the maximum achievable utility. Finally, we address the consequences of practical implementation issue such as distributed scheduling by a designing algorithm, which is capable of taking advantage of diversity gain introduced by fading channels. We demonstrate the efficacy of our policies by numerical studies under various network conditions.
I Introduction
The explosive increase of user demands leads to emergence of new data intensive applications to accommodate this increasing demand of users. One of the examples are 4G cellular technologies (WiMAX and LTEA), which have efficient physical and MAC layer performance, but are still lagging behind extremely fastincreasing users’ data demand. Therefore, researches are seeking for new paradigm in the context of 5G technologies to increase the efficiency of wireless network.
While the conventional cellular architecture consists of connections from base stations to user equipment, 5G systems may well rely upon a twotier architecture consisting of a macrocell tier for base station to device communication, and a second device tier for device to device (D2D) communications. Such architectures are a hybrid of conventional cellular and adhoc designs. D2D communication in cellular networks is defined as direct communication between two mobile users without traversing the access point (AP) or first tier network. Hence directly communicating with nearby wireless device can highly increase the efficiency of the network by reusing cellular spectrum resources. This concept not only improves the efficiency of spectrum usage, but also has a great potential for enhancing the network performance expressed in terms of capacity, coverage, energy efficiency and endtoend delays [1, 2]. For the above benefits, D2D communication is one of such technologies that appears to be a promising component in next generation cellular network.
In the literature, various paradigms are introduced based on the spectrum in which D2D communication occurs. These paradigms are classified as overlay and underlay inband and outband communications. The paradigm of interest in this paper is underlay inband communications, in which both D2D and cellular devices use the same spectrum to transmit their data. Operating in the same licensed band, devices will inevitably impact macrocell users by causing interference. To ensure minimal impact on the performance of existing APs, a twotier network needs to be designed with smart interference management strategies and appropriate resource allocation schemes. Furthermore, users in today’s cellular networks use realtime high data rate services like video sharing, gaming or proximityaware social network. Those applications require endtoend delay of incoming packets to be finite. Hence, the stability of user queues, i.e., the network stability, guaranteeing the delay to be finite, is another important design parameter, which should be taken into account.
Considering those design parameters, we combine a variety of basic networking mechanisms such as flow control and scheduling in the context of D2D underlaying cellular network. For that purpose, we model the entire problem as that of a network utility maximization, in which interference caused by D2D pairs to cellular BS is incorporated as an additional constraint and develop the associated dynamic flow control, power control, and scheduling mechanisms.
Scheduling in wireless networks is a prominent and challenging problem that attracted significant interest from the networking community. The challenge arises from the fact that the capacity of wireless channel is timevarying nature of wireless channels. Optimal scheduling in wireless networks has been extensively studied in the literature under various assumptions [3, 4, 5, 6]. Starting with the seminal work of Tassiulas and Ephremides [3], where throughput optimality of backpressure algorithm is proven, policies opportunistically exploiting the timevarying nature of the wireless channel to schedule users are shown to be at least as good as static policies [4]. In principle, these opportunistic policies schedule the user with the favorable channel condition to increase the overall performance of the system. However, without imposing individual performance guarantees for each user in the system, this type of scheduling may result in unfair sharing of resources and may lead to starvation of some users, for example, those far away from the base station in a cellular network. Hence, in order to address fairness issues, scheduling problem was investigated jointly with the network utility maximization problem [7, 8], and the stochastic network optimization framework that guarantees the network stability and provides fairness, was developed [9].
In our model, we consider a D2D network in which communication takes place regardless of cellular system as long as the interference caused by them to AP is kept low to guarantee a required level of QualityofService (QoS) on the cellular network. We call such network interferenceaware D2D network, In such setting, we use stochastic optimization framework to address the basic wireless network control problem in order to develop a crosslayer resource allocation solution that will incorporate interference metric to the AP as a QoS metric. Our main contributions are summarized as follows:

We evaluate the stability region of interferenceaware D2D network, defined as the set of rates achievable by any flow control and linkscheduling scheme. Then, we compare it with the stability region where there is no interference constraint, and quantify the rate loss due to interferenceaware operation of D2D network.

We formulate the resource allocation problem in interferenceaware D2D network as a network utility maximization (NUM) problem, in which the optimal scheduling of D2D communication links allocation is implemented in the physical layer, while the flow control is realized in the transport layer.

Although there is no standard for D2D communications, D2D communications in cellular network are expected to be overseen/controlled by a central entity like evolved Node N (eNB). Thus, our first focus is to design a centralized algorithm. We propose a dynamic control algorithm, which is a joint scheduling and flow control scheme as the solution of the NUM problem, and we prove that this scheme achieves a utility arbitrarily close to the maximum achievable utility. Note that flow control algorithm moves the rate vector to the pareto boundary of the stability region based on defined utilities for the users. For example, logarithmic utility function moves the rates to the point where proportional fairness is satisfied.

We investigate important practical limitation, which is the unavailability of a centralized scheduler. Hence, we design a distributed scheduling algorithm called channelaware distributed scheduler, where the devices decide to transmit or not based on their local information (i.e., based on their channel conditions and queue backlogs), and the proposed algorithm is channelaware in the sense that they have capability of taking into account wireless channel variations. Then, we obtain performance bounds on the algorithms since the optimality can no longer be guaranteed due to availability of limited information. advantage of diversity gain imposed by fading channels. Then, we obtain performance bounds on the algorithms since the optimality can no longer be guaranteed due to availability of limited information.

We demonstrate via simulations the performance of proposed algorithms comparing with other wellknown distributed schedulers and verify the analytical results.
Ii Related Works
Our results in this paper are related to optimization in D2D communication in underlaying cellular network. Hence, we will only mention the papers which are most relevant to ours.
The authors of [10] and [11] consider a single cell scenario where a cellular user and two D2D users share the same spectrum, where the BS controls the transmit power and spectrum of the D2D communication link. The objective is to optimize the sum rate with energy/power constraint under nonorthogonal and orthogonal sharing mode. The authors show analytically that an optimal solution can be given either in closed form or can be chosen from a set. Then, the researchers turn their eyes on the efficient resource allocation with multiple D2D pairs. The work in [12] controls the interference of the D2D links to the cellular users by limiting the maximum transmit power of the D2D users. In [13], authors employ the interferencelimited area knowledge for D2D receivers to maximize the network capacity with multiuser MIMO, and any cellular users in the vicinity of the interferencelimited area is not scheduled. The optimal power allocation problem for D2D communication pairs are analyzed in [14, 15, 16, 17]. The authors in [14] and [16] showed that the problem of optimal power allocation and mode selection problem are rather involved and they propose heuristic approaches to solve the problem. The authors of [15] propose a method which applies exhaustive search to find power efficiency, which is a function of transmission rate and power consumption of the users.
The works in [18, 19, 20, 21, 22] focus on the improving the system performance while maintaining certain QoS constraint. The authors in [18] consider a resource allocation problem to maximize the overall network throughput while guaranteeing QoS requirements for both D2D and regular users. A maximum weight bi partite matching based scheme is developed to select a suitable D2D communication pair. [19] propose a graphbased resource allocation method, where they formulate the optimal resource allocation as a nonlinear problem. Since the problem is NPhard, they propose a suboptimal graphbased approach, which accounts for interference and capacity of the network. In their proposed graph, each vertex represents a link (D2D or cellular) and each edge connecting two vertices shows the potential interference the two links. [20] formulate the problem of maximizing the system throughput with minimum data rate requirements, and they use the particle swarm optimization framework to obtain the solution. In [22], the formulate the problem as integer programming problem. which is hard to solve due to being NPhard. Hence, they propose a suboptimal solution which is based on obtaining the elements of optimization problem in different phases.
On the other hand, the problems when the cellular infrastructure is missing, i.e., distributed algorithms where D2D users act autonomously, are mostly analyzed in a gametheoretical framework. In [23], the proposed approach optimally selects the most beneficial source devices by analyzing the interactions between the base station’s rewarding strategies and the devices’ transmission power using a Stackelberg game model. In addition to pricing model, [24] propose auctionbased game theoretic models, where D2D communication pairs should bid for the channel to transmit their information. Other approaches propose a coalition formation game based scheme [25], [26], where cellular users and D2D communications pairs who want to communicate in the same spectrum, form a coalition. Most gametheoretical solutions impose large overhead due to the need for heavy information exchange in terms of bids, or prices and demands.
The solutions of these approaches and their derived suboptimal heuristic can indeed improve the system performance with QoS constraints. However, they do not seem to be a good candidate for timestringent application with limited computational capacity. Nonetheless, the authors of [10] and [11] derived the closedform solution, which reduces the complexity. But they only consider a scenario with a cellular user and D2D communication pair, which is not practical in reality. The algorithm derived in this paper, which is shown to be close to optimal solution, is a simple index policy, i.e., the centralized entity performs simple algebraic manipulations to obtain the solution in each time instant. Furthermore, we consider a general network model which contains arbitraty number of D2D communication pairs.
Iii System Model, Scheduling Policy and D2D Network Stability
In this section, we will introduce the details of our system model and the definitions of the main concepts that are used throughout the paper in relation to this model.
Iiia System Model
Our primary aim in this paper is to characterize the maximum rates of communication that can be stably supported by a D2D network and to discover the crosslayer centralized and/or distributed control mechanisms (i.e., flow control and scheduling) that can achieve these rates with provable performance guarantees. To this end, we consider a group of communication devices forming distinct D2D communication pairs and sharing the same frequency band with an access point (AP), as shown in Fig. 1. The devices are in close proximity of each other so that they can reach to their intended receivers in a single hop, but they also cause excessive interference to each other when two or more pairs are active at the same time. This leads to a fully connected interference graph topology with collision model for the D2D network in question.
The devices operate in slotted time with slot indices represented by . The link quality between the device pairs varies over time according to the block fading model, in which the channel gain is constant over a time slot and changes from one slot to another independently according to a common fading distribution. We use , , to represent the direct channel gain between the transmitter and receiver of the th D2D pair. These direct channel gains are independent and identically distributed (iid) over users as well as over time. Operating in the same frequency band, the devices also cause interference to the AP in Fig. 1, and we denote the interference channel gain between the transmitter of the th D2D pair and the AP by , . Again, interference channel gains obey to the iid block fading model (possibly with a different distribution than that of the direct channel gains) as described above. For notational simplicity, we often use the vector notation and to denote the channel gains more compactly.
For the sake of comprehending the interplay between the scheduling decisions at the MAC layer and the flow control decisions at the transport layer better, it is assumed that no power control is exercised at the physical layer of the D2D network and all devices transmit at a constant power level over all time slots. This assumption will help us to distill the effect of physical layer parameters on the interactions of the upper layer scheduling and flow control protocols, which is the main focus of the current paper. In this setting, an important quantity of interest that determines the D2D network performance is the rates (measured in units of bits/slot) offered over a communication link during time slot . We assume that these communication rates are described by the functions (as functions of transmission power levels and channel gains) for . Even though we do not assume any specific functional form for , which is usually determined by the coding and communication technologies embedded in the transceiver circuits of the devices,^{1}^{1}1For example, if the Shannon capacity formula is used to quantify the communication rates for the th D2D pair, can be given as , where represents the total noise plus interference power degrading transmissions over the th link. Indeed, these communication rates are achievable by using Gaussian codebooks when slot durations are large enough [27]. we require that has a bounded second moment, i.e., for all . The significance of the rate function in our analysis is that it will determine the service rates of the network layer queues maintained at the devices.
An application runs at the application layer of each device, and generates the bits to be stored at the transport layer queues of the devices. These bits are accepted to the network layer according to a flow control mechanism that runs at the transport layer. We let represent the amount of data (in bits per slot) that enters the network layer at the beginning of time slot and is stored at a network layer queue with size at device . The relationship between these important network parameters at the queue level is displayed in Fig. 2.
It is assumed that the input rate is admissible in the sense that for all , and it has a longterm average , i.e., . The utility obtained by the th D2D pair is a function of the longterm average rate . We assume that , and is a continuously differentiable, monotonically increasing and concave function of its argument. This concludes the description of our system model. In the next subsections, we will formally introduce the notions of scheduling policy and network stability as well as providing some main definitions classifying/characterizing the scheduling policies and the network stability region.
IiiB Scheduling Policy
Due to close geographical proximity of devices in our network model, only one D2D pair can communicate its data reliably over its respective wireless communication link. Hence, a scheduling decision must be made at the beginning of each time slot to select an appropriate user based on the current (both direct and interference) channel conditions. For this purpose, roughly speaking, a scheduling policy should determine which set of links to be activated in each time slot for data transmission.
Definition 1
A scheduling policy is a vectorvalued function mapping the direct and interference channel states to scheduling probabilities, i.e., for , for devices and satisfying the feasibility constraint .
It should be noted that scheduling policies given in Definition 1 constitute a collection of randomized control mechanisms for the D2D network in question specifying scheduling probabilities for each pair of device. Implicit in this definition, a scheduling policy does not allow two D2D pairs to be active simultaneously due to the topological constraints of our network model. More explicitly, once scheduling probabilities are identified for all D2D pairs for time slot , only one of them is selected for transmission by using the probability distribution (possibly a defective one) induced by over the set of device indices to determine the index of the selected D2D pair.
An important subset of the randomized scheduling policies introduced above is the deterministic ones. We say that a scheduling policy is a deterministic scheduling policy if is either zero or one for all and for all time slots . It will be shown below that the use of randomized scheduling policies will facilitate the mathematical analysis of the collection of optimization problems leading to the network stability region by turning them into convex optimization problems, whilst the solutions of these optimization problems lie in the set of deterministic scheduling policies.
IiiC D2D Network Stability
In this part, we will provide the details of the notion of the D2D network stability by relating the scheduling policies to the queue level dynamics of the devices. To this end, we will first put forward the notion of interferenceaware D2D operation. All the network stability definitions presented afterwards will be with respect to the interferenceaware D2D operation.
The main communication paradigm of interest that we focus on in this paper for the coexistence of a D2D network with an AP in the same spectrum is the underlay paradigm [28]. The main idea underpinning the underlay communication paradigm is that the D2D network can utilize the same spectrum with the AP as long as they do not cause harmful degradation to the data communication at the AP by keeping the interference levels (peak and average) below prespecified interference threshold values. This leads to the interferenceaware D2D operation, formally defined as below.
Definition 2
A D2D network is said to be an interferenceaware D2D network if the average and peak interference power levels that it causes to the AP is bounded above by the prespecified interference threshold values as
(1) 
(2) 
where and denote the upper limits on the aggregate average and instantaneous interference powers from all D2D links, respectively.
This definition makes the coupling between the scheduling policies and the restrictions due to the interferenceaware operation of the D2D network explicit. In particular, the optimum scheduling policy achieving the maximum communication rates that can be stably supported by an interferenceaware D2D network must strike a balance between choosing the best D2D link for device communication and respecting the radio etiquette rules arbitrating the spectrum access rights between devices and the AP. The above interference constraints are primarily designed to safeguard two types of data traffic at the AP against the harmful D2D interference. The average interference constraint in (1) is for delayinsensitive data traffic (e.g., text messaging) for which the messages are encoded and decoded over many slots. On the other hand, the instantaneous interference constraint in (2) is for delaysensitive data traffic (e.g., video streaming) for which the messages are encoded and decoded over a single time slot. A D2D network may not know the type of data traffic at the AP at any given particular time, and hence must respect both constraints simultaneously.
Next, we formally define the concept of the stability of an interferenceaware D2D network. As is standard [9], stability here refers to being longterm averages of expected queue sizes finite, i.e., . Further, we say that the D2D network is stable under if the network layer queues of all devices are stable. An important concept that expands upon definition of network stability and relates the flow control and scheduling mechanisms for a D2D network is the network stability region, which is defined as below.
Definition 3
The network stability region of an interferenceaware D2D network, denoted by , consists of all arrival rate vectors such that there exists a scheduling policy satisfying the conditions below for all .
(3)  
(4)  
(5) 
The constraints in (4) and (5), they are due to the interferenceaware operation of the D2D network, and the feasibility condition. The constraint in (3) is the classical necessity constraint for the queue stability describing the fact that the incoming rate to the network layer queues should be equal to or smaller than the outgoing service rate, which depends on the choice of the scheduling policy in our particular D2D communication scenario [29].^{2}^{2}2Note that is the minimal set that contains all achievable arrival rates, i.e., no vector outside can be stabilized by any feasible and interferenceaware scheduling policy. Although not needed in our analysis, it is also important to note that it can be easily shown that the stability region is a convex set by using the standard timeaveraging arguments.
In the next section, we will obtain the Pareto boundary of the network stability region, where no feasible and interferenceaware scheduling policy can stabilize the D2D network when the arrival rates are beyond this boundary. This will provide a complete characterization of . This characterization will be carried out under the full channelstate information (CSI) assumption. Although helpful in understanding the maximum rates that can be stably supported by a D2D network, such a characterization of the D2D network stability region does not provide us with any insights regarding how to design dynamic control mechanisms achieving the rates in .
To resolve this drawback, we design a dynamic but centralized flow control and scheduling algorithm that achieves all the rates within the D2D network stability region in Section V. The scheduling part of the proposed algorithm provides design insights into how to construct a feasible and interferenceaware scheduling mechanism for a D2D network. In addition to stabilizing an interferenceaware D2D network, the proposed algorithm also maximizes the collective utility of the devices. The flow control part of the proposed algorithm provides design insights into how to construct flow control mechanism to maximize collective network performance. The distributed solutions achieving these desirable properties up to some provable performance bounds are given in Section VI.
Iv Stability Region for InterferenceAware D2D Network
In this section, we derive the boundary of the stability region of an interferenceaware D2D network such that any arrival rate vector, outside the closure of the boundary is unattainable. Then, we analyze the effect of interferenceaware communication on the network stability region by comparing the boundaries obtained with and without interference constraints.
We begin our analysis by computing the boundary of network stability region. This is equivalent to maximizing the average outgoing (service) rate achieved by for given average service rate values of other devices. Recall that the average arrival rate should be smaller or equal to the average service rate in a stable network. Hence, we say that any arrival rate of device , , that is larger than this maximized service rate, cannot be achieved. Before giving the mathematical description of the problem, we make following remark. Since we assume that the channels are ergodic and stationary, we utilize the statistical averages in constructing the optimization problem. Hence, we ignore the time index for the sake of notational simplicity in this section. But in the next section, when we perform dynamic control, we again introduce time index back. Further, for notational convenience, let be the set of all scheduling policies. Therefore, the aim is to maximize , associated with the point , by solving the following linear program:
(6)  
subject to  (7)  
(8)  
(9) 
where the expectations are over the joint distribution of the instantaneous channel gains of direct and interference channels. We solve the above optimization problem using the dual method that is particularly appealing to our problem structure, whose solution is given in the next theorem.
Theorem 1
Proof:
Please see Appendix A. \qed
Theorem 1 gives us the optimal scheduling policy achieving for all . Then, the boundary of the stability region can be attained by varying , and obtaining the points where the average rates of device are maximized. Another important point is that even though we state the optimization for the randomized scheduling policies, i.e., , the optimal solution turns out to be a deterministic scheduling policy, i.e., is either zero or one. In addition, observe that if the condition and for all , then the channel remains idle, i.e., . The reason is that the channel conditions are not good enough to access the channel at the expense of the interference caused to the transmission of AP.
As indicated above, there may be time instants during which the channel remains idle in an interferenceaware D2D network to safeguard the AP. This will result in a decrease in optimal rates due to underutilization of the channel. Consequently, it leads to the contraction of the network stability region. To understand this phenomenon better, we also derive the optimum scheduling policy without interference constraints, and compare the achievable rate regions of both cases with and without interference constraints. Following similar arguments above, we have the following optimum scheduling problem
(11)  
subject to  (12) 
without interference constraints, whose solution is given by the next lemma.
where is the Lagrange multiplier associated with the rate constraint in (12).
Proof:
The proof follows the similar lines with the proof of Theorem 1, and hence is skipped to avoid repetitions. \qed
Interpretation of Stability Network Region for Two Device Case:
Now, we consider a communication scenario containing only two D2D pairs. In this case, the interferenceaware optimum scheduling policy is given by
and
In Figs. (a)a(b)b, the stability region for a twolink D2D network, is illustrated for Rayleigh fading direct and interference channels. To plot the regions, we varied the rate achieved by the second D2D pair, and calculated and the boundary rate pair for each point. Recall that and are the direct and interference channel gains of th D2D pair, respectively. In Fig. (a)a, we take and and we obtain different boundary rate pair for varying interference parameter, . As seen in Fig. (a)a, as we decrease value, i.e., the interference constraint is more stringent, both D2D pairs have smaller transmission opportunities to meet the interference constraint. Thus, the network stability region becomes smaller. In Fig. (b)b, we fixed the value of at 0.1 and vary . As seen in Fig. (b)b, when , the network stability regions (with and without interference constraint) coincides for smaller values of , where the second D2D pair takes a higher portion of transmissions. In this case, the interference constraint is inactive since the second D2D pair with smaller interference channel gain transmits predominantly. On the other hand, as increases, i.e., , more interference is caused to the AP and the network stability region shrinks.
V Control of Underlay D2D Communication Networks with Centralized Scheduling
In the previous section, we characterize the stability region by obtaining maximum rates that an interferenceaware D2D network can support. In this section, we will present a dynamic control algorithm that will solve a network utility maximization (NUM) problem while stabilizing the network layer queues in a D2D network. To do so, we follow a crosslayer design approach. In the lower layer, the scheduling policy ensures network stability and satisfies the interference requirements. In the upper layer, on the other hand, flow control policy strives to move the network layer arrival rates to the optimal point within the stability region. Since the derived crosslayer algorithm will be a dynamic online algorithm, we will use the time index in this section again to indicate its operation in time.
The dynamic crosslayer algorithm takes the queue lengths ( both virtual and real queues) and instantaneous direct and crosschannel gains as input, and determines the scheduled device at each time slot as output. We start our analysis by first formulating the NUM problem and providing the queue dynamics to set the stage for the crosslayer design approach.
Va Problem Formulation
Our objective is to stabilize the network while maximizing the sum of device utilities. That is, we aim to find the solution of the following NUM problem:
(14)  
subject to  (15) 
The objective function in (14) calculates the total expected utility of D2D pairs over random stationary channel conditions and scheduling decisions. The constraint (15) ensures that network layer arrival rates of D2D pairs are within the rate region supported by the network defined as . The above problem could in principle be solved by means of standard convex optimization techniques if the stability region is known in advance. Although this approach may give us an idea about how to select transmission rates, it will not say anything about how we can reach the optimum operating by relating the solution to the design of wireless networks. Thus, in the following subsections, we develop a practical dynamic control algorithm to facilitate our understanding of the interplay between interference requirements of the D2D network and the critical functionalities of wireless networks, such as scheduling, and flow control.
VB Queue Dynamics
We assume that there is an infinite backlog of data at the transport layer of each node. Our proposed dynamic flow control algorithm determines the amount of traffic injected into the queues at the network layer. The dynamics of the network layer queue of th D2D pair is given as follows:
(16) 
To meet the average interference constraint given in (1), we also maintain a virtual queue as:
(17) 
The state of the virtual queue at any given time instant is an indicator on the amount by which we exceed the allowable interference constraint. Thus, the larger the state of these queues, the more conservative our dynamic algorithm has to get towards meeting these constraints, i.e., the less transmissions will take place by D2D pairs. The strong stability of virtual queues guarantees that the interference constraints are satisfied in the long run (see Theorem 5.1 in [9]).
VC Dynamic Control
The proposed crosslayer dynamic control algorithm is based on the stochastic network optimization framework [9]. This framework allows the solution of a longterm stochastic optimization problem without requiring explicit characterization of the stability region. Furthermore, it enables the simultaneous treatment of stability and performance optimization by the introduction of virtual queues to transform performance constraints into queuing stability problems.
To this end, consider queue backlog vectors for D2D communication pairs as and . Let be a quadratic Lyapunov function of real and virtual queue backlogs defined as:
(18) 
Also, consider the onestep expected Lyapunov drift, for the Lyapunov function as:
The aim of stochastic optimization framework is to minimize the drift to ensure the network stability, which can be achieved by having negative Lyapunov drift whenever the sum of queue backlogs is sufficiently large. Intuitively, this property ensures network stability because whenever the queue backlog vector leaves the stability region, the negative drift eventually drives it back to this region. Furthermore, the following utilitymixed Lyapunov drift
(19) 
enables us to maximize the network performance in conjunction with the network stability, where the conditional expectation is taken with respect to the random fading realizations.
Control Algorithm: Making an analogy to back pressure algorithm, we propose the following crosslayer algorithm which executes the following steps in each time :

Upper Layer  Flow control: The flow controller at each device observes its current queue backlog, . It then injects bits, where is the solution of the following optimization:
(20) where is a design constant that will determine the final performance of the designed algorithm. The above identity involves maximizing a concave function, which can be easily solved by using convex optimization techniques [30].

Lower Layer  Scheduling: A scheduler observes the backlogs in all devices, and all fading states. Then, it determines the scheduling decision for time slot , as the solution of the following optimization:
subject to where is the weight of pair and is given as:
(21) The above is a standard linear optimization problem, whose solution is obtained on the boundary. Specifically, among the pairs that satisfies the instantaneous interference constraint, and has the maximum weight, is allowed to transmit at a given time slot.
We note that the parameter in the flow control algorithm that determines the extent to which the utility optimization is emphasized. Indeed, if is large relative to the current backlog in the source queues, then the admitted rates will be large, increasing the time average utility while consequently increasing congestion. This effect is mitigated as the backlog grows at the source queues and flow control decisions become more conservative. Note that the flow control algorithm is decentralized because the control values for each node require only knowledge of the queue backlogs in device .
In the scheduling policy, the weight equation in (21) consists of reward and cost terms. Specifically, the larger the data queue backlog size and/or higher the instantaneous channel rate , the more likely the transmission of D2D pair occurs. On the other hand, larger the interference queue backlog (representing the interference level caused to AP) and/or higher the interference channel gain , the less likely the transmission of pair takes place. In this setting, the flow control algorithm strives to maximize collective network utility, whereas the scheduling policy makes sure that the utility maximizing operating point is within the stability region. Indeed, by utilizing the proposed scheduling algorithm, we can achieve any point in the stability region.
Theorem 2
Suppose is the average arrival rates produced by the proposed dynamic control algorithm. Then, for any flow parameter , the dynamic control algorithm yields the following performance bound for the aggregate network utility:
while bounding the total longterm expected queue lengths as:
where are constants, and is the optimal aggregate utility, i.e., the solution of the problem in (14)(15). This theorem shows that the proposed dynamic control gets arbitrarily close to the optimal utility by choosing sufficiently large at the expense of proportionally increased average queue sizes.
Proof:
The proof is given in Appendix B. \qed
We note that the proposed dynamic control algorithm is not distributed since its scheduling part depends on global queue length information. As compared to the distributed scheduling algorithms, the centralized scheduling schemes usually lead to a better performance at the cost of requiring a central authority to allocate the network resources. In a largescale wireless network, such a central authority does not always exist. Furthermore, implementation of the centralized algorithms results in high overhead on the network due to collecting channel conditions and queue states of all users. In the remainder of the paper, we focus on designing distributed algorithms relaxing the assumptions necessary for the centralized algorithm. Note that the flow control part of the proposed solution is already distributed, i.e, each node decides its admitted flow by only local information. Thus, they remain the same below.
Vi ChannelAware Distributed Algorithms
In this section, we relax the requirement of having a central authority for device scheduling in Section V by investigating contention based distributed scheduling algorithm with multiple round contention, called ChannelAware Distributed Schedulers (CADS) with uniform mapping, that operates based on the local queue size and channel state information at each D2D pair. The distributed mode of operation necessitates the modification of the NUM problem as
(22)  
subject to  (23) 
where is the contraction coefficient. The constraint in (23) suggests that the distributed scheduling algorithms can still stabilize the network, provided that the arrival rates are interior to , which is a scaled version of the stability region.
In the remainder of the section, we will introduce a distributed scheduling algorithm that is channelaware in the sense that they can utilize the diversity gain in timevarying environments, and obtain its performance bounds by characterizing . For analytical purposes, we assume that all channel gains, and are i.i.d. However, we perform simulations for i.i.d and noni.i.d. cases and observe that the proposed algorithm often achieves scheduling performance far better than the obtained performance lower bounds. Furthermore, we will only assume the average interference constraint, but it is straightforward to incorporate the instantaneous interference requirement in the solutions.
Via Contention Resolution Phase in CADS
In this part, we introduce the common operational principles shared by all CADStype distributed schedulers that will be introduced in the subsequent sections. Operation of a CADS takes place in slotted time in two phases: (i) contention phase and (ii) data transmission phase. The contention phase is composed of minislots, each of which is of enough duration to detect contention signals from other devices, i.e., a minislot must be at least in IEEE 802.11b. If is the ratio of the minislot duration to the duration of a regular time slot, then the parameter , which will appear in our derivations below, signifies the fraction of time spent to resolve collisions by means of implementing a contention resolution phase before the transmission of actual data.
The contention from devices for the time slot is resolved as follows. The th D2D pair selects a minislot to send its contention signal. The selected minislot depends on the pair ’s weight that incorporates queue backlog, direct channel and interference channel information into a single parameter. If pair senses a contention signal from another pair before the th minislot, it stops contending for the channel and defers its data transmission to the next time slot, i.e., . Otherwise, it sends a contention signal in the beginning of the th minislot. If no collision is sensed, the th D2D pair obtains the access for the channel to transmit its data for the remaining part of the time slot, called the data transmission phase, commencing after the contention phase. If a collision is sensed, then the time slot remains idle and no data transmission takes place. These steps are visually illustrated in Fig. 4 and summarized in Algorithm 1 below.
Secondly, we assume that when a D2D pair detects a contention from other devices, it stops contenting the channel, i.e., the first pair which contends for the channel, obtains the right to access the channel and start transmitting in data transmission phase as illustrated in Fig. (a)a. Furthermore, if more than one user contend in the same minislot, we say that collision takes place and the channel remains idle in data transmission phase as illustrated in Fig. (b)b to prevent redundant interference to AP. The basic operational steps of CADSs is summarized in Algorithm 1.
Based on the contention resolution algorithm in CADS described above, the D2D pair with the smallest backoff time earns the access rights for the channel. Hence, it is of critical importance to design an efficient association policy mapping smaller backoff times to the larger weights to ensure high utility and to exploit multiuser diversity. Our aim below is to investigate the structure of such efficient policies associating device weights with the minislot indices.
Definition 4
A minislot association policy is a mapping such that its th component function determines the minislot index to which the th pair is assigned given that .
Further, is said to be a threshold policy if all of its component functions can be written as
where indicates that the pair does not contend for the channel in time slot .
Note that the input of mapping function is the weight of devices, . Hence, if , the transmission of the pair causes more harms to the AP by injecting excessive interference compared to the benefit of the data transmission.
Below, we design a thresholdbased minislot association policy in which the goal is to operate in close proximity of optimal point without imposing high complexity, and to provide fairness between D2D pairs. Furthermore, in numerical section, we compare the performance of the designed policy with different thresholdbased minislot association policies that mainly differ on determining the threshold values .
ViB CADS with Uniform Mapping
In CADS with uniform mapping, each pair is assigned to a minislot uniformly at random over all available minislots. This is achieved by utilizing the distribution of weights of pairs as follows. ^{3}^{3}3In our model, the fading process is stationary and independent from slot to slot. Hence, the D2D pairs know their instantaneous channel gains, and can learn channel distributions by observing the channel over a period of time [31]. Let is the conditional cumulative distribution function (CDF) of at time slot given that is larger than zero, and let be the corresponding probability density function (PDF). Given and , each device can obtain as follows:
(24) 
Furthermore, let be the inverse function of . The following lemma indicates how to select the threshold values to achieve uniform distribution over all minislot indices.
Lemma 2
In CADS with uniform mapping, the mapping function, , is as follows:
The minislot association policy above ensures that each pair picks a minislot uniformly at random, i.e., with a probability given that its weight is positive. The goal here is to minimize the probability collisions by spreading the contention probabilities uniformly over all minislots. Furthermore, this minislot association policy also enforces the scheduling of schedule a good D2D pair with respect to the current channel and queue states. However, it should be noted that such a uniform mapping policy, although promising, does not necessarily guarantee the scheduling of the best user, i.e., the user that has the maximum weight, as discussed subsequently.
ViC Performance Analysis of CADS with Uniform Mapping
Here, we characterize the performance of CADS with uniform mapping by studying the performance loss due to fraction of time allocated to the contention resolution period, collisions in the contention period and imperfect scheduling.
1) Performance Loss Due to Contention Period: We assume that the length of minislots is not negligible. Hence, the devices that are scheduled to transmit, can only use fraction of a whole time slot. That is the loss due to a contention resolution phase with multiple minislots, is .
2) Performance Loss Due to Collisions in the Contention Phase: The CADS with uniform mapping cannot prevent collision in the contention phase perfectly. Whenever such a collision occurs, all D2D pairs remain silent during the data transmission phase, and the channel becomes underutilized. In the sequel, we characterize this loss.
3) Performance Loss Due to Imperfect Scheduling: The CADS with uniform mapping does not always schedule the D2D pair that has the maximum weight. The main underlying reason behind this phenomenon is that each device is assigned to a minislot uniformly at random with respect to their respective weights. Although this provides fairness among devices in giving the access rights to the channel (i.e., devices with lower and higher weights are treated equally), it can lead to assignment of channel access rights to the devices with smaller weights. Let us define the expected loss due to such an imperfect scheduling as , where .
We do not try to characterize analytically for arbitrary number of devices and minislots due to intractability of calculations. Hence, will appear as a parameter in Theorem 3 determining the performance of the CADS with uniform mapping. Our simulation results indicate that the loss due to imperfect scheduling is negligibly small for i.i.d. channels.
We need the following lemma to characterize the performance of the CADS with uniform mapping.
Lemma 3
The CADS with uniform mapping satisfies the following inequality:
where
Proof:
The proof is given in Appendix C. \qed
Lemma 3 indicates that the sum of average weights achieved by the uniform minislot association policy is larger than a fraction of the maximum weight. By using Lemma 3, we next characterize the performance of the CADS with uniform mapping.
Theorem 3
Suppose is the average arrival rates produced the CADS with uniform mapping. Then, for any flow parameter , the algorithm achieves the following performance bound:
while bounding the longterm expected queue lengths as:
where are constants, and is the optimal aggregate utility, i.e., the solution of the problem in (14)(15).
Vii Numerical Results
Viia Simulation Setup and Parameters
In our numerical experiments, we consider a twotier network, in which D2D pairs are communication with each other while causing interference to CBS. Furthermore, we consider logarithmic utility function for all D2D pairs ^{4}^{4}4We utilize logarithmic utility function to provide proportional fairness.. Specifically, pair obtains utility of at the rate of . The rates depicted in the graphs are the sum of arrival rates of all users and the unit of the plotted rates is natts/channel use. First, the main channel between D2D pairs and interference channel between transmitters of pairs are modeled as i.i.d. Rayleigh fading Gaussian channels. Thus, the main and interference power gains are exponentially distributed with means 2 and 1, respectively. The noise normalized power is . Furthermore, in experiments, we only consider average interference constraint. We compare the performance of our algorithm with different minislot association policies and that of widely used regulated queue approach.
ViiA1 CADS with Optimal Weight Mapping
In this subsection, we design a mapping function such that it maximizes the expected weight of D2D pairs by neglecting the effect of the imperfect scheduling. That is to say, the sequence of threshold value is determined as the solution of the following optimization problem:
Since the above optimization problem is highly nonlinear and dependent on distribution of fading process, we are not able to obtain closedform solution. Hence, we numerically solve the problem in the numerical result section. Furthermore, obtaining the performance bound on this mapping function is nontrivial due to having no access to closedform solution. However, we should note that since the algorithm maximizes the sum of weights, it results in better bounds obtained for the one with uniform mapping given in Lemma 3 and Theorem 3. Hence, it results in better performance compared to uniform mapping at the expense of high complexity.
ViiA2 CADS with Linear Mapping
In this section, we explicitly consider this practical challenge and propose an easily implementable and efficient algorithm for the users having limited power and memory. This time, we utilize a liner mapping discrete mapping function , is defined as follows:.
where is ideally a value that the the number of realization of which are larger than , is small. However, since we assume that the users are not capable of calculating the mapping according the distribution function, they agree a value for .
ViiA3 Interference Regulated Distributed Scheduler
In the sequel, we modify the regulated queue model to consider the interference constraint, and it is called Interference Regulated Distributed Scheduler (IRDS), which is based on the baseline algorithm in [32].
The operation of IRDS again has again two phases: (i) contention phase and (ii) data transmission phase. To facilitate the discussion, we introduce two new random variables related to contention and scheduling phases. The first one is the contention variable, , that is 1 with probability , and 0 with probability . The second one is the transmission variable, , that is 1 with probability , and 0 with probability , where is the weight of D2D pair defined in (21). We note that the transmission variable takes also into account interference level that is caused to AP. This is the reason why, the algorithm is regulated with respect to the interference level.
The scheduling decision of D2D pair depends on the following three conditions:
Condition (1): The contention of pair is successful, i.e., .
Condition (2): None of the neighboring pairs were scheduled in the previous time slot, i.e.,
Condition (3): The transmission variable .
Based on these three conditions, the scheduling phase consists of three different cases, as given by Algorithm 2.
Notice that IRDS only consider single round contention unlike the proposed CADSs, which limits the performance of the algorithm as shown in simulation results.
ViiB Simulation Results
In Fig. 5, we investigate the effect of system parameter in our dynamic control algorithms. We take interference constraint, and the ratio of length of a minislot to the length of a slot is taken as , i.e., . Furthermore, the number of minislots, , is 200. As expected, the average rate of all algorithms increase with increasing and Fig. 5 shows that the longterm utilities for converges to their optimal values fairly closely verifying the results in Theorem 1. Furthermore, the distributed algorithm with the best performance is CADS with optimal weight mapping achieving over of the average rate of centralized algorithm. CADS with uniform mapping exhibits a performance fairly close to the one with optimal weight mapping. However, IRDS is the distributed scheduler with worst performance achieving only approximately of the average rate of centralized algorithm.
For the rest of the experiments, we take . In Fig. (a)a and (b)b, we anaylze the effect of interference constraint, , number of D2D pairs, , on the system performance, respectively. As illustrated in Fig. (a)a, the average rate for all algorithms increases with increasing . This is because for low values, in order to satisfy a tight interference constraint, a larger fraction of timeslots remains idle, i.e., smaller number of transmission opportunities are given to D2D pairs. Starting around = 1, the interference constraint becomes inactive, since the constraint is realized with strict inequality. From Fig (b)b, we first notice that the performance of IRDS does not change with the increasing number of pairs and only achieves of the rate achieved by the centralized algorithm when the number of pairs is one. Thus, we can conclude that IRDS cannot take advantage of diversity gain of fading channels. On the other hand, CADS with optimal weight, uniform and linear mapping achieves a performance that is closer to the centralized algorithm. As expected, CADS with optimal weight mapping performs the best whereas the one with linear mapping has the worst performance. In linear mapping, we sacrifice some performance in favor of reducing the complexity of mapping. Furthermore, as the number of pairs increases, the collisions during contention phase increases. This results in an increase in the difference between the performance of centralized and CADSs.
Next, we analyse the effect of the number of minislots on the performance of CADSs. We take the number of SUs as 100. In Fig. (a)a, we take and in Fig. (b)b, we take . As illustrated in Fig. (a)a, the average data rate increases initially with increasing . This is due to fact with increasing , the network experiences less collision. However, when is high, the emphasis on reducing collisions becomes less significant, and the loss due to length of contention phase, i.e., increases. As a result, the performance of CADSs gets worse. The optimal is achieved when is around 600. In Fig. (b)b, we notice that the optimal is around 400 and the decrease of the average rate due to having long contention phase is sharper compared to the case when .
Lastly, we investigate the performance of algorithms with respect to and number of D2D pairs, , when the channels are noniid. Precisely, the main and interference channel gains of D2D pairs are chosen at random, uniformly distributed in the intervals [1.2, 2.8], and [0.2,1.8], respectively. In addition, we take 10 runs and the rates depicted in Fig. (a)a and (b)b are the average of these 10 runs. From Fig. (a)a and (b)b, we notice that CADS with optimal weight and uniform mappings perform slightly worse compared to the case when the channel gains are iid. The reason is that when the channel gains are noniid, the algorithms do not schedule the pair with the highest weight in high number of instances frequency due to having different mapping interval for each pair.
Viii Conclusion
In this paper, we considered the problem of resource allocation in wireless twotier network where D2D communication pairs have information to be shared. However, the interference to the CBS should be kept arbitrarily low caused by the transmission between D2D pairs. First, we studied the stability region of such interferenceaware network, and compare the region to the case when there is no interference constraint. Then, we described a crosslayer dynamic algorithm, and we proved that our algorithm achieves utility arbitrarily close to achievable optimal utility.
Finally, we investigate distributed algorithms, where each pair decides to transmit or not according to local information. We design a channelaware distributed scheduling algorithm, which is a threshold based minislot association policy. We derive the performance bound achieved by this algorithm, and compare the performance of the proposed algorithm with widely used regulated queue approach and different minislot association policies. Via simulations, we show that the proposed algorithm achieves high performance, and the reduction in average rate in the proposed algorithms due to implementing distributed approach is relatively small.
References
 [1] L. Lei, Z. Zhong, C. Lin, and X. Shen, “Operator controlled deviceto device communications in lteadvanced networks,” IEEE Wireless Commun. Mag., vol. 19, no. 3, pp. 96–104, Jun. 2012.
 [2] G. Fodor, E. Dahlman, G. Mildh, S. Parkvall, N. Reider, G. Miklos, and Z. Turanyi, “Design aspects of network assisted devicetodevice communications,” IEEE Commun. Mag., vol. 50, no. 3, pp. 170–177, Mar. 2012.
 [3] L. Tassiulas and A. Ephremides, “Jointly optimal routing and scheduling in packet ratio networks,” IEEE Transactions on Information Theory, vol. 38, no. 1, pp. 165 –168, Jan. 1992.
 [4] X. Liu, E. K. P. Chong, and N. B. Shroff, “A framework for opportunistic scheduling in wireless networks,” Computer Networks, vol. 41, no. 4, pp. 451–474, 2003.
 [5] R. Urgaonkar and M. J. Neely, “Opportunistic scheduling with reliability guarantees in cognitive radio networks,” IEEE Trans. Mob. Comput., vol. 8, no. 6, pp. 766–777, 2009.
 [6] J. J. Jaramillo and R. Srikant, “Optimal scheduling for fair resource allocation in ad hoc networks with elastic and inelastic traffic,” in INFOCOM, 2010, pp. 2231–2239.
 [7] F. P. Kelly, A. K. Maulloo, and D. K. H. Tan, “Rate Control for Communication Networks: Shadow Prices, Proportional Fairness and Stability,” The Journal of the Operational Research Society, vol. 49, no. 3, pp. 237–252, 1998. [Online]. Available: http://dx.doi.org/10.2307/3010473
 [8] X. Wang and K. Kar, “Crosslayer rate control for endtoend proportional fairness in wireless networks with random access,” in MobiHoc, 2005, pp. 157–168.
 [9] L. Georgiadis, M. J. Neely, and L. Tassiulas, “Resource allocation and crosslayer control in wireless networks,” Foundations and Trends in Networking, vol. 1, no. 1, 2006.
 [10] C. H. Yu, O. Tirkkonen, K. Doppler, and C. B. Ribeiro, “On the performance of devicetodevice underlay communication with simple power control,” in in Proc. IEEE Vehicular Technology Conf. (VTC), Apr. 2009, pp. 1–â5.
 [11] C. H. Yu, K. Doppler, C. Riberio, and O. Tirkkonen, “Resource Sharing Optimization for DevicetoDevice Communication Underlaying Cellular Communication,” IEEE Transactions on Wireless Communications, vol. 10, no. 8, pp. 2752–2763, 2011.
 [12] P. Janis, C. Yu, K. Doppler, C. Ribeiro, C. Wijting, K. Hugl, O. Tirkkonen, and V. Koivunen, “Devicetodevice communication underlaying cellular communications system,” IEEE Trans. Inf. Theory, vol. 2, no. 3, pp. 169–178, Mar. 2009.
 [13] H. Min, J. Lee, S. Park, and D. Hong, “Capacity enhancement using an interference limited area for devicetodevice uplink underlaying cellular networks,” IEEE Trans. Wireless Commun., vol. 10, no. 12, pp. 3995–4000, Dec. 2011.
 [14] X. Xiao, X. Tao, and J. Lu, “A qosaware power optimization scheme in ofdma systems with integrated devicetodevice (d2d) communications,” in in Proceedings of IEEE VTCFall, 2011, pp. 1–5.
 [15] M. Jung, K. Hwang, and S. Choi, “Joint mode selection and power allocation scheme for powerefficient devicetodevice (d2d) communication,” in in Proceedings of IEEE VTCSpring, 2012, pp. 1–5.
 [16] S. Hakola, T. Chen, J. Lehtomaki, and T. Koskela, “Devicetodevice (d2d) communication in cellular networkperformance analysis of optimum and practical communication mode selection,” in in Proceedings of IEEE WCNC, 2010, pp. 1–6.
 [17] M. Belleschi, G. Fodor, and A. Abrardo, “Performance analysis of a distributed resource allocation scheme for d2d communications,” in in Proceedings of IEEE GLOBECOM Workshops, 2011, pp. 358–362.
 [18] D. F. et al., “DevicetoDevice Communications Underlaying Cellular Networks,” IEEE Transactions on Communications, vol. 61, no. 8, pp. 3541–3551, 2013.
 [19] R. Zhang, X. Cheng, L. Yang, and B. Jiao, “Interferenceaware graph based resource sharing for devicetodevice communication underlaying cellular communication,” in in Proc. IEEE WNC, 2013, pp. 140–145.
 [20] L. Su, Y. Ji, P. Wang, and F. Liu, “Resource allocation using particle swarm optimization for d2d communication underlay of cellular networks,” in in Proceedings of IEEE WCNC, 2013, pp. 129–133.
 [21] B. G. K. M. H. Han and J. W. Lee, “Subchannel and transmission mode scheduling for d2d communication in ofdma networks,” in in Proceedings of IEEE VTCFall, 2012, pp. 1–5.
 [22] L. B. Le, “Fair resource allocation for devicetodevice communications in wireless cellular networks,” in in Proceedings of IEEE GLOBECOM, 2012, pp. 5451–â5456.
 [23] Q. W. et al., “Qualityoptimized joint source selection and power control for wireless multimedia D2D communication using stackelberg game,” IEEE Transactions on Vehicular Technology, vol. PP, no. 99, pp. 1–1, Sep. 2014.
 [24] C. X. et al., “Efficiency resource allocation for devicetodevice underlay communication systems: A reverse iterative combinatorial auction based approach,” IEEE Journal on Selected Areas in Communications, vol. 31, no. 9, pp. 348â–358, Sept. 2013.
 [25] Y. Li, D. Jin, J. Yuan, and Z. Han, “Coalitional games for resource allocation in the devicetodevice uplink underlaying cellular networks,” IEEE Transactions on Wireless Communications, vol. 13, no. 7, pp. 3965â–3977, July 2014.
 [26] H. Chen, D. Wu, and Y. Cai, “Coalition formation game for green resource management in D2D communications,” IEEE Communications Letterss, vol. 18, no. 8, pp. 1395–â1398, Aug. 2014.
 [27] D. N. C. Tse and S. V. Hanly, “Multiaccess fading channelsâpart I: Polymatroid structure, optimal resource allocation and throughput capacities,” IEEE Trans. Inf. Theory, vol. 44, no. 7, pp. 2796â–2815, 1998.
 [28] A. Goldsmith, S. A. Jafar, I. Maric, and S. Srinivasa, “Breaking spectrum gridlock with cognitive radios: An information theoretic perspective,” Proc. IEEE, vol. 97, no. 5, pp. 894â–914, 2009.
 [29] M. J. Neely, “Optimal energy and delay tradeoffs for multiuser wireless downlinks,” Tech. Rep. CSI050601, vol. University of Southern California, 2005.
 [30] S. Boyd and L. Vandenberghe, Convex Optimization. New York, NY: Cambridge University Press, 2004.
 [31] A. can der Vaart, Asymptotic Statistics. Cambridge , UK: Cambridge University Press, 1998.
 [32] D. Xue and E. Ekici, “Efficient Distributed Scheduling in Cognitive Radio Networks in the ManyChannel Regime,” Wiopt, May 2013.
 [33] Z. Latreuch and B. Beladi, “New inequalities for convexsequences with applications,” Int. J. Open Problems Comput. Math., vol. 5, no. 3, pp. 15–27, Sept. 2012.
Appendix A Proof of Theorem 1
(25)  
subject to  (26) 
where we ignore and in the expectation as they are merely constants and do not affect the solution.
Note that, since the channel rates are independently distributed, optimizing in each fading realization results in optimal solution of statistical averages in (25)(26) in dual problem. That is to say, the solution, will be a stationary policy, i.e., optimizing in each fading realization results in optimal solution of the problem. For any given values of the Lagrange multipliers, and , and transmission rate vector , the optimal policy will choose the scheduling decision in each time slot as the solution of the following Lagrangian problem:
(27)  
subject to  (28) 
Since the above optimization problem has a linear objection function and constraints, we conclude that the optimization problem lies on the boundary. Specifically, for any given values of the Lagrange multipliers forall and , the optimal policy will be deterministic policy, i.e, . Furthermore, for all , i. e., the channel remains idle, if or , and or for all . Also, for the duality gap to be zeo, the following KKT conditions should be satisfied for the optimal Lagrange multipliers [30]:
(29)  
(30)  
(31) 
The conditions in (30) and (31) express the fact that if a constraint is not active then its corresponding Lagrange multiplier is 0. Since the objective function, i.e., the rate achieved by D2D pair is inversely proportional to the rates achieved by the other pairs. We conclude that for all for which the corresponding constraint is realized with equality.
Appendix B Proof of Theorem 2
To derive the performance bound on the control algorithm, we start with providing an upper bound on in the following lemma.
Lemma 4
(32) 
where is a constant.
Proof:
By using the bounds assumed for the channel rates and the arrival rates, the following inequalities can be obtained for each real queue:
(33) 
where . The same line of derivation can be performed for the virtual queue to obtain
(34) 
where , and is the bound on the first moment of the interference channel gain, , i.e., , for all .
(35) 
It is easy to observe that our proposed dynamic network control algorithm minimizes the right hand side of (35).
If there exists a feasible region, it has been shown in [9] that there must exist a stationary scheduling and rate control policy that chooses the users and the arrival rates independent of queue backlogs and only with respect to the channel statistics. In particular, the optimal stationary policy can be found as the solution of a deterministic policy if the channel statistics are known a priori.
Let be the optimal value of the objective function of the problem ((14)(15)) obtained by the aforementioned stationary policy. Also let be optimal traffic arrival rates found as the solution of the same problem. In particular, the optimal input rate could in principle be achieved by the simple backlogindependent admission control algorithm of including all new arrivals for a given pair in time independently with probability . Then, the righthand side (RHS) of (35) can be rewritten as:
(36) 
Also, since is in the achievable rate region, i.e., arrival rates are strictly interior of the rate region, there must exist a stationary scheduling and rate allocation policy that is independent of queue backlogs and satisfies the following:
(37) 
Note that as we consider stationary and ergodic policies, longterm averages (37) correspond to expectations of the same variables as in (35). Clearly, any stationary policy should satisfy (35). Recall that our proposed policy minimizes RHS of (35), and hence, any other stationary policy (including the optimal policy) has a higher RHS value than the one attained by our policy. In particular, the stationary policy that satisfies (37), and implements aforementioned probabilistic admission control can be used to obtain an upper bound for the RHS of our proposed policy. Inserting (37) into (36), we obtain the following upper bound for our policy:
Appendix C Proof of Lemma 3
Our aim here is to obtain a bound on the average weight achieved by CADS comparing with the weight achieved by the centralized algorithm given in Section V, i.e., the max weight algorithm. Also, we d