Robust FuzzyLearning For Partially Overlapping Channels Allocation In UAV Communication Networks
Abstract
In this paper, we consider a meshstructured unmanned aerial vehicle (UAV) networks exploiting partially overlapping channels (POCs). For general datacollection tasks in UAV networks, we aim to optimize the network throughput with constraints on transmission power and quality of service (QoS). As far as the highly mobile and constantly changing UAV networks are concerned, unfortunately, most existing methods rely on definite information which is vulnerable to the dynamic environment, rendering system performance to be less effective. In order to combat dynamic topology and varying interference of UAV networks, a robust and distributed learning scheme is proposed. Rather than the perfect channel state information (CSI), we introduce uncertainties to characterize the dynamic channel gains among UAV nodes, which are then interpreted with fuzzy numbers. Instead of the traditional observation space where the channel capacity is a crisp reward, we implement the learning and decision process in a mapped fuzzy space. This allows the system to achieve a smoother and more robust performance by optimizing in an alternate space. To this end, we design a fuzzy payoffs function (FPF) to describe the fluctuated utility, and the problem of POCs assignment is formulated as a fuzzy payoffs game (FPG). Assisted by an attractive property of fuzzy bimatrix games, the existence of fuzzy Nash equilibrium (FNE) for our formulated FPG is proved. Our robust fuzzylearning algorithm could reach the equilibrium solution via a leastdeviation method. Finally, numerical simulations are provided to demonstrate the advantages of our new scheme over the existing scheme.
I Introduction
Triggered by the development of automation and sensor technology, unmanned aerial vehicles (UAVs) have become increasingly prevalent in military, public and civil applications, such as autonomous combat, target detection, video surveillance, data collection, disaster management, network coverage extension and so on [1, 2]. With the distinctive advantages of high mobility, quick deployment, costeffective and lineofsight (LOS) or near LOS communication channels, UAVs open a promising prospect to the future development of society and technology, and therefore, have attracted tremendous interest from both academia and industry [3, 4]. To support diverse applications of UAVs, reliable and effective information transmission within UAVs network and with the ground control station (GCS) is of crucial importance [5]. However, communication between a swarm of UAVs can become unreliable especially when considering the high mobility of UAVs, the constrained power of hardware, the throughput performance requirements of data packets, and the limited amount of radio resources [6].
Radio resources, particularly the available channels in specific wireless spectrum, are generally regarded as a main factor affecting the communication quality [7]. Besides, the rapid development of wireless technologies poses growing demands for the limited spectrum resource [8]. Therefore, improving the efficiency of channel utilization becomes more significant for emerging wireless services [9], including the UAV networks. Even for the military UAV applications with adequate spectrum resource, enhancing spectrum efficiency is no doubt of great importance to improve the reliability and effectiveness of data transmission [10]. To this end, the optimal resources allocation with higher utilization efficiency along with better network performance in UAV systems is naturally a fundamental and important problem to be addressed. Since the number of nonoverlapping channels (orthogonal channels) is limited by the available spectrum, the partially overlapping channels (POCs) have become the focus of research in the past five years [11]. As one of the most prospective techniques in multiradio multichannel (MRMC) field, POCs specified by IEEE 802.11b/g standard can improve the network throughput, by permitting more parallel transmissions under the tolerable interference [12]. As such, the allocation of POCs should be properly designed and, otherwise, adjacent channels and self interference may become serious, by noticeably degrading network performance instead of improving it [13].
There are few studies on optimizing channel/spectrum resource in UAV networks [14, 15]. Under a configuration of orthogonal channels, a navigation dataassisted opportunistic spectrum access (OSA) scheme in heterogeneous UAV networks is proposed [14]. In [15], a resource allocation scheme is presented to minimize mean packet transmission delay in multilayer UAV networks. In order to accommodate more parallel transmission channels, the POCs assignment in a combined UAVD2D network based on the crisp game theory is firstly studied in [16]. Various schemes have been proposed to implement the optimal assignment of POCs in the context of static wireless networks, e.g. the WLAN scenarios [16, 17, 18, 19, 20, 21, 22, 23]. Research in [17] demonstrates that POCs can efficiently avoid interference and improve the overall throughput by proper assignment. In [18], a greedy algorithm is presented for POCs allocation to maximize the network throughput, while a heuristic POCs assignment algorithm is proposed in [19]. [20] presentes an interferencetolerant medium access method to optimize the WLAN/cellular integrated network (WCIN) throughput by utilizing POCs. In [21] the authors study the problem of interaction between density of access points (APs) and POCs assignment with parameter tuning. [22] proposes a loadaware channel assignment exploiting POCs for wireless mesh networks. The problem of distributed channel allocation in OSA networks with POCs using a game theoretic learning algorithm is investigated in [23].
It is noteworthy that, however, the distinguishing features of UAV networks would compromise the effectiveness of existing methods developed for POCs allocation in ground wireless networks. First, most schemes rely on an ideal assumption that the channel state information (CSI) is static and can be perfectly estimated. Unfortunately, due to the high mobility of UAV nodes, the intermittence of links, the dynamics of topologies and the shorttraining duration, such assumptions will become impracticable in UAV scenarios. Therefore, most schemes may be vulnerable to dynamic environments [24]. Moreover, taking the hardware limitations in UAVs into consideration (e.g. volume and weight), a UAV node is generally energyconstrained [25]. Thus, the information exchange within a whole network is resourcedemanding and tends to be formidable. As far as UAV scenarios are concerned, existing learning schemes premised on global and definite knowledge will be no longer reliable, by greatly deteriorating the network performance. To the best of our knowledge, robust channel allocation for dynamic UAV networks has not been reported in previous works, especially when considering the ubiquitous varying CSI and the complex coupling interferences.
In this paper, we focus on the robust and distributed POCs assignment in UAV communication networks, by fully taking its intrinsic dynamics and uncertainties in to consideration. To be specific, we aim specially at realizing robust channel accessing in the presence of uncertain environmental knowledge, and simultaneously, guarantee the QoSprovisioning data transmission under timevarying mutual interference [26]. As opposed to the previous crisp gametheoretical approaches, in this work we propose a novel robust fuzzylearning scheme for distributed channel allocation in UAV communication networks. By mapping the uncertain utility from a direct observation space to another fuzzy space, and with aid of fuzzylogic analysis, our proposed method effectively relaxes the sensitiveness to changing environments. It thereby ensures a robust channel accessing and QoSaware transmissions even in dynamic UAV scenarios. To sum up, the main contributions of this work are summarized as follows.

We formulate an optimization problem of POCs allocation in UAV communication networks. Specifically, we consider the meshstructured UAV network and introduce a virtual interference factor to thoroughly characterize the coupling network interferences. By taking the QoSprovisioning requirement into account, we model the optimal POCs allocation in highly dynamic environments as one global throughput maximization problem with multiple constraints.

We develop a fuzzy payoffs game (FPG) to describe the optimal POCs allocation with uncertain dynamics, and investigate the property of FPG to ensure the existence of fuzzy Nash equilibrium (FNE). In order to alleviate the sensitiveness to varying CSI and coupling interferences, we employ the fuzzy number to describe the uncertain channel gains, and use the fuzzy payoffs function (FPF) to evaluate the fluctuated utility of UAV nodes [27]. Owing to the fuzzynumber representation and the fuzzylogic computation, our new FPG can effectively address the channel resource competition with dynamic and uncertain environmental information.

We design a robust fuzzylearning algorithm for distributed POCs allocation. For the FPG evolving the fuzzy numbers, we first cope with the fuzzy payoffs in the mapped fuzzy space, and calculate the priority vector of channels with the assistance of fuzzy preference relation (FPR). Relying on the priority vector derived in a fuzzy space, the UAV nodes can implement a robust fuzzy learning and, therefore, distributed updating to achieve the FNE of the formulated FPG. In this regards, our scheme is capable of combating the dynamic environments and thereby realizing the optimal POCs allocation to maximize the global throughput with the predefined constraints.

We evaluate the performances of our robust fuzzylearning scheme in the meshstructured UAV networks. Numerical results show that our proposed scheme can achieve the maximum throughput and improve the resources efficiency, even with the dynamic and uncertain utility, whereas the crispgame based scheme is less effective in terms of robust allocation. We demonstrate our new fuzzy learning scheme can significantly improve the global throughput (60%) and the allowable activelink numbers, which hence provides great promises to emerging UAV communication networks.
The remainder of this paper is organized as follows. In Section II, we formulate the optimal POCs allocation problem with constrains for the meshstructured UAV networks. In Section III, based on the preliminaries of game theory and fuzzy set theory, we develop a FPG, and further prove the existence of FNE. The robust fuzzylearning algorithm for distributed POCs allocation in UAV networks is proposed in Section IV. The performances of our proposed scheme are demonstrated via numerical simulations in Section V. Finally, we draw the conclusions of our work in Section VI.
Ii System Model and Problem Formulation
Iia Network Architecture
We consider a meshstructured UAVs network, where the UAV nodes act as clients (such as undertaking datacollection tasks) and attempt to convey information to others or to the GCS with limited number of channel resources. As shown in Fig. 1, the UAV nodes are randomly distributed in a 3D space, and will form clusters according to their spatial positions [28]. A cluster unit consists of a cluster head (CH) and multiple cluster members (CMs). The CMs use interlinks to communicate with their corresponding CHs, while the CHs use outerlinks to forward information to GCS. Compared with the outerlinks, the length of the interlinks is much shorter, which results in reduced downlink bandwidth requirements and improved energy efficiency. Considering the energy limitation of the CHs as well as the QoS requirements of the CMs, the number of UAV nodes in a specific cluster is bounded by .
Based on the above constructed UAV communication architecture, we denote the set of UAV nodes as , in which one UAV can be modeled by the Poisson Point Process (PPP). The full state of each UAV node is given by:
(1) 
Here is the spatial position of node [29]. A Bernoulli random variable characterizes the role mode of node in a cluster. Specifically, if node acts as a CH, , otherwise, . is the transmission power of node , which is determined by , i.e.,
A Bernoulli random variable represents the transmission state of node , i.e., if node is active in the current time slot, , else . denotes the corresponding CH of node , and is the CHs set. Based on the clustered structure of UAV networks, the set of UAV nodes can be rewritten as , where is the UAV sets of cluster .
The POCs are assumed to support the communication of the clustered UAV networks. Denote the set of the available channels as , and the minimum channel separation for two channels to be regarded orthogonal as . Therefore, the maximum number of the orthogonal channels can be expressed as . For a specific channel , the set of orthogonal channels is denoted as , and the orthogonal channels set class of is denoted as , i.e. . Specifications on the above parameters can be found in IEEE 802.11b/g standard. An illustrative partition of the POCs is shown in Fig. 2.
IiB Interference Model
For a wireless network with MRMC, there are three different types of interferences which should be addressed to ensure the reliable transmission of network nodes: cochannel interference, adjacent channels interference, and self interference (e.g. when a single node utilizes two adjacent channels). To comprehensively describe the interference, a metric named Interference Factor (IF) is recommended [11], which is defined as a ratio of spatial distance and Interference Range (IR) between two nodes, and represents the effective channel separation level. To be specific, the IR refers to the minimum distance that two UAVs should obtain to avoid interference, and it is related with a channel separation factor (e.g. between channel and ). Based on the real measurements [30] and the scaleup degree [11], the relationships between IR and are given in Table I, in which the IR is measured in meters.
0  1  2  3  4  5  
132.6  90.8  75.9  46.9  32.1  0 
Denote the distance between two nodes and operating with channel and as , then the IF can be evaluated via the following three cases:

, when or .
In this case, the nodes are assigned into orthogonal channels or have enough distance to avoid interference. Thus, there is no interference between them.

, when and .
When two nodes occupy the overlapping channels, and meanwhile, the spatial distance between them is less than the IR, then the cochannel interference () or the adjacent channels interference () will arise.

, when and .
This situation corresponds to the self interference, which is excluded for its serious damages to the QoS. That is to say, two overlapping channels () will not be assigned to a single node.
Premised on the above interference model to analyze POCs, we will design a robust channel allocation scheme to avoid cochannel and adjacent channels interference, thereby enhance the overall throughput of UAV communication networks in the dynamic and uncertain environments.
IiC Problem Formulation
The high mobility, dynamic topology, intermittent links and varying link quality are the inherent properties of UAV communication networks, which need to be carefully addressed in practice. To illustrate the dynamics of UAV nodes, we apply the Paparazzi mobility model (PMM), which is a stochastic mobility model to simulate UAVs behavior [31]. The PMM consists of five possible movement types: StayAt, Waypoint, Eight, Scan, Oval, which have different trajectories. Following a PMM, the spatial position of UAV node is changeable, making the link distance between the node and the destination node (the CHs or the GCS) timevariant. Thus, there would exist a discrepancy of link distance between the real value and the estimated value , i.e.,
(2) 
where is the uncertain boundary.
By introducing a scaling factor [32], the channel gain of UAV node , which is related with the link distance, is denoted as:
(3) 
where is a constant to reflect the influence of antenna gain and the average channel attenuation, is a reference distance which is fixed to be 110m indoors and 10100m outdoors, and is the path loss exponent.
Due to the dynamic characteristics of UAV networks, the CSI tends to be uncertain and can be hardly estimated. Denote the imperfect estimation of channel gain as [33], then the uncertain channel gain can be expressed as:
(4) 
where accounts for a bounded error [34].
Moving on, the signal to interference and noise ratio (SINR) of node , when accessing channel , can be described as:
(5) 
where is the variance of additive white Gaussian noise (AWGN) on the channel , and the represents the interference, which is given by:
(6) 
(1) Achievable Rate Let denote the allocated channel set of node , where is the number of the allocated channel, and is the feasible channels set of node . According to Shannon’s capacity formula, the achievable data rate of node is:
(7) 
where is the bandwidth of each channel, and is the data rate in channel .
(2) Generalized Throughput Considering the mesh architecture of UAV networks, the performances of node should not be assessed only by its data rate, but also the topology structure. Instead of the achievable data rate , we introduce another more comprehensive indicator [11] to reformat the throughput of node , whereby the previously defined IF as well as network connectivity are also taken into consideration, i.e.
(8) 
where denotes a connectivity factor, is the IF when node occupying channel , and is the hop count form node to the destination receiver.
Denote the channel allocation pattern of all UAV nodes as , with the above two performance metrics, i.e. the achievable rate and the generalized throughput , then the global utility of all UAVs can be given by:
(9) 
The purpose of channel resources management is thereby to optimize the network utility (According to the practical application to determine which one is preferred, the rate or the throughput ), by carefully allocating the POCs with the following constraints.

Total power constraint:
(10) where denotes the maximum transmission power of node .

QoS constraint:
(11) where is the minimum transmit data rate for node to maintain QoS requirement of diverse applications.

Cluster size constraint:
(12) where is the maximum number of UAV nodes that a cluster unit can accommodate.

Orthogonality Constraint:
(13) As we claimed before, the self interference would severely undermine the QoS. Therefore, multiple channels occupied by one UAV node should be orthogonal, obviously, whose number can’t surpass .
Based on the above elaborations, the corresponding POCs allocation problem for the meshstructured UAV networks can be mathematically formulated as:
(14)  
s.t.  
(15) 
Due to the NPhard nature of the above problem, solving it in static networks with definite information is already challenging, not to mention taking the dynamic properties of UAV networks and additional complex constrains into considerations. To the best of our knowledge, a robust algorithm for POCs allocation, one that can efficient cope with the dynamics and uncertainties in the context of UAVs communication networks, has not been studied in the literature.
Iii Fuzzy Payoffs Game for POCs Allocation
As shown by previous analysis and subsequent simulations, the intrinsic dynamics and uncertainties of the considered UAV networks would degenerate the performance of conventional crispgame theoretical algorithms, for its sensitiveness to environmental variations. In this section, we will exploit fuzzy game theory [35] to reformulate the above optimization problem with uncertain information and multiple constraints. Specifically, we first summarize some basic definitions and notions of the conventional crispgame theory and the fuzzy set theory. On this basis, we introduce the FPG concept to describe the POCs allocation problem in mesh UAV networks. Furthermore, the existence of the equilibrium solution for our established FPG is demonstrated.
Iiia Game Theory
Definition 1 (Crisp NonCooperative Game).
A crisp noncooperative game is defined as , where:

is the set of players (UAV nodes);

, is the set of strategy profiles of the game, where is the set of strategies of the th player;

is the set of payoff functions for the players.
Since the game is noncooperative, then only selfenforcing solutions can be reasonable and rational for it. The core concept of the noncooperative game is Nash equilibrium (NE), which is described as follows.
Definition 2 (Nash Equilibrium).
A strategy pattern is called a NE of the game if,
(16) 
where is the strategies sets of all players, except the th player, and is an element of .
IiiB Fuzzy Set Theory
Definition 3 (Fuzzy Number).
A real fuzzy number is precisely described as any fuzzy subset on the space of real numbers , whose membership function satisfies the following conditions:

is a continuous mapping from to the closed interval .

is constant on and . Specifically, , and , .

is strictly increasing and continuous over , and strictly decreasing and continuous over .
Here , , and are real numbers satisfying .
The membership function gives a quantitative description of the fuzzy number , which is the basic concept of fuzzy mathematics. Here, we take the triangular fuzzy number (TFN) for example, whose membership functions is given by:
(17) 
where , and are all real numbers. An illustration of the membership function of TFN is shown in Fig. 3.
The operations of the fuzzy number obey the following lemma.
Lemma 1.
Let , represent TFNs, is a real number. It holds that:

;

;

dominates (denoted by ) if and only if and .
Since fuzzy numbers represent ambiguous numeric values, it is difficult to rank them according to their magnitude. Various methods of fuzzy numbers ranking have been developed. In [36], the method of evaluating fuzzy numbers with the satisfaction function (SF) and the viewpoint, and then ranking the numbers on the basis of their relative indexes of the evaluation values is introduced. The definitions of the SF, the viewpoint, the evaluation value, and the relative index are presented as follows.
Definition 4 (Satisfaction Function).
The SF between two fuzzy number and is defined as:
(18a) 
(18b) 
where the operator is a Tnorm, without losing of generality, here we employ the commonly used multiplication operator. represents the possibility that fuzzy number is smaller than . Similarly, represents the possibility that is larger than .
Definition 5 (Viewpoint).
For a fuzzy numbers , a fuzzy set which satisfies the following conditions is a viewpoint:

, where ;

exists and it is not zero.
The fuzzy set is a viewpoint, which is used for evaluating the fuzzy numbers and can be broadly divided into three categories: optimistic neutral and pessimistic. The second condition is added so that a viewpoint can be applicable to the SF.
Definition 6 (Evaluation Value).
On the basis of the interpretation of the SF, the evaluation value of fuzzy number in a viewpoint , is given by:
(19) 
Definition 7 (Relative Index).
The relative index of the fuzzy number in the viewpoint , , which shows how close is to the one having the best evaluation in viewpoint , is defined as:
(20) 
where is the set of fuzzy numbers.
IiiC Fuzzy Payoffs Game
In order to deal with the POCs allocation problem with uncertain information in dynamic UAV communication networks, we map the channel assignment problem into a fuzzylogic space rather than an observational space. Then, we employ the TFN to describe the uncertain channel gains. Without losing of generality, we presume the left deviation and the right deviation of the TFN are equal, i.e. . Recall that gives the bounded estimation error of channel gains. For clarity, we denote
(21) 
Expanding each component of the crisp game to a fuzzy set would lead to a fuzzy game. In this paper, we assume the players set and the strategy profiles are definite, whereas the payoff of each player which influenced by the TFN is a fuzzy number.
Definition 8 (Fuzzy Payoff Function).
The FPF of each UAV node is defined as the uncertain achievable rate or generalized throughput of node , which would become fuzzy numbers due to the fuzzy space projecting, i.e.,
(22) 
In this regards, the utility of each node is not only affected by the action taken of all nodes, but also by the dynamic and uncertain environments.
With the formulated FPF, we now present a FPG to characterize the problem of POCs allocation in UAV communication networks with indefinite CSI.
Definition 9 (Fuzzy Payoffs Game).
The FPG is defined as:
(23) 
where and are identical with that in the crisp game . is the FPF set, and is the vector of uncertain channel gains modeled by fuzzy number.
Based on the above analysis, the problem on eq. (14) constrained by eq. (15) can be reformulated as a FPG, in which the players attempt to find an appropriate channel selection pattern to maximize their fuzzy payoffs, i.e.,
(24)  
s.t.  
(25) 
The property of the above designed FPG is investigated in the following subsection.
IiiD Analysis of Fuzzy Nash Equilibrium
As with the crisp game, the fuzzy game has also a NE concept, which is referred to as FNE [37]. The definition of FNE is presented as follows.
Definition 10 (Fuzzy Nash Equilibrium).
A strategy pattern is called a FNE of the fuzzy game if,
(26) 
Theorem 1.
There exists a FNE solution for the formulated FPG in eq. (23).
Proof.
In order to demonstrate the existence of FNE, we first present the definition and the property of fuzzy bimatrix games.
Definition 11 (Fuzzy Bimatrix Game).
A fuzzy bimatrix game is defined as a bimatrix game, which involves two players with fuzzy payoffs [38], i.e.,
(27) 
where and are the sets of the strategies of Player I and Player II, respectively.
(28) 
is the payoffs matrix of the players. Each element specifies the attained fuzzy payoffs, when Player I adopts the strategy while Player II adopts the strategy .
One key property of fuzzy bimatrix games is characterized by the following lemma.
Lemma 2.
A fuzzy bimatrix game has at least one FNE solution, if there exists a subset such that the function is convex on [kacher2008existence].
Based on the favorable feature of fuzzy bimatrix games, the mathematical induction (MI) is employed to analyze our formulated FPG, with which the existence of FNE can be guaranteed.
First, we consider the situation that the FPG consists of two players and present the following theorem.
Theorem 2.
There exists at least one FNE solution for the FPG with two players.
Proof.
For the first condition in Lemma 2, intuitively,
(29) 
is a fuzzy bimatrix game.
For the second condition, here, we choose , then we have
(30) 
To distinguish the concaveconvex quality of function , we calculate its second derivative , i.e.
(31) 
where and are constants. According to the definition of FPF in eq. (22), we discuss the following two cases.

When , we have
(32) 
When , we have
(33)
In the above two cases, the function is convex on .
Based on Lemma 2 and the above elaborations, Theorem 2 can be proved. ∎
Moving on, we execute the second step of the MI and provide the following theorem.
Theorem 3.
Assuming that the FPG with players has FNE solutions, then there exists at least one FNE solution for the FPG with players.
Proof.
Denote the th player as , the FPG can be expressed as:
(34) 
Due to the existence of FNE of the FPG , the players set can be regarded as a whole unity. Therefore, the FPG apparently turns into a fuzzy bimatrix game, in which is Player I, and is Player II.
After establishing the FPG as a fuzzy bimatrix game, we analyze the concaveconvex property of the payoff functions and .
For the payoff function of Player I, let , by performing the same steps (from eq. (30) to eq. (33)) in Theorem 2, it can be proved convex on .
The payoff function of Player II is the sum payoffs of all player , , i.e.
(35) 
Obviously, is a convex function on , since is convex on , . Recall that denotes the vector of uncertain channel gains.
On the basis of Lemma 2 and the above analysis, Theorem 3 can be proved. ∎
Combining Theorem 2 and Theorem 3, Theorem 1 can be proved. ∎
After we have demonstrated the existence of FNE for our formulated FPG, what remains to solve is how to achieve the equilibrium solution. It should be pointed out that the procedure of identifying FNE of fuzzy games is far more complex than finding NE in crisp games, which, for example, involves the fuzzy number ranking, and would be influenced by the membership function of fuzzy number as well as the viewpoint of players, rendering most existing learning methods invalid. Thus, we need to design a new learning algorithm to cope with the fuzzy parameters, with which the robust accessing and optimal allocation can be implemented in dynamic UAVs environments.
Iv Global Optimization in Dynamic UAV Communication Networks
In order to solve the concerned FPG, in this section we will introduce a robust fuzzylearning algorithm for distributed POCs allocation, and then demonstrate its convergence property.
Iva FuzzyLearning Algorithm
In UAV communication networks, UAV nodes would experience dynamic environments and suffer from varying coupling interference in most cases, i.e. their utilities become fluctuated and uncertain. Existing crispgame theoretical learning approaches, depending on the accurate rewards (e.g. the channel capacity or SINR) to make decisions and update strategies in an observational space, are vulnerable for the encountered dynamics, hence the convergence can be hardly guaranteed.
Instead of an observational space, we implement the learning and updating in a mapped fuzzy space, whereby the priority vector of actions can be acquired by resorting to the fuzzy payoffs rather than the realvalue crisp payoffs. A remarkable advantage of our developed learning algorithm is that, by introducing a mapped fuzzyspace and the fuzzydomain learning, it allows for a desensitization of fluctuated utility and thereby is capable of combating environmental changes and ensuring robust access. A conceptional algorithm flow is shown in Fig. 4.
Based on the above elaborations, our proposed robust fuzzylearning algorithm for POCs assignment in dynamic UAV networks is summarized in Algorithm 1. With the fuzzyspace interpolation and processing, a general actiontaken strategy is adopted as follows:
(36) 
It is worth to highlight that the updating rules contain two aspects, of which the first part corresponds to the best response with a fuzzy logic, while the second part is added to fulfill the QoS requirement (e.g. the orthogonality constraints).
As mentioned, we assume the channel gain varies fast between two adjacent slots, and the strategy updating at slot is based on the current uncertain utility. The termination criterion of Algorithm 1 is that, the difference of UAV node ’s utility between two adjacent iteration slots is less than a predefined threshold (data rate criterion) or (throughput criterion), i.e.,
(37) 
IvB Priority Vector
From eq. (34) and the proposed Algorithm 1, it is noted that the priority vector is the cornerstone of fuzzyspace learning for UAV node , as far as the main purpose of POC channel assignment is concerned, which accounts for the important degree of channels with a viewpoint of node and also satisfies the normalizing condition , . In the following, we will present a least deviation algorithm [39], in order to quantify the fuzzy payoffs and finally calculate the priority vector via fuzzy logic.
(39) 
(40) 
(41) 
(42) 
(43) 
(44) 
(45) 
We first rank the fuzzy number payoffs, and then introduce a fuzzy preference relation (FPR) to make a soft measurement of fuzzy numbers, with which the stable priority vector of the actions can be derived. The FPR matrix of node is defined as with complementary matrix properties:
(38) 
where denotes the preference degree of the UAV node between channel and .
On this basis, the priority vector can be determined by incorporating the viewpoint projection. Provided the used triangular fuzzy number, the schematic flow of a least deviation algorithm is then illustrated by Algorithm 2. By resorting to the fuzzylogic to analyze mapped fuzzy payoffs, the fuzzyspace interpolation leads to a desensitization of fast changing environments and fluctuated utilities, hence, each player would learn smoothly dynamic environments and evolve steadily towards a satisfactory solution.
IvC Convergence of the Proposed Algorithm
To demonstrate the convergence of our new fuzzylearning algorithm for POCs allocations, we then present the following theorem.
Theorem 4.
The proposed Algorithm 1 for the dynamic UAV networks is guaranteed to converge to a stable channel allocation profile, with which the maximal network throughput can be achieved.
Proof.
For our robust fuzzylearning algorithm for distributed POCs allocations, in Step 3 of Algorithm 1, due to the best response of strategy updating with fuzzy logic, the reward of UAV node is always nondecreasing, and thus the network throughput will increase until the global stability is achieved. Owing to the limitation of channel resources and UAV nodes, on the other hand, the global QoS utility is up bounded. Thus, the proposed algorithm would converge finally to a stable channel allocation profile after finite iterations, i.e., no further throughput improvement can be made as the maximal network throughput would have been achieved. Based on the above statements, Theorem 4 is proved. ∎
V Simulation Results
In this section, numerical simulations are provided to demonstrate the performance of our robust fuzzylearning algorithm in the context of selfadaption POCs allocation in mesh UAV communication networks. In our following analysis, the size of 3D space is configured to m. The maximum number of UAV nodes for a cluster is 6, i.e., . Transmission powers of CH and CM are dBm and dBm, respectively. The available channels for UAV networks are specified by IEEE 802.11b/g standard, i.e. , and . In order to characterize different movement types, the normalized uncertain boundary of varying channel gains is . The AWGN variance and the passloss exponent of channels are set as dBm and , , respectively. Other constant parameters for implementing fuzzylearning algorithm are set to and . The viewpoint of UAV node is assume to be neutral.
In the following, we firstly explained the meshstructured UAV network used in our simulations. Then, the convergence performance of our proposed scheme is provided, and both achievable rate and generalized throughput of our fuzzylearning algorithm are compared with that of its counterpart. Finally, the system performance is demonstrated via two main metrics: the number of active links, and the network throughput with the assurance of QoS. Note that, all numerical results are derived from 50 independently simulated UAV network topologies and 100 trials for each network topology.
Va Mesh UAV Network Model
A diagram of simulated mesh UAV network is shown by Fig. 5, which involves a GCS and multiple UAV nodes. The network size is , whereby total 10 UAV nodes formed 3 clusters (one may refer to some related works for cluster formulation algorithms, and here we just assume clustering is based on the spatial distances). So, the required number of outerlinks is 3. For a starstructured UAV network, the number of required outerlinks equals exactly to the number of UAV nodes. In comparison, the longdistance outerlinks will be limited in a mesh UAV network, and more importantly, the QoS requirements can be fulfilled with a much lower transmission power (dominated by shortdistance interlinks). Thus, in comparing with a star architecture, the mesh configuration will be more preferable for UAV communication networks.
VB Convergence Performance
We then evaluate our proposed scheme in the context of dynamically uncertain UAV communication environments. First, the cumulative distribution function (CDF) of required iterations to achieve a satisfactory solution is presented in Fig. 6, which gives a indicator of convergence speed of our proposed fuzzylearning scheme from a statistical perspective. From numerical results, it seems that the iterations needed for convergence is positively related with the total number of UAV nodes. This is relatively easy to follow, i.e. the larger network size needs more iteration to achieve convergence. Besides, we note that the convergence of our proposed method is relatively rapid even for a larger network size, and the mean values of required iteration under different UAV network sizes (=10, 20, 30, 40) are about 3, 4, 6 and 8, respectively. Such a rapid convergence property makes our scheme particularly attractive to the robust accessing in energyconstrained UAV communication networks.
After the convergence, comparative results for both achievable rate and generalized throughput between our proposed fuzzylearning algorithm and its counterpart, i.e. the crispgame theoretical algorithm, are plotted together in Fig. 7. First, as far as these two different performance metrics are concerned, the generalized throughput would be less than the achievable rate under the same parameter configurations, which is consistent with our previous definitions (i.e., the generalized throughput is further weighted by the connectivity factor and interference factor). More importantly, with our new fuzzylearning algorithm, the attained maximum utility of our proposed scheme will dramatically outperforms that of a conventional scheme, no matter what the performance metric is and how larger the UAV network size is. Taking the generalized throughput under for example, the total throughput of our new method converges to 71, whilst the crispgame based algorithm can only approach 50. That is, the significant improvement, i.e. around , is achieved by our proposed scheme.
The main reason is that, due to the lack of mechanisms to combat the randomly fluctuated utility, most conventional learningbased methods will be inevitably influenced by dynamic environments, and thereby fail to achieve the satisfactory solution. In contrast, by the implementing learning and updating in a mapped fuzzy space, our proposed scheme is basically immune to the involved dynamics and uncertainties, which is hence more competitive in identifying the optimal solution to POCs allocations and enables robust accessing even in dynamic UAV communication networks.
VC System Performance
We further study the performance of various multichannel allocation algorithms under difference system configurations, i.e. the UAV network size . First, we are interested in the number of active links under different UAV network size. Then, we present the comparative results, i.e. the achieved network throughput, of our proposed scheme and its counterpart methods, i.e. the crispgame based algorithm and another random selection approach.
VC1 Number of Active Links
For our proposed fuzzylearning scheme, the permitted number of parallel active links under different UAV network sizes is illustrated in Fig. 8. It is found that, the numerical derived curve can be partitioned into an unsaturated regime and a saturated regime, with a regime bound of . In the left unsaturated regime, the number of active links would increases with total UAV nodes number . In the right saturated regime, the number of active links would remain unchanged, which means the network capacity has an upbound even considering the channel reuse, due to serious coupling interference. For a specific parameter configuration, it is shown that the maximum number of active links is 35, and a maximal channel reuse ratio is around 3.
VC2 Achieved Network Throughput
Furthermore, we study the network throughput of our proposed scheme and the counterpart approaches under various UAV network size. The expected network throughput is shown by Fig. 9, which simultaneously gives the mean value and the variance of network throughput with three allocation schemes. It is observed that our proposed fuzzylearning algorithm would significantly outperform the other two approaches no matter what the UAV nodes number is. Our new method can attain the superior network performance (i.e. higher mean) with the more favorable stability (i.e. lower variance). In comparison, the other existing approaches may become less competitive, as far as robust accessing in dynamic environments is concerned, especially for a crisplearning scheme with which the performance variance may even surpass its mean value (e.g. and ).
In particular, we noted from Fig. 9(a) that the network throughput achieved by a crispgame algorithm is slightly greater than that of another random selection approach. In other words, a classical crispgame algorithm, developed for most static environments (i.e. with timeindependent channel gain and mutual interference), would become basically invalid, whose learning behavior was completely undermined by the constantly changing network topology and the timevarying channels or local utilities. In sharp contrast, despite the mobility of UAV nodes and the resulting dynamically uncertain CSI, with the fuzzyspace mapping, our proposed algorithm can still achieve the optimal POCs allocation by maximizing the network throughput.
In addition, from Fig. 9(b), we found that the performance variance of conventional crisplearning algorithms is even inferior to that of a random selection approach, which further demonstrated the extreme vulnerability of a crispgame method when handling the dynamically uncertain information in practical scenarios. And hence, it loses the effectiveness in channel allocations for dynamic UAV networks. This problem may hold for a large class of greedybased learning schemes, whereby the temporal variations in local utility may trigger the impetuous response in updating strategies, causing sharp fluctuations in strategies and leading to random evolution behaviors. By presenting the appealing fuzzyspace learning framework, our proposed scheme can cope with this challenging problem. As highlighted, it is capable of desensitizing the fluctuated utility, which thereby ensures robust accessing even in dynamic UAV networks, by producing a much lower variance in achieved performance.
Vi Conclusion
In this paper, the optimal POCs allocation with multiple constraints in dynamic meshstructured UAV communication networks is studied. By projecting the randomly fluctuated utility into another fuzzyspace, a robust fuzzylearning paradigm for distributed POCs allocations is developed to cope with the major challenges of UAV communication networks, i.e. dynamically uncertain environment and energyconstrained deployment. As opposite to the most existing learning schemes directly operated in the observational space, our proposed scheme implements the learning and updating in a mapped fuzzy space, whereby the fluctuated utilities are interpreted with fuzzy numbers and the decisions are made on the basis of the derived priority vectors. Our new scheme is characterized by its appealing desensitization of dynamically uncertain CSI and fluctuated utilities, which would effectively combat the oscillation effects in decisions and thereby ensure the robust accessing even in dynamic UAV communication networks. Potential advantages of our algorithm are also demonstrated by numerical results. Attributed to its notable stability and robustness, our proposed algorithm is capable of achieving the maximum network throughput and enhancing resource efficiency even in dynamically changing environments. As a consequence, our proposed robust fuzzylearning scheme will be of significant promise for the emerging UAV applications.
Acknowledgment
This work was supported by BUPT Excellent Ph.D. Students Foundation under Grant No. CX2017209, and Natural Science Foundation of China (NSFC) under Grant No. 61471061.
References
 [1] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: opportunities and challenges,” IEEE Communications Magazine, vol. 54, no. 5, pp. 36–42, 2016.
 [2] S. Koulali, E. Sabir, T. Taleb, and M. Azizi, “A green strategic activity scheduling for uav networks: A submodular game perspective,” IEEE Communications Magazine, vol. 54, no. 5, pp. 58–64, 2016.
 [3] L. Gupta, R. Jain, and G. Vaszkun, “Survey of important issues in uav communication networks,” IEEE Communications Surveys & Tutorials, vol. 18, no. 2, pp. 1123–1152, 2016.
 [4] F. Jiang and A. L. Swindlehurst, “Optimization of uav heading for the groundtoair uplink,” IEEE Journal on Selected Areas in Communications, vol. 30, no. 5, pp. 993–1005, 2012.
 [5] N. Goddemeier, K. Daniel, and C. Wietfeld, “Rolebased connectivity management with realistic airtoground channels for cooperative uavs,” IEEE Journal on Selected Areas in Communications, vol. 30, no. 5, pp. 951–963, 2012.
 [6] Z. Liu, Y. Chen, B. Liu, C. Cao, and X. Fu, “Hawk: an unmanned minihelicopterbased aerial wireless kit for localization,” IEEE Transactions on Mobile Computing, vol. 13, no. 2, pp. 287–298, 2014.
 [7] M. Naeem, A. Anpalagan, M. Jaseemuddin, and D. C. Lee, “Resource allocation techniques in cooperative cognitive radio networks,” IEEE Communications Surveys and Tutorials, vol. 16, no. 2, pp. 729–744, 2014.
 [8] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and J. C. Zhang, “What will 5g be?” IEEE Journal on selected areas in communications, vol. 32, no. 6, pp. 1065–1082, 2014.
 [9] C. Fan, B. Li, C. Zhao, W. Guo, and Y. C. Liang, “Learningbased spectrum sharing and spatial reuse in mmwave ultra dense networks,” IEEE Transactions on Vehicular Technology, vol. PP, no. 99, pp. 1–1, 2017.
 [10] D. Orfanus, E. P. De Freitas, and F. Eliassen, “Selforganization as a supporting paradigm for military uav relay networks,” IEEE Communications Letters, vol. 20, no. 4, pp. 804–807, 2016.
 [11] P. B. Duarte, Z. M. Fadlullah, A. V. Vasilakos, and N. Kato, “On the partially overlapped channel assignment on wireless mesh network backbone: A game theoretic approach,” IEEE Journal on Selected Areas in Communications, vol. 30, no. 1, pp. 119–127, 2012.
 [12] Y. Cui, W. Li, and X. Cheng, “Partially overlapping channel assignment based on ¡°node orthogonality¡± for 802.11 wireless networks,” in INFOCOM, 2011 Proceedings IEEE. IEEE, 2011, pp. 361–365.
 [13] Y. Su, Y. Wang, Y. Zhang, Y. Liu, and J. Yuan, “Partially overlapped channel interference measurement implementation and analysis,” in 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), April 2016, pp. 760–765.
 [14] P. Si, F. R. Yu, R. Yang, and Y. Zhang, “Dynamic spectrum management for heterogeneous uav networks with navigation data assistance,” in Wireless Communications and Networking Conference (WCNC), 2015 IEEE. IEEE, 2015, pp. 1078–1083.
 [15] J. Li and Y. Han, “Optimal resource allocation for packet delay minimization in multilayer uav networks,” IEEE Communications Letters, vol. 21, no. 3, pp. 580–583, 2017.
 [16] F. Tang, Z. M. Fadlullah, N. Kato, and R. Miura, “Acpoca: Anticoordination game based partially overlapping channels assignment in combined uav and d2d based networks,” IEEE Transactions on Vehicular Technology, 2017.
 [17] A. Mishra, V. Shrivastava, S. Banerjee, and W. Arbaugh, “Partially overlapped channels not considered harmful,” in ACM SIGMETRICS Performance Evaluation Review, vol. 34, no. 1. ACM, 2006, pp. 63–74.
 [18] Y. Ding, Y. Huang, G. Zeng, and L. Xiao, “Using partially overlapping channels to improve throughput in wireless mesh networks,” IEEE Transactions on Mobile Computing, vol. 11, no. 11, pp. 1720–1733, 2012.
 [19] P. B. Duarte, Z. M. Fadlullah, K. Hashimoto, and N. Kato, “Partially overlapped channel assignment on wireless mesh network backbone,” in Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE. IEEE, 2010, pp. 1–5.
 [20] J. Li, Y. Cheng, X. Jia, and L. M. Ni, “Throughput optimization in wlan/cellular integrated network using partially overlapped channels,” IEEE Transactions on Wireless Communications, vol. PP, no. 99, pp. 1–1, 2017.
 [21] W. Zhao, H. Nishiyama, Z. M. Fadlullah, N. Kato, and K. Hamaguchi, “Dapa: Capacity optimization in wireless networks through a combined design of density of access points and partially overlapped channel allocation,” IEEE Transactions on Vehicular Technology, vol. 65, no. 5, pp. 3715–3722, 2016.
 [22] Y. Liu, R. Venkatesan, and C. Li, “Loadaware channel assignment exploiting partially overlapping channels for wireless mesh networks,” in 2010 IEEE Global Telecommunications Conference GLOBECOM 2010, Dec 2010, pp. 1–5.
 [23] Y. Xu, Q. Wu, J. Wang, L. Shen, and A. Anpalagan, “Opportunistic spectrum access using partially overlapping channels: Graphical game and uncoupled learning,” IEEE Transactions on Communications, vol. 61, no. 9, pp. 3906–3918, 2013.
 [24] C. Fan, B. Li, Y. Zhang, and C. Zhao, “Robust dynamic spectrum access in uncertain channels: a fuzzy payoffs game approach,” in Global Telecommunications Conference, 2017. IEEE GLOBECOM 2017. IEEE. IEEE, 2017.
 [25] J. Nguyen, N. Lawrance, R. Fitch, and S. Sukkarieh, “Energyconstrained motion planning for information gathering with autonomous aerial soaring,” in 2013 IEEE International Conference on Robotics and Automation, May 2013, pp. 3825–3831.
 [26] S. M. Perlaza, H. Tembine, S. Lasaulce, and M. Debbah, “Qualityofservice provisioning in decentralized networks: A satisfaction equilibrium approach,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 2, pp. 104–116, April 2012.
 [27] D.F. Li, “An effective methodology for solving matrix games with fuzzy payoffs,” IEEE Transactions on Cybernetics, vol. 43, no. 2, pp. 610–621, 2013.
 [28] S. H. Breheny, R. D’Andrea, and J. C. Miller, “Using airborne vehiclebased antenna arrays to improve communications with uav clusters,” in 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), vol. 4, Dec 2003, pp. 4158–4162 vol.4.
 [29] B. Li, S. Li, A. Nallanathan, and C. Zhao, “Deep sensing for future spectrum and location awareness 5g communications,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 7, pp. 1331–1344, 2015.
 [30] Z. Feng and Y. Yang, “Characterizing the impact of partially overlapped channel on the performance of wireless networks,” in Global Telecommunications Conference, 2008. IEEE GLOBECOM 2008. IEEE. IEEE, 2008, pp. 1–6.
 [31] O. Bouachir, A. Abrassart, F. Garcia, and N. Larrieu, “A mobility model for uav ad hoc network,” in Unmanned Aircraft Systems (ICUAS), 2014 International Conference on. IEEE, 2014, pp. 383–388.
 [32] Z. Feng and Y. Yang, “How much improvement can we get from partially overlapped channels?” in 2008 IEEE Wireless Communications and Networking Conference, March 2008, pp. 2957–2962.
 [33] B. Li, J. Hou, X. Li, Y. Nan, A. Nallanathan, and C. Zhao, “Deep sensing for spacetime doubly selective channels: When a primary user is mobile and the channel is flat rayleigh fading,” IEEE Transactions on Signal Processing, vol. 64, no. 13, pp. 3362–3375, 2016.
 [34] M. Hasan and E. Hossain, “Distributed resource allocation for relayaided devicetodevice communication under channel uncertainties: A stable matching approach,” IEEE Transactions on Communications, vol. 63, no. 10, pp. 3882–3897, 2015.
 [35] H.J. Zimmermann, Fuzzy set theory¡ªand its applications. Springer Science & Business Media, 2011.
 [36] H. LeeKwang and J.H. Lee, “A method for ranking fuzzy numbers and its application to decisionmaking,” IEEE Transactions on Fuzzy Systems, vol. 7, no. 6, pp. 677–685, 1999.
 [37] A. Chakeri and F. Sheikholeslam, “Fuzzy nash equilibriums in crisp and fuzzy games,” IEEE Transactions on Fuzzy Systems, vol. 21, no. 1, pp. 171–176, 2013.
 [38] L. Cunlin and Z. Qiang, “Nash equilibrium strategy for fuzzy noncooperative games,” Fuzzy Sets and Systems, vol. 176, no. 1, pp. 46–55, 2011.
 [39] Z. Xu and Q. Da, “A least deviation method to obtain a priority vector of a fuzzy preference relation,” European Journal of Operational Research, vol. 164, no. 1, pp. 206–216, 2005.