AI-based Two-Stage Intrusion Detection for Software Defined IoT Networks

AI-based Two-Stage Intrusion Detection for Software Defined IoT Networks

Jiaqi Li, Zhifeng Zhao, Rongpeng Li and Honggang Zhang College of Information Science Electronic Engineering, Zhejiang University
Email: {21631097, zhaozf, lirongpeng, honggangzhang}

Software Defined Internet of Things (SD-IoT) Networks profits from centralized management and interactive resource sharing which enhances the efficiency and scalability of IoT applications. But with the rapid growth in services and applications, it is vulnerable to possible attacks and faces severe security challenges. Intrusion detection has been widely used to ensure network security, but classical detection means are usually signature-based or explicit-behavior-based and fail to detect unknown attacks intelligently, which are hard to satisfy the requirements of SD-IoT Networks. In this paper, we propose an AI-based two-stage intrusion detection empowered by software defined technology. It flexibly captures network flows with a globle view and detects attacks intelligently through applying AI algorithms. We firstly leverage Bat algorithm with swarm division and Differential Mutation to select typical features. Then, we exploit Random forest through adaptively altering the weights of samples using weighted voting mechanism to classify flows. Evaluation results prove that the modified intelligent algorithms select more important features and achieve superior performance in flow classification. It is also verified that intelligent intrusion detection shows better accuracy with lower overhead comparied with existing solutions.

thanks: J. Li, Z. Zhao, R. Li, and H. Zhang are with College of Information Science and Electronic Engineering, Zhejiang University. Email: {21631097, lirongpeng, zhaozf, honggangzhang} This work was supported in part by the Program for Zhejiang Leading Team of Science and Technology Innovation (No. 2013TD20), in part by the National Postdoctoral Program for Innovative Talents of China (No. BX201600133), and in part by the Project funded by China Postdoctoral Science Foundation (No. 2017M610369).

I Introduction

Internet of Things (IoT) is an evolving technology which provides ubiquitous connectivity and interaction between the physical and cyber worlds [1]. However, the rapid increase in the number and diversity of smart devices connected to the Internet has raised the issues of flexibility, efficiency and availability within the current IoT networks. As an essential trend of Iot networks, the emergence of Software Defined IoT networks [2] provides a manageable solution which has drawn significant attention. Benifiting from the advantages of Software Defined Network (SDN) [3], SDN-based approach facilitates the supervision of network status and the collection of information under centralized control in an active manner. Moreover, it also optimizes network management and resource allocation flexibly through software programmability to meet the diverse demands of IoT networks. Nevertheless, the development of SD-IoT dose not totally eliminate various security issues and challenges. In order to address these emerging problems, it is urgent to build up effective and intelligent algorithms for enforcing the security of Software Defined IoT networks [4].

As an indispensable technology in network security, intrusion detection mechanisms dynamically monitor the abnormal behaviors or patterns in a system and indicate whether some events are susceptible of an attack [5]. There are two main categories of intrusion detection techniques: misuse detection and anomaly detection. Misuse detection are usually signature-based, which can only detect known attacks by matching the behaviors of incoming intrusions with the historical knowledge and predefined rules. Anomaly detection automatically constructs a normal behavior of the systems and stubbornly detects incoming intrusions by explicitly computing deviations. It can recognize novel attacks but may raise false alarms as well.

To overcome the limitation of traditional intrusion detection, Artificial Intelligence (AI) has been taken into acount for intelligent detection. AI-based schemes can automatically discover deep knowledge or patterns from the historical data and make wise judgments to predict network intrusions [6][7]. Though, there have been a few researches on combining IDS and AI, they are still incapable for universally precise detection and robustly considering the evolution and development of SD-IoT networks.

In this paper, we propose an advanced intrusion detection technology using AI algorithms based on Software Defined IoT architecture. Specifically we apply a combination of enhanced AI algorithms to perform feature selection and flow classification, which are two crucial steps in intrusion detection. In particular for feature selection, we take advantage of improved Bat algorithm by splitting the whole swarm into subgroups using K-means method so that each subgroup can learn within and among different populations more efficiently. Besides, Differential Evolution is also employed to increase the diversity of individuals. For flow classification, we optimize Random forest through updating the weight of each sample after building each tree iteratively and making the final decision by using weighted voting mechanism.

The remainder of the paper is organized as follows. Section II discusses the related research works. Section III introduces the AI-based two-stage intrusion detection empowered by Software Defined technology. Section IV describes the proposed BA algorithm for feature selection. Section V presents the improved RF algorithm for traffic classification. Section VI presents the performance evaluation results. Section VII summarizes the paper and points out the potential future work.

Ii Related work

As a critical enabling technology, SDN radically revolutionizes the way network operators will architect and coordinate, and meets the demands of IoT networks in terms of performance and reliability. It dynamically manages network configurations and provides flexible service provisioning mechanisms in a centralized control manner [8]. Accordingly, the combination of IoT networks and SDN has attracted tremendous research interests. In [9], an architecture with SDN based IoT framework coupled with network functionality virtualization (NFV) is introduced. It provides a general implementation by virtualizing the IoT gateway which makes it possible to be dynamic, scalable and elastic in the IoT networks. Another IoT architecture originated from SDN to overcome big data problem is proposed in [10]. By evaluating the usefulness of the sensed values in the lower layers (especially in gateway layer) instead of application layer, the number of packets being sent to the Internet is reduced, which overcomes the huge data volume problem of IoT.

The advancement of SDN has strengthened IoT security through supporting the supervision of network status and the collection of flows statistics. It also provides network-layer security services such as packet routing, identity authentication, and automated security management in a global view, which facilitates the detection and prevention of attacks [11]. However, SD-IoT networks still face severe security challengesa as new attacks quikly appear. Therefore, several previous studies have investigated the ability of SDN and introduces various solutions to improve IoT security. In [12], it proposes an identity-based authentication scheme for IoT based on SDN. The specific identity formats used by different communication protocols are mapped to a shared identity and a trusted certificate authority is implemented on the SDN controller. [13] also proposes a host-based intrusion detection and mitigation framework for IoT, in which network visibility and flexibility properties of SDN are exploited and modules of intrusion detection and mitigation are implemented at the SDN controller. Moreover, remote security management is provided by the third-party entities that offer ‘Security as a Service’. A flow-based security approach for IoT devices is proposed using an SDN gateway in [14]. It aims to mitigate Distributed Denial of Service (DDoS) attacks that violates services availability by monitoring traffic flows and anomalous behavior. These proposed schemes are applicable to solve specific types of security problems and can only perform well under certain scenarios.

Fig. 1: AI-based Two-Stage Intrusion Detection for Software Defined IoT Networks

As the focus of network security, intrusion detection has gained intensive attentions and wide investigations in recent years [15]. Most of them depend on pre-defined rules, which are still unable to recognize uprising new attacks intelligently. In terms of this issue, various machine learning algorithms have been adopted with flow-based classification in solving such problems [16]. Most often, there are two stages in this process: feature selection and flow classification. The former stage addresses high dimension data efficiently and decreases the number of features from a noisy dataset, which improves the learning efficiency and prediction accuracy of flow classification. Recently, a variety of Swarm Intelligence (SI) algorithms such as ant colony optimization (ACO) [17] and particle swarm optimization (PSO) [18] have been applied to select optimal features as well. The latter stage distinguishes network flows by marking whether it belongs to specific types of attacks or benign traffic. [19] proposes a network-based IDPS (Intrusion Detection and Prevention System) which performs C4.5 algorithm to build the decision tree for classifying the traffic. In [20], the authors propose a three-layer Recurrent Neural Network (RNN) which is capable of automatically finding the correlations between flow records and acting as a neural classifier for misuse detection. However current algorithms are still inadequate for effectively selecting optimal features and detecting new types of attacks with lower cost under different circumstances owing to their inherent limits. Therefore, we need to optimize existing algorithms for the two critical stages to detect network intrusions efficiently and adaptively.

Iii AI-based two stage intrusion detection

In this paper, we propose a two-stage intrusion detection using AI algorithms under SD-IoT Networks, see Figure 1. Taking advantage of SDN, it captures network packets and collects status information with centralized control. The Controllers partitions network packets into flows and delivers them to the upper layer. In this way, flow-based intelligent intrusion detection can be implemented using AI algorithms in two stages. It selects optimal flow features and detects network anomalies by classifying each flow into specific catogories. Afterwards, the Controllers manages resource arrangement and organizes specific actions for defending attacks according to the classification results.

We combine two artificial intelligent algorithms in the two main stage of intrusion detection. Swarm Intelligence (SI) algorithms have been widely adopted for global optimization through heuristic searching iteratively combined by classification which contributes to higher accuracy. As a promising novel SI algorithm, Bat Algorithm (BA) solves feature selection problems and behaves better than traditional SI algorithms with simple structure, fewer parameters and stronger robustness. It achieve outstanding performance owing to its flexibility, simplicity, and robustness. Therefore in the fist stage, we enhance BA to improve its ability for searching optimal features. As an ensemble method, Random forest (RF) can prevent overfitting and make final dicision using majority voting. It has been validated that it outperforms other algorithms in various situations in terms of prediction accuracy with tolerable time complexity, which gains much more popularities in classification. So in the second stage, we optimize RF algorithm to classify the network traffic into different classes of attacks with the selected features as input.

Iv Improved of Bat Algorithm for feature selection

Bat algorithm [21] is a metaheuristic algorithm for global optimization. It was inspired by the echolocation behaviour of microbats, with varying pulse rates of emission and loudness, which achieves great performance in But it still suffers from getting trapped into local minima easily since the position of each bat is only strongly influenced by the global best individual without communicating with its neighbors, which lacks diverse positions of the swarm. Also, the algorithm lacks of a mutation mechanism, which can not escape from local minima once the bat is adjacent to it. Thus, in order to address these problems, we enhance the algorithm as in the following two ways.

Fig. 2: Swarm division

Iv-a Swarm division

At each iteration during the process, we divide the whole swarm (the total number of individuals is ) into several subgroups (e.g. ) containing the same number of individuals using K-means algorithm referring to the distances between them, see Figure 2. For each subgroup (), we select the local minima whose position is . For all the subgroups, the global best bat is found and locates in . Each bat retains its previous best position as (). For the th individual, the flight is described by its location in space and velocity , where is the problem dimension. is the frequency of the bat. The bat updates its velocity at iteration in two typical situations as follows:

For the non-local-minima individuals in each subgroup


For the local-minima individuals of each subgroup


In each iteration, is the inertia weight for each bat and is the self-learning factor of each bat. , , , are the maximum and the minimum of and , respectively. is the the largest number of iterations. The two variables are calculated as below.


We introduce the linearly decreasing inertia weight varying as the iterations increase to improve the optimization ability of the algorithm. At early iterations, the bat with a larger inertia weight as well as a great speed is enabled with a strong global search ability. Later, a smaller inertia weight contributes to a more accurate local search, which accelerates the rate of convergence.

By employing the parameter , we slightly enhance the velocity update by taking advantage of the historical experience of the bat. The dynamically adjusting parameter indicates the influence of the previous best position of each bat on the current speed. The bat with larger retains its own position with a better exploration ability initially. Afterwards, the position of the bat tends to be greatly affected by the global best bat which elevates its exploitation ability.

Iv-B Binary Differential Mutation

After the update of a bat at each iteration, we further apply the mutation mechanism of Differential Evolution [22] to BA algorithm, which enhances the diversity of population and the ability of bats to jump out of local optimum. It disturbs the target value by using the differences of random selected individuals in the swarm. Since the original method can only solve continuous optimization problem, we put forward a Binary Differential Evolution algorithm using bit operation based on swarm division.

In the proposed new algorithm, the location and velocity of each bat in space are represented through binary strings. We leverage logical operations to implement the mutation process where ‘+’ represents ‘xor’ operation and represents ‘or’ operation. is a random number generated between 0 and 1. ‘’ means only if will the operations in the parentheses can be carried out. At iteration for the th bat, if , we conduct the mutation on the velocity of the current bat as following, else we do nothing:


Where ,, are bats randomly selected in the same subgroup of the target while , are bats selected in the distinctive subgroups.

is the mutation probability controlling whether bats perform the mutation operations or not. It varies adaptively with the number of iterations to obtain better search ability. is the largest number of iterations. is calculated as:


At early iterations, each bat can make full use of its ability to search within a larger space with a small mutation probability. As the number of iterations increases, it is more likely for bats to mutate, which breaks the constraints of local minima for avoiding any premature convergence.

is the shrinkage factor that is randomly generated between 0 and 1.The shrinkage factor regulates the variation of individuals by controlling the effect of the differential vectors. Larger values of is conductive to maintaining the diversity of the population while smaller values of enables bats with better local search ability. and are the maximum and the minimum of , which is calculated as:


The modification of BA will ensure the algorithm with better exploration ability at early stage and higher exploitation at later stage. The division of the swarm enables efficient learning among similar individuals in the vicinity within each subgroup meanwhile sharing optimal information among the subgroups through the local minima. In this way, each bat can step to the global best gradually without being trapped into the local minim in avoid of its dramatical influence. By applying the Binary Differential Mutation, the proposed approach also increases the diversity of swarm and prevents from trapping into local minima, which also accelerates the rate of convergence. The pseudocode of the new algorithm is given in Algorithm 1. The relevant steps which are not introduced in our paper remains the same as original BA algorithm.

0:    size of the swarm ; the number of subgroups ; maximum number of iterations ; fitness function ;
0:  best solution of the swarm ;
1:   initialization
2:  generate initial population (,,,,);
3:  compute fitness function
4:  select global best solution ;
5:  for  to  do
6:     update controlling factors cf.eq.(3)(4)(6)(7);
7:     divide the swarm using K-means;
8:     for  to  do
9:        select local minima at position ;
10:        for  to round(do
11:           retain previous best position
12:           update the frequency
13:           if  is non-local-minima bat then
14:              update the velocity cf.eq.(1);
15:           else
16:              update the velocity of local-minima cf.eq.(2);
17:           end if
18:           update the position
19:           if rand then
20:              local search for the best solution ;
21:           end if
22:           if rand AND  then
23:              accept new solution;
24:              update and ;
25:           end if
26:           if rand then
27:              apply Binary Differential Evolution cf.eq.(5);
28:           end if
29:        end for
30:     end for
31:     determine best fitness and update current ;
32:  end for
33:  return  
Algorithm 1 Improved Bat algorithm

V Improved of Random Forest for flow classification

As an ensemble machine learning algorithm, RF [23] has been widely used in processing high dimension dataset collaterally and preventing over-fitting to some extent. However, it degrades its performance in the minority classes of data when coping with an imbalanced dataset. Since the samples are randomly selected with replacement when building trees, the minority class with fewer samples is less likely to be selected and learned. In addition, we find that the cost of the minority class which is mis-classified is even higher than the majority ones which is urgent to improve its detection rate. Therefore, we need to optimize the algorithm as in the following ways.

V-a Weight initialization

Usually, the weight of each sample is initialized similarly as ( is the total number of the dataset) and the sum of them is 1. In this way, each sample is equal to be selected. In our mechanism, we initialize each training sample with a different weight according to the class that it belongs to. Corresponding to the original distribution of each class in our dataset, we initial the weights of five classes as 0.3, 0.15, 0.35, 0.05, 0.15. It reduces the weights of the majority class while boosting those of the minority class. In class j (j=1,2…5), the total number of samples is and the weight of each sample (i=1,2…) is calculated as:


Where is the weight of class j. is the total number of samples in class j. represents the probability of each sample being selected from samples. In this way, the minority samples can be selected and paid more attentions instead of over-selecting the abundant samples in the majority class. We randomly select samples using the roulette wheel selection scheme from the original dataset with replacement to train each tree.

V-B Weight update

After building each tree, we classsify the whole dataset and intend to update the weight of each sample according to the result it is classified. The weights of misclassified samples will be increased while samples correctly classified should be cut down. Consequently, we can stress more concern on the samples which are misclassified. Those samples with higher weights are more likely to be selected and learned in the next tree. We calculate the accuracy using the error rate of tree (=1,2,…M) under the whole dataset and update the weight of the sample as following. We use to boost the weights of mis-classified samples and for those correctly classified. M is the pre-defined number of trees for training. is the scaling factor so that the total sum of weights remains 1.


We use a cost-sensitive way of learning by altering the weight of each sample. For the purpose of training the misclassified samples in distinctive classes specifically, we update the weight of each sample to various extent by controlling the factor in the Eq. (11) considering four situations in Table I [24]. is calculated as:

Different classes
correctly classified
The majority classes
The minority classes
TABLE I: calculated for different classes

In Table I, m and n are the sum of weights in the majority and minority classes. From the table we can see that, for the minority classes, we obviously increase the weights of misclassified samples, but slightly decrease those which are classified into the right class. For the majority classes, it is just the opposite. By this method, the minority samples can be selected and trained consistently and iteratively without degrading their weights remarkably when building trees. The weight update operation can also prevent samples from over-training repetitively.

V-C Weighted voting

Now that the classification ability of each tree varies in different classes, the traditional method using the majority votes of all the trees for final result could not be used anymore. We introduce the weighted voting mechanism to the ensemble trees. Specifically, we compute the Accuracy Matrix of the classification accuracy of each tree in each class as in Table II.

Tree 1
Tree 2
Tree m
Tree M
TABLE II: Accuracy Matrix

The final result is computed according to Eq.(13). is the ensemble result and is the classification judgement of each tree . The trees specializing in classifying different classes of samples can maximum its advantage in deciding the final result with higher weights. The process is illustrated in Figure 3. The colorful histograms represent accuracy of each tree in each class which is of different length. The figure shows classification of sample in class 2 as a example.

Fig. 3: Weighted voting

The modification of RF helps to strike a balance between over-learning in the majority class and directing more emphasis on the minority class, which enhances its performance in imbalanced dataset. Furthermore, the cost-sensitive learning of mis-classified samples also contributes to the overall accuracy. By using the weighted voting mechanism, various trees with distinguished abilities in classification can be strongly combined. The pseudocode of the algorithm is given in Algorithm 2.

0:    the number of training data the number of trees samples selected for traning each tree the number of samples of
0:  the ensemble classification result
1:   initialization
2:  initialize weights of different classes ;
3:  compute weight of each sample in each class cf.eq.(8);
4:  for =1 to  do
5:     training tree using sampling data ;
6:     calculate the error rate and accuracy cf.eq.(9);
7:     for =1 to  do
8:        update the weight of each sample cf.eq.(10)(11)(12) and Table I;
9:     end for
10:  end for
11:  determine the ensemble result cf.eq.(13);
12:  return  
Algorithm 2 Improved Random forest

Vi Evaluation result

In this section, we conduct several numerical experiments to evaluate the proposed intrusion detection mechanisms.

Vi-a Dataset and evaluation metrics

As the KDD Cup 1999 dataset [25] has been widely used to evaluate various intrusion detection approaches, we perform a five-class flow classification using a subset of it after down-sampling in this paper. The distribution of both training and testing data marked by their attack type is summarized in Table III.

Generally, the performance of an intrusion detection system is evaluated in the light of precision (P), recall (R), F-score (F), accuracy (AC), and false alarm rate (FA). We desire a system with higher detection rate and lower false rate. Another comparative metric is defined to measure the cost damage of misclassification for different attacks per sample. The relevant details of the dataset and evaluating metrics are introduced in our previous work [26].

Class Training dataset Testing dataset
No. of samples Percentage No. of samples Percentage
Normal 17129 29.99% 12183 32.52%
Probe 3107 5.44% 1880 5.02%
DoS 35700 62.51% 21705 57.94%
U2R 52 0.09% 228 0.61%
R2L 1126 1.97% 1468 3.92%
TABLE III: Distribution of data used in our evaluation

Vi-B Performance Analysis

We intend to evaluate the proposed mechanisms from three aspects. Firstly, we assess the optimality and convergence of the proposed BA algorithm for feature selection. Secondly, we estimate the detection ability of the enhanced RF algorithm for flow classification on the overall dataset as well as in different classes of flows respectively. Finally, we implement the combination of the above-mentioned algorithms in the proposed two-stage intrusion detection to make comparisons with the existing solutions.

Vi-B1 Evaluations on the proposed Bat algorithm

There have been several improvements of BA to overcome its inherent shortcomings [27][28]. We carry out a comparison with these methods as well as several typical Swarm Intelligence algorithms for feature selection together with RF for classification. The results are shown in Table IV in terms of classification accuracy and false alarm rate. As noticed from the table, the proposed algorithm selects the features contributing more to differentiate attack traffic, which gives rise to higher accuracy and lower false rate. It is proved that the proposed algorithm for optimal feature selection performs well, which is conductive to achieve better performance in classification.

Different algorithms
Accuracy ()
FPR ()
ACO 94.25 4.78
PSO 94.52 3.99
BA 94.93 3.68
Reference[27] 95.35 2.74
Reference[28] 95.64 2.06
Our proposed algorthm 96.03 1.18
TABLE IV: Performance comparision using different algorithms
Fig. 4: Convergence comparision using different SI algorithms

In Figure 4, we verify the advantage of the proposed algorithm in finding a better subset of features with higher fitness within less iterations. It can be noticed that our algorithm converges faster than the others at about iteration 40 with a steeper slope, which lowers the time complexity. Furthermore, it obtains a higher fitness value after convergence and remains unchanged above the other curves. It is validated that the improvement of swarm division and the mutation mechanism help to escape from trapping into local minima and search for better solution. The linear time-varying parameters also strengthen individuals with dynamic searching ability to adjust different phases of iterations.

Fig. 5: Performance comparision using different numbers of individuals with various numbers of iterations
Fig. 6: Performance comparision using different methods of swarm division

Since the size of the swarm and the number of iterations are two critical parameters in process of solving optimization problems, we evaluate their influence with various values in Figure 5. It is illustrated that the accuracy of the algorithm enhances as the number of iterations increases with constant number of individuals in the swarm. It can be deduced that to some extent, the more iterations the swarm goes through, the better they evolves in finding the optimal solutions. Also, the algorithm will converge within restricted number of iterations in finding the approximate optimal solution. At the same iteration, we can see that a larger swarm with more individuals performs better than those smaller ones. It is because that a larger swarm with better diversity of population can communicate with each other more interactively without gathering close to the local minima, which points to a greater searching ability in larger area.

One of the most significant improvement of the proposed BA algorithm is the division of swarm into different populations. There are different kinds of methods to cluster individuals into populations. Some just randomly assign individuals into different clusters while the others use clustering algorithms in the light of various metrics. Therefore, we examine the influence of different clustering algorithms on the final BA algorithm and give the convergence results in Figure 6. It is apparent that using K-means [29] clustering algorithm achieves better performance with higher fitness and faster rate of convergence, since the adjacent individuals are clustered into same subgroup based on distances. On one hand, the whole swarm moves towards the current best position by learning within each subgroup as well as sharing knowledge between populations. On the other hand, each individual only learns partially from the local best in its subgroup and moves slightly in case of being badly affected by the local minima.

Decision Tree 96.59 92.84 95.42 4.79 0.2271
Adaboost 97.42 93.21 95.68 3.98 0.2032
RF 98.09 93.84 95.92 3.78 0.1738
SVM 98.74 94.36 96.55 2.75 0.1688
GBDT 99.17 94.84 96.72 1.68 0.1608
Proposed algorithm 99.51 95.17 97.29 0.98 0.1302
TABLE V: Performance comparision using different algorithms
Algorithms Normal Probe DoS U2R R2L
Decision Tree 95.63 95.47 99.02 6.25 8.27
Adaboost 96.17 96.62 99.85 20.34 19.68
RF 96.28 95.85 100 12.19 14.51
SVM 97.72 97.55 99.78 12.49 12.72
GBDT 98.54 98.79 100 27.65 21.45
Proposed algorithm 99.02 99.63 100 57.46 23.84
TABLE VI: Detection Performance comparision in different classes

Vi-B2 Evaluations on the proposed Random forest algorithm

In Table V, we verify the advantage of improved RF algorithm through the performance of detection in contrast with the ordinary RF algorithm and other machine learning algorithms. All the classification algorithms use features selected by the proposed BA as input. By altering the weights of samples, each tree in the forest can be trained more effectively through picking the samples which are more frequent to be mis-classified. By applying the weighted voting mechanism, the weight of each tree in a specific class is directly affected by its performance, which contributes distinctively to determining the final result. As we can see, it is obvious that the proposed method generates a better performance in every metric.

Fig. 7: Overhead produced by algorithms with different numbers of flows

We observe the detection performance of the improved algorithm in five classes individually comparing with the above-mentioned algorithms in Table VI. In real network scenarios, some intrusions generate more connections than others which leads to an extreme unbalanced dataset for classification. The detection rate varies remarkably in different classes. Usually, those intrusions with fewer flows generate higher costs when mis-classified, so it is urgent to improve their performance. Thus, we solve the problem through altering the weights of samples accordingly to stress more attentions on those in the minority classes while avoiding over-fitting for the majority ones. It can be observed in Table VI, it is apparent that the proposed RF algorithm improves the detection accuracy of minority intrusions dramatically while slightly increases the detection rate of the majority ones. The result indicates that the algorithm adaptively balances its training for samples in different classes and decides the final result according to its learning ability, which accelerates the performance in each class.

Combination of
No. of
Proposed BA Proposed RF 32 96.42 0.98
RF GBDT 22 95.49 1.84
Tree SVM 10 94.41 2.64
Fisher RF 10 94.97 2.35
ReliefF Adaboost 8 95.32 1.92
IG Decision Tree 8 94.07 4.35
CFS LR 18 93.22 6.75
TABLE VII: Performance of different combinations of algorithms
Proposed system (and related works)
Acurracy ()
FPR ()
Our system 96.42 0.98
Reference [26] 95.21 1.57
Reference [16] 94.56 1.83
Reference [30] 93.36 2.07
Reference [31] 92.42 2.82
Reference [19] 90.27 3.45
TABLE VIII: Performance comparision of different systems
Fig. 8: Processing times of different systems

We measure the overhead that the proposed algorithm causes using the above-mentioned cost metrics. We compare the proposed algorithm with Random forest and the other machine learning algorithms in Figure 7. The metric indicates that the more mis-classified flows there are when making classification, the more overhead the algorithm generates for misclassification. As the number of flows grows, all of them become more well-training to make classification with less faults. It can be seen that our algorithm produces less overhead than its comparisons.

Vi-B3 Evaluations on the performance of combined algorithms in the two stages

We evaluate the combination of our improved algorithms comparing with several groups of traditional feature selection and machine learning algorithms in Table VII. Since we know that the selection of algorithms for feature selection and traffic classification possesses a mutual influence on each other, we care more about the performance of the combination of them. As we can see, it is obvious that the proposed methods generate a better performance among all the combination alternatives in every metric.

There have been several intelligent architectures proposed to detect and prevent network intrusions under SDN environment [16][19][26][30][31]. We conduct a comparison with the previous results in terms of classification accuracy and error rate as described in Table VIII. The processing time of each approach using a portion of flows in the dataset is also illustrated in Figure 8 to evaluate the efficiency. It can be noticed that the proposed two-stage intrusion detection improves the classification accuracy with tolerable time consumption.

Vii Conclusion

This paper presents an AI-based two-stage intrusion detection implemented in Software Defined IoT Networks. It leverages SDN contributing to status monitoring as well as traffic capturing under a global view. It integrates and coordinates two stages of IDS including feature selection and flow classification to detect novel intrusions with a self-learning ability. We improve Bat algorithm to select optimal features and design network flow classification methods by enhancing Random forest algorithm. Evaluation results validate the optimality of our proposed algorithms in achieving higher accuracy and lower overhead. The experiments also reveal that the system improves its detection ability without much time consumption compared with existing solutions.

In the future, we will implement this approach in a real network to traffic and evaluate the performance.


  • [1] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash, “Internet of things: A survey on enabling technologies, protocols, and applications,” IEEE Communications Surveys and Tutorials, vol. 17, no. 4, pp. 2347–2376, 2015.
  • [2] M. S. Mahdavinejad, M. Rezvan, M. Barekatain, P. Adibi, P. Barnaghi, and A. P. Sheth, “Machine learning for Internet of Things data analysis: A survey,” Digital Communications and Networks, 2017.
  • [3] D. Wu, L. Zhou, and Y. Cai, “Improving network management with software defined networking,” IEEE Communications Magazine, vol. 51, no. 2, pp. 114–119, 2013.
  • [4] M. B. Yassein, S. Aljawarneh, M. Al-Rousan, W. Mardini, and W. Al-Rashdan, “Combined Software-Defined Network (SDN) and Iinternet of Things (IoT),” in International Conference on Electrical and Computing Technologies and Applications, 2018.
  • [5] T. A. Tang, L. Mhamdi, D. Mclernon, S. A. R. Zaidi, and M. Ghogho, “Deep learning approach for network intrusion detection in software defined networking,” in International Conference on Wireless Networks and Mobile Communications, 2018.
  • [6] A. L. Buczak and E. Guven, “A survey of data mining and machine learning methods for cyber security intrusion detection,” IEEE Communications Surveys and Tutorials, vol. 18, no. 2, pp. 1153–1176, 2017.
  • [7] R. Li, Z. Zhao, X. Zhou, G. Ding, Y. Chen, Z. Wang, and H. Zhang, “Intelligent 5G: When cellular networks meet artificial intelligence,” IEEE Wireless Communications, vol. PP, no. 99, pp. 2–10, 2017.
  • [8] Y. Zhang, M. Chen, and Lai, “Cloudified and software defined 5G networks: Architecture, solutions, and emerging applications,” Mobile Networks and Applications, vol. 21, no. 5, pp. 727–728, 2016.
  • [9] M. Ojo, D. Adami, and S. Giordano, “A SDN-IoT architecture with NFV implementation,” in GLOBECOM Workshops, 2017, pp. 1–6.
  • [10] M. T. Kakiz, E. Ozturk, and T. Cavdar, “A novel SDN-based IoT architecture for big data,” in International Artificial Intelligence and Data Processing Symposium, 2006, pp. 1–5.
  • [11] D. B. Rawat and S. R. Reddy, “Software defined networking architecture, security and energy efficiency: A survey,” IEEE Communications Surveys and Tutorials, vol. 19, no. 1, pp. 325–346, 2017.
  • [12] O. Salman, S. Abdallah, I. H. Elhajj, A. Chehab, and A. Kayssi, “Identity-based authentication scheme for the internet of things,” Computers and Communication, pp. 1109–1111, 2016.
  • [13] M. Nobakht, V. Sivaraman, and R. Boreli, “A host-based intrusion detection and mitigation framework for smart home IoT using openflow,” in International Conference on Availability, Reliability and Security, 2016, pp. 147–156.
  • [14] P. Bull, R. Austin, M. Sharma, and R. Watson, “Flow based security for iot devices using an SDN gateway,” in IEEE International Conference on Future Internet of Things and Cloud, 2016, pp. 157–163.
  • [15] N. Farah, M. Avishek, F. Muhammad, A. Rahman, M. Rafni, and D. Md, “Application of machine learning approaches in intrusion detection system: A survey,” International Journal of Advanced Research in Artificial Intelligence, vol. 4, no. 3, 2015.
  • [16] A. S. D. Silva, J. A. Wickboldt, L. Z. Granville, and A. Schaeffer-Filho, “Atlantic: A framework for anomaly traffic detection, classification, and mitigation in SDN,” 2016, pp. 27–35.
  • [17] T. Mehmood and H. B. M. Rais, “SVM for network anomaly detection using ACO feature subset,” in International Symposium on Mathematical Sciences and Computing Research, 2016, pp. 121–126.
  • [18] N. Cleetus and K. A. Dhanya, “Multi-objective functions in particle swarm optimization for intrusion detection,” in International Conference on Advances in Computing, Communications and Informatics, 2014, pp. 387–392.
  • [19] A. Le, P. Dinh, H. Le, and N. C. Tran, “Flexible network-based intrusion detection and prevention system on software-defined networks,” in International Conference on Advanced Computing and Applications, 2016, pp. 106–111.
  • [20] J. Kim, J. Kim, H. L. T. Thu, and H. Kim, “Long short term memory recurrent neural network classifier for intrusion detection,” in International Conference on Platform Technology and Service, 2016, pp. 1–5.
  • [21] A. C. Enache and V. Sgârciu, “A feature selection approach implemented with the binary bat algorithm applied for intrusion detection,” in International Conference on Telecommunications and Signal Processing, 2015, pp. 11–15.
  • [22] J. Wang, J. Liao, Y. Zhou, and Y. Cai, “Differential evolution enhanced with multiobjective sorting-based mutation operators,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2792–2805, 2017.
  • [23] K. Singh, S. C. Guntuku, A. Thakur, and C. Hota, “Big data analytics framework for peer-to-peer botnet detection using random forest,” Information Sciences, vol. 278, no. 19, pp. 488–497, 2014.
  • [24] B. Sun, J. Luo, S. Shu, and E. Xue, “Improve the performance of random forests by introducing weight update technique,” in International Conference on Intelligent Human-Machine Systems and Cybernetics, 2010, pp. 34–37.
  • [25] H. Saxena and V. Richariya, “Intrusion detection in KDD99 dataset using SVM-PSO and feature reduction with information gain,” International Journal of Computer Applications, vol. 98, no. 6, pp. 25–29, 2014.
  • [26] J. Li, Z. Zhao, and R. Li, “A machine learning based intrusion detection system for software defined 5G network,” IET Networks, 2017.
  • [27] A. C. Enache and V. Sgârciu, “Anomaly intrusions detection based on support vector machines with an improved bat algorithm,” in International Conference on Control Systems and Computer Science, 2015, pp. 856–861.
  • [28] A. C. Enache and V. Sgrciu, “An improved bat algorithm driven by support vector machines for intrusion detection,” in International Joint Conference, 2015, pp. 41–51.
  • [29] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient k-means clustering algorithm: analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, 2002.
  • [30] X. Ye, X. Chen, H. Wang, X. Zeng, G. Shao, X. Yin, and C. Xu, “An anomalous behavior detection model in cloud computing,” Tsinghua Science and Technology, vol. 21, no. 3, pp. 322–332, 2016.
  • [31] P. Wang, K. M. Chao, H. C. Lin, W. H. Lin, and C. C. Lo, “An efficient flow control approach for sdn-based network threat detection and migration using support vector machine,” in IEEE International Conference on E-Business Engineering, vol. 24, no. 7, 2017, pp. 56–63.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description