Local structure entropy of complex networks

# Local structure entropy of complex networks

Qi Zhang School of Computer and Information Science, Southwest University, Chongqing 400715, China    Meizhu Li School of Computer and Information Science, Southwest University, Chongqing 400715, China    Yuxian Du School of Computer and Information Science, Southwest University, Chongqing 400715, China    Yong Deng School of Computer and Information Science, Southwest University, Chongqing 400715, China School of Automation, Northwestern Polytechnical University, Xian, Shaanxi 710072, China
July 6, 2019
###### Abstract

Identifying influential nodes in the complex networks is of theoretical and practical significance. There are many methods are proposed to identify the influential nodes in the complex networks. In this paper, a local structure entropy which is based on the degree centrality and the statistical mechanics is proposed to identifying the influential nodes in the complex network. In the definition of the local structure entropy, each node has a local network, the local structure entropy of each node is equal to the structure entropy of the local network. The main idea in the local structure entropy is try to use the influence of the local network to replace the node’s influence on the whole network. The influential nodes which are identified by the local structure entropy are the intermediate nodes in the network. The intermediate nodes which connect those nodes with a big value of degree. We use the (SI) model to evaluate the performance of the influential nodes which are identified by the local structure entropy. In the SI model the nodes use as the source of infection. According to the SI model, the bigger the percentage of the infective nodes in the network the important the node to the whole networks. The simulation on four real networks show that the proposed method is efficacious and rationality to identify the influential nodes in the complex networks.

Complex networks, Structure entropy, Local structure entropy, Influential nodes, SI model
###### pacs:
89.20.-a, 05.10.-a, 02.50.-r, 02.10.-v
preprint: Physical Review E

## I Introduction

The complex networks is a system which composed of many interacting parts Kim and Wilhelm (2008); Newman (2003). There are many real systems can be modeled as the complex networks. Identifying influential nodes in the complex networks is of theoretical and practical significance. There are many methods have proposed to identify influential nodes, such as the degree centrality, the betweenness centrality and so on Chen et al. (2012); Freeman (1977); Anand and Bianconi (2009).

The degree centrality method is very simple but can not illuminate the global characteristic. The betweenness centrality method is useful but can not been implemented in the big scales complex networks. In order to find a more reasonable method to identifying influential nodes in the big scales complex networks, we proposed a new method which has merged the statistical mechanic and the degree centrality of the complex networks which is named local structure entropy of the complex networks .

The local structure entropy is defined based on the shannon entropy and the degree centrality. The results of our test show that the proposed method is useful and efficient.

The rest of this paper is organised as follows. Section II introduces some preliminaries of this work, such as the definition of the shannon entropy, the degree centrality and the betweenness centrality. In section IV, a new method to identifying influential nodes in the complex networks is proposed. The application of the proposed method is illustrated in section V. Conclusion is given in Section VI.

## Ii Preliminaries

In this section, we introduce some core concepts which will be used in this paper.

### ii.1 Degree centrality

The degree of one node in a network is the number of the edges connected to the node. Most of the complex networks’ properties are based on the degree distribution, such as the clustering coefficient, the community structure and so on. In the network, represents the degree of the th vertex Newman (2003). The details of the degree centrality are shown in the Fig 1.

### ii.2 Betweenness centrality

The betweenness centrality is an important index which can be used to illuminate the importance of the nodes. It is defined based on the shortest path of the network Barthelemy (2004).

The betweenness centrality of the complex networks is defined as follows Barthelemy (2004):

 bet(i)=υ(i)∑σst(s≠i≠t) (1)

In the Eq. (1), the is the number of the shortest path from vertex to vertex , is the number of the shortest path which have go to through the vertex Barthelemy (2004). The details of the betweenness centrality is shown in the Fig.2. However, it is clear that the betweenness centrality is hard to calculate when the network has a big scale.

### ii.3 Shannon entropy

The shannon entropy which is named as the information entropy Shannon (2001) is the basic of the information theory. In the information theory, the shannon entropy is used to measure the unpredictability of the information content. In other words, the shannon entropy is used to measure the uncertainty in the system which is described by the probability theory.

The classical form of the Shannon entropy is defined as follows Shannon (2001).

 EShannon=n∑i=1pilog(pi) (2)

### ii.4 Susceptible and infective model in the network

In order to improve the influential of the nodes. We use the SI model (Susceptible and infective model) in the network. The process of the infection can be divide into three steps:

Step 1: Choosing one of the node as the source of infection. Set the times() of the infective node to infect other normal nodes.

Step 2: Find the neighbour node of the infection source node. Infecting the neighbour nodes randomly of an probability. Repeating this process in times.

Step 3: After times infection, check the numbers of the infective nodes in the network. Calculating the probability of infection.

The probability of the infection represents the influence of the nodes in the network. The value of the infective times in the process is decided by the scale of the network. The large the scale of the target network, the big the value of infective times.

## Iii Some problems in the existing methods

There are many methods can be used to identifying the influential nodes in the complex networks, such as the degree centrality, the betweenness centrality and so on de Arruda et al. (2014); Huang et al. (2011); Xu et al. (2010). However, in the large scale networks to calculate the betweenness centrality is inefficiently. The degree centrality is lack of information to describe the special nodes’ structure property in the large scale network. The details are shown in the Fig. 3.

Based on the existing researches, the degree centrality is a effective method to identifying the influential nodes in the complex network. However, it is an incomplete method. The degree centrality of the nodes only have considered the direct connection to the target node. There are so many nodes in the complex networks which is in the same structure property of the node 7 in the Fig.3(d) Park et al. (2004); Masuda et al. (2009). This kind of nodes have a small value of degree, but a big influence in the local network. In order to describe the structure of the complex network more effective and more convenient, we proposed a new method based on the degree centrality and the shannon entropy.

## Iv Local structure entropy of the complex networks

Because of the effective and convenient of the degree centrality Gao et al. (2013, 2014); Wei et al. (2013); Du et al. (2014); Li et al. (2014); Ji et al. (2014), the new method proposed in this paper is based on the degree centrality too. However, in the new method, the influence of the neighbour nodes of the target node is considered.

The main idea of the definition of the new method is that the influence of the target node’s neighbour is contained. Therefore, a local network around each node is established by us. The influence of the node on the whole network is replaced by the influence of the local network on the whole network.

There are many researches of complex network are based on the statistical mechanics Albert and Barabási (2002); Tsallis (1988, 2009), such as the information dimension Zhang et al. (2015); Daijun et al. (2014), the structure entropy Anand and Bianconi (2009). The researches show that the statistical mechanics is an useful method to describe the structure property of the complex networks. The structure entropy of a network is used to describe the structure complexity of it. The more the complex of the network, the big the value of the structure entropy. Depends on this definition of the structure entropy, the influence of each node can be described by the local network’s structure entropy.

Depends on the definition of the local network, the degree centrality and the shannon entropy, a local structure entropy of the complex network is proposed in this paper to identifying the influential nodes in the complex network.

The definition of the local structure entropy can be separate into three steps:

Step 1 Creating a local network: Choosing one of the node in the network as a central node of the local network. The neighbour nodes of the central node is contained in the local network. In other words, the local network of each node in the complex network is a part of the complex network which contains the target node and the neighbour nodes of it.

Step 2 Calculating the unit of the local structure entropy: Calculating the degree of each node in the local network and the total number of the degree in the local network. The unit of the local structure entropy can be represents as the , it is defined in the Eq.(4).

Step 3 Calculating the local structure entropy of each node: The definition of the local structure entropy for each node is shown in the Eq.(3).

 LEi=n∑j=1pijlogpij (3)

Where the represents the local structure entropy of the th node in the complex networks. The is the total number of the nodes in the local network. The represents the percent of degree for the th node in the local network. The definition of the is shown in the Eq.(4).

 pij= degree(j)n∑j=1 degree(j) (4)

The detail of the process to calculate the local structure entropy is shown in the Fig.4.

In order to show the reasonable of the local structure entropy to identify the influence of the nodes in the network, a network with 21 nodes and 33 edges is used as an example to identify the most influential nodes in it. The network is named . The value of the degree, the betweenness, the local structure entropy of each nodes in the are shown in the Table 1.

According to the existing method and the proposed method, the most influential six nodes in the are identified by those methods. The details are shown in the Table 2.

In the six most influential nodes in the which are identified by the local structure entropy, the top three influential nodes are the same as the nodes identified by the degree centrality and the other three nodes are the same as the nodes identified by the betweenness centrality. It means that, in the , the local structure entropy contains some property of the betweenness centrality. In order to show more details of the nodes identifying by the local structure entropy we have point those most influential nodes in the Fig.5

From the Fig.5, we can see that the nodes identified by the local structure entropy are in the middle of the nodes which have big value of degree. In other words, the nodes identified by the local structure entropy are the nodes which have small value of degree but a series of important neighbour.

## V Application

In this section, the local structure entropy of the complex network is used to identify the influence of the nodes in those real networks, such as the Zachary’s Karate Club network uci (2014), the US-airport network net (2014a), Email networks net (2014a), the Germany highway networks net (2014b) and the protein-protein interaction network in budding yeast net (2014a).

### v.1 The property of the local structure entropy

First, we use the Zachary’s Karate Club network uci (2014) to show the property of the proposed method. The result is shown in the Table3. The top six important nodes have identified by the betweenness centrality, the degree centrality and the local structure entropy.

The results of our test on the Zachary’s Karate Club network uci (2014) show that the local structure entropy can identify the nodes connect those node’s have big value of degree.

### v.2 The difference between the local structure entropy and the existing methods to identify the influences nodes in the US-airport network net (2014a)

In order to prove the rationality of the proposed method, we use the SI model to infect the most important nodes identify by different methods in the US-airport network net (2014a). The results are shown in the Table 4.

The nodes number in the US-airport network net (2014a) are equal to 332, so each step in the process contains 1 times infection. It means that, each infective node has one chance to infect other neighbour nodes in a step. The tables show as follows illuminate the proportion of infective nodes in the network in each step. The figures show the process in directly. Because the plot in the step 3 and step 5 is not clear, so we have create two subfigure in every figure.

The Fig.7 and Table 5 show the proportion of the infective nodes in the network of each step. The infection source nodes show in the figure and table are identify by the betweenneess centrality. The results show that in the first step, each node which has been identified by the betweenness centrality has a small percentage of infection. It means that most of the nodes have a small value of degree, so that they can not infect many nodes in the first step. In the step 2, step 3, step 4 and step 5, most of the nodes which are identified by the betweenness centrality has a small percentage of infection too. It means that depends on the SI model, the nodes seems not so important to influence the whole network.

In the Fig.8 and the Table 6, the infection source nodes are identified by the degree centrality. In the first step, each node has a large percentage of infection. It means that the nodes which are identified by the degree centrality have a large value of degree and most of them can infect a lot of nodes in the network.

In the Fig.9 and the Table 7, the infection source nodes are identified by the local structure entropy. In the first step, most of them have infected a small percentage of the nodes in the networks. It means all of them have a small value of degree. However, follow the continue of the process of infection, the percentage of the infective nodes in the network is growing and most of the nodes in the network can be infected. It means that, depends on the SI model, those nodes have an larger influence in the US-airport network net (2014a).

### v.3 The influential nodes in the reals networks

In this subsection the influential nodes in the Email networks net (2014a), the Germany highway networks net (2014b) and the protein-protein interaction network in budding yeast net (2014a) are identified by the local structure entropy. The results are shown as follows.

The process of the infection in the Email networks net (2014a) is shown in the Table 9 and the Fig.10. The infection source nodes in the Email networks net (2014a) are identified by the local structure entropy. The results show that each influential node in the Email networks net (2014a) which is identified by the local structure entropy can infect most of the nodes after 25 times infection.

The infection processes of the influential nodes in the Germany highway networks net (2014b) are shown in the Table 10 and Fig.11. It is clear that in the Germany highway networks net (2014b), after 100 times infection almost half of the nodes in the network have been infected. The nodes which are identified by the local structure entropy have a big influence in the Germany highway networks net (2014b).

The protein-protein interaction network in budding yeast net (2014a) is a biological network. The processes of the infection in the protein-protein interaction network in budding yeast net (2014a) are shown in the Table 11 and the Fig.12. The results show that the nodes which are identified by the local structure entropy have an stable and big influence to the protein-protein interaction network in budding yeast net (2014a).

The details of our research show that the local structure entropy can identify those nodes which have a small value of degree but have an big influence to the whole network. Most of the nodes which are identified by the local structure entropy are the intermediate connection nodes, they connect those nodes which have a big value of degree.

## Vi Conclusion

Identifying the influential nodes in the network is one of the most important research direction in the research of complex network. It can be used to identify the leader in the social network. Tt can be used to find the central nodes in the power network. It also can be used in the human disease network to find the main gene which control the health of our human. There are many methods can be used to identify the influential nodes in the network from different needs. In this paper, the local structure entropy is proposed based ont the degree centrality and the statistical mechanics. The influence of the local network on the whole network is used to replace the node’s influence on the whole networks. In our opinion, the local structure entropy can avoid the complex calculation in the traditional methods and merge the influence of the degree and the betweenness of the nodes in the local network. The results of this paper show that the local structure entropy is efficacious and rationality.

## Vii Acknowledgment

The work is partially supported by National Natural Science Foundation of China (Grant No. 61174022), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20131102130002), RD Program of China (2012BAH07B01), National High Technology Research and Development Program of China (863 Program) (Grant No. 2013AA013801), the open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (Grant No.BUAA-VR-14KF-02).

## References

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters