Measure the similarity of nodes in the complex networks

Measure the similarity of nodes in the complex networks

Qi Zhang Meizhu Li Yong Deng ydeng@swu.edu.cn, prof.deng@hotmail.com Sankaran Mahadevan School of Computer and Information Science, Southwest University, Chongqing, 400715, China School of Automation, Northwestern Polytechnical University, Xian, Shaanxi 710072, China School of Engineering, Vanderbilt University, Nashville, TN, 37235, USA
Abstract

Measure the similarity of the nodes in the complex networks have interested many researchers to explore it. In this paper, a new method which is based on the degree centrality and the Relative-entropy is proposed to measure the similarity of the nodes in the complex networks. The results in this paper show that, the nodes which have a common structure property always have a high similarity to others nodes. The nodes which have a high influential to others always have a small value of similarity to other nodes and the marginal nodes also have a low similar to other nodes. The results in this paper show that the proposed method is useful and reasonable to measure the similarity of the nodes in the complex networks.

keywords:
Complex networks, Similarity of nodes, Cross-entropy
journal: Elsevier

1 Introduction

The complex networks is a new method to describe those complex system from the mathematic. Many of the real system in the real world can be modeled as the complex system, such as the biological, social and technological systems albert2000error (); newman2003structure (); de2014facebook (); csermely2008creative (); csermely2009weak (). Many property of the complex networks have illuminated by these researchers in this filed, such as the network topology and dynamics watts1998collective (); newman2006structure (); ferrara2013traveling (); ferrara2012large (), the property of the network structure newman2003structure (); barthelemy2004betweenness (), the self-similarity and fractal property of the complex networkssong2005self (); wei2014new (); zhang2015tsallis (), the evolutionary games on complex networks wang2013impact (); PhysRevE.89.052813 (), the controllability and the synchronization of the complex networks liu2011controllability (); arenas2008synchronization () and so on barabasi2009scale (); barabasi1999emergence (); barabasi2009scale (); meo2013analyzing (); ferrara2013traveling (); teixeira2010complex (); csermely2004strong (); wang2012evolution ().

The similarity of the nodes in the complex networks is a new research direction. It is interested that ”How similar are these two vertices ?” or ” Which node is most similar to others nodes?”. There are many methods have proposed to solve this problem leicht2006vertex (); zhou2009predicting (); pan2010detecting (); lu2001node (); lu2007node (). In this paper, a new methods which is based on the relative-entropy (Kullback¨CLeibler divergence) kullback1951information () is proposed to describe the similarity of those nodes in the complex networks. The definition of the probabilities of each node is based on the degree distribution.

The rest of this paper is organised as follows. Section 2 introduces some preliminaries of this work. In section 3, a new method to measure the similarity of the nodes in the complex networks is proposed. The application of the proposed method is illustrated in section 4. Conclusion is given in Section 5.

2 Preliminaries

2.1 Local network in the complex network

Based on the existing research about the complex networks, it is clear that a lot of the property of complex networks are based on the structure property of it newman2003structure (). In the complex networks, each node’s influence on the whole network is decided by the neighbour nodes of it. Based on the existing researches about the local structure of the complex networks zhou2009predicting (); zhou2009predicting (); ulanowicz1999nutrient (), a local network of each node in the complex networks is proposed zhang2014local (). The details of the local networks is shown as follows:

(a) Network A
(b) The local network of node 4
(c) The local network of node 8
Figure 1: The Network A in the subfigure (a) is a simple network. The subfigure (b) shows the detail of the local network of node 4. The subfigure (c) shows the details of the local network of node 8.

It is clear that each local network of the target node contains the target node and the neighbor nodes of the target nodes.

2.2 Relative entropy (Kullback¨CLeibler divergence)

The Relative entropy (Kullback¨CLeibler divergence) is a basic conception in the probability theory and the information theory. It is proposed by Kullback and Leibler er.al kullback1951information (). The Relative entropy is a non-symmetric measure of the difference between two probability. For two probabilities and The definition of the Relative entropy is shown in the Eq.(1).

(1)

Where the and have the same number of the components in it. The components in those two probabilities is equal to .

3 Measure the similarity of each node

The proposed method is based on the definition of the local network and the Relative entropy. The definition of the proposed new methods can be divided into two parts.

  • The definition of the probabilities of each node. First, calculate the degree of each node. Find the maximum of the degree in the network. Second, set the scales of the probabilities of each node base on the value of the maximum degree. Third, use the degree of the neighbour nodes as the components of probabilities. At last, sort the probabilities from the high to the low.

  • The Relative entropy of each node to others nodes. Calculate the Relative entropy between each node’s probabilities.

Based on the local network of each node and the degree centrality, the definition of the probabilities of each node is shown as follows. For example, we use the represents the local network of node . In the local network , the total value of degree is represented by the (). The in the represents the th node. The node number in the local network is equal to . The maximum value of the degree in the whole networks is equal to . Then, the number of the components of each node’s probabilities is equal to . The probabilities of node is defined in the Eq.(2).

(2)

where the d(j) in the Eq.(2) is defined based on the degree of the node in the local network.

In the , the value of is defined based on the degree of the node in the local network (). If the value of node number in the local network is small than the , then the value of will be set as 0. At last, sort the probabilities from the high to the low.

An example of the definition of are shown in the Fig. 2.


Figure 2: This figure is a part of the Network A shows in the Fig.1. The maximum of the degree in the Network A is equal to 6. The is the is 6. The number of the components in the probabilities is 7. The total degree in the is equal to 23. The degree of node 1 is 2, the degree of node 2 is 2, the degree of node 3 is 3, the degree of node 5 is 4, the degree of node 6 is 4, the degree of node 7 is 2 and the degree of node 4 is 6. Then . Then sort the components in the . The .

Then the measure of the similarity of node and node is defined as follows:

(3)

The sum of each node similarity to others in the network is used to identify which node is most similar to others nodes. The big the value of the sum of similarity. The more similar to others nodes.

In order to illuminate the useful of the new method an example network (Network A-21) is used to measure the similarity of nodes in it. The details of the example network (Network A-21) are shown in the Fig. 3.


Figure 3: The example network (Network A-21)

The probabilities of each node in the example network (Network A-21) are shown in the Table 1.

P(1) = [0.27 0.27 0.27 0.18 0.00 0.00 0.00]
P(2) = [0.31 0.31 0.19 0.19 0.00 0.00 0.00]
P(3) = [0.36 0.21 0.21 0.21 0.00 0.00 0.00]
P(4) = [0.44 0.33 0.22 0.00 0.00 0.00 0.00]
P(5) = [0.23 0.23 0.14 0.14 0.14 0.14 0.00]
P(6) = [0.36 0.27 0.27 0.09 0.00 0.00 0.00]
P(7) = [0.24 0.24 0.14 0.14 0.14 0.10 0.00]
P(8) = [0.31 0.31 0.19 0.19 0.00 0.00 0.00]
P(9) = [0.75 0.25 0.00 0.00 0.00 0.00 0.00]
P(10) = [0.29 0.21 0.21 0.14 0.14 0.00 0.00]
P(11) = [0.45 0.36 0.18 0.00 0.00 0.00 0.00]
P(12) = [0.33 0.28 0.22 0.17 0.00 0.00 0.00]
P(13) = [0.50 0.33 0.17 0.00 0.00 0.00 0.00]
P(14) = [0.40 0.40 0.20 0.00 0.00 0.00 0.00]
P(15) = [0.24 0.16 0.16 0.12 0.12 0.12 0.08]
P(16) = [0.44 0.33 0.22 0.00 0.00 0.00 0.00]
P(17) = [0.38 0.25 0.19 0.19 0.00 0.00 0.00]
P(18) = [0.35 0.24 0.18 0.12 0.12 0.00 0.00]
P(19) = [0.35 0.24 0.18 0.12 0.12 0.00 0.00]
P(20) = [0.33 0.27 0.20 0.20 0.00 0.00 0.00]
P(21) = [0.43 0.21 0.21 0.14 0.00 0.00 0.00]
Table 1: The probabilities () of each node in the Network A-21

Then the similarity matrix of the nodes in the example network (Network A-21) is shown in the Eq.(4):

(4)
(a) The similar nodes of node 1
(b) The similar nodes of node 2
(c) The similar nodes of node 3
(d) The similar nodes of node 4
(e) The similar nodes of node 5
(f) The similar nodes of node 6
(g) The similar nodes of node 7
(h) The similar nodes of node 8
(i) The similar nodes of node 9
(j) The similar nodes of node 10
Figure 4: The similar nodes of each node (From node 1 to node 10)
(a) The similar nodes of node 11
(b) The similar nodes of node 12
(c) The similar nodes of node 13
(d) The similar nodes of node 14
(e) The similar nodes of node 15
(f) The similar nodes of node 16
(g) The similar nodes of node 17
(h) The similar nodes of node 18
(i) The similar nodes of node 19
(j) The similar nodes of node 20
Figure 5: The similar nodes of each node (From node 11 to node 20)

From the similarity matrix, we can find that the node 2 and node 8, the node 4 and node 16, the node 18 and node 19 have the same structure in the example network (Network A-21). The details is shown in the Fig. 6:

(a) The details of node 2 and node 8
(b) The details of node 4 and node 16
(c) The details of node 18 and node 19
Figure 6: The details of those nodes which have same structure.

From the similarity matrix, we also have find that the node 9 has the lowest similarity to others nodes and the node 12 have the highest similarity to others nodes.

From the results of our test on the example network (Network A-21), the measurement of the similarity of the nodes based on the Relative-entropy is an reasonable and useful method. The method also can be use to node classify in the complex networks. The node 12 have the highest similarity to others nodes. The degree of node 12 is equal to 3. In the example network (Network A-21) most node’s degree is equal to 3. It shows from the other hands that the degree is very important to describe the structure property of the complex networks. The node 9 is a marginal node, because this is no node has a high similarity to it.

4 Application

In the section, the new method is used to find the most similar node in four real networks. The four networks are the Zachary’s Karate Club network (Karate) uci (), the US-airport network (Us-airport) networkdata (), Email networks (Email) networkdata ()and the Germany highway networks (Highway) nettt (). The results are shown as follows:

Network Nodes Edages High similarity node Low similarity node
Karate 34 78 28 12
Us-airport 332 2126 55 118
Email 1133 10902 855 644
Highway 1168 2481 31 798
Table 2: The most similar node and the most marginal node in the four real networks

5 Conclusion

Measure the similarity of the node in the complex networks is an interesting topic. In this paper, a new method which is based on the Relative-entropy is proposed the measure the similarity of the nodes in the complex networks. The nodes with common structure have a high similarity to others. When the similarity between those nodes is equal to 1, it means that those two nodes have same structure property in the complex networks. The nodes which have influential to other or the nodes which are marginal nodes in the complex networks have a low similarity to others. The results in this paper show that, the proposed methods is useful and reasonable to measure the similarity of the node in the complex networks.

Acknowledgments

The work is partially supported by National Natural Science Foundation of China (Grant No. 61174022), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20131102130002), RD Program of China (2012BAH07B01), National High Technology Research and Development Program of China (863 Program) (Grant No. 2013AA013801), the open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (Grant No.BUAA-VR-14KF-02). Fundamental Research Funds for the Central Universities No. XDJK2015D009. Chongqing Graduate Student Research Innovation Project (Grant No. CYS14062)

References

  • (1) R. Albert, H. Jeong, A. Barabási, Error and attack tolerance of complex networks, Nature 406 (6794) (2000) 378–382.
  • (2) M. Newman, The structure and function of complex networks, SIAM Review (2003) 167–256.
  • (3) P. De Meo, E. Ferrara, G. Fiumara, A. Provetti, On facebook, most ties are weak, Communications of the ACM 57 (11) (2014) 78–84.
  • (4) P. Csermely, Creative elements: network-based predictions of active centres in proteins and cellular and social networks, Trends in biochemical sciences 33 (12) (2008) 569–576.
  • (5) P. Csermely, Weak links: the universal key to the stability of networks and complex systems, Springer Science & Business Media, 2009.
  • (6) D. Watts, S. Strogatz, Collective dynamics of ¡®small-world¡¯networks, Nature 393 (6684) (1998) 440–442.
  • (7) M. Newman, A.-L. Barabási, D. J. Watts, The structure and dynamics of networks, Princeton University Press, 2006.
  • (8) E. Ferrara, O. Varol, F. Menczer, A. Flammini, Traveling trends: social butterflies or frequent fliers?, in: Proceedings of the first ACM conference on Online social networks, ACM, 2013, pp. 213–222.
  • (9) E. Ferrara, A large-scale community structure analysis in facebook, EPJ Data Science 1 (1) (2012) 1–30.
  • (10) M. Barthelemy, Betweenness centrality in large complex networks, The European Physical Journal B-Condensed Matter and Complex Systems 38 (2) (2004) 163–168.
  • (11) C. Song, S. Havlin, H. Makse, Self-similarity of complex networks, Nature 433 (7024) (2005) 392–395.
  • (12) D. Wei, B. Wei, Y. Hu, H. Zhang, Y. Deng, A new information dimension of complex networks, Physics Letters A 378 (16) (2014) 1091–1094.
  • (13) Q. Zhang, C. Luo, M. Li, Y. Deng, S. Mahadevan, Tsallis information dimension of complex networks, Physica A: Statistical Mechanics and its Applications 419 (2015) 707–717.
  • (14) Z. Wang, C.-Y. Xia, S. Meloni, C.-S. Zhou, Y. Moreno, Impact of social punishment on cooperative behavior in complex networks, Scientific reports 3. doi:10.1038/srep03055.
  • (15) Z. Wang, L. Wang, M. c. v. Perc, Degree mixing in multilayer networks impedes the evolution of cooperation, Phys. Rev. E 89 (2014) 052813.
  • (16) Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási, Controllability of complex networks, Nature 473 (7346) (2011) 167–173.
  • (17) A. Arenas, A. Díaz-Guilera, J. Kurths, Y. Moreno, C. Zhou, Synchronization in complex networks, Physics Reports 469 (3) (2008) 93–153.
  • (18) A.-L. Barabási, et al., Scale-free networks: a decade and beyond, science 325 (5939) (2009) 412.
  • (19) A. Barabási, R. Albert, Emergence of scaling in random networks, Science 286 (5439) (1999) 509–512.
  • (20) P. d. Meo, E. Ferrara, F. Abel, L. Aroyo, G.-J. Houben, Analyzing user behavior across social sharing environments, ACM Transactions on Intelligent Systems and Technology (TIST) 5 (1) (2013) 14.
  • (21) G. Teixeira, M. Aguiar, C. Carvalho, D. Dantas, M. Cunha, J. Morais, H. Pereira, J. Miranda, Complex semantic networks, International Journal of Modern Physics C 21 (03) (2010) 333–347.
  • (22) P. Csermely, Strong links are important, but weak links stabilize them, Trends in biochemical sciences 29 (7) (2004) 331–334.
  • (23) Z. Wang, A. Szolnoki, M. Perc, Evolution of public cooperation on interdependent networks: The impact of biased utility functions, EPL (Europhysics Letters) 97 (4) (2012) 48001.
  • (24) E. Leicht, P. Holme, M. E. Newman, Vertex similarity in networks, Physical Review E 73 (2) (2006) 026120.
  • (25) T. Zhou, L. Lü, Y.-C. Zhang, Predicting missing links via local information, The European Physical Journal B-Condensed Matter and Complex Systems 71 (4) (2009) 623–630.
  • (26) Y. Pan, D.-H. Li, J.-G. Liu, J.-Z. Liang, Detecting community structure in complex networks via node similarity, Physica A: Statistical Mechanics and its Applications 389 (14) (2010) 2849–2857.
  • (27) W. Lu, J. Janssen, E. Milios, N. Japkowicz, Node similarity in networked information spaces, in: Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research, IBM Press, 2001, p. 11.
  • (28) W. Lu, J. Janssen, E. Milios, N. Japkowicz, Y. Zhang, Node similarity in the citation graph, Knowledge and Information Systems 11 (1) (2007) 105–129.
  • (29) S. Kullback, R. A. Leibler, On information and sufficiency, The annals of mathematical statistics (1951) 79–86.
  • (30) R. E. Ulanowicz, D. Baird, Nutrient controls on ecosystem dynamics: the chesapeake mesohaline community, Journal of Marine Systems 19 (1) (1999) 159–172.
  • (31) Q. Zhang, M. Li, Y. Du, Y. Deng, Local structure entropy of complex networks, arXiv preprint arXiv:1412.3910.
  • (32) Uci network data repository, http://networkdata.ics.uci.edu/data.php?id=105 (2014).
  • (33) Pajek datasets, http://vlado.fmf.uni-lj.si/pub/networks/data/ (2014).
  • (34) Tore opsahl, http://toreopsahl.com/datasets/ (2014).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
47940
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description