Identification of influential nodes in network of networks

Identification of influential nodes in network of networks

Meizhu Li School of Computer and Information Science, Southwest University, Chongqing 400715, China    Qi Zhang School of Computer and Information Science, Southwest University, Chongqing 400715, China    Qi Liu Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN 37232, USA Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA    Yong Deng ydeng@swu.edu.cn; prof.deng@hotmail.com School of Computer and Information Science, Southwest University, Chongqing 400715, China School of Automation, Northwestern Polytechnical University, Xian, Shaanxi 710072, China School of Engineering, Vanderbilt University, Nashville, TN, 37235, USA
July 3, 2019
Abstract

The network of networks(NON) research is focused on studying the properties of interdependent networks which is ubiquitous in the real world. Identifying the influential nodes in the network of networks is theoretical and practical significance. However, it is hard to describe the structure property of the NON based on traditional methods. In this paper, a new method is proposed to identify the influential nodes in the network of networks base on the evidence theory. The proposed method can fuse different kinds of relationship between the network components to constructed a comprehensive similarity network. The nodes which have a big value of similarity are the influential nodes in the NON. The experiment results illustrate that the proposed method is reasonable and significant.

Network of networks, Evidence theory, Influential nodes
pacs:
89.20.-a, 05.10.-a, 02.50.-r, 02.10.-v
preprint: Physical Review Letters

I Introduction

Complex networks describe a wide range of systems in nature and society, it has been widely used in many fields Newman (2003); Newman et al. (2006); Watts and Strogatz (1998). In the real world, a large amount of systems can be described as the complex networks, such as the internet, airline routes, electric power grids and the protein interaction networks. The function of all these networks relies on the connectivity between the network components. However in the real world, numbers of networks have the property that the nodes of the network have different kinds of relationship based on different principles, such as the protein interaction networks and the cancer gene expression network Nicosia et al. (2013); Wang et al. (2014); Kenett et al. (2014). This kind of networks is called the network of networks (NON) Gao et al. (2011).

Compared to the traditional complex networks with single relationship, the network of networks is more difficult to illuminate the structure property of it Boccaletti et al. (2014); Battiston et al. (2014); Zhang et al. (2015). In the NON, identification of the influential nodes is theoretical and practical significance. In this paper, a new method is proposed to identify the influential nodes in the network of networks based on the evidence theory.

Dempster-Shafer theory of evidence Dempster (1967); Shafer (1976), is used to deal with uncertain information and has been widely used in many fields Bloch (1996); Cuzzolin (2008); Denœux (2011); Deng et al. (2011); Li et al. (2014). Here the combination rules of evidence theory are used to fuse the influence of each node in different single networks. The nodes which have a big value of similarity are the influential nodes in the NON. One of the advantages of the proposed method is that the more the type of the interrelation between networks components is, the more accurate the results are.

The rest of this paper is organised as follows. Section II introduces some preliminaries of this work. In section III, a new method is proposed to identify the influential nodes in the network of network. The application based on the cancer gene expression networks is illustrated in section IV. Conclusion is given in Section V.

Ii Preliminaries

ii.1 The network of networks

The network of networks(NON), sometimes called multilayer networks or multiplex, has attracted more and more attention. Due to the fast growth of this field, there are many definitions of different types of NON, such as interdependent networks, interconnected networks, multilayered networks, multiplex networks and so on. There exist many datasets that can be represented as NON, such as flight networks, reliway networks and road networks, network of biological networks including gene regulation networks, metabolic network and protein-protein interacting network.

In this paper, we focus on the network which have the same nodes and different kinds of relationship between the components in the networks.

ii.2 The evidence theory

Dempster-Shafer theory Dempster (1967); Shafer (1976) is often regarded as an extension of the bayesian theory. For completeness of the explanation, a few basic concepts are introduced as follows.

Definition 1.

Let be a set of mutually exclusive and collectively exhaustive, indicted by

(1)

The set is called frame of discernment. The power set of is indicated by , where

(2)

If , is called a proposition.

Definition 2.

For a frame of discernment , a mass function is a mapping from to , formally defined by:

(3)

which satisfies the following condition:

(4)

In Dempster-Shafer theory, a mass function is also called a basic probability assignment (BPA).

Consider two pieces of evidence indicated by two BPAs and on the frame of discernment , Dempster’s rule of combination is used to combine them. This rule assumes that these BPAs are independent.

Definition 3.

Dempster’s rule of combination, also called orthogonal sum, denoted by , is defined as follows

(5)

with

(6)

where and are also elements of , and K is a constant to show the conflict between the two BPAs.

Note that the Dempster’s rule of combination is only applicable to such two BPAs which satisfy the condition .

Iii New methods to identify the influential nodes in the network of networks

To identify the influential nodes in the complex networks is one of the important directions in network science. In the single network, many methods have been proposed to identify the influential nodes, such as the degree centrality, the betweenness centrality, the local structure entropy and so on. However, in the network of networks, more than two networks depend on each other, the structure of it becomes more complex. Based on the evidence theory, a new method is proposed to identify the influential nodes in NON by making a combination among the networks divided from the NON.


Figure 1: Division of NON.

As shown in the subfigure (a) of Fig.1, our research is focus on the NON which has the same nodes but different edges. In the proposed method, one NON can be divided into numbers of single networks based on different principles, which is shown in the subfigre (b) of Fig.1. Based on the single networks divided from NON, a series of similarity networks can be established. According to the similarity networks, a comprehensive networks can be constructed by fusing the similarity networks based on the combination rules of evidence theory. Compared to other nodes in the comprehensive networks, the nodes have a large value of similarity are the influential nodes in the NON.

To introduce the proposed method in details, four steps are essential in the identification research, which is shown as follows.

Step 1

Based on the different significance of the edges , divide the NON into numbers of single networks.

Step 2

According to the distance between each node, establish the distance matrix of the single networks. Each single network has a distance matrix to describe the similarity between each node. The details of the distance matrix are shown as follows:

Where is the shortest distance between node and node .

Based on the distance matrix, the similarity network can be defined as follows:

Step 3

According to similarity network, the basic probability assignment(BPA) can be constructed, which is an essential concept in the evidence theory. Each element in the similarity network has a corresponding BPA, which is defined as follows.

Definition 4.

Given an similarity network , the frame of discernment of the network is , where represents similarity and represents dissimilarity. The BPA of element is:

(7)
(8)
(9)
(10)

Where represents the maximum element in the similarity network, except the diagonal elements. represents the minimum element in the similarity network.

Based on the combination rules of the evidence theory, fuse the BPA of corresponding elements in similarity network into a comprehensive similarity network.

Step 4

The nodes which have a big value of similarity with other nodes in the fused similarity network are the influential nodes in the NON.

Here a network of networks, which is shown in Fig.2, is constructed to show the details of the combination process.


Figure 2: Example of NON and the corresponding divied single networks.

The NON in the Fig.2 can be divided into three single networks. The details of the network (b), network (c) and network (d) are shown as follows.

The similarity matrix of the three single networks is shown as follows.

The similarity matrix of the single network (b):

The similarity matrix of the single network (c):

The similarity matrix of the single network (d):

According to Definition 4, the BPA of each element in the similarity networks can be constructed. Then using the combination rules of evidence theory, fuse the corresponding BPA of each element in the similarity networks. Here an example is shown to fuse the element of network (b), network (c) and network (c).

Based on the three single networks above, the values of the element in the networks can be shown as follows.

, , .

According to Definition 4, the BPA of elements in the networks can be constructed.

, , .

, , .

, , .

After combination based on evidence theory, the fusion result of element is:

, , .

So the value of similarity between node 3 and node 4 in the network and networks is . The order of the influential nodes in the example networks is shown in Table 1:

Order number 1 2 3 4 5 6 7 8 9 10
Node number 7 6 4 10 3 9 8 5 1 2
Table 1: The influential nodes in the example network of networks

Iv Application of the new method in the cancer gene expression networks

In order to illuminate the usefulness of the proposed method, four cancer gene expression networks, the glioblastoma multiforme (GBM), the breast invasive carcinoma (BIC), the kidney renal clear cell carcinoma (KRCCC) and the lung squamous cell carcinoma (LSCC), have been applied as cases. Each cancer gene expression network has three kinds of expression networks, the DNA methylation network, the mRNA expression network and the miRNA expression network.

The nodes in the networks represent the patients. The relationship in these expression networks are the similarity between each patient. Based on the definition of the network of networks , the four cancer gene expression networks can treated as four NON. In order to find the influential patients in the NON, a comprehensive similarity expression network is constructed by the combination rules of evidence theory. Each NON in this case has three single networks, the DNA methylation network, the mRNA expression network and the miRNA expression network.

The experiment results are shown in the Table 2 and Fig. 3. In the Table 2, the number of most influential nodes in the four NON are shown. For example, in the network GBM, the first influential node is the node 116. In the Fig. 3, the nodes with bigger size and deeper color are the more influential nodes. The results illustrate that the proposed method is reasonable and significant.

Order number 1 2 3 4 5 6 7 8 9 10
GBM 116 60 190 179 50 42 139 72 194 209
BIC 106 34 71 49 68 7 55 76 100 51
KRCCC 104 15 113 117 110 50 41 96 77 87
LSCC 37 90 91 51 93 33 88 28 92 38
Table 2: The influential nodes in the four cancer gene expression networks
(a) GBM
(b) BIC
(c) KRCCC
(d) LSCC
Figure 3: Influential nodes of four NON

V Conclusion

Many real systems in the real world can be treated as the network of networks. Identifying the influential nodes in the network of networks is theoretical and practical significance. Dempster-Shafer theory of evidence is used to deal with uncertain information and has been widely used in many fields. In this paper the combination rules of evidence theory are used to fuse the influence of each node in different single networks. The proposed method can fuse different kinds of relationship between the network components to constructed a comprehensive similarity network. The nodes which have a big value of similarity are the influential nodes in the NON. One of the advantages of the proposed method is that the more the type of the interrelation between networks components is, the more accurate the results are. The experiment results illustrate that the proposed method is reasonable and significant.

Vi Acknowledgment

The work is partially supported by National High Technology Research and Development Program of China (863 Program) (Grant No. 2013AA013801), National Natural Science Foundation of China (Grant No. 61174022), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20131102130002), R&D Program of China (2012BAH07B01), the open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (Grant No.BUAA-VR-14KF-02). Fundamental Research Funds for the Central Universities No. XDJK2015D009. Chongqing Graduate Student Research Innovation Project (Grant No. CYS14062).

References

  • Newman (2003) M. E. Newman, SIAM review 45, 167 (2003).
  • Newman et al. (2006) M. Newman, A.-L. Barabási,  and D. J. Watts, The structure and dynamics of networks (Princeton University Press, 2006).
  • Watts and Strogatz (1998) D. J. Watts and S. H. Strogatz, nature 393, 440 (1998).
  • Nicosia et al. (2013) V. Nicosia, G. Bianconi, V. Latora,  and M. Barthelemy, Physical review letters 111, 058701 (2013).
  • Wang et al. (2014) B. Wang, A. M. Mezlini, F. Demir, M. Fiume, Z. Tu, M. Brudno, B. Haibe-Kains,  and A. Goldenberg, Nature methods 11, 333 (2014).
  • Kenett et al. (2014) D. Y. Kenett, J. Gao, X. Huang, S. Shao, I. Vodenska, S. V. Buldyrev, G. Paul, H. E. Stanley,  and S. Havlin, in Networks of Networks: The Last Frontier of Complexity (Springer, 2014) pp. 3–36.
  • Gao et al. (2011) J. Gao, S. V. Buldyrev, S. Havlin,  and H. E. Stanley, Physical Review Letters 107, 195701 (2011).
  • Boccaletti et al. (2014) S. Boccaletti, G. Bianconi, R. Criado, C. Del Genio, J. Gómez-Gardeñes, M. Romance, I. Sendiña-Nadal, Z. Wang,  and M. Zanin, Physics Reports 544, 1 (2014).
  • Battiston et al. (2014) F. Battiston, V. Nicosia,  and V. Latora, Physical Review E 89, 032804 (2014).
  • Zhang et al. (2015) Q. Zhang, C. Luo, M. Li, Y. Deng,  and S. Mahadevan, Physica A: Statistical Mechanics and its Applications 419, 707 (2015).
  • Dempster (1967) A. P. Dempster, The annals of mathematical statistics 38, 325 (1967).
  • Shafer (1976) G. Shafer, A mathematical theory of evidence, Vol. 1 (Princeton university press Princeton, 1976).
  • Bloch (1996) I. Bloch, Pattern Recognition Letters 17, 905 (1996).
  • Cuzzolin (2008) F. Cuzzolin, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 38, 522 (2008).
  • Denœux (2011) T. Denœux, Fuzzy sets and systems 183, 72 (2011).
  • Deng et al. (2011) Y. Deng, R. Sadiq, W. Jiang,  and S. Tesfamariam, Expert Systems with Applications 38, 15438 (2011).
  • Li et al. (2014) M. Li, X. Lu, Q. Zhang,  and Y. Deng, Mathematical Problems in Engineering 2014, Article ID 319264, 6 pages (2014).
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
11079
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description