Corona graphs as a model of smallworld networks
Abstract
We introduce recursive corona graphs as a model of smallworld networks. We investigate analytically the critical characteristics of the model, including order and size, degree distribution, average path length, clustering coefficient, and the number of spanning trees, as well as Kirchhoff index. Furthermore, we study the spectra for the adjacency matrix and the Laplacian matrix for the model. We obtain explicit results for all the quantities of the recursive corona graphs, which are similar to those observed in reallife networks.
keywords:
Corona product, Smallworld, Spectra, Laplacian matrix[corZhang]Corresponding author. Email address: zhangzz@fudan.edu.cn
1 Introduction
For decades we want to know what a graph looks like. We want to reveal the principles of the networks’ behaviour covered by their complex topology and dynamics. We want to learn about how the network structure evolves over time and how it affects the properties of dynamical processes on it. For the nature of decentrality of real networks, it is hard to observe the networks directly. Instead we observe them by taking snapshots of their network structure and content and keep updating them. Yet this gives little information about the future, since many of them keep growing over time. Thus it is desirable to set up models to fit real networks in both structure and functionality. We can use the models to mimic reallife networks. We also expect that the properties of the models can be proven rigorously, thus we can find the relations between topological and dynamical properties of networks. Even if it is hard to give closedformed expressions for some quantities, it would be nice to make them tractable for convenience of estimation.
Among various network models, the ER graph proposed by Erdös and Rényi (1) is the earliest one. It generates a random graph by choosing a constant probability for joining every pair of vertices in the network. The model exhibit interesting statistical properties and has been well studied by many people. However the model lacks some important properties of realworld networks. For example many realworld networks exhibit smallworld property (2); (3); (4) with their diameters growing logarithmically in the number of vertices, while maintaining a high clustering coefficient. The WattsStrogatz (WS) model (3) is a typical graph model with the smallworld effect. Nevertheless, because of its randomness, many of its properties cannot be derived precisely, for example eigenvalues of the adjacency and Laplacian matrices. Thus deterministic models are often used to mimic complex networks (5), since their structural (6) and spectral (7) characteristics can be determined analytically. In addition to the smallworld effect, another important feature of a network is degree distribution. Many realworld networks exhibit a heavytail distribution while some networks have an exponential distribution (8); (9); (10). The famous preference attachment (11); (12); (13); (14) scheme successfully described the growing process of networks with heavytailed degree distribution. This work, like the WS model (3), leads to a network with an exponential degree distribution.
Recently graph (matrix) products have been applied to modelling graphs with the same properties as reallife networked systems, such as the Cartesian product (15), dot product (16) and Kronecker product (17); (18); (19); (20); (21). A merit of such methods is that the graph/matrix products facilitate estimate of the properties of the generated graphs.
In this paper we introduce a recursive way to generate smallworld networks with an exponential degree distribution, based on corona product of graphs. We obtain exact solutions to many structural properties of the networks. Moreover, we derive all the eigenvalues for their adjacency matrix and Laplacian matrix, which are provided in a recursive way. Based on the obtained eigenvalues, we calculate the number of spanning trees, as well as the Kirchhoff index of the networks.
2 Graph Construction
In this paper, we use corona product to generate a smallworld network model. Literatures about the corona product and its related graphs are partly established (22); (23). Let be the embedded graph of a network. Suppose the graph is undirected and has vertex set and edge set . We define the number of vertices as the order of the graph, and the number of edges as the size of the graph, denoted as .
Given two graphs and , their corona product is defined as follows.
Definition 2.1.
Let and be two graphs with disjoint vertex sets. has vertices and has vertices. Their corona product is a new graph which consists of one copy of nd copies of . The th vertex of is joint by a new edge with every vertex in the th copy of .
In this paper we investigate the case where is the complete graph, thus we give the definition of the recursive corona graph.
Definition 2.2.
Let be the complete graph (), then the th generation of recursive corona graph (RCG) is defined as the corona product of the previous generation of RCG and . More formally, is defined as , , with the initial condition .
Figure 1 illustrates the construction process for a particular network .
3 Structural properties
In this section we derive several important quantities of the RCG, showing that it is an appropriate model for the smallworld complex networks. Thanks to the deterministic feature of , we can give exact expressions for the properties of the graph. We will give the explicit result for its order, size, degree distribution, degree correlation, average distance, clustering coefficient, number of spanning tree and Kirchhoff index.
We denote by and respectively the order and the size of . Next we show how to derive these quantities. Assume that the number of vertices and the number of edges that get newly generated at step is denoted as and . Then it is obvious that we have for , which leads to the result of along with the initial condition . With respect to the size of the network, we have , , and the initial condition .
Proposition 3.1.
The order and size of the graph are, respectively.
(1) 
and
(2) 
The average degree is which tends to for large . Note that many reallife networks are sparse and their average degree tend to a constant value.
3.1 Degree distribution
The degree distribution for a network is a function indicating the fraction of vertices with degree over all vertices. The degree distribution is a very important characteristic of a graph. It is essential to the analysis of many other structural properties.
The cumulative degree distribution (8) is defined as
which is often used to analyse the degree distribution of a graph. The quantity gives the fraction of vertices whose degree is greater than or equal to . In addition, networks whose degree distributions are exponential: , have also an exponential cumulative distribution with the same exponent:
(3) 
Next we investigate the degree distribution of . We find that at time , the network has vertices of degree . Now we study the degree of some vertex at step . Let the value be , we look in details about how the quantity evolves. We assume that vertex is added to the network at step , (). For any , there are , where edges link to other vertices in and the other edge links to . At every step every existed vertex increase its degree by .
Theorem 3.2.
The cumulative degree distribution of the graph follows an exponential distribution: .
Proof.
The degree of vertex at step , denoted as , can be written as
(4) 
Thus we have
(5) 
This means that the numbers of vertices with the degree equal to are, respectively, .
For a certain value of degree , we have where . Therefore we can find
(6)  
For large we have
∎
3.2 Degree correlation
One important parameter for the degree correlation is the average degree of adjacent vertices of all vertices which is referred to as any vertex with degree . We denote the parameter by . If increases with , this means that the vertices have a tendency to connect to vertices with a similar or larger degree. In this case we claim the graph to be assortative. The considered value of vertices with degree which is generated at step can be written as
(7) 
According to Eq.(3.1), we can express it as
Since , therefore we have , which means
(8) 
As for the initial vertices, everyone of them has the same distribution on the degrees of its neighbors. It leads to
(9) 
which yields
(10) 
By checking the results we can see that the considered graph is assortative.
3.3 Average distance
Given a graph , its average distance or mean distance is defined as: where is the distance between the pair of vertices and .
Theorem 3.3.
The average distance of graph is
(11) 
Proof.
To begin with, we assume that the summation of distances between all pairs of vertices in is . The sum of distances between all pairs where belongs to vertex set and belongs to a disjoint vertex set , is denoted as .
In order to utilize the recursive construction process to the recursive corona graph we classify the vertex pairs in into 4 different categories , , and . The sum of the distances for the 4 categories is denoted as , , and , respectively.
Category refers to the pairs within the same that we add to the network at step . Category refers to the pairs where is selected from one of the s added at step and selected from any other added to the network at the same step. refers to the pairs where is a new vertex and is a vertex in . As for category , it indicates the pairs where both and are from the previous generation of graph . Thus we have the following equations:
(12)  
(13)  
(14)  
(15)  
(16) 
Combining these recursive expression together we have:
(17) 
with the initial condition we get the result
(18) 
Dividing by yields Eq.(11). For large , we have , which increases logarithmically with the network order.
∎
3.4 Clustering coefficient
Clustering coefficient (3) is another crucial quantity used to characterize network structure. Many works about determining clustering coefficient and its related quantities are done on both graph models and graphs in reality (3); (24); (25); (26).
The clustering coefficient of vertex is defined as the following quantity
(19) 
where is the number of edges between the neighbours of vertex . The network clustering coefficient is defined as the average of among all vertices. That is,
(20) 
Theorem 3.4.
Let be a vertex in whose degree is . Except the initial vertices, its clustering coefficient is
(21) 
Proof.
Let us review the intermediate result in calculating the degree distribution that the number of vertices with degree are, respectively, . Except the initial vertices, the clustering coefficient of other vertices follow the same rule, that is, a vertex with degree has neighbours, which are evenly distributed in clusters. Each cluster forms a complete graph . Thus the clustering coefficient of vertex is derived as
(22) 
and for initial vertices
(23) 
Theorem 3.4 is naturally gained. ∎
Theorem 3.5.
The clustering coefficient of RCG network converge to
(24) 
when the network order is high enough. In the expression, is the Lerch transcendent function. For large , the clustering coefficient tends to .
3.5 Spanning trees
Next we derive the number of spanning trees in graph .
Theorem 3.6.
The number of spanning tree of is
(27) 
Proof.
According to the Cayley’s formula (27), the number of spanning trees of a complete graph is equal to . Since all vertices of a added to the graph are connected to a vertex in the original graph, these vertices consist of a new complete graph . Therefore we have the following recursive relation of the spanning trees of :
(28) 
Together with the initial condition , we can derive the expression for ;
(29) 
∎
3.6 Kirchhoff index
Resistance distance is an important character of a graph, which can imply many of its dynamic properties. The Kirchhoff index (28) of a graph refers to the sum of resistance between all vertex pairs in an associated electrical network obtained from the graph by replacing each edge of the graph by a unit resistance. Denote the effective resistance between vertices and as or , then the Kirchhoff index of graph is defined as
(30) 
We denote the Kirchhoff index of graph by .
Theorem 3.7.
The Kirchhoff index of is
(31) 
Proof.
We denote by as the sum of all effective resistance between pairs in which and belong to two disjoint vertex set and respectively. Similar to the method we used in calculating the average distance, we classify these pairs into categories , , and , where the definition is exactly the same as used in calculating the average distance. Then the sum of the distances for the four categories is denoted as , , and . We have the following equations:
(32)  
(33)  
(34)  
(35)  
(36) 
which yields
(37) 
Notice that the effective resistance between vertices and in a complete graph is since the potential between any other vertices are identical, if we impose potential difference between and .
Along with the initial condition we can deduce
(38)  
This completes the proof. ∎
4 Spectral analysis
By convention the (unweighted) adjacency matrix of a graph is defined as a matrix with the entry representing the number of edges incident with endpoints . The degree matrix of , is defined as a diagonal matrix with its th entry on the main diagonal equal to the degree of vertex . We call the Laplacian matrix of graph . These matrices determine the structure of the graph, and the eigenvalues of and are sensitive to many of the structural properties, which have remarkable impact on the dynamic processes superimposed upon the network.
Definition 4.1.
Given the adjacency matrix of , we define the spectra of as
(39) 
Similarly, we have:
Definition 4.2.
Given as the Laplacian matrix of , we define its Laplacian spectra
(40) 
4.1 Spectra of Adjacency Matrix
Theorem 4.1.
The relation between and is

with multiplicity for and

with multiplicity .
4.2 Spectra of Laplacian Matrix
Theorem 4.2.
The relation between and is

with multiplicity for and

with extra multiplicity .
Note that in the first part will generate an eigenvalue equal to with multiplicity in iteration . So the actual multiplicity of is for any . The proof of Theorem 4.2 is evident using methods in (22); (23); (29). For convenience of the following discussion we give a similar proof here:
Proof.
The Laplacian matrix of is
(41) 
Let be the Laplacian eigenvectors of corresponding to the eigenvalues , respectively. For , let
Note that , are Laplacian eigenvalues of corresponding to the eigenvectors
respectively. In fact is obtained by solving
(42) 
Thus we can derive the following equations:
(43)  
(44) 
From Eq.(44) we can obtain that . Therefore we can substitute Eq.(44) into Eq.(43), we have:
(45) 
which leads to the result of the first part of the theorem.
If the Laplacian eigenvalues of are correlated with the eigenvectors , respectively, then for , we have
(46) 
This completes the proof. ∎
Next we use the results of the Laplacian spectra to prove Theorem 3.6 and 3.7. First we give an alternative proof of Theorem 3.6.
Proof.
It is known that the number of spanning tree of a graph has the following form (30); (31)
(47) 
where is the number of vertices and refers to eigenvalues of the graph . Given the graph is connected, let be the unique zero eigenvalue, then , are nonzero eigenvalues of the graph Laplacian.
Theorem 4.2 tells that the Laplacian spectrum of consists of two parts. For the first part, we can derive from Eq.(45) that, in iteration , in generates two eigenvalues and which are subject to the relations and . In particular the trivial eigenvalue generates and . As for the second part, there is an eigenvalue with multiplicity .We denote by the sum of all nonzero eigenvalues of and by the product of all nonzero eigenvalues of . Then we can obtain
(48) 
Eq.(48) and the initial condition yield
(49) 
Therefore
∎
The result is equivalent to what we derived using combinatorial method.
In the following we give an alternative proof of Theorem 3.7 using the spectra information.
Proof.
The Kirchhoff index of a graph can be expressed as (32); (33):
(50) 
where and are the same as the previous definition. Let
(51) 
We can follow the clue of the previous analysis by separating the eigenvalues of its Laplacian matrix into two parts. Recall that the eigenvalues of the Laplacian consist of two parts and as defined by Theorem 4.2. Assume that . For the first part of the eigenvalues of , we denote them as and , . Suppose the original eigenvalue in , which is correlated with and , is . Then
(52) 
where
(53) 
(54) 
(55) 
and
(56) 
Accordingly we can obtain