Unveiling the Multifractal Structure of Complex Networks
Abstract
The fractal nature of graphs has traditionally been investigated by using the network’s nodes as the basic units. Here, instead, we propose to concentrate on the graph’s edges, and introduce a practical and computationally not demanding method for revealing changes in the fractal behavior of networks, and particularly for allowing distinction between monofractal, quasi monofractal, and multifractal structures. We show that degree homogeneity plays a crucial role in determining the fractal nature of the underlying network, and report on six different proteinprotein interaction networks along with their corresponding random networks. Our analysis allows to identify varying levels of complexity in the species.
[name=SI Figure]suppfigure \DeclareFloatingEnvironment[name=SI equation]suppequation \DeclareFloatingEnvironment[name=SI Table]supptable
Unveiling the Multifractal Structure of Complex Networks
Sarika Jalan, Alok Yadav, Camellia Sarkar & Stefano Boccaletti
Complex Systems Lab, Discipline of Physics, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India
Centre for Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India
CNRInstitute of Complex Systems, Via Madonna del Piano, 10, 50019 Sesto Fiorentino, Florence, Italy
The Italian Embassy in Israel, 25 Hamered st., 68125 Tel Aviv, Israel
Corresponding author email: sarikajalan9@gmail.com
The fractal geometry of nature was first described by Mandelbrot already in 1967 [1], and the approach has been extensively used, since then, to gain insights into the underlying scaling of a variety of visually complex structures, like fracture surfaces of metals [2], strange attractors [3, 4], diffusion [5] and medical imaging [6] patterns, galaxies [7], and atomic spectra [8], just to quote a few examples. In complex networks, so far the focus has been investigating selfsimilarity [9] and fractal structure of skeleton [10], as well as growth models which capture the observed fractal behavior[11, 12].
A fundamental issue is identifying whether a system is a mono or a multifractal, i.e. whether or not a unique fractal scaling spans all the different system’s parts, regions or components. Multifractal analysis requires considering a physical measure: the number of nodes within a box of size, say , has been used so far to analyze how the distribution of such number of nodes scales in a network, as the box size increases [10]. In this Letter, instead, we quarter the focus on the scaling of the number of edges in a partitionbox of the network’s adjacency matrix . Precisely, we consider a spatial distribution of the network’s edges (instead of the network’s nodes), this way making edges (the entries of the ’s plane) as the basic units for the evaluation of fractal dimensions.
The aim is introducing a practical and computationally non demanding procedure, able to distinguish mono from multifractal characters in complex networks. To do so, the distribution of 1’s in square boxes of length in is analyzed by the use of the box counting method (BCM) [13, 14, 15], which considers only non overlapping boxes, thus preventing specific parts of the matrix from getting overweighted due to systematic overcounting. Furthermore, the robustness of the results is pledged by reshuffling the nodes, i.e. by avoiding any form of biasness generated by a specific node indexing. Precisely, the supplementary material contains evidence that each shuffling step provides a new ensemble of adjacency matrices [16].
A schematic representation of the procedure is sketched in Fig. A.1. We start with partitioning the adjacency matrix into size boxes, and counting the number of boxes with at least one nonzero entry (edge), with varying from 2 to . exhibits typically the scaling , with giving the dimension of the network. If is a noninteger number, the network is said to be fractal. , however, gives no information whatsoever on the system’s multifractality, which should be accounted for, instead, by a multifractal approach [13]. Let therefore digitalize into size boxes, and let be now the number of ‘1’ entries in the element of the partition. The occurrence probability of 1’s in the box of size , denoted by , ranges therefore between and . At each value, the moment (for all real numbers) of this probability is given by
(1) 
The scaling exponent is given by
(2) 
and is obtained from the slope of vs. at all value. Notice that is here calculated always as ensemble average over the values obtained by shuffling the node indices. The moment dimension is then given by
(3) 
For each box of size and occupation probability , the singularity strength is given by , and at every , is evaluated as . The singularity spectrum, , is related with by means of the Legendre transform
(4) 
with being the Hlder exponent, and indicating the dimension of the subset scaling with .
A multifractal structure is indicated by the following marks [13]: i) multiple slopes of vs. ; ii) a non constant value of vs. , and iii) vs. covering a broad range instead of being accumulated at nearby noninteger values of .
We start by briefly discussing the effect of different reshuffling of the nodes’ labeling, in particular (i) random reshuffling, and (ii) degreebased reshuffling. For an size box (located at the coordinate in the adjacency matrix plane and denoted as ) the probability of occurrence of 1’s is in case i) (detailed derivations are provided in Supplementary Material), and is therefore independent of the network architecture. Case ii) leads instead to interesting behavior, as the probability of occurrence of 1’s depends on the heterogeneity present in the networks. For 1d lattices, where no degree heterogeneity is present, random and degreebased reshuffling provide the same results [16]. At variance, ErdösRényi (ER) and scalefree (SF) networks, which display varying levels of degree heterogeneity, exhibit different behaviors for the two cases. For the degreebased reshuffling one has , where is the degree of a node inside the box and is the probability of occurrence of degree nodes in the network. Note that depends upon the degree as well as the probability density of degree of all the nodes inside the box [16]. In the following, we first sort the nodes of a network in decreasing order of their degrees and assign their indices. As the usual case is that many nodes have the same degree, in order to avoid the possibility of the preference being conferred on a particular node in a set of the same degree nodes, we carry out several realizations of the shuffling of indices among the nodes with the same degree.
We now present the results for 1d lattices, ER and SF networks, and real world proteinprotein interaction (PPI) networks of six different species. With a range of values spanning from to , is evaluated for values ranging between to . The slope of (calculated using Eq. 1) versus (on a doubly logarithmic scale) provides the estimation of the value of the scaling exponent at each , denoted as (Eq. 2). For all networks, the value of the scaling exponent is zero at and nonzero at other values (see Fig. 2). The range can be divided, then into the left () and the right () region. A same value of the slope of in both regions indicates a monofractal nature of the network, while different values are a signature of a multifractal structure. For a regular lattice, all nodes have the same degree. This kind of uniformity results in a single value of the scaling exponent on both left and right regions (see Fig. 3(a)). While the slopes of are indicative measures of the multi, mono or nonfractality of the system, the range of values gives the dimension(s) of the corresponding networks. For 1d lattices, the values shrink to a very narrow range about , confirming again the monofractal character of the graph, while the singularity spectrum, (calculated using Eq. 4) shows that, for a certain range of , most of the values are accumulated in a nearby noninteger region, with only a few points deviating from this behavior.
The case of ER networks is far different. There, by construction, a new edge does not have preference to connect with specific nodes [17], leading to approximately similar degrees of all the nodes, and to an adjacency matrix where the 1 entries are dispersed in the entire plane, without clusters or patterns. The associated homogeneity gives nearly close values of scaling exponents on the the left and right regions on performing multifractal analysis (see Fig. 3(a) and consult the Supplementary Material for the analysis of the impact of degree homogeneity on multifractal dimensions). Further, on plotting vs. , one finds find that the right region exhibits exactly the same behavior as that of the 1d lattice, whereas the left region deviates significantly from a monofractal pattern (see Fig. 3(b)). The conclusion is that ER networks can be said to exhibit a multifractal nature in the left region of the spectrum, and a quasimonofractal character for . When SF networks are taken into consideration, the growth mechanism producing them has a bias towards high degree nodes getting more and more edges (as preferential attachment induces new nodes to get linked to existing high degree nodes [17]), and gives a large spread in the degrees of the network’s units. In the adjacency matrix, this is reflected by few rows and columns (corresponding to high degree nodes) with a very large number of entries, with the majority of rows and columns having only very few 1’s. Two large structures in the adjacency matrix then arise: one is a strip structure with a dense accumulation of 1’s, and another is the sparse dispersion of entries in rest of the space. The mixing of these two patterns yields different scaling exponents, and is indicative of a multifractal behavior (see Fig. 3(a)).
The probability of occurrence of 1’s in the box of size , , ranges between and . On considering the moments of these probabilities (Eq. 1), one finds that in the positive region the values dominate over the , while in the negative region the values exhibit dominance over values with increasing magnitude of . This means that the behavior of the densely packed boxes of entries in the adjacency matrix plane are reflected in the right region, while the sparsely packed boxes are determining the behavior of the left region. The conclusion is that the relevant character of the system (in terms of high density of edges) is described by the right region, while generally small fluctuations in the system’s fractal properties are accounted for the left region (as very small changes in the moments of the probabilities get magnified in the negative region).
In 1d lattices, all nodes have exactly the same degree, thus rendering uniform the distribution of 1’s across the adjacency matrix planes. As a consequence, there are only two probabilities of occurrence of 1’s: for all the boxes in the offdiagonal region, the probability is , while for all the boxes lying in the diagonal region, the probability is . The slight kink in the values in the left region can be therefore attributed to the probabilities corresponding to the diagonal regions.
The degree distribution of a ER random network follows a Poisson statistics. Though most of the nodes have degree close to , there are nodes having higher and lower degrees as well. The probability of occurrence of 1’s in a box depends on the degree distribution, and deviates from that of a 1d lattice. The deviation in the homogeneity of node degrees is reflected in the lower and upper bounds for the probability of occurrence of 1’s: for all the boxes in the offdiagonal region, the probability is , whereas for those in the diagonal region, the probability is . Though the distribution of 1’s in the adjacency matrix plane of the ER random networks appears very similar to that of a 1d lattice (se Supplementary Material), the deviation in homogeneity of node degrees results in significant deviations within the left region of , as small changes are actually magnified in this region..
For SF networks, two types of dominant structures (patterns) exist in the adjacency matrix plane. The first structure is made up of densely packed rows and columns of 1’s, while the second structure is formed by sparse population of 1’s. The mixing of these two structures give rise to wide variety of probabilities ’s. For positive (negative) values, the principal contribution to the multifractal nature of the graph comes from the first (the second) structure. Thus, for negative values, SF networks exhibit the same behavior as ER random networks, while their behavior strongly differs from that of ER graphs for positive values.
Finally, we apply our measure for shedding light on the structure and nature of associations in real world proteinprotein interaction (PPI) networks. Namely, we consider the PPI networks of six different species: the C. elegans, the D. melanogaster, the E. coli, the H. pylori, the H. sapiens and the S. cerevisiae. In order to draw a fair comparison among the different PPI networks, we construct corresponding ER random networks having the same size and average degree as the real networks, and perform a comparative multifractal analysis for 100,000 rounds of node reshufflings, and 50 realizations. On plotting vs. for the PPI networks and their corresponding random controls, one finds that for E. coli and S. cerevisiae both left and right regions deviate significantly from their random counterparts (see Fig. 4(c) and (f)), indicating a genuine multifractal behavior of the underlying networks, which is quite expected as all these species are known to display scalefree properties in the degree distribution [18]. In the cases of D. melanogaster and H. pylori, the extent of deviation of the PPI networks from their corresponding ER graphs is less evident (see Fig. 4(b) and (d)). Interestingly, for C. elegans, the PPI network very closely resembles the behaviour of its corresponding random network (see Fig. 4(a)).
In the case of H. sapiens, the behavior is very different as compared to all other PPI networks. Indeed, in contrast to the other five graphs, the H. sapiens PPI network exhibits a fractal behavior intermediate between ER random networks and 1d lattices in the left region (see Fig. 4(e) and 3(b)), indicating some kind of structural homogeneity in the underlying system, as also revealed recently through an analysis based on random matrix theory [18]. The multifractal analysis supports and corroborates the hypotheses that real world biological systems are formed based on different designing principles, which are then reflected in difference in their dimensionality.
In conclusion, this Letter provides a practical and computationally not demanding method for inspecting the complexity of real world systems, by analyzing the scaling behaviour of edges in the underlying networks. Our study reveals that networks of the same size and average degree can have completely different fractal character and nature, depending upon how edges are distributed in the networks. Further, we demonstrated how homogeneity in node degrees influences the scaling behavior of the underlying networks, and a deviation from degree homogeneity leads to significant changes in the graph’s dimensionality. Our results point out the importance of edge distribution in determining the behavior of the system, and have diverse applications. For instance, different biological networks such as proteinprotein interaction networks, metabolic networks, transcription regulatory networks, neural networks (which are based on different basic design and functional principles) may display similar fractal behavior depending upon similarity in their edge distributions. It is indeed important to remark that the underlying biological systems have evolved and survived over a long span of time without any major changes in their properties [19]. Additional examples demonstrating the importance of edge distribution are the edgeserver network which facilitates the search on the internet by Google [20, 21], and the identification of influential nodes in social networks which takes into account the arrangement of the edges in the graph [22]. Therefore the easy method for multifractal analysis presented here can be of interest and use for a better understanding and description of complexity in these systems. It could be furthermore interesting to investigate multifractality in networks constructed based on a given ordering of nodes, which would result in an adjacency matrix with a more pronounced block structure (specifically more dense diagonal blocks). Looking at the differences between a random permutation of the nodes and the ordering of the nodes that gives the best community structure can be one of the dimensions of future investigation.
SJ and AY acknowledge the Department of Science and Technology grant 25(0205)/12/EMRII and the grant CSIR 09/1022(0013)/2014EMRI. SJ thanks J. N. Bandyopadhyay for useful discussions in the initial phase of the work.
Supplementary Material
Appendix A Random reshuffling and degreebased reshuffling of the nodes
In our work, two different methods of reshuffling the nodes have been considered.

The labeling of the nodes is reshuffled randomly. Usually saturation of measurements occurs for a number of reshuffling being 10 times larger with respect to the number of connections.

A degree based reshuffling, where nodes are first sequenced based on their degree, and reshuffling operations are made only among those nodes which have the same degree. For this case, the saturation of measurements is reached much earlier.
a.1 Random reshuffling
For random reshuffling, the probability of occurrence of 1’s in boxes of the adjacency matrix plane is proportional to . In the following, we analytically derive the differences.
For a network of size , upon shuffling randomly the nodes’ indices, a node of degree (say) can have an index (say) without any relation between and . Therefore, the probability for any node (say ) to have index is given by . Let us now represent each box by its position in the adjacency matrix plane. represents a box which starts from in the adjacency matrix plane. {suppfigure}
The number of 1’s in a box is given as
(5) 
where and are the adjacency matrix coordinates inside a box.
=1, if is connected to and 0 otherwise, meaning that the contribution of row to is proportional to the degree of index node, as the column is randomly allotted for a fixed . This condition is arising due to the fact that for a fixed , the probability that is connected to node is proportional to the degree of node.
Therefore,the probability of any is given as
(6) 
and the probability of any in a box is
(7) 
is given by
Let be the probability that a given box of size has at least one entry . Then one has x Probability of degree node in the box of size . One then has
where is the probability of node being allotted index.
In random reshuffling method, all the nodes in a network have equal probability of occurring at a particular index. Hence, . Therefore,
Since index runs within a box having size , hence
(8) 
Therefore,
(9) 
This indicates that on considering random reshuffling of the nodes, the probability of occurrence of 1’s in a box of size is directly proportional to the average degree of the underlying network. Further, to capture minor changes in this probability, with respect to various degree distributions, we consider 1d, ER and SF networks. {suppfigure}[t]
In the 1d lattices with circular boundary condition, all the nodes have exactly the same degree which is equal to the average degree of the network. Thus each box exhibits exactly the same behaviour. For the ER random networks, the degree of nodes are closely distributed about the average degree of the underlying network. Therefore, probability of occurrence of 1’s in each box has small deviations from the average behaviour. In the case of SF networks, a very high degree heterogeneity is present in the networks, which leads to significant local deviations from the average behaviour, depending on the type of the node (high degree node or low degree node) occupying the boxes (see Fig. S1 and S2).
To further assess the importance of the reshuffling method on the probability of occurrence of 1’s in a box, in the following we adopt a different method of reshuffling considering degree homogeneity.
a.2 Degreebased reshuffling
In the degreebased reshuffling method, we arrange the nodes in a decreasing order of their degrees and assign them indices. To avoid any preferential allocation of indices between the same degree nodes, we reshuffle the indices between the same degree nodes. For a network of size , and , where is the probability of occurrence of degree node and is number of nodes with degree . Given that ,

If , any degree node having index is given as .

If , any degree node having index is given as
(10)
For instance, let us consider a degree sequence viz. [200, 136, 80, 60, 60, 60, 30, 30, 22, 22, 22, 19 ……]. We want to find the range of indices which can have a node with degree 60. From Eq. 10, the upper and lower bound of indices that can have a node with degree 60 is given as .
[h]
Since in the degreebased reshuffling, the nodes having similar degrees get reshuffled, the indices accessible to a particular degree (say ) are given by the probability
(11) 
where is the Heaviside step function (Fig. S3).
Assuming there is no preference of the nodes being connected based on their degrees, we say that probability that any given degree node to be connected to any other node is equal to .
Therefore, a box of size () starting from a point in the adjacency matrix plane () will have number of 1’s, given as;
(12) 
where , if and are connected and , otherwise.
The probability of any node being connected to the node is given by
(13) 
The probability of getting 1’s in box of size is given by
(14) 
Now, according to the degree sequence of the network and the position , can assume various values which are discussed in the following subsections.
Case 1: The nodes have only one type of degree (say ) in the box .
The indices of size box, i.e. ( to , to ), which lies within the range given by Eq. 10 can have only degree nodes. Therefore, the probability of occurrence of degree node within the box is , while the probability of occurrence of other degree node is zero. The probability of having 1’s at position within the given box is drawn from Eq. 13 as
(15) 
Since is the same for all within the box, the consequence is that
(16) 
Case 2: The nodes have two types of degrees (say and ) in the box.
The indices of a size box, i.e. ( to , to ) lies within the range given by Eq. 10 for and degree nodes. Therefore, the probability of any degree node allotted within the box is and the probability of any degree node allotted within the box is . The probability of occurrence of any other degree nodes is zero. Therefore, the probability of any being within the box is given by
(17) 
Therefore, the probability of having 1’s inside the box is
(18) 
In comparison to the random reshuffling, the degreebased reshuffling highlights following differences for both the case. For the case of degreebased reshuffling, the probability of occurrence of 1’s within a box depends on the position of the box as well as the coordinates within the box (for instance, (,) (,) …….. (,)). More importantly, the probability of occurrence of 1’s in the degreebased reshuffling method depends on , i.e., network’s degree distribution. Whereas, in the case of random reshuffling, the probability of occurrence of 1’s within a box depends only on the average degree of the underlying network. Therefore, the degreebased reshuffling leads to different behaviours for 1d lattices, ER random networks and SF networks of the same average degree and size (see Fig. S1).
After discussing impact of reshuffling on the probability of occurrence of in a particular box, in the following we derive expressions for probability distribution as well as moment of fractal dimension for various different types of the edge distributions.
Appendix B Edge distribution of different networks on adjacency matrix plane
For the edge distribution of random networks and 1d lattices, we present three simplified cases: (a) a strictly homogeneous distribution, (b) an approximately homogeneous distribution and (c) a strictly linear distribution.
b.1 Case A: Strictly Homogeneous distribution
Any size box in the adjacency matrix plane will have number of 1’s. Here, we consider large size network to avoid any boundary effect while assigning the boxes.
Therefore, the probability of occurrence of 1’s in a box of size is given as
.
Summing up these probabilities gives
.
The moment of the probability of occurrence of 1’s for a size box is given as
.
By taking log of both the sides, one has
This is an equation for a straight line with a slope , which we define as the
scaling exponent .
The moment dimension of the system defined as gives a single dimension i.e. 2.
The singularity spectrum, is related with by means of
Legendre tranform
.
where .
Substituting the value of , and .
This indicates that a homogeneous distribution over a large system (where boundary effect can be neglected) leads to a nonfractal structure with dimension 2.
b.2 Case B : Approximately Homogeneous distribution
In this case, we assume that the number of 1’s in the box deviates from the
homogeneous distribution of 1’s by a small amount generated randomly for each box
with alternately changing sign in such a way that sum over all the multifractal measures is
normalized. This gives the following expression for the multifractal measure,
,
which further gives us the following restriction on the random numbers () to be generated,
Now, moment of can be written as,
For small values, the qth moment can be approximated to,
Taking log on both the side, we get,
By using approximation for small , we get,
Also, gives
One can see that theoretically this approximation does not affect values of and from those of case A. However, numerical evaluation shows significant deviations of these values from those of the case A. For these two cases of strictly homogeneous and approximately homogeneous distribution, on plotting vs we find that the strictly homogeneous case has a straight line behaviour with existence of various slopes indicating nonfractal structure with dimension 2. Whereas the approximately homogeneous case manifests a deviation leading to nonfractal dimensions distributed very closely about 2 which is in fact a multifractal structure (see Fig. S4). This indicates that one of the reasons behind vs plot deviating from the linear behaviour can be attributed to the nonhomogeneity in the degree distribution.
Appendix C versus for PPI Networks
On plotting versus for the PPI networks and their corresponding random controls, for E. coli and S. cerevisiae, we find a significant deviation exhibited by the PPI networks from their corresponding random controls (see Fig. S5(c) and (f)), indicating multifractal behaviour of the underlying networks. This is quite expected owing to their scalefree nature of degree distribution [18]. The extent of deviation between the PPI networks of and and their corresponding ER random networks diminishes (Fig. S5(b) and (d)), while noticeably for the PPI network very closely resembles the behaviour of its corresponding random network (Fig. S4(a)). Interestingly, exhibits a completely different behaviour from rest of the model organisms. The left region of PPI network exhibits a slope which is lower than that of its corresponding ER random network (see Fig. S5(e)), thus indicating that though the underlying system maintains an universality in large scale properties with the other PPI networks, minor fluctuations in fractal behaviour captured by the left region indicates a homogeneity in the underlying structure. These features of the PPI networks are magnified and better understood through the behaviour of versus , discussed in length in the paper. {suppfigure}[bh]
References
 B. B. Mandelbrot, The Fractal Geometry of Nature, (Freeman, New York,1975).
 B. B. Mandelbrot, D. E. Passoja and A. J. Paullay, Nature 308, 721 (1984).
 H. G. E. Hentschel and I. Procaccia, Physica D 8, 435 (1983).
 D. A. Egolf and H. S. Greenside, Nature 389, 129 (1994).
 B. B. Mandelbrot, J. Fluid Mech. 62, 331 (1974).
 R. Lopes and N. Betrouni, Medical Image Analysis 13, 634 (2009).
 K. K. S. Wu, O. Lahav and M. J. Rees, Nature 397, 225 (1999).
 A. Sáiz and V. J. Martínez, Phys. Rev. E 54, 2431 (1996).
 C. Song, S. Havlin and H. A. Makse, Nature 433, 392 (2005); C. Song et. al., J. Stat. Mech. 3, 03006 (2007).
 J. S. Kim et. al., Phys. Rev. E 75, 016110 (2007).
 S. Jung, S. Kim and B. Kahng, Phys. Rev. E 65, 056101 (2002).
 G. Palla, L. Lovász and T. Vicsek, PNAS 107, 17, 7640 (2010).
 P. Grassberger and I. Procaccia, Phys. Rev. Letts. 50, 346 (1983).
 A. L.Barabási, P. Szèpfalusy and T. Vicsek, Physica A 178, 17 (1991).
 J. W. Kantelnardt, H. E. Roman and M. Greiner, Physica A 220, 219 (1995).
 the reader is addressed to the Supplementary Material for the detailed derivations of the probability distributions (and the numerical calculations of fractal dimensions) for a random reshuffling case, as well as matrix diagrams for various types of nodes’ index reshufflings.
 M. Newman, Network: An Introduction, (Oxford Univ. Press, 2010).
 A. Agrawal et. al., Physica A 404, 359 (2014).
 R. Albert, Journal of Cell Science 118, 4947 (2005).
 S. Jain et. al., ACM SIGCOMM Computer Communication Review 43, 3 (2013)
 E. Nygren et. al., ACM SIGOPS Operating Systems Review 44, 2 (2010).
 D. Kempe, J. Kleinberg and É. Tardos, Theory Of Computing 11, 105 (2015).