Revealing the Micro-Structure of the Giant Component in Random Graph Ensembles

# Revealing the Micro-Structure of the Giant Component in Random Graph Ensembles

Ido Tishby Racah Institute of Physics, The Hebrew University, Jerusalem, 91904, Israel    Ofer Biham Racah Institute of Physics, The Hebrew University, Jerusalem, 91904, Israel    Eytan Katzav Racah Institute of Physics, The Hebrew University, Jerusalem, 91904, Israel    Reimer Kühn Mathematics Department, King’s College London, Strand, London WC2R 2LS, UK
###### Abstract

The micro-structure of the giant component of the Erdős-Rényi network and other configuration model networks is analyzed using generating function methods. While configuration model networks are uncorrelated, the giant component exhibits a degree distribution which is different from the overall degree distribution of the network and includes degree-degree correlations of all orders. We present exact analytical results for the degree distributions as well as higher order degree-degree correlations on the giant components of configuration model networks. We show that the degree-degree correlations are essential for the integrity of the giant component, in the sense that the degree distribution alone cannot guarantee that it will consist of a single connected component. To demonstrate the importance and broad applicability of these results, we apply them to the study of the distribution of shortest path lengths on the giant component, percolation on the giant component and the spectra of sparse matrices defined on the giant component. We show that by using the degree distribution on the giant component, one obtains high quality results for these properties, which can be further improved by taking the degree-degree correlations into account. This suggests that many existing methods, currently used for the analysis of the whole network, can be adapted in a straightforward fashion to yield results conditioned on the giant component.

###### pacs:
64.60.aq,89.75.Da

## I Introduction

There is a broad range of phenomena in the natural sciences and engineering as well as in the economic and social sciences, which can be usefully described in terms of network models. This realization has stimulated increasing interest during the past two decades in the study of the structure of random graphs and complex networks, and in the dynamics of processes which take place on them Albert2002 (); Dorogovtsev2003 (); Dorogovtsev2008 (); Newman2010 (); Barrat2012 (); Hofstad2013 (). One of the central lines of inquiry since Erdős and Rényi’s seminal study of the evolution of random graphs Erdos1959 (); Erdos1960 (); Erdos1961 () has been concerned with the existence, under suitable conditions, of a giant component, which occupies a finite, non-zero fraction of the graph in the thermodynamic limit of infinite system size. Critical parameters for the emergence of a giant component in the thermodynamic limit of Erdős-Rényi (ER) networks were identified and the asymptotic fraction occupied by the giant component was determined Erdos1960 (); Bollobas1984 (). For configuration model networks, i.e. networks that are maximally random subject to a given degree sequence, those problems were solved by Molloy and Reed Molloy1995 (); Molloy1998 (). These authors also established a so-called duality relation according to which the degree distribution, when restricted to nodes which reside on finite components of a configuration model network, is simply related to the degree distribution of the whole network Molloy1998 (). This property, which was known before for ER networks Bollobas1984 (); Bollobas2001 (), has since been generalized also to a class of heterogeneous canonical random graph models with broad distributions of expected degrees Bollobas2007 (); Janson2010 (). Curiously, with the single exception of a study on large deviation properties of ER networks by Engel et al. Engel2004 (), we have not come across corresponding statements concerning degree distributions when restricted to the giant component of a random graph ensemble. Clearly, the knowledge of degree distributions and degree-degree correlations restricted to the giant component of a network would be very useful when investigating dynamical processes on complex networks. It would help to obtain results pertaining only to the giant component of such systems, without the contributions from finite components which often amount to trivial contaminations or (unwanted) distortions of results. Examples that come to mind are localization phenomena in sparse matrix spectra Biroli1999 (); Kuhn2008 () (where finite components of a random graph support eigenvectors that are trivially localized), properties of random walks Sood2005 (); Bacco2015 (); Tishby2016 (); Tishby2017 () (where a random walker chosen to start a walk on one of the finite components will never be able to explore an appreciable fraction of the entire network), or the spread of diseases or cascading failures Watts2002 (); Newman2002 (); Newman2002b (); Karrer2010 (); Rogers2015 (); Satorras2015 () (where an initial failure or initial infection occurring on a finite component will never lead to a global system failure or the outbreak of an epidemic). Component-size distributions in the percolation problem on complex networks Callaway2000 (); Rogers2015 () will likewise contain a component originating from clusters that were finite, before nodes or edges were randomly removed from the network. Finally, the distribution of shortest path lengths between pairs of nodes in a network Blondel2007 (); Katzav2015 (); Nitzan2016 (); Melnik2016 () contain contributions from pairs of nodes on different components whose distance is, by convention, infinite. To eliminate such unwanted contributions, numerical studies using message passing algorithms, or straightforward simulation methods, are often performed directly on the largest component of a given random network. The results may be difficult to compare with theoretical results if the latter do not eliminate finite component contributions (or suppress them by taking the density of links to be sufficiently large to make them effectively negligible).

It is the purpose of the present contribution to explore the micro-structure of the giant component as well as the finite components of random graphs in the configuration model class. More specifically, we use generating functions Newman2001 () and their probabilistic interpretation to obtain degree distributions conditioned on both the giant and finite components appearing in networks in the configuration model class, as well as degree-degree correlations of all orders in these networks. The key assumption underlying the use of the generating function method is that configuration model networks (with finite mean degree), which are only locally tree-like, are in fact probabilistically well approximated by trees in the limit of large system sizes. The underlying reason is that any correlations between neighbours of a given node which are generated by the existence of loops become arbitrarily small, as the typical lengths of such loops diverge like with the system size, .

The following are among our key results: (i) the giant components of ER networks and of configuration model networks exhibit degree-degree correlations of all orders, and are thus not in the configuration model class; (ii) degree-degree correlations of all orders need to be taken into account in order to verify that conditioned on the giant component of a random network in the configuration model class — there are indeed no finite trees of any size; (iii) for finite components of ER and configuration model networks one has a duality relation linking the degree distribution restricted to the finite components to that of a different sub-percolating configuration model via renormalization of the original degree distribution over the whole network; (iv) we provide examples demonstrating the quality of results that can be obtained for properties of the giant component when neglecting degree-degree correlations, and improvements that can be made when taking degree-degree correlations into account.

The paper is organized as follows. In Sec. II we present the configuration model network ensemble. In Sec. III we recall the generating function approach Newman2001 (), which allows to establish the existence of a giant component in configuration model networks, and to compute the asymptotic fraction of nodes that belong to the giant component. Concentrating on the probabilistic content of individual contributions to the expression for in that approach allows us to extract the degree distributions conditioned on nodes belonging to either the giant component or to any of the finite components. For the finite components we recover the duality relation obtained by Molloy and Reed Molloy1998 (). Using iterated versions of one of the generating functions then allows us in Sec. IV to obtain joint degree distributions (and thereby degree-degree correlations) of all orders for nodes surrounding a given node, conditioned on the central node belonging to either the giant component or to one of the finite components. In Sec. V we derive an analytical expression for the assortativity coefficient on the giant component of a configuration model network with any given degree distribution. In Sec. VI we apply the results to several specific configuration model networks with a Poisson degree distribution (ER network), an exponential degree distribution, a ternary degree distribution, a Zipf degree distribution and a power-law degree distribution (scale-free network). In Sec. VII we demonstrate the consistency of our results in the sense that, conditioned on a node belonging to the giant component of a configuration model network, it cannot belong to a finite tree of any size. In Sec. VIII we provide examples illustrating the quality of various approximate descriptions of the degree statistics conditioned on the giant component in the calculation of the distribution of shortest path lengths, in the computation of the spectra of sparse matrices defined on the giant componen and in the analysis of epidemic spreading on the giant component. We conclude the paper with a summary and discussion, in Sec. IX.

## Ii The Configuration Model Network and its percolation properties

The configuration model is a maximum entropy ensemble of networks under the condition that the degree distribution is imposed Newman2001 (); Newman2010 (). Here we focus on the case of undirected networks, in which all the edges are bidirectional and the degree distribution , , satisfies . The mean degree over the ensemble of networks is denoted by

 c=⟨K⟩=N−1∑k=0kP(k). (1)

To construct such a network of a given size, , one can draw the degrees of all nodes from , producing the degree sequence , (where must be even). Before proceeding to the next step of actually constructing the network, one should check whether the resulting sequence is graphic, namely admissible as a degree sequence of at least one network instance. The graphicality of the sequence is tested using the Erdős-Gallai theorem, which states that an ordered sequence of the form is graphic if and only if the condition Erdos1960b (); Choudum1986 ()

 n∑i=1ki≤n(n−1)+N∑i=n+1min(ki,n) (2)

holds for all the values of in the range .

A convenient way to construct a configuration model network with a given degree sequence, , , is to prepare the nodes such that each node, , is connected to half edges or stubs Newman2010 (). Pairs of half edges from different nodes are then chosen randomly and are connected to each other in order to form the network. The result is a network with the desired degree sequence and no correlations. Note that towards the end of the construction the process may get stuck. This may happen in case that the only remaining pairs of stubs belong to the same node or to nodes which are already connected to each other. In such cases one may perform some random reconnections in order to enable completion of the construction.

A special case of the configuration model is the ER network ensemble, which is a maximum entropy ensemble under the condition that the mean degree is constrained. ER networks can be constructed by independently connecting each pair of nodes with probability . In the thermodynamic limit the resulting degree distribution follows a Poisson distribution of the form

 P(k)=e−cckk!. (3)

Consider a random network in the configuration model class, described by a degree distribution , where . It is well known Newman2001 () that the fraction of nodes that reside on the giant component of such network can be found using generating functions as described below. Let us first introduce the degree generating function of , namely

 G0(x)=∞∑k=0xkP(k), (4)

while

 G1(x)=∞∑k=1xk−1kcP(k) (5)

is the generating function of the distribution of degrees of nodes reached via a random edge. From the definitions of and in Eqs. (4) and (5), respectively, we find that and .

To obtain the probability, , that a random node in the network resides on the giant component, one needs to first calculate the probability that a random neighbor of a random node, , belongs to the giant component of the reduced network, which does not include the node . In the thermodynamic limit, , the probability is given as a solution of the self-consistency equation Molloy1995 (); Molloy1998 ()

 1−~g=G1(1−~g). (6)

The left hand side of this equation is the probability that a random neighbor of a random reference node in the network does not reside on the giant component of the reduced network from which the reference node is removed. The right hand side represents the same quantity in terms of its neighbors, namely as the probability that none of the neighbors of such node resides on the giant component of the reduced network. Once is known, the probability can be obtained from

 g=1−G0(1−~g), (7)

This relation is based on the same consideration as Eq. (6), where the difference is that the reference node is a random node rather than a random neighbor of a random node.

Clearly, is always a solution of Eq. (6). A random network exhibits a giant component, if Eq. (6) also has a non-trivial solution. The condition for the existence of a giant component can be expressed in the form Newman2010 ()

 ∞∑k=2k(k−1)cP(k)=⟨K2⟩−⟨K⟩⟨K⟩>1, (8)

which is known as the Molloy-Reed criterion Molloy1995 (); Molloy1998 (). In essence, this criterion states that a giant component exists if the mean excess degree of the neighbours of a random node exceeds one. In Fig. 1 we present the parameter , which is the probability that a random node resides on the giant component, for an ER network, as a function of the mean degree, . For there is no giant component and thus . The percolation transition takes place at , above which gradually increases towards the dense network limit of .

## Iii The Degree Distribution on the Giant Component

The probability, , that a randomly selected node belongs to the giant component can be expressed in the form

 g=∞∑k=0gkP(k), (9)

where is the conditional probability that a random node belongs to the giant component, given that its degree is . Comparing this expression to Eq. (7) we find that

 gk=1−(1−~g)k. (10)

To make the conditioning explicit, we introduce an indicator variable , with indicating that an event happens on the giant component, whereas indicates that it happens on one of the finite components of the network. The probability that a random node resides on the giant component is given by , while the probability that it resides on one of the finite components is . The probability that a random node of a given degree resides on the giant component is given by

 P(Λ=1|K=k)=gk=1−(1−~g)k, (11)

while the probability that it resided on one of the finite components is

 P(Λ=0|K=k)=1−gk=(1−~g)k. (12)

Using Bayes’ theorem

 P(K=k|Λ=λ)=P(Λ=λ|K=k)P(Λ=λ)P(K=k), (13)

we invert these relations so as to obtain the degree distributions conditioned on a node to belong to the giant and the finite components, respectively. For brevity, in the rest of the paper we use a more compact notation, in which , and are replaced by , and , respectively, except for a few places in which the more detailed notation is needed for clarity. The conditional degree distribution of nodes which reside on the giant component is given by

 P(k|1)=1−(1−~g)kgP(k), (14)

while the conditional degree distribution for nodes which reside on the finite tree components is

 P(k|0)=(1−~g)k1−gP(k). (15)

This result can be expressed in the form

 P(k|0)=11−ge−akP(k), (16)

where . This highlights the fact that the degree distribution of the finite components is an exponentially attenuated variant of the original degree distribution. The result for the finite components, Eq. (15), was first derived by Molloy and Reed Molloy1995 (); Molloy1998 () (although in a less transparent form), while the result for the giant component, Eq. (14), was reported by Engel et al. Engel2004 () for the special case of ER networks (for which ).

The mean degree conditioned on the giant component is

 c1=E[K|1]=∞∑k=1kP(k|1). (17)

Using Eq. (14) and performing the summation, we obtain

 c1=cg[1−(1−~g)2]=(2−~g)~ggc. (18)

The mean degree conditioned on the finite components is given by

 c0=E[K|0]=∞∑k=0kP(k|0). (19)

From Eq. (15), we obtain

 c0=(1−~g)21−gc. (20)

Using Eq. (15) and the generating function , it can be shown that for , which implies that and . For ER networks these results specialize to and , respectively. Actually, the value of corresponds to the mean degree below the percolation threshold as it must correspond to a sub-percolating configuration model.

In the literature, the finite component result of Eq. (15) is referred to as a discrete duality relation Molloy1998 (); Bollobas2007 (); Janson2010 (). Indeed for ER networks is in itself a Poisson distribution of the form

 P(k|0)=e−c0ck0k!, (21)

where is the mean degree of the nodes which reside on the finite components. The degree distribution, restricted to nodes on the finite components of an ER network is thus of the same type as the degree distribution of the entire network, albeit with a renormalized parameter for the mean degree, . Note that for any , reflecting the fact that the finite components are equivalent to a sub-percolating ER network.

An analogous parametric renormalization relating the degree distribution of the whole network to a degree distribution conditioned on the finite components is found for any degree distribution which has a component which scales exponentially in . Such degree distributions can be expressed in the form

 P(k)=ϕ(k)e−αk, (22)

where , and the function is chosen such that is properly normalized. Clearly, the simplest example of such degree distribution is the exponential distribution, for which is merely a normalization constant. For networks with an exponential component in the degree distribution as described in Eq. (22), the degree distribution conditioned on the finite components takes the form

 P(k|0)=11−gϕ(k)e−α0k (23)

with . This simple parametric renormalization with respect to Eq. (22) is in close analogy to the results obtained earlier for ER networks. The degree distribution conditioned on the giant component can be compactly expressed as

 P(k|1)=e−αk−e−α0kgϕ(k). (24)

In order to obtain the conditional degree distributions for a given network, one needs to evaluate the parameters and . The latter is obtained from the solution of Eq. (6), while the former is obtained by inserting the solution for into (7).

## Iv Degree-Degree Correlations on the Giant Component

Having computed degree distributions conditioned on the giant and finite components of configuration model networks, we now turn to investigating the micro-structure of these giant and finite components further by looking at various joint degree distributions and degree-degree correlation. We shall find that — on the giant component — there are degree-degree correlations of any order. This could have been anticipated, as degree-degree correlations of arbitrarily high order are clearly required in order to exclude the possibility that a randomly selected node belongs to a tree of any finite size. In what follows we go some way to quantify these correlations. The key step is to use Eq. (6) to express the powers appearing in Eq. (7), resulting in

 (25)

Here we use the notation to denote a configuration consisting of a central node of degree , surrounded by a first coordination shell of nodes with degrees . The probabilistic interpretation of this identity is that the probability that a random node of degree , whose neighbors are of degrees resides on the giant component is

 P(1|k;{kμ})=[1−k∏μ=1(1−~g)kμ−1], (26)

while the probability that it resides on one of the finite components is

 P(0|k;{kμ})=k∏μ=1(1−~g)kμ−1. (27)

Using Bayes’ Theorem, one can invert these relations to obtain

 P(k;{kμ}|1)=1g[1−k∏μ=1(1−~g)kμ−1]P(k)k∏μ=1kμcP(kμ), (28)

and

 P(k;{kμ}|0)=11−g[k∏μ=1(1−~g)kμ−1]P(k)k∏μ=1kμcP(kμ), (29)

as the probabilities for nodes to have a degree and first neighbour shell configuration , conditioned on this happening on the giant component, and on one of the finite components, respectively. Note that Eq. (14) correctly predicts that the probability of a node of degree to belong to the giant component is zero. Moreover, Eq. (28) also correctly predicts that the probability of a node of degree to connect to nodes of degree , thereby forming an isolated -star, is zero on the giant component.

Marginalizing , namely summing Eq. (28) over all the values of , and replacing , gives the probability of a random node of degree to be connected to a node of degree , conditioned on them being on the giant component

 P(k;k′|1)=1g[1−(1−~g)k−1(1−~g)k′−1]P(k)k′cP(k′),k≥1. (30)

Similarly, from Eq. (29) we obtain the probability of a random node of degree to be connected to a node of degree , under the condition that they do not reside on the giant component

 P(k;k′|0)=11−g(1−~g)k−1(1−~g)k′−1P(k)k′cP(k′),k≥1. (31)

Here we have exploited the fact that the averages over the distributions of neighbouring degrees factor, each of them giving

 ∑kμ(1−~g)kμ−1kμcP(kμ)=1−~g, (32)

by using Eq. (6). Marginalizing Eqs. (30) and (31) by summing over , we recover and as given by Eqs. (14) and (15). On the other hand, marginalizing Eq. (30) by summing over , gives the probability, starting from a randomly chosen node on the giant component, to reach a node of degree , namely

 ˜P(k′|1)≡∑k≥1P(k;k′|1). (33)

Carrying out the summation we obtain

 ˜P(k′|1)=1g[1−p0−1−g−p01−~g(1−~g)k′−1]k′cP(k′), (34)

where , namely the probability of an isolated node in the original network. It is easy to see that this is a normalized distribution. It is also important to stress the asymmetric role of the two degrees and appearing in Eqs. (30) and (31).

Consider a random edge in a configuration model network. The joint degree distribution, , of the nodes which reside on both sides of such edge is given by

 ˆP(k,k′)=kcP(k)k′cP(k′). (35)

The non-giant components of a configuration model network constitute a sub-network which is itself a configuration model network, and is in the sub-percolation regime. The degree distribution, , of this sub-network is given by Eq. (15). Thus, the joint degree distribution of pairs of connected nodes which reside on the non-giant components is given by

 ˆP(k,k′|0)=(1−~g)k−1(1−~g)k′−1(1−~g)2kcP(k)k′cP(k′). (36)

The fraction of edges in the network which reside on the giant component is denoted by . It is given by

 gE=1−(1−~g)2, (37)

while the fraction of edges which reside on the non-giant components is . Therefore, the joint degree distribution can be expressed in the form

 ˆP(k,k′)=[1−(1−~g)2]ˆP(k,k′|1)+(1−~g)2ˆP(k,k′|0). (38)

Using Eqs. (36) and (38) we find that

 ˆP(k,k′|1)=1−(1−~g)k+k′−21−(1−~g)2kcP(k)k′cP(k′). (39)

## V Assortativity on the Giant Component

From the joint probability that a randomly chosen edge on the giant component connects two vertices of degrees and , one obtains the corresponding probability for a random edge to connect nodes of excess-degrees and , by a simple shift of arguments as

 ˆPe(k,k′|1)=ˆP(k+1,k′+1|1). (40)

The conditional joint probability of excess degrees, , is given by

 ˆPe(k,k′|1)=[1−(1−~g)k+k′−21−(1−~g)2]k+1cP(k+1)k′+1cP(k′+1). (41)

Summing over , we obtain the marginal distribution

 ˆPe(k|1)=[1−(1−~g)k−11−(1−~g)2]k+1cP(k+1). (42)

In terms of these definitions, the assortativity coefficient on the giant component is given by Newman2002b ()

 r=1^σ2∑k,k′≥0kk′[ˆPe(k,k′|1)−ˆPe(k|1)ˆPe(k′|1)], (43)

where

 ^σ2=∑k≥0k2ˆPe(k|1)−[∑k≥0kˆPe(k|1)]2 (44)

is the variance of . The assortativity coefficient is actually the Pearson correlation coefficient of degrees between pairs of linked nodes Newman2002b ().

While the assortativity coefficient can be evaluated directly from Eq. (43), it turns out that there is a more effective approach for its calculation, using generating functions. To this end, we introduce the bivariate generating function, , of , which is given by

 B(u,v)=∑k,k′≥0ˆPe(k,k′|1)ukvk′. (45)

This function is symmetric in and , reflecting the symmetric form of in terms of and . We also introduce the generating function, , of the marginal distribution , which takes the form

 S(u)=∑k≥0ˆP(k|1)uk. (46)

Note that it can be expressed in terms of the bivariate generating function, as . Inserting the expression for from Eq. (41) into Eq. (45), we find that the bivariate generating function can be expressed in terms of the generating function of the degree distribution . It takes the form

 B(u,v)=G1(u)G1(v)−G1[(1−~g)u]G1[(1−~g)v]1−(1−~g)2. (47)

Plugging in we obtain

 S(u)=G1(u)−(1−~g)G1[(1−~g)u]1−(1−~g)2. (48)

Expressing the terms on the right hand side of Eq. (43) in terms of derivatives of the generating functions and , we express the assortativity coefficient in the form

 r=∂u∂vB(u,v)−[∂uS(u)]2(u∂u)2S(u)−[∂uS(u)]2∣∣∣u=v=1. (49)

In the next section we use this formulation to obtain exact analytical results for the assortativity coefficients on the giant components of different configuration model networks. The assortativity coefficient is expected to be negative, which implies that the giant component of a configuration model network is disassortative. This is due to the fact that high-degree nodes are over-represented in the giant component. Thus, in order that the giant component will be a single connected component, low degree nodes must have a greater than normal probability to connect to high degree nodes. In particular, a node of degree on the giant component must be connected to a node of degree , while a node of degree can have at most one neighbor of degree . The disassortativity of the giant component is most pronounced just above the percolation threshold, where most nodes are of low degrees.

## Vi Analysis of specific network models

In this section we discuss in detail some specific network models and the properties of their giant components.

### vi.1 Erdős-Rényi networks

Consider an ER network of nodes and mean degree . In this case the degree distribution follows a Poisson distribution, given by Eq. (3). The generating functions and of this distribution coincide and satisfy

 G0(x)=G1(x)=e−c(1−x). (50)

As a result, in this case . Using Eq. (7) one obtains a closed form expression for , which is given by

 g=1+W(−ce−c)c, (51)

where is the Lambert W function Olver2010 (). In this case a giant component exists for (Fig. 1). The degree distribution on the giant component is obtained from Eq. (14), where and are given by Eq. (51). It is given by

 P(k|1)=1g[e−cckk!−(1−g)e−c(1−g)[c(1−g)]kk!], (52)

which takes the form of the difference between two Poisson distributions. In Fig. 2 we present analytical results for the degree distribution, , of the giant components of ER networks with mean degrees , and (solid lines), obtained from Eq. (52). The analytical results are found to be in excellent agreement with the results of computer simulations (circles), for a network of size . For comparison we also show the degree distribution on the entire network (dashed lines), obtained from Eq. (3).

The mean degree of the giant component of an ER network, obtained from Eq. (17), is given by

 E[K|1]=(2−g)c, (53)

while the mean degree of the finite components, obtained from Eq. (20) is

 E[K|0]=(1−g)c. (54)

In Fig. 3 we present analytical results for the mean degree, , of the giant component (solid line) and the mean degree, , of the finite components (dotted line), of an ER network as a function of . The mean degree of the whole network, , is also shown (dashed line). It is observed that at the percolation threshold (), , while . As is increased, the mean degree of the giant component converges asymptotically towards to overall mean degree of the network, while the mean degree of the finite components decays to zero.

The assortativity coefficient for the giant component of an ER network is given by

 r=−c(1−g)2(2−g)3−(1−g)(2−g)−c(1−g)2. (55)

In the limit of large , the index decreases according to . For values of just above the percolation threshold, we find that

 r≃−15+1225(c−1)2+O[(c−1)3]. (56)

The negative value of implies that the giant component is disassortative, meaning that high degree nodes on the giant component tend to connect to low degree nodes and vice versa. As is increased, the absolute value of gradually decreases. In Fig. 4 we present the assortativity coefficient, , of the giant component of an ER network, as a function of . Just above the percolation transition, the assortativity coefficient is large and negative. Its absolute value gradually decreases and eventually vanishes as is increased, reflecting the fact that the giant component coincides with the entire network, and all the correlations are lost.

### vi.2 Configuration model networks with an exponential degree distribution

Consider a configuration model network with an exponential degree distribution of the form

 P(k)=Ae−αk, (57)

where . Here we focus on the case of , for which the normalization factor is . The mean degree is given by

 c=⟨K⟩=11−e−α. (58)

For the analysis presented below, it is convenient to parametrize the degree distribution in terms of the mean degree, . Plugging in we obtain

 P(k)=1c(c−1c)k−1, (59)

with . The degree generating function is given by

 G0(x)=xc−x(c−1). (60)

It exhibits two trivial fixed points, namely and . The cavity generating function is

 G1(x)=1[c−x(c−1)]2. (61)

This generating function has a trivial fixed point given by . The size of the giant component is obtained using a two step process. In the first step we find the non-trivial fixed point of , by solving Eq. (6) for . We find that

 ~g=c−32(c−1)+12√c+3c−1. (62)

In the second step we obtain the fraction of nodes which reside on the giant component, which is given by Eq. (7), namely

 g=3c2(c−1)−c(c+3)1/22(c−1)3/2. (63)

The percolation transition occurs at , such that a giant component exists for . The degree distribution on the giant component, obtained from Eq. (14), takes the form

 P(k|1)=[1−(1−~g)k(c−1)g](c−1c)k, (64)

where is given by Eq. (62) and is given by Eq. (63). The mean degree on the giant component is given by Eq. (18). The assortativity coefficient of a configuration model with an exponential degree distribution takes the form

 r=−2(c−1)(1−~g)2[1−(1−~g)3/2]2[1−(1−~g)2]{1−(1−~g)7/2+3(c−1)[1−(1−~g)5]}−2(c−1)[1−(1−~g)7/2]2. (65)

In the limit of large , the assortativity coefficient decreases to zero according to

 r≃−2c−4+2c−5+O(c−6). (66)

Just above the percolation threshold, which is located at , the assortativity coefficient can be approximated by

 r≃−313+22401521(c−32)2−3584013689(c−32)3+O(c−32)4. (67)

### vi.3 Configuration model networks with a ternary degree distribution

The properties of the giant components of random networks are very sensitive to the abundance of nodes of low degrees, particularly nodes of degree (leaf nodes) and . Nodes of degree (isolated nodes) are excluded from the giant component and their weight in the overall degree distribution has no effect on the properties of the giant component. Therefore, it is useful to consider a simple configuration model in which all nodes are restricted to a small number of low degrees. Here we consider a configuration model network with a ternary degree distribution of the form Newman2010 ()

 P(k)=p1δk,1+p2δk,2+p3δk,3, (68)

where is the Kronecker delta, and . The mean degree of such network is given by

 ⟨K⟩=p1+2p2+3p3. (69)

The generating functions are

 G0(x)=p1x+p2x2+p3x3, (70)

and

 G1(x)=p1+2p2x+3p3x2p1+2p2+3p3. (71)

Solving Eq. (6) for , with given by Eq. (71), we find that

 (72)

And using Eq. (7) for , where is given by Eq. (70), we find that

 (73)

Thus, the percolation threshold is located at . This can be understood intuitively by recalling that the finite components exhibit tree structures. In a tree that includes a single node of degree , with three chains of arbitrary lengths attached to it, there must be three leaf nodes of degree . In more complex tree structures, let alone in the giant component, there must be more than one node of degree for every three nodes of degree . This is not likely to occur in case that . Using the normalization condition, we find that for any given value of , a giant component exists for .

The degree distribution on the giant component is given by

 P(k|1)=1−(p13p3)k1−(p213p3)−(p21p29p23)−(p327p23)P(k), (74)

where and is given by Eq. (68). The degree distribution on the finite components is given by

 P(k|0)=(p13p3)k(p213p3)+(p21p29p23)+(p327p23)P(k). (75)

Thus, the mean degree on the giant component is given by

 E[k|1]=1−(p13p3)k1−(p213p3)−(p21p29p23)−(p327p23)⟨K⟩, (76)

while the mean degree on the finite components is given by

 E[k|0]=(p13p3)k(p213p3)+(p21p29p23)+(p327p23)⟨K⟩. (77)

The assortativity coefficient of the ternary network is given by

 r=−18p21p2327p2p33+9p21p3(p2+2p3)+27p1p23(p2+2p3)+p31(p2+6p3). (78)

As is increased, while keeping fixed, the network becomes denser and the fraction of nodes, , which reside on the giant component increases, reaching at (namely, at the point in which the number of leaf nodes vanishes). Above this point the giant component encompasses the entire network and the assortativity coefficient vanishes. In the opposite case, in which is decreased the network becomes more sparse. The percolation transition takes place at . In the limit of sparse networks just above the percolation threshold the assortativity coefficient can be approximated by

 r≃−31−p29+7p2−64p2p3−p3,c(9+7p2)2+O[(p3−p3,c)2]. (79)

In the limit of (and ), the ternary network becomes a random regular graph (RRG) with a degenerate degree distribution of the form , while in the limit of (and ) it becomes an RRG with . In general, random regular graphs exhibit degree distributions of the form , where is an integer. In RRGs with the giant component encompasses the entire network, namely Bonneau2017 (). Thus, the degree distribution of the giant component is simply .

The case of an RRG with , which corresponds to the limit of and , is special. An RRG with consists of a collection of closed cycles. The local structure of all the cycles is identical, and follows the overall degree distribution of the network, . Thus, unlike the case of other configuration model networks there is no further information to be revealed about the degree distribution of the giant component. The generating function method used in this paper does not permit the calculation of the percolating fraction, , in the case of RRGs with , as the value of turns out to be indeterminate in this case. However, an interesting analogy between the cycles of RRGs with and the cycles which appear in they theory of random permutations, enables one to conclude that the average size of the longest cycle is extensive in . In random permutations of objects, the average length of the longest cycle turns out to be where is the Golomb-Dickman constant Shepp1966 (). However, in numerical simulations of RRGs with we found that the average length of the longest cycle is given by , where . This difference can be understood from the fact that the two systems differ in some details. For example, unlike the case of random permutations, in RRGs with fixed points (namely isolated nodes) and cycles of length (namely dimers) are not allowed, and the minimal cycle length is .

### vi.4 Configuration model networks with a Zipf degree distribution

Consider a configuration model network with a Zipf degree distribution of the form

 P(k)=eαkminΦ(e−α,1,kmin)e−αkk, (80)

where is the Lerch transcendent function Olver2010 (). This distribution exhibits a power-law component of the form , with , with a cutoff in the form of an exponential tail controlled by the parameter , which sets the range of the tail. The mean degree is given by

 ⟨K⟩=Φ(e−α,0,kmin)Φ(e−α,1,kmin). (81)

The generating functions take the form

 G0(x)=xkminΦ(xe−α,1,kmin), (82)

and

 G1(x)=xkmin−1Φ(xe−α,0,kmin)Φ(e−α,0,kmin). (83)

Note that and .

From this point and on, we focus on the case , where many of the quantities mentioned above become significantly simpler. In particular, the mean degree becomes

 ⟨K⟩=−1(eα−1)ln(1−e−α), (84)

and the two degree generating functions become

 G0(x)=ln(1−e−αx)ln(1−e−α), (85)

and

 G1(x)=1−e−α1−e−αx. (86)

Inserting the expression of given by Eq. (86) into Eq. (6) we find that there is a non-trivial solution of the form

 ~g=2−eα. (87)

Using Eq. (7) we find that

 g=1+αln(1−e−α). (88)

The percolation transition takes place at , below which there is a giant component. The degree distribution on the giant component is given by Eq. (14), where is given by Eq. (87) and is given by Eq. (88). The mean degree on the giant component is given by

 E[K|1]=eα−2(1−e−α)[α+ln(1−e−α)]. (89)

The assortativity index is given by

 r=−(1−e−α)2e2α−3eα+3. (90)

For small values of we obtain

 r≃−α2−α412+O(α6). (91)

For values of just above the percolation threshold, we obtain

 r≃−14+54(α−ln2)2+O[(α−ln2)3]. (92)

### vi.5 Configuration model networks with a power-law degree distribution

Consider a configuration model network with a power-law degree distribution of the form

 P(k)=Akγ (93)

for , where the normalization coefficient is

 A=1ζ(γ,kmin)−ζ(γ,kmax+1) (94)

and is the Hurwitz zeta function Olver2010 (). The mean degree is given by

 ⟨K⟩=ζ(γ−1,kmin)−ζ(γ−1,kmax+1)ζ(γ,kmin)−ζ(γ,kmax+1), (95)

while the second moment of the degree distribution is

 ⟨K2⟩=ζ(γ−2,kmin)−ζ(γ−2,kmax+1)ζ(γ,kmin)−ζ(γ,kmax+1). (96)

For the mean degree diverges when . For the mean degree is bounded while the second moment, , diverges. For both moments are bounded. For and (where nodes of degrees and do not exist), namely the Molloy and Reed criterion is satisfied and the network exhibits a giant component Molloy1995 (); Molloy1998 (). Moreover, under these conditions the giant component encompasses the entire network Bonneau2017 ().

The case of is particularly interesting. In this case, the degree distribution is given by Eq. (93) with

 A=1ζ(γ)−ζ(γ,kmax+1), (97)

and its first two moments are

 ⟨K⟩=ζ(γ−1)−ζ(γ−1,kmax+1)ζ(γ)−ζ(γ,kmax+1), (98)

and

 ⟨K2⟩=ζ(γ−2)−ζ(γ−2,kmax+1)ζ(γ)−