Percolation transition in correlated hypergraphs

# Percolation transition in correlated hypergraphs

## Abstract

Correlations are known to play a crucial role in determining the structure of complex networks. Here we study how their presence affects the computation of the percolation threshold in random hypergraphs. In order to mimic the correlation in real network, we build hypergraphs from a generalized hidden variable ensembles and we study the percolation transition by mapping this problem to the fully connected Potts model with heterogeneous couplings.

## 1 Introduction

In the last decade the topological properties of networks have attracted a large interest [1, 2, 3, 4], mainly driven by the emergence of novel dynamical effects in random processes defined on them [5, 6]. The impact of this research is wide and has important consequences in the domains of biology, socio-economical theories and technological infrastructure design. It has been shown that complex networks can display a universal behavior which strongly affects the dynamics of statistical models built on them.

A major example is provided by the percolation transition [7, 8, 9, 10], one of the most famous emergent collective phenomena that can be defined on complex networks. Its dependence on degree distribution, degree-correlations and directionality of the links has been extensively studied in the last years [11, 12, 13]. In particular, the interplay between topological features and the nature of the percolation transition has been fully investigated within different types of random network ensembles [14, 15, 16, 17, 18, 19, 20]. These constitute null models for networks, each of them being formed by graphs sharing with real complex networks a number of structural features, such as degree distribution or correlations between neighboring components.

Recently, attention has been devoted to the structure of hypergraphs [21, 22], describing, for example, many on-line social and professional communities which collaborate in order to give a semantic structure to a set of data. Among these communities, also named folksonomies, we mention Flickr or CiteULike, whose structure is formed by triplets of users, resources and tags linked together. Such kind of networks show important correlations since the interest of the user and the subject of the resources (for example a picture for Flickr and an article for CiteULike) usually have a strong inter-dependence.

These correlations are responsible for the build-up of communities, after projecting these networks into networks made only by user-user, resource-resource and tag-tag interactions [22]

In a recent paper [21], the percolation transition in random uncorrelated hypergraphs was characterized providing a first approximation to the real hypergraphs properties. In this work we extend these results to more general ensembles of correlated hypergraphs which can give a better description of real social communities. We will show how to build null models for hypergraphs based on recent parallel construction introduced for networks [19, 20]. Moreover, we will derive the percolation threshold of correlated hypergraphs by mapping the problem to the solution of a fully connected Potts model with heterogeneous couplings [23] following the method developed in a recent paper on percolation phase transition in simple networks [24].

## 2 Correlated random hypergraphs

To mimic the correlation of real hypergraphs we propose to study randomized ensembles of correlated hypergraphs within the same theoretical framework developed in [19, 20] for networks. In these works it has been shown how is possible to correctly define statistical microcanonical and canonical networks ensembles. In particular they show how to obtain the general relation between random networks ensemble - a collection of graphs with fixed number of nodes and links - and the one formed by nodes linked two by two with probability . In the latter case the number of links fluctuates and has a Poissonian distribution with mean . In the thermodynamic limit it is known that these two different ensembles share the same statistical properties when the external parameters, and , satisfy the following relation . We call the microcanonical ensemble because it satisfies an hard constraint on the extensive number of links - like the energy constraint for usual microcanonical ensemble -. While on the other case the ensemble is canonical, in the sense that the Lagrange multiplier let the number of links fluctuates fixing only its average value to . Generalizing this approach to several properties, like i.e. the degree sequence or a certain division into different communities, it is possible to define complex microcanonical and canonical network ensembles satisfying more stringent constraint [19, 20]. Here we sketch how this general framework can be easily generalized to hypergraphs.

Let us consider a hypergraph formed by nodes of different types linked in groups. For example in Flickr we will have and indicating respectively agents, pictures and tags. We assume also, in order to gain in generality, that each node can be associated to a different feature indicating a given classification of the nodes. Again, in the case of Flickr, the agents can be classified in relation to their interests or age, the tags in relation to their general meaning and the pictures in relation of the type of subject is represented.

A given random correlated ensemble of hypergraphs can be defined as the set of all the hypergraphs which satisfy a number of constraints.

In particular, we choose these constraints to be the number of hyperlinks each node has and the number of hyperlinks bridging set of nodes with different features. Following these prescriptions we can construct microcanonical and canonical hypergraphs ensembles.

### 2.1 Microcanonical hypergraph ensembles

Let us define a hypergraph by the tensor if the nodes are linked together and otherwise. Let’s call the number of nodes of type . The networks in the microcanonical hypergraph ensemble will then satisfy the following conditions,

 {kiα=∑{iγ}γ≠αai1,…iα,…,iKA(x1,x2,…xK)=∑{iγ}ai1,…iα,…,iK∏γδ(riγ−xγ) (1)

with being the hyperdegree of node and with being the number of hyperedges between nodes of features .

### 2.2 Canonical hypergraph ensembles

By using the statistical mechanics approach described in [19, 20] it can be shown that networks in the canonical hypergraph ensembles, satisfying on average the constraints , can be constructed by assigning a hyperlink with probability

 pi1,i2,…,iK=θi1θi2…θiKW(ri1,ri2,…,riK)1+θi1θi2…θiKW(ri1,ri2,…,riK). (2)

If the constants and the tensor satisfy the following conditions

 ⎧⎪⎨⎪⎩¯¯¯¯¯¯¯kiα=∑{iγ}γ≠αpi1,i2,…,iK¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯A(x1,x2,…xK)=∑{iγ}pi1,i2,…,iK∏Kγ=1δ(riγ−xγ) (3)

then the hyperdegrees and the number of hyperedges

 {kiα=∑{iγ}γ≠αai1,…iα,…,iKA(x1,x2,…xK)=∑{iγ}ai1,…iα,…,iK∏γδ(riγ−xγ)

are Poisson distributed [17] with average and given by .

In the limit of hypergraphs with a linking probability independent on the features of the nodes we obtain that the probability becomes

 pi1,i2,…,iK=θi1θi2…θiK1+θi1θi2…θiK (4)

Moreover, we recover the configuration model for the uncorrelated hypergraphs taking the limit . In this case the hyperedge probability be

 pi1,i2,…,iK≃θi1θi2…θiK=¯¯¯¯¯¯ki1…¯¯¯¯¯¯¯kiK(⟨¯¯¯k⟩N)(K−1), (5)

and this last expression describes uncorrelated hypergraphs whose properties have been studied in [21].

## 3 Potts Model and percolation transition

It is a well known result [25, 26, 27] of statistical mechanics that the Potts phase transition in a fully connected systems for the number of colors , describes the bond percolation transition [23, 28] in Erdös Rényi random graphs [29]. Recently a fully connected Potts model with heterogeneous couplings [23, 24] was introduced in order to study the percolation transition in random networks with heterogeneous degree distribution and additional structural properties. Here we show that such a method can be extended to correlated canonical hypergraphs.

Instead of a pure Potts model we consider the following generalized Hamiltonian

 H=−∑i1i2…iKJi1i2…iKδsi1si2…siK (6)

where the summation runs over all the -sets that compose the fully connected hypergraph and the variables are Potts spins taking values from . We define for later convenience the vector where is the number of sites of a given type . Introducing an external parameter as the inverse of temperature, the partition function of the Hamiltonian model reads

 Z=∑s1s2…sKe−βH[sα], (7)

whose cluster expansion gives the following expression

 Z=∑H∈K(H)∏i1i2…iK∈Evi1i2…iKqC(H). (8)

The value is the number of connected components of the hypergraph and

 vi1i2…iK=eβJi1i2…iK−1≃βJi1,i2,…iK,

where the last expression is valid in the small limit. After identifying the coupling

 βJi1,i2,…iK=pi1i2…iK1−pi1i2…iK=θi1θi2…θiKW(ri1,ri2,…,riK) (9)

then the sum in (8) is a weighted sum over random hypergraphs in which each link has probability reported in equation (2). If the condition holds, the transition of the Hamiltonian model for coincides with the percolation transition in the random correlated hypergraph ensemble. In the next section we will solve the Potts model providing the percolation condition for hypergraphs in the ensemble that could be written in the form

 det(Ξ)=0 (10)

with the matrix to be determined in the following.

### 3.1 Solving Potts model

In this subsection we solve the Potts model , in the generic case, for arbitrary and given by . We introduce the order parameter indicating the fraction of nodes of type associated to the ‘hidden variable’ and the feature , having Potts spin equal to

 cαθ,r(s)=1Nαpα(θ,r)∑iαδs,siαδ(θ,θiα)δr,riα. (11)

where is the probability distribution to have a node of -type with local properties describes by variables and . We have introduced for convenience the Dirac delta function, with its proper normalization and the Kronecker’s delta defined as

 δ(θ,θ′)={0θ≠θ′∞θ=θ′δs,s′={0s≠s′1s=s′ (12)

Noticing the symmetry of the Hamiltonian in terms of

 H[{cαθ,r(s)}]=−∏γNγ∑s,{r}∏γ[∫dθγpγ(θγ,rγ)cγθγ,rγ(s)]J({θ},{r}) (13)

we can express the partition function as a summation over these new variables

 Z=∑{cαθ,r(s)}e−βF[{cαθ,r(s)}] (14)

where the free energy function is given by

 βF[{cαθ,r(s)}]= βH[{cαθ,r(s)}] (15) +∑αNα∫dθα∑rαpα(θα,rα)cαθα,rα(s)lncαθα,rα(s)

The phase transition of the Potts model is determined by the point at which the free energy becomes unstable respect to variation of the order parameters around the symmetric solution

 cαθ,r=1q. (16)

In order to determine the stability of the free energy we evaluate the Hessian of components

 H{a},{a′}(s,s′)=∂βF∂c{a}(s)∂c{a′}(s′) (17)

where we indicate by the triplets . Making explicitly the calculation, the previous equation (17) reads

 H{a},{a′}(s,s′) = δs,s′{δ{a},{a′}Nαpα(θα,rα)cαθα,rα(s)−pα(θα,rα)pα′(θ′α′,r′α′)× (18) ∏γNγ∏γ≠α,α′[∫dθγ∑rγpγ(θγ,rγ)cγθγ,rγ(s)]βJ({θ},{r})⎫⎬⎭,

with the following definition of . Taking from equation , we obtain that the eigenvalue problem associated to the Hessian matrix is

 [λ+Nαpα(θα,rα)q]e({a}) = Nαpα(θα,rα)qK−2∑α′≠α⎧⎨⎩∏γ≠α,α′[∫dθγ∑rγNγpγ(θγ,rγ)]× (19) ∫dθα′∑rα′pα′(θα′,rα′)βJ({θ},{r})e({a′})⎫⎬⎭.

where and are respectively the eigenvalue and the eigenvector of the problem. The equation can be written as

 e({a})=Nαpα(θα,rα)(λ+Nαpα(θα,rα)q)qK−2Δ({a}) (20)

with the defined as

 Δ({a}) = ∑α′≠α⎧⎨⎩∏γ≠α,α′[∫dθγ∑rγNγpγ(θγ,rγ)]× (21) Nα′∫dθα′∑rα′pα′(θα′,rα′)βJ({θ},{r})e({a′})⎫⎬⎭.

The symmetric solution becomes unstable when the maximal eigenvalue of the Hessian problem becomes positive. Therefore, in order to determine the critical point of the Potts model for we consider equations and when the eigenvalues vanishes . If we take into account the explicit form of the coupling constant given by Eq. (9), we find that the vector takes the form where the ’s satisfy the linear system of equations

 Ξv=0 (22)

with the matrix given by

 Ξ{α,rα}{α′,rα′} = −δα,α′δrα,rα′+[1−δα,α′δrα,rα′]∏γ≠α,α′[∫dθγ∑rγNγpγ(θγ,rγ)θγ] (23) ×∫dθα′Nα′pα′(θα′,rα′)(θα′)2W(r1,…,rα,…,rα′,…,rK).

Therefore the condition determining the critical point of the Potts model for comes from the vanishing of the determinant, i.e. .

### 3.2 Simplified cases

In the simplified case in which the linking distribution do not depend on the communities we have that . In fact the coupling constant in the Potts model given by (9) becomes simply

 Ji1,i2,…iK=θi1θi2…θiK (24)

then the solution can be expressed as

 v(α,rα)=uα. (25)

Using the definition we get a non null solution for the ’s if and only if the condition is satisfied, with defined as

 Φα,α′=−δα,α′+(1−δα,α′)Nα′⟨θ2⟩α′∏γ≠α,α′[Nγ⟨θ⟩γ]. (26)

In the case this condition reduces to the following relation

 2π12π23π31+π12π32+π13π23+π21π31−1=0 (27)

with the proper identification

 πα,α′=Nα⟨θ2⟩αNα′⟨θ⟩α′. (28)

The formula (27) is valid in the general case of hypergraphs that show non trivial correlation between nodes. We recover as a special case the percolation condition in the uncorrelated hypergraphs, just remembering the relation between the variables ’s and the mean site connectivity  . Therefore, using the previous condition, the ’s are given by

 πα,α′=⟨k(k−1)⟩α⟨k⟩ (29)

and after some algebra we obtain the condition for the percolation already found in [21]

 ⟨k⟩1⟨k2⟩1+⟨k⟩2⟨k2⟩2+⟨k⟩3⟨k2⟩3=2 (30)

## 4 Conclusions

Correlations account for the non-trivial structure of complex networks and must play a significant role also in the characterization of hypergraphs describing folksonomies. In this paper we have studied ensembles of correlated hypergraphs which can be used to model the interactions between different types of nodes in real complex hypergraphs. We determined the percolation threshold by mapping this problem to a fully connected Potts model with heterogeneous couplings. Our approach extends the present knowledge on percolation in uncorrelated hypergraph. Future development will link these findings to the study and characterization of real folksonomies and to the analysis of the robustness of the giant component phase against the removal of nodes or hyperedges.

## 5 Acknowledgments

This paper was supported by the project IST STREP GENNETEC contract No.034952 and by MIUR grant 2007JHLPEZ.

### References

1. R. Albert and A. L. Barabási. Statistical mechanics of complex networks. Rev. Mod. Phys., 74(1):47–97, Jan 2002.
2. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167, 2003.
3. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang. Complex networks: Structure and dynamics. Phys. Rep., 424(4-5):175 – 308, 2006.
4. R. Pastor-Satorras and A. Vespignani. Evolution and structure of the Internet: A statistical physics approach. Cambridge, University Press, 2001, 1st edition, 2004.
5. S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes. Critical phenomena in complex networks. Rev. Mod. Phys., 80(4):1275, 2008.
6. M. Barthélemy A. Barrat and A. Vespignani. Dynamics Processes on Complex Networks. Cambridge, University Press, 2001, 1st edition, 2008.
7. R. Albert, H. Jeong, and A.-L. Barabási. Error and attack tolerance of complex networks. Nature, 406:378, 2000.
8. R. Cohen, K. Erez, D. ben Avraham, and S. Havlin. Resilience of the internet to random breakdowns. Phys. Rev. Lett., 85(21):4626–4628, Nov 2000.
9. R. Cohen, K. Erez, D. ben Avraham, and S. Havlin. Breakdown of the internet under intentional attack. Phys. Rev. Lett., 86(16):3682–3685, Apr 2001.
10. M. E. J. Newman and G. Ghoshal. Bicomponents and the robustness of networks to failure. Phys. Rev. Lett., 100:138701, 2008.
11. M. Boguñá and M. A. Serrano. Generalized percolation in random directed networks. Phys. Rev. E, 72:016106, 2005.
12. A. V. Goltsev, S. N. Dorogovtsev, and J. F. F. Mendes. Percolation on correlated networks. Phys. Rev. E, 78(5):051105, 2008.
13. M. E. J. Newman. Random graphs with clustering. arXiv.org:0903.4009, 2009.
14. F. Chung and L. Lu. Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 6(2):125, 2002.
15. B. Kahng K. I. Goh and D. Kim. Universal behavior of load distribution in scale-free networks. Phys. Rev. Lett., 87:278701, 2001.
16. G. Caldarelli, A. Capocci, P. De Los Rios, and M. A. Muñoz. Scale-free networks from varying vertex intrinsic fitness. Phys. Rev. Lett., 89(25):258702, Dec 2002.
17. M. Bogu ná and R. Pastor-Satorras. Class of correlated random networks with hidden variables. Phys. Rev. E, 68:036112, 2003.
18. J. Park and M. E. J. Newman. Statistical mechanics of networks. Phys. Rev. E, 70:066117, 2004.
19. G. Bianconi. Entropy of randomized network ensembles. Europhys. Lett., 81:28005, 2008.
20. G. Bianconi. Entropy of network ensembles. Phys. Rev. E, 79(3):036114, 2009.
21. G. Ghoshal, V. Zlatić, G. Caldarelli, and M. E. J. Newman. Random hypergraphs and their applications. arXiv.org:0903.0419, 2009.
22. V. Zlatić, G. Ghoshal, and G. Caldarelli. Hypergraph topological quantities for tagged social networks. arXiv.org:0905.0976, 2009.
23. B. Kahng D. S. Lee, K. I. Goh and D. Kim. Evolution of scale-free random graphs: Potts model formulation. Nucl. Phys. B, 696(3):351, 2004.
24. S. Bradde and G. Bianconi. Percolation transition and distribution of connected components in generalized random network ensembles. J. Phys. A: Math. Theor., 42(19):195007, 2009.
25. C. M. Fortuin and P. W. Kasteleyn. On the random-cluster model : I. introduction and relation to other models. Physica, 57(4):536 – 564, 1972.
26. F. Y. Wu. The potts model. Rev. Mod. Phys., 54(1):235–268, Jan 1982.
27. T. C. Lubensky. Thermal and geometrical critical phenomena in random systems. North-Holland Pub. Co. ; sole distributors for the USA and Canada, Elsevier North-Holland, Amsterdam ; New York : New York :, 1979.
28. A. Engel, R. Monasson, and A. K. Hartmann. On large deviation properties of erdos-renyi random graphs. J. Stat. Phys., 117:387, 2004.
29. B. B. Bollobas. Random graphs. Cambridge, University Press, 2001, 2nd edition, 1985.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters