Cayley Trees and Bethe Lattices

# Cayley Trees and Bethe Lattices, a concise analysis for mathematicians and physicists

M. Ostilli Dept. of Computational & Theoretical Sciences, IIUM, Kuantan, Pahang, Malaysia Statistical Mechanics and Complexity Center (SMC), INFM-CNR SMC, Rome, Italy
###### Abstract

We review critically the concepts and the applications of Cayley Trees and Bethe Lattices in statistical mechanics in a tentative effort to remove widespread misuse of these simple, but yet important - and different - ideal graphs. We illustrate, in particular, two rigorous techniques to deal with Bethe Lattices, based respectively on self-similarity and on the Kolmogorov consistency theorem, linking the latter with the Cavity and Belief Propagation methods, more known to the physics community.

###### pacs:
05.50.+q, 64.60.aq, 64.70.-p, 5.10.-a

## 1 Introduction

After many years since their introduction, Cayley Trees (CT) [1] and Bethe Lattices (BL) [2] still play an important role as prototypes of graphs. In fact, even if one can say, from the point of graph theory, that these ideal graphs 111 We will see soon that, both CT’s and BL’s have little numerical availability: on one hand a CT is very sensitive to the boundary conditions while, on the other hand, a BL, due to the fact that is an infinite graph, cannot be simulated (represented) on a PC. are by now obsolete objects which have been replaced by more realistic random graphs, like the classical random graph since a few decades [3]  222Note that nowadays the name Bethe Lattice is often used to indicate the regular random graph, which is a tree-like graph, but not an exact tree (a tree-like graph is a graph containing only long cycles so that locally it looks like a tree [4]). However, in this paper we will reserve the name Bethe Lattice to indicate an exact (and then infinite) tree., and by complex networks more recently [4], their key feature, that is the fact they are exact tree, i.e., cycles-free, makes CT and BL very instructive examples where exact calculations can be done [5]. However, even though there are several excellent works on their applications, there is still a quite widespread confusion about their exact definition and use. In particular, by a rapid survey (September 2011), ranging from Wikipedia to famous textbooks in statistical mechanics, as well as many papers in referred journals of mathematics or physics, one finds statements claiming, for example, that the BL is the thermodynamic limit of the CT, or one finds that a BL is the interior of a large CT, that can be then analyzed by introducing large but finite subtrees, etc… Such false statements and misuses are not just formal mistakes, but serious conceptual misunderstandings that may lead in turn to fatal errors. The difference between a CT and a BL was emerged long ago in [6, 7, 8, 9, 10] but, nevertheless, confusion on the subject remained around over the years, both among mathematicians and physicists. We think that the main reason for that is due to an ill mathematical approach to the BL, and to a scarce communication between the physics and mathematics communities. Moreover, at the time of the Refs. [6, 7, 8, 9, 10], a proper nomenclature for the two kind of graphs was not yet consolidated, causing further confusion.

The aim of this paper is to give a concise definition of CT and BL and to illustrate ambiguities-free mathematical tools to be used for statistical mechanical models built over CT and BL. We will show two rigorous techniques to deal with BL: self-similarity and the Kolmogorov consistency theorem, pointing out that the latter is equivalent to the Cavity and Belief Propagation methods, more known among physicists.

## 2 Cayley Trees and Bethe Lattices: definition and basic properties

Both CT and BL are simple connected undirected graphs ( set of vertices, set of edges) with no cycles (a cycle is a closed path of different edges), i.e., they are trees.

A CT of order with shells is defined in the following way. Given a root vertex , we link with new vertices by means of edges. This first set of vertices constitutes the shell of the CT. Then, to build the shell , each vertex of the shell is linked to new vertices. Note that the vertices in the last shell have degree , while all the other vertices have degree , (the shell is represented by the single root vertex (0)).

The BL of degree is instead defined as a tree in which any vertex has degree , so that there is no boundary and no central vertex and, as a consequence, the main difference between CT and BL is simply that CT is finite while BL is infinite:

 CT:|V|,|E|<∞, (1) BL:|V|,|E|=∞. (2)

Note that Eqs. (1) and (2) imply an important difference for the average connectivity (the connectivity of a given vertex is defined as the number of edges emanating from it) between a CT and a BL. In fact, for any finite tree, and in particular a CT, it holds . Therefore

 CT:c=2−2|V|, (3) BL:c=q, (4)

where in deriving for the CT case we have used (valid for any graph). In Secs. III and IV we will see that the difference between Eqs. (3) and (4) has a dramatic consequence in statistical mechanics.

In probability theory, an important distinction is in order between a sequence of probability spaces of increasing size that eventually diverges, and a probability space that is infinite by definition. Similarly, in statistical mechanics, one can be more interested in studying the thermodynamic limit of the density free energy of the system, or else in studying the physical properties of a system which is defined from the very beginning as an infinite space (physical or abstract). Despite only the former kind of infinity leads to a constructive theory of statistical mechanics and seems physically relevant (in the real world nothing is really infinite, neither the universe), the second kind of infinity may still be, not only mathematically convenient, but even physically important (we will see this soon). From Eqs. (1) and (2), we see that, when we study a model of statistical mechanics built on a CT, we can have access only to the thermodynamic limit of the system, while when we study a model on a BL, by definition, we have access only to the physical properties of the model as defined over an infinite space. Even tough most of the models in physics show equivalence between the two different “kind of infinity”, in the case of CT and BL such equivalence is lost. The reason for such a difference is easily seen even without entering into the details of a specific model. In fact, when we study a model on the CT we need to specify the boundary conditions of the model, while on the BL, by definition, there are no boundary conditions. Given a CT of degree with shells, the number of vertices on the -th shell is given by , while the total number of vertices of the CT is . We see therefore that the ratio of the CT, in the thermodynamic limit, , does not approach zero (for ). This situation is very different from what happens in a -dimensional regular lattice box, where the ratio of the number of boundary vertices (vertices on the dimensional surface), with respect to the total number of vertices, for reaches zero as fast as . Therefore, while the thermodynamic limit of a model built on increasing boxes subsets of , is equivalent to the physical properties of the model defined on the infinite space , the boundary conditions becoming negligible, the thermodynamic limit of a model built on increasing CT’s is not equivalent to the physical properties of the model defined on the infinite space BL where, by definition, there are no boundaries. No matter how large is, the model built on the CT will heavily depend on the boundary conditions, a feature that makes the model on the CT rarely representative of a real world physical system, so that the model on the BL is often preferred. In the next Sections the non equivalence between CT and BL will be made concrete with the example of the Ising model.

## 3 Ising model on Cayley Trees and on General Trees

Given a CT of degree and shells, we want to analyze the Ising model built on it, having Hamiltonian

 H[{σi}]def=−J∑(i,j)∈Eσiσj−H0∑iσi, (5)

and partition function

 Zndef=∑{σi}e−βH[{σi}], (6)

where is the inverse temperature, the coupling constant, the external field, the ss are the spin variables, and stands for the partition function of a CT with shells. Let us consider free boundary conditions, and, for simplicity, the case . Since the CT is finite, we can in particular start to perform the summation of (6) by summing over the boundary spins, i.e., by summing over the spins on the -th shell. By using we get the recursive Equation

 Zn=Zn−1[cosh(βJ)]Nn, (7)

from which, by iterating, we arrive at

 Zn=[cosh(βJ)]Nn+Nn−1+…+N1, (8)

that is

 Zn=[cosh(βJ)]|V|. (9)

From Eq. (9) we see that the density free energy

 −βfdef=lim|V|→∞log(Zn)|V|=log[cosh(βJ)] (10)

is an analytic function of for any , therefore the Ising model on the CT does not give rise to a spontaneous magnetization. Eq. (10) has actually a more general counterpart, as it holds for any tree, i.e., with no cycles. This can be easily seen, for example, by using the high temperature expansion of the free energy [11]. By using the high temperature expansion, for any graph , can be written as

 −βf=lim|V|→∞|E||V|log[cosh(βJ)]+φG, (11)

where is the non trivial part of the expansion and is defined as the sum over all the closed non overlapping paths of of , where is the length of the closed path . Now, if is a tree, there are no closed paths, therefore , and from Eq. (1) it follows

 −βf=log[cosh(βJ)]lim|V|→∞|E||V|. (12)

The last factor is nothing else than , being the average connectivity of . By using Eq. (3) we see that the free energy of any tree does not depend on the details of the graph. In conclusion, the thermodynamic limit of the Ising model built on trees does not have a spontaneous magnetization. However, it can be shown [7, 8, 10] that the free energy density of a CT with is a non analytic function of the external field for , where will be introduced in the next Section.

## 4 Ising model on the Bethe Lattice

We now illustrate how to solve an Ising model on the BL by using two methods which are free of ambiguities.

### 4.1 Self-Similarity method

Let us consider for simplicity a BL lattice with degree with the exception of one single vertex that instead has degree 2 (see Fig. 1).

We introduce this little irregularity on our BL just to make things simpler (later on we will show how to restore the fully regular BL). From the root emanate 2 edges pointing at the vertices and . Since is infinite, it turns out that the two equivalent infinite subgraphs and that we obtain by eliminating from the vertex together with its two edges, are also both equivalent to the original , i.e., up to a change of name of the vertices, we have the self-similarity . Note, furthermore, that and are each other disconnected. If we define the conditional partition function of the system with respect to the root as

 Z(σ0)def=∑{σi,i≠0}e−βH[{σi}], (13)

where is defined similarly to Eq. (5), by using the self-similarity and the fact that and are disconnected, we get (again for simplicity we consider here only the case with no external field )

 Z(σ0)=∑σ1,σ2eβJσ0(σ1+σ2)Z(σ1)Z(σ2). (14)

One can feel unease with Eqs. (13) and (14) due to the fact that G is infinite, as well as , and the ’s. However we can get rid of the ’s and any ill defined quantity when we consider the finite ratio , where stands for the probability that the spin at has value . If we define

 e−2hdef=P(−)P(+), (15)

from Eqs. (14) we arrive at the equation for

 e−2h=e2βJ−4h+e−2βJ+2e−2he2βJ+e−2βJ−4h+2e−2h. (16)

Eq. (16) has always the trivial solution corresponding to a zero magnetization. However, by expanding Eq. (16) for small , it is easy to see that a non trivial solution is present when , where is the critical temperature given as solution of the equation

 (q−1)tanh(βJ)=1, (17)

which has solution as soon as . In other words, the value represents the critical number of neighbors above which in the system there exists a phase transition. We can now physically understand why, unlike the BL, in the CT we cannot never have a spontaneous magnetization. From Eqs. (3) and (4) we see that, unlike the BL, on average, no matter how large the degree of the CT is, the connectivity of the CT is always strictly below the value , therefore, in the CT, no matter how large its degree is, on average any spin is surrounded by an insufficient number of neighbors so that the system cannot have a spontaneous magnetization (at zero external field 333More precisely, since in the thermodynamic limit for the CT it holds , from Eq. (17) applied with , we see that the Ising model on the CT is critical only at .. Similar conclusions apply to the Potts model [10]. The average magnetization of the spin can be calculated from Eq. (15) and gives , while, the average magnetization for the fully regular BL with is simply . Notice however the necessity to consider the irregular BL: self-similarity, and then the possibility to have an equation for the effective field , applies only to the irregular BL, not to the fully regular BL. The physicist familiar with the Cavity Method (see next subsection) recognizes in this step another way to see that is the strength felt by a spin due to all the other spins in the absence (“the cavity”) of the spin itself.

We end this subsection with the following critical note. In many textbooks of statistical mechanics, as well as in many papers that deal with a BL (regular or not), in the place of Eqs. (14) or (16), we find respectively

 Zn+1(σ0)=∑σ1,σ2eβJσ0(σ1+σ2)Zn(σ1)Zn(σ2), (18)

and

 e−2hn+1=e2βJ−4hn+e−2βJ+2e−2hne2βJ+e−2βJ−4hn+2e−2hn, (19)

and it is often wrongly said that stands for the partition function of that finite portion of the BL having only shells. From what we have seen all above, it should be clear that such an approach for the BL is conceptually quite wrong: cannot be interpreted as the partition function of a finite subgraph of the BL, otherwise for we would get again Eq. (9). At this level, the only correct meaning we can attribute here to is that of the conditional partition function of an infinite subgraph self-similar to that associated to , as showed above, and, most importantly, and are the same thing. In this context, the index appearing in Eqs. (18) or (19) is a quite misleading symbol. More precisely, one should use instead and but, as a consequence, for the ratios we would get the same value regardless of the index: . Rather, looking at Eqs. (14) or (16), as a mere numerical tool, it can be convenient to introduce the recurrent Eqs. (18) or (19) since its fixed points are equivalent to Eqs. (14) or (16), respectively. A more rigorous and general meaning to can be attributed by following the next measure-theoretical approach.

### 4.2 Kolmogorov’ s condition

A more rigorous approach to the BL comes from the Kolmogorov consistency theorem (see e.g., [12] and [13]) by which we can avoid to introduce any ill defined quantity but, most importantly, as we will see, the method attributes a rigorous and physical meaning to Eq. (19) and to its generalizations.

From the point of view of probability theory, solving the Ising model on the BL amounts to find all the marginal probabilities of an infinite probability space, i.e., a probability space characterized by an infinite number of random variables (in our case the spins with indices in ), a suitable sigma-algebra [12], and a measure . The marginal probabilities can be calculated in turn from an infinite set of given finite dimensional marginal distributions , where is the set of vertices that are at distance not greater than from 0. Note however that, when we consider , we are not dealing with a finite disconnected subgraph of the given BL; the spins of that are located on the shell are in fact in turn connected to the spins of . The key point to be used is that, given the values of the spins on , the spins on are conditioned only by the neighboring spins, i.e., the spins of the -th shell. In turn, given a spin , with , the effects of all its neighbor spins that are on , can be encoded via a single effective external field acting only on . Therefore, for the , we look for solutions of the form

 μn({σi,i∈Vn})=e−βHn+∑i∈Wnhiσi/Zn (20)

where is given by Eq. (5), stands for the set of vertices of the shell , and is the normalization constant. Physically, Eq. (20) represents the first step of the Bethe-Peierls approach [2] (nowadays also known as Cavity-Method in physics [16], or Belief-Propagation in computer science [18]) in which the effects of all the spins other than those of the set are encoded in the fields to be determined self-consistently. From a strict probabilistic point of view, Eqs. (20) and (5) represent the definition of the Ising model itself on the BL. Now, as we have mentioned in Sec. II, the distinction between a sequence of probability spaces of increasing size and an infinite probability space is important. The sequence of probability spaces may or may not converge to anything, and even if it converges, it may happen that it does not converge to a probability space. The tool to investigate whether a sequence of increasing probability spaces converges to an infinite probability space is provided by the Kolmogorov consistency theorem [12]. It should be clear that on an infinite space like the BL, the probability that a given configuration of spins (a point-like event) is realized is zero: . Therefore, when one considers an infinite probability space, the introduction of a proper sigma-algebra is not just a formal necessity. However, since here we do not need to work directly with the measure of the infinite space, for what follows, we can skip the introduction of the sigma-algebras. The Kolmogorov consistency theorem says that a sequence of probability spaces converges to the probability space on BL, , if for any one has

 ∑{σi,i∈Wn+1}μn+1({σi,i∈Vn+1})=μn({σi,i∈Vn}). (21)

Eqs. (20) and (21) give the functional equations for the fields .

Due to the tree-like structure of the BL, Eq. (21) factorizes over independent branches allowing for a fundamental simplification. Let us apply Eqs. (20) and (21) to the same model analyzed in the previous subsection (a BL with in which the root vertex (0) has only two first neighbors) [14, 15]. For any given we indicate with and the indices of the first two neighbors of that belong to . We have ()

 e−βHn({σl,l∈Vn})Zn+1∏i0∈Wn∑σi1,σi2[eβJσi0(σi1+σi2)+hi1σi1+hi2σi2] (22) = e−βHn({σl,l∈Vn})Zn∏i0∈Wnehi0σi0

From Eq. (22), for any and for any values of , we have an equation for the field as a function of the fields and

 ZnZn+1∑σi1,σi2eβJσi0(σi1+σi2)+hi1σi1+hi2σi2=ehi0σi0,

from which, by evaluating the cases and we get

 e−2hi0= (23) e2βJ−hi1−hi2+e−2βJ+hi1+hi2+ehi1−hi2+ehi2−hi1e2βJ+hi1+hi2+e−2βJ−hi1−hi2+ehi1−hi2+ehi2−hi1.

It is immediate to verify that Eqs. (23) reduce to Eq. (19) for the choice . Besides to be rigorous, this approach, based on the Kolmogorov consistency theorem, shows us that a possible non homogeneous solution for the effective fields , periodic or not, is not an artifact of the theory: an effective field which depends on the index vertex is in correspondence with a non homogeneous marginal probability which depends on the index vertex too.

In the context of the Cavity and Belief Propagation methods, Eqs. (23) are better known in another form more suited for an intuitive interpretation of the effective fields as messages passing from a vertex to a neighbor vertex ; see Fig. (2).

By using , valid for , it is easy to see that Eqs. (23) can be rewritten as

 hi→l = tanh−1[tanh(βJ)tanh(hj→i)]+tanh−1[tanh(βJ)tanh(hk→i)]. (24)

More in general, the equation for a an Ising model built on a generic BL, regular or not, having generic couplings and an external field , reads as

 hi→j=∑k∈∂i∖jtanh−1[tanh(βJij)tanh(βH0+hk→i)], (25)

where stands for the set of the first neighbors of the vertex .

The developments about the convergence and algorithmic issues around Eqs. (25), as well as their applications in physics, computer science, and statistical inference are huge (see e.g. [16], [17], [18]). Here we wanted just to stress the equivalence between the Cavity/Belief-Propagation approaches and a rigorous method based on the Kolmogorov consistency theorem.

## 5 Last observation

From the previous Section, we see that we could re-interpret the (infinite) BL, as the thermodynamic limit of (finite) CT’s having specific “boundary conditions” determined by the fields solution of Eqs. (20) and (21), as shown in [19]. However, this point of view is a quite poor physical one (a system whose thermodynamic limit exists only for a specific boundary condition) and, pedagogically, very inconvenient.

## 6 conclusions

We have reviewed critically the concepts and the applications of CT and BL in statistical mechanics emphasizing their very different features. We have pointed out that careful must be paid especially in the case of a BL, an infinite space, where serious misuses and misunderstandings are currently seen in both textbooks and journal papers due to a ill mathematical approach to the BL. We have then illustrated for the BL case two alternative approaches which are rigorous and free of dangerous ambiguities based respectively on self-similarity and the Kolmogorov consistency theorem, pointing out the link of the latter with the Cavity and Belief Propagation methods, more known to the physics community. We hope that this critical review paper, where concepts and tools from physics and mathematics (too often kept apart) are used, might reduce the quite widespread misuse and conceptual errors around CT and BL.

## Acknowledgments

Work supported by Grant IIUM EDW B 11-159-0637. We thank F. Mukhamedov for useful discussions.

## References

• [1] A. Cayley, Arthur, Desiderata and suggestions: No. 2. The Theory of groups: graphical representation. Amer. J. Math. (2): 174 (1878).
• [2] H. A. Bethe, Statistical theory of superlattices, Proc. Roy. Soc. London Ser A, 150, 552 (1935).
• [3] P. Erds, A. Rnyi, Publ. Math. Debrecen 6, 290 (1959).
• [4] R. Albert, A.L. BarbÂ´asi, Rev. Mod. Phys. 74, 47 (2002); S.N. Dorogovtsev, J.F.F. Mendes, Evolution of Networks (University Press: Oxford, 2003); M. E. J. Newman, SIAM Review 45, 167 (2003); S. N. Dorogovtsev, Lectures on Complex Networks (Oxford Master Series in Statistical, Computational, and Theoretical Physics, 2010); S. N. Dorogovtsev, A. V. Goltsev, J.F.F. Mendes, Rev. Mod. Phys. 80, 1275 (2008).
• [5] R. J. Baxter, Exact Solved Models in Statistical Mechanics (Academic Press, London, 1982).
• [6] T. P. Eggarter, Phys. Rev. B 9, 2989 (1974).
• [7] H. Matsuda, Progr. Theor. Phys. 51 1053 (1973).
• [8] E. Muller-Hartmann, and J. Zittartz, Phys. Rev. Lett. 33, 893 (1974).
• [9] F. Y. Wu, J. Phys. A: Math. Gen. 9 593 (1976).
• [10] L. Turban, Phys. Lett. 78A 404 (1980).
• [11] C. Domb, Ising model Phase Transitions and Critical Phenomena Vol. 3 Ed. C. Domb. and M. S. Green (London: Academic Press) 357 (1974).
• [12] A. N. Shiryaev, Probability (New York, Springer, 1984).
• [13] Y. G. Sinai, Theory of phase transitions: rigorous results, (Oxford ; New York : Pergamon Press, 1982).
• [14] H. O. Georgii, Gibbs measures and phase transitions (Walter de Gruyter, Berlin, 1988).
• [15] P. M. Bleher, J. Ruiz and V. A. Zagrebnov, J. Stat. Phys. 79, 473 (1995).
• [16] M. Mezard, G. Parisi, Eur. Phys. J. B 20, 217-233 (2001).
• [17] R. Zecchina, O. C. Martin, and R. Monasson, Statistical mechanics methods and phase transitions in optimization problems Theoretical Computer Science, 3, 265 (2001).
• [18] J. S. Yedidia, J.S.; W. T. Freeman, Understanding Belief Propagation and Its Generalizations In Lakemeyer, Gerhard; Nebel, Bernhard. Exploring Artificial Intelligence in the New Millennium. Morgan Kaufmann, 239 (2003).
• [19] C. J. Thomposon, J. Stat. Mech. 27, 441 (1982).
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters