Some probabilistic trees with algebraic roots

# Some probabilistic trees with algebraic roots

Olivier Bernardi and Alejandro H. Morales
July 14, 2019
###### Abstract.

In this article we consider several probabilistic processes defining random grapha. One of these processes appeared recently in connection with a factorization problem in the symmetric group. For each of the probabilistic processes, we prove that the probability for the random graph to be a tree has an extremely simple expression, which is independent of most parameters of the problem. This raises many open questions.

O.B. acknowledges support from NSF grant DMS-1308441, and ERC ExploreMaps.
A.M. acknowledges support from a CRM-ISM postdoctoral fellowship.

## 1. Introduction: an example

In this paper we consider several probabilistic processes defining a random graph. These processes were originally motivated by factorizations problems in the symmetric group investigated in . Our main result is a formula, for each of the probabilistic processes, of the probability that the random graph is a tree. This probability formula turns out to be surprisingly simple, and is in particular independent of most parameters of the processes. This is reminiscent of the result of Kenyon and Winkler  about 2-dimensional Branched polymers. We conjecture that more results of this type should hold, but we were not able to prove them.

Before describing our results in full details, let us describe one particular case. Let be a positive integer. Given a tuple of proper subsets of , we define as the digraph having vertex set and arc set , where the arc has origin and endpoint the unique integer in such that , with the integers considered cyclically modulo . For instance, if and , then the arc has origin 5 and endpoint 2. In Figure 1 we have drawn some digraphs in the case . We now fix a tuple of non-negative integers, and choose uniformly at random a tuple of proper subsets of such that for all the integer is contained in exactly of the subsets . This gives a random digraph . One of the results proved in this paper is that the probability that is a tree (oriented toward the vertex ) is equal to . This result is unexpectedly simple, especially because it does not depend on the parameters . Figure 1. We represent above the three situations for which the digraph G(S) is a tree, with k=3 and S=(S1,S2).

Let us investigate in more detail the case of the aforementioned result. The situation is represented in Figure 1. By definition, the tuple is a pair of proper subsets of , and there are three possible trees with vertex set (in general, there are possible Cayley trees). Let , , and be respectively the events leading to the trees represented in parts (a), (b), and (c) of Figure 1. For instance,

 C={2∈S1} ∩ {3∉S1} ∩ {3∈S2} ∩ {1∉S2}.

Now it is not hard to check that the event has the same probability as the event

 C′={2∈S1} ∩ {3∉S2} ∩ {3∈S1} ∩ {1∉S1}.

Moreover the events , and are disjoint and their union is

 A∪B∪C′=({2∉S1}∪{2∈S1 ∩ 3∉S1}∪{2∈S1 ∩ 3∈S1 ∩ 1∉S1}) ∩ {3∉S2}.

But since is by definition a proper subset of , the first condition in the above clause is always satisfied. It follows that

 P(A∪B∪C)=P(A∪B∪C′)=P(3∉S2)=1−p3/2,

as claimed. Observe that the individual probabilities of the trees represented in Figure 1 do depend on the value of and , but the sum of these probabilities is independent of and .

The rest of the paper is organized as follows. In Section 2, we state the main results of the paper. In Section 3, we derive a generalization of the matrix-tree theorem tailored to our needs: it allows us to express the probability that our random graphs are trees as the probability of a “determinant of a matrix of events”. In Section 5 we simplify the matrices of events corresponding to our different random processes. In Section 5, we compute the determinant of the matrices of events. This computation uses some sign reversing involutions. We conclude in Section 6 with some conjectures about another random process, and some open questions.

## 2. Main results

In this section we fix some notation and state our main results. We denote by the cardinality of a set . We denote the disjoint union of two sets . For a positive integer , we denote by the set of integers . For , we denote by the set of integers , where integers are considered cyclically modulo . For instance , , and . For an integer and a tuple of non-negative integers, we denote by the set of tuples such that for all , is a subset of and for all the integer is contained in exactly of the subsets . We also denote by the set of tuples such that for all .

We now define three ways of associating a digraph to a an element in , using three mappings . The mappings takes as argument an integer and a subset and are defined as follows:

• if , and otherwise is the integer such that but .

• if , and otherwise is the integer such that but .

• if , if , and otherwise is the integer such that but .

The mappings are represented in Figure 2. Figure 2. Rules α,β,γ for creating an arc of the complete graph Kk (an informal description of the rule is given between quotation marks where ¯i means that i∉S).

We now use the mappings to define digraphs. Let , and let be a surjection from to . For , we define to be the digraph with vertex set and arc set where for all . For instance, the digraph defined in Section 1 corresponds to the case and where Id is the identity mapping from to . Observe that the graph has loops unless .

We are now interested in the probability that the digraph is a tree. Observe that in this case the tree is oriented toward the vertex (since every vertex in has one outgoing arc). Our main result is the following.

###### Theorem 1.

Let and be positive integers such that , and let be a tuple of non-negative integers. Let be a uniformly random element of (supposing that this set is non-empty), and let be a uniformly random surjection from to independent from S. For , let be the probability that the random digraph is a tree. Then

• . This is equal to the probability that .

• where . This is equal to the probability that .

• , where and for all , if and . This is equal to the probability that .

Observe that the cardinality of appearing in Theorem 1 can be expressed as the coefficient of in the polynomial . The case (c) of Theorem 1 was needed to complete the combinatorial proof described in  of Jackson’s formula . Before embarking on the proof of Theorem 1, we make a few remarks.

Remark 1. Theorem 1 can be stated in terms of uniformly random tuples S in instead of in . More precisely, for , if S is a uniformly random tuple in , and is a uniformly random surjection from to independent from S, then the probability for the graph to be a tree is . This is simply because the graph is never a tree if .

Remark 2. In the case , the result of Theorem 1, can be stated without referring to a random surjection . Indeed, for and , let us define as the graph where Id is the identity mapping from to . Then the probability that the graph is a tree has the same expression as in Theorem 1. For instance, the probability that is a tree is , as claimed in Section 1. Indeed, in the particular case of Theorem 1, the surjection would be a bijection from to independent from S. But then the tuple has the same distribution as , hence the surjection does not affect probabilities.

Remark 3. The results in Theorem 1 would hold for any probability distribution on the tuples of proper subsets of , provided that the probability of a tuple only depends on the total number of occurrences of each integer . For instance, the probability that is a tree would still be equal to the probability that for such a probability distribution. This result follows from Theorem 1 since one can always condition on the total number of occurrences of each integer .

Remark 4. A slightly weaker version of Theorem 1 can be obtained by not requiring the function to be surjective. More precisely, for and for any positive integer , if S is a uniformly random tuple in , and is a uniformly random function from to independent from S, then the probability that has the same expression as in Theorem 1. For instance, the probability that is a tree is . Indeed, this result follows from Theorem 1 by conditioning on the cardinality of the image of the function , and by the number of occurrences of each integer in the subsets . It is actually this version of Theorem 1 (in the case ) which was needed in .

## 3. A probabilistic analogue of the matrix-tree theorem

The matrix-tree theorem is a classical result giving the number of oriented spanning trees of a graph as a determinant; see e.g. . In order to prove Theorem 1, it is tempting to consider the trees on the vertex set as the spanning trees of the complete graph , and apply a suitable analogue of the matrix-tree theorem. In this section we develop the framework necessary to establish this suitable analogue.

We first recall the matrix-tree theorem (in its weighted, directed version). Let denote the complete digraph having vertex set and arc set . We call spanning tree of rooted at a set of arcs not containing any cycle and such that every vertex is incident to exactly one outgoing arc in (this is equivalent to asking that is a spanning tree of “oriented toward” the root vertex ). We denote by the set of spanning tree of rooted at . Given some weights (taken in a commutative ring) for the arcs , one defines the weight of a tree as . The matrix-tree theorem states that

 (1) ∑T∈Tnw(T)=det(L),

where is the reduced Laplacian matrix, defined by if and . Observe that the matrix-tree theorem gives results about the spanning trees of any digraph with vertex set , because one can restrict its attention to the spanning trees of simply by setting for all arcs not in . Observe also that the weights are actually irrelevant in (1).

We will now derive a generalization of the matrix-tree theorem. Let be a probability space, where is the sample set, is the set of events (which is a -algebra on ), and is the probability measure. In our applications, will be a finite set and will be the powerset . Let be a positive integer and let be a matrix whose entries are events. For a tree , we define the probability of as

 P(T):=P(⋂(i,j)∈TEi,j).

Now we aim at expressing as some kind of determinant, and this requires some notation. Let be the set of (formal) finite linear combinations of events, with coefficients in the field (i.e., the free -module with basis ). The elements of are called generalized events and are of the form with and . We then define the ring . Here the intersection operation “” is defined to act distributively on , that is, for all , we set , and moreover for all we set . We define the determinant of a matrix of generalized events as

 det(M)=∑π∈Snϵ(π)M1,π(1)∩M2,π(2)∩⋯∩Mn,π(n),

where denotes the set of permutations of , and is the sign of the permutation .

We also extend the probability measure from to by linearity. More concretely, we set and . We call -determinant of a matrix of generalized events , and denote by , the probability of , that is,

 Pdet(M)=∑π∈Snϵ(π)P(M1,π(1)∩M2,π(2)∩⋯∩Mn,π(n))

We can now state our generalization of the matrix-tree theorem. For a matrix of generalized events, we define its reduced Laplacian matrix by setting for all if and .

###### Proposition 2.

Let be a matrix of generalized events and let be its reduced Laplacian matrix. Then

 ∑T∈TnP(T)=Pdet(L).

Observe that if the events are all independent, then , and so that Proposition 2 reduces to the usual matrix-tree theorem given in (1) for the weights .

###### Proof.

The known combinatorial proofs of the matrix-tree theorem actually extend almost verbatim to give Proposition 2. We sketch one such proof, following , mainly for the reader’s convenience.

By definition,

 (2) det(L) = ∑π∈Sn−1ϵ(π)⋂i∈[n]Li,π(i) = ∑π∈Sn−1ϵ(π)⎛⎝⋂i∈[n], π(i)≠i−Ei,π(i)⎞⎠∩⎛⎝⋂i∈[n], π(i)=i ∑j≠iEi,j⎞⎠.

Expanding the right-hand side of (2) leads to a sum over a set that we now describe. Let be the set of digraphs with vertex set having exactly one outgoing (non-loop) arc at each vertex , and no outgoing arc at vertex . Let be the set of edge-colored digraphs that can be obtained from digraphs in by coloring edges in either in blue or red, in such a way that the blue edges form a disjoint union of simple directed cycles. We claim that

 det(L)=∑C∈Cn(−1)cycle(C)⋂(i,j)∈CEi,j,

where is the number of blue cycles of the colored digraph . Indeed, the blue arcs of an element encode a permutation (the blue arcs are ), the red arcs of encode a summand in the expansion of (the red arcs form a set of the form ), and the factor is equal to .

Now, the digraphs in are all the graphs made of a (possibly empty) tree oriented toward the vertex together with a (possibly empty) set of directed cycles on which are possibly attached oriented trees. Moreover, if one sums the contribution of all the elements corresponding to the same underlying graph one gets 0 if there are some directed cycles (because these cycles can be colored either blue or red), and otherwise (because all the edges have to be red). This gives,

 det(L)=∑C∈Cn(−1)cycle(C)⋂(i,j)∈CEi,j=∑C∈Tn⋂(i,j)∈CEi,j.

and taking probability on both sides gives . ∎

Remark. Observe that more generally, for any commutative ring , any abelian group , and any homomorphism from to , there is an analogue of the matrix-tree theorem which holds with the same proof:

 ∑T∈TnP(⨂(i,j)∈TEi,j)=P(det(L)).

Before closing this section we define an equivalence relation on the set of generalized events. Let , be events in , and let , be complex numbers. We say that the generalized event and are equivalent, and we denote this , if for all , the quantities and are equal. For instance, for all , the generalized events and are equivalent. Also the event and the generalized event 0 are equivalent. It is easy to see that is an equivalence relation (symmetric, reflexive, transitive) and that if then . Moreover, if and then , and . We say that two matrices of generalized events and are equivalent if for all . The preceding properties immediately imply the following result.

###### Lemma 3.

If and are equivalent matrices of events, then they have the same -determinant.

## 4. Determinantal expressions for the probabilities Pζ(p,r).

We fix , , and p as in Theorem 1. We define a probability space in the following way:

• is the set of pairs , where S is in and is a surjection from to .

• is the power set ,

• is the uniform distribution on .

We denote by the set of triples , where is in and is a permutation of . For a matrix of events we denote by the set of triples such that is in the intersection . Observe that

 (3) Pdet(E)=1|Ω|∑(S,f,π)∈Ω(E)ϵ(π).

We now express the probabilities defined in Theorem 1 as -determinants. By definition, for , is the conditional probability, in the space , that the random digraph is a tree given that is in (equivalently, given that none of the subsets is equal to ). Since the random digraph is never a tree unless S is in (because has loops if ) one gets

where is the set of spanning tree of rooted at .

We will now use our generalization of the matrix-tree theorem. For in , , and , we define the event as the set of pairs in such that . In other words, is the event “the arc of the digraph is ”. By definition, for any tree in , the event is equal to . Thus,

 Pζ(p,r)=|Sp,r||Rp,r|∑T∈TkP(⋂(i,j)∈TEζ,i,j).

Hence by Proposition 2,

 Pζ(p,r)=|Sp,r||Rp,r|Pdet(Lζ),

where is the reduced Laplacian matrix of .

We will now define a matrix equivalent to the reduced Laplacian . For , and in we define the event as follows:

 Iti,j={(S,f)∈Ω, ]i,j]⊆St}.

Observe that because . We also define the event by if and . For and we define the following generalized events

 (4) L′α,i,j=If(i)i,j−If(i)i,j−1,L′β,i,j=Jf(i)i,j+1−Jf(i)i,j,L′γ,i,j=Jf(i)i,j+1−Jf(i)i,j−Jf(i)i−1,j+1+Jf(i)i−1,j.

Here and in the following, we consider the subscripts of the events and cyclically modulo ; for instance is understood as .

It is easy to check that for in and for all such that , one has the equivalence of events . For instance, where

 Eα,i,j={i+1,..,j−1∈Sf(i), and j∉Sf(i)}=Ii,j−1∖Ii,j,

hence . Moreover, for all ,

 L′ζ,i,i=−∑j∈[k−1]∖{i}L′ζ,i,j ∼ −∑j∈[k−1]∖{i}Lζ,i,j=Lζ,i,i.

Thus, by Lemma 3 the matrices and have the same -determinant. Our findings so far are summarized in the following lemma.

###### Lemma 4.

For all , the probability defined in Theorem 1 is

 Pζ(p,r)=|Sp,r||Rp,r|Pdet(L′ζ),

where is the matrix of generalized events defined by (4).

Next we derive simpler determinantal expressions for the probabilities .

###### Proposition 5.

For all , the probability defined in Theorem 1 is

 Pζ(p,r)=|Sp,r||Rp,r|Pdet(Mζ),

where is the matrix of generalized events defined by

 Mα,i,j=If(i)i,jif i∈[k−1], and Mα,k,j=Ω,Mβ,i,j=Jf(i)i,j+1if i∈[k−1], and Mβ,k,j=Ω,Mγ,i,j=Jf(i)i,j+1−Jf(i)i−1,j+1if i∈[k−1], and Mγ,k,j=Ω.

Example. Let us illustrate Proposition 5 in the case and . In this case,

 Mα=⎛⎜ ⎜ ⎜⎝If(1)1,1If(1)1,2If(1)1,3If(2)2,1If(2)2,2If(2)2,3ΩΩΩ⎞⎟ ⎟ ⎟⎠=⎛⎜⎝Ω2∈Sf(1)2,3∈Sf(1)1,3∈Sf(2)Ω3∈Sf(2)ΩΩΩ⎞⎟⎠,

hence by definition of the -determinant,

 (5)

Proposition 5 asserts that the probability that the graph is a tree is equal to . We leave as an exercise to prove that the right-hand side of (5) is equal to as predicted by Theorem 1.

The rest of this section is devoted to the proof of Proposition 5. We first treat in detail the case . Given Lemma 4 we only need to prove where . Since -determinants are alternating in the columns of matrices, we can replace the th column of by the sum of its first columns without changing the -determinant. This gives,

 Pdet(L′α)=Pdet(If(i)i,j−If(i)i,k)i,j∈[k−1].

Next, by linearity of the -determinant in the rows of the matrix, one gets

 Pdet(L′α)=∑D⊆[k−1](−1)|D|Pdet(MD),

where with if and otherwise. We now show that only of the subsets contribute to the above sum.

###### Lemma 6.

If contains more than one element, then .

###### Proof.

We assume that contains two distinct integers and and want to show that . We will use the expression (3) of -determinant. By definition, a triple is in if and only if for all , and for all , . Observe that the above conditions for and , namely and , do not depend on the permutation . More generally, none of the above conditions is affected by changing the permutation by , where is the transposition of the integers and . Thus a triple is in if and only if is in . Thus the mapping is an involution of . Moreover, since the involution changes the sign of the permutation , we get

 Pdet(MD)=1|Ω|∑(S,f,π)∈Ω(MD)ϵ(π)=0,

as claimed. ∎

So far we have shown that

 Pdet(L′α)=Pdet(M∅)−∑a∈[k−1]Pdet(M{a}).

Next, we observe that the set of triples in identifies with the set of triples in such that . Indeed, the correspondence is simply obtained by replacing the permutation of by the permutation of such that , and for all in . Similarly, for all , there is a bijection between the set of triples in and the set of triples in such that . Indeed, the bijection is simply obtained by replacing the permutation of by the permutation of such that , and for all in . Observe that this bijection changes the sign of the permutation, hence

 Pdet(L′α) = 1|Ω|</