Limits of Structures and the Example of Tree-Semilattices
The notion of left convergent sequences of graphs introduced by Lovász et al. (in relation with homomorphism densities for fixed patterns and Szemerédi’s regularity lemma) got increasingly studied over the past years. Recently, Nešetřil and Ossona de Mendez introduced a general framework for convergence of sequences of structures. In particular, the authors introduced the notion of -convergence, which is a natural generalization of left-convergence. In this paper, we initiate study of -convergence for structures with functional symbols by focusing on the particular case of tree semi-lattices. We fully characterize the limit objects and give an application to the study of left convergence of -partite cographs, a generalization of cographs.
Key words and phrases:Graph limits and Structural limit and Random-free graphon and Tree-order and -partite Cograph
2010 Mathematics Subject Classification:05C99 (Graph theory)
The study of limits of graphs gained recently a major interest [Borgs20081801, Borgs2012, pre05504139, LovaszBook, Lov'asz2006, Lovasz2010]. In the framework studied in the aforementioned papers, a sequence of graphs is said left-convergent if, for every (finite) graph , the probability
that a random map is a homomorphism (i.e. a mapping preserving adjacency) converges as goes to infinity. (For a graph , we denote by the order of , that is the number of vertices of .) In this case, the limit object can be represented by means of a graphon, that is a measurable function . The definition of the function above is extended to graphons by
(where we assume that is a graph with vertex set ) and then the graphon is the left-limit of a left-convergent sequence of graphs if for every graph it holds
For -regular hypergraphs, the notion of left-convergence extends in the natural way, and left-limits — called hypergraphons — are measurable functions and have been constructed by Elek and Szegedy using ultraproducts [ElekSze] (see also [Zhao2014]). These limits were also studied by Hoover [Hoover1979], Aldous [AldousICM], and Kallenberg [Kallenberg2005] in the setting of exchangeable random arrays (see also [Austin2008]). For other structures, let us mention limits of permutations [hoppen2011limits, Hoppen201393] and limits of posets [brightwell2010continuum, janson2011poset, hladky2015poset].
A signature is a set of symbols of relations and functions with their arities. A -structure is defined by its domain and an interpretation in of all the relations and functions declared in . A -structure is relational if the signature only contains symbols of relations. Thus relational structures are natural generalization of -uniform hypergraphs. To the opposite, a -structure is functional (or called an algebra) if the signature only contains symbols of functions. Denote by the fragment of all quantifier free formulas with free variables (in the language of ) and by the fragment of all quantifier free formulas. In the following, we shall use and when the signature is clear from context. For a formula with free variables, the set of satisfying assignments of is denoted by :
In the general framework of finite -structures (that is a -structure with finite domain), the notion of -convergence has been introduced by Nešetřil and the third author [CMUC]. In this setting, a sequence of -structures is -convergent if, for every quantifier free formula with free variables , the probability
that a random (uniform independent) assignment to the free variables of of elements of satisfies converges as goes to infinity. These notions naturally extends to weighted structures, that is structures equipped with a non uniform probability measure.
The notion of QF-convergence extends several notions of convergence.
It was proven in [CMUC] that a sequence of graphs (or of -uniform hypergraphs) with order going to infinity is QF-convergent if and only if it is left-convergent. This is intuitive, as for every finite graph with vertex set there is a quantifier-free formula with free variable such that for every graph and every -tuple of vertices of it holds if and only if the map is a homomorphism from to .
As mentioned before the left-limit of a left-convergent sequence of graphs can be represented by a graphon. However it cannot, in general, be represented by a Borel graph — that is a graph having a standard Borel space as its vertex set and a Borel subset of as its edge set. A graphon is random-free if it is almost everywhere -valued. Notice that a random-free graphon is essentially the same (up to isomorphism mod ) as a Borel graph equipped with a non-atomic probability measure on . A class of graph is said to be random-free if, for every left-convergent sequence of graphs with (for all ) the sequence has a random-free limit.
Local convergence of graphs with bounded degree has been defined by Benjamini and Schramm [Benjamini2001]. A sequence of graphs with maximum degree is local-convergent if, for every , the distribution of the isomorphism types of the distance -neighborhood of a random vertex of converges as goes to infinity. This notion can also be expressed by means of QF-convergence (in a slightly stronger form). Let be graphs with maximum degree strictly smaller than . By considering a proper edge coloring of by colors, we can represent as a functional structure with signature containing unary functions , where is the vertex set of and are defined as follows: for every vertex , is either the unique vertex adjacent to by an edge of color , or if no edge of color is incident to . It is easily checked that if the sequence is QF-convergent if and only if the sequence of edge-colored graphs is local-convergent. If is QF-convergent, then the limit is a graphing, that is a functional structure (with same signature as ) such that is a standard Borel space, and are measure-preserving involutions.
In the case above, the property of the functions to be involutions is essential. The case of quantifier free limits of general functional structures is open, even in the case of unary functions. Only the simplest case of a single unary function has been recently settled [MapLim]. The case of QF-limits of functional structures with a single binary function is obviously at least as complicated as the case of graphs, as a graph can be encoded by means of a (non-symmetric) function defined by if and are adjacent, and otherwise, with the property that QF-convergence of the encoding is equivalent to left-convergence of the graphs. The natural guess here for a limit object is the following:
Let be the signature formed by a single binary functional symbol .
Then the limit of a QF-convergent sequence of finite -structures can be represented by means of a measurable function , where stands for the space of probability measures on .
As witnessed by the case of local-convergence of graphs with bounded degrees, the “random-free” case, that is the case where the limit object can be represented by a Borel structure with same signature, is of particular interest. In this paper, we will focus on the case of simple structures defined by a single binary function — the tree semi-lattices — and we will prove that they admit Borel tree semi-lattices for QF-limits. Conversely, we will prove that every Borel tree semi-lattices (with domain equipped with an atomless probability measure) can be arbitrarily well approximated by a finite tree semi-lattices, hence leading to a full characterization of QF-limits of finite tree semi-lattices.
2. Statement of the Results
A tree-semilattice is an algebraic structure such that:
is a meet semi-lattice (i.e. an idempotent commutative semigroup);
s.t. and it holds .
Because we consider structures with infimum operator , note that we shall use the symbol for the logical conjunction.
Each tree-semilattice canonically defines a partial order on its domain by if . In the case where is finite, it is a partial order induced by the ancestor relation of a rooted tree.
It is possible to add finitely many unary relations to the signature of tree-semilattices. In this case, we speak of colored tree-semilattices, and we define the color of a vertex as the set of the indices of those unary relations it belongs to: .
A Borel tree-semilattice is a tree-semilattice on a standard Borel space , such that is Borel. Note that every finite tree-semilattice is indeed a Borel tree-semilattice.
Our main results concerning QF-convergence of tree-semilattices are as follows:
every -convergent sequence of finite colored weighted tree-semilattices admits a limit, which is a Borel colored tree-semilattice (Theorem LABEL:thm:limwato), and conversely: every Borel colored tree-semilattice is the limit of some -convergent sequence of colored weighted tree-semilattices (Corollary 2);
every -convergent sequence of colored uniform tree-semilattices admits a limit, which is an atomless Borel colored tree-semilattice (Corollary LABEL:cor:limato), and conversely: every atomless Borel colored tree-semilattice is the limit of some finite -convergent sequence of colored uniform tree-semilattices (Theorem 1).
The notion of -partite cographs has been introduced in [Ganian2013], based on the generalization of the characterization of cographs by means of cotrees [Corneil1981]: a graph is an -partite cograph if there exists a colored tree-semilattice , such that the vertices of are the leaves of , the leaves of are colored with a set of at most colors, and the adjacency of any two vertices and is fully determined by the colors of and . (Notice that there is no restriction on the colors used for internal elements of .) In this setting we prove (Theorem LABEL:thm:limco):
every left-convergent sequence of -partite cographs has a Borel limit, which is the interpretation of an atomless Borel colored tree-semilattice;
conversely, every interpretation of an atomless Borel colored tree-semilattice is the left-limit of a sequence of -partite cographs.
The class of all finite -partite cographs can be characterized by means of a finite family of excluded induced subgraphs [Ganian2012, Ganian2013]. We prove that this characterization extends to Borel graphs (Theorem LABEL:thm:char) in the sense that an atomless Borel graph excludes all graphs in as induced subgraphs if and only if it is the interpretation of an atomless colored Borel tree-semilattice.
In a finite tree-semilattice, each element except the minimum has a unique predecessor, that we call the father of (as it is the father of in the associated tree).
For a tree-semilattice and an element we further define
Let be a tree-semilattice, and let . The sub-tree-semilattice of generated by is the tree-semilattice with elements
where is defined as the restriction of the function of to the domain of .
Condition (2) of the definition of a tree-semilattice can be replaced by condition:
It follows that for , the sub-tree-semilattice of generated by has domain
If is any quantifier-free formula with free variables (in the language of tree-semilattice) and is a Borel tree-semilattice then is a Borel subset of , thus any (Borel) probability measure on allows to define
Let be Borel tree-semilattices, and let and be probability measures on and , respectively. We define the pseudometric
Note that a sequence of Borel tree-semilattices is -convergent if and only if it is Cauchy for the above distance.
As mentioned in Section 2, the color of an element of a colored tree-semilattice is the set of the indices of those unary relations belongs to. The order of the relations naturally induces a total order on these colors: for distinct it holds if (with convention ).
4. Sampling and Approximating
Two Borel structures are QF-equivalent if holds for every quantifier free formula . The following lemma, which is trivial for uniform structures, requires some little work for structures with a probability measure. As it this result is not really needed here, we leave the proof to the reader.
Two finite structures are QF-equivalent if and only if they are isomorphic.
Let be a Borel structure, and let . The -sampling of is the random structure defined as follows:
the domain of is the union of sets , and , where is a set of random independent elements of sampled with probability , is the set of all the elements that can be obtained from by at most applications of a function, and is an additional element;
the relations are defined on as in , as well as functions when the image belongs to . When undefined, functions have image ;
the probability measure on assigns probability to , and probability to the other elements.
Lemma 2 ([mcdiarmid1989method]).
Let be independent random variables, with taking values in a set for each . Suppose that a (measurable) function satisfies
whenever the vectors and differ only in the th coordinate. Let be the random variable . Then for any ,
Let be a quantifier-free formula with at most free variables and at most functional symbols.
Then, for every Borel structure and every and , it holds
where is the -sampling of .
Let and let be the indicator function of . Then
where is the Pochhammer symbol.
Then it holds
Considering the expectation we get
So we have
Now remark that for every it holds
as bounds the probability that an mapping from to will map some value to . Thus, according to Lemma 2 it holds for any :
In particular, for it holds
By union bound, we deduce that for sufficiently large there exists an -sampling which has -close Stone pairing with any formula with at most free variables and at most functional symbols. Precisely, we have:
For every signature there exists a function with the following property:
For every Borel -structure , every and every there exists, for each an -sampling of such that for every formula with at most free variables and functional symbols it holds
Hence we have:
Every Borel -structure is the limit of a sequence of weighted finite -structures.
Note that the finite weighted structures obtained as -sampling of a Borel structure usually have many elements with measure. The problem of determining whether an infinite Borel structure is the limit of a sequence of finite unweighted structures is much more difficult. Note that we have some (easy) necessary conditions on :
the domain is uncountable and the measure is atomless;
for every definable functions , and every definable subset of the set of fixed points of , the sets and have the same measure.
The second condition can be seen as a simple generalization of the intrinsic mass transport principle of Benjamini and Schramm: a graphing indeed defines a purely functional structure, with functional symbols, each interpreted as a measure preserving involution. In this case, the existence for each graphing of a sequence of bounded degree graphs having the given graphing as its limits is the difficult and well-known Aldous-Lyons conjecture [Aldous2006]. It turns out that one of the main difficulties of this problem concerns the expansion properties of the graphing. This leads naturally to first consider a weakened version we present now (for generalized structures):
Let be a -structure and let . The substructure of generated by is the -structure, denoted , whose domain is the smallest subset of including closed by all functional symbols of , with the same interpretation of the relations and functions symbols as .
We shall now prove that in the case of atomless Borel tree semi-lattices, the sampling techniques can be used to build arbitrarily good finite approximations, thus to build a converging sequence of finite tree semi-lattices with the given Borel semi-lattice as a limit.
Every atomless Borel tree-semilattice is limit of a sequence of uniform finite Borel tree-semilattices.
Let be an -sampling of (note that in the context of tree-semilattices, and according to Remark 1, taking yields the same structures as with ). Let , that is, the number of vertices of that were not directly sampled.
Fix and let . Let be the Borel tree-semilattice with elements set defined by:
with meet operation defined by
and uniform measure .
Informally, is obtained from by replacing each of the randomly selected elements used to create with a chain on vertices, and considering a uniform measure.
Define the map
Note that for every quantifier free formula with free variables, it holds, for every distinct and every it holds:
Let be a Borel tree-semilattice, an integer, and (resp. ) independent random variables in (resp. ) (with , resp. ). Let be a quantifier-free formula. As for any formula and any structure it holds , we have
Similarly, denoting the event
But, denoting by the indicator function of set , it holds:
Altogether, we have:
In other words, it holds
Together with Corollary 2, we get that for every atomless Borel tree semi-lattice , every and every there exists a finite (unweighted) tree semi-lattice such that for every quantifier free formula with free variables it holds
hence if we choose it holds .
5. Limits of Tree-Semilattices
In this section, we focus on providing an approximation lemma for finite colored tree-semilattices, which can be seen as an analog of the weak version of SzemerÃ©di’s regularity lemma. For the sake of simplicity, in this section, by tree-semilattice we always mean a finite weighted colored tree-semilattice.
5.1. Partitions of tree-semilattices
Let be a tree-semilattice and let . Then is said:
-light if ;
-singular if where is the set minus the sets for non--light children of ;
-chaining if is not singular and has exactly non-light child;
-branching if is not singular and has at least non-light children.
(One can easily convince themselves that every vertex of a tree-semilattice falls in exactly one of those categories.)
Let be a tree-semilattice. A partition of is an -partition of if
each part is of one of the following types (for some , see Fig. 1):
for some non-empty ,
for some with ,
for some distinct from ( is called the cut vertex of , and the path from to the father of is called the spine of ),
where (which is easily checked to be the infimum of ) is called the attachement vertex of and is denoted by ;
every part which is not a singleton has -measure at most .
An -partition of a tree-semilattice canonically defines a quotient rooted tree:
Let be an -partition of a tree-semilattice . The quotient rooted tree is the rooted tree with node set , the root of which is the unique part of that contains the minimum element of , where the father of any non-root part is the part that contain the father of if (i.e. types 1,2, or 4) or itself if (i.e. type 3). By abuse of notation, will also denote the tree-semilattice defined by the ancestor relation in the rooted tree .
In several circumstances it will be handy to refer to an -partition of a tree-semilattice directly by means of the partition map (where is meant as the vertex set of ). Note that this mapping is a weak homomorphism in the sense that for every it holds except (maybe) in the case where . The definition of -partitions can easily be transposed to provide a characterization of those weak homomorphisms that define an -partition. Such mappings are called -partition functions.
We now prove that every tree-semilattice has a small -partition.
Let be a tree-semilattice and . Then there exists an -partition of with at most elements.
For sake of clarity, we construct the desired -partition in two steps. First, let be the partition of obtained in the following manner:
for every -singular vertex , keep in its own part, and, for an arbitrary order on the -light children of , group the greedily such that the -measure of each part is maximum while remaining less than .
for every -branching or -chaining vertex , group with ,
It is easily seen that is indeed an -partition of (with only parts of type ). However, the number of parts is not bounded by .
Now, as per the definition, an -chaining vertex has at most one -chaining child, and the previous construction never groups the two together. Hence, we can consider chains of parts rooted at -chaining vertices, the parts of which we merge greedily (starting from the closest to the root, and going from parent to child) so that each part has maximum -measure while remaining less than , thus yielding a partition , in which every part is either:
an -singular vertex alone,
a subtree of rooted at an -branching vertex,
or for a set of children of some -singular vertex.
for an -chaining vertex and a descendent of ,
Note that these four categories correspond exactly to the types described in Definition 3, and since by definition, all parts are indeed of -measure at most , we get an -partition of . Now we need to prove that the number of parts is bounded.
First, note that there are at most sets of the first kind. Indeed, to each singular vertex correspond a subtree ( minus the sets for non--light children of ) of measure at least , and each of these subtrees are disjoint.
Sets of type correspond to -branching vertices. Consider the tree obtained by deleting -light vertices. We obtain a rooted tree, in which every chaining vertex has exactly one child, branching vertices have at least two children, and leaves are necessarily singular vertices. Therefore the number of branching vertices is at most the number of singular vertices, hence there are at most sets of type .
Finally, note that in the greedy construction of sets of both types 3 or 4, we apply a similar principle : we have a collection of disjoint sets, each of measure at most , and of total measure, say , and we partition this collection by forming groups of total measure at most . It is an easy observation that this can be done using at most groups : one can sets greedily by decreasing order of measure – this insures that all groups but the last have weight at least . Moreover these groups are overall all disjoint (a non-light vertex cannot be the descendent of a light one), so the total number of parts of type and is at most . This concludes our proof.
A partition is said to be a refinement of another one if each element of is a a subset of an element of .
Let , be a (finite) tree-semilattice, and be an -partition of . Then there exists an -partition of with at most elements, that is a refinement of . Moreover, induces an -partition of the tree-semilattice .
Each part not of type is a tree with total measure at most , so we can apply Lemma 4 independetly on each of these trees. For parts of type , we start by putting the attachment vertex back in the part, and then again apply Lemma 4, before removing it. Thus, we obtain an -partition of with at most elements. ∎
In the previous subsection, we defined -partitions and described the quotient map and quotient tree associated to an -partition. These are convenient objects to represent the partition (in particular with respects to successive refinement) but they miss some information about the measure and the colors. This is why we introduce her the concept of -reduction. We first give the definition and give two easy but essential lemmas before showing how to construct the particular ”small” reductions that will be of use to construct appoximations in the proof of our main Theorem.
Let be a tree-semilattice and let be an -partition function of .
An -reduction of is a color-preserving mapping , where is a tree-semilattice, such that factorizes as , where is an -partition function of , and satisfies the property that for every with it holds
Such a situation we depict by the following commutative diagram: