Modeling Limits in Hereditary Classes:
Reduction and Application to Trees
Abstract.
Limits of graphs were initiated recently in the two extreme contexts of dense and bounded degree graphs. This led to elegant limiting structures called graphons and graphings. These approach have been unified and generalized by authors in a more general setting using a combination of analytic tools and model theory to limits (and limits) and to the notion of modeling. The existence of modeling limits was established for sequences in a bounded degree class and, in addition, to the case of classes of trees with bounded height and of graphs with bounded tree depth. These seemingly very special classes is in fact a key step in the development of limits for more general situations. The natural obstacle for the existence of modeling limit for a monotone class of graphs is the nowhere dense property and it has been conjectured that this is a sufficient condition. Extending earlier results we derive several general results which present a realistic approach to this conjecture. As an example we then prove that the class of all finite trees admits modeling limits.
Key words and phrases:
Graph and Relational structure and Graph limits and Structural limits and Radon measures and Stone space and Model theory and Firstorder logic and Measurable graph2010 Mathematics Subject Classification:
Primary 03C13 (Finite structures), 03C98 (Applications of model theory), 05C99 (Graph theory), 06E15 (Stone spaces and related structures), Secondary 28C05 (Integration theory via linear functionals)1. Introduction
The study of limiting properties of large graphs have recently received a great attention, mainly in two directions: limits of graphs with bounded degrees [1] and limit of dense graphs [11]. These developments are nicely documented in the recent monograph of Lovász [10]. Motivated by a possible unifying scheme for the study of structural limits, we introduced the notion of Stone pairing and convergence [16, 18]. Precisely, we proposed an approach based on the Stone pairing of a firstorder formula (with set of free variables ) and a graph , which is defined by following expression
In other words, is the probability that is satisfied in by a random assignment of vertices (chosen independently and uniformly in the vertex set of ) to the free variables of .
Stone pairing induces a notion of convergence: a sequence of graphs is convergent if, for every first order formula (in the language of graphs), the values converge as . In other words, is convergent if the probability that a formula is satisfied by the graph with a random assignment of vertices of to the free variables of converges as grows to infinity.
It is sometimes interesting to consider weaker notions of convergence, by restricting the set of considered formulas to a fragment of . In this case, we speak about convergence instead of convergence. Of special importance are the following fragments:
\hlxhv Fragment  Type of formulas  Type of convergence 

\hlxvhhv  quantifier free formulas  left convergence [11] 
\hlxvhv  sentences (no free variables)  elementary convergence 
\hlxvhv  formulas with free variables in  
\hlxvhv  local formulas (depending on a fixed distance neighborhood of the free variables)  
\hlxvhv  local formulas with single free variable  local convergence (if bounded degree) [1] 
\hlxvh 
Note that the above notions clearly extend to relational structures. Precisely, if we consider relational structures with signature , the symbols of the relations and constants in define the nonlogical symbols of the vocabulary of the firstorder language associated to structures. Notice that if is at most countable then is countable. We have shown in [16, 18] that every finite relational structure with (at most countable) signature defines (injectively) a probability measure on the standard Borel space , which is the Stone space of the LindenbaumTarski algebra of firstorder formulas (modulo logical equivalence) in the language of relational structures. In this setting, a sequence of structures is convergent if and only if the sequence of measures converge (in the sense of a weak* convergence), and that the uniquely determined limit probability measure is such that for every firstorder formula it holds
where stands for the indicator function of the set of the such that . Note that the space of probability measures on the Stone space of a countable Boolean algebra, equipped with the weak topology, is compact.
It is natural to search for a limit object that would more look like a relational structure. Thus we introduced in [18] — as candidate for a possible limit object of sparse structures — the notion of modeling, which extends the notion of graphing introduced for bounded degree graphs. Here is an outline of its definition. A relational sample space is a relational structure (with signature ) with additional structure: The domain of of a relation sample space is a standard Borel space (with Borel algebra ) with the property that every subset of that is firstorder definable in is measurable (in with respect to the product algebra). A modeling is a relational sample space equiped with a probability measure (denoted ). For brevity we shall use the same letter for structure, relational sample space, and modeling. The definition of modelings allows us to extend Stone pairing naturally to modelings: the Stone pairing of a firstorder formula (with free variables in ) and a modeling , is defined by
where is the indicator function of the solution set of in , that is:
Note that every finite structure canonically defines a modeling (with same universe, discrete algebra, and uniform probability measure) and that in the definition above matches the definition of Stone pairing of a formula and a finite structure introduced earlier.
In the following, we assume that free variables of formulas are of the form with . Note that the free variables need not to be indexed by consecutive integers. For a formula , denote by the formula obtained by packing the free variables of : if the free variables of are with then is obtained from by renaming to . Although and differ in general, it is clear that they have same measure (as can be obtained from by taking the Cartesian product by some power of , and then permuting the coordinates). Hence for every formula it holds
that is: the Stone pairing is invariant by renaming of the free variables.
The expressive power of the Stone pairing goes slightly beyond satisfaction statistics of firstorder formulas. In particular, we prove (see Corollary 1) that the Stone pairing can be extended in a unique way to the infinitary language , which is an extension of allowing countable conjunctions and disjunctions [20, 21]. Note that this language is still complete, as proved by Karp [5]. Although the compactness theorem does not hold for , the interpolation theorem for was proved by LopezEscobar [9] and Scott’s isomorphism theorem for by Scott [19]. For a modeling and an integer , the definable subsets of correspond to the smallest algebra that contains all the firstorder definable subsets of (see Lemma 7). According to the definition of a modeling, this means that all definable sets of a modeling are Borel measurable.
We say that a class of structures admits modeling limits if for every convergent sequence of structures there is a modeling such that for every it holds
what we denote by . More generally, for a fragment of , we say that a class of structures admits modeling limits if for every convergent sequence of structures there is a modeling such that for every it holds , and we denote this by .
The following results have been proved in [18]:

every class of graphs with bounded degree admits modeling limits;

every class of graphs of colored rooted trees bounded height admits modeling limits;

every class of graphs with bounded treedepth admits modeling limits.
On the other hand, only sparse monotone classes of graphs can admits modeling limits. Precisely, if a monotone class of graphs admits modeling limits, then it is nowhere dense [18], and we conjectured that a monotone class of graphs actually admits modeling limits if and only if it is nowhere dense.
Recall that a monotone class of graphs is nowhere dense if, for every integer there exists a graph whose subdivision is not in (for more on nowhere dense graphs, see [13, 12, 14, 15, 17]). The importance of nowhere dense classes and the strong relationship of this notion with firstorder logic is examplified by the recent result of Grohe, Kreutzer, and Siebertz [4], which states that (under a reasonable complexity theoretic assumption) deciding firstorder properties of graphs in a monotone class is fixedparameter tractable if and only if is nowhere dense.
In this paper, we initiate a systematic study of hereditary classes that admit modeling limits. We prove that the problem of the existence of a modeling limit can be reduced to the study of convergence, and then to two “typical” particular cases:

Residual sequences, that is sequences such that (intuitively) the limit has only zeromeasure connected components,

Nondispersive sequences, that is sequences such that (intuitively) the limit is (almost) connected.
A modeling with universe satisfies the Finitary Mass Transport Principle if, for every and every integers such that
it holds
It is clear that every finite structure satisfies the Finitary Mass Transport Principle, hence every modeling limit of finite structures satisfies the Finitary Mass Transport Principle, too.
A stronger version of this principle, which is also satisfied by every finite structure, does not automatically hold in the limit. A modeling with universe satisfies the Strong Finitary Mass Transport Principle if, for every measurable subsets of , and every integers , the following property holds:
If every has at least neighbors in and every has at most neighbors in then .
In this context, we prove the following theorem, which is the principal result of this paper.
Theorem 1.
Let be a hereditary class of structures.
Assume that for every and every () the following properties hold:

if is convergent and residual, then it has a modeling limit;

if is convergent (resp. convergent) and nondispersive then it has a modeling limit (resp. a limit).
Then admits modeling limits (resp. modeling limits).
Moreover, if in cases (1) and (2) the modeling limits satisfy the Strong Finitary Mass Transport Principle, then admits modeling limits (resp. modeling limits) that satisfy the Strong Finitary Mass Transport Principle.
Then we apply this theorem in Section 8 to give a simple proof of the fact that the class of forests admit modeling limits.
Theorem 2.
The class of finite forests admits modeling limits, that is: every convergent sequence of finite forests as a modeling limit that satisfies the Strong Finitary Mass Transport Principle.
2. Preliminaries
Let be a relational structure with signature and universe , and let . The substructure induced by has domain and the same relations as (restricted to ). A class of structures is hereditary if every induced substructure of a structure in belongs to : .
The distance between two vertices is the smallest number of relations inducing a connected substructure of and containing both and , that is the graph distance between and in the Gaifman graph of . For and , we denote by the ball of radius centered at , that is the substructure of induced by the vertices at distance at most from in . More generally, for , we denote by the substructure of induced by the vertices at distance at most from at least one of the () in .
A formula is local if its satisfaction only depends on the neighborhood of the free variables, that is: for every structure and every it holds
Recall the particular case of Gaifman locality theorem for sentences, which we will be usefull in the following. A local sentence is a sentence of the form
where and is local.
Theorem 3 (Gaifman [3]).
Every firstorder sentence is equivalent to a Boolean combination of local sentences.
We end this section with two very simple but usefull lemmas.
Lemma 4.
Let be formulas. Then it holds
Proof.
Thus
∎
Lemma 5.
Let be formulas without common free variables. Then it holds
Proof.
Let . For every modeling , the solution set can be obtained from by permuting the coordinates, hence both sets have the same measure, that is:
∎
3. What does Stone pairing measure?
By definition, the Stone pairing measures the probability that a given firstorder formula is satisfied in by a random assignment of vertices of to the free variables. For this definition to make sense, we have to assume that every subset of a power of that is firstorder definable without parameters is measurable. Hence we have to consider, for each , a algebra on that contains all subsets of that are firstorder definable without parameters.
The aim of this section is to prove that the minimal algebra including all subsets of that are firstorder definable without parameters is exactly the family of all subsets of that are definable without parameters.
We take time out for two lemmas.
Lemma 6.
Let be a set. For , let be a field of sets on , and let be the minimal algebra that contains . For , let and let be defined by .
Assume that for each , maps to .
Then maps to .
Proof.
The proof follows the standard construction of a algebra by transfinite induction. For , we let

be the collection of sets obtained as countable unions of increasing sets in , that is: sets of the form where and ;

be the collection of sets obtained as countable intersections of decreasing sets in , that is: sets of the form where and ;

(for not a limit ordinal) be the collection of sets obtained as countable unions of increasing sets in ;

(for not a limit ordinal) be the collection of sets obtained as countable intersections of decreasing sets in ;

(for limit ordinal) and .
Then it is easily checked that by induction that for every up to it holds:

for all , ;

for every limit ordinal , ;

if then and ;

if then and ;

maps to and to ;
According to the monotone class theorem, . ∎
Lemma 7.
We consider a relational structure with countable signature.
Let (resp. ) be the field of sets of all the subsets of that are firstorder definable without (resp. with) parameters. Then the smallest algebra (resp. ) is the algebra of all the subsets of that are definable in without (resp. with) parameters.
Proof.
Let and be the projection map. According to Lemma 6, the projection map send sets in to sets in (and sets in to sets in ). It follows easily that subsets of that are definable without (resp. with) parameters are exactly those in (resp. ). ∎
Note that when is a modeling, the collection of the subsets of definable in without parameters is the algebra generated by the projection , mapping a tuple of vertices of to its type: a subset of is definable in without parameters if and only if it is the preimage by of a Borel subset of (see [18] for detailed definition and analysis of ).
Corollary 1.
For every modeling the Stone pairing can be extended in a unique way to .
Remark 8.
Let be the set of all probability measures on the Stone space , and let be the algebra generated by evaluation maps for measurable set of . It is well known that is a standard Borel space ([6], Sect. .E). (Hence the space of all finite structures and their limits is also a compact standard Borel space, as it can be identified to a closed subspace of .) The mapping embeds the space of modelings into . The initial topology on with respect to this mapping is the same as the topology induced by Stone pairing. Hence the mapping , which maps to , is continuous for , and measurable for .
Also remark that the topology of can be defined by means of Lévy–Prokhorov metric (by choosing some metric on the Stone space). For instance, for finite signature , the topology of can be generated by the pseudometric:
Theorem 9.
Let be a relational sample space. Then every subset of that is definable (with parameters) is measurable (with respect to product Borel algebra ).
3.1. Interpretation Schemes
Interpretation Schemes (introduced in this setting in [18]) generalize to other logics than .
Definition 10.
Let be a logic (for us, or ). For and a signature , denotes the set of the formulas in the language of in logic , with free variables in .
Let be signatures, where has relational symbols with respective arities .
An interpretation scheme of structures in structures is defined by an integer — the exponent of the interpretation scheme — a formula , a formula , and a formula for each symbol , such that:

the formula defines an equivalence relation of tuples;

each formula is compatible with , in the sense that for every it holds
where , boldface and represent tuples of free variables, and where stands for .
For a structure , we denote by the structure defined as follows:

the domain of is the subset of the equivalence classes of the tuples such that ;

for each and every such that (for every ) it holds
From the standard properties of model theoretical interpretations (see, for instance [8] p. 180), we state the following: if is an interpretation of structures in structures, then there exists a mapping (defined by means of the formulas above) such that for every , and every structure , the following property holds (while letting and identifying elements of with the corresponding equivalence classes of ):
For every (where ) it holds
It directly follows from the existence of the mapping that

an interpretation scheme of structures in structures defines a continuous mapping from to ;

an interpretation scheme of structures in structures defines a measurable mapping from to .
Definition 11.
Let be signatures. A basic interpretation scheme of structures in structures with exponent is defined by a formula for each symbol with arity .
For a structure , we denote by the structure with domain such that, for every with arity and every it holds
It is immediate that every basic interpretation scheme defines a mapping such that for every structure , every , and every it holds
We deduce the following general properties:
Lemma 12 ([18]).
Let be an interpretation scheme of structures in structures.
Then, if a sequence of finite structures is convergent then the sequence of (finite) structures is convergent.
Lemma 13.
Let be an interpretation scheme of structures in structures.
If is injective and is a relational sample space, then is a relational sample space.
Furthermore, if is a basic interpretation scheme and is a modeling, then is a modeling and for every , it holds
Proof.
Assume is an injective interpretation scheme and is a relational sample space.
We first mark all the (finitely many) parameters and reduce to the case where the interpretation has no parameters (as in the case of interpretation, see [18]. Let be the domain of . As is definable in , is definable in hence . Then is a Borel subspace of . As is a bijection from to , we deduce that is a standard Borel space. Moreover, as the inverse image of every definable set of is definable in , we deduce that is a relational sample space.
Assume is a basic interpretation scheme and is a modeling. The pushforward of by defines a probability measure on such that for every , it holds
∎
4. Residual Sequences
Every modeling can be decomposed into countably many connected components with nonzero measure and an union of connected components with (individual) zero measure. A residual modeling is a modeling, all components of which have zero measure.
Lemma 14.
A modeling is residual if it holds
Proof.
Assume that for every it holds . For , the connected component of is . As all these balls are firstorder definable (hence measurable) we deduce
It follows that every connected component of has zeromeasure, hence is residual.
Conversely, assume that there exists and such that . Then the connected component of does not have zero measure, hence is not residual. ∎
This equivalence justifies the following notion of residual sequence.
Definition 15 (Residual sequence).
A sequence of modelings is residual if
Lemma 16.
Let be local, and define the formula
Then there exist local formulas such that it holds
Proof.
According to Lemma 4 it holds
According to the locality of there exist local formulas such that is logically equivalent to (where denotes the formula with free variable renamed ). Thus, according to Lemma 4, it holds
As the formulas use no common free variables, it holds (according to Lemma 5):
Hence the result. ∎
Corollary 2.
Let be local.
Then there exist local formulas such that for every modeling it holds
Proof.
Lemma 17.
For a residual sequence, convergence is equivalent to convergence.
Proof.
Let be a residual sequence.
If is convergent, it is convergent;
Assume is convergent, let be an local formula, and let be the formula .
As is residual, it holds
According to Lemma 16, there exist local formulas such that for every it holds
Hence
Hence for residual sequences, convergence implies convergence. ∎
To every formula and integer we associate the formula defined as
Definition 18.
A modeling is clean if for every formula it holds
(Note that the righthand side condition is equivalent to existence of such that .)
Lemma 19.
Let be a residual clean modeling and let .
If is not empty, then it is uncountable.
Proof.
Assume is not empty. As is clean, there exists such that , that is: . Clearly . Assume for contradiction that is countable. Then
As is residual, for every it holds , what contradicts the assumption . ∎
Let be a fragment of . Two modeling and are equivalent if, for every it holds . We shall now show how any modeling can be transformed into a residual clean modeling, which is equivalent.
Lemma 20.
Let be a modeling. Then there exists a residual modeling that is equivalent to .
Proof.
Consider the modeling with universe , measure (where is standard Borel measure of ) and relations defined as follows: for every relation of arity it holds
Then is residual and for every it holds
∎
Lemma 21.
Let be a residual modeling. Then there exists a residual clean modeling obtained from by removing a union of connected components with global measure zero.
Proof.
Let be such that and . For , denote by the connected component of that contains , that is: . Note that if and if then but .
Note that the assumption on rewrites as “ while for every it holds ”.
Then
Denote by the set of all