Modeling Limits in Hereditary Classes

Modeling Limits in Hereditary Classes:
Reduction and Application to Trees

Jaroslav Nešetřil Jaroslav Nešetřil
Computer Science Institute of Charles University (IUUK and ITI)
Malostranské nám.25, 11800 Praha 1, Czech Republic
nesetril@iuuk.mff.cuni.cz
 and  Patrice Ossona de Mendez Patrice Ossona de Mendez
Centre d’Analyse et de Mathématiques Sociales (CNRS, UMR 8557)
190-198 avenue de France, 75013 Paris, France — and — Computer Science Institute of Charles University (IUUK)
Malostranské nám.25, 11800 Praha 1, Czech Republic
pom@ehess.fr
July 7, 2019
Abstract.

Limits of graphs were initiated recently in the two extreme contexts of dense and bounded degree graphs. This led to elegant limiting structures called graphons and graphings. These approach have been unified and generalized by authors in a more general setting using a combination of analytic tools and model theory to -limits (and -limits) and to the notion of modeling. The existence of modeling limits was established for sequences in a bounded degree class and, in addition, to the case of classes of trees with bounded height and of graphs with bounded tree depth. These seemingly very special classes is in fact a key step in the development of limits for more general situations. The natural obstacle for the existence of modeling limit for a monotone class of graphs is the nowhere dense property and it has been conjectured that this is a sufficient condition. Extending earlier results we derive several general results which present a realistic approach to this conjecture. As an example we then prove that the class of all finite trees admits modeling limits.

Key words and phrases:
Graph and Relational structure and Graph limits and Structural limits and Radon measures and Stone space and Model theory and First-order logic and Measurable graph
2010 Mathematics Subject Classification:
Primary 03C13 (Finite structures), 03C98 (Applications of model theory), 05C99 (Graph theory), 06E15 (Stone spaces and related structures), Secondary 28C05 (Integration theory via linear functionals)
Supported by grant ERCCZ LL-1201 and CE-ITI P202/12/G061, and by the European Associated Laboratory “Structures in Combinatorics” (LEA STRUCO)
Supported by grant ERCCZ LL-1201 and by the European Associated Laboratory “Structures in Combinatorics” (LEA STRUCO)

1. Introduction

The study of limiting properties of large graphs have recently received a great attention, mainly in two directions: limits of graphs with bounded degrees [1] and limit of dense graphs [11]. These developments are nicely documented in the recent monograph of Lovász [10]. Motivated by a possible unifying scheme for the study of structural limits, we introduced the notion of Stone pairing and -convergence [16, 18]. Precisely, we proposed an approach based on the Stone pairing of a first-order formula (with set of free variables ) and a graph , which is defined by following expression

In other words, is the probability that is satisfied in by a random assignment of vertices (chosen independently and uniformly in the vertex set of ) to the free variables of .

Stone pairing induces a notion of convergence: a sequence of graphs is -convergent if, for every first order formula (in the language of graphs), the values converge as . In other words, is -convergent if the probability that a formula is satisfied by the graph with a random assignment of vertices of to the free variables of converges as grows to infinity.

It is sometimes interesting to consider weaker notions of convergence, by restricting the set of considered formulas to a fragment of . In this case, we speak about -convergence instead of -convergence. Of special importance are the following fragments:

\hlxhv Fragment Type of formulas Type of convergence
\hlxvhhv quantifier free formulas left convergence [11]
\hlxvhv sentences (no free variables) elementary convergence
\hlxvhv formulas with free variables in
\hlxvhv local formulas (depending on a fixed distance neighborhood of the free variables)
\hlxvhv local formulas with single free variable local convergence (if bounded degree) [1]
\hlxvh
Table 1. Fragments of specific importance.

Note that the above notions clearly extend to relational structures. Precisely, if we consider relational structures with signature , the symbols of the relations and constants in define the non-logical symbols of the vocabulary of the first-order language associated to -structures. Notice that if is at most countable then is countable. We have shown in [16, 18] that every finite relational structure with (at most countable) signature defines (injectively) a probability measure on the standard Borel space , which is the Stone space of the Lindenbaum-Tarski algebra of first-order formulas (modulo logical equivalence) in the language of -relational structures. In this setting, a sequence of -structures is -convergent if and only if the sequence of measures converge (in the sense of a weak-* convergence), and that the uniquely determined limit probability measure is such that for every first-order formula it holds

where stands for the indicator function of the set of the such that . Note that the space of probability measures on the Stone space of a countable Boolean algebra, equipped with the weak topology, is compact.

It is natural to search for a limit object that would more look like a relational structure. Thus we introduced in [18] — as candidate for a possible limit object of sparse structures — the notion of modeling, which extends the notion of graphing introduced for bounded degree graphs. Here is an outline of its definition. A relational sample space is a relational structure (with signature ) with additional structure: The domain of of a relation sample space is a standard Borel space (with Borel -algebra ) with the property that every subset of that is first-order definable in is measurable (in with respect to the product -algebra). A modeling is a relational sample space equiped with a probability measure (denoted ). For brevity we shall use the same letter for structure, relational sample space, and modeling. The definition of modelings allows us to extend Stone pairing naturally to modelings: the Stone pairing of a first-order formula (with free variables in ) and a modeling , is defined by

where is the indicator function of the solution set of in , that is:

Note that every finite structure canonically defines a modeling (with same universe, discrete -algebra, and uniform probability measure) and that in the definition above matches the definition of Stone pairing of a formula and a finite structure introduced earlier.

In the following, we assume that free variables of formulas are of the form with . Note that the free variables need not to be indexed by consecutive integers. For a formula , denote by the formula obtained by packing the free variables of : if the free variables of are with then is obtained from by renaming to . Although and differ in general, it is clear that they have same measure (as can be obtained from by taking the Cartesian product by some power of , and then permuting the coordinates). Hence for every formula it holds

that is: the Stone pairing is invariant by renaming of the free variables.

The expressive power of the Stone pairing goes slightly beyond satisfaction statistics of first-order formulas. In particular, we prove (see Corollary 1) that the Stone pairing can be extended in a unique way to the infinitary language , which is an extension of allowing countable conjunctions and disjunctions [20, 21]. Note that this language is still complete, as proved by Karp [5]. Although the compactness theorem does not hold for , the interpolation theorem for was proved by Lopez-Escobar [9] and Scott’s isomorphism theorem for by Scott [19]. For a modeling and an integer , the -definable subsets of correspond to the smallest -algebra that contains all the first-order definable subsets of (see Lemma 7). According to the definition of a modeling, this means that all -definable sets of a modeling are Borel measurable.

We say that a class of structures admits modeling limits if for every -convergent sequence of structures there is a modeling such that for every it holds

what we denote by . More generally, for a fragment of , we say that a class of structures admits modeling -limits if for every -convergent sequence of structures there is a modeling such that for every it holds , and we denote this by .

The following results have been proved in [18]:

  • every class of graphs with bounded degree admits modeling limits;

  • every class of graphs of colored rooted trees bounded height admits modeling limits;

  • every class of graphs with bounded tree-depth admits modeling limits.

On the other hand, only sparse monotone classes of graphs can admits modeling limits. Precisely, if a monotone class of graphs admits modeling limits, then it is nowhere dense [18], and we conjectured that a monotone class of graphs actually admits modeling limits if and only if it is nowhere dense.

Recall that a monotone class of graphs is nowhere dense if, for every integer there exists a graph whose -subdivision is not in (for more on nowhere dense graphs, see [13, 12, 14, 15, 17]). The importance of nowhere dense classes and the strong relationship of this notion with first-order logic is examplified by the recent result of Grohe, Kreutzer, and Siebertz [4], which states that (under a reasonable complexity theoretic assumption) deciding first-order properties of graphs in a monotone class is fixed-parameter tractable if and only if is nowhere dense.

In this paper, we initiate a systematic study of hereditary classes that admit modeling limits. We prove that the problem of the existence of a modeling limit can be reduced to the study of -convergence, and then to two “typical” particular cases:

  • Residual sequences, that is sequences such that (intuitively) the limit has only zero-measure connected components,

  • Non-dispersive sequences, that is sequences such that (intuitively) the limit is (almost) connected.

A modeling with universe satisfies the Finitary Mass Transport Principle if, for every and every integers such that

it holds

It is clear that every finite structure satisfies the Finitary Mass Transport Principle, hence every modeling -limit of finite structures satisfies the Finitary Mass Transport Principle, too.

A stronger version of this principle, which is also satisfied by every finite structure, does not automatically hold in the limit. A modeling with universe satisfies the Strong Finitary Mass Transport Principle if, for every measurable subsets of , and every integers , the following property holds:

If every has at least neighbors in and every has at most neighbors in then .

In this context, we prove the following theorem, which is the principal result of this paper.

Theorem 1.

Let be a hereditary class of structures.

Assume that for every and every () the following properties hold:

  1. if is -convergent and residual, then it has a modeling -limit;

  2. if is -convergent (resp. -convergent) and -non-dispersive then it has a modeling -limit (resp. a -limit).

Then admits modeling limits (resp. modeling -limits).

Moreover, if in cases (1) and (2) the modeling limits satisfy the Strong Finitary Mass Transport Principle, then admits modeling limits (resp. modeling -limits) that satisfy the Strong Finitary Mass Transport Principle.

Then we apply this theorem in Section 8 to give a simple proof of the fact that the class of forests admit modeling limits.

Theorem 2.

The class of finite forests admits modeling limits, that is: every -convergent sequence of finite forests as a modeling -limit that satisfies the Strong Finitary Mass Transport Principle.

Note that a result similar to Theorem 2 was recently claimed by Kráľ, Kupec, and Tůma [7].

2. Preliminaries

Let be a relational structure with signature and universe , and let . The substructure induced by has domain and the same relations as (restricted to ). A class of -structures is hereditary if every induced substructure of a structure in belongs to : .

The distance between two vertices is the smallest number of relations inducing a connected substructure of and containing both and , that is the graph distance between and in the Gaifman graph of . For and , we denote by the ball of radius centered at , that is the substructure of induced by the vertices at distance at most from in . More generally, for , we denote by the substructure of induced by the vertices at distance at most from at least one of the () in .

A formula is -local if its satisfaction only depends on the -neighborhood of the free variables, that is: for every -structure and every it holds

Recall the particular case of Gaifman locality theorem for sentences, which we will be usefull in the following. A local sentence is a sentence of the form

where and is -local.

Theorem 3 (Gaifman [3]).

Every first-order sentence is equivalent to a Boolean combination of local sentences.

We end this section with two very simple but usefull lemmas.

Lemma 4.

Let be formulas. Then it holds

Proof.

Thus

Lemma 5.

Let be formulas without common free variables. Then it holds

Proof.

Let . For every modeling , the solution set can be obtained from by permuting the coordinates, hence both sets have the same measure, that is:

3. What does Stone pairing measure?

By definition, the Stone pairing measures the probability that a given first-order formula is satisfied in by a random assignment of vertices of to the free variables. For this definition to make sense, we have to assume that every subset of a power of that is first-order definable without parameters is measurable. Hence we have to consider, for each , a -algebra on that contains all subsets of that are first-order definable without parameters.

The aim of this section is to prove that the minimal -algebra including all subsets of that are first-order definable without parameters is exactly the family of all subsets of that are -definable without parameters.

We take time out for two lemmas.

Lemma 6.

Let be a set. For , let be a field of sets on , and let be the minimal -algebra that contains . For , let and let be defined by .

Assume that for each , maps to .

Then maps to .

Proof.

The proof follows the standard construction of a -algebra by transfinite induction. For , we let

  • be the collection of sets obtained as countable unions of increasing sets in , that is: sets of the form where and ;

  • be the collection of sets obtained as countable intersections of decreasing sets in , that is: sets of the form where and ;

  • (for not a limit ordinal) be the collection of sets obtained as countable unions of increasing sets in ;

  • (for not a limit ordinal) be the collection of sets obtained as countable intersections of decreasing sets in ;

  • (for limit ordinal) and .

Then it is easily checked that by induction that for every up to it holds:

  • for all , ;

  • for every limit ordinal , ;

  • if then and ;

  • if then and ;

  • maps to and to ;

According to the monotone class theorem, . ∎

Lemma 7.

We consider a relational structure with countable signature.

Let (resp. ) be the field of sets of all the subsets of that are first-order definable without (resp. with) parameters. Then the smallest -algebra (resp. ) is the algebra of all the subsets of that are definable in without (resp. with) parameters.

Proof.

Let and be the projection map. According to Lemma 6, the projection map send sets in to sets in (and sets in to sets in ). It follows easily that subsets of that are -definable without (resp. with) parameters are exactly those in (resp. ). ∎

Note that when is a modeling, the collection of the subsets of definable in without parameters is the -algebra generated by the projection , mapping a -tuple of vertices of to its -type: a subset of is definable in without parameters if and only if it is the preimage by of a Borel subset of (see [18] for detailed definition and analysis of ).

Corollary 1.

For every modeling the Stone pairing can be extended in a unique way to .

Remark 8.

Let be the set of all probability measures on the Stone space , and let be the -algebra generated by evaluation maps for measurable set of . It is well known that is a standard Borel space ([6], Sect. .E). (Hence the space of all finite -structures and their -limits is also a compact standard Borel space, as it can be identified to a closed subspace of .) The mapping embeds the space of modelings into . The initial topology on with respect to this mapping is the same as the topology induced by Stone pairing. Hence the mapping , which maps to , is continuous for , and measurable for .

Also remark that the topology of can be defined by means of Lévy–Prokhorov metric (by choosing some metric on the Stone space). For instance, for finite signature , the topology of can be generated by the pseudometric:

Theorem 9.

Let be a relational sample space. Then every subset of that is -definable (with parameters) is measurable (with respect to product Borel -algebra ).

3.1. Interpretation Schemes

Interpretation Schemes (introduced in this setting in [18]) generalize to other logics than .

Definition 10.

Let be a logic (for us, or ). For and a signature , denotes the set of the formulas in the language of in logic , with free variables in .

Let be signatures, where has relational symbols with respective arities .

An -interpretation scheme of -structures in -structures is defined by an integer — the exponent of the -interpretation scheme — a formula , a formula , and a formula for each symbol , such that:

  • the formula defines an equivalence relation of -tuples;

  • each formula is compatible with , in the sense that for every it holds

    where , boldface and represent -tuples of free variables, and where stands for .

For a -structure , we denote by the -structure defined as follows:

  • the domain of is the subset of the -equivalence classes of the tuples such that ;

  • for each and every such that (for every ) it holds

From the standard properties of model theoretical interpretations (see, for instance [8] p. 180), we state the following: if is an -interpretation of -structures in -structures, then there exists a mapping (defined by means of the formulas above) such that for every , and every -structure , the following property holds (while letting and identifying elements of with the corresponding equivalence classes of ):

For every (where ) it holds

It directly follows from the existence of the mapping that

  • an -interpretation scheme of -structures in -structures defines a continuous mapping from to ;

  • an -interpretation scheme of -structures in -structures defines a measurable mapping from to .

Definition 11.

Let be signatures. A basic -interpretation scheme of -structures in -structures with exponent is defined by a formula for each symbol with arity .

For a -structure , we denote by the structure with domain such that, for every with arity and every it holds

It is immediate that every basic -interpretation scheme defines a mapping such that for every -structure , every , and every it holds

We deduce the following general properties:

Lemma 12 ([18]).

Let be an -interpretation scheme of -structures in -structures.

Then, if a sequence of finite -structures is -convergent then the sequence of (finite) -structures is -convergent.

Lemma 13.

Let be an -interpretation scheme of -structures in -structures.

If is injective and is a relational sample space, then is a relational sample space.

Furthermore, if is a basic -interpretation scheme and is a modeling, then is a modeling and for every , it holds

Proof.

Assume is an injective -interpretation scheme and is a relational sample space.

We first mark all the (finitely many) parameters and reduce to the case where the interpretation has no parameters (as in the case of -interpretation, see [18]. Let be the domain of . As is -definable in , is -definable in hence . Then is a Borel sub-space of . As is a bijection from to , we deduce that is a standard Borel space. Moreover, as the inverse image of every -definable set of is -definable in , we deduce that is a -relational sample space.

Assume is a basic -interpretation scheme and is a modeling. The pushforward of by defines a probability measure on such that for every , it holds

4. Residual Sequences

Every modeling can be decomposed into countably many connected components with non-zero measure and an union of connected components with (individual) zero measure. A residual modeling is a modeling, all components of which have zero measure.

Lemma 14.

A modeling is residual if it holds

Proof.

Assume that for every it holds . For , the connected component of is . As all these balls are first-order definable (hence measurable) we deduce

It follows that every connected component of has zero-measure, hence is residual.

Conversely, assume that there exists and such that . Then the connected component of does not have zero measure, hence is not residual. ∎

This equivalence justifies the following notion of residual sequence.

Definition 15 (Residual sequence).

A sequence of modelings is residual if

Lemma 16.

Let be -local, and define the formula

Then there exist -local formulas such that it holds

Proof.

According to Lemma 4 it holds

According to the -locality of there exist -local formulas such that is logically equivalent to (where denotes the formula with free variable renamed ). Thus, according to Lemma 4, it holds

As the formulas use no common free variables, it holds (according to Lemma 5):

Hence the result. ∎

Corollary 2.

Let be -local.

Then there exist -local formulas such that for every modeling it holds

Proof.

Let be defined as in Lemma 16. By union bound, we get

and the result follows from Lemma 16. ∎

Lemma 17.

For a residual sequence, -convergence is equivalent to -convergence.

Proof.

Let be a residual sequence.

If is -convergent, it is -convergent;

Assume is -convergent, let be an -local formula, and let be the formula .

As is residual, it holds

According to Lemma 16, there exist -local formulas such that for every it holds

Hence

Hence for residual sequences, -convergence implies -convergence. ∎

To every formula and integer we associate the formula defined as

Definition 18.

A modeling is clean if for every formula it holds

(Note that the right-hand side condition is equivalent to existence of such that .)

Lemma 19.

Let be a residual clean modeling and let .

If is not empty, then it is uncountable.

Proof.

Assume is not empty. As is clean, there exists such that , that is: . Clearly . Assume for contradiction that is countable. Then

As is residual, for every it holds , what contradicts the assumption . ∎

Let be a fragment of . Two modeling and are -equivalent if, for every it holds . We shall now show how any modeling can be transformed into a residual clean modeling, which is -equivalent.

Lemma 20.

Let be a modeling. Then there exists a residual modeling that is -equivalent to .

Proof.

Consider the modeling with universe , measure (where is standard Borel measure of ) and relations defined as follows: for every relation of arity it holds

Then is residual and for every it holds

Lemma 21.

Let be a residual modeling. Then there exists a residual clean modeling obtained from by removing a union of connected components with global -measure zero.

Proof.

Let be such that and . For , denote by the connected component of that contains , that is: . Note that if and if then but .

Note that the assumption on rewrites as “ while for every it holds ”.

Then

Denote by the set of all