First order limits of sparse graphs:Plane trees and path-width

First order limits of sparse graphs:
Plane trees and path-width

Jakub Gajarský Faculty of Informatics, Masaryk University, Botanická 68a, 602 00 Brno, Czech Republic. E-mail: {xgajar,hlineny,obdrzalek,ordyniak}@fi.muni.cz. JG, PH and JO have been supported by project 14-03501S of the Czech Science Foundation. SO has been supported by the European Social Fund and the state budget of the Czech Republic under project CZ.1.07/2.3.00/30.0009 (POSTDOC I).    Petr Hliněný    Tomáš Kaiser Department of Mathematics, Institute for Theoretical Computer Science (CE-ITI), and European Centre of Excellence NTIS (New Technologies for the Information Society), University of West Bohemia, Univerzitní 8, 306 14 Pilsen, Czech Republic. Email: kaisert@kma.zcu.cz. Supported by project GA14-19503S of the Czech Science Foundation.    Daniel Král’ Mathematics Institute, DIMAP and Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK. E-mail: d.kral@warwick.ac.uk. This author’s work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 259385 and by the Engineering and Physical Sciences Research Council Standard Grant number EP/M025365/1.    Martin Kupec Computer Science Institute, Faculty of Mathematics and Physics, Charles University, Malostranské náměstí 25, 118 00, Prague, Czech Republic. E-mail: magon@iuuk.mff.cuni.cz.    Jan Obdržálek    Sebastian Ordyniak    Vojtěch Tůma Department of Applied Mathematics, Faculty of Mathematics and Physics, Charles University, Malostranské náměstí 25, 118 00, Prague, Czech Republic. E-mail: voyta@kam.mff.cuni.cz.
Abstract

Nešetřil and Ossona de Mendez introduced the notion of first order convergence as an attempt to unify the notions of convergence for sparse and dense graphs. It is known that there exist first order convergent sequences of graphs with no limit modeling (an analytic representation of the limit). On the positive side, every first order convergent sequence of trees or graphs with no long path (graphs with bounded tree-depth) has a limit modeling. We strengthen these results by showing that every first order convergent sequence of plane trees (trees with embeddings in the plane) and every first order convergent sequence of graphs with bounded path-width has a limit modeling.

1 Introduction

The theory of combinatorial limits has quickly become an important area of combinatorics. The most developed is the theory of graph limits, which is a subject of a recent monograph by Lovász [13]. The graph convergence evolved to a large extent differently and independently for dense and sparse graphs. The case of dense graphs was developed in the series of papers by Borgs, Chayes, Lovász, Sós, Szegedy and Vesztergombi [5, 6, 7, 14, 15] and is considered to be well-understood. In the case of sparse graphs (such as those with bounded maximum degree), the most used notion of convergence known as the Benjamini-Schramm convergence, which was studied e.g. in [1, 2, 10], comes with substantial disadvantages. Several alternative notions were proposed [3, 4, 11], however, each of them also comes with certain drawbacks.

As an attempt to unify the existing notions of convergence for dense and sparse graphs, Nešetřil and Ossona de Mendez [16, 17] proposed a notion of convergence based on first order properties of graphs, the first order convergence (a formal definition is given in Section 2.2). This notion applies to all relational structures, and it implies the standard notion of convergence in the case of dense graphs and the Benjamini-Schramm convergence in the case of sparse graphs. A first order convergent sequence of graphs can be associated with an analytic representation, known as a limit modeling. Unfortunately, not all first order convergent sequences of graphs do have a limit modeling [17], e.g., the sequence of Erdős-Rényi random graphs is first order convergent with probability one but it has no limit modeling.

The existence of a limit modeling of a first order convergent sequence is one of central problems related to first order convergence. Nešetřil and Ossona de Mendez [17] conjectured that every first order convergent sequence of sparse graphs has a limit modeling:

Conjecture 1.

Let be a nowhere-dense class of graphs. Every first order convergent sequence of graphs from has a limit modeling.

Recall that nowhere-dense classes of graphs are classes of graphs [19] which include all minor closed classes of graphs (in particular, trees, planar graphs, etc.) and some more general classes of sparse graphs. However, only little is known towards proving Conjecture 1. Nešetřil and Ossona de Mendez [17] showed that every first order convergent sequence of trees of bounded depth has a limit modeling, and they used this result to show that every first order convergent sequence of graphs with bounded tree-depth has a limit modeling. Three of the authors (DK, MK and VT) extended this result and showed that every first order convergent sequence of trees has a limit modeling. This is also implied by a more general result of Nešetřil and Ossona de Mendez [18], who developed a framework for building limit modelings based on residual and non-dispersive first order convergent sequences.

In this paper, we make another step towards a proof of Conjecture 1. We show that every first order convergent sequence of trees embedded in the plane has a limit modeling (Theorem 1) and that every first order convergent sequence of graphs with bounded path-width has a limit modeling (Theorem 14). While the first result can be viewed as a small extension of the result on the existence of limit modelings of first order convergent sequences of trees, it turned out that embedding the trees in the plane, which essentially corresponds to fixing the cyclic order among the neighbors of each vertex, gave us enough power to prove the (more important) result on the existence of limit modelings of first order convergent sequences of graphs with bounded path-width. Note that the class of graphs of bounded path-width is significantly richer than the class of trees, which do not have cycles at all, or the classes of graphs with bounded tree-depth, which do not have long paths. In a certain sense, this is the first class of graphs with rich internal structure for which Conjecture 1 is proven.

The proof of Theorem 1 on the existence of limit modelings of first order convergent sequences of plane trees consists of two steps: a decomposition step described in Section 3.1 and a composition step described in Section 3.2. The decomposition step aims at analyzing first order properties of the graphs in the sequence and describing them through quantities that we refer to as Stone measure and discrete Stone measure. The composition step then uses these quantities to build a limit modeling of the sequence. The decomposition step follows the line of our original proof of the existence of limit modelings of first order convergent sequences of trees. However, we decided to replace the composition step of our original proof with the arguments from the analogous part of the proof given in [18], which we have found elegant and simpler than the composition step of our original proof using methods from [11]. We then employ first order interpretation schemes to encode graphs with bounded path-width by plane trees, which allows us to prove our result on first order convergent sequences of graphs with bounded path-width (Theorem 14). We also note that the modelings constructed in Theorems 1 and 14 satisfy the strong finitary mass transport principle, which, vaguely speaking, forbids the existence of a small and a large subset of vertices with a matching between them.

2 Notation

We mostly follow the standard graph theory terminology and the standard model theory terminology as it can be found e.g. in [8] and in [9], respectively. Still, we want to specify some less standard notation and to introduce some non-standard notation related to graphs with bounded path-width. In what follows, all graphs, trees, etc. are finite unless specified otherwise, and the order of a graph is the number of its vertices. The set of positive integers is denoted by and the set of non-negative integers by . The set of integers from to (inclusively) is denoted by , and stands for . If is a real number and is a positive real, denotes the unique real such that for some .

2.1 Path-width and semi-interval graphs

If is a graph, then a semi-interval representation of is an assignment of intervals of the form , , , to vertices of such that the intervals and of any two adjacent vertices and intersect (however, the intervals of non-adjacent vertices may also intersect). A graph with a fixed semi-interval representation is called a semi-interval graph. The intervals of the form , are called segments. If is a semi-interval graph, then the first segment is the leftmost segment intersected by an interval assigned to a vertex of and the last segment is the rightmost such segment.

The path-width of a graph is the smallest integer such that has a semi-interval representation such that each segment is contained in at most intervals assigned to vertices of . Note that this definition coincides with the usual definition of the path-width. In particular, given a semi-interval graph such that each segment is contained in at most intervals, one can construct its path-decomposition of width (at most) by taking a path with vertices corresponding to the segments between the first and the last segment and assigning each vertex of the path a bag consisting of the vertices of whose intervals contain the segment corresponding to . Likewise, a path-decomposition of width naturally yields a semi-interval graph such that each segment is contained in at most intervals.

2.2 First order convergence

The notion of first order convergence applies to all relational structures (and even further, e.g., to matroids [12]). However, we limit our exposition to graphs for simplicity. The extensions to rooted graphs, vertex-colored graphs, etc., are straightforward. If is a first order formula with free variables and is a (finite) graph, then the Stone pairing is the probability that a uniformly chosen -tuple of vertices of satisfies . A sequence of graphs is first order convergent if the limit exists for every first order formula . We note that every sequence of graphs has a first order convergent subsequence.

A modeling is a (finite or infinite) graph with a standard Borel space on its vertex set equipped with a probability measure such that the set of all -tuples of vertices of satisfying a formula is measurable in the product measure for every first order formula with free variables. In the analogy to the graph case, the Stone pairing is the probability that a randomly chosen -tuple of vertices satisfies . If a finite graph is viewed as a modeling with a uniform discrete probability measure on its vertex set, then the Stone pairings for the graph and the modeling obtained in this way coincide. A modeling is a limit modeling of a first order convergent sequence if

for every first order formula . The definitions of a modeling and a limit modeling readily generalize from the case of graphs to the case of general relational structures, which include directed graphs, graphs with colored edges, etc. as particular cases.

Every limit modeling of a first order convergence sequence of graphs satisfies the so-called finitary mass transport principle (see [18] for further details) that requires that for any two first order formulas and , each with one free variable, such that every vertex satisfying has at least neighbors satisfying and every vertex satisfying has at most neighbors satisfying , it holds that

We are interested in a stronger variant of this principle. We say that a modeling satisfies the strong finitary mass transport principle if every two measurable subsets and of the vertices of such that each vertex of has at least neighbors in and each vertex of has at most neighbors in satisfy that

where is the probability measure of . Note that the assertion of the finitary mass transport principle requires this inequality to hold only for first order definable subsets of vertices. The strong finitary mass transport principle is satisfied by any finite graph when viewed as a modeling but it need not hold for every limit modeling. The importance of the strong finitary mass transport principle comes from its relation to graphings, which are limit representations of Benjamini-Schramm convergent sequences of bounded degree graphs: a limit modeling of a first order convergent sequence of bounded degree graphs is a limit graphing of the sequence if and only if satisfies the strong finitary mass transport principle.

2.3 Hintikka chains

Hintikka sentences are maximally expressive sentences with a certain quantifier depth. We alter the definition to local formulas with a single free variable. Consider a signature that includes the signature of graphs and is finite except that it may contain countably many constants, which are labelled by natural numbers. A first order formula is local if each quantifier is restricted to the neighbors of one of the vertices, i.e., it is of the form or where is required to be a neighbor of . For example, the formula is true iff the vertex has a neighbor of degree exactly one. The quantifier depth of a local formula is defined in the usual way. A formula is a -formula if its quantifier depth is at most and it does not involve any constants except for the first constants; a -formula with no free variables is referred to as a -sentence.

The same argument as in the textbook case of first order sentences yields that there are only finitely many non-equivalent local -formulas with one free variable for every . Let be a maximal set of non-equivalent local formulas with one free variable, i.e., a set containing one representative from each equivalence class of local formulas with one free variable. If is an integer, the -Hintikka type of a vertex of a (not necessarily finite) graph is the set of all -formulas such that . A formula with quantifier depth at most is called a -Hintikka formula if there exist a (not necessarily finite) graph and a vertex of such that is equivalent to the conjunction of the -Hintikka type of the vertex of the graph (note that must actually be equivalent to one of the formulas in the -Hintikka type of ).

Fix a -Hintikka formula . Observe that if is a vertex of a graph and is a vertex of a graph such that and then the -Hintikka types of and are the same. So, we can speak of the -Hintikka type of . A Hintikka chain is a sequence such that is an -Hintikka formula and the -Hintikka type of contains . If is a -Hintikka formula, the set of Hintikka chains is called basic. The set of Hintikka chains (formed by Hintikka formulas with a fixed signature) can be equipped with the topology with the base formed by basic sets; basic sets are clopen in this topology. This defines a Polish space on the set of Hintikka chains. In the proof of Lemma 8, we will define a measure on the -algebra of Borel sets of Hintikka chains. Let us remark that the just defined topological space of Hintikka chains is homeomorphic to the Stone space of studied in [16, 17].

A graph theory inspired view of Hintikka chains can be the following. Consider a rooted infinite tree where the root node corresponds to the tautology and the nodes at depth one-to-one correspond to -Hintikka formulas. The parent of the node corresponding to a -Hintikka formula is the unique node at depth such that the formula corresponding to is contained in the -Hintikka type of . Note that the constructed tree is locally finite. The Hintikka chains one-to-one correspond to infinite paths from the root and the just defined topology is the most often considered topology on such paths in the infinite rooted trees.

3 Limits of plane trees

A plane tree is a rooted tree embedded in the plane. Having in mind our application to graphs with bounded path-width, we will refer to vertices of plane trees as nodes (to distinguish them from the vertices of graphs with bounded path-width, which we consider later). The signature we use to describe plane trees consists of two binary relational symbols and : the relation describes the child-parent relation and the relation determines a linear order among the children of nodes. More precisely, if is a plane tree, then if is a child of . The embedding of in the plane determines a linear order among the children of each node and if and are children of the same node that are consecutive in this linear order. Note that determines a linear order on the children of each node but this order is not first order definable using . Finally, since plane trees are rooted, we can speak about subtrees of their nodes, i.e. the subtree of a node of a plane tree is the subtree induced by all descendants of .

The main result of this section is the following theorem.

Theorem 1.

Every first-order convergent sequence of plane trees has a limit modeling satisfying the strong finitary mass transport principle.

As we have already mentioned, the proof of Theorem 1 has two parts. First, we represent a first order convergent sequence of plane trees through a measure on Hintikka chains. This will require identifying nodes in the sequence that are in the proximity of a non-zero fraction of all the nodes. The second step of the proof consists of constructing a tree on a measurable set of nodes that satisfies the same first order formulas and has the same Stone pairings as the considered sequence of plane trees. On a high level, one can think of our decomposition step as analogous to comb structure results given for trees of bounded depth in [17, Theorems 27–29] and for general trees in [18, Theorem 36]. However, we do not use comb structure to capture first order properties of plane trees and we employ another technique, bringing a different view on constructing limits of first order convergent sequences. We believe that our decomposition technique involves fewer tools from analysis and model theory and is more of combinatorial nature, which might be useful when extending the results to wider classes of graphs. On the other hand, the composition step developed by some of the authors when constructing modelings of first order convergent sequences of trees was more technical and less elegant than the one used in [18]. So, we adapt ideas from the proof of Lemma 39 in [18] to prove Lemma 8; however, it is not possible to use Lemma 39 from [18] in our setting directly since it concerns trees/forests only and it is designed to be used in conjunction with comb structure results presented in [18].

3.1 Decomposition

Our first aim is to identify nodes that are in the proximity of a non-zero fraction of other nodes. A node of a tree is -major if the sum of the sizes of the two largest components of is at most . Every node in the proximity of a non-zero fraction of other nodes is also close to a major node, as given in the next lemma. However, the converse need not be true, i.e. the -neighborhoods of major nodes can be small.

Lemma 2.

Let be a tree, a positive real and the set of -major nodes of . For every node of and every integer , the number of nodes at distance at most from in is at most .

Proof.

Fix the node of . We will color some nodes in the -neighborhood of as green. We start with coloring the node green and proceed as follows as long as we can. If is a green node at distance at most from , we color the neighbors of in the two largest components of green. In this way, at most nodes are colored green (each node at distance from is adjacent to at most two green nodes at distance ). Next, we recolor all green nodes that are -major to red. We will refer to the nodes that are not red or green as black nodes.

A node can be at distance at most from in only if it is joined to by a path consisting of green and black nodes only. If is a green node, then is not -major and the sum of the orders of the components of such that the neighbor of in the component is black is at most (the neighbors of in the two largest components are green or red). Hence, the number of nodes reachable from through a path consisting of green and black nodes only such that is the last green node on the path is at most (here, we also count the node ). Since there are at most green nodes, we conclude that the -neighborhood of in has at most nodes. ∎

On the other hand, each tree can have only a bounded number of -major nodes.

Lemma 3.

For every , the number of -major nodes of any tree is at most .

Proof.

If the tree has at most nodes, then there is nothing to prove. So, we assume that the tree has more than nodes. We can also assume that is rooted and all the edges are oriented towards the root. Let for a node of be the number of nodes in the subtree of (including itself) divided by . We claim that if is -major, then for every child of . Consider an -major node and suppose that it has a child with . The number of nodes in the component of containing is and in the component containing the parent of is . Since these two components together have more than nodes, cannot be -major.

Partition the interval into intervals , . Note that there is no -major node with since has more than nodes. Since any two -major nodes and with the values and from the same interval , , cannot be joined by an oriented path, the subtrees of all such nodes are disjoint. Consequently, the number of -major nodes with , , is at most (recall that there are no -major node with ). We conclude that the number of -major nodes does not exceed . ∎

We will enhance the signature of plane trees by countably many constants , , to capture major nodes. We will require that these constants are interpreted by different nodes of trees, and some of the constants may not be interpreted at all. A plane tree with some nodes being constants will be called a plane -tree. The considered signature contains countably many constants but only finitely many constants can be interpreted in a finite plane -tree since we require that all constants are different. A sequence of plane -trees is null-partitioned if the following two conditions hold:

  • if for every , there exists such that all constants are interpreted in , , and

  • if for every , there exist integers and such that all the -major nodes of every tree , , are among the constants .

Note that the first condition in the definition above guarantees that for every first order formula , there exists such that can be evaluated in every , .

Lemma 4.

Let be a first order convergent sequence of plane trees such that the orders tend to infinity. There exists a first order convergent null-partitioned sequence of plane -trees obtained from a subsequence of by interpreting some of the constants .

Proof.

We start with constructing first order convergent sequences of plane -trees, . Exactly the constants will be interpreted in for , and all -major nodes of every , , will be among the constants. Let for every . Consider and assume that we have constructed the sequence . Let be such that every , , has at least nodes. Assign -major nodes of , , that are not already some of the constants, to different constants ; since the number of -major nodes does not exceed by Lemma 3, this is possible. Next assign the constants that are not interpreted yet to the remaining nodes of in a way that all constants are assigned to different nodes. Let be the resulting plane -tree. We set to be a first order convergent subsequence of this sequence .

Set for every . Let be a first order formula and let be the largest index of a constant appearing in (if none of the constants appears in , let ). Let be a positive integer such that . Since the reduct of obtained by omitting all constants , , is a subsequence of , and the sequence is first order convergent, the sequence of Stone pairings converges. Hence, the sequence is first order convergent.

We now show that the sequence is null-partitioned. Let and let be a positive integer such that . All -major nodes are among the constants in every tree , . Set and . Since all the -major nodes of every tree , , are among the constants and the choice of was arbitrary, the sequence is null-partitioned. ∎

The first order properties are closely linked with Ehrenfeucht-Fraïssé games [9]. It is well-known that two structures satisfy the same first order sentences with quantifier depth if and only if the duplicator has a winning strategy for the -round Ehrenfeucht-Fraïssé game played on two structures. We slightly alter the standard notion of Ehrenfeucht-Fraïssé games to fit our setting of plane -trees in the way we now present. The -round Ehrenfeucht-Fraïssé game is played by two players, called the spoiler and the duplicator. In the -th round, the spoiler chooses a node of one of the trees (the spoiler can choose a different tree in different rounds) and places the -th pebble on that node. The duplicator responds with placing the -th pebble on a node of the other tree. At the end of the game, the duplicator wins if the mapping between the two plane -trees that maps the node with the -pebble in the first tree to the node with the -pebble in the other tree is an isomorphism preserving the relations and and the first constants. The standard argument used in the classical setting yields that the two plane -trees satisfy the same -sentences if and only if the duplicator has a winning strategy for the -round Ehrenfeucht-Fraïssé game.

The next lemma allows us to prove an extension of Hanf’s theorem to plane -trees with unbounded degrees. Before stating the lemma, we need an additional definition. The relative position of two nodes and in a plane -tree can be described by a path in Gaifman’s graph between them. Let be such path, i.e., and . This path can be associated with a sequence of words up, down, left and right as follows: if , if , if , and if . The path is called strongly canonical if it is associated with a sequence of the form or with , and it is weakly canonical if the associated sequence is . We will refer to a path as canonical if it is either strongly canonical or weakly canonical.

For , the -position of two nodes and is the sequence of words up, down, left and right associated with the shortest strongly canonical path from to if such a path has length at most ; otherwise, if there is a weakly canonical path from to of length at most , then it is the sequence associated with it, and if such a path does not exist, then it is undefined. We say that two pairs of nodes have the same -position if either their -positions are the same sequence or they are both undefined.

Lemma 5.

For all pairs of integers and , , there exist integers and that satisfy the following. Suppose that the -round Ehrenfeucht-Fraïssé game is played on two (not necessarily finite) plane -trees and and the first rounds have already been played. Further suppose that if one of the plane -trees and has less than nodes of some -Hintikka type, then the number of nodes of this -Hintikka type in the trees and are the same. If for every and , , the -Hintikka types of the nodes with the -th pebble are the same and the nodes with the -th and the -th pebbles have the same -position in and , then the duplicator has a winning strategy.

Proof.

Fix . We prove the lemma by induction on the number of the remaining rounds of the game, i.e., on . We show that the lemma holds with the choice . During the proof, we will invoke various lower bounds on and we eventually take to be the largest of these lower bounds.

Suppose that . Since the -positions of the pairs of the corresponding pebbled nodes are the same in and , the relations and induced by the pebbled nodes are isomorphic through the mapping that maps a pebbled node in to the corresponding pebbled node of . Since the -Hintikka types of the nodes with the -th pebble in and , , are the same, the first constants appear on the corresponding pebbled nodes (if they appear on any of them). We conclude that the duplicator has won the game. Note that we can choose arbitrarily, e.g., equal to .

Suppose that and that we have proven the existence of and . By symmetry, we can assume that the spoiler has placed the -th pebble on a node of . Let be the nodes with the first pebbles in , and let be the nodes with the first pebbles in . We distinguish several cases based on the existence of a short canonical path from one of the nodes to .

Suppose first that there exists a canonical path from , , to that has length at most and let be the sequence associated with the path. Let be the string of the form or associated with the path. Among all choices of such that a canonical path from to of length at most exists, choose the one with minimal. We first assume that . Let be the node of such that there exists a canonical path from to that is also associated with . The existence of follows from that the nodes and have the same -Hintikka type and . Note that the node is uniquely determined and that the -position between and is the same as the -position between and for every . Finally, since , the -Hintikka types of and are the same. We conclude that the duplicator will have a winning strategy for the remaining rounds of the game, if the duplicator places the -th pebble on .

We now assume that . Note that . Let be the node at distance on the path from to the root and let be the node such that the -position of and is the same as the -position of and . Note that must exist and is uniquely determined since the nodes and have the same -Hintikka type. Consider the -formula that is satisfied if a node has a descendant at depth with the same -Hintikka type as . Let be the number of children of satisfying and let be the number of such children of . Since , it holds that or both and are at least .

We say that a child of is -close if there is a canonical path from to associated with or with for some . Similarly, a child of is -close if such a path exists from to . Let be the number of -close children of satisfying , and let be the number of -close children of satisfying . Note that every and is at most . Since the -positions of all the pairs of pebbled nodes are the same in and , is a descendant of at depth at most if and only if is a descendant of at the same depth. Since and and have the same -Hintikka type, it holds for every .

Let be the number of children of that satisfy and are -close for some , and let be the number of children of that satisfy and are -close for some . Observe that if the same child of is counted both in and for some and , then the nodes and are joined by a strongly canonical path of length at most and the corresponding child of is also counted both in and . This yields that . The choice of as small as possible implies that the node is not a descendant of a -close child of . Since , we conclude that and thus . Consequently, there exists a child of satisfying that is not -close for any . Let be the descendant of this at depth that has the same -Hintikka type as . The duplicator now places the -th pebble on . Since the -position of to any , is the same as the -position of to , the duplicator has a winning strategy for the remaining rounds of the game by induction.

It remains to consider the case that there is no canonical path from any to of length at most . If has a node with the same -Hintikka type as with no canonical path of length at most to any of the nodes , the duplicator places the -th pebble on and the existence of the winning strategy follows by induction. Otherwise, there exist , , such that every node of the same -Hintikka type as is a descendant of for some , at depth at most , where is a function that assigns a node its parent.

First suppose that is at distance less than from the root in and consider a node in of the same -Hintikka type as (which exists since ). Note that the distance of from the root in is the same as that of in . The node is a descendant of for some , . Hence, is at distance at most from the root. It follows that is at the same distance from the root in because they have the same -Hintikka type. This implies that there is a canonical path from to of length at most , which contradicts our assumption.

We now assume that is at distance at least from the root. Let be a local -formula expressing that has a descendant at depth of the same -Hintikka type as . Note that any node of the same -Hintikka type as (in or in ) is at distance at least from the root and must thus be contained in the subtree of a node satisfying at depth . Moreover, every node of that satisfies is one of the nodes , and . Hence, contains at most nodes satisfying . If , then the number of such nodes is the same in . If , all such nodes in are equal to for some and (because the same holds in ). Since every node of of the same -Hintikka type as is contained in the subtree of a node satisfying at depth , all nodes of the same -Hintikka type as are joined to some by a canonical path of length at most . In particular, there is a canonical path from to of length at most , which is impossible. ∎

An immediate corollary of Lemma 5 is the following variant of Hanf’s theorem for plane trees.

Theorem 6.

For every integer , there exist integers and such that for any two (not necessarily finite) plane -trees and , if the trees and have the same number of nodes of each -Hintikka type or the number of the nodes of this type is at least in both and , then the sets of -sentences satisfied by and are the same.

We are now ready to distill the essence of first order properties of a first order convergent sequence of plane trees. Fix a first order convergent null-partitioned sequence of plane -trees . We associate the sequence with two functions, and , which we will refer to as the discrete Stone measure and the Stone measure of the sequence (strictly speaking, the Stone measure as we define it is not a measure in the sense of measure theory, however, we believe that there is no danger of confusion). If , then is the limit of the number of nodes such that , and is the limit of . If is a plane -tree modeling, then it is also possible to speak about its discrete Stone measure and its Stone measure by setting and for .

Let be the ring formed by finite unions of basic sets of Hintikka chains. Note that can be equivalently defined as the set containing finite unions of disjoint basic sets of Hintikka chains. Let be the union of disjoint basic sets corresponding to Hintikka formulas and define . Clearly, the mapping is additive. Since every countable union of non-empty pairwise disjoint sets from that is contained in is finite, is a premeasure. By Carathéodory’s Extension Theorem, the premeasure extends to a measure on the -algebra formed by Borel sets of Hintikka chains. We will also use to denote this measure, which is uniquely determined by the Stone measure . Let us remark that the existence of on the -algebra formed by Borel sets of Hintikka chains can be derived from [17, Theorem 8] but we have preferred giving a simple direct argument for the clarity and completeness of our exposition.

The following lemma relates first order convergent sequences of plane trees and their limit modelings. In particular, the lemma reduces the problem of constructing a limit modeling of a first order convergent null-partitioned sequence of plane -trees, i.e., a modeling that agrees on all Stone pairings, to constructing a limit modeling that agrees on Stone pairings involving formulas from only.

Lemma 7.

Let be a first order convergent null-partitioned sequence of plane -trees with increasing orders and let and be its discrete Stone measure and Stone measure, respectively. If is a plane -tree modeling such that

  • the discrete Stone measure of is ,

  • the Stone measure of is ,

  • the -neighborhood in of each node in has zero measure for every ,

then is a limit modeling of .

Proof.

By Theorem 6, the discrete Stone measure fully determines which first order sentences are satisfied by , in particular, is equal to the limit of for every first order sentence . The situation is trickier for first order formulas with free variables.

Fix such a formula with free variables and with quantifier depth . Let and be the quantities from Lemma 5 applied for and , and let be the number of -Hintikka formulas. Lemma 5 yields that the truth value of for a particular evaluation of its free variables is determined by the -Hintikka types of the values of the free variables and by the statistics -Hintikka types (possibly truncated at the value of ) of all the nodes, if none of the pairs of the values of the free variables is joined by a canonical path of length at most . If the distance between the values of the free variables is smaller, then their distance can also come into the play. Since there exist first order convergent sequences of plane -trees such that the probability of two random nodes being joined by a canonical path of length at most is bounded away from zero for every tree in the sequence, we have to account for this possibility, and we do so using the major nodes we have introduced.

Fix . Since the sequence is null-partitioned, there exists such that all the -major nodes of each of the trees , , are among the constants . We assume that and are large enough such that

  • the largest index of a constant that appears in does not exceed ,

  • all trees , , contain the same set of the constants among (the existence of such follows from the first order convergence of the sequence ), and

  • for every -Hintikka formula and (the existence of such follows from the fact that are only different -Hintikka formulas and the sequence is first order convergent).

The probability that one of the randomly chosen nodes of , , is among the constants or two such nodes are at distance at most (in ) is at most , since every -neighborhood of a node in the tree has at most nodes by Lemma 2.

Let us now look at the modeling . The -neighborhood of any node in has zero measure but the same need not be true for the -neighborhoods in . However, since it is possible to express by a local first order formula that a node is at distance at most from a constant and the -neighborhood of each node , , contains at most nodes in , , by Lemma 2, the measure of the -neighborhood of , , is at most . Let be a node of . If the -neighborhood of in does not contain any of the constants , , then its measure is zero by the assumption of the lemma. If the -neighborhood of in contains one of the constants , , it is contained in the -neighborhood of in and its measure is at most , which is the upper bound on the measure of the -neighborhood of . We conclude that the probability that any two of randomly chosen nodes of are at distance at most in is at most .

For any -tuple of -Hintikka formulas , the probabilities that a random -tuple of nodes of , , and a random -tuple of nodes of have -Hintikka types containing (in this order) differ by at most since for every -Hintikka formula . Note that there are choices of -tuples of -Hintikka formulas. Also note that if two nodes are at distance larger than in and have the same -Hintikka type, then either they are not joined by a canonical path of length at most or their -position is uniquely determined by their -Hintikka types (since a canonical path of length at most must pass through one of the constants ). The same holds for any two nodes of . Hence, the Stone pairings and for , which are the probabilities that a random -tuple of nodes of , , and a random -tuple of nodes of satisfy , differ by at most

Since the choice of was arbitrary and the orders of the trees tend to infinity, we conclude that for every , there exists such that for every . Since this holds for every first order formula , the modeling is a limit modeling of . ∎

3.2 Composition

In this section, we complement Lemma 7 by constructing a modeling with the properties stated in that lemma.

Lemma 8.

Let be a first order convergent null-partitioned sequence of plane -trees and let and be its discrete Stone measure and Stone measure, respectively. There exists a plane -tree modeling such that

  • the discrete Stone measure of is ,

  • the Stone measure of is ,

  • the -neighborhood of each node in has zero measure for every , and

  • the modeling satisfies the strong finitary mass transport principle.

Proof.

We first extend the mapping to Hintikka chains as follows. If is a Hintikka chain, then

Observe that the support of the measure is a subset of . Indeed, if , then there exists a -Hintikka chain such that . Hence, there exists such that every , contains exactly nodes such that . Since the sequence is null-partitioned (in particular, the orders of tend to infinity), it follows that . Consequently, the measure of the basic set of Hintikka chains corresponding to is zero and is not in the support of .

The node set of the modeling that we construct consists of two sets: contains all pairs where is a Hintikka chain such that and , and . Note that the set is countable.

A Hintikka chain encodes many properties of a node. In particular, it uniquely determine the Hintikka chains of the parent and the successor (if they exist) and we will refer to these chains as the parent Hintikka chain and the successor Hintikka chain. The Hintikka chain of a node also determines whether the node is one of the constants, whether it is the root, and how many children satisfying a particular local first order formula the node can have. In a slightly informal way, we will be speaking about these properties by saying that the node of the Hintikka chain is a constant, it is the root, etc.

We now continue with the construction of the modeling with setting the constants. For each constant , there is at most one Hintikka chain with such that the node of is the constant . If such a Hintikka chain exists, then and we set the constant to be the node .

We next define the child-parent relation and the successor relation . Let . If the node of the Hintikka chain is the root, then has no parent and no successor. Otherwise, let be the parent Hintikka chain of . The definition of the discrete Stone measure and the first order convergence of imply that is a non-zero integer which divides . The parent of the node is the node where . If the node of the Hintikka chain has no successor, then has no successor. Otherwise, let be the successor Hintikka chain of . Since is a discrete Stone measure of a first order convergent sequence, it must hold that . We set the successor of to be the node .

Let . Since every tree has a unique root and , the node of the Hintikka chain is not the root. Let be the parent Hintikka chain. If is a non-zero integer, the parent of is the node . If the node of the Hintikka chain has no successor, then has no successor. Otherwise, let be the successor Hintikka chain. Since is a discrete Stone measure of a first order convergent sequence, it must hold that , and we can set the successor of the node to be the node