First-order interpretations of bounded expansion classes
111J. Nešetřil and P. Ossona de Mendez are
supported by CE-ITI P202/12/G061 of GACR and European
Associated Laboratory (LEA STRUCO).
J. Gajarský and S. Kreutzer are supported by the
European Research Council (ERC) under the European Union’s Horizon
2020 research and innovation programme (ERC Consolidator Grant
DISTRUCT, grant agreement No 648527).
M. Pilipczuk and S. Siebertz are supported by the National
Science Centre of Poland (NCN) via POLONEZ grant agreement
UMO-2015/19/P/ST6/03998, which has received funding from
the European Union’s Horizon 2020 research and innovation
programme (Marie Skłodowska-Curie grant agreement No. 665778).
Sz. Toruńczyk is supported by the NCN grant
The notion of bounded expansion captures uniform sparsity of graph classes and renders various algorithmic problems that are hard in general tractable. In particular, the model-checking problem for first-order logic is fixed-parameter tractable over such graph classes. With the aim of generalizing such results to dense graphs, we introduce classes of graphs with structurally bounded expansion, defined as first-order interpretations of classes of bounded expansion. As a first step towards their algorithmic treatment, we provide their characterization analogous to the characterization of classes of bounded expansion via low treedepth decompositions, replacing treedepth by its dense analogue called shrubdepth.
The interplay of methods from logic and graph theory has led to many important results in theoretical computer science, notably in algorithmics and complexity theory. The combination of logic and algorithmic graph theory is particularly fruitful in the area of algorithmic meta-theorems. Algorithmic meta-theorems are results of the form: every computational problem definable in a logic can be solved efficiently on any class of structures satisfying a property . In other words, these theorems show that the model-checking problem for the logic on any class satisfying can be solved efficiently, where efficiency usually means fixed-parameter tractability.
The archetypal example of an algorithmic meta-theorem is Courcelle’s theorem [1, 2], which states that model-checking a formula of monadic second-order logic can be solved in time on any graph with vertices which comes from a fixed class of graphs of bounded treewidth, for some computable function . Seese  proved an analogue of Courcelle’s result for the model-checking problem of first-order logic on any class of graphs of bounded degree. Following this result, the complexity of first-order model-checking on specific classes of graphs has been studied extensively in the literature. See e.g. [20, 9, 22, 25, 5, 11, 12, 33, 10, 26, 34, 19, 6, 7, 15, 24]. One of the main goals of this line of research is to find a structural property which precisely defines those graph classes for which model checking of first-order logic is tractable.
So far, research on algorithmic meta-theorems has focused predominantly on sparse classes of graphs, such as classes of bounded treewidth, excluding a minor or which have bounded expansion or are nowhere dense. The concepts of bounded expansion and nowhere denseness were introduced by Nešetřil and Ossona de Mendez with the goal of capturing the intuitive notion of sparseness. See  for an extensive cover of these notions. The large number of equivalent ways in which they can be defined using either notions from combinatorics, theoretical computer science or logic, indicate that these two concepts capture some very natural limits of “well-behavedness” and algorithmic tractability. For instance, Grohe et al.  proved that if is a class of graphs closed under taking subgraphs then model checking first-order logic on is tractable if, and only if, is nowhere dense (the lower bound was proved in ). As far as algorithmic meta-theorems for fixed-parameter tractability of first-order model-checking are concerned, this result completely solves the case for graph classes which are closed under taking subgraphs, which is a reasonable requirement for sparse but not for dense graph classes.
Consequently, research in this area has shifted towards studying the dense case, which is much less understood. While there are several examples of algorithmic meta-theorems on dense classes, such as for monadic second-order logic on classes of bounded cliquewidth  or for first-order logic on interval graphs, partial orders, classes of bounded shrubdepth and other classes, see e.g. [15, 13, 17, 14], a general theory of meta-theorems for dense classes is still missing. Moreover, unlike the sparse case, there is no canonical hierarchy of dense graph classes similar to the sparse case which could guide research on algorithmic meta-theorems in the dense world.
Hence, the main research challenge for dense model-checking is not only to prove tractability results and to develop the necessary logical and algorithmic tools. It is at least as important to define and analyze promising candidates for “structurally simple” classes of graph classes which are not necessarily sparse. This is the main motivation for the research in this paper. Since bounded expansion and nowhere denseness form the limits for tractability of certain problems in the sparse case, any extension of the theory should provide notions which collapse to bounded expansion or nowhere denseness, under the additional assumption that the classes are closed under taking subgraphs. Therefore, a natural way of seeking such notions is to base them on the existing notions of bounded expansion or nowhere denseness.
In this paper, we take bounded expansion classes as a starting point and study two different ways of generalizing them towards dense graph classes preserving their good properties. In particular, we define and analyze classes of graphs obtained from bounded expansion classes by means of first-order interpretations and classes of graphs obtained by generalizing another, more combinatorial characterization of bounded expansion in terms of low treedepth colorings into the dense world. Our main structural result shows that these two very different ways of generalizing bounded expansion into the dense setting lead to the same classes of graphs. This is explained in greater detail below.
Interpretations and transductions. One possible way of constructing “well-behaved” and “structurally simple” classes of graphs is to use logical interpretations, or the related concept of transductions studied in formal language and automata theory. For our purpose, transductions are more convenient and we will use them in this paper. Intuitively, a transduction is a logically defined operation which takes a structure as input and nondeterministically produces as output a target structure. In this paper we use first-order transductions, which involve first-order formulas (see section 2 for details). Two examples of such transductions are graph complementation, and the squaring operation which, given a graph , adds an edge between every pair of vertices at distance from each other.
We postulate that if we start with a “structurally simple” class of graphs, e.g. a class of bounded expansion or a nowhere dense class, and then study the graph classes which can be obtained from by first-order transductions, then the resulting classes should still have a simple structure and thus be well-behaved algorithmically as well as in terms of logic. In other words, the resulting classes are interesting graph classes with good algorithmic and logical properties, and which are certainly not sparse in general. For instance, a useful feature of transductions is that they provide a canonical way of reducing model-checking problems from the generated classes to the original class , provided that given a graph , we can effectively compute some graph that is mapped to by the transduction. In general, this is a hard problem, requiring a combinatorial understanding of the structure of the resulting classes .
The above principle has so far been successfully applied in the setting of graph classes of bounded treewidth and monadic second-order transductions: it was shown by Courcelle, Makowsky and Rotics  that transductions of classes of bounded treewidth can be combinatorially characterized as classes of bounded cliquewidth. This, combined with Oum’s result  gives a fixed-parameter algorithm for model-checking monadic second-order logic on classes of bounded cliquewidth. More recently, the same principle, but for first-order logic, has been applied to graphs of bounded degree , leading to a combinatorial characterization of first-order transductions of such classes, and to a model-checking algorithm.
Applying our postulate to bounded expansion classes yields the central notion of this paper: a class of graphs has structurally bounded expansion if it is the image of a class of bounded expansion under some fixed first-order transduction. This paper is a step towards a combinatorial, algorithmic, and logical understanding of such graph classes.
Low Shrubdepth Covers. The method of transductions is one way of constructing complex graphs out of simple graphs. A more combinatorial approach is the method of decompositions (or colorings) , which we reformulate below in terms of covers. This method can be used to provide a characterization of bounded expansion classes in terms of very simple graph classes, namely classes of bounded treedepth. A class of graphs has bounded treedepth if there is a bound on the length of simple paths in the graphs in the class (see section 2 for a different but equivalent definition). A class has low treedepth covers if for every number there is a number and a class of bounded treedepth such that for every , the vertex set can be covered by sets so that every set of at most vertices is contained in some , and for each , the subgraph of induced by belongs to . A consequence of a result by Nešetřil and Ossona de Mendez  on a related notion of low treedepth colorings is that a graph class has bounded expansion if, and only if, it has low treedepth covers.
The decomposition method allows to lift algorithmic, logical, and structural properties from classes of bounded treedepth to classes of bounded expansion. For instance, this was used to show tractability of first-order model-checking on bounded expansion classes [8, 21].
An analogue of treedepth in the dense world is the concept of shrubdepth, introduced in . Shrubdepth shares many of the good algorithmic and logical properties of treedepth. This notion is defined combinatorially, in the spirit of the definition of cliquewidth, but can be also characterized by logical means, as first-order transductions of classes of bounded treedepth. Applying the method of decompositions to the notion of shrubdepth leads to the following definition. A class of graphs has low shrubdepth covers if for every number there is a number and a class of bounded shrubdepth such that for every , there is a -cover of consisting of sets , so that every set of at most vertices is contained in some and for each , the subgraph of induced by belongs to . Shrubdepth properly generalizes treedepth and consequently classes admitting low shrubdepth covers properly extend bounded expansion classes.
It was observed earlier  that for every fixed and every class of bounded expansion, the class of th power graphs of graphs from (the th power of a graph is a simple first-order transduction) admits low shrubdepth colorings.
Our contributions. Our main result, theorem 15, states that the two notions introduced above are the same: a class of graphs has structurally bounded expansion if, and only if, it has bounded shrubdepth covers. That is, transductions of classes of bounded expansion are the same as classes with low shrubdepth covers (cf. Figure 1). This gives a combinatorial characterization of structurally bounded expansion classes, which is an important step towards their algorithmic treatment.
One of the key ingredients of our proof is a quantifier-elimination result (theorem 16) for transductions on classes of structurally bounded expansion. This result strengthens in several ways similar results for bounded expansion classes due to Dvořák, Král’, and Thomas , Grohe and Kreutzer  and Kazana and Segoufin . Our assumption is more general, as they assume that has bounded expansion, and here is only required to have low shrubdepth covers. Also, our conclusion is stronger, as their results provide quantifier-free formulas involving some unary functions and unary predicates which are computable algorithmically, whereas our result shows that these functions can be defined using very restricted transductions. Quantifier-elimination results of this type proved to be useful for the model-checking problem on bounded expansion classes [8, 21, 26], and this is also the case here.
As explained earlier, the transduction method allows to reduce the model-checking problem to the problem of finding inverse images under transductions, which is a hard problem in general and depends very much on the specific transduction. On the other hand, as we show, the cover method allows to reduce the model-checking problem for classes with low shrubdepth covers to the problem of computing a bounded shrubdepth cover of a given graph. In fact, as a consequence of our proof, in theorem 40 we show that it is enough to compute a -cover of a given graph from a structurally bounded expansion class, in order to obtain an algorithm for the model-checking problem for such classes. We conjecture that such an algorithm exists and that therefore first-order model-checking is fixed-parameter tractable on any class of graphs of structurally bounded expansion. We leave this problem for future work.
Organization. In section 2 we collect basic facts about logic, transductions, treedepth, shrubdepth and the notion of bounded expansion. In section 3 we provide the formal definitions of structurally bounded expansion classes and classes with low shrubdepth covers, and state the main results and their proofs using lemmas which are proved in the following three sections. We consider algorithmic aspects in section 7 and conclude in section 8. We aim to present an easy to follow proof of our main result. For this reason, we present proofs of the key lemmas in the main body of the paper, while rather technical results that disturb the flow of ideas are presented in full detail in the appendix.
We use standard graph notation. All graphs considered in this paper are undirected, finite, and simple; that is, we do not allow loops or multiple edges with the same pair of endpoints. We follow the convention that the composition of an empty sequence of (partial) functions is the identity function. For an integer , we denote .
2.1 Structures, logic, and transductions
Structures and logic.
A signature is a finite set of relation symbols, each with prescribed arity that is a non-negative integer, and unary function symbols. A structure over consists of a finite universe and interpretations of symbols from the signature: each relation symbol , say of arity , is interpreted as a -ary relation , whereas each function symbol is interpreted as a partial function . We drop the superscipt when the structure is clear from the context, thus identifying each symbol with its interpretation. If is a structure and then we define the substructure of induced by in the usual way except that a unary function in becomes undefined on all for which . The Gaifman graph of a structure is the graph with vertex set where two elements are adjacent if and only if either and appear together in some tuple in some relation in , or or for some partial function in .
For a signature , we consider standard first-order logic over . Let us clarify the usage of function symbols. A term is a finite composition of function symbols applied to a variable . In a structure , given an evaluation of , the term either evaluates to some element of in the natural sense, or is undefined if during the evaluation we encounter an element that does not belong to the domain of the function that is to be applied next. In first order logic over we allow usage of atomic formulas of the following form:
for a relation symbol of arity , terms , and variables ;
for terms and variables ; and
for term and variable .
Here, the predicate checks whether belongs to the domain of . The semantics are defined as usual, however an atomic formula is false if any of the terms involved is undefined. Based on these atomic formulas, the syntax and semantics of first order logic is defined in the expected way.
Graphs, colored graphs and trees.
Graphs can be viewed as finite structures over the signature consisting of a binary relation symbol , interpreted as the edge relation, in the usual way. For a finite label set , by a -colored graph we mean a graph enriched by a unary predicate for every . We will follow the convention that if is a class of colored graphs, then we implicitly assume that all graphs in are over the same fixed finite signature. A rooted forest is an acyclic graph together with a unary predicate selecting one root in each connected component of . A tree is a connected forest. The depth of a node in a rooted forest is the distance between and the root in the connected component of in . The depth of a forest is the largest depth of any of its nodes. The least common ancestor of nodes and in a rooted tree is the common ancestor of and that has the largest depth.
We now define the notion of transduction used in the sequel. A transduction is a special type of first-order interpretation with set parameters, which we see here (from a computational point of view) as a nondeterministic operation that maps input structures to output structures. Transductions are defined as compositions of atomic operations listed below.
An extension operation is parameterized by a first-order formula and a relation symbol . Given an input structure , it outputs the structure extended by the relation interpreted as the set of -tuples of elements satisfying in . A restriction operation is parameterized by a unary formula . Applied to a structure it outputs the substructure of induced by all elements satisfying . A reduct operation is parameterized by a relation symbol , and results in removing the relation from the input structure. Copying is an operation which, given a structure outputs a disjoint union of two copies of extended with a new unary predicate which marks the newly created vertices, and a symmetric binary relation which connects each vertex with its copy. A function extension operation is parameterized by a binary formula and a function symbol , and extends a given input structure by a partial function defined as follows: if is the unique vertex such that holds. Note that if there is no such or more than one such , then is undefined. Finally, suppose is function that maps each structure to a nonempty family of subsets of its universe. A unary lift operation, parameterized by , takes as input a structure and outputs the structure enriched by a unary predicate interpreted by a nondeterministically chosen set .
We remark that function extension operations can be simulated by extension operations, defining the graphs of the functions in the obvious way. They are, however, useful as a means of extending the expressive power of transductions in which only quantifier-free formulas are allowed, as defined below.
Transductions are defined inductively: every atomic transduction is a transduction, and the composition of two transductions and is the transduction that, given a structure , first applies to and then to the output . A transduction is deterministic if it does not use unary lifts. In this case, for every input structure there is exactly one output structure. A transduction is almost quantifier-free if all formulas that parameterize atomic operations comprising it are quantifier-free222We use the adverb “almost” to indicate that such transductions still can access elements that are not among its free variables via functions., and is deterministic almost quantifier-free if it additionally does not use unary lifts.
If is a class of structures, we write for the class which contains all possible outputs for . We say that two transductions and are equivalent on a class of structures if every possible output of is also a possible output of , and vice versa, for every .
It may happen that an atomic operation is undefined for a given input structure . For example, for an extension operation parametrized by a first order formula using a relation symbol , if the input structure does not carry the symbol , then is undefined according to the above definition. This will never occur in our constructions. However, for completeness, we may define as a fixed structure in such situations.
When considering a composition of atomic operations, we avoid overriding symbols by later operations, i.e., we always assume that subsequent atomic operations create relation symbols which are distinct from previously created relations symbols and also from symbols in the original signature. Since every transduction is a composition of finitely many atomic operations, the result of applied to a structure over a finite signature will be again a structure over a finite signature , which depends on and only (unless the result is undefined).
Let be the class of rooted forests of depth at most , for some fixed . We describe an almost quantifier-free transduction which defines the parent function in . First, using unary lifts introduce unary predicates , where marks the vertices of the input tree which are at distance from a root. Next, using a function extension, define a partial function which maps a vertex in the input tree to its parent, or is undefined in case of a root. This can be done by a quantifier-free formula, which selects those pairs such that and are adjacent and implies .
It will sometimes be convenient to work with the encoding of bounded-depth trees and forests as node sets endowed with the parent function, rather than graphs with prescribed roots. As seen in example 1, these two encodings can be translated to each other by means of almost quantifier-free transductions, which render them essentially equivalent.
It will sometimes be useful to assume a certain normal form of transductions. We will need two similar, yet slightly different normal forms: one for general transductions and one for almost quantifier-free transductions. The proofs are standard, for completeness, we give them in the appendix.
Lemma 2 ().
Let be a transduction. Then is equivalent to a transduction of the form
is a sequence of unary lifts;
is a sequence of copying operations;
is a sequence of function extension operations, one for each function on the output;
is a sequence of extension operations, one for each relation on the output;
is a single restriction operation; and
is a sequence of reduct operations.
Moreover, formulas parameterizing atomic operations in use only relations and functions that appeared originally on input or were introduced by . In particular, none of these formulas uses any function or relation introduced by an atomic operation in .
Lemma 3 ().
Every almost quantifier-free transduction is equivalent to an almost quantifier-free transduction that first applies a sequence of unary lifts and then applies a deterministic almost quantifier-free transduction.
2.2 Treedepth and shrubdepth
The treedepth of a graph is the minimal depth of a rooted forest with the same vertex set as , such that for every edge of , is an ancestor of , or is an ancestor of in . A class of graphs has bounded treedepth if there is a bound such that every graph in has treedepth at most . Equivalently, has bounded treedepth if there is some number such that no graph in contains a simple path of length . The notion of treedepth lifts to structures: a class of structures has bounded treedepth if the class of their Gaifman graphs has bounded treedepth.
The following notion of shrubdepth has been proposed in  as a dense analogue of treedepth. Originally, shrubdepth was defined using the notion of tree-models. We present an equivalent definition basing on the notion of connection models, introduced in  under the name of -partite cographs of bounded depth.
A connection model with labels from is a rooted labeled tree where each leaf is labeled by a label , and each non-leaf node is labeled by a (symmetric) binary relation . Such a model defines a graph on the leaves of , in which two distinct leaves and are connected by an edge if and only if , where is the least common ancestor of and . We say that is a connection model of the resulting graph .
Fix , and let be the bi-complement of a matching of order , i.e., the bipartite graph with nodes and , such that is adjacent to if and only if . A connection model for is shown below:
We can naturally extend the definition above to structures with unary functions by regarding each unary function by a binary relation selecting all pairs.
A class of graphs has bounded shrubdepth if there is a number and a finite set of labels such that every graph has a connection model of depth at most using labels from .
Shrubdepth can be equivalently defined in terms of another graph parameter, as follows. Given a graph and a set of vertices , the graph obtained by flipping the adjacency within is the graph with vertices and edge set which is the symmetric difference of the edge set of and the edge set of the clique on .
The subset-complementation depth, or SC-depth, of a graph is defined inductively as follows:
a graph with one vertex has SC-depth , and
a graph has SC-depth at most , where , if there is a set of vertices such that in the graph obtained from by flipping the adjacency within all connected components have SC-depth at most .
A star has SC-depth at most : flipping the adjacency within the set consisting of the vertices of degree yields a clique, which in turn has SC-depth at most .
The notion of SC-depth leads to a natural notion of decompositions. An SC-decomposition of a graph of SC-depth at most is a rooted tree of depth with leaf set , equipped with unary predicates on the leaves. Each child of the root in corresponds to a connected component of the graph obtained from by flipping the adjacency within , such that the subtree of rooted at , together with the unary predicates restricted to , form an SC-decomposition of .
We will make use of the following properties, where the first one follows from the definition of shrubdepth, and the remaining ones follow from .
Let be a class of graphs. Then:
If has bounded shrubdepth then the class of all induced subgraphs of graphs from also has bounded shrubdepth.
has bounded shrubdepth if and only if for some all graphs in have SC-depth at most .
If has bounded treedepth then has bounded shrubdepth.
If has bounded shrubdepth and is a transduction that outputs colored graphs, then has bounded shrubdepth.
It is well-known (see ) that in the absence of large bi-cliques (complete bipartite graphs) a graph of bounded cliquewidth has in fact bounded treewidth. The same holds also for shrubdepth and treedepth. The lemma is proved by an easy induction on the depth of the connection models.
Lemma 7 ().
A class of graphs has bounded treedepth if and only if graphs in have bounded shrubdepth and exclude some fixed bi-clique as a subgraph.
2.3 Bounded expansion
A graph is a depth- minor of a graph if can be obtained from a subgraph of by contracting mutually disjoint connected subgraphs of radius at most . A class of graphs has bounded expansion if there is a function such that for every and every depth- minor of a graph from . Examples include the class of planar graphs, or any class of graphs with bounded maximum degree.
We will use the following lemma.
Let be a class of (colored) graphs of bounded expansion and let be a copy operation. Then is a class of colored graphs of bounded expansion.
Let . The Gaifman graph of is a subgraph of the so-called lexicographic product of with , i.e., it is constructed from the latter by replacing every vertex with two clones of it. It is known that if a class of graphs has bounded expansion, then the class of lexicographic products of graphs from with any fixed graph also has bounded expansion; see e.g., [31, Proposition 4.6].
The connection between treedepth and graph classes of bounded expansion can be established via -treedepth colorings. For an integer , a function is a -treedepth coloring if, for every and set with , the induced graph has treedepth at most . A graph class has low treedepth colorings if for every there is a number such that for every there exists a -treedepth coloring with .
Theorem 9 ().
A class of graphs has bounded expansion if, and only if, it has low treedepth colorings.
3 Main results
In this section we introduce two notions which generalize the concept of bounded expansion. Then we state the main results and outline the proof. First, we introduce classes of structurally bounded expansion. This notion arises from closing bounded expansion graph classes under transductions.
A class of graphs has structurally bounded expansion if there exists a class of graphs of bounded expansion and a transduction such that .
The second notion, low shrubdepth covers, arises from the low treedepth coloring characterisation of bounded expansion (see theorem 9) by replacing treedepth by its dense counterpart, shrubdepth. For convenience, we formally define this in terms of covers.
A cover of a graph is a family of subsets of such that . A cover is a -cover, where , if every set of at most vertices is contained in some . If is a class of graphs, then a (-)cover of is a family , where is a (-)cover of . The cover is finite if is finite. Let denote the class of graphs . We say that the cover has bounded treedepth (respectively, bounded shrubdepth) if the class has bounded treedepth (respectively, shrubdepth).
Let be the class of trees and let . We construct a finite -cover of which has bounded treedepth. Given a rooted tree , let , where is the set of vertices of whose depth is not congruent to modulo . Note that is a forest of height , and that is a -cover of . Hence is a finite -cover of of bounded treedepth.
In analogy to low treedepth colorings, we can now characterize graph classes of bounded expansion using covers. We say that a class of graphs has low treedepth covers if for every there is a finite -cover of with bounded treedepth. The following lemma follows easily from theorem 9.
Lemma 13 ().
A class of graphs has bounded expansion if, and only if, it has low treedepth covers.
We now define the second notion generalizing the concept of bounded expansion. The idea is to use low shrubdepth covers instead of low treedepth covers.
A class of graphs has low shrubdepth covers if, and only if, for every there is a finite -cover of with bounded shrubdepth.
A class of graphs has structurally bounded expansion if, and only if, it has low shrubdepth covers.
As a byproduct of our proof of theorem 15 we obtain the following quantifier-elimination result, which we believe is of independent interest.
Let be a class of colored graphs which has low shrubdepth covers. Then every transduction is equivalent to some almost quantifier-free transduction on .
We start with the following lemma, which intuitively shows that covers commute with almost quantifier-free transductions.
If a class of graphs has low shrubdepth covers and is an almost quantifier-free transduction that outputs colored graphs, then also has low shrubdepth covers.
The idea is that for any almost quantifier-free transduction there is a constant such any induced substructure of on elements depends only on an induced substructure of of size . In particular, a -cover of induces a -cover of . Moreover, as having bounded shrubdepth is preserved by transductions, a low shrubdepth cover of induces a low shrubdepth cover of . The details are presented in section 4.
The main novel ingredient in our proof of theorem 15 and theorem 16 is the following result, which intuitively states that classes with low shrubdepth covers are bi-definable with classes of bounded expansion, using almost quantifier-free transductions.
Suppose is a class of graphs with low shrubdepth covers. Then there is a pair of transductions and , where is almost quantifier-free and is deterministic almost quantifier-free, such that is a class of colored graphs of bounded expansion and for each .
Clearly, proposition 18 implies that has structurally bounded expansion, since it can be obtained as a result of transduction to a class of bounded expansion. Thus, the right-to-left implication of theorem 15 is a corollary of the proposition. The proof of proposition 18 is presented in section 5. We sketch the rough idea below.
First, in lemma 31 of section 5.2, we prove the special case where is a class of graphs of bounded shrubdepth, and for those we prove bi-definability with classes of trees of bounded depth. In particular, if is a class of graphs of bounded shrubdepth, then there is a pair of almost quantifier-free transductions such that is a class of colored trees of bounded depth and such that for all . lemma 31 is the combinatorial core of this paper.
To prove proposition 18, we lift lemma 31 to the general case using covers, as follows. Let be a class with low shrubdepth covers and let be a -cover of of bounded shrubdepth, and let be such that for . We apply the bounded shrubdepth case to the class , yielding almost quantifier-free transductions and as above. The transduction works as follows: given a graph , introduce unary predicates marking the cover of , and for each , apply to the induced subgraph of , yielding a colored tree . Define as the union of the trees , for . As is a -cover of , is the union of the induced graphs for . As each graph can be recovered from the tree using the inverse transduction , it follows that can be recovered from the union . This yields the inverse transduction such that . As is almost quantifier-free by construction, it follows from lemma 17 that is a class with low shrubdepth covers. Moreover, each graph in is a union of at most trees, so it does not contain as a subgraph. It follows from lemma 7 that the low shrubdepth cover of is in fact a low treedepth cover. Hence, has low treedepth covers, i.e., has bounded expansion.
Let be a class of graphs of bounded expansion and let be a transduction. Then is equivalent to an almost quantifier-free transduction on .
We note that proposition 19 is a strengthening of similar statements provided by Dvořák et al.  and of Grohe and Kreutzer , and could be derived by a careful analysis of their proofs. In section 6 we provide a self-contained proof, which we believe is simpler than the previous proofs, and is sketched below.
We use the characterization of bounded expansion classes as those which have low treedepth covers. We first prove proposition 19 for forests of bounded depth. This can be handled by a direct (although slightly cumbersome) combinatorial argument, similarly as in . In Appendix F.2 we present an argument using tree automata.
The statement for classes of forests of bounded depth then easily lifts to classes of bounded treedepth. Here we use the fact that in a graph of bounded treedepth it is possible to encode a depth-first search forest of bounded depth, by using unary predicates marking the depth of each node in the spanning forest.
We then lift the result from classes of bounded treedepth using covers. Specifically, suppose for simplicity that the transduction is a single extension operation, parametrized by a formula . We then proceed by induction on the structure of the formula and show that it can be replaced by a quantifier-free formula, at the cost of introducing unary functions defined by an almost quantifier-free transduction.
In the inductive step, the only nontrivial case is the one of existential quantification, i.e., of formulas of the form
where may be assumed to be a quantifier-free formula involving unary functions, by inductive assumption. We consider a -cover of where is a constant such that there are at most different terms occurring in . Since has bounded expansion, we may assume that the cover has bounded treedepth, and that there is a constant such that for all . For a fixed graph , the existentially quantified variable must be in one of the sets . Therefore, the formula is equivalent to a disjunction of at most formulas , for , where each formula performs existential quantification restricted to the th set in (where is ordered arbitrarily). By the special case of the proposition proved for classes of bounded treedepth, is equivalent to a quantifier-free formula on (the quantifier-free formula uses unary functions introduced by almost quantifier-free transductions). Reassuming, is equivalent on to a disjunction of quantifier-free formulas involving unary functions that are introduced by almost quantifier-free transductions. This deals with the inductive step.
Proof (of theorem 15).
Let be a class of bounded expansion and let be a transduction that outputs colored graphs. We show that has low shrubdepth covers.
Proof (of theorem 16).
It remains to provide the details of the proofs of lemma 17, proposition 18 and proposition 19. This is done in section 4, section 5 and section 6, respectively. After that, in section 7 we conclude with a preliminary algorithmic result concerning the model-checking problem for first-order logic on classes with structurally bounded expansion.
4 Proof of lemma 17 (almost quantifier-free transductions commute with covers)
In this section we prove lemma 17, which we restate for convenience.
If a class of graphs has low shrubdepth covers and is an almost quantifier-free transduction that outputs colored graphs, then also has low shrubdepth covers.
We start with formulating the following lemma which states that almost quantifier-free transductions are, in a certain sense, local.
For every deterministic almost quantifier-free transduction there is a constant such that the following holds. For every structure and every element of there is a set of size at most such that for any sets with and , if , then
In order to prove the lemma, we define the following notions of dependency and support.
Suppose that is a term. For a structure carrying partial functions , we say that an element -depends with respect to on itself and all elements of the form for , whenever defined. For a quantifier-free formula , an element -depends on all elements on which -depends, for any term appearing in . For an element , the set of elements on which -depends in will be denoted by ; note that the size of this set is always bounded by a constant depending only on . Observe also that given elements , to check whether holds in it suffices to check whether it holds in the substructure of induced by all elements on which -depend.
With the auxiliary notion of dependency defined we can come to the definition of support.
Suppose is a deterministic almost quantifier-free transduction, and let be an input structure. For an element and a subset , we now define what it means that is -supported by . We first define this for atomic operations (note that unary lifts are excluded since is assumed to be deterministic):
If is a reduct operation or a copy operation, then is -supported by if and only if .
If is a restriction or an extension operation, say parameterized by a formula , then is -supported by if and only if .
Suppose is a function extension operation, say introducing a partial function using a binary formula . Then is -supported by if and only if and the following holds:
if there exists exactly one for which holds, then .
if there are at least two elements for which holds, then for at least two distinct such elements .
Finally, for non-atomic deterministic almost quantifier-free transductions the notion of -supporting is defined by induction on the structure of the transduction. Suppose is the composition of two transductions. Then is -supported by if there exists a subset and, for each , a subset such that is -supported by and each is -supported by .
The notion of supporting is trivially closed under taking supersets: if is -supported by , then is also -supported by any superset of .
Proof (of lemma 20).
By induction on the definition of an almost quantifier-free transduction it is easy to see that for every there is a set such that is -supported by and is bounded by a constant, possibly depending on .
By induction we also observe that if and are such that every is -supported by then
This proves the lemma.
We can now prove lemma 19.
Proof (of lemma 19).
Let be a class with low shrubdepth covers and let be an almost quantifier-free transduction that outputs colored graphs. We show that has low shrubdepth covers. By normalizing as described in lemma 3, we may assume that is of the form , where is a sequence of unary lifts and is deterministic almost quantifier-free. As has low shrubdepth covers, the class also has low shrubdepth covers (this is implied by proposition 6(4)). Moreover, . Therefore, it suffices to focus on the deterministic almost quantifier-free transduction applied to the class . Note that is a class of colored graphs, i.e., graphs with unary predicates on their vertices.
Let be the constant provided by lemma 20 for the transduction . We need to find, for every , a finite -cover of of bounded shrubdepth, so let us fix . Let be a finite -cover of of bounded shrubdepth. For a graph and , let be the set of those elements of such that , where is as obtained from lemma 20 applied to the deterministic almost quantifier-free transduction .
Define a cover of by letting
Clearly , so is finite as well. We need to verify that is a -cover and that it has bounded shrubdepth.
To see that is a -cover, take any elements of . Let . Then , hence there exists with . We conclude that .
To see that is a bounded shrubdepth cover, observe that by assumption has bounded shrubdepth, hence by proposition 6(4) we find that also has bounded shrubdepth. By lemma 20, for each and , the induced substructure is equal to . Now it suffices to note that , hence belongs to the hereditary closure of , which also has bounded shrubdepth by proposition 6(1).
5 Proof of proposition 18 (bi-definability of classes with low shrubdepth covers and classes of bounded expansion)
In this section we prove proposition 18, which we repeat for convenience.
Suppose is a class of graphs with low shrubdepth covers. Then there is a pair of transductions and , where is almost quantifier-free and is deterministic almost quantifier-free, such that is a class of colored graphs of bounded expansion and for each .
Clearly, proposition 22 implies that has structurally bounded expansion, since it can be obtained as a result of transduction to a class of bounded expansion. Thus, the right-to-left implication of theorem 15 is a corollary of the proposition.
The idea of the proof of proposition 22 is as follows. We first prove in lemma 23 of section 5.1 that connected components in graphs of bounded shrubdepth are definable by almost quantifier-free transductions. We use lemma 23 to first prove proposition 22 for the special case where is a class of graphs of bounded shrubdepth, and for those we prove bi-definability with classes of trees of bounded depth. This is done in lemma 31 of section 5.2. Then, we conclude the general case in section 5.3, by lifting lemma 31 using covers.
5.1 Defining connected components in graphs of bounded shrubdepth
The following lemma is the combinatorial core of our proof of proposition 22.
Let be a class of graphs of bounded shrubdepth. There is an almost quantifier-free transduction such that for a given , every output of on is equal to enriched by a function such that if and only if and are in the same connected component of .
We first introduce the notions of guidance systems and of functions guided or guidable by them. This is a combinatorial abstraction for functions computable by almost quantifier-free transductions.
Let be a graph. A guidance system in is any family of subsets of the vertex set of . The size of a guidance system is the cardinality of the family . We say that a partial function is guided by the guidance system if for every for which is defined and different than , there is some such that is the unique neighbor of in . Finally, a partial function is -guidable, where , if there is a guidance system of size at most in that such that is guided by .
Observe that an -guidable partial function maps each vertex from its domain to a vertex in the same connected component as . The following lemmas will be useful for operating on guidable functions.
Lemma 24 ().
Let be a graph and suppose is a partial function such that the restriction of to each connected component of is -guidable. Then is -guidable.
Lemma 25 ().
Let be a graph and let be partial functions, where is -guidable for each . If is a partial function such that for every there is some such that , then is -guidable.
Finally, guidable functions can be computed using almost quantifier-free transductions.
Lemma 26 ().
Let be a class of graphs and let be fixed. Suppose that each is equipped with an -guidable function . Then there exists an almost quantifier-free transduction which given has exactly one output: the graph enriched with .
We will use the following fact stating that graphs of bounded shrubdepth do not admit long induced paths.
Lemma 27 ().
For every class of graphs of bounded shrubdepth there exists a constant such that no graph from contains a path on more than vertices as an induced subgraph. Consequently, for every graph every connected component of has diameter at most .
For a graph and a function , we say that defines a spanning forest of depth on if is guarded by and the -fold composition is constant when restricted to each connected component of . In particular, two vertices are in the same connected component of if and only if .
The following lemma states that guidance systems can define shallow spanning forests in graph classes of bounded shrubdepth.
For every class of graphs of bounded shrubdepth there exist constants such that for every there is a function which is -guidable as a partial function on and defines a spanning forest of depth on .
Proof (of lemma 23).
By lemma 26, there is an almost quantifier-free transduction which, given a graph on input, constructs the function obtained from lemma 28. Now let be the -fold composition of . Clearly, can be computed by an almost quantifier-free transduction using a single function extension operation, making use of the function constructed by . As is constant on every connected component of , lemma 23 follows.
It remains to prove lemma 28.
Constructing guidable choice functions.
lemma 28 will follow easily from the fact that connected components of graphs of bounded shrubdepth have bounded diameter by lemma 27, and from the following lemma, essentially stating that every total binary relation whose graph has bounded shrubdepth contains a guidable choice function.
For every class of graphs of bounded shrubdepth there exists a constant such that the following holds. Suppose and and are two disjoint subsets of vertices of such that every vertex of has a neighbor in . Then there is a function which is -guidable as a partial function on .
We found two conceptually different proofs of this result. We believe that both proofs describe complementary viewpoints on the problem, so we present both of them. To keep the presentation concise, in the main body of the paper we give only one proof, using the characterization of classes of bounded shrubdepth using connection models, and their close connection to bi-cographs. We present the second proof in Appendix D.2, which provides an explicit greedy procedure leading to the construction of .
We first prove a special case of lemma 29 for graphs which have a connection model using two different labels and , where one part of has label and the other part has label . Such graphs are called bi-cographs (cf. ).
Let be a bi-cograph with parts and with a connection model of height where vertices in have label and vertices in have label . Suppose further that every vertex in has a neighbor in . Then there is a function which is -guidable as a partial function on .
By lemma 24, it is enough to consider the case when is connected. Let be the assumed connection model of height .
We prove that there is an -guidable function . The proof proceeds by induction on . The base case, when is trivial, because then every vertex of is adjacent to every vertex of , so picking any the function which maps every to is guided by the guidance system consisting only of .
In the inductive step, assume that and the statement holds for height . Since is connected, either the label of the root contains the pair , or has only one child . In the latter case, the subtree of rooted at is a connection model of of height , so the conclusion holds by inductive assumption. Hence, we assume that .
Let be the set of bipartite induced subgraphs of such that is defined by the connection model rooted at some child of in . As , it follows that if are two distinct graphs, then every vertex with label in is connected to every vertex with label in . We consider two cases, depending on whether contains more than one graph containing a vertex with label , or not.
In the first case, there are at least two graphs such that and both contain a vertex with label . Pick and , both with label . Then every vertex in is adjacent either to or to . Let be a function which maps a vertex to if is adjacent to , and to otherwise. Then is guided by the guidance system consisting of and .
In the second case, there is only one graph which contains a vertex with label . Pick an arbitrary vertex with label in . Notice that every vertex in is adjacent to . The graph has a connection model of height , so by inductive assumption, there is a guidance system of size at most and a function which is guided by . Then the function which extends by mapping every vertex in to is guided by . In either case, we have constructed a -guidable function , as required.
We now prove lemma 29 in the general case.
Proof (of lemma 29).
Let be a class of graphs of bounded shrubdepth. Hence, there is a finite set of labels and a number such that every graph has a connection model of height using labels from . For , let denote the set of vertices of which are labeled .
Define a function as follows: for every vertex define as , where is the label of , and is an arbitrary label such that has a neighbor in with label .
For every pair of labels , consider the bipartite graph which is the subgraph of consisting of on one side and on the other side, and all edges between these sets; note that they are disjoint, as one is contained in and second in . Observe that is a bi-cograph with a connection model of height , such that every vertex in has a neighbor in . By lemma 30 there is a function which is -guidable in . Observe that is also -guidable when treated as a partial function on ; it suffices to take the same guidance system, but with all its sets restricted to .
Constructing guidable spanning forests.
We are ready to complete the proof of lemma 28 stating that shallow spanning forests on classes of bounded shrubdepth are definable by guidance systems.
Proof (of lemma 28).
Let be a class of graphs of bounded shrubdepth, and let and be constants provided by lemma 27 and lemma 29, respectively, for the class . Let be a set of vertices which contains exactly one vertex in each connected component of . By lemma 27, we may assume that every vertex in is at distance at most from a unique vertex in . For , let be the set of vertices of whose distance to some vertex in is equal to . Then the sets form a partition of the vertex set of . Furthermore, observe that for , every vertex of has a neighbor in .
Fix a number . Apply lemma 29 to as and as . This yields a function which is -guidable in . In particular, is also a -guidable partial function . Let be a partial function from to that fixes every vertex of and is undefined otherwise. Then is guided by the guidance system , hence it is -guidable in .
Consider now the function such that for , if is defined for some . By the first item of lemma 25 we find that is -guidable. By construction, is guarded, and maps every vertex to the unique vertex in which lies in the connected component of . This proves that defines a spanning forest of depth on .
This completes the proof of lemma 23.
5.2 proposition 22 for classes of bounded shrubdepth
In this section, we prove proposition 22 in the special case when is a class of graphs of bounded shrubdepth:
Let be a class of graphs of bounded shrubdepth. Then there is a class of colored trees of bounded height and a pair of transductions and such that is almost quantifier-free, is deterministic almost quantifier-free, , , and
Moreover, for any , every is an SC-decomposition of .
We remark that in lemma 31, every output of the transduction is an SC-decomposition of the input graph of bounded depth, whereas the transduction recovers the graph from its SC-decomposition.
In other words, the lemma allows to construct the SC-decomposition of a graph from a class of graphs of bounded shrubdepth using an almost quantifier-free transduction. This argument is the combinatorial cornerstone of our approach. Conceptually, it shows that bounded-height decompositions of graphs from classes of bounded shrubdepth can be defined in a very weak logic, as essentially the whole information about the decomposition can be pushed to unary predicates on vertices (added using unary lifts), and from this information the decomposition can be reconstructed using only deterministic almost quantifier-free formulas.
We need one more auxiliary lemma which allows to apply a transduction in parallel to a disjoint union of structures. Suppose is a set of structures over the same signature. The bundling of is a structure obtained by taking the disjoint union of the structures in , extended with a set disjoint from and a function