Shrubdepth: Capturing Height of Dense Graphs
Abstract
The recent increase of interest in the graph invariant called treedepth and in its applications in algorithms and logic on graphs led to a natural question: is there an analogously useful “depth” notion also for dense graphs (say; one which is stable under graph complementation)? To this end, in a 2012 conference paper, a new notion of shrubdepth has been introduced, such that it is related to the established notion of cliquewidth in a similar way as treedepth is related to treewidth. Since then shrubdepth has been successfully used in several research papers. Here we provide an indepth review of the definition and basic properties of shrubdepth, and we focus on its logical aspects which turned out to be most useful. In particular, we use shrubdepth to give a characterization of the lower levels of the MSO transduction hierarchy of simple graphs.
1517 \lmcsheadingLABEL:LastPageJul. 19, 2017Jan. 31, 2019
R. Ganian]Robert Ganian P. Hliněný]Petr Hliněný J. Nešetřil]Jaroslav Nešetřil J. Obdržálek]Jan Obdržálek P. Ossona de Mendez]Patrice Ossona de Mendez
1 Introduction
In this paper, we are interested in a structural graph parameter that is intermediate between cliquewidth and treedepth, sharing the nice properties of both. Cliquewidth, originated by Courcelle et al in [6, 8], is the older of the two notions. In several aspects, the theory of graphs of bounded cliquewidth is similar to the one of bounded treewidth. Indeed, bounded treewidth implies bounded cliquewidth. However, unlike treewidth, graphs of bounded cliquewidth include arbitrarily large cliques and other dense graphs, and the value of cliquewidth does not change much when complementing the edge set of a graph. Cliquewidth is not closed under taking subgraphs or minors, only under taking induced subgraphs. As we will see later, cliquewidth is also closely related to trees and monadic secondorder logic of graphs.
The notion of treedepth of a graph, coined by Nešetřil and Ossona de Mendez [31], is equivalent or similar to some older notions such as the vertex ranking number and the minimum height of an elimination tree [3, 9, 34], etc. Graphs of small treedepth are related to trees of small height, and they enjoy strong “finiteness” properties (finiteness of cores, existence of nontrivial automorphisms if the graph is large, wellquasiordering by subgraph inclusion). The treedepth notion received almost immediate attention, as it plays a central role in the theory of graph classes of bounded expansion [29, 30]. However, graphs of small treedepth are necessarily very sparse and the notion behaves badly with respect to, say, graph complementation.
Our search for a structural concept “between cliquewidth and treedepth” [20] has originally been inspired by algorithmic considerations: graphs of bounded parameters such as cliquewidth allow efficient solvability of various problems which are difficult (e.g. NPhard) in general, e.g. [7, 13, 22, 21]. Highly regarded results in this area are those which, instead of solving one problem, give a solution to a whole class of problems (called algorithmic metatheorems). The perhaps most famous result of this kind is Courcelle’s theorem [4], which states that every graph property expressible in the logic of graphs can be solved in time where is a computable function, meaning that the problem is fixedparameter tractable (FPT for short). For cliquewidth, a result similar to Courcelle’s theorem holds; model checking is FPT on graphs parameterized by cliquewidth [7].
However, an issue with these results is that, as showed by Frick and Grohe [14] for model checking of the class of all trees, the function of Courcelle’s algorithm is, unavoidably, nonelementary in the parameter (unless P=NP). This brings the following question: are there interesting graph classes in which the runtime dependency on the formula is better? For instance, in 2010, Lampis [27] gave an FPT algorithm for model checking on graphs of bounded vertex cover with elementary (doublyexponential) dependence on the formula. Subsequently, in 2012, Gajarský and Hliněný showed [16] that there exists a lineartime FPT algorithm for model checking of graphs of bounded treedepth, again with elementary dependence on the formula. Their result is essentially best possible, as shown soon after by Lampis [28]. In order to extend that result towards model checking of (some classes of) dense graphs, one would first need to adjust the cliquewidth concept towards “bounded depth” (as with treedepth), which is not a simple task.
The aforementioned paper [16] was not the first one explicitly raising the issue of restricting cliquewidth towards bounded depth in the literature. In 2012, for example, independently Elberfeld, Grohe and Tantau made the following remark regarding the expressive power of graph FO logic [12]: One idea is to develop an adjusted notion of cliquewidth that has the same relation to cliquewidth as treedepth has to treewidth. Our concept of shrubdepth [20] has provided a quick positive answer also to the question of [12]. Cliquewidthlike graph decompositions of limited depth have also been used as a tool by Blumensath and Courcelle in [2] (under the name “decompositions”). However, some of their technical results which may be interesting in our context have not been published anywhere.
In [20], two new structural depth parameters of graphs have been introduced: shrubdepth (Definition 3.1) and SCdepth (Definition 4), which are asymptotically equivalent to each other. Since their emergence these have been successfully used in several research papers, and shrubdepth in particular is a subject of ongoing interest in the finite model theory of graphs.
For instance, the aforementioned [16] (its full journal version, to be precise) has also extended model checking tractability on graphs of bounded treedepth to on graph classes of bounded shrubdepth, again with an elementary runtime dependence on the checked formula. Furthermore, [16] has generalized the result of [12] to prove that the expressive power of FO and MSO is the same on classes of bounded shrubdepth.
In a recent paper by Gajarský, Kreutzer, Nešetřil, Ossona de Mendez, Pilipczuk, Siebertz and Toru’nczyk [17], the concept of shrubdepth has been successfully used to obtain an analog of low treedepth decompositions for transductions of bounded expansion classes.
On another topic, Hliněný, Kwon, Obdržálek and Ordyniak [24] have shown that the treedepth and shrubdepth concepts of graphs are tightly related to each other via the so called vertexminors. Regarding alternative and generalized views of shrubdepth, DeVos, Kwon and Oum [unpublished] in an ongoing work elaborate on the concept of branchdepth of matroids, and prove that a derived new concept of rankdepth of graphs is asymptotically equivalent to shrubdepth.
Paper organization.
Since the core initial paper on shrubdepth [20] has appeared only as a short conference version, we take an opportunity here to give a detailed review of this concept and to provide full proofs of the results of [20] enhanced in light of the current stateoftheart. After preliminary definitions in Section 2, this overview of shrubdepth and its structural properties (such as Theorems 3.2, 3.3 and 3.4) constitute Section 3 of this paper. The subsequent Section 4 focuses on logical aspects of shrubdepth, which have so far been of greatest interest, and presents our main results with their proofs. We start with proving that the concept of shrubdepth of a graph class is stable – meaning that the shrubdepth value does not grow, under MSO interpretations (Theorem 4.1) and also under noncopying MSO transductions (Theorem 4.2). From that we derive (Theorem 4.3) that the integer values of shrubdepth define the lower levels of the MSO transduction hierarchy of simple graphs, which partially answers an open question raised by Blumensath and Courcelle in [1]. We conclude with some remarks and open questions in Section 5.
2 Common Definitions
We assume the reader is familiar with the standard notation of graph theory. In particular, our graphs are finite, undirected and simple (i.e. without loops or multiple edges). For a graph we use to denote its vertex set and to denote the set of its edges. We write to say that graphs and are isomorphic, and similarly we use to say that is a subgraph of (not necessarily induced). An isomorphism of a graph to itself is also called an automorphism. We will also use labelled graphs, where each vertex is assigned one or more of a fixed finite set of labels (in this case, isomorphism implicitly preserves the labels).
A forest is a graph without cycles, and a tree is a forest with a single connected component. We will consider mainly rooted forests (trees), in which every connected component has a designated vertex called the root. The height of a vertex in a rooted forest is the length of a path from the root (of the component of to which belongs) to . The height ^{1}^{1}1 There is a conflict in the literature about whether the height of a rooted tree should be measured by the “roottoleaves distance” or by the “number of levels” (a difference of on finite trees). We adopt the convention that the height of a singlenode tree is (i.e., the former view). of the rooted forest is the maximum height of the vertices of . Let be vertices of . The vertex is an ancestor of , and is a descendant of , in if belongs to the path of linking to the corresponding root; we denote this as in F. If is an ancestor of and , then is called a parent of , and is a child of . The least common ancestor of and in is denoted by .
2.1 Width and depth measures
The so called width measures play an important role in structural graph theory and in its algorithmic applications. A prototypical width parameter is the treewidth of a graph [33] introduced by Robertson and Seymour together with the related pathwidth. We refer to [10] for missing definitions and basic properties.
The primary interest of our paper is in two other, seemingly unrelated, structural width measures which we define now.
[Cliquewidth [6, 8]] A expression is an algebraic expression having the following four operations on vertexlabelled graphs using labels:

create a new vertex with a single label ;

take the disjoint union of two labelled graphs;

add all edges between vertices of label and label (); and

relabel all vertices with label to label .
The cliquewidth of a graph equals the minimum such that (some labelling of) is the value of a expression.
Cliquewidth may be low even on graph classes for which the treewidth is unbounded, such as complete graphs or complete bipartite graphs (the cliquewidth of which is 2). Note that Definition 2.1 demands each vertex to carry only one label, while one can allow multiple labels as well. Another possible modification is to allow in the third step. Both these relaxations, while changing values of cliquewidth for some particular graphs, are nevertheless asymptotically equivalent to the standard cliquewidth notion of Definition 2.1.
One can, furthermore, define linear cliquewidth (see, e.g., [23]) which has the additional restriction that the union operator is allowed to take only a single vertex as the righthand operand (i.e., the expression tree is a caterpillar—this is conceptually related to pathwidth).
A close alternative of cliquewidth is represented by the NLC classes introduced by Wanke [35]. consists of all graphs that can be obtained from single vertices with single labels in using the two following operations:

disjoint union of two graphs and , with addition of all edges between vertices of with label and vertices of with label whenever belongs to a given fixed subset of ;

relabelling of the vertices according to some map .
The NLCwidth of a graph is the minimum such that the graph belongs to . It has been proved in [26] that the NLCwidth and the cliquewidth (cw) of a graph are related by .
At last, we briefly mention that another graph measure asymptotically equivalent to cliquewidth is rankwidth [32]. Similarly, linear cliquewidth is asymptotically equivalent to linear rankwidth [18].
The second structural measure of our interest is treedepth.
[Treedepth [31]] The closure of a forest is the graph obtained from by making every vertex adjacent to all of its ancestors. The treedepth of a graph is one more than the minimum height of a rooted forest such that .
Definition 2.1 is illustrated in Figure 1. For a proof of the following proposition, as well as for a more extensive study of treedepth, we refer the reader to [30].
Let and be graphs. Then the following are true:

If is a minor of , then .

If is the length of a longest path in , then .

If and denote the treewidth and pathwidth of a graph , then .
2.2 logic on graphs
We now briefly introduce monadic second order logic () over graphs and the concepts of MSO interpretation and transduction. We refer interested readers to, e.g., Courcelle and Engelfriet [5] for further reading. In general, is the extension of firstorder logic by quantification over sets. In our paper we deal with the following particular flavour: {defi}[ and logic of graphs] The language of consists of expressions built from the following elements:

variables for vertices, and for sets of vertices,

equality for variables, quantifiers ranging over vertices and vertex sets, and the standard Boolean connectives,

the predicates and with their standard meaning.
One may also use an arbitrary number of unary predicates on the vertex set (as vertex labels). The language of (counting ), moreover, adds the predicates , such that holds true if and only if .
logic can be used to express many interesting graph properties, such as 3colourability and dominating set. We also briefly mention logic of graphs, which additionally includes quantification over edge sets and can express properties which are not definable in (e.g., Hamiltonicity).
From an algorithmic perspective, logic is particularly useful as the language for describing tractable problems in algorithmic metatheorems (e.g., for the aforementioned graphs of bounded cliquewidth [7] or treewidth [4]). In this respect, we consider the model checking problem in which the input is a graph , the parameter is a formula of the considered logic (such as ), and the question is whether .
A powerful tool, both in theory and in algorithmic metatheorems, is the ability to “efficiently translate” an instance of the model checking problem over a given class, into an instance of the problem over another class (for which we, perhaps, already have an efficient model checking algorithm). We start with simple interpretations of undirected graphs.
A simple graph interpretation is a pair of formulae (with and free first order variables, respectively), such that is symmetric (i.e., in every graph ).^{2}^{2}2We remark that while the question whether is symmetric is generally undecidable, we may simply force it to be symmetric, e.g., by using . To each graph , the interpretation associates a graph which is defined as follows:

The vertex set of (the domain of in ) is the set of all vertices of such that ;

the edge set of is the set of all the pairs of vertices of such that .
A simple graph interpretation is defined analogously.
For example, a complete graph can be interpreted in any graph (with the same number of vertices) by letting , and the complement of a graph has an interpretation using and .
Note that, to each formula , an interpretation naturally and efficiently assigns a formula such that holds. Having classes of finite graphs, we say that is a simple interpretation of in if the following holds: for every there is such that , and for every it holds that .
A more general concept of a “logical translation” is that of transductions. Briefly saying, in an addition to a simple interpretation, this allows to add to a graph arbitrary “parameters” (as unary predicates) and to make several disjoint copies of the graph. A thorough discussion of this concept can be found in [5], but we prefer to keep the paper simple and accessible to a wide audience of graph theorists, and so we give a simplified version of the definition from [1].
Still, before proceeding to Definition 2.2, we have to briefly extend the notion of interpretation towards finite relational structures with finite signatures. A relational structure of the signature consists of a universe (a finite set) and a (finite) list of relations over . For instance, for graphs, is the vertex set and is the binary symmetric relation of edges of . The language of logic of relational structures of the signature is as in Definition 2.2 with the predicates (instead of ). The scope of Definition 2.2 of a simple graph interpretation is then naturally generalized by allowing and to be formulae over relational structures of the signature . For each structure of the signature , the interpretation is, in our case, a simple graph (again possibly with arbitrarily assigned vertex labels).
[ and transduction] A basic graph transduction is a triple such that is a simple graph interpretation, and is an sentence. The transduction maps a relational structure to a graph , denoted here by , if , and is undefined if .
The copy operation maps a graph to the relational structure such that , the subset for induces a copy of the graph (there are no edges between distinct copies), and is additionally equipped with a binary relation and unary such that; for iff , and .
The expansion of a graph by unary predicates maps to the set of all structures obtained by an expansion of by new unary predicates (as vertex labels).
Altogether, a manyvalued map is an graph transduction if it can be written as , where is a basic graph transduction, is a copy operation for some , and is the expansion by unary predicates for some . Specially, if , then we call a noncopying transduction.
A transduction is defined analogously.
Note, once again, that the result of a transduction of one graph is generally a set of graphs, due to the involved expansion map. For a graph class , the transduction of the class is the union of the particular transduction results, precisely, .
3 Capturing Height of Dense Graphs
The concept of treedepth is commonly used to capture the “height” of other graphs than just trees. Actually, treedepth can be seen as a boundedheight analogue of treewidth. However, as discussed already in the introduction, the main drawback of treedepth (as well as of treewidth) is its incapability to handle dense graphs and some simple graph operations like the complement. Since, on the other hand, cliquewidth handles dense “uniform” graphs and the complement operation smoothly, it makes good sense to try to modify its definition towards capturing “height” in addition to “width”.
Unfortunately, such a direct modification of cliquewidth seems not possible,^{3}^{3}3For example, simply trying to restrict the underlying expressing tree in Definition 2.1 brings the necessity of disjoint unions of an arbitrary arity which, in turn, “weakens” the definition too much. This is precisely the point at which the NLC approach (Subsection 2.1) with explicitly adding edges only between the graphs participating in a disjoint union operation turns out better. and one has to look at other related width measures, namely to the so called neighbourhood diversity and the aforementioned NLCwidth for an inspiration.
Before we continue, notice that the requirement to smoothly handle dense graphs and the graph complement operation, naturally means that a new measure cannot be stable under taking noninduced subgraphs.
3.1 Shrubdepth
To motivate the coming definition of shrubdepth, we recall the neighbourhood diversity parameter introduced by Lampis [27] in an algorithmic context: Two vertices are twins in a graph if . The neighbourhood diversity of is the smallest such that can be partitioned into sets such that in each part the vertices are pairwise twins. This basically means that can be coloured by exclusive labels such that the existence of an edge depends solely on the colours of and .
To stress that the considered labels are exclusive, we shall instead call them colours. Inspired by attempts to generalize neighbourhood diversity, e.g, in [19, 15], we come to the idea of enriching the diversity colouring with a bounded number of “layers”. This results in the following formalization:
[Treemodel] Let and be nonnegative integers. A treemodel of colours and depth for a graph is a pair of a rooted tree (of height ) and a set (called a signature of the treemodel) such that

the length of each roottoleaf path in is exactly ,

the set of leaves of is exactly the set of vertices of ,

each leaf of is assigned one of the colours , and

for any it holds iff (symmetry in the colours), and for any two vertices such that is coloured and is coloured and the distance between in is , the edge exists in if and only if .
Note that point (4) effectively says that the existence of a edge between depends solely on the colours of and the depth of the least common ancestor in . We hence, for convenience, call itself a treemodel of , assuming that the signature set is implicitly associated with .
The class of all graphs having such a treemodel of colours and depth is denoted by .
For instance, and . More generally, is exactly the class of graphs of neighbourhood diversity at most . For a more involved example, imagine an arbitrarily large collection of graphs , such that is partitioned into groups. Let be a graph obtained from a disjoint union by adding, say, all edges between distinct graphs from the groups 1 and 3, all edges from graphs in the group 2 to graphs in the groups 5 and 7, etc. Then . This “hierarchical” example can be easily generalized to higher values of . Yet another illustrations can be found in Figures 2 and 4.
It is easy to see that each class is closed under complements and induced subgraphs (which is our desire), but neither under disjoint unions, nor under subgraphs. If has a treemodel and is any induced subgraph of , then the corresponding induced subtree of immediately gives a treemodel for . Note also that one coloured tree can be a treemodel of several graphs (on the same vertex set), depending on the associated signatures.
Another interesting observation is the relation of a treemodel to a certain generalization of the NLC classes from Subsection 2.1: imagine that the definition of NLC is allowed to make disjoint union of an arbitrary number of graphs (but still with a uniform rule for adding edges between them), and the depth of the construction tree is bounded by . If we, furthermore, forbid the relabelling operation, then the result coincides with the class . Even if relabellings are allowed in NLC, we can encode all label changes in the leaf colours thanks to the bounded depth of the construction (at the price of increasing ).
The depth of a treemodel generalizes treedepth of a graph as follows (while the other direction is obviously unbounded, e.g., for cliques):
If is of treedepth , then . If, moreover, is connected, then also .
Let be an inclusionminimal rooted forest of height such that , and let be a rooted tree obtained by adding a new root connected to the former roots of , and . If is connected, then already is a tree, and then we set and .
For we set a colour such that and , where denotes the ancestor of in at distance from . Notice that (because of the height of ), and so the total number of distinct over all is . Let be obtained from as follows: For every node such that , we add to a new path with the other end denoted by such that , and set .
We claim that this with the colours in the leaves of is the desired treemodel of . Let be the graph defined on the leaves of as follows; is an edge of iff, for , and , it holds and . Then clearly .
When dealing with treemodels of graph classes (e.g., in model checking or in transductions), the depth parameter is asymptotically much more important than the number of colours . With this in mind, it is useful to work with a more streamlined notion which only requires a single parameter , and to this end, we introduce the following:
[Shrubdepth] A class of graphs has shrubdepth if there exists such that , while for all natural it is the case that . In a wider sense, is of bounded shrubdepth if there exist integers such that . Note that Definition 3.1 is asymptotic as it makes sense only for infinite graph classes; the shrubdepth of a single finite graph is always at most one ( for empty or onevertex graphs). Furthermore, it makes no sense to say “the class of all graphs of shrubdepth ”.
For instance, the class of all cliques has shrubdepth . On the other hand, it will follow from Theorem 3.3 that the class of all paths has unbounded (infinite) shrubdepth. Now we argue that this new notion is indeed “intermediate” between treedepth and cliquewidth (and even linear cliquewidth).
Let be a graph class and an integer. Then:

If is of treedepth , then is of shrubdepth .

If is of bounded shrubdepth, then is of bounded linear cliquewidth.
The converse statements are not true in general.
a) This follows from Proposition 3.1, and the converse cannot be true in general because of, e.g., the class of all cliques.
b) We remark that it is trivial to see that is of bounded cliquewidth. Here we even show how to straightforwardly translate a treemodel with colours and depth into a linear (caterpillarshaped) expression: Let be any (usual) lefttoright ordering of the leaves of a treemodel of some . The expression is constructed inductively for as follows:

a vertex is created and added with a (currently unique) colour where is its colour in ,

whenever colour is to be adjacent to colour at distance in the model , the expression adds all edges between the colours and , and

for being the distance from to in , the expression changes all colours with to .
A counterexample to the converse claim is, e.g., the class of all paths by Theorem 3.3.
The relation between classes of bounded shrubdepth and of bounded treedepth is even deeper than shown above. The operation of a local complementation in a graph takes any vertex and replaces the subgraph induced on the neighbours of with its edgecomplement. A graph is a vertexminor of a graph if is an induced subgraph of a graph such that is obtained from by a sequence of local complementations. As shown in [24], the class of vertexminors of all graphs of treedepth at most has shrubdepth at most , and every class of shrubdepth can be constructed as vertexminors of graphs of treedepth where depends (only) on .
3.2 SCdepth
A significant drawback of the notion of shrubdepth is the aforementioned fact that it does not make sense to ask about the shrubdepth of a single finite graph. Here we propose a remedy for this problem in the form of another, very simple and singleparameter based, definition of a depthlike parameter which turns out to be asymptotically equivalent to shrubdepth. (Although, several years of research experience since [20] have also shown many clear advantages of the shrubdepth notion.)
Let be a graph and let . We denote by the graph with vertex set where are adjacent in if (i) either and , or (ii) and . In other words, is the graph obtained from by complementing the edges on .
[SCdepth^{4}^{4}4As the “SubsetComplementation” depth.] We define inductively the class as follows:

We let ;

if and denotes the disjoint union of the ’s, then for every subset of vertices of , we have .
The SCdepth of is the minimum integer such that .
The SCdepth of a graph is thus the minimum height of a rooted tree , such that the leaves of form the vertex set of , and each internal node is assigned a subset of the descendant leaves of . Then the graph corresponding to in is the complement on , of the disjoint union of the graphs corresponding to the children of (see Figure 3).
The reason we introduce both the asymptotically equivalent SCdepth and shrubdepth measures here is that each brings a unique perspective on the classes of graphs we are interested in (see e.g. [24]).
Let be a class of graphs. Then the following are equivalent:

There exist integers , such that ( has bounded shrubdepth).

There exists an integer such that ( has bounded SCdepth).
More precisely, and .
We prove the forward implication by induction on . In the degenerate base case , it is trivially the case that . Assume now for some . By Definition 3.1, there exist an integer and graphs (actually subgraphs of induced by the leaf sets of the rootsubtrees in the respective treemodel of ) such that the following holds: results from the disjoint union by adding those edges for which and belong to distinct graphs among , and for the pair of colours of , belongs to the signature .
By the induction assumption, we have got for some integer . For each of these graphs , , we successively complement the edges on the following subsets of vertices:

for each such that , on the set of the vertices of of colour ,

for each such that , on the set (defined as above), then on the set itself and then on itself.
Observe that at most complement operations are applied to each , and this number can be reduced down to by skipping possible repeated complements. Denoting by the graph obtained in this way from we get, by Definition 4, that where .
Effectively, in each we have complemented the edges whose colour pairs (together with third ) belong to . In the next step we make the disjoint union and repeat the same complementation procedure on this global level. Namely:

for each such that , on the set of the vertices of of colour ,

for each such that , on the set , then on and then on .
Denoting the resulting graph by , we similarly get where . It remains to routinely verify that .
As for the backward implication, we directly construct a treemodel for each graph . By Definition 4, can be constructed along a rooted tree such that the leaf set of is and each internal node of is associated with a complement set (which is a subset of the descendant leaves). We assign the leaf colours as follows. Let be a leaf of , and be the path from to the root of . We colour with the binary vector such that iff .
By Definition 4, forms an edge of , if and only if the pair belongs to an odd number of the complement sets over the whole . This can easily be determined from the colours of and , and from the depth of their least common ancestor in . Consequently, .
3.3 Long paths
For graphs of small treedepth a characteristic property is the absence of long paths as subgraphs, cf. Proposition 2.1 b). This is obviously false for classes of small shrubdepth since those, in particular, include all cliques and bicliques. However, one can restrict induced paths in every class , as follows.
Let and denote the path of length , i.e., on vertices. Then , but for any we have . In particular, there exist no such that would contain all paths.
We start with the construction of that is, of an appropriate treemodel of , by induction on . We shall maintain a special property that each end of is represented in by a leaf which has no siblings, i.e., its parent is of degree . As the base case, we use the treemodel of colours and depth from the lefthand side of Figure 4. (Note that although , we use an extra level in to achieve our property.)
We now construct for . Let and be the ends of , and recall that each of has no siblings in . We create a sibling of in and assign a new colour . This intermediate treemodel can represent with the ends and, see Figure 4 right, the desired model follows:

for and its disjoint copy , add a common ancestor of their roots,

create a rooted path of length , with the root and the only leaf of colour , and make another son of .
Clearly, is a treemodel of colours and depth , and it can represent the edges and but not . Thus makes a treemodel of for .
In the converse direction we start with an easy observation for ; for any (this follows from the folklore fact that the path on vertices is not a cograph, too). The proof can then be finished by induction over , provided that we establish the following contrapositive claim: if for any , then .
So fix and , and assume and is a corresponding treemodel of colours and minimum possible height . In this proof we denote by the subtree of rooted at a node . As is minimum and is connected, there exist distinct sons of the root of and colours (possibly equal), such that includes at least one leaf with colour and at least one leaf with colour , and the colour pair at distance determines an edge.
We let denote the subgraph formed only by those edges which are determined by the colour pair at distance in , i.e., iff the colours of are in and the only common ancestor of is the root of .
If , then we claim that there cannot be two nonincident edges in . Indeed, this would necessarily mean that contains , but . Hence is or and there exist at most three vertices of colour altogether, and in either case one subpath in is of length at least . Hence gives a treemodel of of labels.
We now examine the other possibility . First, we observe that if are nonincident edges of such that are of the same colour, then the only common ancestor of is the root of . Otherwise, we would get a contradiction that . Second, we argue that there cannot be three pairwise nonincident edges in (where are of the same colour). If this happened, then (say) for the vertex at least two of the vertices would have only one common ancestor with , the root of . Consequently, would have at least two neighbours in the set , and the same would symmetrically hold for all the members of this set, contradicting the fact that is acyclic.
Therefore, is a path of length at most , or consists of two components isomorphic to or . Moreover, if there exists a leaf of colour or in which is not incident to an edge of , then has no two nonincident edges and all such leaves (of colour or ) not incident to are of the same colour, as can be easily checked.
We first consider the case that has one component. If it is or then, by the previous arguments, all the leaves of coloured (say) are incident to the one or two edges of . As above (in the case of ) we can now argue that gives a treemodel of of labels. If, on the other hand, is or , then all the leaves of coloured or are incident to the edges of . We form a new treemodel by removing from all the leaves of colours (i.e., incident to the edges of ) and adding arbitrarily one new leaf of colour . Then of labels models a path (or ).
We are left with the case of consisting of two components, such that all the leaves of coloured or are incident to the edges of . If any of the subpaths of is of length at least , then we are again done. Otherwise, we can choose one component of such that contains a subpath of length at least . We denote by the other component of (presumably ), and form a new treemodel by restricting to the leaves from , removing the leaves of and adding arbitrarily one leaf of colour (recall that no vertex of has colour or ). Hence of labels models a path (or ).
The combinatorial result in Theorem 3.3 has interesting relations also to logical questions (see Section 4). For instance, in respect of the research of orderable graphs by Blumensath and Courcelle [2], note that in the class of all finite paths one can easily define a linear ordering by an formula. Hence it immediately follows from a characterization given in [2, Theorem 5.31] that the class of all finite paths cannot have bounded shrubdepth. The advantage of our Theorem 3.3 (occurirng already in [20]) is that it gives exact combinatorial bounds. Furthermore, Theorem 3.3 together with Theorem 4.1 implies the result [2, Theorem 5.31] that infinite graph classes of bounded shrubdepth are not orderable.
Note, however, that graph classes of bounded shrubdepth are not asymptotically related to those excluding long induced subpaths; in the opposite direction the situation here is very different than in Proposition 2.1 b). As an example, we mention the graph class from Figure 5 which contains no induced subpaths of length . One can give a direct combinatorial proof that this class is of unbounded shrubdepth (similarly as for Theorem 3.3), but we skip it here since this fact follows from the aforementioned result of [2] (the graph of Figure 5 is FOorderable) or, alternatively, from a combination of results of [24].
3.4 Induced subgraphs characterization
Lastly in this section, we provide yet another characterization of the classes defined previously. In a nutshell, we are going to show that each of these classes can be characterized by a finite list of forbidden induced subgraphs. A nice consequence of this finding is that membership in each of the classes can be tested in polynomial time. The tool we use here is wellquasiordering.
A class or property is said to be hereditary if it is closed under taking induced subgraphs. A wellquasiordering (or wqo) of a set is a quasiordering on such that for any infinite sequence of elements of there exist with . In other words, a wqo is a quasiordering that does not contain an infinite strictly decreasing sequence or an infinite set of incomparable elements. We are going to use the following wellknown result:
[Ding [11]] Let be an integer and be a finite set of colours. The class of the graphs not containing a path on vertices as a subgraph and with vertices coloured by is wellquasiordered under the colourpreserving induced subgraph order.
Let be a graph class of bounded shrubdepth, such that the vertices of the graphs in are coloured from a finite set of colours. Then is wellquasiordered under the colourpreserving induced subgraph order. {proof} Consider an infinite sequence , and the corresponding treemodels . Let , , denote the rooted tree with leaf labels composed of the colours of and the colours of . By Theorem 3.4, of bounded diameter is wqo under the rooted coloured subtree relation, and, consequently, so are the coloured graphs , as desired.
The advertised result now follows by a simple twist as follows.
For every integers , there exists a finite set of graphs (the forbidden subgraphs) such that a graph belongs to if and only if has no induced subgraph isomorphic to a member of .
Similarly, for every there exists a finite set of graphs such that if and only if has no induced subgraph isomorphic to one of .
We let be the (isomorphismfree) set of graphs such that but for every . By this definition, no member of is a proper induced subgraph of another member. Hence it is enough to argue that is wqo to conclude that is finite.
The latter follows from an easy observation: if for some , then . Indeed, we take a treemodel of , add arbitrarily a new leaf of a unique new colour for and annotate with an extra bit the colours of all leaves which are neighbours of in . The result is a treemodel for with colours. Consequently, and the wqo property follows from Corollary 3.4.
The second claim is proved analogously. We let be the (isomorphismfree) set of graphs such that but for every . By Theorem 3.2, , and so by the previous paragraph. The wqo property again follows from Corollary 3.4.
The “obstacle” sets and of Theorem 3.4 are not only of mathematical interest, but also have algorithmic consequences. Namely, in connection with established algorithms they allow for efficient membership testing of these classes. Note, however, that we do not provide an algorithmic construction of the sets and , and so we only prove an existence of the respective algorithms for each specific values of and (in parameterized complexity theory this is formally called nonuniform FPT).
The problems to decide, for a given graph , whether and whether , are fixedparameter tractable with respect to the parameters and , respectively. {proof} We provide a proof for the problem of , while that of is very similar. As mentioned before, the class is of bounded cliquewidth (namely, is a trivial upper bound). Therefore, one can use [25] to compute in FPT an approximate expression of of cliquewidth depending only on or to correctly conclude that . In the former case, one can then call the algorithm of [7] to test whether any member of is an induced subgraph of . Based on the outcome, the correct decision about is easily made.
4 Shrubdepth and Transductions
While in the previous section we have focused on establishing basic combinatorial properties of shrubdepth and SCdepth, now we shift our attention towards their logical aspects. The final outcome will be the finding that (a slight technical adjustment of) treemodels of depth precisely capture the th finite level of the transduction hierarchy of simple undirected graphs, for all . For that, we start by showing that shrubdepth indeed goes well with simple interpretations.
4.1 Stability under interpretations
We again turn to classical cliquewidth for an inspiration: graph classes of bounded cliquewidth have interpretations into the class of all coloured rooted trees and, in turn, graph classes having an interpretation into those of bounded cliquewidth still have bounded cliquewidth (although the bound on their cliquewidth is generally much higher).
In one direction, shrubdepth has been defined using (Definition 3.1) a very special form of a simple interpretation. In the other direction, we can go even further than with cliquewidth itself (cf. also Section 4.3): the bound on shrubdepth is preserved exactly (and not only asymptotically) under any interpretations. In other words, the precise height of a tree is absolutely essential for interpretability. The full formal statement follows.
A class of graphs has a simple interpretation in a class of finite coloured rooted trees of height at most , if, and only if, has shrubdepth at most .
The ‘if’ direction of Theorem 4.1 follows immediately from Definition 3.1: for any , the class has a simple interpretation (or even FO interpretation) in the class of coloured treemodels of depth . Hence we now give a proof of the ‘only if’ direction of Theorem 4.1 consisting of the following sequence of three technical claims.
[Gajarský and Hliněný [16]] There exists a function^{5}^{5}5Here stands for the iterated (“tower of height ”) exponential, i.e., and . over the positive integers such that the following holds.
Let be a rooted tree with each vertex assigned one of at most colours, and let be any sentence with quantifiers, such that the least common multiple of the values of all predicates in equals . Take any node such that the subtree rooted at is of height , and denote by the connected components of (their roots are thus all the sons of ).
Assume that there exists a (sufficiently large) subset of indices , where , such that there are colourpreserving isomorphisms from to each , . Choose any , , and take the subtree . Then behaves the same with respect to as , precisely, .
Lemma 4.1 and, in particular, the operation of obtaining from as in the lemma, will be useful in the following generalized setting of a reduction. Assume that is an arbitrary nondecreasing function and is a positive integer (for use with Lemma 4.1, we can have ), and is a coloured rooted tree of height . Inductively for , we do the following: For every such that is of height , consider the components of partitioned into equivalence classes according to the existence of a colourpreserving isomorphism. In each of these classes whose cardinality is at least , we repeatedly remove tuples of components until the cardinality reaches where . Let be the resulting “reduced” subtree of . In such situation we say that is reduced (modulo ) to . Observe that is of bounded size depending only on , and , and independent of the size of .
We continue with the technical claims leading to Theorem 4.1. Imagine a situation in which we have a graph (tree) automorphism taking a vertex to a vertex , and similarly an automorphism taking to . Then it is generally not true that there would exist an automorphism taking the pair to the pair . The next lemma establishes a simple additional condition under which the previous becomes always true. We need the notion of an orbit. The binary relation on the vertex set of a graph defined as ‘ iff there is an automorphism taking to ’ is an equivalence, and its equivalence classes are called the vertex automorphism orbits.
Note that all automorphisms in this section are colourpreserving. {lem} Let be a coloured rooted tree. Assume that are vertex automorphism orbits of , and and are chosen arbitrarily. Let , , denote the least common ancestor of in . If and , then there is an automorphism of taking the pair onto