Shrub-depth: Capturing Height of Dense Graphs

# Shrub-depth: Capturing Height of Dense Graphs

[ Algorithms and Complexity Group, TU Wien, Vienna, Austria    [ Faculty of Informatics, Masaryk University, Brno, Czech Republic    [ Computer Science Inst. of Charles University (IUUK), Praha, Czech Republic    [ Faculty of Informatics, Masaryk University, Brno, Czech Republic    [ CNRS UMR 8557, École des Hautes Études en Sciences Sociales, Paris, France
###### Abstract

The recent increase of interest in the graph invariant called tree-depth and in its applications in algorithms and logic on graphs led to a natural question: is there an analogously useful “depth” notion also for dense graphs (say; one which is stable under graph complementation)? To this end, in a 2012 conference paper, a new notion of shrub-depth has been introduced, such that it is related to the established notion of clique-width in a similar way as tree-depth is related to tree-width. Since then shrub-depth has been successfully used in several research papers. Here we provide an in-depth review of the definition and basic properties of shrub-depth, and we focus on its logical aspects which turned out to be most useful. In particular, we use shrub-depth to give a characterization of the lower levels of the MSO transduction hierarchy of simple graphs.

tree-depth; clique-width; shrub-depth; MSO logic; transduction
\lmcsdoi

1517 \lmcsheadingLABEL:LastPageJul. 19, 2017Jan. 31, 2019

R. Ganian]Robert Ganian P. Hliněný]Petr Hliněný J. Nešetřil]Jaroslav Nešetřil J. Obdržálek]Jan Obdržálek P. Ossona de Mendez]Patrice Ossona de Mendez

thanks: R. Ganian, P. Hliněný, J. Nešetřil and J. Obdržálek have been supported by the Institute for Theoretical Computer Science (CE-ITI), Czech Science Foundation project No. P202/12/G061. J. Nešetřil and P. Ossona de Mendez have been supported by the project LL1201 CORES of the Ministry of Education of the Czech republic. Robert Ganian also acknowledges support from the Austrian Science Fund (FWF, project P31336).

## 1 Introduction

In this paper, we are interested in a structural graph parameter that is intermediate between clique-width and tree-depth, sharing the nice properties of both. Clique-width, originated by Courcelle et al in [6, 8], is the older of the two notions. In several aspects, the theory of graphs of bounded clique-width is similar to the one of bounded tree-width. Indeed, bounded tree-width implies bounded clique-width. However, unlike tree-width, graphs of bounded clique-width include arbitrarily large cliques and other dense graphs, and the value of clique-width does not change much when complementing the edge set of a graph. Clique-width is not closed under taking subgraphs or minors, only under taking induced subgraphs. As we will see later, clique-width is also closely related to trees and monadic second-order logic of graphs.

The notion of tree-depth of a graph, coined by Nešetřil and Ossona de Mendez [31], is equivalent or similar to some older notions such as the vertex ranking number and the minimum height of an elimination tree [3, 9, 34], etc. Graphs of small tree-depth are related to trees of small height, and they enjoy strong “finiteness” properties (finiteness of cores, existence of non-trivial automorphisms if the graph is large, well-quasi-ordering by subgraph inclusion). The tree-depth notion received almost immediate attention, as it plays a central role in the theory of graph classes of bounded expansion [29, 30]. However, graphs of small tree-depth are necessarily very sparse and the notion behaves badly with respect to, say, graph complementation.

Our search for a structural concept “between clique-width and tree-depth” [20] has originally been inspired by algorithmic considerations: graphs of bounded parameters such as clique-width allow efficient solvability of various problems which are difficult (e.g. NP-hard) in general, e.g. [7, 13, 22, 21]. Highly regarded results in this area are those which, instead of solving one problem, give a solution to a whole class of problems (called algorithmic metatheorems). The perhaps most famous result of this kind is Courcelle’s theorem [4], which states that every graph property expressible in the logic of graphs can be solved in time where is a computable function, meaning that the problem is fixed-parameter tractable (FPT for short). For clique-width, a result similar to Courcelle’s theorem holds; model checking is FPT on graphs parameterized by clique-width [7].

However, an issue with these results is that, as showed by Frick and Grohe [14] for model checking of the class of all trees, the function of Courcelle’s algorithm is, unavoidably, non-elementary in the parameter (unless P=NP). This brings the following question: are there interesting graph classes in which the runtime dependency on the formula is better? For instance, in 2010, Lampis [27] gave an FPT algorithm for model checking on graphs of bounded vertex cover with elementary (doubly-exponential) dependence on the formula. Subsequently, in 2012, Gajarský and Hliněný showed [16] that there exists a linear-time FPT algorithm for model checking of graphs of bounded tree-depth, again with elementary dependence on the formula. Their result is essentially best possible, as shown soon after by Lampis [28]. In order to extend that result towards model checking of (some classes of) dense graphs, one would first need to adjust the clique-width concept towards “bounded depth” (as with tree-depth), which is not a simple task.

The aforementioned paper [16] was not the first one explicitly raising the issue of restricting clique-width towards bounded depth in the literature. In 2012, for example, independently Elberfeld, Grohe and Tantau made the following remark regarding the expressive power of graph FO logic [12]: One idea is to develop an adjusted notion of clique-width that has the same relation to clique-width as tree-depth has to tree-width. Our concept of shrub-depth [20] has provided a quick positive answer also to the question of [12]. Clique-width-like graph decompositions of limited depth have also been used as a tool by Blumensath and Courcelle in [2] (under the name “-decompositions”). However, some of their technical results which may be interesting in our context have not been published anywhere.

In [20], two new structural depth parameters of graphs have been introduced: shrub-depth (Definition 3.1) and SC-depth (Definition 4), which are asymptotically equivalent to each other. Since their emergence these have been successfully used in several research papers, and shrub-depth in particular is a subject of ongoing interest in the finite model theory of graphs.

For instance, the aforementioned [16] (its full journal version, to be precise) has also extended model checking tractability on graphs of bounded tree-depth to on graph classes of bounded shrub-depth, again with an elementary runtime dependence on the checked formula. Furthermore, [16] has generalized the result of [12] to prove that the expressive power of FO and MSO is the same on classes of bounded shrub-depth.

In a recent paper by Gajarský, Kreutzer, Nešetřil, Ossona de Mendez, Pilipczuk, Siebertz and Toru’nczyk [17], the concept of shrub-depth has been successfully used to obtain an analog of low tree-depth decompositions for transductions of bounded expansion classes.

On another topic, Hliněný, Kwon, Obdržálek and Ordyniak [24] have shown that the tree-depth and shrub-depth concepts of graphs are tightly related to each other via the so called vertex-minors. Regarding alternative and generalized views of shrub-depth, DeVos, Kwon and Oum [unpublished] in an ongoing work elaborate on the concept of branch-depth of matroids, and prove that a derived new concept of rank-depth of graphs is asymptotically equivalent to shrub-depth.

#### Paper organization.

Since the core initial paper on shrub-depth [20] has appeared only as a short conference version, we take an opportunity here to give a detailed review of this concept and to provide full proofs of the results of [20] enhanced in light of the current state-of-the-art. After preliminary definitions in Section 2, this overview of shrub-depth and its structural properties (such as Theorems 3.2, 3.3 and 3.4) constitute Section 3 of this paper. The subsequent Section 4 focuses on logical aspects of shrub-depth, which have so far been of greatest interest, and presents our main results with their proofs. We start with proving that the concept of shrub-depth of a graph class is stable – meaning that the shrub-depth value does not grow, under MSO interpretations (Theorem 4.1) and also under non-copying MSO transductions (Theorem 4.2). From that we derive (Theorem 4.3) that the integer values of shrub-depth define the lower levels of the MSO transduction hierarchy of simple graphs, which partially answers an open question raised by Blumensath and Courcelle in [1]. We conclude with some remarks and open questions in Section 5.

## 2 Common Definitions

We assume the reader is familiar with the standard notation of graph theory. In particular, our graphs are finite, undirected and simple (i.e. without loops or multiple edges). For a graph we use to denote its vertex set and to denote the set of its edges. We write to say that graphs and are isomorphic, and similarly we use to say that is a subgraph of  (not necessarily induced). An isomorphism of a graph to itself is also called an automorphism. We will also use labelled graphs, where each vertex is assigned one or more of a fixed finite set of labels (in this case, isomorphism implicitly preserves the labels).

A forest is a graph without cycles, and a tree is a forest with a single connected component. We will consider mainly rooted forests (trees), in which every connected component has a designated vertex called the root. The height of a vertex in a rooted forest is the length of a path from the root (of the component of to which belongs) to . The height111 There is a conflict in the literature about whether the height of a rooted tree should be measured by the “root-to-leaves distance” or by the “number of levels” (a difference of on finite trees). We adopt the convention that the height of a single-node tree is (i.e., the former view). of the rooted forest is the maximum height of the vertices of . Let be vertices of . The vertex is an ancestor of , and is a descendant of , in if belongs to the path of linking to the corresponding root; we denote this as in F. If is an ancestor of and , then is called a parent of , and is a child of . The least common ancestor of and in is denoted by .

### 2.1 Width and depth measures

The so called width measures play an important role in structural graph theory and in its algorithmic applications. A prototypical width parameter is the tree-width of a graph [33] introduced by Robertson and Seymour together with the related path-width. We refer to [10] for missing definitions and basic properties.

The primary interest of our paper is in two other, seemingly unrelated, structural width measures which we define now.

{defi}

[Clique-width [6, 8]] A -expression is an algebraic expression having the following four operations on vertex-labelled graphs using labels:

• create a new vertex with a single label ;

• take the disjoint union of two labelled graphs;

• add all edges between vertices of label and label (); and

• relabel all vertices with label to label .

The clique-width of a graph equals the minimum such that (some labelling of) is the value of a -expression.

Clique-width may be low even on graph classes for which the tree-width is unbounded, such as complete graphs or complete bipartite graphs (the clique-width of which is 2). Note that Definition 2.1 demands each vertex to carry only one label, while one can allow multiple labels as well. Another possible modification is to allow in the third step. Both these relaxations, while changing values of clique-width for some particular graphs, are nevertheless asymptotically equivalent to the standard clique-width notion of Definition 2.1.

One can, furthermore, define linear clique-width (see, e.g., [23]) which has the additional restriction that the union operator is allowed to take only a single vertex as the right-hand operand (i.e., the expression tree is a caterpillar—this is conceptually related to path-width).

A close alternative of clique-width is represented by the NLC classes introduced by Wanke [35]. consists of all graphs that can be obtained from single vertices with single labels in using the two following operations:

• disjoint union of two graphs and , with addition of all edges between vertices of with label and vertices of with label whenever belongs to a given fixed subset of ;

• relabelling of the vertices according to some map .

The NLC-width of a graph is the minimum such that the graph belongs to . It has been proved in [26] that the NLC-width and the clique-width (cw) of a graph are related by .

At last, we briefly mention that another graph measure asymptotically equivalent to clique-width is rank-width [32]. Similarly, linear clique-width is asymptotically equivalent to linear rank-width [18].

The second structural measure of our interest is tree-depth.

{defi}

[Tree-depth [31]] The closure of a forest is the graph obtained from by making every vertex adjacent to all of its ancestors. The tree-depth of a graph is one more than the minimum height of a rooted forest such that .

Definition 2.1 is illustrated in Figure 1. For a proof of the following proposition, as well as for a more extensive study of tree-depth, we refer the reader to [30].

{prop}

Let and be graphs. Then the following are true:

1. If is a minor of , then .

2. If is the length of a longest path in , then .

3. If and denote the tree-width and path-width of a graph , then .

### 2.2 MSO logic on graphs

We now briefly introduce monadic second order logic () over graphs and the concepts of MSO interpretation and transduction. We refer interested readers to, e.g., Courcelle and Engelfriet [5] for further reading. In general, is the extension of first-order logic by quantification over sets. In our paper we deal with the following particular flavour: {defi}[ and logic of graphs] The language of consists of expressions built from the following elements:

• variables for vertices, and for sets of vertices,

• equality for variables, quantifiers ranging over vertices and vertex sets, and the standard Boolean connectives,

• the predicates and with their standard meaning.

One may also use an arbitrary number of unary predicates on the vertex set (as vertex labels). The language of (counting ), moreover, adds the predicates , such that holds true if and only if .

logic can be used to express many interesting graph properties, such as 3-colourability and dominating set. We also briefly mention logic of graphs, which additionally includes quantification over edge sets and can express properties which are not definable in (e.g., Hamiltonicity).

From an algorithmic perspective, logic is particularly useful as the language for describing tractable problems in algorithmic metatheorems (e.g., for the aforementioned graphs of bounded clique-width [7] or tree-width [4]). In this respect, we consider the -model checking problem in which the input is a graph , the parameter is a formula of the considered logic (such as ), and the question is whether .

A powerful tool, both in theory and in algorithmic metatheorems, is the ability to “efficiently translate” an instance of the model checking problem over a given class, into an instance of the problem over another class (for which we, perhaps, already have an efficient model checking algorithm). We start with simple interpretations of undirected graphs.

{defi}

A simple graph interpretation is a pair of formulae (with and free first order variables, respectively), such that is symmetric (i.e.,  in every graph ).222We remark that while the question whether is symmetric is generally undecidable, we may simply force it to be symmetric, e.g., by using . To each graph , the interpretation associates a graph which is defined as follows:

• The vertex set of (the domain of  in ) is the set of all vertices of such that ;

• the edge set of is the set of all the pairs of vertices of such that .

A simple graph interpretation is defined analogously.

For example, a complete graph can be interpreted in any graph (with the same number of vertices) by letting , and the complement of a graph has an interpretation using and .

Note that, to each formula , an interpretation naturally and efficiently assigns a formula such that holds. Having classes of finite graphs, we say that is a simple interpretation of in if the following holds: for every there is such that , and for every it holds that .

A more general concept of a “logical translation” is that of transductions. Briefly saying, in an addition to a simple interpretation, this allows to add to a graph arbitrary “parameters” (as unary predicates) and to make several disjoint copies of the graph. A thorough discussion of this concept can be found in [5], but we prefer to keep the paper simple and accessible to a wide audience of graph theorists, and so we give a simplified version of the definition from [1].

Still, before proceeding to Definition 2.2, we have to briefly extend the notion of interpretation towards finite relational structures with finite signatures. A relational structure of the signature consists of a universe (a finite set) and a (finite) list of relations over . For instance, for graphs, is the vertex set and is the binary symmetric relation of edges of . The language of logic of relational structures of the signature is as in Definition 2.2 with the predicates (instead of ). The scope of Definition 2.2 of a simple graph interpretation is then naturally generalized by allowing and to be formulae over relational structures of the signature . For each structure of the signature , the interpretation is, in our case, a simple graph (again possibly with arbitrarily assigned vertex labels).

{defi}

[ and transduction] A basic graph transduction is a triple such that is a simple graph interpretation, and is an sentence. The transduction maps a relational structure to a graph , denoted here by , if , and is undefined if .

The -copy operation maps a graph to the relational structure such that , the subset for induces a copy of the graph  (there are no edges between distinct copies), and is additionally equipped with a binary relation and unary such that; for iff , and .

The expansion of a graph by unary predicates maps to the set of all structures obtained by an expansion of by new unary predicates (as vertex labels).

Altogether, a many-valued map is an graph transduction if it can be written as , where is a basic graph transduction, is a -copy operation for some , and is the expansion by unary predicates for some . Specially, if , then we call a non-copying transduction.

A transduction is defined analogously.

Note, once again, that the result of a transduction of one graph is generally a set of graphs, due to the involved expansion map. For a graph class , the transduction of the class is the union of the particular transduction results, precisely, .

## 3 Capturing Height of Dense Graphs

The concept of tree-depth is commonly used to capture the “height” of other graphs than just trees. Actually, tree-depth can be seen as a bounded-height analogue of tree-width. However, as discussed already in the introduction, the main drawback of tree-depth (as well as of tree-width) is its incapability to handle dense graphs and some simple graph operations like the complement. Since, on the other hand, clique-width handles dense “uniform” graphs and the complement operation smoothly, it makes good sense to try to modify its definition towards capturing “height” in addition to “width”.

Unfortunately, such a direct modification of clique-width seems not possible,333For example, simply trying to restrict the underlying expressing tree in Definition 2.1 brings the necessity of disjoint unions of an arbitrary arity which, in turn, “weakens” the definition too much. This is precisely the point at which the NLC approach (Subsection 2.1) with explicitly adding edges only between the graphs participating in a disjoint union operation turns out better. and one has to look at other related width measures, namely to the so called neighbourhood diversity and the aforementioned NLC-width for an inspiration.

Before we continue, notice that the requirement to smoothly handle dense graphs and the graph complement operation, naturally means that a new measure cannot be stable under taking non-induced subgraphs.

### 3.1 Shrub-depth

To motivate the coming definition of shrub-depth, we recall the neighbourhood diversity parameter introduced by Lampis [27] in an algorithmic context: Two vertices are twins in a graph if . The neighbourhood diversity of is the smallest such that can be partitioned into sets such that in each part the vertices are pairwise twins. This basically means that can be coloured by exclusive labels such that the existence of an edge depends solely on the colours of and .

To stress that the considered labels are exclusive, we shall instead call them colours. Inspired by attempts to generalize neighbourhood diversity, e.g, in [19, 15], we come to the idea of enriching the diversity colouring with a bounded number of “layers”. This results in the following formalization:

{defi}

[Tree-model] Let and   be non-negative integers. A tree-model of colours and depth for a graph is a pair of a rooted tree (of height ) and a set (called a signature of the tree-model) such that

1. the length of each root-to-leaf path in is exactly ,

2. the set of leaves of is exactly the set of vertices of ,

3. each leaf of is assigned one of the colours , and

4. for any it holds iff (symmetry in the colours), and for any two vertices such that is coloured and is coloured and the distance between in is , the edge exists in if and only if .

Note that point (4) effectively says that the existence of a -edge between depends solely on the colours of and the depth of the least common ancestor in . We hence, for convenience, call itself a tree-model of , assuming that the signature set is implicitly associated with .

The class of all graphs having such a tree-model of colours and depth is denoted by .

For instance, and . More generally, is exactly the class of graphs of neighbourhood diversity at most . For a more involved example, imagine an arbitrarily large collection of graphs , such that is partitioned into groups. Let be a graph obtained from a disjoint union by adding, say, all edges between distinct graphs from the groups 1 and 3, all edges from graphs in the group 2 to graphs in the groups 5 and 7, etc. Then . This “hierarchical” example can be easily generalized to higher values of . Yet another illustrations can be found in Figures 2 and 4.

It is easy to see that each class is closed under complements and induced subgraphs (which is our desire), but neither under disjoint unions, nor under subgraphs. If has a tree-model and is any induced subgraph of , then the corresponding induced subtree of immediately gives a tree-model for . Note also that one coloured tree can be a tree-model of several graphs (on the same vertex set), depending on the associated signatures.

Another interesting observation is the relation of a tree-model to a certain generalization of the NLC classes from Subsection 2.1: imagine that the definition of NLC is allowed to make disjoint union of an arbitrary number of graphs (but still with a uniform rule for adding edges between them), and the depth of the construction tree is bounded by . If we, furthermore, forbid the relabelling operation, then the result coincides with the class . Even if relabellings are allowed in NLC, we can encode all label changes in the leaf colours thanks to the bounded depth of the construction (at the price of increasing ).

The depth of a tree-model generalizes tree-depth of a graph as follows (while the other direction is obviously unbounded, e.g., for cliques):

{prop}

If is of tree-depth , then . If, moreover, is connected, then also .

{proof}

Let be an inclusion-minimal rooted forest of height such that , and let be a rooted tree obtained by adding a new root connected to the former roots of , and . If is connected, then already is a tree, and then we set and .

For we set a colour such that and , where denotes the ancestor of in at distance from . Notice that (because of the height of ), and so the total number of distinct over all is . Let be obtained from as follows: For every node such that , we add to a new path with the other end denoted by such that , and set .

We claim that this with the colours in the leaves of is the desired tree-model of . Let be the graph defined on the leaves of as follows; is an edge of iff, for , and , it holds and . Then clearly .

When dealing with tree-models of graph classes (e.g., in model checking or in transductions), the depth parameter is asymptotically much more important than the number of colours . With this in mind, it is useful to work with a more streamlined notion which only requires a single parameter , and to this end, we introduce the following:

{defi}

[Shrub-depth] A class of graphs has shrub-depth if there exists such that , while for all natural it is the case that . In a wider sense, is of bounded shrub-depth if there exist integers such that . Note that Definition 3.1 is asymptotic as it makes sense only for infinite graph classes; the shrub-depth of a single finite graph is always at most one ( for empty or one-vertex graphs). Furthermore, it makes no sense to say “the class of all graphs of shrub-depth ”.

For instance, the class of all cliques has shrub-depth . On the other hand, it will follow from Theorem 3.3 that the class of all paths has unbounded (infinite) shrub-depth. Now we argue that this new notion is indeed “intermediate” between tree-depth and clique-width (and even linear clique-width).

{prop}

Let be a graph class and an integer. Then:

1. If is of tree-depth , then is of shrub-depth .

2. If is of bounded shrub-depth, then is of bounded linear clique-width.

The converse statements are not true in general.

{proof}

a) This follows from Proposition 3.1, and the converse cannot be true in general because of, e.g., the class of all cliques.

b) We remark that it is trivial to see that is of bounded clique-width. Here we even show how to straightforwardly translate a tree-model with colours and depth into a linear (caterpillar-shaped) -expression: Let be any (usual) left-to-right ordering of the leaves of a tree-model of some . The expression is constructed inductively for as follows:

• a vertex is created and added with a (currently unique) colour where is its colour in ,

• whenever colour is to be adjacent to colour at distance in the model , the expression adds all edges between the colours and , and

• for being the distance from to in , the expression changes all colours with to .

A counterexample to the converse claim is, e.g., the class of all paths by Theorem 3.3.

The relation between classes of bounded shrub-depth and of bounded tree-depth is even deeper than shown above. The operation of a local complementation in a graph takes any vertex and replaces the subgraph induced on the neighbours of with its edge-complement. A graph is a vertex-minor of a graph if is an induced subgraph of a graph such that is obtained from by a sequence of local complementations. As shown in [24], the class of vertex-minors of all graphs of tree-depth at most has shrub-depth at most , and every class of shrub-depth can be constructed as vertex-minors of graphs of tree-depth where depends (only) on .

### 3.2 SC-depth

A significant drawback of the notion of shrub-depth is the aforementioned fact that it does not make sense to ask about the shrub-depth of a single finite graph. Here we propose a remedy for this problem in the form of another, very simple and single-parameter based, definition of a depth-like parameter which turns out to be asymptotically equivalent to shrub-depth. (Although, several years of research experience since [20] have also shown many clear advantages of the shrub-depth notion.)

Let be a graph and let . We denote by the graph with vertex set where are adjacent in if (i) either and , or (ii) and . In other words, is the graph obtained from by complementing the edges on .

{defi}

[SC-depth444As the “Subset-Complementation” depth.] We define inductively the class as follows:

• We let ;

• if and denotes the disjoint union of the ’s, then for every subset of vertices of , we have .

The SC-depth of is the minimum integer such that .

The SC-depth of a graph is thus the minimum height of a rooted tree , such that the leaves of form the vertex set of , and each internal node is assigned a subset of the descendant leaves of . Then the graph corresponding to in is the complement on , of the disjoint union of the graphs corresponding to the children of (see Figure 3).

The reason we introduce both the asymptotically equivalent SC-depth and shrub-depth measures here is that each brings a unique perspective on the classes of graphs we are interested in (see e.g. [24]).

{thm}

Let be a class of graphs. Then the following are equivalent:

• There exist integers , such that ( has bounded shrub-depth).

• There exists an integer such that ( has bounded SC-depth).

More precisely, and .

{proof}

We prove the forward implication by induction on . In the degenerate base case , it is trivially the case that . Assume now for some . By Definition 3.1, there exist an integer and graphs (actually subgraphs of induced by the leaf sets of the root-subtrees in the respective tree-model of ) such that the following holds: results from the disjoint union by adding those edges for which and belong to distinct graphs among , and for the pair of colours of ,  belongs to the signature .

By the induction assumption, we have got for some integer . For each of these graphs , , we successively complement the edges on the following subsets of vertices:

• for each such that , on the set of the vertices of of colour ,

• for each such that , on the set (defined as above), then on the set itself and then on itself.

Observe that at most complement operations are applied to each , and this number can be reduced down to by skipping possible repeated complements. Denoting by the graph obtained in this way from we get, by Definition 4, that where .

Effectively, in each we have complemented the edges whose colour pairs (together with third ) belong to . In the next step we make the disjoint union and repeat the same complementation procedure on this global level. Namely:

• for each such that , on the set of the vertices of of colour ,

• for each such that , on the set , then on and then on .

Denoting the resulting graph by , we similarly get where . It remains to routinely verify that .

As for the backward implication, we directly construct a tree-model for each graph . By Definition 4, can be constructed along a rooted tree such that the leaf set of is and each internal node of is associated with a complement set (which is a subset of the descendant leaves). We assign the leaf colours as follows. Let be a leaf of , and be the path from to the root of . We colour with the binary vector such that iff .

By Definition 4, forms an edge of , if and only if the pair belongs to an odd number of the complement sets over the whole . This can easily be determined from the colours of and , and from the depth of their least common ancestor in . Consequently, .

### 3.3 Long paths

For graphs of small tree-depth a characteristic property is the absence of long paths as subgraphs, cf. Proposition 2.1 b). This is obviously false for classes of small shrub-depth since those, in particular, include all cliques and bicliques. However, one can restrict induced paths in every class , as follows.

{thm}

Let and denote the path of length , i.e., on vertices. Then , but for any we have . In particular, there exist no such that would contain all paths.

{proof}

We start with the construction of that is, of an appropriate tree-model of , by induction on . We shall maintain a special property that each end of is represented in by a leaf which has no siblings, i.e., its parent is of degree . As the base case, we use the tree-model of colours and depth from the left-hand side of Figure 4. (Note that although , we use an extra level in to achieve our property.)

We now construct for . Let and be the ends of , and recall that each of has no siblings in . We create a sibling of in and assign a new colour . This intermediate tree-model can represent with the ends and, see Figure 4 right, the desired model follows:

• for and its disjoint copy , add a common ancestor of their roots,

• create a rooted path of length , with the root and the only leaf of colour , and make another son of .

Clearly, is a tree-model of colours and depth , and it can represent the edges and but not . Thus makes a tree-model of for .

In the converse direction we start with an easy observation for ; for any  (this follows from the folklore fact that the path on vertices is not a cograph, too). The proof can then be finished by induction over , provided that we establish the following contrapositive claim: if for any , then .

So fix and , and assume and is a corresponding tree-model of colours and minimum possible height . In this proof we denote by the subtree of rooted at a node . As is minimum and is connected, there exist distinct sons of the root of and colours (possibly equal), such that includes at least one leaf with colour and at least one leaf with colour , and the colour pair at distance determines an edge.

We let denote the subgraph formed only by those edges which are determined by the colour pair at distance in , i.e., iff the colours of are in and the only common ancestor of is the root of .

If , then we claim that there cannot be two non-incident edges in . Indeed, this would necessarily mean that contains , but . Hence is or and there exist at most three vertices of colour altogether, and in either case one subpath in is of length at least . Hence gives a tree-model of of labels.

We now examine the other possibility . First, we observe that if are non-incident edges of such that are of the same colour, then the only common ancestor of is the root of . Otherwise, we would get a contradiction that . Second, we argue that there cannot be three pairwise non-incident edges in (where are of the same colour). If this happened, then (say) for the vertex at least two of the vertices  would have only one common ancestor with , the root of . Consequently, would have at least two neighbours in the set , and the same would symmetrically hold for all the members of this set, contradicting the fact that is acyclic.

Therefore, is a path of length at most , or consists of two components isomorphic to or . Moreover, if there exists a leaf of colour or in which is not incident to an edge of , then has no two non-incident edges and all such leaves (of colour or ) not incident to are of the same colour, as can be easily checked.

We first consider the case that has one component. If it is or then, by the previous arguments, all the leaves of coloured (say) are incident to the one or two edges of . As above (in the case of ) we can now argue that gives a tree-model of of labels. If, on the other hand, is or , then all the leaves of coloured or are incident to the edges of . We form a new tree-model by removing from all the leaves of colours (i.e., incident to the edges of ) and adding arbitrarily one new leaf of colour . Then of labels models a path (or ).

We are left with the case of consisting of two components, such that all the leaves of coloured or are incident to the edges of . If any of the subpaths of is of length at least , then we are again done. Otherwise, we can choose one component of such that contains a subpath of length at least . We denote by the other component of (presumably ), and form a new tree-model by restricting to the leaves from , removing the leaves of and adding arbitrarily one leaf of colour (recall that no vertex of has colour or ). Hence of labels models a path (or ).

The combinatorial result in Theorem 3.3 has interesting relations also to logical questions (see Section 4). For instance, in respect of the research of -orderable graphs by Blumensath and Courcelle [2], note that in the class of all finite paths one can easily define a linear ordering by an formula. Hence it immediately follows from a characterization given in [2, Theorem 5.31] that the class of all finite paths cannot have bounded shrub-depth. The advantage of our Theorem 3.3 (occurirng already in [20]) is that it gives exact combinatorial bounds. Furthermore, Theorem 3.3 together with Theorem 4.1 implies the result [2, Theorem 5.31] that infinite graph classes of bounded shrub-depth are not -orderable.

Note, however, that graph classes of bounded shrub-depth are not asymptotically related to those excluding long induced subpaths; in the opposite direction the situation here is very different than in Proposition 2.1 b). As an example, we mention the graph class from Figure 5 which contains no induced subpaths of length . One can give a direct combinatorial proof that this class is of unbounded shrub-depth (similarly as for Theorem 3.3), but we skip it here since this fact follows from the aforementioned result of [2] (the graph of Figure 5 is FO-orderable) or, alternatively, from a combination of results of [24].

### 3.4 Induced subgraphs characterization

Lastly in this section, we provide yet another characterization of the classes defined previously. In a nutshell, we are going to show that each of these classes can be characterized by a finite list of forbidden induced subgraphs. A nice consequence of this finding is that membership in each of the classes can be tested in polynomial time. The tool we use here is well-quasi-ordering.

A class or property is said to be hereditary if it is closed under taking induced subgraphs. A well-quasi-ordering (or wqo) of a set is a quasi-ordering on such that for any infinite sequence of elements of there exist with . In other words, a wqo is a quasi-ordering that does not contain an infinite strictly decreasing sequence or an infinite set of incomparable elements. We are going to use the following well-known result:

{thm}

[Ding [11]] Let be an integer and be a finite set of colours. The class of the graphs not containing a path on vertices as a subgraph and with vertices coloured by is well-quasi-ordered under the colour-preserving induced subgraph order.

{cor}

Let be a graph class of bounded shrub-depth, such that the vertices of the graphs in are coloured from a finite set of colours. Then is well-quasi-ordered under the colour-preserving induced subgraph order. {proof} Consider an infinite sequence , and the corresponding tree-models . Let , , denote the rooted tree with leaf labels composed of the colours of and the colours of . By Theorem 3.4, of bounded diameter is wqo under the rooted coloured subtree relation, and, consequently, so are the coloured graphs , as desired.

The advertised result now follows by a simple twist as follows.

{thm}

For every integers , there exists a finite set of graphs (the forbidden subgraphs) such that a graph belongs to if and only if has no induced subgraph isomorphic to a member of .

Similarly, for every there exists a finite set of graphs such that if and only if has no induced subgraph isomorphic to one of .

{proof}

We let be the (isomorphism-free) set of graphs such that but for every . By this definition, no member of is a proper induced subgraph of another member. Hence it is enough to argue that is wqo to conclude that is finite.

The latter follows from an easy observation: if for some , then . Indeed, we take a tree-model of , add arbitrarily a new leaf of a unique new colour for  and annotate with an extra bit the colours of all leaves which are neighbours of in . The result is a tree-model for with colours. Consequently, and the wqo property follows from Corollary 3.4.

The second claim is proved analogously. We let be the (isomorphism-free) set of graphs such that but for every . By Theorem 3.2, , and so by the previous paragraph. The wqo property again follows from Corollary 3.4.

The “obstacle” sets and of Theorem 3.4 are not only of mathematical interest, but also have algorithmic consequences. Namely, in connection with established algorithms they allow for efficient membership testing of these classes. Note, however, that we do not provide an algorithmic construction of the sets and , and so we only prove an existence of the respective algorithms for each specific values of and (in parameterized complexity theory this is formally called nonuniform FPT).

{cor}

The problems to decide, for a given graph , whether and whether , are fixed-parameter tractable with respect to the parameters and , respectively. {proof} We provide a proof for the problem of , while that of is very similar. As mentioned before, the class is of bounded clique-width (namely, is a trivial upper bound). Therefore, one can use [25] to compute in FPT an approximate expression of of clique-width depending only on or to correctly conclude that . In the former case, one can then call the algorithm of [7] to test whether any member of is an induced subgraph of . Based on the outcome, the correct decision about is easily made.

## 4 Shrub-depth and MSO Transductions

While in the previous section we have focused on establishing basic combinatorial properties of shrub-depth and SC-depth, now we shift our attention towards their logical aspects. The final outcome will be the finding that (a slight technical adjustment of) tree-models of depth precisely capture the -th finite level of the transduction hierarchy of simple undirected graphs, for all . For that, we start by showing that shrub-depth indeed goes well with simple interpretations.

### 4.1 Stability under interpretations

We again turn to classical clique-width for an inspiration: graph classes of bounded clique-width have interpretations into the class of all coloured rooted trees and, in turn, graph classes having an interpretation into those of bounded clique-width still have bounded clique-width (although the bound on their clique-width is generally much higher).

In one direction, shrub-depth has been defined using (Definition 3.1) a very special form of a simple interpretation. In the other direction, we can go even further than with clique-width itself (cf. also Section 4.3): the bound on shrub-depth is preserved exactly (and not only asymptotically) under any interpretations. In other words, the precise height of a tree is absolutely essential for interpretability. The full formal statement follows.

{thm}

A class of graphs has a simple interpretation in a class of finite coloured rooted trees of height at most , if, and only if,  has shrub-depth at most .

The ‘if’ direction of Theorem 4.1 follows immediately from Definition 3.1: for any , the class has a simple interpretation (or even FO interpretation) in the class of -coloured tree-models of depth . Hence we now give a proof of the ‘only if’ direction of Theorem 4.1 consisting of the following sequence of three technical claims.

{lem}

[Gajarský and Hliněný [16]] There exists a function555Here stands for the iterated (“tower of height ”) exponential, i.e., and . over the positive integers such that the following holds.

Let be a rooted tree with each vertex assigned one of at most colours, and let be any sentence with quantifiers, such that the least common multiple of the values of all predicates in equals . Take any node such that the subtree rooted at  is of height , and denote by the connected components of (their roots are thus all the sons of ).

Assume that there exists a (sufficiently large) subset of indices , where , such that there are colour-preserving isomorphisms from to each , . Choose any , , and take the subtree . Then behaves the same with respect to as , precisely, .

Lemma 4.1 and, in particular, the operation of obtaining from as in the lemma, will be useful in the following generalized setting of a reduction. Assume that is an arbitrary non-decreasing function and is a positive integer (for use with Lemma 4.1, we can have ), and is a coloured rooted tree of height . Inductively for , we do the following: For every such that is of height , consider the components of partitioned into equivalence classes according to the existence of a colour-preserving isomorphism. In each of these classes whose cardinality is at least , we repeatedly remove -tuples of components until the cardinality reaches where . Let be the resulting “reduced” subtree of . In such situation we say that is -reduced (modulo ) to . Observe that is of bounded size depending only on , and , and independent of the size of .

We continue with the technical claims leading to Theorem 4.1. Imagine a situation in which we have a graph (tree) automorphism taking a vertex to a vertex , and similarly an automorphism taking to . Then it is generally not true that there would exist an automorphism taking the pair to the pair . The next lemma establishes a simple additional condition under which the previous becomes always true. We need the notion of an orbit. The binary relation on the vertex set of a graph defined as ‘ iff there is an automorphism taking to ’ is an equivalence, and its equivalence classes are called the vertex automorphism orbits.

Note that all automorphisms in this section are colour-preserving. {lem} Let be a coloured rooted tree. Assume that are vertex automorphism orbits of , and and are chosen arbitrarily. Let , , denote the least common ancestor of in . If and , then there is an automorphism of taking the pair onto