Linear Recognition of Almost Interval GraphsDedicated to Jianer Chen on the occasion of his 60th birthday.

# Linear Recognition of Almost Interval Graphs1

## Abstract

Let , , and denote the classes of graphs that can be obtained from some interval graph by adding vertices, adding edges, and deleting edges, respectively. When is small, these graph classes are called almost interval graphs. They are well motivated from computational biology, where the data ought to be represented by an interval graph while we can only expect an almost interval graph for the best. For any fixed , we give linear-time algorithms for recognizing all these classes, and in the case of membership, our algorithms provide also a specific interval graph as evidence. When is part of the input, these problems are also known as graph modification problems, all NP-complete. Our results imply that they are fixed-parameter tractable parameterized by , thereby resolving the long-standing open problem on the parameterized complexity of recognizing , first asked by Bodlaender et al. [Bioinformatics, 11:49–57, 1995]. Moreover, our algorithms for recognizing and run in times and , (where and stand for the numbers of vertices and edges respectively in the input graph,) significantly improving the -time algorithm of Heggernes et al. [STOC 2007] and the -time algorithm of Cao and Marx [SODA 2014] respectively.

## 1 Introduction

A graph is an interval graph if its vertices can be assigned to intervals on the real line such that there is an edge between two vertices if and only if their corresponding intervals intersect. This set of intervals is called an interval model for the graph. The study of interval graphs has been closely associated with (computational) biology [4, 91]. For example, in physical mapping of DNA, which asks for reconstructing the relative positions of clones along the target DNA based on their pairwise overlap information [65, 2], the input data can be easily represented by a graph, where each clone is a vertex, and two clones are adjacent if and only if they overlap [91, 97, 54], hence an interval graph. A wealth of literature has been devoted to algorithms on interval graphs, which include a series of linear-time recognition algorithms [13, 73, 61, 48, 50, 43, 24]. Ironically, however, these recognition algorithms are never used as they are intended to be. Biologists never need to roll up their sleeves and feed their data into any recognition algorithm before claiming the answer is “NO” with full confidence, i.e., their data would not give an interval graph though they ought to. The reason is that biological data, obtained by mainly experimental methods, are destined to be flawed.

More often than not, biologists are also confident that their data, though not perfect, are of reasonably good quality: there are only few errors hidden in the data . This leads us naturally to consider graphs that are not interval graphs, but close to one in some sense. We say that a graph is an almost interval graph if it can be obtained from an interval graph by a small amount of modifications; it may or may not be an interval graph itself. Different applications are afflicted with different types of errors, e.g., there might be outliers, false-positive overlaps, and/or false-negative overlaps. We can accordingly define different measures for closeness. For any given nonnegative integer , we use , , and to denote the classes of graphs that can be obtained from some interval graph by adding at most vertices, adding at most edges, and deleting at most edges, respectively.2 We remark that this definition can be easily generalized to any hereditary graph class (i.e., closed under taking induced subgraphs). Interval graphs and all other graph classes to be mentioned in this paper are hereditary [39, 14, 85].

The first task is of course to efficiently decide whether a given graph is an almost interval graph or not, and more importantly, identify an object interval graph if one exists. Computationally, finding an object interval graph is equivalent to pinpointing the few but crucial errors in the data. For any fixed , this can be trivially done in polynomial time: given a graph on vertices, we can in time try every subset of vertices, edges, or missing edges of . Such an algorithm is nevertheless inefficient even for very small , as is usually large. The main results of this paper are linear-time recognition algorithms for all three classes of almost interval graphs.

###### Theorem 1.1.

Let be any fixed nonnegative integer. Given a graph on vertices and edges, the membership of in each of , , and can be decided in time. Moreover, in case of affirmative, an object interval graph can be produced in the same time.

Thm. 1.1 extends the line of linear-time algorithms for recognizing interval graphs. In the running times of all the three algorithms, needless to say, the constants hidden by big-Oh rely on . Since all the problems are NP-hard when , instead of being constant, is part of the input [67, 55, 38], the dependence on is necessarily super-polynomial (assuming PNP). Now that the linear dependence on the graph size is already optimum, we would like to minimize the factor of . We are thus brought into the framework of parameterized computation. Recall that a problem, associated with some parameter, is fixed-parameter tractable (FPT) if it admits a polynomial-time algorithm where the exponent on the input size ( in this paper) is a global constant independent of the parameter . From the lens of parameterized computation, the recognition of almost interval graphs is conventionally defined as graph modification problems, where the parameter is , and the task is to transform a graph to an interval graph by at most modifications . For the classes , , and , the modifications are vertex deletions, edge deletions, and completions (i.e., edge additions) respectively, which are the most commonly considered on hereditary graph classes. The parameterized problems are accordingly named interval vertex deletion, interval edge deletion, and interval completion. Our results can then be more specifically stated as:

###### Theorem 1.2.

Given a graph on vertices and edges and a nonnegative parameter , the problems interval vertex deletion, interval edge deletion, and interval completion can be solved in time , , and , respectively.

In particular, we show that interval edge deletion is FPT, thereby resolving a long-standing open problem first asked by Bodlaender et al. . Further, our algorithms for interval vertex deletion and interval completion significantly improve the -time algorithm of Heggernes et al.  and the -time algorithm of Cao and Marx , respectively. We remark that it can also be derived an -time approximation algorithm of ratio 8 for the minimum interval vertex deletion problem.

We feel obliged to point out that computational biologists cannot claim all credit for the discovery and further study of interval graphs. Independent of , Hajós  formulated the class of interval graphs out of nothing but coffee. Since its inception in 1950s, its natural structure earns itself a position in many other applications, among which the most cited ones include jobs scheduling in industrial engineering , temporal reasoning , and seriation in archeology . All these applications involve some temporal structure, which is understandable: before the final invention of time traveling vehicles, a graph representing relationship of temporal activities has to be an interval graph. With errors involved, almost interval graphs arise naturally.

### 1.1 Notation

All graphs discussed in this paper shall always be undirected and simple. The order and size of a graph are defined to be the cardinalities of its vertex set and its edge set respectively. We assume without loss of generality that is connected and nontrivial (containing at least two vertices); thus . We sometimes use the customary notation to mean , and to mean . The degree of a vertex is denoted by . A vertex is simplicial if induces a clique; let denote the set of simplicial vertices of . The length of a path or a cycle is defined to be the number of edges in it. Standard graph-theoretical and algorithmic terminology can be found in [27, 39].

A cycle induced by vertices, where , is called a -hole, or simply a hole if is irrelevant. In other words, a hole is an induced cycle that is not a triangle. A graph is chordal if it contains no holes. Lekkerkerker and Boland  showed that a graph is an interval graph if and only if it is chordal and does not contain a structure called asteroidal triple (at for short), i.e., three vertices such that each pair of them is connected by a path avoiding neighbors of the third one. They went further to list all minimal chordal graphs that contain an at. These graphs, reproduced in Fig. 1, are called chordal asteroidal witnesses (caws for short).

Let denote the set of minimal forbidden induced subgraphs of interval graphs, i.e., all holes and caws. Let be the set {net, sun, rising sun, long claw, whipping top, -hole, -hole} (see the first row of Fig. 1). An important ingredient of our algorithms is a comprehensive study of the following graph class. Clearly, , and thus all interval graphs satisfy this definition.

###### Definition 1.

Locally interval graphs are defined by forbidding all subgraphs in .

An induced interval subgraph of is an interval subgraph induced by a set of vertices. An interval graph (resp., ) is called a spanning interval subgraph (resp., an interval supergraph) of if it has the same vertex set as and (resp., ). An induced interval subgraph (resp., a spanning interval subgraph or an interval supergraph ) of is maximum (resp., maximum or minimum) if (resp., or ) is maximum (resp., maximum or minimum) among all induced interval subgraphs (resp., spanning interval subgraphs or interval supergraphs) of ; in other words, the number of modifications (resp., or ) is minimum.

A subset of vertices forms a module of if all vertices in have the same neighborhood outside of . In other words, for any pair of vertices , a vertex is adjacent to if and only if it is adjacent to as well. The set and all singleton vertex sets are modules, called trivial. A graph on less than three vertices has only trivial modules, while a graph on three vertices always has a nontrivial module. A graph on at least four vertices is prime if it contains only trivial modules, e.g., all holes of length at least five and all caws are prime. Two disjoint modules are either nonadjacent or completely adjacent. Given any partition of such that for every is a module of , we can associate a quotient graph , where each vertex represents a module of , and for any pair of distinct with , the th and th vertices of are adjacent if and only if and are adjacent in . From and for all (their total sizes are bounded by ), the original graph can be easily and efficiently retrieved.

### 1.2 Our major results

We state here the major results of this paper (besides Thms. 1.1 and 1.2) that are of independent interest. Our first result is a straightforward observation on modules of locally interval graphs and interval graphs.

###### Proposition 1.3.

Let  be the class of interval graphs or the class of locally interval graphs. A graph is in  if and only if a quotient graph of is in  and

1. every non-simplicial vertex of represents a clique module; and

2. in any pair of adjacent vertices of , at least one represents a clique module.

Our second major result comprises of a set of theorems. They characterize the minimum modifications with respect to modules of the input graph. Note that after replacing a module by another subgraph, we add edges between every vertex in the new subgraph to .

###### Theorem 1.4.

Let be a maximum induced interval subgraph of graph . For any module of intersecting , the set is a module of , and if is -hole-free, then replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .

###### Theorem 1.5.

Let be a -hole-free graph. There is a maximum spanning interval subgraph  of such that the following hold for every module of : i) is a module of ; and ii) replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .

###### Theorem 1.6.

For any graph , there is a minimum interval supergraph  of such that the following hold for every module of : i) is a module of ; and ii) if is not a clique, then replacing by any minimum interval supergraph of in gives a minimum interval supergraph of .

These results hold regardless of , and thus can be used for any algorithmic approach, e.g., Thm. 1.6 has already been used in . We remark that there has been a long relationship between modules and interval graphs. Indeed, the algorithm of , based on a characterization of prime interval graphs by Hsu , is arguably the simplest among all known recognition algorithms for interval graphs.

Let be a connected graph whose vertices, called bags, are the set of all maximal cliques of . We say that is a clique decomposition of if for any , the set of bags containing induces a connected subgraph of . A caterpillar is a tree that consists of a main path and all other vertices are leaves connected to it. An olive ring is a uni-cyclic graph that consists of a hole (called the main cycle) and all other vertices are pendant (having degree ) and connected to this hole. The deletion of any edge from the main cycle of an olive ring results in a caterpillar. Our third result is on the clique decomposition of prime locally interval graphs.

###### Theorem 1.7.

A prime locally interval graph  has a clique decomposition that is either a caterpillar when it is chordal; or an olive ring otherwise. This decomposition can be constructed in time.

Indeed, given a prime graph that does not have such a decomposition, our algorithm is able to identify a subgraph of in . The following statement is stronger than Thm. 1.7 and implies it.

###### Theorem 1.8.

Given a prime graph , we can in time either build an olive-ring/caterpillar decomposition for or find a subgraph of in .

In addition to the above listed concrete results, our algorithms also suggest a meta approach for designing fixed-parameter algorithms for vertex deletion problems (where modules are trivially preserved): If the object graph class can be characterized by a set of forbidden induced subgraphs of which only a finite number are not prime, then we may break them first and then use divide-and-conquer, i.e., solve the quotient graph and subgraphs induced by modules individually.3 This extends the result of Cai , and might also be applicable to some edge modification problems, on which, however, the preservation of modules needs to be checked case by case. The main advantage of this approach is that it enables us to concentrate on prime graphs and use their structural properties.

### 1.3 Motivation and background

The aforementioned physical mapping of DNA is a central problem in computational biology [65, 2]. In a utopia where experimental data were perfect, they should define an interval graph. Then the problem is equivalent to constructing an interval model for the graph, which can be done in linear time. In the real world we live, however, data are always inconsistent and contaminated by a few but crucial errors, which have to be detected and fixed. In particular, on the detection of false-positive errors that correspond to fake edges, Goldberg et al.  formulated the minimum interval edge deletion problem and showed its NP-hardness. Likewise, the deletion of vertices can be used to formulate the detection of outliers (i.e., elements participating in many false overlaps, both positive and negative), and the minimum interval vertex deletion problem is long known to be NP-hard [64, 67].

Solving the minimum interval vertex deletion problem and the minimum interval edge deletion problem is equivalent to finding the maximum induced interval subgraph [31, 9] and the maximum spanning interval subgraph  respectively. In light of the importance of interval graphs, it is not surprising that some natural combinatorial problems can be formulated as, or computationally reduced to the interval deletion problems. For instance, Narayanaswamy and Subashini  recently solved the maximum consecutive ones sub-matrix problem and the minimum convex bipartite deletion problem by a reduction to minimum interval vertex deletion. Oum et al.  showed that an induced interval subgraph can be used to find a special branch decomposition, which can be in turn used to devise FPT algorithms for a large number of problems, namely, locally checkable vertex subset and vertex partitioning problems. They both used our previous algorithm  as a subroutine, and thus will benefit from an improved algorithm directly.

The minimum interval completion problem is also a classic NP-hard problem [55, 96]. Besides computational biology, its most important application should be sparse matrix computations . The profile method is an extension of the bandwidth method [83, 81], and their purpose is to minimize the storage used during Gaussian elimination for a symmetric sparse matrix. Both methods attempt to reorder the rows and columns of the input matrix such that all elimination are limited within a band or an envelope around the main diagonal, while all entries outside are always zeroes during the whole computation. Therefore, we only need to store the elements in the band or envelop, whose sizes are accordingly called the bandwidth and profile . Rose  correlated bandwidth with graphs. Tarjan  showed that a symmetric matrix has a reordering such that its profile coincides with non-zero entries if and only if it defines an interval graph (there is an edge between vertices and if and only if the -element is non-zero), and finding the minimum profile is equivalent to solving the minimum interval completion problem.

A very similar problem is the minimum pathwidth problem, which also asks for an interval supergraph of but the objective is to minimize the size of the maximum clique in . This problem was also known to be NP-hard . In light of the hardness of both problems, people turned to finding minimal interval completions, which can be viewed as a relaxation of both of them. Ohtsuki et al.  designed an algorithm that finds a minimal interval completion in time. Very recently, Crespelle and Todinca  proposed an improved algorithm that runs in time. This is the best known, and it remains open to develop a linear-time algorithm for finding a minimal interval completion. See also Heggernes et al.  for a characterization of minimal interval completions.

Möhring  showed that if a graph is free of ats, then any minimal chordal supergraph of it is an interval graph. The converse was later shown to be true as well . Since the minimum chordal completion problem (also known as minimum fill-in) is known to be NP-hard on at-free graphs , the minimum interval completion problem remains NP-hard on at-free graphs. Other graph classes on which the minimum interval completion problem remains NP-hard include chordal graphs , permutation graphs , and cocomparability graphs . On the positive side, see  for some polynomial solvable special cases.

### 1.4 Graph modification problems and their fixed-parameter tractability

Many classical graph-theoretic problems can be formulated as graph modification problems to specific graph classes. For example, Garey and Johnson [35, section A1.2] listed 18 NP-complete graph modification problems (two of which are indeed large collections of problems; see also [67, 95]). Graph modification problems are also among the earliest problems whose parameterized complexity were considered, e.g., Kaplan et al.  and Cai  devised FPT algorithms for completion problems to chordal graphs and related graphs. Indeed, since the graph modification problems are a natural computational method for detecting few errors in experimental data, they were an important motivation behind parameterized computation. In the special case when the desired graph class  can be characterized by a finite number of forbidden (induced) subgraphs, their fixed-parameter tractability follows from a basic bounded search tree algorithm . However, many important graph classes, e.g., forests, bipartite graphs, and chordal graphs, have minimal obstructions of arbitrarily large size (cycles, odd cycles, and holes, respectively). It is much more challenging to obtain fixed-parameter tractability results for such classes.

Besides holes, has another infinite set of obstructions (caws), which is far less understood [23, 63]. Since adding or deleting a single edge is sufficient to fix an arbitrarily large caw, the modification problems to interval graphs are more complicated than chordal graphs. Their fixed-parameterized tractability were frequently posed as important open problems [53, 30, 10]. Only after about two decades were interval completion and interval vertex deletion shown to be FPT [90, 21]. Both algorithms use a two-phase approach, where the first phase breaks all (problem-specifically) small forbidden induced subgraphs and the second one takes care of the remaining ones with the help of combinatorial properties that hold only in graphs without those small subgraphs. Nevertheless, neither approach of [90, 21] generalizes to interval edge deletion in a natural way, whose parameterized complexity remained open to date. Moreover, both algorithms of [90, 21] suffer from high time complexity.

In passing let us point out that the vertex deletion version can be considered as the most robust variant, as it encompasses both edge modifications in the following sense: if a graph can be made an interval graph by edge deletions and edge additions, then it can also be made an interval graph by at most vertex deletions (e.g., one vertex from each added/deleted edge). In other words, the graph class contains both classes and . The similar fact holds for all hereditary graph classes. On the other hand, and are incomparable in general, e.g., a -hole is in interval and a is in interval but not the other way.

### 1.5 Efficient detection of (small) forbidden induced subgraphs

As said, if the object graph class has only a finite number of forbidden induced subgraphs, then the modification problem is trivially FPT. This observation can be extended to a family of forbidden induced subgraphs that, though infinite, can be detected in polynomial time and destroyed by a bounded number of ways; the most remarkable example is chordal completion [53, 17]. For the purpose of contrast, let us call this one-phase approach. In carrying out the aforementioned two-phase approach, one usually focuses on the second phase, on the ground that the first phase seems to be the same as the one-phase approach. This ground is, nevertheless, shaky: more often than not, algorithms based on the one-phase approach run in linear time, but all previous algorithms [90, 89, 21] based on this two-phase approach have high polynomial factors in their running times, which are mainly determined by the time required to detect small forbidden induced subgraphs in the first phase. As we will see, the detection of a small forbidden induced subgraph is usually far more demanding than an arbitrary one.

Kratsch et al.  presented a linear-time algorithm for detecting a hole or an at from a non-interval graph. It first calls the hole-detection algorithm of Tarjan and Yannakakis , which either returns a hole, or reduces to finding an at in a chordal graph. The additional chordal condition for the detection of an at is crucial: we do not know how to find an at in a general graph in linear time. The best known recognition algorithm for at-free graphs takes time , and Kratsch and Spinrad  showed that this algorithm can be used to find an at in the same time if the graph contains one. A more important result of  is that recognizing at-free graphs is at least as difficulty as finding a triangle. The detection of an at cannot be easier than the recognition of at-free graphs, and hence a linear-time algorithm for it is very unlikely to exist. (See also .4) When an at is detected, the algorithm of Kratsch et al.  also provides in the same time a witness for it. This witness, although unnecessarily minimal itself, can be used to easily retrieve a minimal one, i.e., a caw (see also  for another approach).

Obviously, for any hereditary graph class, the detection of a forbidden induced subgraph is never easier than the recognition of this graph class. On the other hand, we have seen that the detection of a hole, an at with witness, and a subgraph in can be done in the same asymptotic time as the recognition of chordal graphs, at-free graphs, and interval graphs, respectively. From these examples one may surmise that the requirement of explicit evidence does not seem to pose an extra burden to the recognition algorithms. This is known to be true for almost all polynomial-recognizable graph classes with known characterization by forbidden induced subgraphs.

However, it changes drastically when the evidence is further required to have a small or minimum number of vertices. The most famous example should be the detection of cycles: while an arbitrary cycle can be trivially found in linear time, the detection of a shortest cycle, which includes the triangle-detection as a special case, is very unlikely to be done in linear time. Even finding a short cycle in linear time seems to be out of the question (see, e.g., ). Assuming that triangles cannot be detected in linear time, we can also rule out the possibility of linear-time detection of a minimum subgraph in or a shortest hole. Let be the graph obtained by subdividing a graph (i.e., for each edge , adding a new vertex , connecting it to both and , and deleting ), then contains a triangle if and only if the minimum subgraph of in is a -hole. Since has vertices and edges, an linear-time algorithm for finding a minimum subgraph in can be used to detect a triangle in linear time. With a similar reduction, we can show that a linear-time algorithm for detecting subgraphs in —recall that they are small graphs in —is unlikely to exist, as it can be used to detect a claw in linear time, and further to detect a triangle in time, which would have groundbreaking consequence (see [85, Open problem 8.3, page 103]). Similar phenomenon has been observed in detecting minimum Tucker submatrices, i.e., a minimal matrix that does not have consecutive-ones property  and shortest even holes . 5

Another crucial step of our algorithm is to find all simplicial vertices of a graph. Again, it is unlikely to be done in linear time: Kratsch and Spinrad  showed that counting the number of simplicial vertices is already at least as hard as detecting a triangle. Indeed, there is even no known algorithm that can detect a single simplicial vertex in linear time. The only known way of finding a simplicial vertex is either enumerating all vertices or using fast matrix multiplication. Kloks et al.  showed that in the same time one can actually list all simplicial vertices. This is the best known in general graphs. See also [94, open problems 4.3 and 4.4].

### 1.6 Main challenges and our techniques

We describe here the main challenges and intuitions behind the techniques that we use to address them. They can be roughly put into two categories: for the linear dependence on the graph size and for the smaller exponential dependence on the parameter. Also sketched here is why known techniques from previous work will not suffice. We basically take the two-phase approach, subgraphs in first and then the rest (large ones). We say that caws and holes in are small and short respectively; other caws, namely, s and s, are large, and holes of length six or more are long. It is worth noting that the thresholds are chosen by structural properties instead of sizes.

#### Linear dependence on the graph size.

The biggest challenge is surely the efficient detection of a subgraph in , or more specifically, the detection of a short hole or small caw. As explained above, we do not expect a linear-time algorithm for this task. Instead, we relax it to the following: either find a subgraph in or build a structural decomposition (Thm. 1.8) that is sufficient for the second phase. For the disposal of large forbidden induced subgraphs in the second phase, the algorithm of  breaks long holes first, and then large caws in a chordal graph. There is no clear way to implement this tactic in linear time: the disposal of holes introduces a factor , while finding a caw gives another factor . Neither of them seems to be improvable to . We are thus forced to consider an alternative approach, i.e., we may have to deal with large caws in a non-chordal graph. Hence completely new techniques are required. Overcoming these two difficulties enables us to deliver linear-time algorithms.

#### Exponential dependence on the parameter.

To claim the fixed-parameter tractability of interval edge deletion and better dependence on for interval completion, we still have some major concerns to address. Since fixing holes by edge additions is well understood, the algorithm of Heggernes et al.  for interval completion assumes the input graph to be chordal, and focuses on the disposal of caws. However, holes pose a nontrivial challenge to us in the deletion problems, and thus the techniques of  do not apply. On the other hand, the algorithm in  heavily relies on the fact that the deletion of vertices leaves an induced subgraph. Essentially, it looks for a minimum set of vertices intersecting all subgraphs in , so called hitting set. Deleting any vertex from a subgraph in breaks this subgraph once and for all, but adding/deleting an edge to break an erstwhile subgraph in might introduce new one(s). As a result, the “hitting set” observation does not apply to edge modifications problems.

• The first difficulty that presented itself at this point is on the preservation of modules, which is trivial for vertex deletions, but not true for edge modifications in general. Simple examples tell us that not all maximum spanning interval subgraphs and minimum interval supergraphs preserve all modules. What we do here is to identify appropriate technical conditions, under which there exists some maximum spanning interval subgraph or minimum interval supergraph that preserves all modules, and make them satisfied at the onset of the second phase.

• The other difficulty is why it suffices to consider a bounded number of modifications to fix a special caw, for which we need to argue that most possible modifications are local to it and can be decided locally. In , we studied in a chordal graph with no small caws, how a caw interacts with others; similar arguments are obviously inapplicable to edge variations. Even for vertex deletions, as we had make a compromise to work on non-chordal graph, we need a new argument that does not assume the chordality.

## 2 Outline

The purpose of this section is to describe the main steps of our algorithm at a high level. A quotient graph is isomorphic to an induced subgraph of , e.g., we can pick an arbitrary vertex from each module of the module partition and take the induced subgraph. Therefore, whenever a forbidden induced subgraph of is detected, it can be translated into a forbidden induced subgraph of directly.

### 2.1 Maximal strong modules

Behind Prop. 1.3 and Thms. 1.4-1.6 is a very simple observation: -holes are the only non-prime graph in and . Note that for any induced subgraph intersecting a module , their intersection is a (possibly trivial) module of . Therefore, if is prime and , then it intersects by at most one vertex. Fix any module partition and accordingly a quotient graph . If is in or but not a -hole, then either contains at most one vertex from each module, thus isomorphic to an induced subgraph of , or is fully contained in some module from the given partition. On the other hand, a -hole may contain precisely two vertices of a module , and then the other two vertices must be neighbors of this module. We have two cases: the other two vertices belong to the same module that is adjacent to , or they belong to two different (nonadjacent) modules. In other words, either two non-clique modules are adjacent, or a non-clique module is not simplicial in . This concludes Prop. 1.3.

However, Prop. 1.3 has no direct algorithmic use: a graph might have an exponential number of modules and quotient graphs. A module is strong if for every other module that intersects , one of and is a proper subset of the other. All trivial modules are strong. We say that a strong module , different from , is maximal if the only strong module properly containing is . Using definition it is easy to verify that maximal strong modules of are disjoint and every vertex of appears in one of them. Therefore, they partition , and define a special quotient graph . If is not connected, then each maximal strong module is a component of it, and has no edge. Recall that the complement graph of is defined on the same vertex set , where a pair of vertices and is adjacent if and only if in . Thus, the complement of has the same set of modules as ; in particular, if it is not connected, then its components are the maximal strong modules of , and hence is complete. If both the graph and its complement are connected, then must be prime . Note that this is the only case that a quotient graph can be prime; in other words, a prime quotient graph must be defined by maximal strong modules.

Hereafter, the quotient graph is always decided by maximal strong modules of ; when itself is prime, they are isomorphic. There are at most maximal strong modules, which can be found in linear time . Therefore, the following corollary of Prop. 1.3 will be more useful for algorithmic purpose. Recall that a vertex is universal in if . It is easy to verify that a prime graph is necessarily connected, and its simplicial vertices are pairwise nonadjacent.

###### Corollary 2.1.

Let  be the class of interval graphs or the class of locally interval graphs. A graph having no universal vertices is in  if and only if

1. the quotient graph decided by maximal strong modules of is in  but not a clique;

2. for every module represented by a simplicial vertex of ; and

3. is a clique for every module represented by a non-simplicial vertex of .

Every parameterized modification problem has an equivalent optimization version, which asks for a minimum set of modifications; the resulting interval graph is called an optimum solution to this problem. Clearly, a graph is in the class interval, interval, or interval if and only if the minimum number of vertex deletions, edge deletions, or edge additions respectively that transform into an interval graph is no more than . Although the recognition/modification problems we are working on do not explicitly ask for an optimum solution, an optimum one will serve our purpose. We have stated in Thms. 1.4-1.6 that there are always optimum solutions well aligned with modules of the input graph. Again, for algorithmic purpose, the following variations formulated on maximal strong modules are more convenient for our divide-and-conquer approach. As we will see shortly, they are indeed equivalent to Thms. 1.4-1.6 respectively.

###### Theorem 2.2.

Let be a graph of which every -hole is contained in some maximal strong module, and let be a maximum induced interval subgraph of . For any maximal strong module of intersecting , the set is a module of , and replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .

###### Theorem 2.3.

Let be a graph of which every -hole is contained in some maximal strong module. There exists a maximum spanning interval subgraph  of such that every maximal strong module of is a module of , and replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .

We may assume without loss of generality that the input graph contains no universal vertices. According to Cor. 2.1, the condition of Thms. 2.2 and 2.3 is satisfied if (i) is not a clique, (ii) contains no -hole, and (iii) every non-simplicial vertex of represents a clique module of . In this paper cliques are required to be nonempty. It is easy to verify that the maximum induced interval subgraph or maximum spanning interval subgraph of a graph is clique if and only if it is a clique; thus, under the condition of Thms. 2.2 and 2.3, a maximal strong module is a clique of the object interval graph if and only if it is a clique of .

###### Theorem 2.4.

There is a minimum interval supergraph  of such that every maximal strong module of is a module of , and if is not a clique, then replacing by any minimum interval supergraph of in gives a maximum spanning interval subgraph of .

### 2.2 Characterization and decomposition of locally interval graphs

Prop. 1.3 reduces the main task of the first phase, the detection of a subgraph of in , to two simpler tasks, namely, finding a subgraph of in and finding all simplicial vertices of when it is a locally interval graph. Both tasks are trivial when is an interval graph (including cliques and edgeless graphs), and hence we concentrate on prime non-interval graphs. If such a graph contains no subgraph in , i.e., being a locally interval graph, then it must contain some large caw or some long hole. Therefore, we start from characterizing large caws and long holes in prime locally interval graphs. A glance at Fig. 1 tells us that each caw contains precisely three simplicial vertices, which form the unique at of this caw; they are called the terminals of this caw.6 Each large caw (the second row of Fig. 1) contains a unique terminal , called the shallow terminal, such that the deletion of from this caw leaves an induced path.

###### Theorem 2.5.

Let be a large caw of a prime graph . We can in time find a subgraph of in if the shallow terminal of is non-simplicial in .

If a prime locally interval graph  is chordal, then by Thm. 2.5, every caw contains a simplicial vertex (its shallow terminal), and thus must be an interval graph. In a chordal graph, can be easily found, and then a caterpillar decomposition for can be obtained by adding to a clique path decomposition for (Section 5.3). This settles the chordal case of Thm. 1.8; we may hence assume that is not chordal and has a long hole .

###### Theorem 2.6.

Let be a hole of a prime graph . We can in time find a subgraph of in if there exists a vertex satisfying one of the following: (1) the neighbors of in are not consecutive; (2) is adjacent to or more vertices in ; and (3) is non-simplicial and nonadjacent to .

If is a prime locally interval graph, then for any vertex of the hole , the subgraph must be chordal; otherwise, and any hole of will satisfy Thm. 2.6(3). Therefore, combining Thms. 2.5 and 2.6, we conclude that must be an interval subgraph, and has a linear structure. These observations inspire the definition of the auxiliary graph (with respect to ), which is the main technical tool for analyzing prime non-chordal graphs. Here we need a special vertex of satisfying some local properties, which can be found in linear time (Section 5.1). We number vertices in such that is this special vertex and define . We designate the ordering of traversing as clockwise, and the other counterclockwise. The local properties enable us to assign a direction to each edge between and , i.e., , in accordance with the direction of itself. We use  and  to denote the set of clockwise and counterclockwise edges from , respectively; partitions .

###### Definition 2.

The vertex set of consists of , where and are distinct copies of , i.e., for each , there are a vertex in and another vertex in , and is a new vertex distinct from . For each edge , we add to the edge set of

• an edge if neither nor is in ;

• two edges and if both and are in ; or

• an edge or if and or respectively.

Finally, we add an edge for every .

It is easy to see that the order and size of are upper bounded by and respectively. We will show in Section 5.1 that an adjacency list representation of can be constructed in linear time. The auxiliary graph carries all structural information of useful for us and is easy to manipulate; in particular, the new vertex is introduced to memorize the connection between and the right end of . The shape of symbol is a good hint for understanding the structure of the auxiliary graph. Suppose has an olive-ring structure, then has a caterpillar structure, which is obtained by unfolding the olive ring as follows. The subgraph has a caterpillar structure, to the ends of which we append two copies of . The two copies of , namely, and , are identical, and every edge between and is carried by only one copy of it, based on it is in  or . Furthermore, properties stated in the following theorem allow us to fold (the reverse of the “unfolding” operation) the caterpillar structure of back to produce the olive-ring decomposition for . Note that is different from .

###### Theorem 2.7.

A vertex different from is simplicial in if and only if it is derived from some simplicial vertex of . Moreover, we can in time find a subgraph of in if 1) is not chordal; or 2) is not an interval graph.

We may assume that the graph is chordal, whose simplicial vertices can be identified easily. As a result of Thm. 2.7, we can retrieve and obtain the graph . If it is not an interval graph, then we are done with Thm. 1.8. Otherwise, we apply the following operation to sequentially build a hole decomposition for and an olive-ring decomposition for . Noting that all holes of are also in , once the decomposition for is produced, we can use it to find a shortest hole of . We proceed only when this hole is long.

###### Lemma 2.8.

Given a clique path decomposition for , we can in time build a clique decomposition for that is a hole. Moreover, we can find in time a shortest hole of .

###### Theorem 2.9.

Given a clique hole decomposition for , we can in time construct a clique decomposition for that is an olive ring.

Putting together these steps, we get the decomposition algorithm in Fig. 2, from which Thm. 1.8 follows. This concludes the proof of the characterization and decomposition of prime locally interval graphs.

### 2.3 Recognition of almost interval graphs

In lieu of general solutions, we may consider only those optimum solutions satisfying Thms. 2.2-2.4, which focus us on the quotient graph defined by maximal strong modules of . If is a clique, then we have either a -hole or a smaller instance (by removing all universal vertices). Otherwise is prime and we call Thm. 1.8 with it, which has two possible outcomes; there are only a constant number of modifications applicable to a small caw, and thus we may assume that the outcome is an olive-ring decomposition . For the completion problem, as holes can be easily filled, we can always assume that the graph is chordal and is a caterpillar. With decomposition , whether the input instance satisfies the conditions of Thms. 2.2 and 2.3 can be easily checked. If some non-simplicial vertex in represents a non-clique module, then we have a -hole. Otherwise, we work on all maximal strong modules and find each of them an optimum solution, for which it suffices to consider those represented by simplicial vertices in . Using definition it is easy to verify that the resulting graph has the same set of maximal strong modules as , and hence remains the prime quotient of it. With inductive reasoning, we may assume that every simplicial vertex in represents now an interval subgraph. In summary, the only condition of Cor. 2.1 that might remain unsatisfied is whether itself is an interval graph. Therefore, this section is devoted to the disposal of , which is prime and has a caterpillar/olive-ring decomposition .

Allow us to use some informality in explaining the intuition behind the our algorithms for deletion problems. Recall that clique path decompositions are characteristic of interval graphs . With a bird’s-eye view, what we have is an olive ring, while what we want is a path; it may help to mention that the maximal cliques of the graph may change and the bags of the latter is not necessarily a subset of the former. Toward this end, we need to cut the main cycle and strip off its leaves of the olive ring, and there are immediately two options based on which action is taken first. Interestingly, they correspond to the disposal of holes and caws, respectively. From we can observe that every hole of is global in the sense that it dominates all holes. In contrast, every caw is local, and with diameter at most four, so it sees only a part of the main cycle. The structural difference of holes and caws suggests that different techniques are required to handle them. As explained in Section 1.6, we strip the leaves off the olive ring first to make it a hole.

Let () be a large caw in , possibly (see the second row of Fig. 1). We consider its terminals as well as their neighbors, i.e., . It is observed that if all of them are retained and their adjacencies—except of and , which are adjacent in a but not a —are not changed, then in an interval model of the object interval graph, they must be arranged in the way depicted in Fig. 3. As indicated by the dashed extensions, the interval for (resp., ) might or might not extend to the left (resp., right) to intersect the interval for (resp., ). Our main observation is on the position of the interval for : it has to lie between and , which are nonadjacent—this explains why we single out net and (rising) sun from and respectively. Recall that is originally adjacent to no vertex in the - path . Therefore, we need to delete some vertex or edge to break , or add an edge to connect to some inner vertex of .

In the discussion above, what matters is only the terminals and their neighbors, while the particular - path becomes irrelevant. Indeed, any induced - path in can be used in place of to give a caw of the same type (though not necessarily the same size), which has the same set of terminals. A similar operation is thus needed for all of them, and the particular base is immaterial, inspiring us to consider the following two sets of vertices. Of a large caw (), the frame is denoted by (), and the set of inner vertices is composed of all vertices that can be used to make a caw with frame (); they are denoted by and respectively. Without a specific path in sight, it would be more convenient to use () to denote a frame.

We can find a frame that is minimal in a sense. Its definition, give in Section 6.1, is essentially the same as what is used in our previous work . The major concern here is how to find a minimal frame in linear time; considering that the graph might still contain holes and small caws, it is far more complicated than . This is achieved using the olive-ring decomposition (Section 6.1). The rest is then devoted to the disposal of (caws with) this minimal frame.

Consider first vertex deletions. We show that any optimum solution deletes either some vertex of or a minimum - separator in the subgraph induced by . The second case is our main concern, for which we manage to show that any minimum - separator will suffice; it can be found in linear time. This case can be informally explained as follows. All vertices in and must reside in a consecutive part of the main cycle of the olive-ring decomposition, and we need to find some “place” in between to accommodate . We show that it suffices to “cut any thinnest place” between and , and use this space for . Recalling that has at most seven vertices, we have then an -way branching for disposing of this frame.

The basic idea for edge deletions is similar as vertex deletions, i.e., we delete either one of a bounded number of edges or a minimum edge - separator, but we are now confronted with more complex situations. First, the assumption that no edge in is deleted does not suffice, so instead we find a shortest - path with all inner vertices from . If has a bounded length, then we branch on deleting every edge in it. Otherwise, we argue that either one of the first or last edges of is deleted, or it suffices to find a minimum set of edges whose deletion separates and in , which can also be viewed as the “thinnest place” (in another sense) between and . This gives an -way branching.

After all caws are destroyed as above, if is chordal, then problems are solved; otherwise, it has a clique hole decomposition.7 Since every simplicial vertex of represents an interval subgraph, this decomposition can be extended to a clique hole decomposition for . As a result, all its holes can be broken at a fell swoop in linear time, which solves the problems.

###### Lemma 2.10.

Given a clique hole decomposition for graph , the problems interval vertex deletion and interval edge deletion can be solved in time and respectively.

The “thinnest places” are also crucial for completions, though the argument becomes even more delicate. Our focus is on a minimum interval supergraph  that contains no edge in ; in particular, and remain nonadjacent in . As said, we attend to caws only when the graph is already chordal, which means that the clique decomposition is a caterpillar. Therefore, and can be used to decide a left-right relation for both the caterpillar decomposition of and an interval model of . After adding edges, an interval for a vertex that is to the right of in might intersect part or all intervals between and . We argue that such an interval either reaches , or is to the right of some position (informally speaking, the “rightmost thinnest place”). A symmetric argument works for a vertex to the left of . As a result, we have two points such that all structures between them is totally decided by and ; in particular, it suffices to put in any “thinnest” place in between. This gives a -way branching.

Putting together these steps, a high-level outline of our algorithms is given in Fig. 4. This concludes Thms. 1.1 and 1.2.

#### Organization.

The rest of the paper is organized as follows. Section 3 relates modules to optimum solutions of all the three problems, and proves Thms. 2.2-2.4 as well as Thms. 1.4-1.6. Section 4 gives the characterization of large caws and long holes in prime locally interval graphs, and proves Thms.2.5 and 2.6. Section 5 presents the details of decomposing prime graphs and proves Thms. 2.7-2.9. Section 6 presents the details on the disposal of large caws. Section 7 use all these results to complete the algorithms. Section 8 closes this paper by describing some follow-up work and discussing some possible improvement and new directions.

## 3 Modules

This section is devoted to the proof of Thms. 1.4-1.6 and Thms. 2.2-2.4. Each of these theorems comprises two assertions on modules of . The module preservation asserts that a (maximal strong) module (or its remnant after partial deletion) remains a module of the optimum solution, and local optimum asserts that the optimum solution restricted to is an optimum solution of itself, and can be replaced by any optimum solution of . Thms. 1.4 and 2.2 (vertex deletions) turn out to be quite straightforward. As a matter of fact, a weaker version of them has been proved and used in , and a similar argument, which is based on the characterization of forbidden induced subgraphs and the hereditary property, also works here. On the other hand, this approach does not seem to be adaptable to the edge modifications problems.

Simple examples tell us that not all maximum spanning interval subgraphs or minimum interval supergraphs preserve all modules. For example, consider the graph in Fig. 3 with only solid edges, which is obtained from a as follows: the center is replaced by a clique of vertices, and the shallow terminal is replaced by two nonadjacent vertices and . A minimum completion to this graph must be adding for each of and an edge to connect it to some vertex at the bottom. These two vertices do not need to be the same, e.g., the dashed edges in Fig. 3; however, is not a module of the resulting minimum interval supergraph. On the one hand, the subgraph induced by the maximal strong module  is not connected; on the other hand, we may alternatively connect both and to the same vertex so that we obtain another minimum interval supergraph that preserves as a module. These two observations turn out to be general: 1) if a module of a graph is not a module of some minimum interval supergraph  of , then must be disconnected, and 2) we can always modify to another minimum interval supergraph  of such that is a module of . {SCfigure}[][h] The module is not preserved by a minimum interval supergraph (dashed edges are added; number in a circle means a clique of vertices).

To make it worse, a maximum spanning interval subgraph may have to break some maximal strong modules. The simplest example is a -hole graph, which has two nontrivial modules, but its maximum spanning interval subgraph must be a simple path, which is prime. Even the connectedness does not help here. For example, consider the graph in Fig. 4(a) (all edges, both solid and dashed), which is obtained by completely connecting two induced paths and . A maximum spanning interval subgraph of it has to be isomorphic to Fig. 4(a) after dashed edges deleted, which is again prime. Both examples contain some -hole, which urges us to study -hole-free graphs. We show that any -hole-free graph has a maximum spanning interval subgraph that preserves all its modules. It is worth stressing that not all maximum spanning interval subgraphs of a -hole-free graph preserve all its modules, e.g., the graph (with both solid and dashed edges) and its maximum spanning interval subgraph (after dashed edges deleted) in Fig. 4(b). (a) The only two nontrivial modules {v1,v2,v3,v4} and {u1,u2,u3,u4}, both connected, are not preserved by any maximum spanning interval subgraph.

The way we prove Thms. 2.3 and 2.4 is using interval models: we construct an interval graph satisfying the claimed conditions by explicitly giving an interval model for it. For this purpose we need more notation on interval models. In an interval model, each vertex corresponds to a closed interval , where and are the left and right endpoints of , respectively, and . An interval model is called normalized if no pair of distinct intervals in it shares an endpoint; every interval graph has a normalized interval model. All interval models in this section are normalized. For a subset of vertices, we define and . Observe that if induces a connected subgraph, then the interval is exactly the union of . Let be a set of points that are in an interval . By projecting from to another interval we mean the following operation:

 ρ→β′−α′β−α(ρ−α)+α′for each ρ∈P.

In other words, each point in is proportionally shifted to a point in . It is easy to verify that all new points are in and this operation retains relations between every pair of points. In particular, if we project the endpoints of all intervals for , the set of new intervals defines the same interval graph.

The following simple observation will be crucial for our arguments. For any point , we can find a positive value such that the only possible endpoint of in is . Here the value of should be understood as a function—depending on the interval model as well as the point —instead of a constant.

### 3.1 Modules in maximum induced interval subgraphs

For any module and vertex set of , the set , if not empty, is a module of the subgraph . This property implies the preservation of modules in all maximum induced interval subgraphs. Therefore, for Thms. 1.4 and 2.2, it suffices to prove their second assertions, which follow from the following statement. Recall that if a -hole contains precisely two vertices from some module , then neither nor induces a clique.

###### Lemma 3.1.

Let be a maximum induced interval subgraph of a graph . Let be a module of such that at least one of and induces a clique. If , then replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .

###### Proof.

Suppose, for contradiction, that the new graph is not an interval graph. From we can find a subgraph in , which must intersect both and . Since at least one of and induces a clique, contains exactly one vertex of ; let it be . By assumption, there exists a vertex (possibly ); let . Clearly, , but is isomorphic to , hence in , contradicting that is an interval graph. ∎

### 3.2 Modules in maximum spanning interval subgraphs

Before the proof of Thm. 2.3, we show a stronger result on clique modules.

###### Lemma 3.2.

A clique module of a graph is also a clique module of any maximum spanning interval subgraph  of .

###### Proof.

Let be the component of such that attains the maximum value among all components of . We modify a given normalized interval model for as follows. Let and . For each , we set to a distinct value in , and set to a distinct value in . For each , we set . Let be the interval graph defined by . By construction, a vertex is adjacent to in if and only if it is in . Since , we have . For each , it holds that

 |NG––(v)∖M|≤|NG––(C)∖M|=|NG––(C)|=|NG––′(v)∖M|.

On the other hand, induces a clique in . They together imply , while the equality is only attainable when is a clique, hence , and is completely connected to . Therefore, , and this verifies the lemma. ∎

#### Theorem 2.3 (restated).

Let be a graph of which every -hole is contained in some maximal strong module. There exists a maximum spanning interval subgraph  of such that every maximal strong module of is a module of , and replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .

###### Proof.

As a consequence of Lem. 3.2, it suffices to consider the case when is not a clique, and then must induce a clique of . Let be a normalized interval model for .

###### Claim 1.

For any component of , the set induces a clique of .

###### Proof.

Supposing the contrary, we construct an interval graph with as follows. Let be a pair of vertices in such that , i.e., in . Without loss of generality, assume that is to the left of , and let be an arbitrary point in between, i .e., . For every , we extend the interval to include : if is to the left of , we set to be a distinct point in (); if is to the right of , we set to be a distinct point in (). We use the graph defined by the set of new intervals as . To see , note that all intervals are extended only; to see , note that is an edge in but not in . Every edge in is always incident to , which is a subset of , and hence exists also in . Therefore, is an interval subgraph of with strictly more edges than , which is impossible. ∎

Let be any maximum spanning interval subgraph of . We modify first to make satisfy the claimed condition. Let be the component of such that attains the maximum value among all components of . We have seen that induces a clique in . The intersection of all intervals is thus nonempty; let it be . Since is connected, . The interval intersects , and we can choose a common point in them. We construct another graph by projecting an interval model for into . The graph represented by the set of new intervals will be the sought-after interval graph .

We now verify and . On the one hand, induces the same subgraph in and . On the other hand, by assumption, is an interval subgraph of and has no more edges than . Therefore, it suffices to consider edges between and . For , there edges are . Since is a subset of , it holds that . By selection of (i.e., has the largest size), .

We have now constructed a maximum spanning interval subgraph of where satisfies the claimed conditions. Only intervals for vertices in are changed, and thus this operation can be successively applied on the maximal strong modules of one by one. If a module already satisfies the conditions, then it remains true after modifying other modules. Therefore, repeating this process will derive a claimed maximum spanning interval subgraph of . ∎

As explained below, this settles Thm. 1.5 as well. Thm. 1.5 will not be directly used in this paper, and thus the reader may safely skip the following proof without losing track of the development of our algorithms.

###### Lemma 3.3.

Thms. 1.5 and 2.3 are equivalent.

###### Proof.

We show first that Thm. 1.5 implies Thm. 2.3. Let be a graph of which every -hole is contained in some maximal strong module. Let be the set of maximal strong modules of , and let be the quotient graph defined by them. Let be a maximum spanning interval subgraph of ; for each , we replace by , which is an interval subgraph; let denote the obtained graph. Clearly, is a subgraph of , and is a maximum spanning interval subgraph of . Moreover, every remains a module of and thus is a quotient graph of . By Lem. 3.2, is not a clique if and only if is not a clique. Thus, is -hole-free, and by Thm. 1.5, there is a maximum spanning interval subgraph  of such that for each . This implies that is a maximum spanning interval subgraph of , and the substitutability follows from Cor. 2.1. Moreover, since and are both maximum spanning interval subgraphs of , they have the same size, which implies that is a maximum spanning interval subgraph of as well. This verifies that satisfies the claimed conditions of Thm. 2.3, and concludes this direction.

We now verify the other direction. Let be a -hole-free graph. Note that every strong module different from is a subset of some maximal strong module , and a strong module in . We first use inductive reasoning to show that the assertions i) and ii) of Thm. 1.5 hold for every strong module of . The base case is trivial: the largest strong module is . The inductive steps follow from Thm. 2.3: since every strong module induces a -hole-free subgraph , its quotient graph trivially satisfies the condition of Thm. 2.3. This settles all strong modules, and then we consider modules that are not strong. Such a module is composed of more than one strong modules, and they are either pairwise adjacent or pairwise nonadjacent. In the first case, (noting that graph contains no -hole,) at most one of these strong modules is nontrivial. In the second case, they are different components of . Both cases are straightforward. ∎

### 3.3 Modules in minimum interval supergraphs

Before the proof of Thm. 2.4, we show a stronger result on connected modules, i.e., modules inducing connected subgraphs, of a graph with respect to its minimum interval supergraphs. For a subset of vertices, we denote by the set of common neighbors of , i.e., . Note that , and the equality is attained if and only if it is a module of .

###### Theorem 3.4.

Let be a minimum interval supergraph of a graph . Every connected module of is a module of , and if is not a clique, then replacing by any minimum interval supergraph of in gives a minimum interval supergraph of .

###### Proof.

The statement holds vacuously if consists of a single vertex or a component; hence we may assume and . Let be a normalized interval model for . We define