Linear Recognition of Almost Interval Graphs^{1}
Abstract
Let , , and denote the classes of graphs that can be obtained from some interval graph by adding vertices, adding edges, and deleting edges, respectively. When is small, these graph classes are called almost interval graphs. They are well motivated from computational biology, where the data ought to be represented by an interval graph while we can only expect an almost interval graph for the best. For any fixed , we give lineartime algorithms for recognizing all these classes, and in the case of membership, our algorithms provide also a specific interval graph as evidence. When is part of the input, these problems are also known as graph modification problems, all NPcomplete. Our results imply that they are fixedparameter tractable parameterized by , thereby resolving the longstanding open problem on the parameterized complexity of recognizing , first asked by Bodlaender et al. [Bioinformatics, 11:49–57, 1995]. Moreover, our algorithms for recognizing and run in times and , (where and stand for the numbers of vertices and edges respectively in the input graph,) significantly improving the time algorithm of Heggernes et al. [STOC 2007] and the time algorithm of Cao and Marx [SODA 2014] respectively.
1 Introduction
A graph is an interval graph if its vertices can be assigned to intervals on the real line such that there is an edge between two vertices if and only if their corresponding intervals intersect. This set of intervals is called an interval model for the graph. The study of interval graphs has been closely associated with (computational) biology [4, 91]. For example, in physical mapping of DNA, which asks for reconstructing the relative positions of clones along the target DNA based on their pairwise overlap information [65, 2], the input data can be easily represented by a graph, where each clone is a vertex, and two clones are adjacent if and only if they overlap [91, 97, 54], hence an interval graph. A wealth of literature has been devoted to algorithms on interval graphs, which include a series of lineartime recognition algorithms [13, 73, 61, 48, 50, 43, 24]. Ironically, however, these recognition algorithms are never used as they are intended to be. Biologists never need to roll up their sleeves and feed their data into any recognition algorithm before claiming the answer is “NO” with full confidence, i.e., their data would not give an interval graph though they ought to. The reason is that biological data, obtained by mainly experimental methods, are destined to be flawed.
More often than not, biologists are also confident that their data,
though not perfect, are of reasonably good quality: there are only few
errors hidden in the data [65]. This
leads us naturally to consider graphs that are not interval graphs,
but close to one in some sense. We say that a graph is an
almost interval graph if it can be obtained from an interval
graph by a small amount of modifications; it may or may not be an
interval graph itself. Different applications are afflicted with
different types of errors, e.g., there might be outliers,
falsepositive overlaps, and/or falsenegative overlaps. We can
accordingly define different measures for closeness. For any
given nonnegative integer , we use ,
, and to denote the
classes of graphs that can be obtained from some interval graph by
adding at most vertices, adding at most edges, and deleting at
most edges, respectively.
The first task is of course to efficiently decide whether a given graph is an almost interval graph or not, and more importantly, identify an object interval graph if one exists. Computationally, finding an object interval graph is equivalent to pinpointing the few but crucial errors in the data. For any fixed , this can be trivially done in polynomial time: given a graph on vertices, we can in time try every subset of vertices, edges, or missing edges of . Such an algorithm is nevertheless inefficient even for very small , as is usually large. The main results of this paper are lineartime recognition algorithms for all three classes of almost interval graphs.
Theorem 1.1.
Let be any fixed nonnegative integer. Given a graph on vertices and edges, the membership of in each of , , and can be decided in time. Moreover, in case of affirmative, an object interval graph can be produced in the same time.
Thm. 1.1 extends the line of lineartime algorithms for recognizing interval graphs. In the running times of all the three algorithms, needless to say, the constants hidden by bigOh rely on . Since all the problems are NPhard when , instead of being constant, is part of the input [67, 55, 38], the dependence on is necessarily superpolynomial (assuming PNP). Now that the linear dependence on the graph size is already optimum, we would like to minimize the factor of . We are thus brought into the framework of parameterized computation. Recall that a problem, associated with some parameter, is fixedparameter tractable (FPT) if it admits a polynomialtime algorithm where the exponent on the input size ( in this paper) is a global constant independent of the parameter [30]. From the lens of parameterized computation, the recognition of almost interval graphs is conventionally defined as graph modification problems, where the parameter is , and the task is to transform a graph to an interval graph by at most modifications [17]. For the classes , , and , the modifications are vertex deletions, edge deletions, and completions (i.e., edge additions) respectively, which are the most commonly considered on hereditary graph classes. The parameterized problems are accordingly named interval vertex deletion, interval edge deletion, and interval completion. Our results can then be more specifically stated as:
Theorem 1.2.
Given a graph on vertices and edges and a nonnegative parameter , the problems interval vertex deletion, interval edge deletion, and interval completion can be solved in time , , and , respectively.
In particular, we show that interval edge deletion is FPT, thereby resolving a longstanding open problem first asked by Bodlaender et al. [10]. Further, our algorithms for interval vertex deletion and interval completion significantly improve the time algorithm of Heggernes et al. [90] and the time algorithm of Cao and Marx [21], respectively. We remark that it can also be derived an time approximation algorithm of ratio 8 for the minimum interval vertex deletion problem.
We feel obliged to point out that computational biologists cannot claim all credit for the discovery and further study of interval graphs. Independent of [4], Hajós [46] formulated the class of interval graphs out of nothing but coffee. Since its inception in 1950s, its natural structure earns itself a position in many other applications, among which the most cited ones include jobs scheduling in industrial engineering [3], temporal reasoning [42], and seriation in archeology [57]. All these applications involve some temporal structure, which is understandable: before the final invention of time traveling vehicles, a graph representing relationship of temporal activities has to be an interval graph. With errors involved, almost interval graphs arise naturally.
1.1 Notation
All graphs discussed in this paper shall always be undirected and simple. The order and size of a graph are defined to be the cardinalities of its vertex set and its edge set respectively. We assume without loss of generality that is connected and nontrivial (containing at least two vertices); thus . We sometimes use the customary notation to mean , and to mean . The degree of a vertex is denoted by . A vertex is simplicial if induces a clique; let denote the set of simplicial vertices of . The length of a path or a cycle is defined to be the number of edges in it. Standard graphtheoretical and algorithmic terminology can be found in [27, 39].
A cycle induced by vertices, where , is called a hole, or simply a hole if is irrelevant. In other words, a hole is an induced cycle that is not a triangle. A graph is chordal if it contains no holes. Lekkerkerker and Boland [66] showed that a graph is an interval graph if and only if it is chordal and does not contain a structure called asteroidal triple (at for short), i.e., three vertices such that each pair of them is connected by a path avoiding neighbors of the third one. They went further to list all minimal chordal graphs that contain an at. These graphs, reproduced in Fig. 1, are called chordal asteroidal witnesses (caws for short).
Let denote the set of minimal forbidden induced subgraphs of interval graphs, i.e., all holes and caws. Let be the set {net, sun, rising sun, long claw, whipping top, hole, hole} (see the first row of Fig. 1). An important ingredient of our algorithms is a comprehensive study of the following graph class. Clearly, , and thus all interval graphs satisfy this definition.
Definition 1.
Locally interval graphs are defined by forbidding all subgraphs in .
An induced interval subgraph of is an interval subgraph induced by a set of vertices. An interval graph (resp., ) is called a spanning interval subgraph (resp., an interval supergraph) of if it has the same vertex set as and (resp., ). An induced interval subgraph (resp., a spanning interval subgraph or an interval supergraph ) of is maximum (resp., maximum or minimum) if (resp., or ) is maximum (resp., maximum or minimum) among all induced interval subgraphs (resp., spanning interval subgraphs or interval supergraphs) of ; in other words, the number of modifications (resp., or ) is minimum.
A subset of vertices forms a module of if all vertices in have the same neighborhood outside of . In other words, for any pair of vertices , a vertex is adjacent to if and only if it is adjacent to as well. The set and all singleton vertex sets are modules, called trivial. A graph on less than three vertices has only trivial modules, while a graph on three vertices always has a nontrivial module. A graph on at least four vertices is prime if it contains only trivial modules, e.g., all holes of length at least five and all caws are prime. Two disjoint modules are either nonadjacent or completely adjacent. Given any partition of such that for every is a module of , we can associate a quotient graph , where each vertex represents a module of , and for any pair of distinct with , the th and th vertices of are adjacent if and only if and are adjacent in . From and for all (their total sizes are bounded by ), the original graph can be easily and efficiently retrieved.
1.2 Our major results
We state here the major results of this paper (besides Thms. 1.1 and 1.2) that are of independent interest. Our first result is a straightforward observation on modules of locally interval graphs and interval graphs.
Proposition 1.3.
Let be the class of interval graphs or the class of locally interval graphs. A graph is in if and only if a quotient graph of is in and

every nonsimplicial vertex of represents a clique module; and

in any pair of adjacent vertices of , at least one represents a clique module.
Our second major result comprises of a set of theorems. They characterize the minimum modifications with respect to modules of the input graph. Note that after replacing a module by another subgraph, we add edges between every vertex in the new subgraph to .
Theorem 1.4.
Let be a maximum induced interval subgraph of graph . For any module of intersecting , the set is a module of , and if is holefree, then replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .
Theorem 1.5.
Let be a holefree graph. There is a maximum spanning interval subgraph of such that the following hold for every module of : i) is a module of ; and ii) replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .
Theorem 1.6.
For any graph , there is a minimum interval supergraph of such that the following hold for every module of : i) is a module of ; and ii) if is not a clique, then replacing by any minimum interval supergraph of in gives a minimum interval supergraph of .
These results hold regardless of , and thus can be used for any algorithmic approach, e.g., Thm. 1.6 has already been used in [8]. We remark that there has been a long relationship between modules and interval graphs. Indeed, the algorithm of [50], based on a characterization of prime interval graphs by Hsu [49], is arguably the simplest among all known recognition algorithms for interval graphs.
Let be a connected graph whose vertices, called bags, are the set of all maximal cliques of . We say that is a clique decomposition of if for any , the set of bags containing induces a connected subgraph of . A caterpillar is a tree that consists of a main path and all other vertices are leaves connected to it. An olive ring is a unicyclic graph that consists of a hole (called the main cycle) and all other vertices are pendant (having degree ) and connected to this hole. The deletion of any edge from the main cycle of an olive ring results in a caterpillar. Our third result is on the clique decomposition of prime locally interval graphs.
Theorem 1.7.
A prime locally interval graph has a clique decomposition that is either a caterpillar when it is chordal; or an olive ring otherwise. This decomposition can be constructed in time.
Indeed, given a prime graph that does not have such a decomposition, our algorithm is able to identify a subgraph of in . The following statement is stronger than Thm. 1.7 and implies it.
Theorem 1.8.
Given a prime graph , we can in time either build an olivering/caterpillar decomposition for or find a subgraph of in .
In addition to the above listed concrete results, our algorithms also
suggest a meta approach for designing fixedparameter algorithms for
vertex deletion problems (where modules are trivially preserved): If the object graph class can be characterized by a set of forbidden
induced subgraphs of which only a finite number are not prime, then
we may break them first and then use divideandconquer, i.e., solve
the quotient graph and subgraphs induced by modules
individually.
1.3 Motivation and background
The aforementioned physical mapping of DNA is a central problem in computational biology [65, 2]. In a utopia where experimental data were perfect, they should define an interval graph. Then the problem is equivalent to constructing an interval model for the graph, which can be done in linear time. In the real world we live, however, data are always inconsistent and contaminated by a few but crucial errors, which have to be detected and fixed. In particular, on the detection of falsepositive errors that correspond to fake edges, Goldberg et al. [38] formulated the minimum interval edge deletion problem and showed its NPhardness. Likewise, the deletion of vertices can be used to formulate the detection of outliers (i.e., elements participating in many false overlaps, both positive and negative), and the minimum interval vertex deletion problem is long known to be NPhard [64, 67].
Solving the minimum interval vertex deletion problem and the minimum interval edge deletion problem is equivalent to finding the maximum induced interval subgraph [31, 9] and the maximum spanning interval subgraph [80] respectively. In light of the importance of interval graphs, it is not surprising that some natural combinatorial problems can be formulated as, or computationally reduced to the interval deletion problems. For instance, Narayanaswamy and Subashini [76] recently solved the maximum consecutive ones submatrix problem and the minimum convex bipartite deletion problem by a reduction to minimum interval vertex deletion. Oum et al. [79] showed that an induced interval subgraph can be used to find a special branch decomposition, which can be in turn used to devise FPT algorithms for a large number of problems, namely, locally checkable vertex subset and vertex partitioning problems. They both used our previous algorithm [21] as a subroutine, and thus will benefit from an improved algorithm directly.
The minimum interval completion problem is also a classic NPhard problem [55, 96]. Besides computational biology, its most important application should be sparse matrix computations [86]. The profile method is an extension of the bandwidth method [83, 81], and their purpose is to minimize the storage used during Gaussian elimination for a symmetric sparse matrix. Both methods attempt to reorder the rows and columns of the input matrix such that all elimination are limited within a band or an envelope around the main diagonal, while all entries outside are always zeroes during the whole computation. Therefore, we only need to store the elements in the band or envelop, whose sizes are accordingly called the bandwidth and profile [37]. Rose [83] correlated bandwidth with graphs. Tarjan [86] showed that a symmetric matrix has a reordering such that its profile coincides with nonzero entries if and only if it defines an interval graph (there is an edge between vertices and if and only if the element is nonzero), and finding the minimum profile is equivalent to solving the minimum interval completion problem.
A very similar problem is the minimum pathwidth problem, which also asks for an interval supergraph of but the objective is to minimize the size of the maximum clique in . This problem was also known to be NPhard [56]. In light of the hardness of both problems, people turned to finding minimal interval completions, which can be viewed as a relaxation of both of them. Ohtsuki et al. [78] designed an algorithm that finds a minimal interval completion in time. Very recently, Crespelle and Todinca [25] proposed an improved algorithm that runs in time. This is the best known, and it remains open to develop a lineartime algorithm for finding a minimal interval completion. See also Heggernes et al. [47] for a characterization of minimal interval completions.
Möhring [75] showed that if a graph is free of ats, then any minimal chordal supergraph of it is an interval graph. The converse was later shown to be true as well [23]. Since the minimum chordal completion problem (also known as minimum fillin) is known to be NPhard on atfree graphs [1], the minimum interval completion problem remains NPhard on atfree graphs. Other graph classes on which the minimum interval completion problem remains NPhard include chordal graphs [82], permutation graphs [11], and cocomparability graphs [44]. On the positive side, see [59] for some polynomial solvable special cases.
1.4 Graph modification problems and their fixedparameter tractability
Many classical graphtheoretic problems can be formulated as graph modification problems to specific graph classes. For example, Garey and Johnson [35, section A1.2] listed 18 NPcomplete graph modification problems (two of which are indeed large collections of problems; see also [67, 95]). Graph modification problems are also among the earliest problems whose parameterized complexity were considered, e.g., Kaplan et al. [53] and Cai [17] devised FPT algorithms for completion problems to chordal graphs and related graphs. Indeed, since the graph modification problems are a natural computational method for detecting few errors in experimental data, they were an important motivation behind parameterized computation. In the special case when the desired graph class can be characterized by a finite number of forbidden (induced) subgraphs, their fixedparameter tractability follows from a basic bounded search tree algorithm [17]. However, many important graph classes, e.g., forests, bipartite graphs, and chordal graphs, have minimal obstructions of arbitrarily large size (cycles, odd cycles, and holes, respectively). It is much more challenging to obtain fixedparameter tractability results for such classes.
Besides holes, has another infinite set of obstructions (caws), which is far less understood [23, 63]. Since adding or deleting a single edge is sufficient to fix an arbitrarily large caw, the modification problems to interval graphs are more complicated than chordal graphs. Their fixedparameterized tractability were frequently posed as important open problems [53, 30, 10]. Only after about two decades were interval completion and interval vertex deletion shown to be FPT [90, 21]. Both algorithms use a twophase approach, where the first phase breaks all (problemspecifically) small forbidden induced subgraphs and the second one takes care of the remaining ones with the help of combinatorial properties that hold only in graphs without those small subgraphs. Nevertheless, neither approach of [90, 21] generalizes to interval edge deletion in a natural way, whose parameterized complexity remained open to date. Moreover, both algorithms of [90, 21] suffer from high time complexity.
In passing let us point out that the vertex deletion version can be considered as the most robust variant, as it encompasses both edge modifications in the following sense: if a graph can be made an interval graph by edge deletions and edge additions, then it can also be made an interval graph by at most vertex deletions (e.g., one vertex from each added/deleted edge). In other words, the graph class contains both classes and . The similar fact holds for all hereditary graph classes. On the other hand, and are incomparable in general, e.g., a hole is in interval and a is in interval but not the other way.
1.5 Efficient detection of (small) forbidden induced subgraphs
As said, if the object graph class has only a finite number of forbidden induced subgraphs, then the modification problem is trivially FPT. This observation can be extended to a family of forbidden induced subgraphs that, though infinite, can be detected in polynomial time and destroyed by a bounded number of ways; the most remarkable example is chordal completion [53, 17]. For the purpose of contrast, let us call this onephase approach. In carrying out the aforementioned twophase approach, one usually focuses on the second phase, on the ground that the first phase seems to be the same as the onephase approach. This ground is, nevertheless, shaky: more often than not, algorithms based on the onephase approach run in linear time, but all previous algorithms [90, 89, 21] based on this twophase approach have high polynomial factors in their running times, which are mainly determined by the time required to detect small forbidden induced subgraphs in the first phase. As we will see, the detection of a small forbidden induced subgraph is usually far more demanding than an arbitrary one.
Kratsch et al. [62]
presented a lineartime algorithm for detecting a hole or an at from a
noninterval graph. It first calls the holedetection algorithm of
Tarjan and Yannakakis [87],
which either returns a hole, or reduces to finding an at in a chordal
graph. The additional chordal condition for the detection of an at is
crucial: we do not know how to find an at in a general graph in linear
time. The best known recognition algorithm for atfree graphs takes
time [60], and Kratsch
and Spinrad [63] showed that this algorithm can
be used to find an at in the same time if the graph contains one. A
more important result of [63] is that recognizing
atfree graphs is at least as difficulty as finding a triangle. The
detection of an at cannot be easier than the recognition of atfree
graphs, and hence a lineartime algorithm for it is very unlikely to
exist. (See also
[85].
Obviously, for any hereditary graph class, the detection of a forbidden induced subgraph is never easier than the recognition of this graph class. On the other hand, we have seen that the detection of a hole, an at with witness, and a subgraph in can be done in the same asymptotic time as the recognition of chordal graphs, atfree graphs, and interval graphs, respectively. From these examples one may surmise that the requirement of explicit evidence does not seem to pose an extra burden to the recognition algorithms. This is known to be true for almost all polynomialrecognizable graph classes with known characterization by forbidden induced subgraphs.
However, it changes drastically when the evidence is further required
to have a small or minimum number of vertices. The most famous
example should be the detection of cycles: while an arbitrary cycle
can be trivially found in linear time, the detection of a shortest
cycle, which includes the triangledetection as a special case, is
very unlikely to be done in linear time. Even finding a short cycle
in linear time seems to be out of the question (see, e.g.,
[51]). Assuming that triangles cannot be
detected in linear time, we can also rule out the possibility of
lineartime detection of a minimum subgraph in or a shortest
hole. Let be the graph obtained by subdividing a graph
(i.e., for each edge , adding a new vertex , connecting
it to both and , and deleting ), then contains a
triangle if and only if the minimum subgraph of in is a
hole. Since has vertices and edges,
an lineartime algorithm for finding a minimum subgraph in can
be used to detect a triangle in linear time. With a similar
reduction, we can show that a lineartime algorithm for detecting
subgraphs in —recall that they are small graphs in
—is unlikely to exist, as it can be used to detect a claw in
linear time, and further to detect a triangle in time,
which would have groundbreaking consequence (see [85, Open problem
8.3, page 103]). Similar
phenomenon has been observed in detecting minimum Tucker submatrices,
i.e., a minimal matrix that does not have consecutiveones property
[7] and shortest even holes
[22].
Another crucial step of our algorithm is to find all simplicial vertices of a graph. Again, it is unlikely to be done in linear time: Kratsch and Spinrad [63] showed that counting the number of simplicial vertices is already at least as hard as detecting a triangle. Indeed, there is even no known algorithm that can detect a single simplicial vertex in linear time. The only known way of finding a simplicial vertex is either enumerating all vertices or using fast matrix multiplication. Kloks et al. [58] showed that in the same time one can actually list all simplicial vertices. This is the best known in general graphs. See also [94, open problems 4.3 and 4.4].
1.6 Main challenges and our techniques
We describe here the main challenges and intuitions behind the techniques that we use to address them. They can be roughly put into two categories: for the linear dependence on the graph size and for the smaller exponential dependence on the parameter. Also sketched here is why known techniques from previous work will not suffice. We basically take the twophase approach, subgraphs in first and then the rest (large ones). We say that caws and holes in are small and short respectively; other caws, namely, s and s, are large, and holes of length six or more are long. It is worth noting that the thresholds are chosen by structural properties instead of sizes.
Linear dependence on the graph size.
The biggest challenge is surely the efficient detection of a subgraph in , or more specifically, the detection of a short hole or small caw. As explained above, we do not expect a lineartime algorithm for this task. Instead, we relax it to the following: either find a subgraph in or build a structural decomposition (Thm. 1.8) that is sufficient for the second phase. For the disposal of large forbidden induced subgraphs in the second phase, the algorithm of [21] breaks long holes first, and then large caws in a chordal graph. There is no clear way to implement this tactic in linear time: the disposal of holes introduces a factor , while finding a caw gives another factor . Neither of them seems to be improvable to . We are thus forced to consider an alternative approach, i.e., we may have to deal with large caws in a nonchordal graph. Hence completely new techniques are required. Overcoming these two difficulties enables us to deliver lineartime algorithms.
Exponential dependence on the parameter.
To claim the fixedparameter tractability of interval edge deletion and better dependence on for interval completion, we still have some major concerns to address. Since fixing holes by edge additions is well understood, the algorithm of Heggernes et al. [90] for interval completion assumes the input graph to be chordal, and focuses on the disposal of caws. However, holes pose a nontrivial challenge to us in the deletion problems, and thus the techniques of [90] do not apply. On the other hand, the algorithm in [21] heavily relies on the fact that the deletion of vertices leaves an induced subgraph. Essentially, it looks for a minimum set of vertices intersecting all subgraphs in , so called hitting set. Deleting any vertex from a subgraph in breaks this subgraph once and for all, but adding/deleting an edge to break an erstwhile subgraph in might introduce new one(s). As a result, the “hitting set” observation does not apply to edge modifications problems.

The first difficulty that presented itself at this point is on the preservation of modules, which is trivial for vertex deletions, but not true for edge modifications in general. Simple examples tell us that not all maximum spanning interval subgraphs and minimum interval supergraphs preserve all modules. What we do here is to identify appropriate technical conditions, under which there exists some maximum spanning interval subgraph or minimum interval supergraph that preserves all modules, and make them satisfied at the onset of the second phase.

The other difficulty is why it suffices to consider a bounded number of modifications to fix a special caw, for which we need to argue that most possible modifications are local to it and can be decided locally. In [21], we studied in a chordal graph with no small caws, how a caw interacts with others; similar arguments are obviously inapplicable to edge variations. Even for vertex deletions, as we had make a compromise to work on nonchordal graph, we need a new argument that does not assume the chordality.
2 Outline
The purpose of this section is to describe the main steps of our algorithm at a high level. A quotient graph is isomorphic to an induced subgraph of , e.g., we can pick an arbitrary vertex from each module of the module partition and take the induced subgraph. Therefore, whenever a forbidden induced subgraph of is detected, it can be translated into a forbidden induced subgraph of directly.
2.1 Maximal strong modules
Behind Prop. 1.3 and Thms. 1.41.6 is a very simple observation: holes are the only nonprime graph in and . Note that for any induced subgraph intersecting a module , their intersection is a (possibly trivial) module of . Therefore, if is prime and , then it intersects by at most one vertex. Fix any module partition and accordingly a quotient graph . If is in or but not a hole, then either contains at most one vertex from each module, thus isomorphic to an induced subgraph of , or is fully contained in some module from the given partition. On the other hand, a hole may contain precisely two vertices of a module , and then the other two vertices must be neighbors of this module. We have two cases: the other two vertices belong to the same module that is adjacent to , or they belong to two different (nonadjacent) modules. In other words, either two nonclique modules are adjacent, or a nonclique module is not simplicial in . This concludes Prop. 1.3.
However, Prop. 1.3 has no direct algorithmic use: a graph might have an exponential number of modules and quotient graphs. A module is strong if for every other module that intersects , one of and is a proper subset of the other. All trivial modules are strong. We say that a strong module , different from , is maximal if the only strong module properly containing is . Using definition it is easy to verify that maximal strong modules of are disjoint and every vertex of appears in one of them. Therefore, they partition , and define a special quotient graph . If is not connected, then each maximal strong module is a component of it, and has no edge. Recall that the complement graph of is defined on the same vertex set , where a pair of vertices and is adjacent if and only if in . Thus, the complement of has the same set of modules as ; in particular, if it is not connected, then its components are the maximal strong modules of , and hence is complete. If both the graph and its complement are connected, then must be prime [34]. Note that this is the only case that a quotient graph can be prime; in other words, a prime quotient graph must be defined by maximal strong modules.
Hereafter, the quotient graph is always decided by maximal strong modules of ; when itself is prime, they are isomorphic. There are at most maximal strong modules, which can be found in linear time [45]. Therefore, the following corollary of Prop. 1.3 will be more useful for algorithmic purpose. Recall that a vertex is universal in if . It is easy to verify that a prime graph is necessarily connected, and its simplicial vertices are pairwise nonadjacent.
Corollary 2.1.
Let be the class of interval graphs or the class of locally interval graphs. A graph having no universal vertices is in if and only if

the quotient graph decided by maximal strong modules of is in but not a clique;

for every module represented by a simplicial vertex of ; and

is a clique for every module represented by a nonsimplicial vertex of .
Every parameterized modification problem has an equivalent optimization version, which asks for a minimum set of modifications; the resulting interval graph is called an optimum solution to this problem. Clearly, a graph is in the class interval, interval, or interval if and only if the minimum number of vertex deletions, edge deletions, or edge additions respectively that transform into an interval graph is no more than . Although the recognition/modification problems we are working on do not explicitly ask for an optimum solution, an optimum one will serve our purpose. We have stated in Thms. 1.41.6 that there are always optimum solutions well aligned with modules of the input graph. Again, for algorithmic purpose, the following variations formulated on maximal strong modules are more convenient for our divideandconquer approach. As we will see shortly, they are indeed equivalent to Thms. 1.41.6 respectively.
Theorem 2.2.
Let be a graph of which every hole is contained in some maximal strong module, and let be a maximum induced interval subgraph of . For any maximal strong module of intersecting , the set is a module of , and replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .
Theorem 2.3.
Let be a graph of which every hole is contained in some maximal strong module. There exists a maximum spanning interval subgraph of such that every maximal strong module of is a module of , and replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .
We may assume without loss of generality that the input graph contains no universal vertices. According to Cor. 2.1, the condition of Thms. 2.2 and 2.3 is satisfied if (i) is not a clique, (ii) contains no hole, and (iii) every nonsimplicial vertex of represents a clique module of . In this paper cliques are required to be nonempty. It is easy to verify that the maximum induced interval subgraph or maximum spanning interval subgraph of a graph is clique if and only if it is a clique; thus, under the condition of Thms. 2.2 and 2.3, a maximal strong module is a clique of the object interval graph if and only if it is a clique of .
Theorem 2.4.
There is a minimum interval supergraph of such that every maximal strong module of is a module of , and if is not a clique, then replacing by any minimum interval supergraph of in gives a maximum spanning interval subgraph of .
2.2 Characterization and decomposition of locally interval graphs
Prop. 1.3 reduces the main task of the first
phase, the detection of a subgraph of in , to two
simpler tasks, namely, finding a subgraph of in
and finding all simplicial vertices of when it is a locally interval graph.
Both tasks are trivial when is an interval graph (including
cliques and edgeless graphs), and hence we concentrate on prime
noninterval graphs. If such a graph contains no subgraph in
, i.e., being a locally interval graph, then it must contain some large caw or
some long hole. Therefore, we start from characterizing large caws
and long holes in prime locally interval graphs. A glance at
Fig. 1 tells us that each caw contains precisely three
simplicial vertices, which form the unique at of this caw; they are
called the terminals of this caw.
Theorem 2.5.
Let be a large caw of a prime graph . We can in time find a subgraph of in if the shallow terminal of is nonsimplicial in .
If a prime locally interval graph is chordal, then by Thm. 2.5, every caw contains a simplicial vertex (its shallow terminal), and thus must be an interval graph. In a chordal graph, can be easily found, and then a caterpillar decomposition for can be obtained by adding to a clique path decomposition for (Section 5.3). This settles the chordal case of Thm. 1.8; we may hence assume that is not chordal and has a long hole .
Theorem 2.6.
Let be a hole of a prime graph . We can in time find a subgraph of in if there exists a vertex satisfying one of the following: (1) the neighbors of in are not consecutive; (2) is adjacent to or more vertices in ; and (3) is nonsimplicial and nonadjacent to .
If is a prime locally interval graph, then for any vertex of the hole , the subgraph must be chordal; otherwise, and any hole of will satisfy Thm. 2.6(3). Therefore, combining Thms. 2.5 and 2.6, we conclude that must be an interval subgraph, and has a linear structure. These observations inspire the definition of the auxiliary graph (with respect to ), which is the main technical tool for analyzing prime nonchordal graphs. Here we need a special vertex of satisfying some local properties, which can be found in linear time (Section 5.1). We number vertices in such that is this special vertex and define . We designate the ordering of traversing as clockwise, and the other counterclockwise. The local properties enable us to assign a direction to each edge between and , i.e., , in accordance with the direction of itself. We use and to denote the set of clockwise and counterclockwise edges from , respectively; partitions .
Definition 2.
The vertex set of consists of , where and are distinct copies of , i.e., for each , there are a vertex in and another vertex in , and is a new vertex distinct from . For each edge , we add to the edge set of

an edge if neither nor is in ;

two edges and if both and are in ; or

an edge or if and or respectively.
Finally, we add an edge for every .
It is easy to see that the order and size of are upper bounded by and respectively. We will show in Section 5.1 that an adjacency list representation of can be constructed in linear time. The auxiliary graph carries all structural information of useful for us and is easy to manipulate; in particular, the new vertex is introduced to memorize the connection between and the right end of . The shape of symbol is a good hint for understanding the structure of the auxiliary graph. Suppose has an olivering structure, then has a caterpillar structure, which is obtained by unfolding the olive ring as follows. The subgraph has a caterpillar structure, to the ends of which we append two copies of . The two copies of , namely, and , are identical, and every edge between and is carried by only one copy of it, based on it is in or . Furthermore, properties stated in the following theorem allow us to fold (the reverse of the “unfolding” operation) the caterpillar structure of back to produce the olivering decomposition for . Note that is different from .
Theorem 2.7.
A vertex different from is simplicial in if and only if it is derived from some simplicial vertex of . Moreover, we can in time find a subgraph of in if 1) is not chordal; or 2) is not an interval graph.
We may assume that the graph is chordal, whose simplicial vertices can be identified easily. As a result of Thm. 2.7, we can retrieve and obtain the graph . If it is not an interval graph, then we are done with Thm. 1.8. Otherwise, we apply the following operation to sequentially build a hole decomposition for and an olivering decomposition for . Noting that all holes of are also in , once the decomposition for is produced, we can use it to find a shortest hole of . We proceed only when this hole is long.
Lemma 2.8.
Given a clique path decomposition for , we can in time build a clique decomposition for that is a hole. Moreover, we can find in time a shortest hole of .
Theorem 2.9.
Given a clique hole decomposition for , we can in time construct a clique decomposition for that is an olive ring.
Putting together these steps, we get the decomposition algorithm in Fig. 2, from which Thm. 1.8 follows. This concludes the proof of the characterization and decomposition of prime locally interval graphs.
Algorithm decompose() input: a prime graph . output: a caterpillar/olivering decomposition for or a subgraph of in . 1 if is chordal then if is an interval graph then return a clique path decomposition for ; if is an interval graph then return a caterpillar decomposition for ; find a caw ; if is small then return , else call Thm. 2.5; 2 find a hole of ; build ; 3 if is not chordal then call Thm. 2.7(1); 4 find and ; construct ; 5 if is not an interval graph then call Thm. 2.7(2); 6 call Lem. 2.8 and Thm. 2.9 to build an olivering decomposition for . 
2.3 Recognition of almost interval graphs
In lieu of general solutions, we may consider only those optimum solutions satisfying Thms. 2.22.4, which focus us on the quotient graph defined by maximal strong modules of . If is a clique, then we have either a hole or a smaller instance (by removing all universal vertices). Otherwise is prime and we call Thm. 1.8 with it, which has two possible outcomes; there are only a constant number of modifications applicable to a small caw, and thus we may assume that the outcome is an olivering decomposition . For the completion problem, as holes can be easily filled, we can always assume that the graph is chordal and is a caterpillar. With decomposition , whether the input instance satisfies the conditions of Thms. 2.2 and 2.3 can be easily checked. If some nonsimplicial vertex in represents a nonclique module, then we have a hole. Otherwise, we work on all maximal strong modules and find each of them an optimum solution, for which it suffices to consider those represented by simplicial vertices in . Using definition it is easy to verify that the resulting graph has the same set of maximal strong modules as , and hence remains the prime quotient of it. With inductive reasoning, we may assume that every simplicial vertex in represents now an interval subgraph. In summary, the only condition of Cor. 2.1 that might remain unsatisfied is whether itself is an interval graph. Therefore, this section is devoted to the disposal of , which is prime and has a caterpillar/olivering decomposition .
Allow us to use some informality in explaining the intuition behind the our algorithms for deletion problems. Recall that clique path decompositions are characteristic of interval graphs [33]. With a bird’seye view, what we have is an olive ring, while what we want is a path; it may help to mention that the maximal cliques of the graph may change and the bags of the latter is not necessarily a subset of the former. Toward this end, we need to cut the main cycle and strip off its leaves of the olive ring, and there are immediately two options based on which action is taken first. Interestingly, they correspond to the disposal of holes and caws, respectively. From we can observe that every hole of is global in the sense that it dominates all holes. In contrast, every caw is local, and with diameter at most four, so it sees only a part of the main cycle. The structural difference of holes and caws suggests that different techniques are required to handle them. As explained in Section 1.6, we strip the leaves off the olive ring first to make it a hole.
Let () be a large caw in , possibly (see the second row of Fig. 1). We consider its terminals as well as their neighbors, i.e., . It is observed that if all of them are retained and their adjacencies—except of and , which are adjacent in a but not a —are not changed, then in an interval model of the object interval graph, they must be arranged in the way depicted in Fig. 3. As indicated by the dashed extensions, the interval for (resp., ) might or might not extend to the left (resp., right) to intersect the interval for (resp., ). Our main observation is on the position of the interval for : it has to lie between and , which are nonadjacent—this explains why we single out net and (rising) sun from and respectively. Recall that is originally adjacent to no vertex in the  path . Therefore, we need to delete some vertex or edge to break , or add an edge to connect to some inner vertex of .
In the discussion above, what matters is only the terminals and their neighbors, while the particular  path becomes irrelevant. Indeed, any induced  path in can be used in place of to give a caw of the same type (though not necessarily the same size), which has the same set of terminals. A similar operation is thus needed for all of them, and the particular base is immaterial, inspiring us to consider the following two sets of vertices. Of a large caw (), the frame is denoted by (), and the set of inner vertices is composed of all vertices that can be used to make a caw with frame (); they are denoted by and respectively. Without a specific path in sight, it would be more convenient to use () to denote a frame.
We can find a frame that is minimal in a sense. Its definition, give in Section 6.1, is essentially the same as what is used in our previous work [21]. The major concern here is how to find a minimal frame in linear time; considering that the graph might still contain holes and small caws, it is far more complicated than [21]. This is achieved using the olivering decomposition (Section 6.1). The rest is then devoted to the disposal of (caws with) this minimal frame.
Consider first vertex deletions. We show that any optimum solution deletes either some vertex of or a minimum  separator in the subgraph induced by . The second case is our main concern, for which we manage to show that any minimum  separator will suffice; it can be found in linear time. This case can be informally explained as follows. All vertices in and must reside in a consecutive part of the main cycle of the olivering decomposition, and we need to find some “place” in between to accommodate . We show that it suffices to “cut any thinnest place” between and , and use this space for . Recalling that has at most seven vertices, we have then an way branching for disposing of this frame.
The basic idea for edge deletions is similar as vertex deletions, i.e., we delete either one of a bounded number of edges or a minimum edge  separator, but we are now confronted with more complex situations. First, the assumption that no edge in is deleted does not suffice, so instead we find a shortest  path with all inner vertices from . If has a bounded length, then we branch on deleting every edge in it. Otherwise, we argue that either one of the first or last edges of is deleted, or it suffices to find a minimum set of edges whose deletion separates and in , which can also be viewed as the “thinnest place” (in another sense) between and . This gives an way branching.
After all caws are destroyed as above, if is chordal, then
problems are solved; otherwise, it has a clique hole
decomposition.
Lemma 2.10.
Given a clique hole decomposition for graph , the problems interval vertex deletion and interval edge deletion can be solved in time and respectively.
The “thinnest places” are also crucial for completions, though the argument becomes even more delicate. Our focus is on a minimum interval supergraph that contains no edge in ; in particular, and remain nonadjacent in . As said, we attend to caws only when the graph is already chordal, which means that the clique decomposition is a caterpillar. Therefore, and can be used to decide a leftright relation for both the caterpillar decomposition of and an interval model of . After adding edges, an interval for a vertex that is to the right of in might intersect part or all intervals between and . We argue that such an interval either reaches , or is to the right of some position (informally speaking, the “rightmost thinnest place”). A symmetric argument works for a vertex to the left of . As a result, we have two points such that all structures between them is totally decided by and ; in particular, it suffices to put in any “thinnest” place in between. This gives a way branching.
Putting together these steps, a highlevel outline of our algorithms is given in Fig. 4. This concludes Thms. 1.1 and 1.2.
input: a graph and a nonnegative integer .
output: a set of at most modifications that transforms into an interval graph; or “NO.” 0 if then return “NO”; if is an interval graph then return ; 1 [only for interval completion] fill all holes of ; 2 if the quotient graph defined by maximal strong modules of is edgeless then solve each component individually; 3 if is a clique then if there are two nonclique modules then find a hole and branch on disposing of it; else solve the subgraph induced by the only nonclique module; is not an interval graph. 4 call decompose(); 5 if a small caw or short hole is found then branch on disposing of it; We have hereafter a caterpillar/olivering decomposition. 6 if a nonsimplicial vertex of represents a nonclique module then find a hole and branch on disposing of it; 7 for each module represented by a simplicial vertex of do solve the subgraph ; 8 if the clique decomposition is not a hole then find a minimal frame and branch on disposing of it; 9 [not for interval completion] call Prop. 2.10. 
Organization.
The rest of the paper is organized as follows. Section 3 relates modules to optimum solutions of all the three problems, and proves Thms. 2.22.4 as well as Thms. 1.41.6. Section 4 gives the characterization of large caws and long holes in prime locally interval graphs, and proves Thms.2.5 and 2.6. Section 5 presents the details of decomposing prime graphs and proves Thms. 2.72.9. Section 6 presents the details on the disposal of large caws. Section 7 use all these results to complete the algorithms. Section 8 closes this paper by describing some followup work and discussing some possible improvement and new directions.
3 Modules
This section is devoted to the proof of Thms. 1.41.6 and Thms. 2.22.4. Each of these theorems comprises two assertions on modules of . The module preservation asserts that a (maximal strong) module (or its remnant after partial deletion) remains a module of the optimum solution, and local optimum asserts that the optimum solution restricted to is an optimum solution of itself, and can be replaced by any optimum solution of . Thms. 1.4 and 2.2 (vertex deletions) turn out to be quite straightforward. As a matter of fact, a weaker version of them has been proved and used in [21], and a similar argument, which is based on the characterization of forbidden induced subgraphs and the hereditary property, also works here. On the other hand, this approach does not seem to be adaptable to the edge modifications problems.
Simple examples tell us that not all maximum spanning interval subgraphs or minimum interval supergraphs preserve all modules. For example, consider the graph in Fig. 3 with only solid edges, which is obtained from a as follows: the center is replaced by a clique of vertices, and the shallow terminal is replaced by two nonadjacent vertices and . A minimum completion to this graph must be adding for each of and an edge to connect it to some vertex at the bottom. These two vertices do not need to be the same, e.g., the dashed edges in Fig. 3; however, is not a module of the resulting minimum interval supergraph. On the one hand, the subgraph induced by the maximal strong module is not connected; on the other hand, we may alternatively connect both and to the same vertex so that we obtain another minimum interval supergraph that preserves as a module. These two observations turn out to be general: 1) if a module of a graph is not a module of some minimum interval supergraph of , then must be disconnected, and 2) we can always modify to another minimum interval supergraph of such that is a module of . {SCfigure}[][h]
To make it worse, a maximum spanning interval subgraph may have to break some maximal strong modules. The simplest example is a hole graph, which has two nontrivial modules, but its maximum spanning interval subgraph must be a simple path, which is prime. Even the connectedness does not help here. For example, consider the graph in Fig. 4(a) (all edges, both solid and dashed), which is obtained by completely connecting two induced paths and . A maximum spanning interval subgraph of it has to be isomorphic to Fig. 4(a) after dashed edges deleted, which is again prime. Both examples contain some hole, which urges us to study holefree graphs. We show that any holefree graph has a maximum spanning interval subgraph that preserves all its modules. It is worth stressing that not all maximum spanning interval subgraphs of a holefree graph preserve all its modules, e.g., the graph (with both solid and dashed edges) and its maximum spanning interval subgraph (after dashed edges deleted) in Fig. 4(b).
The way we prove Thms. 2.3 and 2.4 is using interval models: we construct an interval graph satisfying the claimed conditions by explicitly giving an interval model for it. For this purpose we need more notation on interval models. In an interval model, each vertex corresponds to a closed interval , where and are the left and right endpoints of , respectively, and . An interval model is called normalized if no pair of distinct intervals in it shares an endpoint; every interval graph has a normalized interval model. All interval models in this section are normalized. For a subset of vertices, we define and . Observe that if induces a connected subgraph, then the interval is exactly the union of . Let be a set of points that are in an interval . By projecting from to another interval we mean the following operation:
In other words, each point in is proportionally shifted to a point in . It is easy to verify that all new points are in and this operation retains relations between every pair of points. In particular, if we project the endpoints of all intervals for , the set of new intervals defines the same interval graph.
The following simple observation will be crucial for our arguments. For any point , we can find a positive value such that the only possible endpoint of in is . Here the value of should be understood as a function—depending on the interval model as well as the point —instead of a constant.
3.1 Modules in maximum induced interval subgraphs
For any module and vertex set of , the set , if not empty, is a module of the subgraph . This property implies the preservation of modules in all maximum induced interval subgraphs. Therefore, for Thms. 1.4 and 2.2, it suffices to prove their second assertions, which follow from the following statement. Recall that if a hole contains precisely two vertices from some module , then neither nor induces a clique.
Lemma 3.1.
Let be a maximum induced interval subgraph of a graph . Let be a module of such that at least one of and induces a clique. If , then replacing by any maximum induced interval subgraph of in gives a maximum induced interval subgraph of .
Proof.
Suppose, for contradiction, that the new graph is not an interval graph. From we can find a subgraph in , which must intersect both and . Since at least one of and induces a clique, contains exactly one vertex of ; let it be . By assumption, there exists a vertex (possibly ); let . Clearly, , but is isomorphic to , hence in , contradicting that is an interval graph. ∎
3.2 Modules in maximum spanning interval subgraphs
Before the proof of Thm. 2.3, we show a stronger result on clique modules.
Lemma 3.2.
A clique module of a graph is also a clique module of any maximum spanning interval subgraph of .
Proof.
Let be the component of such that attains the maximum value among all components of . We modify a given normalized interval model for as follows. Let and . For each , we set to a distinct value in , and set to a distinct value in . For each , we set . Let be the interval graph defined by . By construction, a vertex is adjacent to in if and only if it is in . Since , we have . For each , it holds that
On the other hand, induces a clique in . They together imply , while the equality is only attainable when is a clique, hence , and is completely connected to . Therefore, , and this verifies the lemma. ∎
Theorem 2.3 (restated).
Let be a graph of which every hole is contained in some maximal strong module. There exists a maximum spanning interval subgraph of such that every maximal strong module of is a module of , and replacing by any maximum spanning interval subgraph of in gives a maximum spanning interval subgraph of .
Proof.
As a consequence of Lem. 3.2, it suffices to consider the case when is not a clique, and then must induce a clique of . Let be a normalized interval model for .
Claim 1.
For any component of , the set induces a clique of .
Proof.
Supposing the contrary, we construct an interval graph with as follows. Let be a pair of vertices in such that , i.e., in . Without loss of generality, assume that is to the left of , and let be an arbitrary point in between, i .e., . For every , we extend the interval to include : if is to the left of , we set to be a distinct point in (); if is to the right of , we set to be a distinct point in (). We use the graph defined by the set of new intervals as . To see , note that all intervals are extended only; to see , note that is an edge in but not in . Every edge in is always incident to , which is a subset of , and hence exists also in . Therefore, is an interval subgraph of with strictly more edges than , which is impossible. ∎
Let be any maximum spanning interval subgraph of . We modify first to make satisfy the claimed condition. Let be the component of such that attains the maximum value among all components of . We have seen that induces a clique in . The intersection of all intervals is thus nonempty; let it be . Since is connected, . The interval intersects , and we can choose a common point in them. We construct another graph by projecting an interval model for into . The graph represented by the set of new intervals will be the soughtafter interval graph .
We now verify and . On the one hand, induces the same subgraph in and . On the other hand, by assumption, is an interval subgraph of and has no more edges than . Therefore, it suffices to consider edges between and . For , there edges are . Since is a subset of , it holds that . By selection of (i.e., has the largest size), .
We have now constructed a maximum spanning interval subgraph of where satisfies the claimed conditions. Only intervals for vertices in are changed, and thus this operation can be successively applied on the maximal strong modules of one by one. If a module already satisfies the conditions, then it remains true after modifying other modules. Therefore, repeating this process will derive a claimed maximum spanning interval subgraph of . ∎
As explained below, this settles Thm. 1.5 as well. Thm. 1.5 will not be directly used in this paper, and thus the reader may safely skip the following proof without losing track of the development of our algorithms.
Proof.
We show first that Thm. 1.5 implies Thm. 2.3. Let be a graph of which every hole is contained in some maximal strong module. Let be the set of maximal strong modules of , and let be the quotient graph defined by them. Let be a maximum spanning interval subgraph of ; for each , we replace by , which is an interval subgraph; let denote the obtained graph. Clearly, is a subgraph of , and is a maximum spanning interval subgraph of . Moreover, every remains a module of and thus is a quotient graph of . By Lem. 3.2, is not a clique if and only if is not a clique. Thus, is holefree, and by Thm. 1.5, there is a maximum spanning interval subgraph of such that for each . This implies that is a maximum spanning interval subgraph of , and the substitutability follows from Cor. 2.1. Moreover, since and are both maximum spanning interval subgraphs of , they have the same size, which implies that is a maximum spanning interval subgraph of as well. This verifies that satisfies the claimed conditions of Thm. 2.3, and concludes this direction.
We now verify the other direction. Let be a holefree graph. Note that every strong module different from is a subset of some maximal strong module , and a strong module in [45]. We first use inductive reasoning to show that the assertions i) and ii) of Thm. 1.5 hold for every strong module of . The base case is trivial: the largest strong module is . The inductive steps follow from Thm. 2.3: since every strong module induces a holefree subgraph , its quotient graph trivially satisfies the condition of Thm. 2.3. This settles all strong modules, and then we consider modules that are not strong. Such a module is composed of more than one strong modules, and they are either pairwise adjacent or pairwise nonadjacent. In the first case, (noting that graph contains no hole,) at most one of these strong modules is nontrivial. In the second case, they are different components of . Both cases are straightforward. ∎
3.3 Modules in minimum interval supergraphs
Before the proof of Thm. 2.4, we show a stronger result on connected modules, i.e., modules inducing connected subgraphs, of a graph with respect to its minimum interval supergraphs. For a subset of vertices, we denote by the set of common neighbors of , i.e., . Note that , and the equality is attained if and only if it is a module of .
Theorem 3.4.
Let be a minimum interval supergraph of a graph . Every connected module of is a module of , and if is not a clique, then replacing by any minimum interval supergraph of in gives a minimum interval supergraph of .
Proof.
The statement holds vacuously if consists of a single vertex or a component; hence we may assume and . Let be a normalized interval model for . We define