Sparsification Upper and Lower Bounds for Graphs Problems and Not-All-Equal SAT111This work was supported by NWO Veni grant “Frontiers in Parameterized Preprocessing” and NWO Gravity grant “Networks”.
We present several sparsification lower and upper bounds for classic problems in graph theory and logic. For the problems -Coloring, (Directed) Hamiltonian Cycle, and (Connected) Dominating Set, we prove that there is no polynomial-time algorithm that reduces any -vertex input to an equivalent instance, of an arbitrary problem, with bitsize for , unless and the polynomial-time hierarchy collapses. These results imply that existing linear-vertex kernels for -Nonblocker and -Max Leaf Spanning Tree (the parametric duals of (Connected) Dominating Set) cannot be improved to have edges, unless . We also present a positive result and exhibit a non-trivial sparsification algorithm for -Not-All-Equal-SAT. We give an algorithm that reduces an -variable input with clauses of size at most to an equivalent input with clauses, for any fixed . Our algorithm is based on a linear-algebraic proof of Lovász that bounds the number of hyperedges in critically -chromatic -uniform -vertex hypergraphs by . We show that our kernel is tight under the assumption that .
Bart M. P. Jansen and Astrid Pieterse\subjclassF.2.2 Nonnumerical Algorithms and Problems, G.2.2 Graph Theory\serieslogo\volumeinfo2111\EventShortName
Sparsification refers to the method of reducing an object such as a graph or CNF-formula to an equivalent object that is less dense, that is, an object in which the ratio of edges to vertices (or clauses to variables) is smaller. The notion is fruitful in theoretical  and practical (cf. ) settings when working with (hyper)graphs and formulas. The theory of kernelization, originating from the field of parameterized complexity theory, can be used to analyze the limits of polynomial-time sparsification. Using tools developed in the last five years, it has become possible to address questions such as: “Is there a polynomial-time algorithm that reduces an -vertex instance of my favorite graph problem to an equivalent instance with a subquadratic number of edges?”
The impetus for this line of analysis was given by an influential paper by Dell and van Melkebeek  (conference version in 2010). One of their main results states that if there is an and a polynomial-time algorithm that reduces any -vertex instance of Vertex Cover to an equivalent instance, of an arbitrary problem, that can be encoded in bits, then and the polynomial-time hierarchy collapses. Since any nontrivial input of Vertex Cover has , their result implies that the number of edges in the -vertex kernel for -Vertex Cover  cannot be improved to unless .
Using related techniques, Dell and van Melkebeek also proved important lower bounds for -cnf-sat problems: testing the satisfiability of a propositional formula in CNF form, where each clause has at most literals. They proved that for every fixed integer , the existence of a polynomial-time algorithm that reduces any -variable instance of -cnf-sat to an equivalent instance, of an arbitrary problem, with bits, for some implies . Their lower bound is tight: there are possible clauses of size over variables, allowing an instance to be represented by a vector of bits that specifies for each clause whether or not it is present.
We continue this line of investigation and analyze sparsification for several classic problems in graph theory and logic. We obtain several sparsification lower bounds that imply that the quadratic number of edges in existing linear-vertex kernels is likely to be unavoidable. When it comes to problems from logic, we give the—to the best of our knowledge—first example of a problem that does admit nontrivial sparsification: -Not-All-Equal-SAT. We also provide a matching lower bound.
The first problem we consider is -Coloring, which asks whether the input graph has a proper vertex coloring with colors. Using several new gadgets, we give a cross-composition  to show that the problem has no compression of size unless . To obtain the lower bound, we give a polynomial-time construction that embeds the logical or of a series of size- inputs of an NP-hard problem into a graph with vertices, such that has a proper -coloring if and only if there is a yes-instance among the inputs. The main structure of the reduction follows the approach of Dell and Marx : we create a table with two rows and columns and vertices in each cell. For each way of picking one cell from each row, we aim to embed one instance into the edge set between the corresponding groups of vertices. When the NP-hard starting problem is chosen such that the inputs each decompose into two induced subgraphs with a simple structure, one can create the vertex groups and their connections such that for each pair of cells , the subgraph they induce represents the -th input. If there is a yes-instance among the inputs, this leads to a pair of cells that can be properly colored in a structured way. The challenging part of the reduction is to ensure that the edges in the graph corresponding to no-inputs do not give conflicts when extending this partial coloring to the entire graph.
The next problem we attack is Hamiltonian Cycle. We rule out compressions of size for the directed and undirected variant of the problem, assuming . The construction is inspired by kernelization lower bounds for Directed Hamiltonian Cycle parameterized by the vertex-deletion distance to a directed graph whose underlying undirected graph is a path .
By combining gadgets from kernelization lower bounds for two different parameterizations of Red Blue Dominating Set, we prove that there is no compression of size for Dominating Set unless . The same construction rules out subquadratic compressions for Connected Dominating Set. These lower bounds have implications for the kernelization complexity of the parametric duals Nonblocker and Max Leaf Spanning Tree of (Connected) Dominating Set. For both Nonblocker and Max Leaf there are kernels with vertices [6, 11] that have edges. Our lower bounds imply that the number of edges in these kernels cannot be improved to , unless .
The final family of problems we consider is -Not-All-Equal-SAT for fixed . The input consists of a formula in CNF form with at most literals per clause. The question is whether there is an assignment to the variables such that each clause contains both a variable that evaluates to true and one that evaluates to false. There is a simple linear-parameter transformation from -cnf-sat to -nae-sat that consists of adding one variable that occurs as a positive literal in all clauses. By the results of Dell and van Melkebeek discussed above, this implies that -nae-sat does not admit compressions of size unless . We prove the surprising result that this lower bound is tight! A linear-algebraic result due to Lovász , concerning the size of critically -chromatic -uniform hypergraphs, can be used to give a kernel for -nae-sat with clauses for every fixed . The kernel is obtained by computing the basis of an associated matrix and removing the clauses that can be expressed as a linear combination of the basis clauses.
A parameterized problem is a subset of , where is a finite alphabet. Let be parameterized problems and let be a computable function. A generalized kernel for into of size is an algorithm that, on input , takes time polynomial in and outputs an instance such that:
and are bounded by , and
if and only if .
The algorithm is a kernel for if . It is a polynomial (generalized) kernel if is a polynomial.
Since a polynomial-time reduction to an equivalent sparse instance yields a generalized kernel, we will use the concept of generalized kernels in the remainder of this paper to prove the non-existence of such sparsification algorithms. We employ the cross-composition framework by Bodlaender et al. , which builds on earlier work by several authors [1, 8, 13].
[Polynomial equivalence relation] An equivalence relation on is called a polynomial equivalence relation if the following conditions hold.
There is an algorithm that, given two strings , decides whether and belong to the same equivalence class in time polynomial in .
For any finite set the equivalence relation partitions the elements of into a number of classes that is polynomially bounded in the size of the largest element of .
[Cross-composition] Let be a language, let be a polynomial equivalence relation on , let be a parameterized problem, and let be a function. An -cross-composition of into (with respect to ) of cost is an algorithm that, given instances of belonging to the same equivalence class of , takes time polynomial in and outputs an instance such that:
the parameter is bounded by , where is some constant independent of , and
if and only if there is an such that .
[] Let be a language, let be a parameterized problem, and let be positive reals. If is NP-hard under Karp reductions, has an -cross-composition into with cost , where denotes the number of instances, and has a polynomial (generalized) kernelization with size bound , then .
For we will refer to an -cross-composition of cost as a degree- cross-composition. By Theorem 2, a degree- cross-composition can be used to rule out generalized kernels of size . We frequently use the fact that a polynomial-time linear-parameter transformation from problem to implies that any generalized kernelization lower bound for , also holds for (cf. [3, 4]). Let be defined as .
In this section we analyze the -Coloring problem, which asks whether it is possible to assign each vertex of the input graph one out of 4 possible colors, such that there is no edge whose endpoints share the same color. We show that -Coloring does not have a generalized kernel of size , by giving a degree- cross-composition from a tailor-made problem that will be introduced below. Before giving the construction, we first present and analyze some of the gadgets that will be needed.
A treegadget is the graph obtained from a complete binary tree by replacing each vertex by a triangle on vertices , and . Let be connected to the parent of and let and be connected to the left and right subtree of . An example of a treegadget with leaves is shown in Figure 1. If vertex is the root of the tree, then is named the root of the treegadget. If does not have a left subtree, then is a leaf of this gadget, similarly, if does not have a right subtree then we refer to as a leaf of the gadget. Let the height of a treegadget be equal to the height of its corresponding binary tree.
It is easy to see that a treegadget is -colorable. The important property of this gadget is that if there is a color that does not appear on any leaf in a proper -coloring, then this must be the color of the root. See Figure (a)a for an illustration.
Let be a treegadget with root and let be a proper -coloring of . If such that for every leaf of T, then .
This will be proven using induction on the structure of a treegadget. For a single triangle, the result is obvious. Suppose we are given a treegadget of height and that the statement holds for all treegadgets of smaller height. Consider the top triangle where is the root. Then, by the induction hypothesis, the roots of the left and right subtree are colored using . Hence and do not use color . Since is a triangle, has color in the -coloring. ∎
The following lemma will be used in the correctness proof of the cross-composition to argue that the existence of a single yes-input is sufficient for -colorability of the entire graph.
Let be a treegadget with leaves and root . Any -coloring that is proper on can be extended to a proper -coloring of . If there is a leaf such that , then such an extension exists with .
We will prove this by induction on the height of the treegadget. For a single triangle, the result is obvious. Suppose the lemma is true for all treegadgets up to height and we are given a treegadget of height with root triangle and with coloring of the leaves . Let one of the leaves be colored using . Without loss of generality assume this leaf is in the left subtree, which is connected to . By the induction hypothesis, we can extend the coloring restricted to the leaves of the left subtree to a proper -coloring of the left subtree such that . We assign color to . Since restricted to the leaves in the right subtree is a proper -coloring of the leaves in the right subtree, by induction we can extend that coloring to a proper -coloring of the right subtree. Suppose the root of this subtree gets color . We now color with a color , which must exist. Finally, choose . By definition, the vertices , , and are now assigned a different color. Both and have a different color than the root of their corresponding subtree, thereby is a proper coloring. We obtain that the defined coloring is a proper coloring extending with . ∎
A triangular gadget is a graph on vertices depicted in Figure (c)c. Vertices , and are the corners of the gadget, all other vertices are referred to as inner vertices.
It is easy to see that a triangular gadget is always -colorable in such a way that every corner gets a different color. Moreover, we make the following observation.
Let be a triangular gadget with corners , and and let be a proper -coloring of . Then . Furthermore, every partial coloring that assigns distinct colors to the three corners of a triangular gadget can be extended to a proper -coloring of the entire gadget.
Having presented all the gadgets we use in our construction, we now define the source problem for the cross-composition. It is a variant of the problem that was used to prove kernel lower bounds for Chromatic Number parameterized by vertex cover .
--Coloring with Triangle Split Decomposition Input: A graph with a partition of its vertex set into such that is an edgeless graph and is a disjoint union of triangles. Question: Is there a proper -coloring of , such that for all ? We will refer to such a coloring as a 2-3-coloring of .
--Coloring with Triangle Split Decomposition is NP-complete.
It is easy to verify the problem is in NP. We will show that it is NP-hard by giving a reduction from -nae-sat, which is known to be NP-complete . Suppose we are given formula over set of variables . Construct graph in the following way. For every variable , construct a gadget as depicted in Figure (a)a. For every clause , construct a gadget as depicted in Figure (b)b. Let for , connect vertex for to vertex in gadget in .
It is easy to verify that has a triangle split decomposition. In Figure 2, triangles are shown with white vertices and the independent set is shown in black.
Suppose is --colorable with color function and let for all in the independent set. Note that in each of the pairs , , and the two vertices have distinct colors in any proper --coloring of . To satisfy , let if and only if . To show that this results in a satisfying assignment, consider any clause for . Note that . Since and we obtain . Therefore, and are colored using colors and .
Suppose . Thereby, , implying the first literal of is set to . By , we know and . Thereby, , so either or . If , then which implies that literal is in . Similarly, if , then which implies that literal is in . In both cases it follows that clause is NAE-satisfied.
When , we can use the same argument with the colors and swapped, to show that is in and or is , which implies that is NAE-satisfied.
Suppose is a yes-instance, with satisfying truth assignment . Define color function as and if is set to false in , define and otherwise. Color the remainder of the variable gadgets consistently. We now need to show how to color the clause gadgets. Consider any clause . At least one of the literals is true and one is set to false, by symmetry we only consider four cases. The corresponding colorings are depicted in Figure 3, where red corresponds to , green corresponds to and blue corresponds to color . It is easy to verify that this leads to a proper -coloring that only uses colors and on vertices in the independent set.
-Coloring parameterized by the number of vertices does not have a generalized kernel of size for any , unless .
By Theorem 2 and Lemma 3 it suffices to give a degree-2 cross-composition from the --coloring problem defined above into -Coloring parameterized by the number of vertices. For ease of presentation, we will actually give a cross-composition into the -List Coloring problem, whose input consists of a graph and a list function that assigns every vertex a list of allowed colors. The question is whether there is a proper coloring of the graph in which every vertex is assigned a color from its list. The -List Coloring reduces to the ordinary -Coloring by a simple transformation that adds a -clique to enforce the color lists, which will prove the theorem. For now, we focus on giving a cross-composition into -List Coloring.
We start by defining a polynomial equivalence relation on inputs of --Coloring with Triangle Split Decomposition. Let two instances of --Coloring with Triangle Split Decomposition be equivalent under equivalence relation when they have the same number of triangles and the independent sets also have the same size. It is easy to see that is a polynomial equivalence relation. By duplicating one of the inputs, we can ensure that the number of inputs to the cross-composition is an even power of two; this does not change the value of or, and increases the total input size by at most a factor four. We will therefore assume that the input consists of instances of --Coloring with Triangle Split Decomposition such that for some integer , implying that and are integers. Let . Enumerate the instances as for . Each input consists of a graph and a partition of its vertex set into sets and , such that is an independent set of size and consists of vertex-disjoint triangles. Enumerate the vertices in and as and , such that vertices and form a triangle, for . We will create an instance of the -List-Coloring problem, which consists of a graph and a list function that assigns each vertex a subset of the color palette . Refer to Figure 4 for a sketch of .
Initialize as the graph containing sets of vertices each, called for . Label the vertices in each of these sets as for , and let .
Add sets of triangular gadgets each, labeled for . Label the corner vertices in as for , such that vertices and are the corner vertices of one of the gadgets for . Let and for any inner vertex of a triangular gadget, let .
Connect vertex to vertex if in graph vertex is connected to , for and . By this construction, the subgraph of induced by is isomorphic to the graph obtained from by replacing each triangle with a triangular gadget.
Add a treegadget with leaves to and enumerate these leaves as ; recall that is a power of two. Connect the ’th leaf of to every vertex in . Let the root of be and define . For every other vertex in let .
Add a treegadget with leaves to and enumerate these leaves as . For , connect every inner vertex of a triangular gadget in group to leaf number of . For every leaf with an even index let and let the root have list . For every other vertex of gadget let .
The graph is -list-colorable some input instance is --colorable.
Suppose we are given a -list coloring for . By definition, . From Lemma 3 it follows that there is a leaf of such that . This leaf is connected to all vertices in some , which implies that none of the vertices in are colored using . Therefore all vertices in are colored using and . Similarly the gadget has at least one leaf such that , note that this must be a leaf with an odd index. Therefore there exists where all vertices are colored using , or . Thereby in only three colors are used, such that is colored using only two colors. Using Observation 3 and the fact that is isomorphic to the graph obtained from by replacing triangles by triangular gadgets, we conclude that has a proper --coloring.
Suppose is a proper --coloring for . We will construct a -list coloring for . For , in instance let and for for let . Let for and , furthermore let for and . For triangular gadgets in the coloring defines all corners to have distinct colors; by Observation 3 we can color the inner vertices consistently using . For with and , the corners of triangular gadgets have color and we can now consistently color the inner vertices using .
The leaf of gadget that is connected to can be colored using . Every other leaf can use both and , so we can properly -color the leaves such that one leaf has color . From Lemma 3 it follows that we can consistently -color such that the root does not receive color , as required by . Similarly, in triangular gadgets in the inner vertices do not have color . As such, leaf of can be colored using and we color leaf with . For with color leaf with and leaf using . Now the leaves of are properly -colored and one is colored . It follows from Observation 3 that we can color such that the root is not colored . This completes the -list coloring of . ∎
The claim shows that the construction serves as a cross-composition into -List Coloring. To prove the theorem, we add four new vertices to simulate the list function. Add a clique on vertices . If for any vertex in , some color is not contained in , connect to the vertex corresponding to this color. As proper colorings of the resulting graph correspond to proper list colorings of , the resulting graph is -colorable if and only if there is a yes-instance among the inputs. It remains to bound the parameter of the problem, i.e., the number of vertices. Observe that a treegadget has at least as many leaves as its corresponding binary tree, therefore the graph has at most vertices. Theorem 3 now follows from Theorem 2 and Lemma 3. ∎
4 Hamiltonian cycle
In this section we prove a sparsification lower bound for Hamiltonian Cycle and its directed variant by giving a degree- cross-composition. The starting problem is Hamiltonian path on bipartite graphs.
Hamiltonian path on bipartite graphs Input: An undirected bipartite graph with partite sets and such that , together with two distinguished vertices and that have degree . Question: Does have a Hamiltonian path from to ?
It is known that Hamiltonian path is NP-complete on bipartite graphs  and it is easy to see that is remains NP-complete when fixing a degree start and endpoint.
(Directed) Hamiltonian Cycle parameterized by the number of vertices does not have a generalized kernel of size for any , unless .
By a suitable choice of polynomial equivalence relation, and by padding the number of inputs, it suffices to give a cross-composition from the problem on bipartite graphs when the input consists of instances for (i.e., is an integer), such that each instance encodes a bipartite graph with partite sets and with and , for some . For each instance, label all elements in as and all elements in as such that and have degree .
The construction makes extensive use of the path gadget depicted in Figure (a)a. Observe that if contains a path gadget as an induced subgraph, while the remainder of the graph only connects to its terminals and , then any Hamiltonian cycle in traverses the path gadget in one of the two ways depicted in Figure (a)a. We create an instance of Directed Hamiltonian Cycle that acts as the logical or of the inputs.
First of all construct groups of path gadgets each. Refer to these groups as , for , and label the gadgets within group as . Let the union of all created sets be named . Similarly, construct groups of path gadgets each. Refer to these groups as , for , and label the gadgets within group as . Let be the union of all for .
For every input instance , for each edge in with , , add an arc from of to of and an arc from of to of .
If some has a Hamiltonian path, it can be mimicked by the combination of and , where for each vertex in we traverse its path gadget in , following Path . The following construction steps are needed to extend such a path to a Hamiltonian cycle in .
Add an arc from the terminal of to the terminal of for all and all . Similarly add an arc from the terminal of to the of for all and all .
Add a vertex start and a vertex end and the arc .
Let , add tuples of vertices, for and connect start to . Furthermore, add the arcs for .
For we add arcs from to the terminal of the gadgets . Furthermore we add an arc from of to for all and . When add arcs from to the terminal of for and connect of to .
Add a vertex next and the arc and an arc from next to the terminal of all gadgets for .
Furthermore, add arcs from of all gadgets to end for . So for each , exactly one vertex has an outgoing arc to end and one has an incoming arc from next.
This completes the construction of . A sketch of is shown in Figure (b)b. In order to prove that the created graph acts as a logical or of the given input instances, we first establish a number of auxiliary lemmas.
Any Hamiltonian cycle in traverses any path gadget in via directed Path 0 or Path 1, as shown in Figure (a)a.
Any Hamiltonian cycle in should visit the center vertex of the path gadget. Since and are its only two neighbors in , the only option is to visit them consecutively, Path and Path are the only two options to do this. ∎
When any Hamiltonian cycle in enters path gadget at for some , the cycle then visits the gadgets in order without visiting other vertices in between. Similarly, if any Hamiltonian cycle in enters path gadget at , the cycle then visits the gadgets in order without visiting other vertices in between.
Consider a Hamiltonian cycle in that enters path gadget at . By Lemma 4 the cycle follows Path and continues to the terminal of the path gadget. Since that terminal has only one out-neighbor outside the gadget, which leads to the terminal of , it follows that the cycle continues to that path gadget. As the adjacency structure around the other path gadgets is similar, the lemma follows by repeating this argument. The proof when entering group at the vertex of is equivalent. ∎
Let be a directed Hamiltonian cycle in , such that its first arc is . There are indices such that subpath of the cycle between and contains exactly the vertices
where contains all vertices of all gadgets in for , and similarly contains all vertices of all gadgets in for .
We will first show that when the cycle reaches any for , it traverses exactly one group with and continues to and for some , without visiting other vertices in between. Similarly, when the cycle reaches any for , it traverses exactly one group with and continues to for some . For , the cycle then continues to , for the cycle reached , which is the last vertex of this subpath.
By Step 4 in the construction, all outgoing arcs of any for lead to gadgets for some . So for any in the cycle there must be a unique such that the arc from to the terminal of is in . By Lemma 4 the cycle visits all vertices in , and no other vertices, before reaching gadget , which is traversed by Path to get to of this gadget. The only neighbors of of gadget lying outside this gadget are of type for . As such, the cycle must visit some next, and its only outgoing arc goes to .
The proof for is similar. As such, visiting for results in visiting all vertices of exactly one group in before continuing via to some without visiting any vertices in between. Visiting for results in visiting all vertices of exactly one group in and returning via to either the end of the subpath () or some .
Every vertex for must be visited by , it remains to show that it is visited in subpath . Suppose there exists an for such that is not visited in the subpath from to . As we have seen above, visiting some results in visiting all vertices in some group in or , continued by visiting some for . Note that no other vertices are visited in between. Hereby, is not in subpath . This implies and thus the next vertex in the cycle is . So, for not in subpath , one can find a new vertex (where ), such that is also not in subpath . Note that we can not create a loop, by visiting a vertex seen earlier, as this would not yield a Hamiltonian cycle in . For example, the vertex start would never be visited. This is however a contradiction since we only have finitely many vertices .
Thus in subpath , exactly groups of are visited and exactly groups of are visited, and no other vertices than specified. This leaves exactly one group and one group unvisited in . ∎
In Step 4 we create a selection mechanism that leaves one group in and one in unvisited. The following lemma formalizes this idea.
Let be a Hamiltonian cycle in , such that its first arc is . Let and satisfy the conditions of Lemma 4. Then cycle visits before . Moreover, the subpath of the cycle between terminal of and of (inclusive) contains all vertices of the gadgets in and and no others.
Vertex next is visited directly after , since it is the only out-neighbor of . Furthermore, the arc from next to gadget must be in the cycle for some , since next only has outgoing arcs of this type. By Lemma 4, all gadgets in all for are visited in the path from to , and thus should not be visited after vertex next. Therefore, the arc from next to gadget is in the cycle, which also implies that is visited before .
It is easy to see that is the last arc in . By considering the incoming arcs of end it follows that some arc from terminal of to end for is in the cycle. Since the vertices in gadgets for are already visited in by Lemma 4, it follows that is in .
By Lemma 4, none of the terminals of gadgets in and are visited in the subpath or equivalently in the subpath . Since is a Hamiltonian cycle these vertices must therefore be visited in , which is equivalent to saying that must contain all vertices in . It is easy to see that this subpath cannot contain any other vertices, as all other vertices are present in or . ∎
Using the lemmas above, we can now prove that has a Hamiltonian cycle if and only if one of the instances has a Hamiltonian path.
Graph has a directed Hamiltonian cycle if and only if at least one of the instances has a Hamiltonian -path.
() Suppose has a Hamiltonian cycle . By Lemma 4 there exist such that the subpath of from gadget to visits exactly the gadgets in . Since gadget is entered at terminal , it is easy to see that all gadgets are traversed using Path . We now construct a Hamiltonian path for instance . Let if the arc from of to of is in . Similarly let if the arc from of to of is in , where and . Using that every gadget is visited exactly once via Path in , we see that is a Hamiltonian path.
() Suppose has a Hamiltonian path . Then we create a Hamiltonian cycle , for each vertex from instance in we add Path in path gadget to and for each vertex we add Path in path gadget to . Let be ordered such that is its first vertex. Now if is followed by in , the arc from terminal of to of is added to . Similarly, if a vertex is followed by in , the arc from terminal of to of will be added to . Now the subpath contains all terminals in all gadgets in .
From the cycle goes to end, then to start and to . To visit all groups for and for , do the following.
From where , the cycle continues to gadgets , then to following Path , and continue to .
From where it goes to and continues with .
Similarly, from where , go through gadgets and continue to .