Surjective Colouring: New Hardness Results^{†}^{†}thanks: Supported by the Research Council of Norway via the project “CLASSIS” and the Leverhulme Trust (RPG2016258).
Abstract
A homomorphism from a graph to a graph is a vertex mapping from the vertex set of to the vertex set of such that there is an edge between vertices and of whenever there is an edge between vertices and of . The Colouring problem is to decide if a graph allows a homomorphism to a fixed graph . We continue a study on a variant of this problem, namely the Surjective Colouring problem, which imposes the homomorphism to be vertexsurjective. We build upon previous results and show that this problem is NPcomplete for every connected graph that has exactly two vertices with a selfloop as long as these two vertices are not adjacent. As a result, we can classify the computational complexity of Surjective Colouring for every graph on at most four vertices.
1 Introduction
The wellknown Colouring problem is to decide if the vertices of a given graph can be properly coloured with at most colours for some given integer . If we exclude from the input and assume it is fixed, we obtain the Colouring problem. A homomorphism from a graph to a graph is a vertex mapping , such that there is an edge between and in whenever there is an edge between and in . We observe that Colouring is equivalent to the problem of asking if a graph allows a homomorphism to the complete graph on vertices. Hence, a natural generalization of the Colouring problem is the Colouring problem, which asks if a given graph allows a homomorphism to an arbitrary fixed graph . We call this fixed graph the target graph. Throughout the paper we consider undirected graphs with no multiple edges. We assume that an input graph contains no vertices with selfloops (we call such vertices reflexive), whereas a target graph may contain such vertices. We call reflexive if all its vertices are reflexive, and irreflexive if all its vertices are irreflexive.
For a survey on graph homomorphisms we refer the reader to the textbook of Hell and Nešetřil [12]. Here, we will discuss the Colouring problem, a number of its variants and their relations to each other. In particular, we will focus on the surjective variant: a homomorphism from a graph to a graph is (vertex)surjective if is surjective, that is, if for every vertex there exists at least one vertex with .
The computational complexity of Colouring has been determined completely. The problem is trivial if contains a reflexive vertex (we can map each vertex of the input graph to ). If has no reflexive vertices, then the HellNešetřil dichotomy theorem [11] tells us that Colouring is solvable in polynomial time if is bipartite and that it is NPcomplete otherwise.
The List Colouring problem takes as input a graph and a function that assigns to each a list . The question is whether allows a homomorphism to the target with for every . Feder, Hell and Huang [4] proved that List Colouring is polynomialtime solvable if is a biarc graph and NPcomplete otherwise (we refer to [4] for the definition of a biarc graph). A homomorphism from to an induced subgraph of is a retraction if for every , and we say that retracts to . A retraction from to can be viewed as a listhomomorphism: choose if , and if . The corresponding decision problem is called Retraction. The computational complexity of Retraction has not yet been classified. Feder et al. [5] determined the complexity of the Retraction problem whenever is a pseudoforest (a graph in which every connected component has at most one cycle). They also showed that Retraction is NPcomplete if contains a connected component in which the reflexive vertices induce a disconnected graph.
As mentioned, we impose a (vertex)surjectivity condition on the graph homomorphism. Such a condition can be imposed locally or globally. If we require a homomorphism from a graph to a graph to be surjective when restricted to the open neighbourhood of every vertex of , we say that is an role assignment. The corresponding decision problem is called Role Assignment and its computational complexity has been fully classified [8]. We refer to the survey of Fiala and Kratochvíl [7] for further details on locally constrained homomorphisms and from here on only consider global surjectivity.
It has been shown that deciding whether a given graph allows a surjective homomorphism to a given graph is NPcomplete even if and both belong to one of the following graph classes: disjoint unions of paths; disjoint unions of complete graphs; trees; connected cographs; connected proper interval graphs; and connected split graphs [9]. Hence it is natural, just as before, to fix , which yields the following problem:
\@minipagerestore
Surjective Colouring
Instance:
a graph .
Question:
does there exist a surjective homomorphism from to ?
We emphasize that we are considering vertexsurjectivity and that being vertexsurjective is a different condition than being edgesurjective. A homomorphism from a graph to a graph is called edgesurjective or a compaction if for any edge with there exists an edge with and . Note that the edgesurjectivity condition does not hold for any selfloops . If is a compaction from to , we say that compacts to . The corresponding decision problem is known as the Compaction problem. A full classification of this problem is still wide open. However partial results are known, for example when is a reflexive cycle, an irreflexive cycle, or a graph on at most four vertices [16, 18, 19], or when is restricted to some special graph class [15]. Vikas also showed that whenever Retraction is polynomialtime solvable, then so is Compaction [18]. Whether the reverse implication holds is not known. A complete complexity classification of Surjective Colouring is also still open. Below we survey the known results.
We first consider irreflexive target graphs . The Surjective Colouring problem is NPcomplete for every such graph if is nonbipartite, as observed by Golovach et al. [10]. The straightforward reduction is from the corresponding Colouring problem, which is NPcomplete due to the aforementioned HellNešetřil dichotomy theorem. However, the complexity classifications of Colouring and Surjective Colouring do not coincide: there exist bipartite graphs for which Surjective Colouring is NPcomplete, for instance when is the graph obtained from a 6vertex cycle to each of which vertices we add a path of length 3 [1], or when is the 6vertex cycle itself [17].
We now consider target graphs with at least one reflexive vertex. Unlike the Colouring problem, the presence of a reflexive vertex does not make the Surjective Colouring problem trivial to solve. We call a connected graph loopconnected if all its reflexive vertices induce a connected subgraph. Golovach, Paulusma and Song [10] showed that if is a tree (in this context, a connected graph with no cycles of length at least 3) then Surjective Colouring is polynomialtime solvable if is loopconnected and NPcomplete otherwise. As such the following question is natural:
Is Surjective Colouring NPcomplete for every connected graph that is not loopconnected?
The reverse statement is not true (if P NP): Surjective Colouring is NPcomplete when is the 4vertex cycle with a selfloop in each of its vertices. This result has been shown by Martin and Paulusma [13] and independently by Vikas, as announced in [15]. Recall also that Surjective Colouring is NPcomplete if is irreflexive (and thus loopconnected) and nonbipartite.
It is known that Surjective Colouring is polynomialtime solvable whenever Compaction is [1]. Recall that Compaction is polynomialtime solvable whenever Retraction is [18]. Hence, for instance, the aforementioned result of Feder, Hell and Huang [4] implies that Surjective Colouring is polynomialtime solvable if is a biarc graph. We also recall that Retraction is NPcomplete whenever is a connected graph that is not loopconnected [5]. Hence, an affirmative answer to the above question would mean that for these target graphs the complexities of Retraction, Compaction and Surjective Colouring coincide.
In Figure 1 we display the relationships between the different problems discussed. In particular, it is a major open problem whether the computational complexities of Compaction, Retraction and Surjective Colouring coincide for each target graph . Even showing this for specific cases, such as the case , has been proven to be nontrivial. If it is true, it would relate the Surjective Colouring problem to a wellknown conjecture of Feder and Vardi [6], which states that the Constraint Satisfaction problem has a dichotomy when is some fixed finite target structure and which is equivalent to conjecturing that Retraction has a dichotomy [6]. We refer to the survey of Bodirsky, Kara and Martin [1] for more details on the Surjective Colouring problem from a constraint satisfaction point of view.
1.1 Our Results
We present further progress on the research question of whether Surjective Colouring is NPcomplete for every connected graph that is not loopconnected. We first consider the case where the target graph is a connected graph with exactly two reflexive vertices that are nonadjacent. In Section 2 we prove that Surjective Colouring is indeed NPcomplete for every such target graph . In the same section we slightly generalize this result by showing that it holds even if the reflexive vertices of can be partitioned into two nonadjacent sets of twin vertices. This enables us to classify in Section 3 the computational complexity of Surjective Colouring for every graph on at most four vertices, just as Vikas [19] did for the Compaction problem. A classification of Surjective Colouring for target graphs on at most four vertices has also been announced by Vikas in [15]. As we will illustrate for one particular case, it is interesting to note that NPhardness proofs for Compaction of [19] may lift to NPhardness for Surjective Colouring. However, this is not true for the reflexive cycle , where a totally new proof was required.
1.2 Future Work
To conjecture a dichotomy of Surjective Colouring between P and NPcomplete seems still to be difficult. Our first goal is to prove that Surjective Colouring is NPcomplete for every connected graph that is not loopconnected. However, doing this via using our current techniques does not seem straightforward and we may need new hardness reductions. Another way forward is to prove polynomial equivalence between the three problems Surjective Colouring, Compaction and Retraction. However, completely achieving this goal also seems far from trivial. Our classification for target graphs up to four vertices does show such an equivalence for these cases (see Section 3).
2 Two NonAdjacent Reflexive Vertices
We say that a graph is reflexive if it contains exactly reflexive vertices that are nonadjacent. In this section we will prove that Surjective Colouring is NPcomplete whenever is connected and 2reflexive. The problem is readily seen to be in NP. Our NPhardness reduction uses similar ingredients as the reduction of Golovach, Paulusma and Song [10] for proving NPhardness when is a tree that is not loopconnected. There are, however, a number of differences. For instance, we will reduce from a factor cut problem instead of the less general matching cut problem used in [10]. We will explain these two problems and prove NPhardness for the former one in Section 2.1. Then in Section 2.2 we give our hardness reduction, and in Section 2.3 we extend our result to be valid for target graphs with more than two reflexive vertices as long as these reflexive vertices can be partitioned into two nonadjacent sets of twin vertices.
2.1 Factor Cuts
Let be a connected graph. For and , let denote the number of edges of incident with . For a partition of , let denote the set of edges between and in .
Let and be positive integers, . Let be a partition of and let . Then is an factor cut of if, for all , , and, for all , . Observe that if a vertex exists with degree at most , then there is a trivial factor cut . Two distinct vertices and in are factor roots of if, for each factor cut of , and belong to different parts of the partition and, if , and (of course, if , we do not require the latter condition as is also an factor cut). We note that when no factor cut exists, every pair of vertices is a pair of factor roots. We define the following decision problem.
\@minipagerestore
Factor Cut with Roots
Instance:
a connected graph with roots and .
Question:
does have an factor cut?
We emphasize that the factor roots are given as part of the input. That is, the problem asks whether or not an factor cut exists, but we know already that if it does, then and belong to different parts of the partition. That is, we actually define Factor Cut with Roots to be a promise problem in which we assume that if an factor cut exists then it has the property that and belong to different parts of the partition. The promise class may not itself be polynomially recognizable but one may readily find a subclass of it that is polynomially recognizable and includes all the instances we need for NPhardness. In fact this will become clear when reading our proof but we refer also to [10] where such a subclass is given for the case . A factor cut of is also known as a matching cut, as no two edges in have a common endvertex, that is, is a matching. Similarly Factor Cut with Roots is known as Matching Cut with Roots and was proved NPcomplete by Golovach, Paulusma and Song [10] (by making an observation about the proof of the result of Patrignani and Pizzonia [14] that deciding whether or not any given graph has a matching cut is NPcomplete).
We will prove the NPcompleteness of Factor Cut with Roots after first presenting a helpful lemma (a clique is a subset of vertices of that are pairwise adjacent to each other).
Lemma 1
Let , and be positive integers where and . Let be a graph that contains a clique on vertices. Then, for every factor cut of , either or .
Proof
If the lemma is false, then for some factor cut , we can choose and . Let . Since every vertex in is linked by an edge of to and every vertex in is linked by an edge of to , we have , contradicting the definition of an factor cut.
Theorem 2.1
Let and be positive integers, . Then Factor Cut with Roots is NPcomplete.
Proof
If , then the problem is Matching Cut with Roots which, as we noted, is known to be NPcomplete [10]. We split the remaining cases in two according to whether or not . In each case, we construct a polynomial time reduction from Matching Cut with Roots. In particular, we take an instance of Matching Cut with Roots, and construct a graph that is a supergraph of and show that

is an instance of Factor Cut with Roots (that is, if has an factor cut , then and or, possibly, vice versa if ),

if has an factor cut, then has a matching cut, and

if has a matching cut, then has an factor cut.
We note that (1) is an atypical feature of an NPcompleteness proof as, unusually for Factor Cut with Roots, it is not immediate to recognize a problem instance. We let .
Case 1: .
Let .
Construct from by first adding a complete graph on vertices and adding edges from to every vertex of . Then, for each , add edges from to vertices of in such a way that no vertex of has more than one neighbour in .
Let be a factor cut of . The vertices of induce a clique on vertices. So, by Lemma 1, or .
Suppose that . Then must contain vertices of both (otherwise would be empty) and (at least ). Thus, as is connected, we can find a vertex that has a neighbour in . But also has neighbours in and so has at least 2 neighbours in , contradicting the definition of a factor cut.
So we must have that . Let and be a partition of , and let and and notice that is the union of and, for each , the edges from to . For each , . For each , . So is a matching cut of ; this proves (2). And as , we have, by the definition of factor roots, ; this proves (1).
To prove (3), we note that if is a matching cut of , then we can assume that and (else relabel them for the purpose of constructing ), and then is a factor cut of .
Case 2: .
Let .
Construct from by first adding a complete graph on vertices and adding edges from to every vertex of , and then adding a complete graph on vertices and adding edges from to every vertex of . Then, for each , add edges from to vertices of in such a way that no vertex of has more than one neighbour in . Afterwards, for each , add edges from to vertices of in such a way that no vertex of has more than one neighbour in .
Let be an factor cut of . The vertices of induce a clique on at least vertices. So, by Lemma 1, or . Similarly or .
Suppose that and are both subsets of . Then must contain vertices of both (at least and ) and (else it would be empty). Thus, as is connected, we can find a vertex that has a neighbour in . But also has neighbours in and neighbours in and so has at least neighbours in , contradicting the definition of an factor. By an analogous argument and cannot both be subsets of .
Suppose that and . As is connected and contains vertices of both and , we can find a vertex that has a neighbour in . But also has neighbours in and so has more than neighbours in , contradicting the definition of a factor.
Thus we have that and are subsets of separate parts and, moreover, either or . Thus (1) is proved, and we have, in either case, that each vertex in is joined by edges to vertices in , and each vertex in is joined by edges to vertices in . Therefore each vertex in is joined to at most one vertex in , and each vertex in is joined to at most one vertex in . Thus is a matching cut of . This proves (2).
To prove (3), we note that if is a matching cut of , then we can assume that and (else relabel them for the purpose of constructing ), and then is an factor cut of .
2.2 The Hardness Reduction
Let be a connected 2reflexive target graph. Let and be the two (nonadjacent) reflexive vertices of . The length of a path is its number of edges. The distance between two vertices and in a graph is the length of a shortest path between them and is denoted . We define two induced subgraphs and of whose vertex sets partition . First contains those vertices of that are closer to than to ; and contains those vertices that are at least as close to as to (so contains any vertex equidistant to and ). That is, and . See Figure 2 for an example. The following lemma follows immediately from our assumption that is connected.
Lemma 2
Both and are connected. Moreover, for every and for every .
Let denote the size of a largest clique in . From graphs and we construct graphs and , respectively, in the following way:

for each , create a vertex ;

for , create vertices ;

for , create vertices ;

for , add an edge in between any two vertices and if and only if is an edge of .
We note that is the graph obtained by taking and replacing by a clique of size . Similarly, is the graph obtained by taking and replacing by a clique of size . We say that are the roots of and that are the roots of . Figure 3 shows an example of the graphs and obtained from the graph in Figure 2.
Let denote the distance between and . Let be the set of neighbours of that are each on some shortest path (thus of length ) from to in . Let be the size of a largest clique in . We define and similarly. We will reduce from Factor Cut with Roots, which is NPcomplete due to Theorem 2.1. Hence, consider an instance of Factor Cut with Roots, where is a connected graph and and form the (ordered) pair of factor roots of . Recall that we assume that is irreflexive.
We say that we identify two vertices and of a graph when we remove them from the graph and replace them with a single vertex that we make adjacent to every vertex that was adjacent to or . From , , and we construct a new graph as follows:

For each edge , we do as follows. We create four vertices, , , and . We also create two paths and , each of length , between and , and between and , respectively. If we identify and and and to get paths of length 0.

For each vertex , we do as follows. First we construct a clique on vertices. We denote these vertices by . We then make every vertex in adjacent to both and for every edge incident to ; we call and a red and blue neighbour of , respectively; if , then the vertex obtained by identifying two vertices and , or and is simultaneously a red neighbour of one clique and a blue neighbour of another one. Finally, for every two edges and incident to , we make and adjacent, that is, the set of red neighbours of form a clique, whereas the set of blue neighbours form an independent set.

We add by identifying and for , and we add by identifying and for . We denote the vertices in and in by their label in or .
See Figure 4 for an example of a graph .
The next lemma describes a straightforward property of graph homomorphisms that will prove useful.
Lemma 3
If there exists a homomorphism then for every pair of vertices .
We now prove the key property of our construction.
Lemma 4
For every homomorphism from to , there exists at least one clique with and at least one clique with .
Proof
Since for each and any edge incident to , every clique in is of size at least , we find that must map at least two of its vertices to a reflexive vertex, so either to or . Hence, for every , we find that maps at least one vertex of to either or .
We prove the lemma by contradiction. We will assume that does not map any vertex of any to , thus for all . We will note later that if instead for all we can obtain a contradiction in the same way.
We consider two vertices and such that . Without loss of generality let . We shall refer to these vertices as and respectively. We now consider a vertex . By Lemma 3, and . In other words:
In fact by applying Lemma 3 we can generalize this further to any vertex mapped to by :
(1) 
For every we define a value as follows:
Claim 1
for all .
We prove Claim 1 by showing that , which suffices due to (1). First suppose . We may assume, without loss of generality, that . So , as .
Now suppose . Then either belongs to a clique or is a vertex of a path or between two cliques. If belongs to a clique or is an endvertex of such a path, then is either in or adjacent to a vertex in (since at least one vertex in maps to ). Hence . Finally, suppose is an inner vertex of a path or . By definition, such a path has length . Then is at most distance from a vertex in a clique, which we know is either in or adjacent to a vertex in . Hence . This proves Claim 1.
Claim 2
If there exists a surjective homomorphism from to , then for any integer :
We prove Claim 2 as follows. Using the fact that with a surjective homomorphism every vertex must be mapped to, we see from Lemma 3 that if there are vertices in which are at a distance from , there must be at least vertices in that are at distance at least from every vertex that maps to . This means we can say for any distance :
Combining this inequality with Claim 1 yields, for every distance :
Now let . Then we only have to consider vertices in . Hence, for every :
By construction, for any with we have that and thus . Therefore, no vertex with is involved in the equation above, so we can write:
Hence Claim 2 is proven.
We first present the intuition behind the final part of the proof. Consider the graphs , and in the example shown in Figure 5. We recall that every vertex (other than or ) has a single corresponding vertex in or . We may naturally want to map the vertices of onto the vertices of , which is possible by definition of . However, when we try to map the vertices of onto the vertices of , with (for some ), we will prove that there is at least one vertex in which is further from in than it is from and that cannot be mapped to and thus violates the surjectivity constraint. In Figure 5 this vertex, which will play a special role in our proof, is shown in red. In the example of this figure, and we observe that there are ten vertices in (including ) with but only nine vertices (excluding ) in with which could be mapped to these vertices. This contradicts Claim 2.
We now formally prove that our initial assumption that for all contradicts Claim 2. For every vertex in there is a corresponding vertex such that , where the latter equality follows from the construction of . From Lemma 2 we find that for every . Hence , and for all :
(2) 
Now let . Using the same arguments, we see that , and thus by definition. Note that, had we instead supposed that it was to which everything mapped, we would instead have a strict inequality. As it turns out, we only need the weaker inequality.
We now look for a vertex in , such that is as far from as possible, subject to the condition that . Let . We see that for any vertex in such that , it is the case that . Note that there may be no vertices with in which case is simply the farthest vertex from within . We also observe that is possible. So <