On the Computational Complexity of
MultiAgent Pathfinding on Directed Graphs
Abstract
The determination of the computational complexity of multiagent pathfinding on directed graphs has been an open problem for many years. For undirected graphs, solvability can be decided in polynomial time, as has been shown already in the eighties. Further, recently it has been shown that a special case on directed graphs is solvable in polynomial time. In this paper, we show that the problem is NPhard in the general case. In addition, some upper bounds are proven.
1 Introduction
The multiagent pathfinding (MAPF) problem is the problem of deciding the existence of a movement plan for a set of agents moving on a graph, most often a graph generated from a grid [5]. An example is provided in Figure 1.
Here, the circular agent wants to move to and the square robot wants to move to . Both want to reach their destination and then stay there. So, could move to and then to . After that could move to its destination . So, in this, a movement plan does exist. Note that for this graph, regardless of how we place the agents and the destinations, there is always a movement plan, provided the diestinations are on different grid fields. When removing , however, there are configuations for which no movement plan is possible.
Kornhauser et al. \shortcitekornhauser:et:al:focs84 have shown in the eighties already that deciding solvability is a polynomialtime problem. Later on, variations of the problem have been studied, such as using parallel movements and considering optimal movement plans [10, 12, 6, 3]. However, in almost all cases, the results apply to undirected graphs only. A notable exception is the paper by Botea et al. \shortcitebotea:et:al:jair18, which shows polynomialtime decidability for MAPF on directed graphs, provided the graph is strongly biconnected and there are at least two unoccupied vertices. The general case has been, however, open so far.
In a similar vein, Wu and Grumbach \shortcitewu:grumbach:dam10 generalized the robot movement problem on an undirected graph as introduced by Papadimitriou et al. \shortcitepapadimitriou:et:al:focs94 to directed graphs. The robot movement problem is the problem of finding a plan to move a robot from a vertex to a vertex , whereby mobile obstacles on vertices can be moved around but are not allowed to collide. Wu and Grumbach showed that solvability can be decided in polynomial time if the graph is either acyclic or strongly connected. In their conclusion they suggested to study the more difficult problem when all mobile obstacles are themselves also agents, which again is the MAPF problem on directed graphs.
We address this open problem by showing that the MAPF problem on directed graphs, which we will call diMAPF, is NPhard. Interestingly, proving completeness for this problem seems to be quite nontrivial and we will only provide a general upper bound, a result for the special case of acyclic directed graphs and a conditional result.
2 Notation and Terminology
A graph is a tuple with . The elements of are called vertices and the elements of are called edges. A directed graph or digraph is a tuple with . The elements of are called vertices, the element of arcs. Given a digraph , the underlying graph of , in symbols , is the graph resulting from ignoring the direction of the arcs, i.e., . We assume all graphs and digraphs to be simple, i.e., not containing any selfloops of the form , resp. .
Given a digraph (or a graph ), the digraph (resp. graph ) is called subdigraph of (resp. subgraph of )) if and (resp. ). Let again be a directed graph (or a graph) and let . Then by (resp. ) we refer to the subdigraph (resp. ).
A path in a digraph (or a graph ) is a nonempty sequence of vertices and arcs (resp. edges) of the form such that , for all , for all , (resp. ) for all , and for all . A cycle in a digraph (or a graph ) is a nonempty sequence of vertices such that , (resp. ) for all and for all . If a digraph does not contain any cycle, it is called directed acyclic graph (DAG).
A graph is said to be connected if there is a path between each pair of distinct vertices. It is biconnected if is connected for each . Similarly, a digraph is weakly connected, if the underlying graph is connected. It is strongly connected if for every pair of distinct vertices , there is a path in from to and one from to . The smallest strongly connected digraph is the one with one vertex and no arcs. A digraph is called strongly biconnected if it is strongly connected and the underlying graph is biconnected.
The strongly connected components of a digraph are the maximal subdigraphs that are strongly connected. The condensation of a digraph is the digraph consisting of its strongly connected components : . Note that is a DAG.
A multiagent pathfinding (MAPF) instance is given by a graph , a set of agents with an injective initial state function , and an injective goal state function . The vertex is called destination of agent . Given a state function , one possible successor state is the function such that one agent moves from one vertex to an adjacent vertex: If and there is no such that , then the successor state is identical to except at the point , where . The MAPF problem is then to decide whether there exists a sequence of moves that transforms into .
Multiagent pathfinding on directed graphs (diMAPF) is similar to MAPF, except that we have a directed graph and the moves have to follow the direction of an arc, i.e., if there is an arc but , then an agent can move from to but not vice versa.
We assume that the reader is familiar with basic notions from computational complexity theory [8].
3 A Lower Bound for diMAPF
As mentioned above, Kornhauser et al. \shortcitekornhauser:et:al:focs84 have shown that deciding MAPF (on undirected graphs) is a polynomialtime problem and that movement plans have only cubic length in the number of vertices. Botea et al. \shortcitebotea:et:al:jair18 have shown that deciding solvability of diMAPF is again a polynomialtime problem and plans have cubic length, provided the digraph is a strongly biconnected digraph and there are at least two empty vertices. One intuitive reason for these positive results are that on graphs and strongly biconnected digraphs one can usually restore earlier subconfigurations. This means that agents can move out of the way and then back to where they were earlier. In a digraph without strong connectivity, moves are not necessarily reversible and an agent might itself paint into a corner. Given that in every state there are different possible moves for one agent, it might be hard to decide which is the one that in the end will not block another agent in the future. As a matter of fact, this is the case in the reduction from 3SAT that we use in the proof of the following theorem.
Theorem 1.
The diMAPF problem is NPhard.
Proof.
We proof NPhardness by a reduction from the 3SAT problem, the problem of deciding satisfiability for a formula in conjunctive normal form with 3 literals in each clause. Let us assume a 3SAT instance, consisting of variables and clauses with 3 literals each.
Now we construct a diMAPF instance as follows.^{1}^{1}1This reduction uses inspirations from a reduction that has been used to show PSPACEhardness for a generalized version of MAPF [7]. The set of agent is:
The ’s are called variable agents, the ’s are named shadow agents, the ’s are called clause agents, and the ’s are called filler agents. The set of vertices of the digraph is constructed as follows:
We proceed by constructing three gadgets, which we call sequencer, clause evaluator, and collector, respectively. We illustrate the construction using the example in Figure 2. In this visualization, vertices occupied by an agent are shown as squares containing the name of the occupying agent. Black circles symbolize empty vertices. Each vertex is labelled by its identifier, perhaps followed by a colon and the name of an agent in order to symbolize the destination for this agent. For example, is the destination for agent .
The task of the sequencer is to enforce first the sequence of truthvalue choices of the variable agents . Each of the variable agents have to go to one of the vertices or —and these are the only vertices can go to. After that the filler and clause agents can move to the left and the clause agents can start to go through the clause evaluator. The clause evaluator is created in a way so that a clause agent can move through it from right to left, provided one of the literals of the corresponding clause is true according to the truthvalue choices made by the variable agents. Finally, the collector contains the destination vertices for all clause agents and for the shadow agents . First the clause agents need to get to their destinations, then the shadow agents can arrive at their goals, making room for the variable agents to move to their final destinations.
The sequencer consists of a subgraph with vertices, which are named to . These vertices are connected linearly, i.e., there is an arc from to . The vertices to are occupied by variable agents named to . In addition we have clause agents on the vertices , respectively. The rest of the vertices are filled with filler agents for all the not yet occupied vertices. The destination for each filler agent is the vertex with an index lower than the one is starting from. These filler agents are necessary to enforce that the clause agents enter the clause evaluator only after the variable agents have made their choices.
The clause evaluator contains for each variable one pair of vertices: and These vertices represent the truth assignment choices false and true, respectively, for . In addition, there exists an additional vertex , which can be reached from both and and which is the destination for agent and initially occupied by the shadow agent . This enforces the variable agent to move to or once it has reached .
Once all the agents have reached their vertices or , the remaining agents in the sequencer can move vertices to the left, i.e., from to bringing all the filler agents to their respective destinations. Further, all clause agents have to go from to , whereby these latter vertices are connected to the clause evaluator in the following way. The vertex , which will hold clause agent after all agents moved steps to the left, is connected to iff the clause contains positively and it is connected to iff contains negated. This means that the clause agent can pass to if and only if one of the variable agents participating in the clause made the “right” choice.
Finally, the collector gadget provides the destinations for all the clause agents and the shadow agents . The vertices , , and all lead to the vertex , which is the destination of the shadow agent . Starting at this node, we have a linearly connected path up to vertex from which can be reached, which in turn is a linear path to . This implies that first all clause agents have to reach their destination vertices, after which the shadow agents can move to their destinations. Only after all this has happened, the variable agents can move to their destinations .
By the construction, a successful movement plan will contain the following phases:

In the first phase the variable agents will move to the vertices or . Which vertex moves to can be interpreted as making a choice on the truth value of the variable. Note that no other vertices are possible, because then the final destination would not be reachable any more for .

In the second phase, all filler and clause agents move vertices to the left in the sequencer widget.

After phase 2 has finished, all clause agents occupy vertices , from which they can pass through the clause evaluator widget. By construction, they can pass through it if and only if for one of the variables occurring in clause , the variable agent has made a choice in phase 1 corresponding to making the clause true. Note that no other group of agents can move, or otherwise they will no be able to reach their destination or block the clause agents. The phase ends when all clause agents have reached their destinations.

After the end of phase 4, the shadow agents move to their respective destinations, enabling the variable agents to go to their destinations.

Finally all variable agents can move to their destinations, finalizing the movement plan.
Note that in a successful plans, some of the phases could overlap. However, one could easily disentangle them. The critical phases are apparently phase 1 and phase 3. Phase 3 is only successful if in phase 1 the variable agents made the choices in a way, so that all clauses are satisfied. In other words, the existence of a successful movement plan implies that there is a satisfying truth value assignment to the CNF formula. Conversely, if there exists a satisfying truth value assignment, then this could be used to generate a successful movement plan by using it to make the choices in phase 1. Since the construction is clearly polynomial in the size of the 3SAT instance, it is a polynomial manyone reduction, proving that diMAPF is NPhard. ∎
4 Upper Bounds for diMAPF
While the result of the previous section demonstrates that diMAPF is more difficult than MAPF (provided ), it leaves open how much more difficulty is introduced by moving from undirected to directed graphs. Although one might suspect that diMAPF is just NPcomplete, this is by now way obvious. The main obstacle in proving this is the fact that the state space of the diMAPF problem is exponential. Nevertheless, it cannot be more complex than the propositional STRIPS planning problem [2], which has a similar state space.
Proposition 2.
The diMAPF problem is in PSPACE.
Proof.
A movement plan from the initial state to a goal state, if one exists, can be generated nondeterministically using for each step only polynomial space. This means that the problem is in NPSPACE, which is identical to PSPACE, which proves the claim. ∎
However, it is by no means obvious that one has to go through a significant part of the state space in order to arrive at the goal configuration, if this is possible at all. In particular, in cases similar to the one used in the proof of Theorem 1, it seems obvious that the number of moves is bounded polynomially.
Proposition 3.
The diMAPF problem on DAGs is NPcomplete.
Proof.
In a DAG, each agent can make at most moves, since the agent can never visit a vertex twice. This means that overall no more than moves are possible. This implies that all solutions have a length bounded by a polynomial in the input size, implying that the problem is in NP. Together with Theorem 1, this implies the claim. ∎
When looking at what stops us from proving a general NPcompleteness result, we notice that strongly connected components are the culprits. They allow agents to reach the same location twice with the other agents in a perhaps different configuration. This may imply that a particular configuration can only be reached when agents walk through exponentially many distinct configurations. We know from Botea et al. \shortcitebotea:et:al:jair18 that for all strongly biconnected digraphs with at least two empty vertices, all configurations can be reached using only cubic many moves. If we allow for only one empty vertex, solution existence cannot be any longer guaranteed [1] and it is not any longer clear whether a polynomial long sequence suffices, if the instance is solvable at all. If we further weaken the requirement to only strongly connected graphs, it is neither clear whether solvability can be decided in polynomial time nor whether movement sequences can be bounded polynomially, although the latter sounds very plausible. For this reason, we will assume it for now and call it the short solution hypothesis for strongly connected digraphs: “For each solvable diMAPF instance on strongly connected digraphs, there exists a movement plan of polynomial length.”
Theorem 4.
If the short solution hypothesis for strongly connected digraphs is true, then diMAPF is NPcomplete.
Proof.
NPhardness follows from Theorem 1.
Assume a diMAPF instance on a digraph that is solvable, which implies that there exists a movement plan for the agents on . This plan may be arbitrarily long. Consider now each strongly component in isolation and focus on the events when an agent enters the component, leaves the component, or moves to its final destination in the component without moving afterwards. In each component there can only be such events because the condensation of is a DAG. Between two such events, exponentially (in the size of the component) many movements of agents in this component may occur in the original plan . However, since we assumed the short solution hypothesis to be true, there must also be a plan of polynomial length . Since there are at most strictly connected components, there must a plan with no more than moves, i.e., a plan of polynomial length. This implies that the problem is in NP. ∎
5 Conclusion and Outlook
We gave a first answer to a longstanding open problem, namely, what the computational complexity of MAPF on digraphs is. In contrast to solvability on undirected graphs, which is a polynomial time problem, solvability on digraphs turns out to be NPhard in the general case. While we also provide an NP upper bound for DAGs and a PSPACE upper bound in general, we were only able to show a conditional upper bound of NP for the general problem, provided the short solution hypothesis for strongly connected digraphs is true.
While the result in itself may not have a high relevance for practical purposes, it still is significant in ruling out the possibility of a polynomialtime algorithm similar to the one developed by Kornhauser et al. \shortcitekornhauser:et:al:focs84. Furthermore, the short solution hypothesis could be taken as a suggestion that the result by Botea et al. \shortcitebotea:et:al:jair18 could be strengthened to general strongly connected digraphs.
References
 [1] A. Botea, D. Bonusi, and P. Surynek. Solving multiagent path finding on strongly biconnected digraphs. Journal of Artificial Intelligence Research, 62:273–314, 2018.
 [2] T. Bylander. The computational complexity of propositional STRIPS planning. Artificial Intelligence, 69(1–2):165–204, 1994.
 [3] A. Felner, R. Stern, S. E. Shimony, E. Boyarski, M. Goldenberg, G. Sharon, N. R. Sturtevant, G. Wagner, and P. Surynek. Searchbased optimal solvers for the multiagent pathfinding problem: Summary and challenges. In Proceedings of the Tenth International Symposium on Combinatorial Search (SOCS17), pages 29–37, 2017.
 [4] D. Kornhauser, G. L. Miller, and P. G. Spirakis. Coordinating pebble motion on graphs, the diameter of permutation groups, and applications. In 25th Annual Symposium on Foundations of Computer Science (FOCS84), pages 241–250, 1984.
 [5] H. Ma and S. Koenig. AI buzzwords explained: multiagent path finding (MAPF). AI Matters, 3(3):15–19, 2017.
 [6] H. Ma, C. A. Tovey, G. Sharon, T. K. S. Kumar, and S. Koenig. Multiagent path finding with payload transfers and the packageexchange robotrouting problem. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI06, pages 3166–3173, 2016.
 [7] B. Nebel, T. Bolander, T. Engesser, and R. Mattmüller. Implicitly coordinated multiagent path finding under destination uncertainty: Success guarantees and computational complexity. Journal of Artificial Intelligence Research, 64:497–527, 2019.
 [8] C. H. Papadimitriou. Computational complexity. AddisonWesley, 1994.
 [9] C. H. Papadimitriou, P. Raghavan, M. Sudan, and H. Tamaki. Motion planning on a graph (extended abstract). In 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, USA, 2022 November 1994, pages 511–520, 1994.
 [10] P. Surynek. An optimization variant of multirobot path planning is intractable. In Proceedings of the TwentyFourth Conference on Artificial Intelligence (AAAI10), 2010.
 [11] Z. Wu and S. Grumbach. Feasibility of motion planning on acyclic and strongly connected directed graphs. Discrete Applied Mathematics, 158(9):1017–1028, 2010.
 [12] J. Yu and S. M. LaValle. Structure and intractability of optimal multirobot path planning on graphs. In Proceedings of the TwentySeventh Conference on Artificial Intelligence (AAAI13), 2013.