Constant Factor Time Optimal Multi-Robot Routing on High-Dimensional Grids in Mostly Sub-Quadratic Time

Constant Factor Time Optimal Multi-Robot Routing on High-Dimensional Grids in Mostly Sub-Quadratic Time

Jingjin Yu Jingjin Yu is with the Department of Computer Science, Rutgers University at New Brunswick. E-mails: jingjin.yu@cs.rutgers.edu.
Abstract

Let be an grid. Assuming that each is occupied by a robot and a robot may move to a neighboring vertex in a step via synchronized rotations along cycles of , we first establish that the arbitrary reconfiguration of labeled robots on can be performed in makespan and requires running time in the worst case and when is non-degenerate (i.e., nearly one dimensional). The resulting algorithm, iSaG, provides average case -approximate (i.e., constant factor) time optimality guarantee. In the case when all dimensions are of similar size , the running time of iSaG approaches a linear . Define as the largest distance between individual initial and goal configurations over all robots for a given problem instance , building on iSaG, we develop the PartitionAndFlow (PaF) algorithm that computes makespan solutions for using mostly running time. PaF provides worst case -approximation regarding solution time optimality. We note that the worst case running time for the problem is .

1 Introduction

In this work, we study the time optimal multi-robot routing or path planning problem on dimensional grids and grid-like settings, with the assumption that each vertex of the grid is occupied by a labeled robot, i.e., the robot density is maximal. Our work brings several technical breakthroughs:

  • On a (assuming is a constant) dimensional grid , our algorithm, iSaG, improves the running time of the average case -approximate (makespan) time optimal SplitAndGroup (SaG) algorithm from [1] from to a sub-quadratic for most cases and in the worst case (when is degenerate and nearly one dimensional). The problem has a worst case time complexity lower bound of .

  • Define as the largest distance between individual initial and goal configurations over all robots for a given problem instance , building on iSaG, we develop the PartitionAndFlow (PaF) algorithm that computes makespan solutions for two and three dimensions in mostly time and time in the worst case. PaF provides worst case -approximate guarantee on time optimality. We note that PaF is developed independently of a key result from [2] that achieves the same effect for two dimensions only.

  • Certain techniques in our work, which help enable the near optimal running time for iSaG and PaF, may be of independent interest, including:

    • We provide a shuffling procedure based on bipartite matching that allows the arbitrary redistribution of a group of unlabeled robots on arbitrary-dimensional grids (Theorem 5).

    • We provide an efficient procedure, also based on matching, that decouples an circulation into unit circulations on arbitrary graphs (Theorem 13).

From the practical standpoint, our results are of significance in multiple application domains including robotics and network routing. Particularly, in robotics, our results imply that even in highly dense settings, if among a group of labeled robots the maximum distance between a robot and its goal is of distance , then it is possible to compute a routing plan that solves the entire problem that requires makespan in only quadratic time, assuming that the robots travel at no faster than unit speed. Further exploration of the algorithmic insights from our work may lead to more optimal coordination algorithms for applications including warehousing [3], automated container port management [4], and coordinated aerial flight [5]. As noted in [2], algorithms like PaF also help resolve open questions regarding routing strategies for inter-connected mesh networks. Indeed, solving multi-robot routing on grid and grid-like structures is equivalent to finding vertex disjoint paths in the underlying network, extended over discrete time steps.

Related work. Multi-robot path planning, from both the algorithmic and the application perspectives, has been studied extensively [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], covering many application domains [21, 22, 23, 24, 25, 26, 27, 28, 29, 5]. Multi-robot path and motion planning is known to be computationally hard under continuous settings [30, 31], even when the robots are unlabeled [32, 33]. While the general multi-robot motion planning problem seems rather difficult to tackle, relaxed unlabeled continuous problems are solvable in polynomial time even near optimally [15, 34].

Restricting our attention to the discrete and labeled setting, in contrast to the continuous setting, feasible solutions are more readily computable. Seminal work by Kornhauser et al. [35], which builds on the work by Wilson [36], establishes that a discrete instance can be checked and solved in time on a graph . Feasibility test can in fact be completed in linear time [37, 38, 39]. Optimal solutions remain difficult to compute in the discrete settings, however, even on planar graphs [40, 2]. Whereas many algorithms have been proposed toward optimally solving the discrete labeled multi-robot path planning problems [11, 41, 42, 43, 44, 45, 46, 47, 48], few provide simultaneous guarantees on solution optimality and (polynomial) running time. This leads to the development of polynomial time methods that also provide these desirable guarantees [1, 2].

2 Preliminaries

Let be a simple, undirected, and connected graph. A set of robots labeled - may move synchronously on in a collision-free manner described as follows. At integer (time) steps starting from , each robot must reside on a unique vertex , inducing a configuration of the robots as an injective map , specifying which robot occupies which vertex at step (see Fig. 1). From step to step , a robot may move from its current vertex to an adjacent one under two collision avoidance constraints: (i) is injective, i.e., each robot occupies a unique vertex, and (ii) for , , , i.e., no two robots may swap locations in a single step. If all individual robot moves between some and are valid (i.e., collision-free), then is a valid move for all robots. Multiple such moves can be chained together to form a sequence of moves, e.g., .

{overpic}

[width=2.8in ,tics=5]./figures/_example-eps-converted-to.pdf (a)(b)

Figure 1: Graph-theoretic formulation of the multi-robot path planning problem. (a) A configuration of 12 robots on a grid. (b) A configuration that is reachable from (a) in a single synchronous move through simultaneous rotations of robots along two disjoint cycles.

Under this model, a multi-robot path planning problem (MPP) instance is fully specified with a 3-tuple in which and are the initial and goal configurations, respectively. To handle the most difficult case, we assumed that , i.e., the number of robots is the maximum possible under the model. We note that the case of may be reduced to the case by arbitrarily placing “virtual” robots on vertices that are empty as indicated by and . An algorithm for the case is then also an algorithm for the case via the reduction.

For this study, is assumed to be a -dimensional () grid graph, i.e., is an grid with . Without loss of generality, throughout the paper we always assume that and (note that constant sized problems can be solved in makespan through first doing brute force search and then direct solution look up, which takes constant time). Such a grid graph is also meant whenever the term grid is used in the paper without further specifications. We say is degenerate if , which implies that all other dimensions are of constant sizes, i.e., is mostly one-dimensional. Otherwise, is non-degenerate. Since the most interesting cases are due to their relevance in applications, these cases are treated with additional attention.

Given an MPP instance and a feasible solution, as a sequence of moves that takes to , we define the solution’s makespan as the length of the sequence. For an MPP instance , let denote the distance between two vertices , assuming each edge has unit length. We define the distance gap between and as

which is an underestimate of the minimum makespan for . The main aim of this work is to establish a polynomial time algorithm that computes solutions with makespan for an arbitrary instance whose underlying grid are of dimensions . In other words, the algorithm produces, in the worst case, -approximate makespan optimal solutions. Note that, on an grid, .

3 Improved Average Case -Approximate Makespan Algorithm

Our worst case -approximate algorithm makes use of, as a subroutine, an average case -approximate algorithm for the same problem that improves over the SplitAndGroup (SaG) algorithm from [1]. Main properties of SaG are summarized in the following theorem.

Theorem 1 ([1]).

Let be an MPP instance with being an grid. Then, a solution with makespan can be computed in time.

To be able to state our improvements over SaG, we briefly describe how SaG operates on an grid . SaG recursively splits into halves along a longer dimension. During the first iteration, is split into two grids (assuming without loss of generality that is even), and . Then, all robots whose goals belong to will be routed to . This will also force all robots whose goals belong to to be moved to because is fully occupied. This effectively partitions all robots on into two equivalence classes (those should be in and those should be in ); there is no need to distinguish the robots within each class during the current iteration. This is the grouping operation in SaG. Fig. 2 illustrates graphically what is to be achieved in the grouping operation in an iteration of SaG.

{overpic}

[width=3.2in ,tics=5]./figures/_split-eps-converted-to.pdf

Figure 2: On a grid, the shaded robots have goals on the right grid. The grouping operation of an SaG iteration seeks to move the shaded robots on the left grid to exchange with the unshaded robots marked with dashed boundaries on the right grid.

To be able to move the robots to the desired halves of , it was noted [48] that an exchange of two robots can be realized on a grid using a constant number of moves (Fig. 3).

{overpic}

[width=3.6in ,tics=5]./figures/_23-eps-converted-to.pdf

Figure 3: Robots and may be “swapped” using three synchronous moves on a grid. This implies that arbitrary configuration on a grid can be realized in a constant number of moves.

The local “swapping” primitives can be executed in parallel on , which implies Lemma 2 as follows. An illustration of the operation is provided in Fig. 4.

Lemma 2 (Lemma 6 in [1]).

On a length path embedded in a grid, a group of indistinguishable robots may be arbitrarily rearranged using makespan. Multiple such rearrangements on vertex disjoint paths can be carried out in parallel.

{overpic}

[width=3.2in ,tics=5]./figures/_line-distribute-eps-converted-to.pdf

Figure 4: Assuming a length path is embedded in a grid, Lemma 2 guarantees that the arbitrary distribution of a group of robots can be performed using make span.

Lemma 2 further implies Lemma 3. Fig. 5 illustrates graphically the operation realized by Lemma 3.

Lemma 3 (Lemma 7 in [1]).

On a length path embedded in a grid, two groups of robots, equal in number and initially located on two disjoint portions of the path, may exchange locations in makespan. Multiple such exchanges on vertex disjoint paths can be carried out in parallel.

{overpic}

[width=3.2in ,tics=5]./figures/_line-flip-eps-converted-to.pdf

Figure 5: Assuming the grid-embedded path has a length of , Lemma 3 guarantees that the swapping of the two separated groups of robots, up to per group, can be done in make span without any net movement of other robots on the line.

Lemma 2 and Lemma 3 both demand a running time of . We note that some problems requires time to simply write down the solution, e.g., when robots need to be moved on a path of length . Several additional results were developed over Lemma 3 in [1] to complete the grouping operation, which involves complicated routing of robots on trees, embedded in a grid, that may overlap. We provide an alternative method that not only simplifies the process with better running time but also allows easy generalization to high dimensions. We note that, to complete the grouping operations, using the example from Fig. 2 for illustration, we only need to reconfigure robots on the left grid so that for each row, robots to be exchanged across the split line are equal in number (see Fig. 6). Lemma 3 then takes care of the rest.

{overpic}

[width=3.2in ,tics=5]./figures/_permute-eps-converted-to.pdf

Figure 6: We would like to reconfigure robots on the left half of Fig. 2 to the configuration as shown. The right portion will not be touched in the operation. In this configuration, robots do not need to move between different rows to complete the grouping operation, using Lemma 3.

To perform the reconfiguration, we begin by assigning labels to the robots as illustrated in Fig. 7 (see the description in the figure on how the labels are assigned in a straightforward manner, which takes linear time with respect to the size of the grid). These labels are only for pairing up robots for the reconfiguration; keep in mind that the shaded robots are in fact indistinguishable in the execution of the grouping operation.

{overpic}

[width=3.2in ,tics=5]./figures/_matching-setup-eps-converted-to.pdf (a)(b)

Figure 7: (a) and (b) correspond to the left grids from Fig. 2 and Fig. 6, respectively. We would like to reconfigure the shaded robots to go from (a) to (b) (ignoring the labels). In (a), shaded robots are assigned labels based on the column they belong to. In (b), from top to bottom and left to right, we sequentially assign each shaded labeled robot from (a) a goal. The same is done to the unshaded robots.

With the labeling, we set up a bipartite graph as follows. One of the partite set (e.g., in Fig. 8) represents the initial columns and the other set (e.g., in Fig. 8) the goal columns. We draw an edge between and if a shaded robot labeled ends up at a goal column . For example, in Fig. 7, shaded robots with label in (a) ends up at columns and in (b), yielding the edges and in Fig. 8. If a goal column contains multiple shaded robots with label , then multiple edges between and are added. Note that, if we also add the edges for the unshaded robots in Fig. 7 in a similar manner, the bipartite graph will be -regular where is the number of rows in the original grid ( in the provided example).

{overpic}

[width=3.6in ,tics=5]./figures/_permute-bipartite-eps-converted-to.pdf

Figure 8: A bipartite graph constructed for rearranging robots. The colorings of the edges indicate a possible set of matchings, which are (red), (orange), (green), (cyan).

With the bipartite graph constructed, we proceed to obtain a set of up to maximum matchings. We note that this is always possible because our bipartite graph is a sub graph of a -regular bipartite graph (By Hall’s theorem [49], a perfect matching may be obtained on a -regular bipartite graph, the removal of which leaves a -regular bipartite graph). From the obtained set of matchings (e.g., using Hopcroft-Karp [50]), we permute with Lemma 2 to distribute the robots vertically so that a robot matched in the -th matching gets moved to the -th row. In our example, the first set is , which means that a set of three shaded robots labeled , and should be moved to the first row. Doing this for all matching sets shown in Fig. 7(a) yields the configuration in Fig. 9(a). Then, in a second round, the robots are permuted within their row, again using the matching result. In the example, the first matching set says that robots , and on the first row should be moved to columns , and . We note that going from to is possible with Lemma 2 because the labels are nominal; we only need to move the four indistinguishable robots to columns , and . For the configuration in Fig. 9(a), this round yields the configuration in Fig. 9(b). We note that the bipartite matching technique mentioned here was due to [51], in which a variation of it is used for a different reconfiguration problem.

{overpic}

[width=3.2in ,tics=5]./figures/_permute-steps-eps-converted-to.pdf (a)(b)

Figure 9: (a) The initial permutation of columns of Fig. 7(a) using the bipartite matching result. (b) A second row-based permutation of (a) using the bipartite matching result. Our procedure operates following the sequence Fig. 7(a) Fig. 9(a) Fig. 9(b) Fig. 7(b).

We observe that the labeled robots that need to be moved now are all in the correct columns. One last column permutation then moves the robots in place. In the example, this is going from Fig. 9(b) to Fig. 7(b). We summarize the the discussion in the following lemma.

Lemma 4.

On an grid, the reconfiguration of a group of indistinguishable robots between two arbitrary configurations can be completed using makespan in time.

Proof.

The procedure is already fully described; here, we analyze its performance. The procedure operates in three phases, each requiring a makespan of either or (because only one dimension of the grid is involved in each phase). The overall makespan is then . Regarding the computation time, each invocation of the procedure from Lemma 2 or Lemma 3 on an grid takes or time; doing these in parallel on the grid then takes time. For doing the bipartite matching, we may invoke an time matching algoithm [52] times to get a running time where and are the degree and the number of edges of the -regular bipartite graph. The total time spent on matching is . The overall running time is then . ∎

We now generalize Lemma 4 to dimensions.

Theorem 5.

On an grid, the reconfiguration of a group of indistinguishable robots between two arbitrary configurations can be completed using

makespan and requires time

(1)
Proof.

Since the case of is of practical importance, we first provide the proof for this case, which also outlines the inductive proof approach for general . On an grid, we partition the gird into columns of size , in the natural way. To build the bipartite graph, robots to be moved will be labeled based on the column it belongs to, yielding labels. The goals for these robots are assigned sequentially, similar to how it is done in the 2D case. After building the bipartite graph as before and performing the matching, the robots to be moved are partitioned into layers (a layer in the 3D case corresponds to a row in the 2D case) with each layer being an grid.

Then, as in the 2D case, a column permutation is done for each of the columns, in parallel. To be able to move the robots on each layer which is a grid, we invoke Lemma 4 in parallel on all layers. This is then followed by a final parallel column permutation.

To count the makespan, the initial and final column permutations require makespan and working with the layers requires makespan, yielding a total makespan of . For running time, at the top layer, the bipartite matching process creates a bipartite graph with . The time for doing matchings is then . The initial and final column permutation takes time (because we need to arrange columns of size each). For handling the layers of grids, by Lemma 4, it takes time . The overall running time is then .

For constructing the inductive proof, suppose for dimension , our makespan hypothesis for reconfiguration is . The running time hypothesis is as given in the theorem statement. For dimension , the problem is first approached at the top level to generate “layers” of size each (corresponding to a grid). After permuting columns of size , -dimensional problems are then solved via the induction hypothesis. Lastly, another column permutation is performed to complete the reconfiguration.

To count the makespan required, we note that at dimension , the initial and final column permutations require a makespan of as all columns of size can be operated on in parallel. By the induction hypothesis, the total makespan is then , which actually does not directly depend on the dimension. The running time for the first matching operation takes time. The running time for the initial and final column permutations require calling the routine (Lemma 2) times, taking the same amount of time. By the induction hypothesis, handling the layers take time multiple of (1). Putting these together yields again (1) with replaced by . ∎

A case of special interest is when all , , are about the same.

Corollary 6.

On a -dimensional grid with all sides having lengths , the reconfiguration of a group of indistinguishable robots between two arbitrary configurations can be completed using makespan and time.

Replacing the tree-routing based grouping operation in SaG with the updated, staged grouping routine, we obtain the following improved result.

Theorem 7.

Let be an MPP instance with being an grid for some . Then, a solution with

makespan can be computed in time

(2)
Proof.

Similar to SaG, standard divide-and-conquer is applied that iteratively divides and subsequent partitions into equal halves; the grouping operation is then applied. For the grouping operation, after reconfiguration on a half grid, a parallel invocation of Lemma 3 is needed to move the robots across the splitting boundary, which takes at most time. Because is already a term in (1), this additional operation does not contribute to more computation time in an iteration of SaG.

For a -dimensional grid, in the first iterations, we may choose the -th round to divide dimension into two halves (i.e., ). Following this scheme, for the -th round, the makespan is

For computation time, we need to operate on subproblems with each subproblem requiring time no more than

That is, each of the first iterations takes no more time than (1). Tallying up, the first rounds require makespan and running time as given in the theorem statement.

After rounds of division, all dimensions are halved. To complete the next rounds, the required makespan is halved and the computation time shrinks even more (since it’s quadratic in at least one of the dimensions and super linear in the rest). Subsequently, the makespan and the running time for the first rounds dominate. ∎

To distinguish our modification with SaG, we denote the improved SaG algorithm as iSaG. We mention that iSaG runs in quadratic time if we allow to be degenerate, i.e., . To see that this is true, we observe that the term inside (2) is bounded by

because . The last term is since is bounded by some small constant. As noted, the quadratic bound is sometimes necessary when is degenerate (see discussion following Lemma 3). We note that in this case, the running time lower bound can also be . When is non-degenerate, iSaG runs in a sub-quadratic time that approaches .

We conclude this section with a corollary, which will be useful later, that directly follows Corollary 6 and Theorem 7.

Corollary 8.

When all dimensions of the underlying grid are of similar magnitude, the makespan and computation time for solving an MPP instance are and , respectively, for two dimensions. For three dimensions, these are and , respectively. For general , these are and , respectively.

4 From Average Case to Worst Case: A Solution Sketch for Two Dimensions

In this section, we highlight, at a high level, why solution produced by iSaG can be rather undesirable in practice and how its shortcomings can be addressed with the PartitionAndFlow (PaF) algorithm. In sketching PaF, we resort to the frequent use of figures to illustrate the important steps. We emphasize that the steps explained using these pictorial examples are also rigorously proved to be correct later in Section 5. Full optimality and running time analysis will also be delayed until then.

4.1 The Difficulty

Given an MPP instance , let the makespan computed by iSaG be denoted as . From an algorithmic perspective, iSaG delivers -approximate makespan optimal solutions on average, i.e., for a fixed , let all instances of MPP on be , then iSaG ensures the quantity (as a sum of ratios)

is a constant. A key assumption in the average case analysis is that all instances for a fixed are equally likely, implying a uniform distribution of problem instances. When this assumption does not hold, as is the case in many practical scenarios, iSaG no longer guarantees approximation. Such cases may be illustrated with a simple example. On an grid, let an MPP instance be constructed so that to reach the goal configuration, all robots on the outer boundary must rotate synchronously once in the clockwise direction (see Fig. 10). The minimum makespan of the instance is but iSaG will incur a makespan of due to its divide-and-conquer approach that agnostically divide the grid in the middle.

{overpic}

[width=2.4in ,tics=5]./figures/_sag-non-const-eps-converted-to.pdf

Figure 10: An MPP instance on an grid. Solving the instance requires all robots on the outside perimeter to move clockwise once. iSaG will first cause the two (red, darker shaded) robots to exchange locations, which induces a makespan of .

On the other hand, if a polynomial time algorithm can be constructed that always produces makespan for an arbitrary MPP instance , then -approximate optimal solution can always be guaranteed. Naturally, such an algorithm will necessarily require some form of divide-and-conquer on top of which the flow of robots at the global scale must also be dealt with. The key to establishing such an algorithm is to be able to recognize the global flow to generate appropriate local routing plans. In terms of the example illustrated in Fig. 10, the two darker shaded (red) robots must be routed locally across the thick dashed (green) boundary lines. This implies that all the shaded robots must more or less move along synchronously around the cycle. A main challenge is how to realize such local-global coordination when many such cyclic flows are entangled under maximum robot density.

Here, we mention that the special case of can be easily handled for an arbitrary dimension .

Proposition 9.

Let be dimensional grid with and let be an MPP instance with . Then an makespan plan for solving can be computed in time.

Proof.

In this case, for a given robot , if , its goal is just one edge away. Starting from any robot , the vertices induce a cycle on . When such a cycle has two vertices, this represents an exchange of two robots. Using parallel swapping operations, such exchanges can be completed in makespan, which leave only simple cycles on that are all disjoint. Robots on these simple cycles can then move to their goals in a single synchronous move. The total makespan is then and to compute the plan is to simply write down the cycles, which takes time linear with respect to the size of the grid. The factor comes from the search branching factor. ∎

4.2 Sketch of PartitionAndFlow

In sketching the PaF algorithm, we remark that PaF essentially works on a problem by gradually updating . That is, it first creates some intermediate based on and and solve the problem , leaving a new problem . Then, it repeats the process to create and solve another problem , resulting a new problem . The process continues until is updated to eventually match . It is important to keep this in mind in reading the sketch of PaF.

In our description of PaF in this section, a two-dimensional, grid will be assumed. The generalization to a -dimensional grid will use the same general approach but require more involved treatment. As the name suggests, PaF partitions an MPP instance on a grid into small pieces and organize the flow of robots through these pieces globally. The partition is essentially a form of decoupling that includes and is more general than iSaG’s half-half splitting scheme.

For a given MPP instance with being an grid, PaF starts by computing , the distance gap for the problem111Henceforth, we use in place of because the instance is always fixed (but arbitrary); is otherwise only used in theorem statements when a problem is being specified.. In the main case, . That is, for any robot , . This means that may be partitioned into square cells of sizes each. This is the partition operation in PaF (see Fig. 11 for an illustration). For the moment, we assume that a perfect partition can be achieved, i.e., and are both integer multiples of ; the assumption is justified in Section 5.

{overpic}

[width=4in ,tics=5]./figures/_partition-eps-converted-to.pdf

Figure 11: Partitioning of an grid into cells. Each cell has a size of . Within a cell (the figure on the right), only robots located of a distance no more than from the border may have goals outside the cell.

The partition scheme, as a refinement to the splitting scheme from iSaG, has the property that only robots of distance from a cell boundary may have goals outside the cell by the definition of (for more details, see Fig. 12). This means that between two cells that share a vertical or horizontal boundary, at most robots need to cross that boundary. If we only count the net exchange, then the number reduces to .

{overpic}

[width=2.5in ,tics=5]./figures/_boundary-eps-converted-to.pdf

Figure 12: An illustration of the thick boundary areas of four adjacent cells. Any net robot exchange between two cells must happen in this region by the definition of .

Over the partition, PaF will build a flow between the cells treating each cell as a node in a graph. To be able to translate the flow into feasible robot movements, the flow should only happen between adjacent cells that share a boundary. However, as illustrated in Fig. 12, it is possible for a robot to have initial and goal configurations that are separated into diagonally adjacent cells which do not share boundaries. To resolve this, we may update the goals for these robots using robots from another cell that is adjacent to both of the involved cells. Fig. 13 illustrates how one such robot can be processed. We call this operation diagonal rerouting, which will create a new configuration of the robots on . iSaG is then invoked to solve . iSaG will do so locally on regions that span equal parts of four adjacent cells.

{overpic}

[width=3.6in ,tics=5]./figures/_crossover-eps-converted-to.pdf 11’21’2’22’1(a)(b)

Figure 13: (a) At the boundary between four cells, robot has initial and goal configurations (vertices) spanning two diagonally adjacent cells. In the top right cell which is adjacent to both the top left and bottom right cells, there exists a robot that has its goal vertex in the same cell. (b) By swapping the robots and using iSaG, no robot needs to cross cell boundaries diagonally.

Then, PaF creates another intermediate configuration for moving robots between each vertical or horizontal cell boundary so that between any two cells, robots will only need to move in a single direction when crossing a cell boundary. That is, for each cell boundary, iSaG is called to “cancel out” non-net robot movements, as illustrated in Fig. 14, leaving only uni-directional robot movements across cells. We call this operation flow cancellation.

{overpic}

[width=3.6in ,tics=5]./figures/_cancellation-eps-converted-to.pdf (a)(b)

Figure 14: (a) There are four robots in the top cell and two robots in the bottom cell that need to move across the horizontal boundary. (b) Through an arbitrary matching (indicated with double sided arrows) of two pairs of robots’ initial configurations and applying iSaG to swap them, the robot movements across the boundary are now unidirectional.

The net robot movement across cell boundary induces a flow over the cells (see Fig. 15(a)). Because each cell contains a fixed number of robots, the incoming and outgoing flow at each cell (node) must be equal. This means that all such flows must form a valid circulation222A circulation is essentially a valid flow over a network without source and sink nodes. That is, the incoming flows and outgoing flows at every node of the network are equal in magnitude. over the graph formed by cells as nodes. The flow between two adjacent cells is no more than (to be established later). The circulation can then be decomposed into unit circulations (Fig. 15(b)). These unit circulations can be translated into coordinated global robot movements that require any robot to travel only locally at most a distance of . The translation amounts to creating another configuration . is also solved using iSaG.

{overpic}

[width=4in ,tics=5]./figures/_flow-decomposition-eps-converted-to.pdf (a)(b)

Figure 15: (a) Induced circulation (network) from required robot movements. The numbers denote the total flow on a given edge. The edges without numbers have unit flows. (b) After decomposition, the circulation can be turned into unit circulations on simple cycles.

After the preparation phase is done, the scheduled global robot movements can be directly executed, yielding a new configuration . The configuration has the property that every robot is now in the partitioned cell where its goal resides. iSaG can then be invoked to solve (iSaG is invoked at the cell level). Throughout the process, each robot only needs to move a distance of and calls to iSaG can be performed in parallel, yielding an overall makespan of . Before presenting the details of PaF in Section 5, we outline the steps of PaF in Algorithm 1. We emphasize that the outline is provided at a very high level that summarizes the sketch of PaF and only covers the main case in 2D.

Input : : an grid graph
: initial configuration
: goal configuration
Output : : a sequence of moves
%Partition ; represents the partition
1
%Orienting flows on
2
%Flow decomposition and global route preparation; are the routes
%Global robot routing
%Final local robot routing
return
Algorithm 1 PafMainCase2D(, , )

In closing this section, we note that in providing the details of PaF in Section 5, objects of minor importance, including the temporary configurations (e.g., ’s) and actual robot movement plans (e.g., ’s), will be omitted in the description. However, sufficient details are provided if a reader is interested in deriving these objects.

5 PartitionAndFlow in 2D: the Details

At this point, we make the assumption that for the rest of the paper (unless stated explicitly otherwise), for a given problem with being an grid, . Otherwise, and we may simply invoke iSaG to solve the problem. We note that this is a different condition than requiring being non-degenerate.

We now proceed to provide the full description of how to piece together PaF. The goal of this section is to establish the following main result on the existence of a polynomial time algorithm (PaF) for computing worst case -approximate makespan optimal solution for MPP, in two dimensions.

Theorem 10.

Let be an arbitrary MPP instance with being an grid. A solution for with makespan can be computed in deterministic time or expected time.

Beside the main case outlined in Section 4, there is also a special case that needs to be analyzed in proving Theorem 10, depending on the magnitude of relative to and . The cases for are divided into two disjoint cases: (i) and (ii) . The first case can be readily addressed.

Lemma 11.

Let be an arbitrary MPP instance in which is an grid with and . The instance admits a solution with a makespan of , computable in time.

{overpic}

[width=4in ,tics=5]./figures/_dg-omega-m2-eps-converted-to.pdf

Figure 16: Partitioning of an grid along the dimension into cells of roughly the same size of with . Three partitioned cells and are shown. Four robots need to move from to and three robots need to move from to . Equal number of robots must move in the opposite direction. The goals of the robots are not illustrated in the drawing.
Proof.

When , We compute and (note that ). Partition into grid cells along the direction of ; each cell is of size or (see Fig. 16). Assuming that is oriented such that its longer dimension is aligned horizontally, then from left to right, we label these cells . By the definition of , a robot initially located in cell may only have its goal in either , , or (for applicable , and ). This further implies that for any applicable , the number of robots that needs to move from to is the same as the number of robots that needs to move from to . The MPP instance can then be solved in two rounds through first invoking iSaG on the combined cells for all applicable odd . This round finishes all robot exchanges between and for odd . In the second round, iSaG is invoked again to do the same, now for all applicable even . Since both parallel applications of iSaG incur a makespan of , the total makespan is . For running time, each round of iSaG application requires . Since and , this yields a total time of . ∎

The rest of this section is devoted to the case . Because , without loss of generality, we assume that . Furthermore, we may assume without loss of generality that and are multiples of . If that is not the case, assuming that PaF is correct, then we can apply PaF up to four times without adding makespan or running time penalty. To execute this, first we compute and . We note that . Then, PaF is applied to the top left portion of . This will fully solve the problem for the top left cells of sizes . Doing the same three more times with each application on a different section of , as illustrated in Fig. 17, the entire problem is then solved.

{overpic}

[width=4.5in ,tics=5]./figures/_multiple-eps-converted-to.pdf

Figure 17: For , if or are not multiples of , we may apply PaF to a cell partition of up to four times to cover .

Henceforth in this section, we assume and in which and are integers. is partitioned into a skeleton grid with its nodes being cells. We remind the readers that after the partition, by the definition of , robot exchanges between cells can only happen in a wide border for any cell, as explained earlier and illustrated in Fig. 12. Our immediate goal is to make sure that between neighboring cells, the movement of robots are uni-directional and does not happen between diagonally adjacent cells. That is, we would like to realize what is illustrated in Fig. 15(a) from a raw partition, in polynomial time and makespan, using diagonal rerouting (Fig. 13) and flow cancellation (Fig. 14) operations.

Lemma 12 (Flow Orientation).

In time and makespan, the flow of robots on the skeleton grid may be arranged to be only vertical or horizontal between adjacent cells and uni-directional. The largest total incoming flow through a cell boundary is no more than .

Proof.

We first show how to carry out the diagonal rerouting operation. For convenience and with more details, we reproduce Fig. 13 in Fig. 18 and let the four involved cells be through as illustrated. By the definition of , if a robot in has its goal in , then the robot must be in the bottom right region of and its goal must be in the top left region of . For each such robot, we pick an arbitrary robot from in the diagonal-line shaded region. Any robot in this region will have its goal in (by definition of ). If we swap the initial configurations of and , then the diagonal movement of is eliminated. Going in a clockwise fashion, for any robot in that needs to move to , we can swap it with a robot from in the diagonal-line shaded region. Within the region, we create a (temporary) MPP problem containing only these swaps. For each such meetings of four cells, such an MPP instance is created. Then, all these disjoint instances can be solved with iSaG in parallel using only makespan. For computation time, constructing the instance requires a single linear scan of the region and solving each MPP instance takes time, by Corollary 8. There are such instances, demanding a total time of .

{overpic}

[width=3.6in ,tics=5]./figures/_crossover-detail-eps-converted-to.pdf

Figure 18: Illustration of four cells meeting at corners. Each small square region is of size . The entire region is of size . Swapping and eliminates the need for to directly cross into a diagonally adjacent cell.

The flow cancellation operation is carried out using a mechanism similar to that for diagonal rerouting. Referring to Fig. 19 as an updated version of Fig. 14(a), for a horizontal boundary between two adjacent cells and , there can be robots that are more than away from the boundary that need to cross the boundary. This is due to the diagonal rerouting step. Suppose that there are robots that need to move from to and from to . We may pick robots from each group and create an MPP problem that swap them on the region as shown in Fig. 19.

{overpic}

[width=3.6in ,tics=5]./figures/_cancellation-detail-eps-converted-to.pdf

Figure 19: A horizontal boundary between two adjacent cells. Some potential robot movements across the boundary are illustrated. Among these, three pairs of robots, as indicated with double sided arrows, may be matched to make the flow across the boundary uni-directional.

Applying iSaG on the instance then renders the flow of robots between the boundary uni-directional. By applying iSaG in parallel on all such instances over horizontal boundaries and then another round over vertical boundaries, flows of robots between cell boundaries are all uni-directional. Following the analysis of diagonal rerouting step, the flow cancellation operation also induces makespan because each MPP instance is on an grid region. The running time is also the same as the diagonal rerouting step at . It is clear that the total flow through any boundary is no more than . ∎

After the flow cancellation operation, we are left with only unidirectional flows on the skeleton grid that are either vertical or horizontal between adjacent cells. To route the robots these flows represent, closed disjoint cycles must be constructed for moving the robots synchronously across multiple cell boundaries. To achieve this, we will first decompose the flow into unit circulations (i.e., describing a procedure for going from Fig. 15(a) to Fig. 15(b)). Then, we will show how the cycles on the skeleton grid (e.g., Fig. 15(b)) can be grouped into a constant number of sized batches and turned into actual cycles on the original grid . Our flow decomposition result, outlined below, works for arbitrary graphs.

Theorem 13 (Circulation Decomposition).

Let be a circulation on a graph with the largest total incoming flow for any vertex being . can be decomposed into unit circulations on in time or expected time.

Proof.

We proceed to build a bipartite graph over two copies of . For a vertex , we denote one of the copy (belonging to the first partite set) and the other (belonging to the second partite set). For any two adjacent vertices , if there is a flow of magnitude from to , then we add edges between and . Because the largest total incoming flow to any vertex is , the maximum degree for any , , is also . Also, due to flow conservation at vertices, for fixed , and have the same degree . For all with , we add edges between and . This brings the degrees of all vertices in the bipartite graph to , yielding a regular bipartite graph. The bipartite graph has vertices and edges. An illustration of the bipartite graph construction is given in Fig. 20.

{overpic}

[width=3.6in ,tics=5]./figures/_flow-bipartite-eps-converted-to.pdf (a)(b)

Figure 20: (a) A graph with five vertices and a valid circulation of largest total incoming degree being . The flow on each edge with non-unit flow is marked on the edge. (b) The constructed bipartite graph. The dashed edges are the edges added to make the graph regular.

With the regular bipartite graph of degree , by Hall’s theorem [49], there exists a perfect matching that can be computed in time [52]. The matching corresponds to a unit circulation on , which translates to either a single cycle or multiple vertex disjoint cycles. In the example, a perfect matching may be , which translates to the cycle . An application of the perfect matching algorithm reduces the degree of the bipartite graph by , resulting in another regular bipartite graph. We may repeat the procedure times to obtain unit circulations on . The total running time to obtain the unit circulations is . Alternatively, we may use the randomized perfect matching algorithm [53], which yields a total expected running time of . ∎

For our setting, Theorem 13 implies the following (note that and .)

Corollary 14 (Flow Decomposition on Skeleton Grid).

An circulation on a skeleton grid can be decomposed into unit circulations in time or expected time.

Because at most flows can pass through a cell boundary, at most flow can pass through a cell (two incoming, two outgoing). Corollary 14 gives us unit circulations over the skeleton grid . With the decomposed circulation, we may group them into batches and translate these into actual robot movements on . To start, we handle a batch.

Lemma 15 (Single Batch Global Flow Routing).

A batch of up to unit circulations on the skeleton grid may be translated into actual cyclic paths for robots on to complete in a single step, using time.

Proof.

For a fixed cell, there are many possible orientations for the incoming and outgoing flows. However, we only need to analyze the case where all four boundaries of a cell have flows passing through. If we can handle these, other cases are degenerate ones with some flows crossing the boundaries being zero. Among all possible flows that go through all sides of a cell, there are only three possible orientations for the incoming and outgoing flows after considering rotation symmetries and flow direction symmetries, as illustrated in Fig. 21. For example, the case with one incoming flow and three outgoing flows is the same as reversing the directions of the arrows in the case shown in Fig. 21(a). Therefore, establishing how a amount of flow may be translated into feasible robot movements for the three cases in Fig. 21 encompasses all possible scenarios. We will establish how up to