Scalable AsymptoticallyOptimal MultiRobot Motion Planning
Abstract
Discovering highquality paths for multirobot problems can be achieved, in principle, through asymptoticallyoptimal data structures in the composite space of all robots, such as a samplingbased roadmap or a tree. The hardness of motion planning, however, which depends exponentially on the number of robots, renders the explicit construction of such structures impractical. This work proposes a scalable, samplingbased planner for coupled multirobot problems that provides desirable pathquality guarantees. The proposed is an informed, asymptoticallyoptimal extension of a prior method , which introduced the idea of building roadmaps for each robot and implicitly searching the tensor product of these structures in the composite space. The paper describes the conditions for convergence to optimal paths in multirobot problems. Moreover, simulated experiments indicate converges to highquality paths and scales to higher numbers of robots where various alternatives fail. It can also be used on highdimensional challenges, such as planning for robot manipulators.
I Introduction and Prior Work
Many multirobot planning applications [22, 4, 3] require highdimensional platforms to simultaneously move in a shared workspace, where highquality paths must be computed quickly as sensing input is updated, as in Fig. 1. Preprocessing given knowledge of the static scene can help the online computation of highquality paths. Samplingbased roadmaps can help with such highdimensional challenges and provide primitives for preprocessing a static scene [8, 9]. These methods converge to optimal solutions given sufficient density [7], i.e., at least edges are needed for a roadmap with vertices, while nearoptimal solutions are achieved after finite computation time [2, 6].
Naïvely constructing a samplingbased roadmap or tree in the robots’ composite configuration space provides asymptotic optimality but does not scale well. In particular, memory requirements depend exponentially on the problem’s dimension [15]. The alternative is decoupled planing, where paths for robots are computed independently and then coordinated [10]. These methods, however, typically lack completeness and optimality guarantees. Hybrid approaches can achieve optimal decoupling to retain guarantees [21]. The problem is more complex when the robots exhibit nontrivial dynamics [12]. Collision avoidance or control methods can scale to large numbers of robots, but typically lack global path quality guarantees [20, 19].
The previously proposed approach [16] is a scalable samplingbased approach, which is probabilistically complete. It searches an implicit tensor product roadmap of roadmaps constructued for each robot individually [18]. The current work proposes and shows that it is an efficient asymptotically optimal variant of the prior method. Simulations show the method practically generates highquality paths while scaling to complex, highdimensional problems, where alternatives fail. ^{1}^{1}1Additional material is provided in the appendices of this paper as well as the accompanying video of the original MRS submission.
Ii Problem Setup and Notation
Consider a shared workspace with robots, each operating in configuration space for . Let be each robot’s free space, where it is free from collision with the static scene, and is the forbidden space for robot . The composite space is the Cartesian product of each robot’s space. A composite configuration is an tuple of robot configurations. For two distinct robots , denote by the set of configurations where and collide. Then, the composite free space consists of configurations subject to:

for every ;

for every .
Each requires robots to not collide with obstacles, and each pair to not collide with each other. The composite forbidden space is defined as .
Given , where , a trajectory is a continuous curve in , such that , where the robots move simultaneously. is an tuple of robot paths such that .
The objective is to find a trajectory which minimizes a cost function . The analysis assumes the cost is the sum of robot path lengths, i.e., , where denotes the standard arc length of . The arguments also work for .^{2}^{2}2The types of distances the arguments hold are more general, but proofs for alternative metrics are left as future work. Section IV shows sufficient conditions for to converge to optimal trajectories over the cost function .
Iii Methods for Composite Space Planning
For a fixed , define for every robot the PRM roadmap constructed over , such that with connection radius . Then, is the tensor product roadmap in space (for an illustration, see Figure 2). Formally, is the Cartesian product of the nodes from each roadmap . For two vertices the edge set contains edge if for every it is that or .^{3}^{3}3Notice this slight difference from [16] so as to allow edges where some robots remain motionless while others move.
As shown in Algorithm 1, grows a tree over , rooted at the start configuration and initializes path (line 1). The method stores the node added each iteration (Line 2), as part of an informed process to guide the expansion of towards the goal. The method iteratively expands given a time budget (Line 3), as detailed by Algorithm 2, storing the newly added node (Line 4). After expansion, the method traces the path which connects the source with the target (Line 5). If such a path is found, it is stored in if it improves upon the cost of the previous solution (Lines 6, 7). Finally, the best path found is returned (Line 8).
The expansion step is given in Alg. 2. The default initial step of the method is given in Lines 14, i.e., when no is passed (Line 1), which corresponds to an exploration step similar to RRT: a random sample is generated in (Line 2), its nearest neighbor in is found (Line 3) and the oracle function returns the implicit graph node that is a neighbor of on the implicit graph in the direction of (Line 4). If a , however, is provided (Line 5)—which happens when the last iteration managed to generate a node closer to the goal relative to its parent—then the is greedily generated so as to be a neighbor of in the direction of the goal (Line 6).
In either case, the method next finds neighbors , which are adjacent to in and have also been added to (Line 7). Among , the best node is chosen, for which the local path is collisionfree and that the total path cost to is minimized (Line 8). If no such parent can be found (Line 9), the expansion fails and no node is returned (Line 10). Then, if is not in , it is added (Lines 1113). Otherwise, if it exists, the tree is rewired so as to contain edge , and the cost of the ’s subtree (if any) is updated (Lines 14, 15). Then, for all nodes in (Line 16), the method tests should be rewired through to reach this neighbor. Given that is collisionfree and is of lower cost than the existing path to (Line 17), the tree is rewired to make be the parent of (line 18).
Finally, if in this iteration the heuristic value of is lower than its parent node (line 19), the method returns (Line 20), causing the next iteration to greedily expand . Otherwise, is returned so as to do an exploration step. Note that the approach is implemented with helpful branchandbound pruning after an initial solution is found, though this is not reflected in the algorithmics.
is determined via an oracle function. Using this oracle function and a simple rewiring scheme is sufficient for showing asymptotic optimality for (see Section IV). The oracle function for a tworobot case is illustrated in Figure 3. First, let be the ray from configuration terminating at . Then, denote as the minimum angle between and . When is drawn in , its nearest neighbor in is found. Then, project the points and into each robot space , i.e., ignore the configurations of other robots.
The method separately searches the singlerobot roadmaps to discover . Denote . For every robot , let be the neighborhood of , and identify . The oracle function returns node .
As in the standard as well as in , the approach has a Voronoibias property [11]. It is, however, slightly more involved to observe as shown in Figure 4. To generate an edge , random sample must be drawn within the Voronoi cell of , denoted (A) and in the general direction of , denoted (B). The intersection of these two volumes is the volume to be sampled generate via .
Iv Analysis
In this section, the theoretical properties of are examined, beginning with a study of the asymptotic convergence of the implicit roadmap to containing a path in whose cost converges to the optimum. Then, it is shown eventually discovers the shortest path in , and that the combination of these two facts proves the asymptotic optimality of .
For simplicity, the analysis is restricted to the setting of robots operating in Euclidean space, i.e. is a dimensional Euclidean hypercube for fixed . ^{4}^{4}4For simplicity, it is assumed that all the robots have the same number of degrees of freedom . Additionally, the analysis is restricted to the specific cost function of total distance, i.e., . Discussion on lifting these restrictions is provided in Section VI.
Iva Optimal Convergence of
For each robot, an asymptotically optimal roadmap is constructed having samples and using a connection radius necessary for asymptotic convergence to the optimum [7]. By the nature of samplingbased algorithms, each graph cannot converge to the true optimum with finite computation, as such a solution may have clearance of exactly . Instead, this work focuses on the notion of a robust optimum ^{5}^{5}5Note that the given definition of robust optimum is similar to that in previous work [17]., showing that the tensor product roamdap converges to this value.
Definition 1
A trajectory is robust if there exists a fixed such that for every it holds that , where denotes the standard Euclidean distance.
Definition 2
A value which denotes a path cost is robust if for every fixed there exists a robust path such that . The robust optimum, denoted by , is the infimum over all such values.
For any fixed , and a specific instance of constructed from roadmaps, having samples each, denote by the shortest path from to over .
Definition 3
is asymptotically optimal (AO) if for every fixed it holds that a.a.s.^{6}^{6}6Let be random variables in some probability space and let be an event depending on . We say that occurs asymptotically almost surely (a.a.s.) if ., where the probability is over all the instantiations of with samples for each PRM.
Using this definition, the following theorem is proven. Recall that denotes the dimension of a singlerobot configuration space.
Theorem 1
is AO when
where is any constant larger than .
Remark. Note that was developed in [6, Theorem 4.1], and guarantees AO of for a single robot. The proof technique described in that work will be one of the ingredients used to prove Theorem 1. ^{7}^{7}7Note that can be refined to incorporate the proportion of , which would reduce this expression.
By definition of , for any given there exists robust trajectory , and fixed , such that the cost of is at most and for every it holds that . Next, it is shown that contains a trajectory such that
(1) 
a.a.s.. This immediately implies that , which will finish the proof of Theorem 1.
Thus, it remains to show that there exists a trajectory on which satisfies Equation 1 a.a.s.. As a first step, it will be shown that the robustness of in the composite space implies robustness in the singlerobot setting, i.e., robustness along .
For define the forbidden space parameterized by as
Claim 1
For every robot , , and , .
Fix a robot , and fix some and a configuration . Next, define the following composite configuration
Note that it differs from only in the th robot’s configuration. By the robustness of it follows that
The result of claim 1 is that the paths are robust in the sense that there is sufficient clearance for the individual robots to not collide with each other given a fixed location of a single robot. A Lemma is derived using proof techniques from the literature [6], and it implies every contains a singlerobot path that converges to
Lemma 1
For every robot , constructed with samples and a connection radius contains a path with the following attributes a.a.s.:

, ;

;

, s.t. .
The first property (i) follows from the fact that are directly added to . The rest follows from the proof of Theorem 4.1 in [6], which is applicable here since .
Lemma 1 also implies that contains a path in , that represents robottoobstacle collisionfree motions, and minimizes the multirobot metric cost. In particular, define , where are obtained from Lemma 1. Then
However, it is not clear whether this ensures the existence of a path where robotrobot collisions are avoided. That is, although , it might be the case that . Next it is shown that can be reparametrized to induce a compositespace path whose image is fully contained in , with length equivalent to .
For each robot , denote by the chain of vertices traversed by . For every assign a timestamp of the closest configuration along , i.e.,
Also, define and denote by the ordered list of , according to the timestamp values. Now, for every , define a global timestamp function which assigns to each global timestamp in a singlerobot configuration from . It thus specifies in which vertex robot resides at time . For , let be the largest index such that . Then simply assign . From property (iii) in Lemma 1 and Claim 1 it follows that no robotrobot collisions are induced by the reparametrization, concluding the proof of Theorem 1.
IvB Asymptotic Optimality of
Finally, is shown to be AO. Denote by the time budget in Algorithm 1, i.e., the number of iterations of the loop. Denote by the solution returned by for and .
Theorem 2
If then for every fixed it holds that
Since is AO (Theorem 1), it suffices to show that for any fixed , and a fixed instance of , defined over PRMs with samples each, eventually (as tends to infinity), finds the optimal trajectory over . This can be shown using the properties of a Markov chain with absorbing states [5, Theorem 11.3]. While a full proof is omitted here, the highlevel idea is similar to what is presented in previous work [16, Theorem 3], and expanded upon in Appendix A. By restricting the states of the Markov chain to being the graph vertices along the optimal path, setting the target vertex to be an absorbing vertex, and showing that the probability of transitioning along any edge in this path is nonzero (i.e. the probability is proportional to ), then the probability that this process does not reach the target state along the optimal path converges to as the number of iterations tends to infinity. The final step is to show that the above statements hold when both and tend to . A proof for this phenomenon can be found in [16, Theorem 6].
V Experimental Validation
This section provides an experimental evaluation of by demonstrating practical convergence, scalability, and applicability to dualarm manipulation. The approach and alternatives are executed on a cluster with Intel(R) Xeon(R) CPU E54650 @ 2.70GHz processors, and 128GB of RAM. ^{8}^{8}8Additional data are provided in Appendix B.
2 Disk Robots among 2D Polygons: This basecase test involves disks () of radius , in a region, as in Figure 5. The disks have to swap positions between and . This is a setup where it is possible to compute the explicit roadmap, which is not practical in more involved scenarios. In particular, is tested against: a) running on the implicit tensor roadmap (referred to as “Implicit ”) defined over the same individual roadmaps with nodes each as those used by ; and b) an explicitly constructed roadmap with nodes in the composite space.
Results are shown in Figure 6. converges to the optimal path over , similar to the one discovered by Implicit , while quickly finding an initial solution of high quality. Furthermore, the implicit tensor product roadmap is of comparable quality to the explicitly constructed roadmap.
Table I presents running times. and implicit construct sized roadmaps (row 3), which are faster to construct than the roadmap in (row 1). becomes very costly as increases. For , the explicit roadmap contains vertices, taking GB of RAM to store, which was the upper limit for the machine used. When the roadmap can be constructed, it is quicker to query (row 2). quickly returns an initial solution (row 5), and converges within of the optimum length (row 6) well before Implicit returns a solution as increases (row 4). The next benchmark further emphasizes this point.
Number of nodes: =  50  100  200 

PRM* construction  3.427  13.293  69.551 
PRM* query  0.002  0.004  0.023 
2 size PRM* construction  0.1351  0.274  0.558 
Implicit A* search over  0.684  2.497  10.184 
over (initial)  0.343  0.257  0.358 
over (converged)  3.497  4.418  5.429 
Many Disk Robots among 2D Polygons: In the same environment as above, the number of robots is increased to evaluate scalability. Each robot starts on the perimeter of the environment and is tasked with reaching the opposite side. An roadmap is constructed for every robot. It quickly becomes intractable to construct a roadmap in the composite space of many robots.
Figure 7 shows the inability of alternatives to compete with in scalability. Solution costs are normalized by an optimistic estimate of the path cost for each case, which is the sum of the optimal solutions for each robot, disregarding robotrobot interactions. Implicit fails to return solutions even for 3 robots. Directly executing in the composite space fails to do so for . The original method (without the informed search component) starts suffering in success ratio for and returns worse solutions than . The average solution times for may decrease as increases but this is due to the decreasing success ratio, i.e., begins to only succeed at easy problems.
Dualarm manipulator: This test shows the benefits of when planing for two dimensional arms. Figure 8 shows that fails to return solutions within iterations. Using small roadmaps is also insufficient for this problem. Both and Implicit require larger roadmaps to begin succeeding. But with , Implicit always fails, while maintains a success ratio. As expected, roadmaps of increasing size result in higher quality path. The informed nature of also allows to find initial solutions fast, which together with the branchandbound primitive allows for good convergence.
Vi Discussion
This work studies the asymptotic optimality of samplingbased multirobot planning over implicit structures. The objective is to efficiently search the composite space of multirobot systems while also achieving formal guarantees. This requires construction of asymptotically optimal individual robot roadmaps and appropriately searching their tensor product. Performance is further improved by informed search in the composite space.
These results may extend to more complex settings involving kinodynamic constraints by relying on recent work [14, 13]. Furthermore, the analysis may be valid for different cost functions other than total distance. The tools presented here can prove useful in the context of simultaneous task and motion planning for multiple robots, especially manipulators [1].
References
 [1] A. Dobson and K. E. Bekris. Planning Representations and Algorithms for Prehensile MultiArm Manipulation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.
 [2] A. Dobson, G. Moustakides, and K. E. Bekris. Geometric Probability Results For Bounding Path Quality In SamplingBased Roadmaps After Finite Computation. In IEEE Inter. Conf. on Robotics and Automation (ICRA), 2015.
 [3] M. Gharbi, J. Cortés, and T. Siméon. Roadmap Composition for MultiArm Systems Path Planning. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009.
 [4] F. Gravot and R. Alami. A Method for Handling Multiple Roadmaps and Its Use for Complex Manipulation Planning. In IEEE Internation Conference on Robotics and Automation (ICRA), 2003.
 [5] C. Grinstead and J. Snell. Introduction to Probability. American Mathmatical Society, Providence, RI, 2012.
 [6] L. Janson, A. Schmerling, A. Clark, and M. Pavone. Fast Marching Tree: a Fast marching SamplingBased Method for Optimal Motion Planning in Many Dimensions. International Journal of Robotics Research (IJRR), 34(7):883–921, 2015.
 [7] S. Karaman and E. Frazzoli. Samplingbased Algorithms for Optimal Motion Planning. International Journal of Robotics Research (IJRR), 30(7):846–894, June 2011.
 [8] L. E. Kavraki, P. Svestka, J.C. Latombe, and M. Overmars. Probabilistic Roadmaps for Path Planning in HighDimensional Configuration Spaces. IEEE Transactions on Robotics and Automation, 12(4):566–580, 1996.
 [9] S. M. LaValle and J. J. Kuffner. Randomized Kinodynamic Planning. International Journal of Robotics Research (IJRR), 20:378–400, May 2001.
 [10] S. Leroy, J. Laumond, and Siméon. Multiple Path Coordination for Mobile Robots: A Geometric Algorithm. Barcelona, Catalonia, Spain, 1999. International Joint Conference on Artificial Intelligence (IJCAI).
 [11] S. R. Lindemann and S. M. LaValle. Incrementally reducing dispersion by increasing voronoi bias in rrts. In Proceedings of the 2004 IEEE International Conference on Robotics and Automation, ICRA 2004, April 26  May 1, 2004, New Orleans, LA, USA, pages 3251–3257, 2004.
 [12] J. Peng and S. Akella. Coordinating Multiple Robots with Kinodynamic Constraints Along Specified Paths. The International Journal of Robotics Research, 24(4):295–310, 2005.
 [13] E. Schmerling, L. Janson, and M. Pavone. Optimal samplingbased motion planning under differential constraints: The drift case with linear affine dynamics. In 54th IEEE Conference on Decision and Control, CDC 2015, Osaka, Japan, December 1518, 2015, pages 2574–2581, 2015.
 [14] E. Schmerling, L. Janson, and M. Pavone. Optimal samplingbased motion planning under differential constraints: The driftless case. In IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 2630 May, 2015, pages 2368–2375, 2015.
 [15] J. T. Schwartz and M. Sharir. On the Piano Movers’ Problem: III. Coordinating the Motion of Several Independent Bodies: The Special Case of Circular Bodies Moving Amidst Polygonal Barriers. The International Journal of Robotics Research, 2(3):46–75, 1983.
 [16] K. Solovey, O. Salzman, and D. Halperin. Finding a needle in an exponential haystack: Discrete RRT for exploration of implicit roadmaps in multirobot motion planning. I. J. Robotics Res., 35(5):501–513, 2016.
 [17] K. Solovey, O. Salzman, and D. Halperin. New Perspective on Samplingbased Motion Planning via Random Geometric Graphs. In Robotics Science and Systems (RSS), 2016.
 [18] P. Svestka and M. Overmars. Coordinated Path Planning for Multiple Robots. Robotics and Autonomous Systems, 23:125–152, 1998.
 [19] S. Tang and V. Kumar. A Complete Algorithm for Generating Safe Trajectories for MultiRobot Teams. In International Symposium on Robotics Research (ISRR), Sestri Levante, Italy, September 2015.
 [20] J. Van Den Berg, S. Guy, M. Lin, and D. Manocha. Reciprocal nbody collision avoidance, volume 70 of Springer Tracts in Advanced Robotics, pages 3–19. Star edition, 2011.
 [21] J. van den Berg, J. Snoeyink, M. Lin, and D. Manocha. Centralized Path Planning for Multiple Robots: Optimal Decoupling into Sequential Plans. In Robotics: Science and Systems (RSS), 2009.
 [22] G. Wagner and H. Choset. Subdimensional Expansion for Multirobot Path Planning. Artificial Intelligence Journal, 219:1024, 2015.
a Proof of Convergence
This appendix examines the result of Theorem 2, and formally proves the convergence of the tree toward containing all optimal paths.
Lemma 2 (Optimal Tree Convergence of )
Consider an arbitrary optimal path originating from and ending at , then let be the event such that after iterations of , the search tree contains the optimal path up to segment . Then,
Proof. This property will be proven using a theorem from Markov chain literature [5, Theorem 11.3]. Specifically, the properties of absorbing Markov chains can be exploited to show that will eventually contain the optimal path over for a given query. An absorbing Markov chain is one such that there is some subset of states in which the transition matrix only allows that state to transition to itself.
The proof follows by showing that the method can be described as an absorbing Markov chain, where the target state of a query is represented as an absorbing state in a Markov chain, restated here.
Theorem 3 (Thm 11.3 in Grinstead & Snell)
In an absorbing Markov chain, the probability that the process will be absorbed is 1 (i.e., as ), where is the transition submatrix for all nonabsorbing states.
There are two steps to using this proof. First, that the search can be cast as an absorbing Markov chain, and second, that the probability of transition for from each state to the next in this chain is nonzero (i.e. that each state can eventually be connected to the target).
For query , let the sequence of length represent the vertices of corresponding to the optimal path through the graph which connects these points, where corresponds to the target vertex, and furthermore, let be an absorbing state. Theorem 3 operates under the assumption that each vertex is connected to an absorbing state ( in this case).
Then, let the transition probability for each state have two values, one for each state transitioning to itself, which corresponds to the search expanding along some other arbitrary path. The other value is a transition probability from to . This corresponds to the method sampling within the volume .
Then, as the second step, it must be shown that this volume has a positive probability of being sampled in each iteration. It is sufficient then to argue that . Fortunately, for any finite , previous work has already shown that this is the case given general position assumptions [16, Lemma 2].
Given these results, the is cast as an absorbing Markov chain which satisfies the assumptions of 3, and therefore, the matrix . This implies that the optimal path to the goal has been expanded in the tree, and therefore
B More Experimental Data
This appendix presents additional experimental data omitted from Section V.
B1 2Robot Benchmark
B2 RRobot Benchmark
For the robot benchmark, additional data is presented in Figure 10, showing query resolution times for the various methods.
To emphasize the lack of scalability for alternate methods, additional experiments were run in this setup using a minimal roadmap. The tests use a node roadmap for each robot as illustrated in Figure 12. Each roadmap is constructed with slight perturbations to the nodes within the shaded regions indicated in the figure.
The data for this modified benchmark (shown in Figure 11) indicate that even using a very small roadmap does not allow alternate methods to scale. While the methods scale better, Implict does time out for , and times out for .