Collision-aware Task Assignment for Multi-Robot Systems
We propose a novel formulation of the collision-aware task assignment (CATA) problem and a decentralized auction-based algorithm to solve the problem with optimality bound. Using a collision cone, we predict potential collisions and introduce a binary decision variable into the local reward function for task bidding. We further improve CATA by implementing a receding collision horizon to address the stopping robot scenario, i.e. when robots are confined to their task location and become static obstacles to other moving robots. The auction-based algorithm encourages the robots to bid for tasks with collision mitigation considerations. We validate the improved task assignment solution with both simulation and experimental results, which show significant reduction of overlapping paths as well as deadlocks.
A successful multi-robot mission generally requires the robots in a team to collaborate towards a common goal, with each robot performing task in coordination with its teammates. Distributing tasks among robots is commonly referred as task assignment problem, which is generally followed by solving a multi-robot path planning problem to route every robot to its assigned task. In any realistic scenario, path planning should consider collision avoidance [3, 11, 2].
Task assignment is a well-known combinatorial optimization problem: tasks need to be assigned to robots to minimize a global cost function. This global cost function depends on the specific mission and customized constraints are usually considered for the optimization problem. The task assignment problem is NP-hard, requiring heuristic approaches [12, 15]. Search in the solution space is usually performed in a centralized setting, where the robots need to communicate with a central planner. This reduces the computation complexity on individual robots, but introduces a single point of failure in the mission. Data transmission towards the central planner also increases with the number of robots, causing congestion. Some decentralized methods  instantiate a task planner on each robot and make use of consensus algorithms to reach a consistent representation of the environment before task assignment. Other methods  let individual robots conduct local planning first and then exploit the consensus algorithm to achieve agreement on the assignment.
Task assignment procedures generate a collection of tuples , where refers to the robot ID and refers to a task identifier. The subsequent multi-robot path planning problem aims at finding , the optimal path from the location of robot to the location of task . Global planners  use prior information about the environment and predict potential collisions between robots, and typically assume perfect motion execution and minimal external disturbance. On the contrary, reactive collision avoidance strategies  naturally work in a decentralized system and are more robust to a dynamic environment. However, all these solutions deal with predefined task assignment, despite the clear indication that task assignment and path planning are inherently coupled. Recent work  has confirmed that higher level reasoning is necessary to improve the collision avoidance performance as increases. Integrating collision awareness with the task assignment process could potentially boost the performance of path planning as well as collision avoidance algorithms.
Existing studies that integrate task assignment and path planning are very limited. Cons and Edison et al. [7, 8] explored the coupled nature of task assignment and path optimization by considering the actuation constraints of fixed-wing UAVs, but assumed altitude layering for multi-robot path planning, which means each aircraft is flying at a different altitude. This would soon become unrealistic as increases. Yao et al.  implemented a reestimation mechanism to reject the task assignment results when unrealistic task completion time occurs during the path planning stage. To the best of our knowledge, no existing work has considered collision mitigation within the task assignment problem. This is due to the common assumption that tasks are distributed sparsely and robot dimensions are significantly small relative to path lengths. This assumption becomes invalid for operations in task-dense environments, such as those involving automatic construction or collective transportation. When considering a congested scenario where multiple robots attempt to cross paths with each other to reach their assignments, neglecting collision mitigation during the task assignment stage causes the mission completion time/ travel distance to increase without upper bound, or even worse, robots to collide with each other as the scenario becomes too complex for local collision avoidance strategies.
Our work formulates a task assignment problem with collision mitigation terms and develops a decentralized task assignment method that makes use of consensus algorithms. In this paper, we present an analytical derivation that provides a guaranteed optimality lower bound, and validate our results with simulation and experimental campaigns.
The remainder of this paper is organized as follows: Section II presents the classical task assignment problem, collision cones, and Buzz  (a language designed for robot swarms) as preliminaries to the following discussion; Section III provides the mathematical formulation of the problem and proposes a solution to the collision-aware task assignment problem; Sections IV and V present simulation and experimental results on the performance of our method; finally, Section VI offers some concluding remarks.
Ii-a Classical task assignment problem
The objective of a task assignment problem is to maximize a global scoring function or to minimize a global cost function while enforcing a set of constraints. In this work, we consider the maximization problem and express the global scoring function as a sum of local reward functions:
where is the local reward that occurs when assigning robot to task ; indicates whether robot is assigned to, or in market-based strategy parlance, has won the task ; refers to the total number of tasks, and refers to the total number of robots. indicates that no two robots are assigned to the same task. represents the maximum number of assignments for each robot. A special case of the formulation above is when , which is commonly referred to as the single-assignment problem.
Choi et al.  proposed a consensus-based decentralized auction algorithm for both the single and multi-assignment problems. Under the diminishing marginal gain assumption on the global scoring function, the algorithm guarantees convergence and solution optimality with a lower bound. The diminishing marginal gain condition assumes the local reward of a task does not increase as other tasks are being assigned before it, which is true for many reward functions used in search and exploration problems. However, in  and its followup work , the local reward of robot winning task only depends on its own previously-won tasks in a multi-assignment problem. For the single-assignment problem, the local reward function is static. In this paper, we formulate the single-assignment problem with collision mitigation, meaning the local reward of robot winning any task also depends on which task its neighbor has won.
Ii-B Collision cone
The collision cone has been widely used to predict collisions between two moving robots from based on their current locations and velocities. As depicted in Fig. 1, when robot at location is moving with velocity and robot at location is moving with velocity , a corresponding collision cone can be generated with a predefined safety distance , which is indicated by the grey circular area. By determining whether the relative velocity lies inside or outside the collision cone, we can predict potential future collisions. Task locations and are also indicated in Fig.1 with the underlying assumption that the robots are capable of moving towards the task locations in a straight line regardless of their orientations. Extensive studies [21, 18, 10] have been integrating the actuation constraints into the collision avoidance methods based on collision cones. In this paper, the robots are assumed to be holonomic for the sake of simplicity although the formulation proposed in Section III does not rely on this assumption.
Iii Collision-Aware Task Assignment
Iii-a Problem statement
Given a set of tasks and a set of robots, the goal is to find an optimal assignment that maximizes a global scoring function while minimizing potential collision incidents. Each robot can be assigned to one task at most and no two robots should be assigned to the same task. The mathematical form of the problem can be written as below:
where indicates the assignment status of robot , a neighbor of robot . And is the cost of assigning robot to task when robot is already assigned to task . , similar to the term in (1), is the local reward that occurs when assigning robot to task , which is independent of the assignment of robot . It is worth noting (2) resembles the general form of the quadratic assignment problem , where the inequality constraints become equality constraints. The quadratic assignment problem is one of the fundamental combinatorial optimization problems (from the facilities location family) and to the best of our knowledge this analogy has not been drawn before.
By rearranging the terms in (2), the optimization function can be written as
To simplify the notation, we replace the term with a collision weight that is determined by the collision status when robot is assigned to task and robot to task .
The collision weights can assume different values for each robot-task pair and it can be deduced that , as negative scores do not have real-world meaning: the worst case scenario is that path overlapping prevents the robot from reaching its assigned task, leading to a null reward. Therefore, we can approximate the sum in (5) with a binary decision variable ,
This binary variable is determined by the collision status when robot is assigned to task :
where refers to the relative velocity as depicted in Fig. 1, refers to the collision cone determined by the robots’ location and predefined safety distance.
By comparing (4) with the classical task assignment formulation (1), it is apparent now that the optimization function has similar form while the reward function is multiplied by a binary decision variable. As more tasks are assigned before robot bids for task , the binary decision variable can only increases from zero to one, which would only further reduce the local reward. We can then draw a conclusion: when local reward function satisfies the diminishing marginal gain condition, the reward function that considers collision mitigation also satisfies the diminishing marginal gain condition. This allows us to take advantages of properties that are already proven in , which guarantees 50% optimality assuming all the robots have accurate knowledge of the situation. We give detailed proof of optimality lower bound in the Appendix.
Iii-B Auction and consensus strategy implemented in Buzz
Here we implement the auction and consensus strategy in Buzz  and use the virtual stigmergy structure as the information propagation infrastructure for consensus agreements. Virtual stigmergy  is a (key,value) pair based shared memory that allows the robots to globally agree on the values of a set of variables, which is a good fit for the operation here.
The consensus based auction algorithm consists of two phases: the auction process and the consensus process. During the auction process, each robot first fetches the assignment set from the virtual stigmergy. The assignment set consists of tuples that link the robot ID to its assigned task identifier, . This set can be searched with either the robot ID or the task identifier, , . Then, the robot computes its own bid for every task that has not been assigned, while considering the respective collision status with each of its neighbors that have already won a task. Finally, each robot determines its own highest bid as a tuple and puts this tuple in the virtual stigmergy, which is akin to broadcasting this tuple to its neighbors. Algorithm 1 shows the procedure of robot ’s auction phase. The time-discounted reward function is used to compute the local reward before any collision mitigation consideration:
where is the discounting factor for task and is the estimated travel time for robot to reach task . refers to the inherent value of task , which usually depends on the importance of specific task to the whole mission. For the sake of simplicity, this paper considers a similar discounting factor and inherent value for all the tasks.
In the event of simultaneous modification of the global bidding tuple in the virtual stigmergy, we resolve the conflict by accepting the tuple with higher bid. After every robot has submitted its bid tuple, each robot updates the assignment set accordingly. Algorithm 2 details the update policies on robot .
Iii-C Receding collision horizon
By introducing a binary decision variable into the local reward function, robot is discouraged to bid for task when robot has already been assigned to task and potential collision is predicted. This would effectively reduce the crossing path incidents during mission execution. However, there exists another collision scenario that is often neglected in existing works: the stopping robot scenario. To better explain this scenario, we show a task assignment problem with 25 robots and 25 tasks illustrated in Fig. 2.
The robots are uniformly distributed within a square-shaped arena, referred as robot arena, and the tasks are arranged in three layers within a circular area, referred as task area. The quantity of robots/tasks as well as their formations are simply chosen for illustration purpose and do not impose any assumption for following discussion. Jet colormap is used for coloring. The color of the robots depends on the distance between specific robot and the center of task area, while color of the tasks depends on the distance between the specific task and the center of robot arena. This paper refers to the robots that are closer to task area as front-row robots and robots that are farther from task area as back-row robots. The same terms are also used to describe tasks regarding their distance to the center of the robot arena. When using a time-discounted reward function during the auction process without a collision mitigation term, the front-row robots naturally win the front-row tasks and the back-row robots end up winning the back-row tasks, as shown in Fig. 1(a). The bidding results after six iterations are indicated with lines connecting the robot with its assigned task.
Although the actual content of the task is outside the scope of this paper, it is reasonable to assume the robots will be bound to the task location for a certain duration of time, which means the front-row robots become static obstacles after arriving at the front-row tasks. This significantly complicate the collision scenarios back-row robots have to face when they reach task area and unfortunately, introducing a binary decision variable into the local reward function cannot effectively mitigate this issue due to the locality of the collision cone. Therefore, we propose a receding collision horizon: instead of a static safety distance , we use a diminishing safety distance during the bidding process.
When the collision cone is used in collision avoidance algorithms, the safety distance is usually determined by the robot size as well as its locomotion capability. This parameter is represented by here. In the collision mitigation context, the interpretation of safety distance can be further extended. Intuitively, it is a representation of how far the robot will risk going into the collision horizon to win a task. Considering a specific front-row robot and its closest neighbor , which has been assigned to task : when is increased so that , a collision will be predicted between robot and regardless of robot ’s assignment. Therefore robot is discouraged to bid for any task. If we start with a large safety distance and reduce it only when a zero bid is submitted by all robots, the back-row robots are encouraged to bid for front-row tasks even when the front-row robots are not fully assigned. Fig. 1(b) shows the results after six bidding iterations when using this scheme.
From the perspective of optimization formulation, a large problem is segmented into smaller problems after introducing the receding collision horizon. And each smaller problem still uses the local reward function in the form of (6), which satisfies the diminishing marginal gain condition, thus the 50% optimality guarantee still holds.
Iv Simulation Results
We developed a greedy algorithm to sequentially find the global bid tuple that renders the highest bid given prior assignments. This centralized procedure generates the same task assignment solution with CATA, assuming a fully connected network so that every robot is able to submit their bid before conducting any task selection. Another distributed auction algorithm for single-assignment with only time-discounted rewards, named consensus-based auction algorithm (CBAA) in , is also simulated as a control group. We chose the distributed reactive collision avoidance (DRCA)  method as the local collision avoidance strategy due to its capability of handling multi-robot interaction and maintaining collision-free before robots even enter the collision cone. Deconfliction maintenance is triggered when one robot detects one or multiple neighbors entering its safety zone, but outside the collision cone. And a deconfliction maneuver is triggered when one robot detects its neighbor entering its collision cone.
|Grid setup||Line setup|
Dead lock for
Here we assume and use to replace the notations for the sake of simplicity, although this is not required for the proposed algorithm. Regarding the robots’ initial locations, we exploited two different patterns for comparison. Taking as an example, twenty-five robots are uniformly located on a 5x5 grid in Fig. 2(a), represented by the colored squares and referred as grid setup in the later part; robots are uniformly located on a horizontal line in Fig. 2(b), referred as line setup in later part. These two patterns are chosen instead of randomly generated robot locations, because real-life task assignment scenarios often deal with initial configurations like these. Also, this comparison can provide more insights about the relation between collision mitigation performance and the initial setup. Task locations T, are randomly generated from two normal distributions, , (black squares).
We simulate 100 trials for each setup with and . We monitor three types of collision incidents throughout the mission: 1) any robot enters the collision cone of any other robot 2) two robots enter each other’s safety zone without entering the collision cone 3) multiple robots enter each other’s safety zone without entering the collision cone. We refer to these scenarios as avoidance, maintain-one, and maintain-multi, respectively. The occurrence rate of these collision incidents is a good indicator of the complexity level that local collision avoidance needs to handle. Figure 4 shows the simulation results, where CATA largely reduced all three types of collision incidents. It is worth noting that as the number of robots increases, the occurrence of collision incidents increased substantially when using CBAA. On the contrary, CATA managed to limit this increase. In addition, the reduction of maintain-multi incidents is of particular interest because most of the existing local collision avoidance methods perform poorly when handling multi-robot interaction in real-life scenarios. We also observed deadlocks in some trials, when collision avoidance simply failed. The dead lock occurrence rate when and are summarized in Table I. As the total number of robots increases, the grid setup generates more deadlocks than the line setup because of the stopping robot scenarios explained in Section III-C. It might be tempting to always choose a line setup over a grid setup, however, that quickly becomes unrealistic for practical applications. Here we demonstrate CATA significantly reduced deadlocks for both the grid setup and the line setup. Out of one hundred trials, fewer than 10 trials failed using CATA, while 52 trials failed using CBAA for the grid setup with .
Box plots of the mission completion steps of successful trials are presented in Fig. 5. The average completion steps remained approximately the same for CBAA and CATA. Intuitively, this indicates that CATA successfully reduced collision incidents without lengthening the overall mission.
V Experimental Validation
The performance of CATA was studied using a small team of 8 KheperaIV  robots. Our experimental platform consists of an IR camera based optical tracking systems (Optitrack), a central communication hub emulating the inter-robot communications, and 8 KheperaIV robots. The central communication hub obtained the position from the tracking system and emulated situated communication , where receivers of a message are aware of the senders’ position in their own reference frames. During the experiments, all the robots ran an instance of the Buzz Virtual Machine (BVM)  and executed identical scripts. The script includes task assignment, velocity control and local collision avoidance algorithms. Different task assignment schemes are used for comparison, including CATA, manual optimal assignment, and random assignment. We used a simple integrator controller for velocity control, which receives a target position and applies a piecewise function to determine the left and right wheel velocity of the differential drive robot. When the robots move too close to each other, a light-weight collision avoidance algorithm (LCA)  exerts a virtual force on the robots and deflects it away from other moving robots and obstacles.
With eight predefined task locations in Fig. 6, we report the task assignment results (top row) and robot trajectories (bottom row) with CATA (middle), optimal assignment (left) and random assignment (right). The optimal assignment was specified by a human operator and the random assignment was obtained by randomly associating tasks to robots. It can be observed that CATA is capable of taking into account the potential collisions and provides reasonably detangled assignments. The non-holonomic nature of the robots resulted in spiral-shaped trajectories, and due to imperfections in position estimation, some robots, such as R5 in Fig. 5(d), turned on the spot and executed a straight trajectory. In all three sub-figures of the second row, we have marked the local collision avoidance activity with red lines, which means that at least one of the robots was too close to its neighbor and triggered the local collision avoidance. We repeated the CATA based task assignment experiment three times and obtained roughly identical assignments, except for small changes in the trajectories as a result of communication delays, positioning errors, and asynchronous script execution.
We proposed a collision-aware task assignment strategy that considers potential collisions during the bidding process. By shaping the local rewards of tasks with collision cones and addressing the stopping robot problem with receding collision horizon, we successfully mitigated the inter-robot collisions during the task assignment stage. As a result, local collision avoidance method handles less and simpler collision incidents. We empirically evaluated the approach with simulations and reported significantly improved results under various configurations. We also implemented the algorithm in Buzz on real robots and presented the trajectories with different task assignment schemes. As KheperaIV are differential wheeled robots, the actuation constraints introduced more complicated collision scenarios than the simulation results. For future work, we plan to extend our work by adapting the approach for nonholonomic robots and eventually heterogeneous robotic system.
Here we show that CATA guarantee 50% optimality when local reward function satisfies the diminishing marginal gain condition.
Proof: Each round of auction produces one globally highest bid. For notational convenience, we use the same symbol for both the round identifier and the ID of the robot that wins the auction at the corresponding round. In other words, robot won the auction at round with bid and robot won the auction at round with bid . We assume for the rest of this section.
Because only the globally highest bid wins the auction, the following condition holds:
where is an assignment set that can be searched with either the robot ID or the task identifier, gives the task that robot has won at round i. Because each robot only submits its local highest bid and its local reward function satisfies the diminishing marginal gain condition, which means the bid that any robot can submit for any task monotonically decreases as the auction proceeds, the following condition holds:
And the largest improvement of the combined bid of robot and is achieved when
Now consider CATA provides a solution so that the objective value is
where refers to the maximum number of tasks that can be assigned. Since each robot can only take one task, any possible variation of the assignment should be in the form of a sequence of task swapping. The largest possible improvement of the objective value can be achieved after a specific sequence of task swapping while every task swapping satisfies the condition (12). Therefore the optimal objective value (OOV) should satisfy
where refers to the number of tasks that need to be swapped to achieve the largest possible improvement, and refers to the rest of the assignments.
Thus, , the 50% optimality is guaranteed.
We would like to thank NSERC for supporting this work under the NSERC Strategic Partnership Grant (479149). Simulations were performed using the computing clusters managed by Calcul Québec and Compute Canada.
-  J. Alonso-Mora, P. Beardsley, and R. Siegwart. Cooperative Collision Avoidance for Nonholonomic Robots. IEEE Transactions on Robotics, 34(2):404–420, 2018.
-  J. V. D. Berg, S. J. Guy, M. Lin, and D. Manocha. Reciprocal n -Body Collision Avoidance. Proceedings 14th International Symposium of Robotics Research, pages 3–19, 2011.
-  A. Breitenmoser and A. Martinoli. On combining multi-robot coverage and reciprocal collision avoidance. Distributed Autonomous Robotic Systems, 112:49–64, 2016.
-  N. Buckman, H.-L. Choi, and J. How. Partial replanning for decentralized dynamic task allocation. 06 2018.
-  R. E. Burkard, L. S. Pitsoulis, J. Linearization, and Q. A. P. Polytopes. The Quadratic Assignment Problem. In Handbook of Combinatorial Optimization, pages 241–338. 1998.
-  H.-L. Choi, L. Brunet, and J. P. How. Consensus-Based Decentralized Auctions for Robust Task Allocation. IEEE Transactions on Robotics, 25(4):912–926, 2009.
-  M. S. Cons, T. Shima, and C. Domshlak. Integrating Task and Motion Planning for Unmanned Aerial Vehicles*. Unmanned Systems, 2(1):19–38, 2014.
-  E. Edison and T. Shima. Integrated task assignment and path optimization for cooperating uninhabited aerial vehicles using genetic algorithms. Computers & Operations Research, pages 340–356, 2011.
-  W. Honig, J. A. Preiss, T. K. Kumar, G. S. Sukhatme, and N. Ayanian. Trajectory Planning for Quadrotor Swarms. IEEE Transactions on Robotics, 34(4):856–869, 2018.
-  E. Lalish and K. A. Morgansen. Decentralized reactive collision avoidance for multivehicle systems. In Proceedings of the IEEE Conference on Decision and Control, pages 1218–1224, 2009.
-  J. Leonard, A. Savvaris, and A. Tsourdos. Distributed reactive collision avoidance for a swarm of quadrotors. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 231(6):1035–1055, 2017.
-  V. Pillac, M. Gendreau, C. GuÃ©ret, and A. L. Medaglia. A review of dynamic vehicle routing problems. European Journal of Operational Research, 225(1):1 – 11, 2013.
-  C. Pinciroli, A. Lee-Brown, and G. Beltrame. Buzz: An Extensible Programming Language for Self-Organizing Heterogeneous Robot Swarms. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3794–3800, 2015.
-  C. Pinciroli, A. Lee-Brown, and G. Beltrame. A Tuple Space for Data Sharing in Robot Swarms. In Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), pages 287–294. ACM, 2016.
-  U. Ritzinger, J. Puchinger, and R. F. Hartl. A survey on dynamic and stochastic vehicle routing problems. International Journal of Production Research, 54(1), Jan. 2016.
-  M. Shahriari, I. Å vogor, D. St-Onge, and G. Beltrame. Lightweight collision avoidance for resource-constrained robots. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–9, Oct 2018.
-  T. Shima, S. J. Rasmussen, and P. Chandler. UAV Team Decision and Control Using Efficient Collaborative Estimation. Journal of Dynamic Systems, Measurement, and Control, 129(5):609–619, apr 2007.
-  J. Snape, J. van den Berg, S. J. Guy, and D. Manocha. Smooth and collision-free navigation for multiple robots under differential-drive constraints. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4584–4589, 2010.
-  J. M. Soares, I. Navarro, and A. Martinoli. The Khepera IV Mobile Robot: Performance Evaluation, Sensory Data and Software Toolbox. In Robot 2015: Second Iberian Robotics Conference, pages 767–781, 2016.
-  K. Støy. Using situated communication in distributed autonomous mobile robots. Proceedings of the 7th Scandinavian Conference on Artificial Intelligence, pages 44–52, 2001.
-  D. Wilkie, J. van den Berg, and D. Manocha. Generalized velocity obstacles. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5573–5578, 2009.
-  W. Yao, N. Qi, N. Wan, and Y. Liu. An iterative strategy for task assignment and path planning of distributed multiple unmanned aerial vehicles. Aerospace Science and Technology, 86:455–464, 2019.