MultiVehicle Collision Avoidance via HamiltonJacobi Reachability and Mixed Integer Programming
Abstract
Multiagent differential games are important and useful tools for analyzing many practical problems. With the recent surge of interest in using UAVs for civil purposes, the importance and urgency of developing tractable multiagent analysis techniques that provide safety and performance guarantees is at an alltime high. HamiltonJacobi (HJ) reachability has successfully provided safety guarantees to smallscale systems and is flexible in terms of system dynamics. However, the exponential complexity scaling of HJ reachability prevents its direct application to large scale problems when the number of vehicles is greater than two. In this paper, we overcome the scalability limitations of HJ reachability by using a mixed integer program that exploits the properties of HJ solutions to provide higherlevel control logic. Our proposed method provides safety guarantee for threevehicle systems – a previously intractable task for HJ reachability – without incurring significant additional computation cost. Furthermore, our method is scalable beyond three vehicles and performs significantly better by several metrics than an extension of pairwise collision avoidance to multivehicle collision avoidance. We demonstrate our proposed method in simulations.
I Introduction
From projects such as Amazon Prime Air and Google Project Wing to other recent uses of unmanned aerial vehicles (UAVs), there is without a doubt an immense interest in using UAVs for civil purposes [1, 2, 3, 4]. Potential uses of UAVs include package delivery, aerial surveillance, and disaster response [5]; future applications of UAVs are only limited by imagination. As a result, government agencies such as the Federal Aviation Administration (FAA) and the National Aeronautics and Space Administration (NASA) are urgently working on UAVrelated regulations [6, 7, Kopardekar16].
Much research has gone into the area of multiagent systems, which involve aspects of cooperation and asymmetric goals among the agents. In [8, 9], the authors assume that the vehicles will employ certain simple control strategies which induce velocity obstacles that must be avoided in order to maintain safety. Other approaches involved using potential functions to ensure collision avoidance while multiple agents maintain formation to travel along prespecified trajectories [10, 11]. Although approaches like these provide valuable insight to multiagent systems, they do not flexibly offer the safety guarantees that are desirable in safetycritical systems.
Multiagent systems have also been studied in the context of differential games, which are ideal for addressing safetycritical problems such as the ones involving UAVs we now urgently face, because of the safety and performance guarantees that differential game approaches can provide. The HJ formulation of differential games has been studied extensively and successfully applied to smallscale problems involving one or two vehicles [12, 13, 14, 15]. Besides providing safety guarantees, perhaps the most appealing feature of HJbased methods is its flexibility in terms of the system dynamics. Unfortunately, the computation complexity of HJbased methods scales exponentially with the number of vehicles in the system, making their direct application to multivehicle problems intractable.
Many attempts have also been made to use differential games to analyze largerscale problems. For example, in works such as [16, 17, 18], the authors discuss various classes of threeplayer differential game with different assumptions on the role of each agent in noncooperative settings. For even larger systems, [19, 20, 21, 22] provide promising results when varying degrees of structural assumptions can be made. However, none of these attempts at providing guarantees address the problem of unstructured flight, which may be important in some situations. In addition, having stronger safety guarantees in unstructured environments has the potential to make structured flight of UAVs more resilient to unforeseen circumstances.
In this paper, we build on the HJbased method for guaranteeing safety when no more than two vehicles are present. We augment the HJbased method with a higherlevel joint cooperative control strategy using a mixed integer program (MIP) inspired by the properties of the pairwise safety guarantee. Our proposed MIP scales well with the number of vehicles, provides safety guarantees for three vehicles, and results in significantly better performance for multivehicle systems in general compared to when not using the higherlevel control logic. We provide a proof for the safety guarantee in a threevehicle system, and illustrate the safety guarantee and performance benefits through simulations of multivehicle systems in various configurations.
Ii Problem Formulation
Consider vehicles, denoted , described by the following ordinary differential equation (ODE)
(1) 
where is the state of the th vehicle , and is the control of . Each of the vehicles may have some objective, such as getting to a set of goal states. Whatever the objective may be, each vehicle must at all times avoid the danger zone with respect to each of the other vehicles . In general, the danger zones may represent any relative configuration between and that are considered undesirable, such as collision. In this paper, we make the assumption that , the interpretation of which is that between a pair of vehicles, an unsafe configuration is one in which either of the vehicle is the danger zone of the other.
If possible and desired, each vehicle would use a “liveness controller” that helps complete its objective. However, sometimes a “safety controller” must be used in order to prevent the vehicle from entering any danger zones with respect to any other vehicles. Since the danger zones are sets of joint configurations, it is convenient to derive the set of relative dynamics between every vehicle pair from the dynamics of each vehicle specified in . Let the relative dynamics between and be specified by the ODE
(2)  
We assume the functions and are uniformly continuous, bounded, and Lipschitz continuous in arguments and respectively for fixed and respectively. In addition, the control functions are drawn from the set of measurable functions^{1}^{1}1A function between two measurable spaces and is said to be measurable if the preimage of a measurable set in is a measurable set in , that is: , with algebras on ,..
Given the vehicle dynamics in (1), some joint objective, the derived relative dynamics in (2), and the danger zones , we propose a cooperative safety control strategy that performs the following:

detect potential conflict based on the joint configuration of all vehicles;

allow vehicles that are not in potential conflict to complete their objective using a liveness controller;

among the vehicles in potential conflict, attempt to minimize the number of instances in which a vehicle gets into another vehicle’s danger zone.
For the case of , we prove that our proposed control strategy guarantees that all vehicles will be able to stay out of all the danger zones with respect to the other vehicles, and thus guaranteeing safety. For all initial configurations in our simulations, all vehicles also complete their objectives.
Iii Methodology
Our proposed method builds on HJ reachability theory, which in the case of guarantees no vehicle will enter another vehicle’s danger zone and that the vehicles will eventually complete their joint objective [13]. HJ reachability becomes computationally intractable for . To provide the same guarantees for , we propose an MIP motivated by the properties of the HJ pairwise solution to specify a higher level control logic. While unable to provide hard guarantees for , our proposed method is computationally tractable for much larger , and performs significantly better than applying an extension of the pairwise HJ reachability solution when .
Iiia HamiltonJacobi Reachability
HJ reachability has been studied extensively [13, 14, 23, 24, 25] and found many successful applications [13, 15, 20, 26]. Here, we give a brief overview of how to apply HJ reachability to solve a pairwise collision avoidance problem such as the one in [13]. Given the relative dynamics (2), we define the target set to be the danger zone , and compute following the backward reachable set
(3)  
If , the relative state of and is outside of for all , then is free to use a liveness controller to make progress towards its objective. If is on the boundary of for a single , then danger can be averted, regardless of the action of , by using the optimal control denoted , which can be obtained from the gradient of the value function representing . For details on obtaining , see [13]; for this paper, it is sufficient to note that where we assume and write . The interpretation is that is guaranteed to be able to avoid collision with over an infinite time horizon as long as the optimal control is applied as soon as the potential conflict occurs.
If is in for more than one , then the pairwise optimal controls cannot guarantee safety. However, in this case, our proposed cooperative control strategy, which uses a MIP to provide a higher level control logic, can provide safety guarantees when .
IiiB The Mixed Integer Program
For the case, we use an MIP to provide higher level control logic to synthesize a cooperative safety controller. We first note two properties of the pairwise solution:

If every vehicle pair stays out of each other’s danger zones, then the entire set of vehicles would be out of each other’s danger zones.

Since the solution is pairwise, the safety controller derived from HJ reachability can only guarantee that some vehicle can avoid the danger zone with respect to a single other vehicle .
Intuitively, a higher level control logic is needed to provide a farsighted avoidance maneuver; without this higher level logic, pairwise avoidance maneuvers between two vehicles and may lead to unavoidable dangerous configurations with respect to a third vehicle .
Definition 1
Control logic matrix: Let be the control logic matrix specifying the joint cooperative control of the vehicles. Denote the element of in position to be . If , then the control logic stipulates that vehicle must execute the pairwise optimal control to avoid vehicle .
Definition 2
Reward coefficient matrix: Let be the reward coefficient matrix with elements . Each specifies the “reward” for choosing to have vehicle avoiding vehicle , or in other words, choosing .
Motivated by the above two properties, and using the above definitions, we arrive at the following MIP:
(4)  
subject to  
At a given time, the vehicles’ joint state determines , which forms the objective of (4). Thus, the interpretation of the objective of (4) depends on the choice of the reward coefficient matrix . A large encourages to be , causing vehicle to avoid . The decision variables consist of the elements of , which provides the high level control logic. This is captured by constraint (4c).
The pairwise HJ optimal control guarantees that a vehicle can remain safe with respect to another vehicle regardless of the action of . Therefore, in every pair , if either or is avoiding the other, there is no need for the other vehicle to also be avoiding the first. The constraint (4a) states that out of every vehicle pair, at most one vehicle should avoid the other so that no control authority is wasted by having both vehicles avoid each other. The other vehicle then could use its control authority to avoid a third vehicle with whom it may come into conflict.
Finally, since the control logic ultimately results in vehicles performing pairwise optimal controls, each vehicle is only guaranteed to be able to avoid at most one other vehicle. The constraint (4b) encodes this limitation.
IiiC Design of the Objective Function
The objective function in (4) can be designed by choosing the reward coefficient matrix . In general, there may be many choices for , and the general guiding principle in choosing is that it should depend on the vehicles’ safety levels and avoidance priority; both concepts are defined below. In this paper, we propose one particular choice of that allows us to prove safety guarantees for three vehicles.
Given the form of the objective function, the first obvious choice for some of the elements of would be . This forces , which states that a vehicle does not need to avoid itself. Before designing the rest of , we need to define the notion of a safety level.
Definition 3
Safety level: Given , the state of vehicle with respect to vehicle , define the safety level to be . For convenience, let .
Proposition 1
Suppose at some time . If chooses the control , then .
Based on the definitions of , , and , we have that if at , then the control collision avoidance for an infinite time horizon. This implies for all time.
Corollary 1
Between the pair , if or , then there exists a joint control strategy to ensure neither vehicle enters the danger zone of the other.
If , then safety is guaranteed if chooses to avoid . If , then . In this case, simply swap the indices and .
Let be a safety level threshold. We say is in potential conflict with if . Based on this safety level threshold, we set whenever . So far, we have whenever and whenever . The rest of the values of are derived from the priority matrix, defined below.
Definition 4
Priority matrix: Let be a priority matrix with elements . The priority matrix establishes an avoidance order for the vehicles.
The diagonal elements of can be arbitrarily set (denoted ). The rest of the elements are assigned in descending order according to Sarrus’ rule [28] (for determining cross products). For example, in the case of ,
(5) 
A large value of indicates that should avoid with a high priority. In order to impose such a priority when constructing a joint cooperative safety control strategy, we set whenever . For example, if and , then we would have
(6) 
As another example, if except , then we would have
(7) 
Remark 1
Avoidance priority is an important notion for guaranteeing safety even when . Consider the scenario where vehicle applies the control to avoid , but does not try to avoid . As long as continues to avoid , the two vehicles can avoid each other’s danger zones.
While is avoiding , is guaranteed to remain positive; however, since is not avoiding , could become negative. If , safety can only be guaranteed if keeps avoiding . The avoidance priority ensures that some never tries to avoid when . Instead, the responsibility of avoidance would remain with , which continues to avoid to ensure .
Iv Safety Guarantee For Three Vehicles
The method for constructing a joint safety controller described in Section III guarantees safety when . We now formally states this guarantee and prove the result.
Theorem 1
It suffices to show that at implies .
Suppose . Then . From the objective of (4), would be chosen to be unless another feasible solution in which results in a higher objective value. Due to (4a) and (4b), the only way for the optimal solution to have is to have or .
There are several cases of to go through, with each case having different elements of being equal to . We show one case here; the rest of the cases follow a similar logic. Assume is given in (6). Then, since , the optimal solution would have as many elements of being as possible (except for diagonal elements).
Suppose , then by (4a), and by (4b), . This leaves us with the freedom to choose . Since , choosing would maximize the objective. This gives us the candidate solution and the rest of the being , with an objective value of . However, choosing alone would already result in an objective value of at least ; therefore, .
Next, suppose , then by (4a), and by (4b), . This leaves us with the freedom to choose . Since , choosing would maximize the objective. This gives us the candidate solution and the rest of the being , with an objective value of . However, choosing alone would already result in an objective value of at least ; therefore, .
This leaves us with whenever . By a similar argument, one can show that whenever , and whenever .
Remark 2
Alternatively, one could enumerate all feasible solutions for every possible choice of , and discover the same result stated in Theorem 1. We have also taken this brute force approach to verify the above proof.
V Numerical Simulations
In this section, we illustrate our proposed method through simulations and compare our method with a baseline pairwise method that uses solely the HJ pairwise optimal control solution in which each agent avoids the agent in the potential conflict set with the smallest pairwise safety value . Compared with our MIP formulation (4), the baseline can be thought of as a different MIP that

omits constraint (4a), making the vehicles unable to coordinate among each other, and

assumes if has the lowest safety value with respect to , and otherwise, making the vehicles lack a notion of global avoidance priority.
Such a baseline is chosen to illustrate the benefits of the above design considerations, which are important features of our proposed method. For illustration purposes, we assumed that the dynamics of each vehicle is given by
(8)  
where the state variables represent the position, position, and heading of vehicle . Each vehicle travels at a constant speed of , and chooses its turn rate , constrained by some maximum . The danger zone for HJ computation between and is defined as
(9) 
whose interpretation is that and are considered to be in each other’s danger zone if their positions are within of each other. In our examples, we chose . Here, represents their joint state, For notational convenience, we define , , and .
To obtain safety levels and the optimal pairwise safety controller, we compute the BRS (3) with the relative dynamics
(10)  
In our examples, we chose . Whenever , applies the optimal control to reach its destination^{2}^{2}2This optimal control can be computed by solving a reachability problem using the dynamics (8), but for brevity we will not go into the details here.. Otherwise, uses the control specified by the joint cooperative safety controller that we propose in this paper.
Simulations for and are presented in detail for our method and the baseline method. Each vehicle aims to reach the circular target of matching color while avoiding other vehicles’ danger zones. The vehicles keep traveling at constant speed even if they enter the danger zones of other vehicles until they reach their targets. The safety level sets are plotted for some pairs of vehicles. When a vehicle is inside the safety level set (outer boundary), plotted in the same color as the vehicle, it is in potential conflict with the vehicle around which the level set is plotted. However, as long as the vehicle stays outside of the safety level set (inner boundary), the pair of vehicles will be able to avoid entering each other’s danger zones.
Fig. 1 illustrates how our joint collision avoidance method cooperatively resolves conflicts for three vehicles. The vehicles start outside of each others’ safety level sets. Each of them performs optimal control to reach their respective targets. On the way, (green) and (blue) come in conflict with each other. Cooperatively, avoids while heads to the target since is already resolving the pairwise conflict. At time , all vehicles come in conflict with each other, and our proposed algorithm advises that (red) avoids , avoids , and avoids , efficiently utilizing their control authorities for avoidance. At time , the conflicts are resolved as each vehicle’s safety level rises to above with respect to the others. Eventually, all vehicles reach their targets without any entering each other’s danger zones.
Fig. 2 illustrates the pitfall of using the baseline method. Here, each vehicle avoids the vehicle with the smallest pairwise safety value. At , all vehicles come in conflict with each other, and without higher level logic, (red) avoids (blue), (green) avoids , and avoids . By avoiding each other, and waste control authority that can be used to prevent and from going closer to each other. When and come closer to each other, they begin avoiding each other, leading to and coming closer to each other. The lack of coordination causes this behavior to repeat, bringing them closer and closer together (), and eventually leading them into each other’s danger zones at . This alternating avoidance behavior also highlights the importance of imposing avoidance priority.
Fig. 3 illustrates a difficult eightvehicle scenario that our cooperative algorithm successfully resolves. The safety level sets are plotted for each avoidance pair. At , multiple vehicles are in conflict with each other. Notice that no redundant control is used (a pair of vehicles avoiding each other). Instead one vehicle in a given conflict pair can free up its control to avoid another agent. Fig. 4 shows the result of applying the baseline approach, which is unable to resolve the multiple conflicts. In particular, at (top right), multiple vehicle pairs avoid each other during the conflicts. In addition, at (bottom right), two vehicles end up in a “limbo” state where they alternate between avoiding each other and trying to get closer to their targets, continually going in a direction that is further from their targets.
Additionally, we compare our method with the baseline method for vehicles by performing simulations with randomized initial conditions for each case, and show that our algorithm performs significantly better than the baseline pairwise approach. We initialized each vehicle by placing each of them symmetrically on a circle of radius facing the center of the circle, and then adding random perturbations to its initial state. We define the two performance metrics below. The average over the 200 trials for each case are presented in Fig. 5.

Success ratio = fraction of vehicles that reach their targets without ever entering others’ danger zones

Aggregate conflict ratio = . The denominator is the maximum possible number of danger zone violations that could occur, which is the number of time steps times ( choose ).
With our proposed method, the average computation time per simulation is 4.1 seconds for and 25.5 seconds for ; this time includes the time needed to solve the MIP (4). With the baseline method, the average computation time for the same simulations is 5.9 seconds for and seconds for . Both methods require the same BRS, which takes approximately 1 minute to compute. All computations were done on a MacBookPro 11.2 laptop with an Intel Core i74750 processor.
Vi Conclusions
By exploiting properties of pairwise optimal collision avoidance, our proposed mixed integer program method guarantees collision avoidance of three vehicle systems and performs well for larger multivehicle systems.
References
 [1] W. M. Debusk, “Unmanned aerial vehicle systems for disaster relief: Tornado alley,” in Infotech@Aerospace Conferences, 2010.
 [2] AUVSI News. (2016) Uas aid in south carolina tornado investigation. [Online]. Available: http://www.auvsi.org/blogs/auvsinews/2016/01/29/tornado
 [3] Amazon.com, Inc. (2016) Amazon prime air. [Online]. Available: http://www.amazon.com/b?node=8037720011
 [4] BBC Technology. (2016) Google plans drone delivery service for 2017. [Online]. Available: http://www.bbc.com/news/technology34704868
 [5] B. P. Tice, “Unmanned aerial vehicles – the force multiplier of the 1990s,” Airpower Journal, 1991.
 [6] Jointed Planning and Development Office (JPDO), “Unmanned aircraft systems (UAS) comprehensive plan – a report on the nation’s UAS path forward,” Federal Aviation Administration, Tech. Rep., 2013.
 [7] National Aeronautics and Space Administration. (2016) Challenge is on to design sky for all. [Online]. Available: http://www.nasa.gov/feature/challengeisontodesignskyforall
 [8] P. Fiorini and Z. Shillert, “Motion planning in dynamic environments using velocity obstacles,” International Journal of Robotics Research, vol. 17, pp. 760–772, 1998.
 [9] J. van den Berg, M. C. Lin, and D. Manocha, “Reciprocal velocity obstacles for realtime multiagent navigation,” in IEEE International Conference on Robotics and Automation, May 2008, pp. 1928–1935.
 [10] R. OlfatiSaber and R. M. Murray, “Distributed cooperative control of multiple vehicle formations using structural potential functions,” in IFAC World Congress, 2002.
 [11] Y.L. Chuang, Y. Huang, M. R. D’Orsogna, and A. L. Bertozzi, “Multivehicle flocking: Scalability of cooperative control algorithms using pairwise potentials,” in IEEE International Conference onRobotics and Automation, April 2007, pp. 2292–2299.
 [12] E. M. Vaisbord and V. I. Zhukovskii, Introduction to Multiplayer Differential Games and Their Applications. Routledge, 1988.
 [13] I. Mitchell, A. Bayen, and C. Tomlin, “A timedependent HamiltonJacobi formulation of reachable sets for continuous dynamic games,” IEEE Transactions on Automatic Control, vol. 50, no. 7, pp. 947–957, 2005.
 [14] J. F. Fisac, M. Chen, C. J. Tomlin, and S. S. Shankar, “Reachavoid problems with timevarying dynamics, targets and constraints,” in 18th International Conference on Hybrid Systems: Computation and Controls, 2015.
 [15] J. Ding, J. Sprinkle, S. S. Sastry, and C. J. Tomlin, “Reachability calculations for automated aerial refueling,” in IEEE Conference on Decision and Control, Cancun, Mexico, 2008.
 [16] S. Tanimoto, “On a class of threeplayer differential games,” Journal of Optimization Theory and Applications, vol. 25, no. 3, p. 469?473, 1978.
 [17] M. Su, Y. ji Wang, and L. Liu, “Bounded guidance law based on differential game for threeplayer conflict,” in IEEE Conference on Modeling, Identification, and Control, 2014.
 [18] J. F. Fisac and S. S. Sastry, “The pursuitevasiondefense differential game in dynamic constrained environments,” in IEEE Conference on Decision and Control, 2015.
 [19] W. Lin, “Differential games for multiagent systems under distributed information,” Ph.D. dissertation, University of Central Florida, 2013.
 [20] M. Chen, Z. Zhou, and C. J. Tomlin, “Multiplayer reachavoid games via low dimensional solutions and maximum matching,” in Proceedings of the American Control Conference, 2014.
 [21] M. Chen, J. Fisac, C. J. Tomlin, and S. Sastry, “Safe sequential path planning of multivehicle systems via doubleobstacle hamiltonjacobiisaacs variational inequality,” in European Control Conference, 2015.
 [22] M. Chen, Q. Hu, C. Mackin, J. Fisac, and C. J. Tomlin, “Safe platooning of unmanned aerial vehicles via reachability,” in IEEE Conference on Decision and Control, 2015.
 [23] E. N. Barron, “Differential Games with Maximum Cost,” Nonlinear analysis: Theory, methods & applications, pp. 971–989, 1990.
 [24] O. Bokanowski, N. Forcadel, and H. Zidani, “Reachability and minimal times for state constrained nonlinear problems without any controllability assumption,” SIAM Journal on Control and Optimization, pp. 1–24, 2010.
 [25] K. Margellos and J. Lygeros, “HamiltonJacobi Formulation for ReachAvoid Differential Games,” IEEE Transactions on Automatic Control, vol. 56, no. 8, Aug 2011.
 [26] ——, “Toward 4D Trajectory Management in Air Traffic Control: A Study Based on Monte Carlo Simulation and Reachability Analysis,” IEEE Transactions on Control Systems Technology, vol. 21, no. 5, Sept 2013.
 [27] M. G. Crandall and P.L. Lions, “Viscosity solutions of HamiltonJacobi equations,” Transactions of the American Mathematical Society, vol. 277, no. 1, pp. 1–42, 1983.
 [28] D. Khattar, The Pearson Guide to Complete Mathematics for AIEEE, 3rd ed. Pearson Education India, 2010.