Nonlinear Model Predictive Control for MultiMicro Aerial Vehicle Robust Collision Avoidance
Abstract
Multiple multirotor Micro Aerial Vehicles sharing the same airspace require a reliable and robust collision avoidance technique. In this paper we address the problem of multiMAV reactive collision avoidance. A modelbased controller is employed to achieve simultaneously reference trajectory tracking and collision avoidance. Moreover, we also account for the uncertainty of the state estimator and the other agents position and velocity uncertainties to achieve a higher degree of robustness. The proposed approach is decentralized, does not require collisionfree reference trajectory and accounts for the full MAV dynamics. We validated our approach in simulation and experimentally.
1 Introduction
As the miniaturization technology advances, low cost and reliable MAVs are becoming available on the market with powerful onboard computation power. Many applications can benefit from the presence of such low cost systems such as inspection and exploration [1, 2], surveillance for security [3], mapping [4] or crop monitoring [5]. However, MAVs are limited to short flight times due to battery limitation and size constraints. Due to this limitation, creating a team of MAVs that can safely share the airspace to execute a specific mission will widen the range of applications where MAVs can be used and will be beneficial for timecritical missions such as search and rescue operations [6].
A crucial problem when multiple MAVs share the same airspace is the risk of midair collision, because of this, a robust method to avoid multiMAVs collisions is necessary. Typically, this problem is solved by planning collisionfree trajectory for each agent in a centralized manner. However, this binds the MAVs to the preplanned trajectory and limits the adaptivity of the team during the mission: any change in the task will require trajectory replanning for the whole team.
In this work we present a unified framework to achieve reference trajectory tracking and multiagent reactive collision avoidance. The proposed approach exploits the full MAV dynamics and takes into account the physical platform limitations. In this way we fully exploit the MAV capabilities and achieve agile and natural avoidance maneuvers compared to classic approaches, where planning is decoupled from trajectory tracking control. To this end, we formulate the control problem as a constrained optimization problem that we solve in a receding horizon fashion. The cost function of the optimization problem includes a potential fieldlike term that penalizes collisions between agents. While potential field methods do not provide any guarantee and are sensitive to tuning parameters, we introduce additional tight hard constraints to guarantee that no midair collisions will occur. The proposed method assumes that each agent is broadcasting its position and velocity on a common network. Additionally, to increase the avoidance robustness, we use the state estimator uncertainty to shape the collision term in the cost function and the optimization constraints.
The contribution of this paper can be summarized as follows:

A unified framework for multiagent control and collision avoidance.

The incorporation of state estimator uncertainty and communication delay for robust collision avoidance.
This paper is organized as follows: In Section 2 we present an overview of existing methods for multiagent collision avoidance. In Section 3 we briefly present the MAV model that will be considered in the controller formulation. In Section 4 we present the controller and discuss the state estimator uncertainty propagation. Finally, in Section 5 we present simulation and experimental results of the proposed approach.
2 Related Work
Many researchers have demonstrated successful trajectory generation and navigation on MAVs in controlled environment where obstacles are static, using external motion capture system [7] or using onboard sensing [8]. Sampling based planning techniques can be used to generate global collisionfree trajectories for single agent, taking into account the agent dynamics [9] and static and dynamics obstacles in the environment.
One way to generate collisionfree trajectories for a team of robots is to solve a mixed integer quadratic problem in a centralized fashion as shown in [10]. A similar approach was presented in [11] where sequential quadratic programming techniques are employed to generate collisionfree trajectories for a team of MAVs. The aforementioned methods lack realtime performance and do not consider unforeseen changes in the environment.
Global collisionfree trajectory generation methods limit the versatility of the team of robots. In real missions, where multiple agents are required to cooperate, the task assigned to each agent might change according to the current situation, and reactive local trajectory planning methods become crucial.
One of the earliest works to achieve reactive collision avoidance for a team of flying robots in a Nonlinear Model Predictive Control (NMPC) framework is the work presented in [12] where a decentralized NMPC is employed to control multiple helicopters in a complex environment with an artificial potential field to achieve reactive collision avoidance. This approach does not provide any guarantees and has been evaluated only in simulation. In [13] the authors present various algorithms based on the Velocity Obstacles (VO) concept to select collisionfree trajectories from a set of candidate trajectories. The method has been experimentally evaluated on MAVs flying in close proximity and including human. However, the MAV dynamics are not considered in this method, and decoupling trajectory generation from control has various limitations as shown in the experimental section of [13].
Among the attempts to unify trajectory optimization and control is the work presented in [14]. The robot control problem is formulated as a finite horizon optimal control problem and an unconstrained optimization is performed at every time step to generate timevarying feedback gains and feedforward control inputs simultaneously. The approach has been successfully applied on MAVs and a ball balancing robot.
In this work, we unify the trajectory tracking and collision avoidance into a single optimization problem in a decentralized manner. In this way, trajectories generated from a global planner can be sent directly to the trajectory tracking controller without modifications, leaving the local avoidance task to the tracking controller.
3 Model
In this section we present the MAV model employed in the controller formulation. We first introduce the full vehicle model and explain the forces and moment acting on the system. Next, we will briefly discuss the closedloop attitude model employed in the trajectory tracking controller.
System model
We define the world fixed inertial frame and the body fixed frame attached to the MAV in the Center of Gravity (CoG) as shown in Figure 2. The vehicle configuration is described by the position of the CoG in the inertial frame , the vehicle velocity in the inertial frame , the vehicle orientation which is parameterized by Euler angles and the body angular rate .
The main forces acting on the vehicle are generated from the propellers. Each propeller generates thrust proportional to the square of the propeller rotation speed and angular moment due to the drag force. The generated thrust and moment from the propeller is given by:
(1a)  
(1b) 
where and are positive constants and is a unit vector in direction. Moreover, we consider two important effects that appear in the case of dynamic maneuvers. These effects are the blade flapping and induced drag. The importance of these effects stems from the fact that they introduce additional forces in the rotor plane, adding some damping to the MAV velocity as shown in [15]. It is possible to combine these effects as shown in [16, 17] into one lumped drag coefficient .
This leads to the aerodynamic force :
(2) 
where and is the component of the thrust force.
The motion of the vehicle can be described by the following equations:
(3a)  
(3b)  
(3c)  
(3d) 
where is the mass of the vehicle and is the external forces acting on the vehicle (i.e wind). is the inertia matrix, is the allocation matrix and is the number of propellers.
Attitude model
We follow a cascaded approach as described in [18] and assume that the vehicle attitude is controlled by an attitude controller. For completeness we quickly summarize their findings in the following paragraph.
To achieve accurate trajectory tracking, it is crucial for the high level controller to consider the inner loop system dynamics. Therefore, it is necessary to consider a simple model of the attitude closedloop response. These dynamics can either be calculated by simplifying the closed loop dynamic equations (if the controller is known) or by a simple system identification procedure in case of an unknown attitude controller (on commercial platforms for instance). In this work we used the system identification approach to identify a first order closedloop attitude response.
The innerloop attitude dynamics are then expressed as follows:
(4a)  
(4b)  
(4c) 
where and are the gains and time constant of roll and pitch angles respectively. and are the commanded roll and pitch angles and is commanded angular velocity of the vehicle heading.
The aforementioned model will be employed in the subsequent trajectory tracking controller to account for the attitude innerloop dynamics. Note that the vehicle heading angular rate is assumed to track the command instantaneously. This assumption is reasonable as the MAV heading angle has no effect on the MAV position.
4 Controller Formulation
In this section we present the unified trajectory tracking and multiagent collision avoidance NMPC controller. First, we will present the Optimal Control Problem (OCP). Afterwards we will discuss the cost function choice and the state estimator uncertainty propagation to achieve robust collision avoidance. Next, we will present the optimization constraints and finally we will discuss the approach adopted to solve the OCP in realtime onboard of the MAV.
4.1 Optimal Control Problem
To formulate the OCP, we first define the system state vector and control input as follows:
(5)  
(6) 
Every time step, we solve the following OCP online:
(7)  
where is composed of Equations (3a), (3b) and (4). are the cost function for reference trajectory tracking, control input penalty and collision cost function and is the terminal cost function. is a function that represents the state constraint, and is the set of admissible control inputs. In the rest of this section, we will discuss the details of the aforementioned OCP and discuss a method to efficiently solve it in realtime.
4.2 Cost Function
In this subsection we discuss the components of the cost function presented in (7). The first term penalizes the deviation of the predicted state from the desired state vector in a quadratic sense as shown below:
(8) 
where is a tuning parameter. The state reference is obtained from the desired trajectory. The second term in the cost function is related to the penalty on the control input as shown below:
(9) 
where is a tuning parameter. The control input reference is chosen to achieve better tracking performance based on desired trajectory acceleration as described in [19].
The collision cost to avoid collisions with other is given by:
(10)  
where is the Euclidean distance to the agent given by , is a tuning parameter, is a parameter that defines the smoothness of the cost function and is a threshold distance between the agents where the collision cost is . Equation (10) is based on the logistic function, and the main motivation behind this choice is to achieve a smooth and bounded collision cost function. Figure 3 shows the cost function for different parameters.
4.3 Constraints
The first constraint in (7) guarantees that the state evolution respects the MAV dynamics. To compensate for external disturbances and modeling errors to achieve offsetfree tracking, we employ a modelbased filter to estimate external disturbances as described in details in [19]. The second constraints addresses limitations on the control input as follows:
(11) 
The third constraint guarantees collision avoidance by setting tight hard constraints on the distance between two agents. The row of the matrix represents the collision constraints with the agent. This is given by:
(12) 
where is the position of the agent at time . These are nonconvex constraints and is continuous and smooth. is chosen to always be strictly less than to guarantee that the hard constraints are activated only if the potential field in Equation (10) is not able to maintain distance to the agent.
Finally, the last constraint in the optimization problem is to fix the initial state to the current estimated state .
4.4 Agents Motion Prediction
Given that the approach presented in this work is based on Model Predictive Control, it is beneficial to employ a simple model for the other agents and use it to predict their future behavior. The model we employ in this work is based on a constant velocity model, but this can be replaced with more sophisticated model, and we will consider this for future work. Given the current position and velocity of the agent we can predict the future positions of the agent along the prediction horizon as follows:
(13) 
where is the communication delay that we compensate for to achieve better prediction. is calculated based on the difference between the timestamp on the message and the arrival time. This is possible thanks to a clock synchronization between the agents and a time server. The communication delay compensation can be omitted if there is no clock synchronization between agents. Additionally, to reduce the noise sensitivity, we consider the velocity to be zero if it is below a certain threshold .
4.5 Uncertainty Propagation
To account for the uncertainty in the state estimator and the uncertainty of the other agents, to achieve higher level of robustness, we propagate the estimated state uncertainty to calculate the minimum allowed distance to the agent and the threshold distance . In other words, if the state is highly uncertain, we should be more conservative on allowing agents to get closer to each other by increasing and at time along the prediction horizon. The uncertainty of the agent’s position is propagated using the model described in Equation (13), while the selfuncertainty can be propagated with higher accuracy employing the system model described in Equations (3a), (3b) and (4). In many previous works, the uncertainty propagation is typically performed using the unscented transformation when the system is nonlinear. In our case, given that we need realtime performance, we choose to perform uncertainty propagation based on an Extended Kalman Filter (EKF). Given the current predicted state with covariance , we propagate the uncertainty by solving the following differential equation:
(14) 
with boundary condition . is the state transition Jacobian matrix. Using Equation (14) we compute the agent’s uncertainty and the selfuncertainty at time , namely and . These values are employed to calculate and according to the following Equations:
(15)  
where is the square root of the maximum eigenvalue of the selfuncertainty and is the square root of maximum eigenvalue of the agent’s uncertainty . and are constant parameters. We use the maximum eigenvalue to reduce the problem of computing the distance between two ellipsoids, which is more complex and time consuming, to the computation of the distance between two spheres. Approximating the uncertainty ellipsoid by the enclosing sphere makes the bounds more conservative, especially if the uncertainty is disproportionately large only in a particular direction. Figure 4 illustrates the concept for 2 agents.
4.6 Implementation
A Multiple shooting technique [20] is employed to solve (7). The system dynamics and constraints are discretized over a coarse discrete time grid within the time interval . For each interval, a Boundary Value Problem (BVP) is solved, where additional continuity constrains are imposed. An implicit RK integrator of order is employed to forward simulate the system dynamics along the interval. At this point, the OCP can be expressed as a Nonlinear Program (NLP) that can be solved using Sequential Quadratic Programming (SQP) technique where an active set or interior point method can be used to solve the Quadratic Program (QP). For the controller to behave as a local planner and to guarantee problem feasibility, a long prediction horizon is necessary. To achieve this without significantly increasing the computation effort, the time step of the grid over which the system dynamics is discretized is chosen to be larger than the controller rate. Smooth predictions are obtained thanks to the implicit RungeKutta of order 4 integrator employed and because the system dynamics is not represented by stiff differential equations.
4.7 Priority
In many situations, a particular agent may have high priority to follow the reference trajectory. For instance, if a MAV is delivering an object or taking footage for mapping task. The proposed method can handle priorities simply by changing the connectivity graph between agents. The highest priority MAV will not perform any avoidance maneuver and therefore doesn’t need to have access to other agents state. A lower priority agent will perform avoidance maneuver only to avoid higher priority agents.
5 Evaluation
In this section we evaluate the proposed approach in simulation and experimentally. The proposed controller has been implemented in C++ and integrated into Robot Operating System (ROS). The ACADO toolkit [21] is employed to generate a fast C solver for the OCP. First we show two simulation studies in a high fidelity MAVs simulator, RotorS [22] where MAVs are commanded to swap positions simultaneously. Then we present two experimental evaluations of the method with two MAVs.
5.1 Simulation Results
In this simulation study we evaluate our method on MAVs commanded to exchange their initial positions. The reference trajectory for each agent is not collisionfree. Figure 11 shows the evolution of the MAV trajectories when all agents are sharing position and velocity information in a fully connected graph. In this case, avoidance maneuvers are reciprocal. The average computation time during this simulation is 1.0 while worst case computation time is around 4.0 on an Intel i7 2.8 CPU.
In another simulation study depicted in Figure 11, we assigned hierarchical avoidance scheme, where the first MAV (blue sphere) has top priority to follow the reference trajectory, while the second MAV (red sphere) is avoiding only the first MAV. The third MAV is avoiding the first two, etc… .
Clearly, assigning priority scheme makes the problem simpler to solve with better position swapping trajectories as shown in Figure 11.
5.2 Experimental Results
We experimentally evaluated the proposed approach on two AscTec NEO hexacopters. In a first experiment, one MAV is commanded to hover in position, while a second MAV with high priority is commanded manually to fly in vicinity to the first MAV. In a second experiment, crossing reference trajectories are planned for both MAVs and executed. The purpose of these experiments is to show how the proposed method exploits the system dynamics and abruptly responds to changes in the other agent behavior in a robust manner.
Experimental Setup
The AscTec NEO hexacopter is equipped with an Intel i7 2.8 8 RAM computer running ROS and an onboard flight controller and as well as tuned attitude controller. The onboard computer communicates with the flight controller over a serial port and exchanges information at 100 . An external motion capture system (Vicon) is used to measure each MAV pose. This measurement is fused with the onboard Inertial Measurement Unit (IMU) to achieve accurate state estimation using the multisensor fusion framework [23]. The estimated position and velocity of each agent is shared over the network. The controller is running onboard of each MAV at 100 while the prediction horizon of the controller is chosen to be 2 .
Experiments
Figure 8 shows the distance between the two agents over time. The distance is almost always maintained above except for moments of aggressive maneuvers, however the hard constraint was never activated as the distance was always above all the time.
In the second experiment, intersecting reference trajectories with maximum velocity of 2 are planned and sent as reference trajectories to the respective MAV. Figure 5 shows a sequence of images taken 1 apart during the experiment
6 Conclusions
In this paper we presented a multiMAVs collision avoidance strategy based on Nonlinear Model Predictive Control. The approach accounts for state estimator uncertainty by propagating the uncertainty along the prediction horizon to increase the minimum acceptable distance between agents, providing robust collision avoidance. Tight hard constraints on the distance between agents guarantee no collisions if the prediction horizon is sufficiently long. Moreover, by changing the connectivity graph, it is possible to assign priority to certain agents to follow their reference trajectories. The approach has been evaluated in simulation with 6 agents and in real experiments with 2 agents. Our experiments showed that this collision avoidance approach results into agile and dynamic avoidance maneuvers while maintaining system stability at reasonable computational cost.
Acknowledgment
This work was supported by the European Union’s Horizon 2020 Research and Innovation Programme under the Grant Agreement No.644128, AEROWORKS.
 MAV
 Micro Aerial Vehicle
 NMPC
 Nonlinear Model Predictive Control
 MPC
 Model Predictive Controller
 CoG
 Center of Gravity
 OCP
 Optimal Control Problem
 SQP
 Sequential Quadratic Programming
 NLP
 Nonlinear Program
 BVP
 Boundary Value Problem
 VO
 Velocity Obstacles
 EKF
 Extended Kalman Filter
 QP
 Quadratic Program
 ROS
 Robot Operating System
 IMU
 Inertial Measurement Unit
Footnotes
 video available on http://goo.gl/RWRhmJ
References
 K. Steich, M. Kamel, P. Beardsleys, M. K. Obrist, R. Siegwart, and T. Lachat, “Tree cavity inspection using aerial robots,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2016.
 A. Bircher, M. Kamel, K. Alexis, H. Oleynikova, and R. Siegwart, “Receding horizon ”nextbestview” planner for 3d exploration,” in 2016 IEEE International Conference on Robotics and Automation (ICRA), May 2016, pp. 1462–1468.
 A. Girard, A. Howell, and J. Hedrick, “Border patrol and surveillance missions using multiple unmanned air vehicles,” in Decision and Control, 2004. CDC. 43rd IEEE Conference on, 2004.
 P. Oettershagen, T. J. Stastny, T. A. Mantel, A. S. Melzer, K. Rudin, G. Agamennoni, K. Alexis, R. Siegwart, “Longendurance sensing and mapping using a handlaunchable solarpowered uav,” in Field and Service Robotics, 10th Conference on, June 2015, (accepted).
 E. R. Hunt, W. D. Hively, S. J. Fujikawa, D. S. Linden, C. S. T. Daughtry, and G. W. McCarty, “Acquisition of nirgreenblue digital photographs from unmanned aircraft for crop monitoring,” Remote Sensing, vol. 2, no. 1, pp. 290–305, 2010.
 P. Rudol and P. Doherty, “Human body detection and geolocalization for uav search and rescue missions using color and thermal imagery,” in Aerospace Conference, 2008 IEEE, 2008, pp. 1–8.
 N. Michael, D. Mellinger, Q. Lindsey, and V. Kumar, “The grasp multiple microuav testbed,” IEEE Robotics & Automation Magazine, vol. 17, no. 3, pp. 56–65, 2010.
 M. Burri, H. Oleynikova, M. W. Achtelik, and R. Siegwart, “Realtime visualinertial mapping, relocalization and planning onboard mavs in unknown environments,” in Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015, pp. 1872–1878.
 E. Frazzoli, M. A. Dahleh, and E. Feron, “Realtime motion planning for agile autonomous vehicles,” Journal of Guidance, Control, and Dynamics, vol. 25, no. 1, pp. 116–129, 2002.
 A. Kushleyev, D. Mellinger, C. Powers, and V. Kumar, “Towards a swarm of agile micro quadrotors,” Autonomous Robots, vol. 35, no. 4, pp. 287–300, 2013.
 F. Augugliaro, A. P. Schoellig, and R. D’Andrea, “Generation of collisionfree trajectories for a quadrocopter fleet: A sequential convex programming approach,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012, pp. 1917–1922.
 D. H. Shim, H. J. Kim, and S. Sastry, “Decentralized nonlinear model predictive control of multiple flying robots,” in Decision and control, 2003. Proceedings. 42nd IEEE conference on, vol. 4. IEEE, 2003, pp. 3621–3626.
 J. AlonsoMora, T. Naegeli, R. Siegwart, and P. Beardsley, “Collision avoidance for aerial vehicles in multiagent scenarios,” Autonomous Robots, vol. 39, no. 1, pp. 101–121, 2015.
 M. Neunert, C. de Crousaz, F. Furrer, M. Kamel, F. Farshidian, R. Siegwart, and J. Buchli, “Fast nonlinear model predictive control for unified trajectory optimization and tracking,” in Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016, pp. 1398–1404.
 R. Mahony, V. Kumar, and P. Corke, “Multirotor aerial vehicles: Modeling, estimation, and control of quadrotor,” IEEE Robotics Automation Magazine, vol. 19, no. 3, pp. 20–32, Sept 2012.
 S. Omari, M. D. Hua, G. Ducard, and T. Hamel, “Nonlinear control of vtol uavs incorporating flapping dynamics,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nov 2013, pp. 2419–2425.
 M. Burri, J. Nikolic, H. Oleynikova, M. W. Achtelik, and R. Siegwart, “Maximum likelihood parameter identification for mavs,” in 2016 IEEE International Conference on Robotics and Automation (ICRA), May 2016, pp. 4297–4303.
 M. Blösch, S. Weiss, D. Scaramuzza, and R. Siegwart, “Vision based mav navigation in unknown and unstructured environments,” in Robotics and automation (ICRA), 2010 IEEE international conference on. IEEE, 2010, pp. 21–28.
 M. Kamel, T. Stastny, K. Alexis, and R. Siegwart”, “”model predictive control for trajectory tracking of unmanned aerial vehicles using robot operating system”,” in ”Robot Operating System (ROS) The Complete Reference”, A. Koubaa”, Ed. ”Springer Press”, “2017”, ”(to appear)”.
 C. Kirches, The Direct Multiple Shooting Method for Optimal Control. Wiesbaden: Vieweg+Teubner Verlag, 2011, pp. 13–29. [Online]. Available: http://dx.doi.org/10.1007/9783834882028_2
 B. Houska, H. Ferreau, and M. Diehl, “ACADO Toolkit – An Open Source Framework for Automatic Control and Dynamic Optimization,” Optimal Control Applications and Methods, vol. 32, no. 3, pp. 298–312, 2011.
 F. Furrer, M. Burri, M. Achtelik, and R. Siegwart, Robot Operating System (ROS): The Complete Reference (Volume 1). Cham: Springer International Publishing, 2016, ch. RotorS—A Modular Gazebo MAV Simulator Framework, pp. 595–625. [Online]. Available: http://dx.doi.org/10.1007/9783319260549_23
 S. Lynen, M. W. Achtelik, S. Weiss, M. Chli, and R. Siegwart, “A robust and modular multisensor fusion approach applied to mav navigation,” in Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on. IEEE, 2013, pp. 3923–3929.