Integrated Motion Planner for Realtime Aerial Videography
with a Drone in a Dense Environment
Abstract
This letter suggests an integrated approach for a drone (or multirotor) to perform an autonomous videography task in a 3D obstacle environment by following a moving object. The proposed system includes 1) a target motion prediction module which can be applied to dense environments and 2) a hierarchical chasing planner based on a proposed metric for visibility. In the prediction module, we minimize observation error given that the target object itself does not collide with obstacles. The estimated future trajectory of target is obtained by covariant optimization. The other module, chasing planner, is in a bilevel structure composed of preplanner and smooth planner. In the first phase, we leverage a graphsearch method to preplan a chasing corridor which incorporates safety and visibility of target during a time window. In the subsequent phase, we generate a smooth and dynamically feasible path within the corridor using quadratic programming (QP). We validate our approach with multiple complex scenarios and actual experiments. The source code can be found in https://github.com/icslJeon/traj_gen_vis.
I Introduction
Video filming has been one of the most popular applications of unmanned aerial vehicles equipped with vision sensors, utilizing their maneuverability and improvement in the technologies such visual odometry [qin2018vins] and mapping [hornung2013octomap, oleynikova2018safe]. For example, drones have been employed in various cinematographic tasks from personal usage to broadcasting sport events, and corresponding research has received great interests in the recent decade [nageli2017real, penin2018vision, bonatti2018autonomous]. Still, the automation of the videographic tasks using drones remains as an open challenge especially in general dense environments.
This letter addresses an online motion strategy developed for more realistic situations where multiple obstacles have arbitrary shapes and the future trajectory of target is not exactly known a priori to the filming drone except the location of sparse viapoints which are preselected for filming purposes. Also, we do not assume that the arrival time at each point is known to the drone. For example, a drone can be deployed for the cases such as shooting a ski game or racing where players are supposed to pass defined spots in a track. As another common example, we can consider an event where an important person (or actor) passes through defined locations in a crowded place and the arrival times for the spots are not exactly predetermined.
Ia Technical challenges
In our problem, the followings can be pointed out as main challenges, which should be handled jointly.
IA1 Smooth transition
first of all, the smoothness of flight path of a drone is essential for flight efficiency avoiding jerky motion, which could cause increased actuation inputs and undesirable shooting quality.
IA2 Flight safety
the recording agent should be able to maintain its safety against arbitrary shape of obstacles not only simple obstacles (e.g. ellipse or sphere) for the broad applicability.
IA3 Occlusion against obstacles of general shape
Occlusion should be carefully handled in obstacle environments. It could degrade the aesthetic quality of video, which could be one of the top priorities in cinematographic tasks. More practically, a duration of occlusion of a dynamic object might interrupt the autonomous mission if the cinematographer drone fails to redetect the target object after losing the target out of the field of view.
IA4 Tradeoff between global optimality and fast computation
As described in 1)3), a motion strategy for videography in the considered cases aims to achieve multiple objectives simultaneously. Such a multiobjective problem is subject to local minima and might yield a poor solution if it relies entirely on numerical optimization. On the contrary, if one relies only on sampling or discrete search algorithm such as RRT* [karaman2011sampling] and A*[duchovn2014path] to focus on the global optimality at the cost of online computation, a drone might not be able to respond fast enough for the uncertain motion of target onthefly. Therefore, tradeoff between optimality and fast computation should be taken into account in a balanced manner.
IA5 Target prediction considering obstacles
for the operation in obstacle environments based on incomplete information of target trajectories, another challenge is a reliable motion prediction of a dynamic object with consideration of obstacles. For example, if a chaser makes infeasible prediction without properly reflecting obstacle locations, the planning based on wrong prediction will also become unreliable. Thus, accounting for obstacles is crucial to enhance the chasing performance in our scenario.
IB Related works
The previous works [nageli2017real], [penin2018vision] and [bonatti2018autonomous] addressed the similar target following problem with consideration of flight efficiency, safety and visibility (A1A3) under continuous optimization formulation to deal with A1A4 their problem settings. [nageli2017real] and [penin2018vision] developed a receding horizon motion planner to yield dynamically feasible path in realtime for dynamic situations. They assume an ellipsoidal shape for obstacles, which is not applicable to more general cases, having difficulty in fully satisfying A2 and A3. Also, the objective function contain multiple nonconvex terms such as trigonometry and product of vectors. This formulation might not be able to produce a satisfactory solution in short time due to localminima as discussed in A4.
In [bonatti2018autonomous], occlusion and collision was handled in a general environment which is represented with octomap. Nevertheless, they relied on numerical optimization of the entire objectives containing complex terms such as integration of signed distance field over a manifold. Such approach might not be able to guarantee the satisfactory optimality, similar to the case of [nageli2017real, penin2017vision] (A4). In [chen2016tracking], the authors addressed the target following with consideration of A1,A2,A4 and A5. The authors designed a hierarchical planner to consider the tradeoff between optimality and online computation where a corridor is preplanned to ensure safety and then a smooth path is generated to minimize the highorder derivatives in the following phase. Especially, [chen2016tracking] performs prediction of target movement with a polynomial regression over past observations. Still, the prediction did not consider obstacles (A5) and the occlusion of the target was not included in their multilayer planner, having difficulty in handling A3.
Regarding the target prediction in target following tasks, [vsvec2014target] included the obstacles for the formulation of prediction directly tackling A5. The authors performed MonteCarlo sampling to estimate the distribution of future target position. In the paper, however, the dynamics and set of possible inputs of target were assumed to be known as a prior, which is difficult to directly applied to general cinematic scenario. Also, the author restricted the homotopy of the solution path of the robot assuming discrete selection of actuation input. This method might not be able to achieve enough travel efficiency as pointed out in A1.
To the best of our knowledge, there is only few research which effectively handle A1 to A5 simultaneously for drones to be employed in the considered cinematic or chasing scenarios. In this letter, we make the following contributions as extension of our previous work [jeon2019online]

An integrated framework for motion strategy is proposed from prediction module to chasing planner, which could achieve desired performance mentioned A1A5 in our cinematic problem setting. Especially, we validate the newly developed prediction module by examining its effectiveness for the proposed motion planner.

We validate our method by multiple challenging scenario and real world experiment. Especially, the tested real platform is implemented to operate fully onboard handling target detection, localization, motion planning and control.
The remainder of this paper is structured as follows: we first describe problem proposition and overall approach. In the subsequent section, method for target prediction for a future time window is proposed in section III, which is followed by a hierarchical chasing planner design in section IV and V.
Ii Overview
Here we outline and formalize the given information and the desired capabilities of the proposed method. We assume that the filming scene is available in the form of octomap before filming. It is also assumed that the actor will pass a set of viapoints in sequence for a filming purpose. The viapoints are known a priori while the arrival time for each point is not available. As an additional specification on the behavior of target, it is assumed to move along a trajectory by minimizing highorder derivatives such as acceleration as assumed in [chen2016tracking]. Additionally, we assume that the target object is visible from a drone at the start of the mission within the limited fieldofview (FOV) of the drone.
Based on these settings, we focus on a chasing planner and target prediction module which can handle A1A5 simultaneously as mentioned in section I. Additionally, the chasing strategy optimizes the total travel distance and the efforts to maintain a desired relative distance between the drone and object.
Iii Target future trajectory prediction
This section describes target future motion estimation utilizing the observation history and prior map information. Here the terms path and trajectory are differentiated for clarity as described in [gasparetto2015path]. A path refers to a geometric path while trajectory is timeparameterized path. The proposed prediction module generates a geometric prediction path first, which will be followed by time estimation for the each point on the path.
Iiia Target path prediction
As mentioned in section II, we assume that the sequence of viapoints of the target is available as which is supposed to be passed in order and the arrival time for each point is not preset. Also, let us denote the value of the Euclidean signed distance field (ESDF) at a position of the prior map as . We denote the position of object at time as .
Now, let us assume that the drone has gathered target observation at discrete time steps and write as . Additionally, let us consider a situation where the target heads to after passing the previous waypoint. For a prediction callback time and a future horizon , we want to forecast the future target trajectory with the estimated trajectory . To obtain , a positional path where is generated first to provide a geometric prediction until the point where the target reaches by solving the following optimization.
(1)  
where in the first term is a positive constant for weighting more to error of recent observation. The second term implies the assumption that the target will minimize its derivatives for the sake of its actuation efficiency. The function in the last term is a nonconvex cost function to reflect the safe behavior assumption of target (see [ratliff2009chomp] for more details of the functional), which is computed based on the of the prior map information. eq. 1 can be arranged into the below, which is the standard form for covariant optimization [ratliff2009chomp].
IiiB Time prediction
In this subsection, we will allocate time knots for each point in with the constant velocity assumption for the simplicity. For the points which was used to regress on the past history of target, we simply assign the observation time stamps . For the predicted position (), the following recursion is used for allocating times.
(4)  
where represents the average speed during the collected observation. The passing times for the points obtained in eq. 1 are estimated with the constant velocity assumption based on . With this allocation, the future trajectory of target for a time window is predicted with the following interpolation:
(5) 
In our videography setting which will be introduced in Sec. VI, single prediction optimization routine runs at 3050 Hz showing its realtime performance. This helps us to retrigger prediction when the estimation error exceeds a threshold onthefly.
Iv Preplanning for chasing corridor
This section introduces a method for generating a chasing corridor in order to provide the boundary region for the chasing drone’s trajectory. We first explain a metric to encode safety and visibility of the chaser’s position, which is utilized as objectives in computing the corridor. Then, the corridor generation by means of graphsearch is described. Several notations are defined adding to as follows:

: Position of a chaser (drone).

: Position of a target.

: The line segment connecting .

: Configuration space.

: Free space in , i.e. the set of points where the probability of occupancy obtained from octomap is small enough.

: Space occupied by obstacles.

: A set of visible vantage points for a target position .

: A set of occluded vantage points for a target position .
Iva Metric for safety and visibility
For the safe flight of the cameradrone , we reuse ESDF as it can measure the risk of collision with nearby obstacles. Here, is used as a constraint in graph construction so that drone can maintain a safe clearance during entire planning horizon. Now, the visibility metric is introduced so that we can encode how robustly the drone can maintain its sight against occluding obstacles and unexpected motion of the target in the near future. For a target position when seen from a chaser position with line of sight (LOS) , we define the below as visibility:
(6) 
In the actual implementation, (6) is calculated over the grid field. That is, we can evaluate (6) with a simple min operation while iterating through voxels along with the linear time complexity. Because (6) means the minimum distance between obstacle and LOS connecting the object and drone, its small value implies that the target could be lost more easily than the higher value as illustrated in fig. 5  (b). The proposed metric possesses multiple advantages. First, it can be directly computed from reusing ESDF which was utilized for the target prediction and safety constraint for drone, without further complex calculation. Second, it can be defined without restriction of shape of obstacle in contrast to the research such as [nageli2017real], [penin2018vision] and [bonatti2018autonomous]. Detailed explanation on the advantage and properties of the proposed metric is referred to our previous work [jeon2019online].
IvB Corridor generation
Based on the proposed metric for safety and visibility, computation of the sequence of corridors for chasing is explained here. Before that, we plan a sequence of viewpoints for a time horizon as a skeleton for it. Let us assume that the drone is at and target prediction . For a window , time is discretized as and we rewrite , . Here, the sequence of viewpoints is generated where point is selected from a set . denotes a discrete point in a given grid. and are the minimum and maximum distance of tracking. The discrete path is obtained from the following discrete optimization.
(7)  
subject to  
The objective function in (7) penalizes the interval distance between each point and rewards the high score of visibility along path. The second term is defined by
(8)  
The last term in (7) aims to keep the relative distance between drone and object as . is the weight for visibility and is for relative distance. Among the constraints, the second one enforces a safe clearance of each line and the third constraint means that should be a visible viewpoint for the predicted target at . The last constraint bounds the maximally connectable distance between two points , in subsequent steps. More details on the method for building a directional graph to solve the above discrete optimization is explained in our previous research[jeon2019online] and fig. 6(a).
From computed from (7), we generate a set of corridors connecting two consecutive viewpoints in as visualized in Fig 6(b). Once the width of corridor is chosen, we can write the box region connecting and as a linear inequality . The corridor is depicted with red rectangles in fig. 6(b). Due to the formulation of (7), every point in maintains a safe margin and from the viewpoints the predicted point can be observed without occlusion by obstacles. Also, for a large value of and small enough , we empirically found that every point in corridor can maintain visibility for the prediction for .
V Smooth path generation
In the previous section, the procedure to select viewpoints and corridor was proposed, which was computed by optimally considering visibility and travel distance while ensuring safety. In this section, we generate a dynamically feasible trajectory for position and yaw using and . The position trajectory is represented with piecewise polynomials as below:
(9) 
Where is coefficient and denotes the order of the polynomial. Polynomial coefficients of the chaser’s trajectory are computed from the optimization below (10). The planning of yaw was done so that heads toward at each time step if observation of the target at is acquired.
(10)  
subject to  
Our optimization setup tries to minimize the magnitude of jerk along the trajectory and the deviation of from viewpoints where is an importance weight. In the constraints, and is the state of drone when the planning was triggered and used as the initial condition of the optimization. Additionally, we enforce continuity conditions on the knots. The last constraint acts as a box constraint so that the smooth path is generated within the chasing corridors for the purpose of safety and visibility. As investigated in [mellinger2011minimum], can be executed by the virtue of differential flatness of quadrotors. (10) can be solved efficiently with the algorithm such interior point [mehrotra1992implementation]. The overall algorithm is summarized in Algorithm 1. During mission, we predict target future trajectory for a time window with several recent observation by solving (1). If observation becomes unreliable where the accumulated estimation error exceeds a defined threshold, prediction is retriggered. Based on observation, the chasing planner yields a desired trajectory for the chaser by preplanning and generating a smooth path to be executed during a corresponding horizon. This loop continues until the end of the videographic mission.
Vi Results
Via Simulations
We validated the proposed algorithm in a dense environment with multiple target trajectories. For simulation, we used complex city (see our previous work [jeon2019online] for the 3D models) where five target viapoints are defined as green circles as in fig. 8. Complex city includes multiple nonconvex obstacles, and the target was operated to hide behind the obstacles at the moment denoted as orange boxes (see fig. 8). In the simulation, we used rotors simulator [furrer2016rotors] for the chaser drone, and the target (turtlebot) was manually operated with a keyboard. A vision sensor is fixed on the drone (13°pitching down). Due to this, the elevation of LOS was limited when selecting the elements from . All the virtual platforms operated in gazebo environment and the simulations were performed in Intel i7 CPU and 16GB RAM laptop. Boost Graph Library (GPL) was used for preplanning while qpOASES [ferreau2014qpoases] was used to solve quadratic programming for the smooth planning phase. For the four target trajectories, chasing strategies with two different levels of visibility weights are tested, totalling 8 simulations. Other than the visibility weight , the other parameters were set at the same value for all tests. For the simulation, we directly fed the current position of the turtlebot to the drone. The results are summarized in fig. 8 (A)(D) and table I. For each target scenario, the history of is plotted in the bottom row in fig. 8 where a small value of implies difficulty for securing visibility due to the proximity of the target to obstacles. Planner for high visibility with tries to secure more visibility score compared to planner with . Specifically, the value of of the planner with was on average 24% higher than the case with . Also, the duration of occlusion was 42% lower with in the four target trajectory cases. In contrast, the planner for low visibility with decreased the travel distance to 34% on average compared to . In all simulations, the safety of drone chaser was strictly satisfied during entire mission. The average computation times are summarized in fig. 9. We observed that the entire pipeline of the receding horizon planner ran at 56Hz, showing the capability to replan fast enough in response to the unexpected target motion.
max width=0.48
A  B  C  D  
target speed [m/s]  0.36  0.57  0.67  0.75  
1.0  5.0  1.0  5.0  1.0  5.0  1.0  5.0  
avg. [m]  0.6084  0.8433  0.4817  0.5330  0.5543  0.6051  0.5566  0.7791 
occ. duration [sec]  0.099  0  7.59  4.323  4.29  2.145  9.768  6.435 
flight dist.[m]  40.7761  49.9106  34.8218  47.5424  36.3040  50.6000  55.7393  77.2377 
max width=0.48
Parameters  

Type  Name  Value 
Common  time window [s]  
Prediction  obsrv. temporal weight  
weight on prior term  
obsrv. pnts./pred. pnts  /  
pred. accum. err. tol.[m]  1.0  
Preplanning 
tracking distance weight  
desired tracking dist.[m]  
maximum connection[m]  
lower and upper bounds of relative dist. [m]  /  
resolution[m]  
time step  
safe margin[m]  
Smooth planning 
waypoint weight  
polynomial order  

safe tol.[m]  
tracking elev. 
ViB Real world experiment
We tested the proposed method in an actual experiment in an indoor classroom without GPS. In the test, the drone is equipped with ZED (stereo vision sensor) and pixhawk2 autopilot for flight control. For visual odometry (VO), we used ZEDfu (the internal VO algorithm of ZED). The visionrelated algorithm ran on Jetson TX2 while planning and control was processed in the onboard computer (NUC7i7BNH) (see fig. 7). The target is turtlebot waffle PI and a disk with green color was attached on top of the turtlebot to simplify the detection from the drone. The target was operated manually by a human operator with linear velocity of 0.20.3 m/s. In order to obtain the position of the target, we thresholded HSV (hue, saturation, value) color to segment the target into an ellipsoid. We extracted the pointcloud of the center of the ellipsoid, to which we applied a smoothing filter to finalize the target’s position. The experimental environment and the path taken by the target and chaser are shown in fig. 1 with stamps of bearing vector. The whole pipeline runs at 10 Hz in our parameter settings for the experiment. In the experiment, we set m, m and m. The grid size used for octomap and the computation of visibility score is 0.1m. was set to the visibility weight. The entire path history for the drone and the target is plotted in fig. 1. The result of each planning trigger can be found in fig. 10.
Vii Conclusion and future works
In this letter, we proposed a chasing planner to handle safety and occlusion against obstacles. The preplanning phase provides a chasing corridor where the objectives such as visibility, safety and travel distance are optimally incorporated. In the smooth planing, a dynamically feasible path is generated based on the corridor. We also proposed a prediction module which allows the cameradrone to forecast the future motion during a time horizon, which can be applied in obstacle cases. The whole pipeline was validated in various simulation scenario, and we implemented real drone which operates fully onboard to perform autonomous videography. We also explored the effect of visibility weights to the two conflicting objectives: travel distance and visibility. From the validations, we found that the chaser was able to handle multiple hiding behavior of target effectively by optimizing the visibility. In the future, we will extend the proposed algorithm for the multitarget chasing scenario. Also, we plan to enhance the algorithm for the case of unknown map where the drone has to explore to gather information to generate more efficient trajectory.