Decentralized Multi-target Tracking in Urban Environments: Overview and Challenges
In multi-target tracking, sensor control involves dynamically configuring sensors to achieve improved tracking performance. Many of these techniques focus on sensors with memoryless states (e.g., waveform adaptation, beam scheduling, and sensor selection), lending themselves to computationally efficient control strategies. Mobile sensor control for multi-target tracking, however, is significantly more challenging due to the complexity of the platform state dynamics. This platform complexity necessitates high-fidelity, non-myopic control strategies in order to achieve strong tracking performance while maintaining safe operation. These sensor control techniques are particularly important in non-cooperative urban surveillance applications including person of interest, vehicle, and unauthorized UAV interdiction. In this overview paper, we highlight the current state of the art in mobile sensor control for multi-target tracking in urban environments. We use this application to motivate the need for closer collaboration between the information fusion, tracking, and control research communities across three challenge areas relevant to the urban surveillance problem.
An accurate and scalable multi-target tracking solution is a critical component of many wide-area urban surveillance systems. For example, human and vehicle detection with closed-circuit television (CCTV) networks leverages multiple spatially separated bearing-only sensors to uniquely track targets throughout a city [1, 2, 3, 4]. Another important area involves tracking unauthorized unmanned aerial vehicles (UAVs) using heterogeneous and spatially distributed sensors [5, 6, 7]. For commercial UAVs that stream video and telemetry data, passive RF detection mechanisms have also been suggested [8, 9, 10]. Across all of these applications, the deployment and positioning of the sensors over time has a major impact on multi-target tracking performance. This is especially true when tracking with passive sensors, which requires fusing multiple sensors to unambiguously resolve target position and velocity. Examples of these passive sensor types include received signal strength indicator (RSSI), time difference of arrival (TDOA), frequency difference of arrival (FDOA), and angle of arrival (AOA).
Sensor deployment and path planning for multi-target tracking falls under the broad research area of sensor control. Sensor control began receiving considerable attention from the information fusion community in the late 1990s [11, 12]. Many of the techniques in the area initially focused on dynamic reconfiguration of individual sensors in order to maintain strong target tracking performance (e.g., beam scheduling , waveform selection [14, 15]). In the early 2000s, however, the focus shifted to include sensor selection for wireless sensor network (WSN) applications [16, 17]. The size, weight, and power (SWaP) requirements of these systems necessitated sensor control techniques that could balance tracking performance with the energy cost of obtaining and communicating sensor measurements across the network [18, 19, 20, 21]. Decentralized multi-target tracking techniques were proposed to maintain communication bandwidth scalability and resilience to sensor failure [22, 23, 24, 25, 26, 27]. The majority of WSN applications focused on stationary installations, allowing offline solutions to the problem of sensor deployment optimization. A number of WSN deployment optimization solvers were proposed by drawing analogies to the NP-hard art gallery problem from computational geometry . Meta heuristics formed the basis of many of these solvers, including techniques such as particle swarm optimization and genetic algorithms [29, 30, 31, 32]. The sensor control problem for online path planning in the context of multi-target tracking, however, is significantly more challenging and less studied. As opposed to existing offline techniques for path planning in mobile sensor networks , the multi-target tracking variation of the problem necessitates an online solution due to the lack of a priori information on target trajectories.
Very few online mobile sensor control techniques in the current state-of-the-art are capable of addressing the unique challenges associated with non-cooperative target surveillance in urban environments. This is primarily because the urban environment highly constrains sensor coverage and maneuverability based on terrain elevation and building geometries. The same shadowing issues that make target sensing difficult also introduce challenges in maintaining end-to-end network connectivity, thus rendering centralized fusion approaches impractical. Strong performance and safe operation of a mobile sensor network in this scenario requires an understanding of how the terrain impacts the relevant tracking and sensor control algorithms, all while maintaining decentralized operation.
The goal of this paper is to provide a brief summary of the current multi-target multi-sensor tracking approaches using a mobile sensor network. We use this summary to highlight the main limitations that prevent immediate application of these architectures to the urban environment. In the sections that follow, we briefly summarize the model generally assumed for the mobile sensor network control problem. Following this, we provide a brief literature review of the current-state-of-the art in decentralized tracking with mobile sensor networks in non-urban environments. We then conclude by discussing three open challenges related to urban surveillance with commercial-off-the-shelf (COTS) UAVs, or more specifically, quadcopters.
Ii Problem Overview
Ii-a Integrated Sensing and Control Architecture
Figure 1 shows a typical architecture used for decentralized target tracking and mobile sensor control for a single platform. A sensor interface provides derived target measurements, such as time of arrival, Doppler shift, TDOA/FDOA, RSSI, or AOA. The measurements are usually obtained under measurement origin uncertainty. That is, it is not known a priori which sensor measurements correspond to clutter and which to existing targets. In addition, measurements obtained from targets may be miss-detected at a given time step. The posterior distributions for each target from the previous time step are propagated forward in time under known target birth and survival dynamics. The mechanism for performing this forward prediction is usually a variant of the Chapman-Kolmogorov integral [34, 35]. A data association process uses these predicted posterior distributions to generate a mapping from the measurements to newborn and persisting targets. The association map, sensor measurements, and platform telemetry are then used to apply the Bayes update for each target’s predicted posterior distribution. If the tracker update step is decentralized, a consensus process is used to jointly process the sensor log likelihood messages over the network with one-hop neighbors. The updated posterior distribution per target is then used to perform state extraction, which generates state estimates and covariance ellipses. The sensor control policy then uses the updated posterior distribution to determine which platform control actions (e.g., heading, acceleration, target waypoint) to use for the next time step. A separate consensus procedure may also occur in order to synchronize agent control actions.
Ii-B POMDP Formulation for Mobile Sensor Control
Control of mobile sensor networks for multi-target tracking typically follows a partially observable Markov decision process (POMDP) formulation [36, Chapter 5.6]. In a POMDP, the target states (i.e., position and velocities) evolve according to a Markov process and are observed indirectly through sensor measurements. Sensor states also evolve according to a Markov process based on the control action applied at the current time step. Depending on the inertial sensor and kinematic models, the corresponding relationship between platform state dynamics and control may be deterministic or stochastic with directly or partially observable states. The relationship between the sensor measurements and the target and platform states at the current time step is given by a set of likelihood functions. The reward function is designed to capture target tracking goals (e.g., minimized cumulative uncertainty in state estimates), obstacle and inter-agent collision avoidance requirements, and constraints on platform control actions. Given this reward function defined over target states, platform states, and platform actions, the goal is to construct a closed-loop control policy that maximizes the infinite horizon expected cost-to-go (i.e., discounted cumulative reward). The information available for a global control policy at the current time step is the measurement history of all target and platform states and the control inputs used at each platform. For simplicity, the following discussion will assume that the sensor platform states are completely observable.
To prevent the growth of the control and state space dimensionality as new measurements are obtained, the POMDP model is typically reformulated into an equivalent Markov decision process (MDP). This is accomplished using a sufficient statistic that subsumes the measurements up until the current time step [37, Chapter 4.3]. The corresponding sufficient statistic is termed the belief state of the system. The belief state for the sensor control application here is the posterior probability of the target states given the observed measurements up until the current time step. In general, the belief state per target is estimated using application-specific variations of the recursive Bayes filter . For the multi-target case, extensions exist for the joint target probability under soft associations  or for multi-target probability distribution under the random finite sets (RFS) formalism . The sensor control reward function at each time step is then mapped to the belief states through an information theoretic measure of the quality of the current target state estimate. The most commonly used measure is the mutual information  between the future states per target and the predicted sensor measurements obtained over a finite horizon lookahead. Each plausible control action affects the locations of the platforms at future time steps, which in turn affects the posterior belief state for each target. The core idea is that the mutual information metric quantifies how the the sharpness of the belief state per target changes over a finite horizon lookahead under each control action.
Despite this simplification, the resulting belief MDP state space is a subset of the space of multivariate probability distributions. These belief states are always continuous, even if the partially observable states are discrete. As a result, very few closed-form optimal policies for belief MDPs exist. The most well known solution is for the case of a linear Gaussian POMDP with quadratic cost. Here, the optimal solution reduces to a Kalman update of the belief state, and the control solution results from solving the discrete-time algebraic Riccati equation [41, 42]. In all other cases, the policies must be determined using approximate online dynamic programming techniques for infinite dimensional state spaces, such as model predictive control (MPC) or stochastic rollout [37, Chapter 6.4-5]. For these techniques, achieving real-time implementation of the control policy involves an application-specific treatment of computational complexity.
Iii Related Work in Non-Urban Environments
Iii-a Myopic Mobility Control for Centralized Trackers
Ristic and Vo in  used the RFS formalism  to derive a myopic sensor control policy for a single integrator (i.e., velocity controlled) plant. The tracking algorithm used was a particle filter approximation of the multi-target Bayes filter. This control policy used the Renyi divergence between the predicted and the future expected multi-object posterior after obtaining measurements from range-only mobile sensors. Ristic et al. provided a similar myopic scheme in  for range-only tracking, but specialized the multi-object Renyi divergence measures of  to the more computationally tractable probability hypothesis density (PHD) filter .
Beard et al. in  used the generalized labeled multi-Bernoulli (GLMB) filter to apply the Cauchy-Schwarz divergence for Poisson point processes  as the sensor control reward function. The authors also proposed the use of RFS void probabilities to achieve collision avoidance with targets. An example of controlling of a single range-bearing sensor tracking multiple targets under measurement origin uncertainty was presented. A finite horizon controller was simulated that assumed a constant velocity plant with instantaneously controllable heading. Closed form equations of the Cauchy-Schwarz divergence for the case when the GLMB single object posterior densities are modeled as Gaussian mixtures were provided.
Koohifar et al. in  provided a single sensor myopic control solution based on the steepest descent direction of the predicted posterior Cramer-Rao lower bound (PCRLB). This sensor control solution of  generalized their previous work in  by specifically deriving the sensor update likelihoods for an RSSI-only sensor model observing a source obeying a Bernoulli transmission process. The plant model assumed was a single integrator control moving at a constant velocity. The velocity heading of the plant at each control step was chosen from a fixed quantized set.
Hoffman and Tomlin in  leveraged a particle filtering solution and myopic control policy for a constrained double integrator (i.e., acceleration controlled) plant obtaining bearing-only or range-only measurements. The double integrator plant was used to simulate the STARMAC quadcopter  moving at slow speeds. The sensor control reward function considered was the mutual information between the predicted future target states and measurements. To maintain computational tractability, mutual information was evaluated using either local contributions per node, or limited pair-wise contributions between nodes.
Iii-B Non-myopic Mobility Control for Centralized Trackers
Dames and Kumar in [52, 53] used unmanned ground vehicles (UGVs) to propose a non-myopic tracking and control solution. The tracking and control algorithms included a particle filter implementation of the PHD filter and an online estimate of mutual information. In contrast to , the non-myopic policy was achieved by evaluating the mutual information between the predicted target states and the binary event of any agent observing an empty measurement set at subsequent time steps.
Atanasov et al. in  proposed a reduced value iteration (RVI) algorithm based off of a linearized range-bearing target dynamics and measurement sensing model for a single sensor. The specific technique did not require linearized sensor platform dynamics, and as such, it was demonstrated in simulation for a single target tracking scenario under differential drive dynamics. This work was later generalized by Schlotfeldt et al. in  where RVI was extended to an anytime planning algorithm. The resulting technique, denoted Anytime-RVI (ARVI) was also decentralized and tested on a set of quadcopters attempting to localize ground-based robots using range and bearing estimates.
Ragi and Chong in  assumed linear-Gaussian state and measurement dynamics and applied a joint-probabilistic data association (JPDA) tracker . The sensor control technique used in this paper is known as nominal belief-state optimization (NBO) . NBO is a POMDP approach that assumes the associated belief-states (i.e., per target posteriors) are completely characterized by a normal distribution (presumably through a Kalman update). A certainty-equivalent principle was applied to remove the expectation across belief states. Both single and multi-step lookahead rollout approaches were provided. The approach in  additionally considered forward acceleration thrust and heading dynamics for the platform under wind force disturbances. Inter-agent collision and obstacle avoidance constraints were considered by including a scaled regularization parameter in the cost-to-go function.
Iii-C Decentralized Tracking and Mobility Control
Grocholsky et al. in  assumed a fixed wing aircraft with constant forward velocity and controllable yaw rate to implement a decentralized control rule for bearing-only sensors. Decentralized data fusion was achieved by combining the mean and covariance matrices of the estimated targets. The control law was based on the expected mutual information gain between subsequent update steps. This law was made computationally feasible by linearizing the measurement and sensor state evolution dynamics and solving the resulting linear-quadratic-Gaussian (LQG) optimal control problem. A simpler myopic strategy based on a gradient following algorithm approximating mutual information was also proposed by Chung et al. in .
Meyer et al. in  proposed a myopic gradient descent algorithm for a class of decentralized multi-target tracking algorithms based on loopy belief propagation (BP). The cost function used was the conditional entropy of the target states at the next time step given the expected sensor measurements at the next time step. The relevant BP messages and conditional entropy gradient were approximated using multiple particle systems under perfect knowledge of the number of targets and the target-to-measurement association. Although the simulations presented in  were for single integrator dynamics, the corresponding technique is general enough to accommodate non-linear sensor and target dynamic models.
Iv Challenges Specific to Urban Surveillance
Iv-a Terrain-aware Tracking and Sensor Control
The primary sensor control challenge in urban surveillance involves understanding how the terrain and building geometries affect tracking performance. In a camera based solution, for example, the observed measurements are AOAs where the probability of detection is dependent on the platform’s ability to maintain line-of-sight (LOS) on targets. A similar argument can be used for passive RF observation of low power transmitters, where the detectability of multi-path effects is negligible111Examples of localization in multi-path rich environments have been proposed based off of pattern recognition techniques . These are outside of the scope of this review.. If targets maneuver into a non-line-of-sight (NLOS) region to all sensors, the uncertainty on the target’s position and velocity increases due to the lack of measurement updates. Thus, platform maneuvers that keep as many targets within LOS to their corresponding sensors will lead to an increase in the mutual information between predicted future target states and measurements. Consequently, the number of sensors that have LOS to a given target and their sensing geometry in the LOS region is also important.
There are a number of related studies that provide tracking functionality for targets constrained to road networks. Ulmke and Koch in  describe a particle filtering technique for tracking a single target maneuvering on partially obstructed road networks. The authors showed that improved tracking performance results when conditioning the measurement detection process on LOS/NLOS information. Ulmke et al. extended their work in  to the RFS formalism in  using the Gaussian-mixture cardinalized PHD filter . Within these efforts, the detectable regions were constrained by the road network as observed from an overhead sensor. In the general urban surveillance case, the sensors may not necessarily be overhead. Furthermore, many practical target types will not be constrained to road networks (e.g., unauthorized UAVs).
The conditioning of the sensor control policy on LOS/NLOS sensing regions as described above necessitates a non-parametric approximation of the belief state per target. Across all applications, these approximation techniques are computationally complex and make implementing non-myopic policies under high update rates very challenging. A number of point-based value iteration (PBVI) approaches [65, 66, 67] have been proposed to solve loosely related target surveillance problems [68, 69, 70]. This computational complexity is made worse by the requirement to perform moderate to high fidelity ray-casting under each sensor action to identify the LOS/NLOS regions. In order to maintain the computational complexity of the PBVI approaches, these shadowing computations necessitate some form of GPU-based acceleration from the computational geometry literature .
Another important consideration is the incorporation of safety-guaranteed operation with respect to inter-agent and obstacle collision avoidance. Some studies such as  attempt to address the collision avoidance constraints for sensor control by directly penalizing the reward function estimate when targets are too close to other agents or other obstacles. A central issue, however, is how the safety constraint penalization term should be weighted when estimating the discounted cost-to-go. A regularization weight that is too small may not be capable of preventing a collision under the assumed dynamics of the platform. Conversely, a penalization that is too large may over constrain the system and unnecessarily degrade the optimality of the policy. A better solution would be to select a POMDP solver that is capable of guaranteeing satisfiability of the safety constraints given an accurate map of the environment and agent positions. Minimum-norm controllers that modify the planned action from sensor control using safety barrier functions are one option [72, 73], but can potentially over-compensate for safety when the optimization reward is not quantifiable by a control Lyapunov function. Computationally tractable implementations of this technique also require a plant model that is affine in the control actions. Path planning and graph traversal techniques, such as A*  and RRT  provide another option, but require a discretization of the platform state space that may not be kinematically feasible. Relevant work by the controls community applying such graph search techniques to the trajectory planning problem is given in [76, 77, 78].
When digital surface models (DSM) of the terrain and buildings are not available a priori, an online estimate is usually computed via a simultaneous localization and mapping (SLAM) technique. The use of online estimated map data necessitates sensor control robustness under uncertainty. That is, the LOS/NLOS sensor control techniques should be designed to maintain strong performance up to a pre-specified level of error in the estimated map data. Similarly, the obstacle avoidance techniques should guarantee collision avoidance up to the same pre-specified level of map error.
Iv-B Control Space Fidelity for Quadcopters
A major contributing factor to the current interest in urban surveillance with mobile sensor networks is the abundance of commercially available UAVs. Quadcopters, for example, provide vertical takeoff and landing functionality in addition to high agility maneuvers. These platforms and their flexible APIs for flight control tasking [79, 80, 81] make real-time experimentation of mobile sensor network applications very attractive. The sensor control methods that exist in the current state-of-the-art, however, make overly simplifying assumptions in the platform kinematics to further reduce computational complexity (e.g., first or second order integrator dynamics). As a result, the platforms are forced to maneuver at slower velocities so that the actions generated by the sensor control algorithms are representative of the POMDP state dynamics. A more critical flaw in this approach is that, under the dynamics of the urban environment, platforms may attempt to delay necessary actions for maintaining collision-free flight until it is too late.
Quadcopter platform dynamics have been studied extensively by the controls and aeronautics communities . The quadcopter is a six degree-of-freedom system consisting of position and orientation in 3D Euclidean space. However, it provides only four actuation points consisting of the total upward thrust force and the roll, pitch, and yaw moments. This makes the quadcopter underactuated, implying that its position and orientation can not be accelerated in any arbitrary direction. Instead, translational and rotational acceleration are achieved by applying time-varying attitude control. A naive incorporation of these plant state and action dynamics under a fixed discretization in a POMDP-based sensor control algorithm is not computationally feasible.
Early attempts at quadcopter control applied small-angle approximation techniques to linearize the flight dynamics around the hover state . An important finding was made by Mellinger and Kumar in [84, 85], where the quadcopter was determined to be differentially flat in terms of its 3D Euclidean position and yaw angle. Differential flatness of a system implies that the original states and inputs can be rewritten as algebraic functions of (potentially fewer) state variables and their derivatives. These algebraic functions define a diffeomorphism that ensures any trajectories of sufficient smoothness in the flat outputs will be sufficiently smooth in the original state and control space.
For the quadcopter, the highest degree derivative of the flat position outputs in their expressions for the original control inputs is four (i.e., trajectory snap). Similarly, the highest degree derivative of the flat yaw output in the expressions for the original control inputs is two (i.e., yaw acceleration). Using this insight, Mellinger and Kumar [84, 85] provided a series of waypoint-based quadcopter trajectory generation techniques that minimize the control effort under trajectory snap and yaw acceleration (i.e., minimum snap trajectories). These waypoint generation methods assume a concatenation of piecewise polynomial functions that pass through pre-defined waypoints. Solving for the trajectory polynomial coefficients is done by solving a computationally tractable quadratic program (QP). Regulating the original state dynamics of the quadcopter according to this trajectory can then be achieved through the use of a backstepping controller [82, 86].
The key takeaway from the above discussion is that the accuracy of the quadcopter control space in a mobile sensor control algorithm can be maintained provided that the actions commanded to the platform generate smooth trajectories up until the fourth derivative of position and second derivative of yaw rate. For sensor control with a quadcopter platform, a natural solution is to assume that the plant consists of a fourth order differential equation on the flat outputs, with an input consisting of the trajectory snap at each time step. This trajectory snap input is termed a motion primitive [87, 76]. Under polynomial trajectories, these motion primitives induce a resolution-complete discretization in the flat outputs. Sikang et al. in  derive this discretization and suggest optimal search techniques for trajectory generation between waypoints using A* . For dynamic environments, Sikang et al. in  proposed a receding horizon control technique based on Lifelong Planning A* (LPA*). The techniques presented in  were shown to provide collision avoidance guarantees between static and dynamic obstacles, and generate robust paths with respect to random platform disturbances.
Iv-C Tracking and Control Algorithm Decentralization
Decentralized operation is a critical requirement of an urban surveillance system. As discussed in Section IV-A, the terrain and building geometries present a strongly RF shadowed propagation environment. This poses a significant challenge for inter-agent communication, and thus renders centralized tracking and sensor control techniques impractical. In general, decentralization of multi-target tracking and sensor control algorithms is very challenging The BP tracking approaches discussed by Meyer et al. in  provide an intuitive framework for performing average consensus on the relevant belief state parameters with one-hop neighbors. For particle filtering approaches to the multi-target tracking problem, the BP approaches are decentralized using a consensus-over-weights approach . Consensus over-weights assumes that the particle systems sampled at each agent are identical, which necessitates the use of synchronized random number generators. Likelihood consensus  is a slightly less restrictive approach that overcomes the need for synchronized random number generators by projecting the sensor likelihood functions onto a common set of basis functions. Other alternatives to the consensus-over-weights scheme include fusion via Gaussian mixture approximations  and kernel-based methods .
Although these techniques work well when decentralizing target tracking algorithms, it is difficult to apply them to mobile sensor control policies for the urban environment. Since the techniques suggested in Section IV-A necessitate a stochastic rollout approach, it is not immediately clear how a consensus algorithm should be constructed. One approach to circumvent this challenge is to implement the centralized mobile sensor control policy in a high-fidelity simulation and perform imitation learning to generate a decentralized policy. Imitation learning is a variation of reinforcement learning, where the goal is to make observations on a set of oracle control decisions and determine a non-parametric representation of the policy . This type of learning has been applied regularly in robotics to learn specific robotic manipulator movements via kinesthetic examples [94, 95, 96]. In these efforts, a convolutional neural network (CNN) is commonly used as approximate architecture for the state-action value function.
A recent study by Gama et al. in  has shown how the convolution and pooling operations used in CNNs can be generalized to support learning with signals supported over graphs. The resulting learning architecture, titled a graph neural network (GNN), may be capable of supporting an imitation learning procedure where the one-hop features that may be relevant to consensus are analogous to those signals supported over a communication network graph. A more thorough investigation of imitation learning of decentralized policies from centralized ones using GNNs is an ongoing and open area of research.
In this paper, we presented an overview of the mobile sensor control problem for multi-target tracking with a specific emphasis on urban surveillance problems. In addition to providing a brief background on the sensor control POMDP formulation, we provided a detailed literature review of the current state-of-the-art and suggested three challenge areas that have yet to receive considerable attention by the community These three areas were terrain-aware tracking and sensor control, control space fidelity for quadcopters, and joint estimation and control algorithm decentralization. A number of these challenges are addressed separately in the information fusion, tracking, and control communities. We suggest a coordinated effort amongst these communities in order to arrive at solutions that are capable of addressing these challenges together in a computationally tractable and bandwidth efficient manner.
-  Z. Chen, W. Liao, B. Xu, H. Liu, Q. Li, H. Li, C. Xiao, H. Zhang, Y. Li, W. Bao, and D. Yang, “Object tracking over a multiple-camera network,” in IEEE International Conference on Multimedia Big Data, 2015.
-  L. Hou, W. Wan, K. Han, R. Muhammad, and M. Yang, “Human detection and tracking over camera networks: A review,” in International Conference on Audio, Language and Image Processing (ICALIP), 2016.
-  L. Anuj and M. T. G. Krishna, “Multiple camera based multiple object tracking under occlusion: A survey,” in International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), 2017.
-  A. Y. Yang, S. Maji, C. M. Christoudias, T. Darrell, J. Malik, and S. S. Sastry, “Multiple-view object recognition in band-limited distributed camera networks,” in Third ACM/IEEE International Conference on Distributed Smart Cameras, 2009.
-  D. Poullin, “Countering illegal UAV flights: Passive DVB radar potentiality,” in 19th International Radar Symposium (IRS), 2018.
-  I. P. Snezhana Jovanoska Fraunhofer Institute for Communication, G. Ergonomics FKIE, Wachtberg, M. Brötje, and W. Koch, “Multisensor data fusion for UAV detection and tracking,” in 19th International Radar Symposium (IRS), 2018.
-  S. R. Ganti and Y. Kim, “Implementation of detection and tracking mechanism for small UAS,” Ph.D. dissertation, International Conference on Unmanned Aircraft Systems (ICUAS), June 2016.
-  W. D. Watson, “3D active and passive geolocation and tracking of unmanned aerial systems,” in IEEE International Symposium on Technologies for Homeland Security (HST), 2017.
-  W. D. Watson and T. McElwain, “4D CAF for localization of co-located, moving, and RF coincident emitters,” in IEEE Military Communications Conference (MILCOM), 2016.
-  P. Scerri, R. Glinton, S. Owens, D. Scerri, and K. Sycara, “Geolocation of RF emitters by many UAVs,” in AIAA Infotech@ Aerospace Conference and Exhibit, 2007.
-  A. O. Hero and D. Cochran, “Sensor management: Past, present, and future,” IEEE Sensors Journal, vol. 11, no. 12, pp. 3064–3074, December 2011.
-  G. W. Ng and K. H. Ng, “Sensor management – What, why and how,” Information Fusion, vol. 1, no. 2, pp. 67–75, 2000.
-  V. Krishnamurthy and R. J. Evans, “Hidden Markov model multiarm bandits: A methodology for beam scheduling in multitarget tracking,” IEEE Transactions on Signal Processing, vol. 49, no. 12, pp. 2893–2908, December 2001.
-  D. J. Kershaw and R. J. Evans, “Optimal waveform selection for tracking systems,” IEEE Transactions on Information Theory, vol. 40, no. 5, pp. 1536–1550, September 1994.
-  S. P. Sira, Y. Li, A. Papandreou-Suppappola, D. Morrell, D. Cochran, and M. Rangaswamy, “Waveform-agile sensing for tracking,” IEEE Signal Processing Magazine, vol. 26, no. 1, pp. 53–64, January 2009.
-  K. Ramya, K. P. Kumar, and V. S. Rao, “A survey on target tracking techniques in wireless sensor networks,” International Journal of Computer Science and Engineering Survey, vol. 3, no. 4, 2012.
-  N. Cao, S. Choi, E. Masazade, and P. K. Varshney, “Sensor selection for target tracking in wireless sensor networks with uncertainty,” IEEE Transactions on Signal Processing, vol. 64, no. 20, pp. 5191–5204, July 2016.
-  D. Guo and X. Wang, “Dynamic sensor collaboration via sequential Monte Carlo,” Journal on Selected Areas in Communications, vol. 22, pp. 1037–1047, August 2004.
-  L. Zuo, R. Niu, and P. K. Varshney, “A sensor selection approach for target tracking in sensor networks with quantized measurements,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2008.
-  E. Masazade and P. K. Varshney, “A market based dynamic bit allocation scheme for target tracking in wireless sensor networks,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2013.
-  R. Niu, A. Vempaty, and P. K. Varshney, “Received-signal-strength-based localization in wireless sensor networks,” Proceedings of the IEEE, vol. 106, no. 7, pp. 1166–1182, June 2018.
-  O. Hlinka, F. Hlawatsch, and P. M. Djuric, “Distributed particle filtering in agent networks: A survey, classification, and comparison,” IEEE Signal Processing Magazine, vol. 30, no. 1, pp. 61–81, January 2013.
-  ——, “Consensus-based distributed particle filtering with distributed proposal adaptation,” IEEE Transactions on Signal Processing, vol. 62, no. 12, pp. 3029–3041, June 2014.
-  F. Meyer, T. Kropfreiter, J. L. Williams, R. Lau, F. Hlawatsch, P. Braca, and M. Z. Win, “Message passing algorithms for scalable multitarget tracking,” Proceedings of the IEEE, vol. 106, no. 2, pp. 221–259, February 2018.
-  E. J. Msechu, S. I. Roumeliotis, A. Ribeiro, and G. B. Giannakis, “Decentralized quantized Kalman filtering with scalable communication cost,” IEEE Transactions on Signal Processing, vol. 56, no. 8, pp. 3727–3741, August 2008.
-  A. Ribeiro, G. B. Giannakis, and S. I. Roumeliotis, “SOI-KF: Distributed Kalman filtering with low-cost communications using the sign of innovations,” IEEE Transactions on Signal Processing, vol. 54, no. 12, pp. 4782–4795, December 2006.
-  L. Zuo, “Conditional posterior cramer-rao lower bound and distributed target tracking in sensor networks,” Ph.D. dissertation, Syracuse University, 2010.
-  A. Efrat, S. Har-Peled, and J. S. B. Mitchell, “Approximation algorithms for two optimal location problems in sensor networks,” in 2nd International Conference on Broadband Networks, 2005.
-  Z. Bojkovic and B. Bakmaz, “A survey on wireless sensor networks deployment,” WSEAS Transactions on Communications, vol. 7, no. 12, pp. 1172–1181, 2008.
-  R. V. Kulkarni and G. K. Venayagamoorthy, “Particle swarm optimization in wireless-sensor networks: A brief survey,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 41, no. 2, pp. 262–267, March 2011.
-  F. Domingo-Perez, J. L. Lazaro-Galilea, I. Bravo, E. Martin-Gorostiza, D. Salido-Monzu, A. Llana, and F. Govaers, “Sensor deployment for motion trajectory tracking with a genetic algorithm,” in IEEE International Conference on Industrial Technology (ICIT), 2015.
-  J. Hu, J. Song, M. Zhang, and X. Kang, “Topology optimization for urban traffic sensor network,” Tsinghua Science and Technology, vol. 13, no. 2, pp. 229–236, April 2008.
-  A. Singh, A. Krause, C. Guestrin, and W. J. Kaiser, “Efficient informative sensing using multiple robots,” Journal of Artificial Intelligence Research, vol. 34, pp. 707–755, 2009.
-  S. Ross, “Chapter 4.2: Chapman-Kolmogorov equations,” in Introduction to Probability Models, 11th ed. Academic Press, 2014.
-  Z. Chen, “Bayesian filtering: From Kalman filters to particle filters, and beyond,” Statistics, vol. 182, no. 1, pp. 1–69, 2003.
-  D. P. Bertsekas, Dynamic Programming and Optimal Control: Approximate Dynamic Programming. Athena Scientific, 2012.
-  ——, Dynamic Programming and Optimal Control. Athena Scientific, 2017.
-  Y. Bar-Shalom and T. E. Fortmann, Tracking and Data Association. Academic Press, 1988.
-  R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion. Artech House, 2014.
-  T. Cover and J. Thomas, Elements of Information Theory, 2nd ed. Wiley-Interscience, 2006.
-  T. T. Georgiou and A. Lindquist, “The separation principle in stochastic control, redux,” IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2481–2494, October 2013.
-  K. J. Åström, Introduction to Stochastic Control Theory. Courier Corporation, 2012.
-  B. Ristic and B.-N. Vo, “Sensor control for multi-object state-space estimation using random finite sets,” Automatica, vol. 46, no. 11, pp. 1812–1818, 2010.
-  B. Ristic, B.-N. Vo, and D. Clark, “A note on the reward function for PHD filters with sensor control,” IEEE Transactions on Aerospace and Electronic Systems, vol. 47, no. 2, pp. 1521–1529, April 2011.
-  B.-N. Vo, S. Singh, and A. Doucet, “Sequential Monte Carlo methods for multitarget filtering with random finite sets,” IEEE Transactions on Aerospace and Electronic Systems, vol. 41, no. 4, pp. 1224–1245, 2005.
-  M. Beard, B. Vo, B. Vo, and S. Arulampalam, “Void probabilities and Cauchy–Schwarz divergence for generalized labeled multi-bernoulli models,” IEEE Transactions on Signal Processing, vol. 65, no. 19, pp. 5047–5061, October 2017.
-  H. G. Hoang, B. Vo, B. Vo, and R. Mahler, “The Cauchy–Schwarz divergence for Poisson point processes,” IEEE Transactions on Information Theory, vol. 61, no. 8, pp. 4475–4485, August 2015.
-  F. Koohifar, I. Guvenc, and M. L. Sichitiu, “Autonomous tracking of intermittent RF source using a UAV swarm,” IEEE Access, vol. 6, pp. 15 884–15 897, 2018.
-  F. Koohifar, A. Kumbhar, and I. Guvenc, “Receding horizon multi-UAV cooperative tracking of moving RF source,” IEEE Communications Letters, vol. 21, no. 6, pp. 1433–1436, 2017.
-  G. M. Hoffmann and C. J. Tomlin, “Mobile sensor network control using mutual information methods and particle filters,” IEEE Transactions on Automatic Control, vol. 55, no. 1, pp. 32–47, January 2010.
-  G. Hoffmann, H. Huang, S. Waslander, and C. Tomlin, “Quadrotor helicopter flight dynamics and control: Theory and experiment,” in AIAA Guidance, Navigation and Control Conference and Exhibit, 2007.
-  P. Dames and V. Kumar, “Autonomous localization of an unknown number of targets without data association using teams of mobile sensors,” IEEE Transactions on Automation Science and Engineering, vol. 12, no. 3, pp. 850–864, July 2015.
-  P. Dames and V. Kumar, “Cooperative multi-target localization with noisy sensors,” in IEEE International Conference on Robotics and Automation (ICRA), 2013.
-  N. Atanasov, J. L. Ny, K. Daniilidis, and G. J. Pappas, “Information acquisition with sensing robots: Algorithms and error bounds,” in IEEE International Conference on Robotics and Automation (ICRA), 2014.
-  B. Schlotfeldt, D. Thakur, N. Atanasov, V. Kumar, and G. J. Pappas, “Anytime planning for decentralized multirobot active information gathering,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 1025–1032, 2018.
-  S. Ragi and E. K. P. Chong, “UAV path planning in a dynamic environment via partially observable Markov decision process,” IEEE Transactions on Aerospace and Electronic Systems, vol. 49, no. 4, pp. 2397–2412, October 2013.
-  S. A. Miller, Z. A. Harris, and E. K. P. Chong, “Coordinated guidance of autonomous UAVs via nominal belief-state optimization,” in American Control Conference, 2009.
-  B. Grocholsky, A. Makarenko, and H. Durrant-Whyte, “Information-theoretic coordinated control of multiple sensor platforms,” in IEEE International Conference on Robotics and Automation (ICRA), 2003.
-  T. H. Chung, J. W. Burdick, and R. M. Murray, “A decentralized motion control strategy for dynamic target tracking,” in IEEE International Conference on Robotics and Automation (ICRA), 2006.
-  F. Meyer, H. Wymeersch, M. Frohle, and F. Hlawatsch, “Distributed estimation with information-seeking control in agent networks,” IEEE Journal on Selected Areas in Communications, vol. 33, no. 11, pp. 2439–2456, 2015.
-  E. Tsalolikhin, I. Bilik, and N. Blaunstein, “A single-base-station localization approach using a statistical model of the NLOS propagation conditions in urban terrain,” IEEE Transactions on Vehicular Technology, vol. 60, no. 3, pp. 1124–1137, March 2011.
-  M. Ulmke and W. Koch, “Road-map assisted ground moving target tracking,” IEEE Transactions on Aerospace and Electronic Systems, vol. 42, no. 4, pp. 1264–1274, October 2006.
-  M. Ulmke, O. Erdinc, and P. Willett, “GMTI tracking via the Gaussian mixture cardinalized probability hypothesis density filter,” IEEE Transactions on Aerospace and Electronic Systems, vol. 46, no. 4, pp. 1821–1833, October 2010.
-  B. Vo, B. Vo, and A. Cantoni, “Analytic implementations of the cardinalized probability hypothesis density filter,” IEEE Transactions on Signal Processing, vol. 55, no. 7, pp. 3553–3567, July 2007.
-  H. Kurniawati, D. Hsu, and W. S. Lee, “SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces,” in Robotics: Science and Systems, 2008.
-  T. Smith and R. G. Simmons, “Point-based POMDP algorithms: Improved analysis and implementation,” in 21st Conference on Uncertainty in Artificial Intelligence (UAI), 2005.
-  M. Kochenderfer, Decision Making Under Uncertainty: Theory and Application. MIT Press, 2015.
-  M. Egorov, M. J. Kochenderfer, and J. J. Uudmae, “Target surveillance in adversarial environments using POMDPs,” in 13th AAAI Conference on Artificial Intelligence, 2016.
-  D. Hsu, W. S. Lee, and N. Rong, “A point-based POMDP planner for target tracking,” in IEEE International Conference on Robotics and Automation (ICRA), 2008.
-  R. He, A. Bachrach, and N. Roy, “Efficient planning under uncertainty for a target-tracking micro-aerial vehicle,” in IEEE International Conference on Robotics and Automation, 2010.
-  L. J. Tomczak, “GPU ray marching of distance fields,” Ph.D. dissertation, Technical University of Denmark, 2012.
-  A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Transactions on Automatic Control, vol. 62, no. 8, pp. 1–17, 2016.
-  R. A. Freeman and P. V. Kokotovic, “Inverse optimality in robust stabilization,” SIAM Journal on Control and Optimization, vol. 34, no. 4, pp. 1365–1391, 1996.
-  P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE Transactions on Systems Science and Cybernetics SSC4, vol. 4, no. 2, pp. 100–107, 1968.
-  S. M. LaValle and J. J. Kuffner Jr., “Randomized kinodynamic planning,” The International Journal of Robotics Research (IJRR), vol. 20, no. 5, pp. 378–400, 2001.
-  S. Liu, “Motion planning for aerial vehicles,” Ph.D. dissertation, University of Pennsylvania, 2018.
-  J. Fink, A. Ribeiro, and V. Kumar, “Robust control for mobility and wireless communication in cyber–physical systems with application to robot teams,” Proceedings of the IEEE, vol. 100, no. 1, pp. 164–178, January 2012.
-  ——, “Robust control of mobility and communications in autonomous robot teams,” IEEE Access, vol. 1, pp. 290–309, 2013.
-  Dronecode. Micro Air Vehicle Communication Protocol (MAVLINK). [Online]. Available: https://mavlink.io/en/
-  Dà-Jiāng Innovations (DJI). DJI Onboard-SDK. [Online]. Available: https://developer.dji.com/onboard-sdk/
-  Open Source Robotics Foundation. Robot Operating System (ROS). [Online]. Available: http://www.ros.org/
-  N. Michael, D. Mellinger, Q. Lindsey, and V. Kumar, “The GRASP multiple micro-UAV testbed,” IEEE Robotics and Automation Magazine, vol. 17, no. 3, pp. 56–65, 2010.
-  G. Hoffmann, S. Waslander, and C. Tomlin, “Quadrotor helicopter trajectory tracking control,” in AIAA Guidance, Navigation and Control Conference and Exhibit, 2008.
-  D. Mellinger and V. Kumar, “Minimum snap trajectory generation and control for quadrotors,” in IEEE International Conference on Robotics and Automation, 2011.
-  D. Mellinger, N. Michael, and V. Kumar, “Trajectory generation and control for precise aggressive maneuvers with quadrotors,” The International Journal of Robotics Research, vol. 31, no. 5, pp. 664–674, 2012.
-  T. L. Lee, M. McClamroch, and N. Harris, “Geometric tracking control of a quadrotor UAV on SE(3),” in 49th IEEE Conference on Decision and Control (CDC), 2010.
-  S. Liu, N. Atanasov, K. Mohta, and V. Kumar, “Search-based motion planning for quadrotors using linear quadratic minimum time control,” in EEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
-  S. Liu, K. Mohta, N. Atanasov, and V. Kumar, “Towards search-based motion planning for micro aerial vehicles,” in International Conference on Robotics and Automation (ICRA), 2019.
-  S. Farahmand, S. I. Roumeliotis, and G. B. Giannakis, “Set-membership constrained particle filter: Distributed adaptation for sensor networks,” IEEE Transactions on Signal Processing, vol. 59, no. 9, pp. 4122–4138, 2011.
-  O. Hlinka, O. Sluciak, F. Hlawatsch, P. M. Djuric, and M. Rupp, “Likelihood consensus and its application to distributed particle filtering,” IEEE Transactions on Signal Processing, vol. 60, no. 8, pp. 4334–4349, 2012.
-  J. Li and A. Nehorai, “Distributed particle filtering via optimal fusion of Gaussian mixtures,” IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 2, pp. 280–292, June 2018.
-  O. Tslil, O. Aharon, and A. Carmi, “Distributed estimation using particles intersection,” in 21st International Conference on Information Fusion (FUSION), 2018.
-  S. Schaal, A. Ijspeert, and A. Billard, “Computational approaches to motor learning by imitation,” Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, vol. 358, no. 1431, pp. 537–547, 2003.
-  J. Kober and J. Peters, “Imitation and reinforcement learning,” IEEE Robotics Automation Magazine, vol. 17, no. 2, pp. 55–62, June 2010.
-  A. Pervez, Y. Mao, and D. Lee, “Learning deep movement primitives using convolutional neural networks,” in IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), 2017.
-  C. Zhang, H. Zhang, and L. E. Parker, “Feature space decomposition for effective robot adaptation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015.
-  F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural network architectures for signals supported on graphs,” IEEE Transactions on Signal Processing, vol. 67, no. 4, pp. 1034–1049, February 2019.