Measurementwise Occlusion in Multiobject Tracking
Abstract
Handling object interaction is a fundamental challenge in practical multiobject tracking, even for simple interactive effects such as one object temporarily occluding another. We formalize the problem of occlusion in tracking with two different abstractions. In objectwise occlusion, objects that are occluded by other objects do not generate measurements. In measurementwise occlusion, a previously unstudied approach, all objects may generate measurements but some measurements may be occluded by others. While the relative validity of each abstraction depends on the situation and sensor, measurementwise occlusion fits into probabilistic multiobject tracking algorithms with much looser assumptions on object interaction. Its value is demonstrated by showing that it naturally derives a popular approximation for lidar tracking, and by an example of visual tracking in image space.
I Introduction
Recent applications of robotics, such as intelligent consumer vehicles, require an understanding of their surroundings on par with a human’s. This is currently achieved via maximizing information intake at all times, combining highresolution sensors like multilaser rotational lidars with powerful computers and substantial context such as 3D maps. Lowerresolution sensors and weaker computation could perhaps achieve the necessary level of understanding at a lower cost, but require a system that accurately and completely handles any uncertainties. The framework of multiobject tracking achieves this by modeling the environment as a set of objects whose presence, location, and characteristics follow potentially interdependent probability distributions. A carefully designed model can intrinsically perform complex tasks such as combining information from different points of view, correctly reasoning about yetundetected objects, and quantifying uncertainty in its predictions.
Not every property of real multiobject systems can be easily formulated in this framework. For example, the majority of models treat the motion of each object as independently distributed, though many tracking applications feature objects that dynamically interact, for instance by following each other. Similarly, these models do not always enforce interobject constraints such as that two objects cannot occupy the same space, though there are some ways to implement such constraints [1]. Multiobject tracking models also typically assume that sensory information is the accumulation of individual information from each object within the sensor’s view. In practice, measurements may be a more complex result of several nearby objects. The clearest example of this is termed occlusion: sensors relying on lineofsight will not receive information from objects that are behind other objects.
Occlusion is a simple concept but has no standard treatment for multiobject trackers. Offline visual tracking techniques often treat occlusion as an unavoidable source of failure and focus on correctly identifying objects upon reappearance [2, 3]. Alternatively, they utilize features that distinguish each object and rely on warning signs to detect occlusion in advance [4]. Occupancy grids are a class of multiobject tracking algorithms that forego representation of distinct objects and instead model a region of space [1]. A grid of adequate resolution is usually more computationally expensive than a similar multiobject tracker, but grids have the advantage of easily incorporating occlusion and other interaction effects. Recent research has applied theory from object tracking to grids [5] and learned grid trackers with techniques from computer vision [6]. Finally, occlusion has been incorporated into the framework of settheoretic multiobject tracking. Prior work has focused on one representation of occlusion and run into limitations, typically resorting to handmade approximations. Section III covers the framework of multiobject tracking, and section IV discusses ways to incorporate occlusion into this framework, with the final sections providing two use cases. But first we differentiate approaches to modeling occlusion with a simple example.
Ii Four Square Example
This example uses a discrete space with up to two objects and measurements. As shown in Figure 1, one object is guaranteed to be present and has an equal chance to exist in either the bottom left or bottom right square. The other object has an equal chance of being present or not present, and if present it has an equal chance of being in the top right or the top left square. A present object has a 50% chance of generating a measurement in the same square, a 25% chance of generating a measurement in the wrong square due to hypothetical sensor error, and a 25% chance of not generating a measurement due to sensor failure. There are no false positives in this example, i.e. a row without an object will not have any measurements. Figure 1 displays this model, with measurements denoted as red boxes. Because each object and measurement are consigned to separate rows, if there is no occlusion then the prior, measurement, and posterior distributions can be handled separately for each object. Several possible measurement outcomes are shown in Figure 1, and the posterior estimate of the objects given each outcome is shown in the “No Occlusion” rows of Table I.
We next assume that the object in the bottom row may occlude the top one. This example displays a common motivation for tracking under occlusion: to determine the presence and rough location of objects behind currently tracked objects. We first follow the traditional representation of occlusion: if the top object is behind the bottom object, it cannot generate any measurement. This naturally leads to a different posterior estimate, not only for the top object’s existence but also for the expected positions of both objects. For instance, the probability of the bottom object being in the left square given outcome D is much lower, because an object in the bottom left square would occlude the object creating a measurement in the top left square.
In the second representation of occlusion, the placement of objects is irrelevant, but a measurement in the bottom row renders a top measurement in the same column invisible. We refer to the first representation as objectwise occlusion, and the second as measurementwise occlusion. Despite having similar base concept, they can ultimately have distinct effects on the posterior estimation of either object. Figure 2 lists the outcomes of this example that are considered impossible by either representation. Table I includes results from both types of occlusion, which can cause significantly different conclusions. Note that a posterior cannot be derived for measurement set E with measurementwise occlusion, because such a measurement set is considered impossible.
P(top object exists)  
A  B  C  D  E  
No Occlusion  1  1  1  
Objectwise Occlusion  1  1  1  
Measurementwise Occlusion  1  1    
P(top object on left if exists)  
No Occlusion  
Objectwise Occlusion  
Measurementwise Occlusion    
P(bottom object on left)  
No Occlusion  
Objectwise Occlusion  
Measurementwise Occlusion   
given measurements from figure 1
Which representation is more valid? For a highly accurate sensor, the outcomes for which objectwise and measurementwise occlusion differ would rarely occur and the difference becomes trivial. Sensors for which the objectwise representation is better suited include:

Sensors that generate a small number of point measurements per object, such as postprocessed radar. Even if clustering is used to match one measurement group per object, the definition of a measurementwise occlusion would be complex and casespecific. Radar, however, requires a complex formulation of occlusion in the first place due to its reflective tendency [7, 8].

Computer vision algorithms that can infer the overall position of an object based on individual parts, especially when the occluding objects are nonconvex shapes such as humans. Deformable partbased models are an example. Figure 3 shows an example image where a moderately overlapping person was detected distinctly. Once again, the nature of occlusion for this type of sensor is quite complex.
Sensors for which the measurementwise representation may be more valid include

Sensors that give unprocessed, highresolution information, such as scanning lasers (lidar). These sensors give a fixed number of measurements at known angles, so any hypothetical measurement can only be occluded by a measurement at the same angle. The value of measurementwise occlusion is especially clear for sensors whose sight is not parallel to the plane in which the objects move, such as rotational lidars placed on drones or the tops of vehicles [9]. The probability of occlusion for each laser will depend on the height of each object, as well as any elevation or sensor tilt, whereas measurementwise occlusion can be reasoned about with only a measured range value. We show in section V that some objectwise approximations for lidar tracking can be handled directly with measurementwise occlusion.

Computer vision algorithms that utilize nonmaximum suppression (NMS). Many computer vision techniques give multiple small or overlapping detection responses for a single object. NMS removes or merges overlapping detections to address this problem, at the cost of potentially removing detections of different, nearby objects. In other words, it is an intentional implementation of measurementwise occlusion. Occlusionsensitive versions of NMS have been studied [10], but to our knowledge have not been heavily adopted. The right side of Figure 3 shows detections from a deeplearning vision algorithm that has utilized NMS.
Ultimately, either approach is a simplification of the complex or possibly unknown true behavior of a sensor. The next sections show how these occlusion methods can be implemented for multiobject tracking.
Iii Tracking Framework
This section briefly describes multiobject tracking, omitting steps that are unaffected by occlusion such as prediction and object creation and removal. is a set of objects that is distributed according to a set probability density function . Similarly, is a set of measurements , generated from by the likelihood function . The goal of multiobject tracking is to determine, or approximate, the posterior distribution . This parallels the goal of singleobject tracking to determine:
(1) 
and in fact multiobject models are designed to utilize similar pairwise objectmeasurement relationships. We adopt the disjoint union notation of [12], in which the probability of a finite, unordered set can be written as a sum of permutations across a fixedsize, ordered list of disjoint subsets.
(2) 
The notation means that , . This notation has not been widely adopted but offers several conveniences. For instance, probabilities over the superposition of two sets can be cleanly written.
(3)  
Iiia Object Models
The distribution is chosen based on descriptive power, as well as conjugacy with the measurement likelihood. For instance, the multibernoulli distribution [13] (and the equivalent classical filter JIPDA) describes a set of potential objects with independent probability of existing and independent state distributions .
(4)  
We use the multibernoulli distribution as an example for the rest of the paper, on the grounds that other distributions have similar forms and reach similar posterior distributions (in the respects that are relevant to occlusion). For instance, the multibernoulli mixture filter uses a mixture of multibernoulli distributions, the labeled MB and GLMB filter have similar forms, and all of the above can be combined with an independent poisson point process to smoothly handle object appearances [14, 12].
IiiB Measurement Model
Many sensors return a single measurement corresponding to each successfully detected object. This is represented by a singlemeasurement likelihood and an objectdependent detection probability . Additionally, sensors may return false positive measurements, which are typically assumed to be Poisson distributed with a generation rate and distribution . These assumptions are referred to as the standard measurement model and can be fully written as
(5)  
(6)  
(7) 
Note that any number of objects may be assigned to the null measurement (undetected), and likewise any number of measurements may be false positives. The joint probability of the multibernoulli object model and the standard measurement model can be factored into a convenient form by rearranging the association variables.
(8)  
(9)  
(10)  
(11) 
is a matrixshaped association variable between bernoulli components and measurements.
The posterior distribution of is a mixture of multibernoulli distributions. The number of components in the mixture is equal to the number of possible associations, so in practice approximations of this form are used. The marginal distribution of is also evident from (8), and can be thought of as the marginal of a function over associations . Calculation or approximation of this marginal probability can be performed in several ways, for instance using graphical techniques [15].
Some sensors, such as scanning lidars or computer vision techniques that collect simple features, instead generate a fixed number of measurements with an arbitrary number detecting any one object. These sensors could be described by applying the standard measurement model to each measurement separately, and assuming that at most one measurement is viewed for any given model. The separable likelihood model [16, 17] combines this framework with the assumption that objects are easily separable in the measurement space. It can thus consider the measurementobject matchings as predetermined. Other nonstandard models parametrize the rate at which an object creates measurements [18]. These models are not covered further because, as mentioned before, measurementwise occlusion is difficult to formulate for such sensors. Intuitively, the standard and separable likelihood models enforce that the set of measurements is a collection of separate pieces of information about individual objects, with uncertainty only in the completeness and association of this information. Certain formulations of occlusion can threaten this assumption.
Iv Occlusion
While discussed heavily in the design of practical multitarget trackers, the phenomenon of occlusion has not (to our knowledge) been formally defined for random sets. We start with a random set which follows some distribution . Occlusion divides the original set into two disjoint sets: the visible set and the occluded set . At its most general, a probabilistic occlusion model could be written
(12)  
(13) 
Where values represent whether or not a particular element was occluded ^{1}^{1}1We don’t strictly define as a random variable, just as a useful symbol.. We next define restricted occlusion, in which only visible objects impact the occlusion of other objects:
(14) 
This assumption may not always be realistic: for instance, some sensors may miss objects that are partially occluded even as those objects occlude others, as illustrated in Figure 4. This is however a reasonable assumption in many cases, and is useful for straightforward inference. An even stricter form of occlusion is static occlusion:
(15) 
In this case, no object affects another object’s probability of occlusion. This is valid when the causes of occlusion are known rather than being tracked, and is approximately valid when they are tracked very accurately.
Iva Objectwise Occlusion
Objectwise occlusion dictates that from the tracked object set , only a subset of objects are actually capable of generating measurements. Static occlusion in particular can be incorporated into the multibernoulli distribution.
(16)  
The joint probability given the standard measurement model is of the same form as (8), with the following modifications.
(17)  
(18) 
It is clear that incorporating static objectwise occlusion in a tracking model is equivalent to modifying the probability of detection to . However, the general and even restricted occlusion models are difficult to formulate in such a way: will no longer simply be a product of individual likelihoods for each permutation.
Thus trackers use the static occlusion model and alter each object’s detection probability, even when the probability of occlusion is highly dependent on nearby objects. The marginal occlusion probability where is the logical choice for a static occlusion term. [19] accurately solves for the probability of occlusion between two rectangular objects tracked by a lineofsight sensor, by calculating both the mean and variance of each object’s angular span and assuming they are independently distributed. However, this method cannot handle an object that is partially occluded by multiple objects, jointly resulting in a full occlusion. In such situations, they estimate the joint probability of visibility for a given object as the product of these pairwise occlusion probabilities. [20] handles approximately ellipsoidal objects in a similar way. [21] uses the mean position of each object to approximate a static occlusion model, but they calculate the joint probability of occlusion by making a miniature grid across the visible parameter of the rectangle. For a sensor that can handle partial occlusions well, the probability of visibility for the object is the maximum probability of visibility in this grid. [21] also uses a exponential weighting to calculate the probability of occlusion for each grid point, to mitigate the inaccuracy of the expectedvalue approximation. Other practical algorithms such as [22] perform deterministic checks for occlusion, assuming that the high accuracy of their sensory data keeps approximation error low.
IvB Measurementwise Occlusion
Intuitively, measurementwise occlusion should only affect the probability that an object did not generate one of the visible measurements. Specifically, each object term in the standard measurement model (6) could be modified to:
(19)  
adding in the probability that object generates a measurement that was occluded by the visible measurements . This result is in fact obtained under restricted occlusion, regardless of the visibility model . The proof uses a convenient property of integration on disjoint sets, proven in [23] section 3.5.3.
(20) 
The standard measurement model with restricted measurementwise occlusion can be written:
(21)  
We are only interested in the probability of the observed measurements , and so integrate out of each term.
Functionally, the only change to the multibernoulli joint distribution is an addition to term (10).
(22) 
In addition to this change, the measurement model is multiplied by a constant exponential term corresponding to occluded false positives, and by . In restricted objectwise occlusion, would complicate inference by adding interobject dependencies. In measurementwise occlusion, the visible measurements are known and so this term is irrelevant to calculation of the posterior.
V Separable Likelihood Application
Section II argued that measurementwise occlusion is a realistic choice for scanning lineofsight sensors. Here the potential simplicity of its application is demonstrated. For these sensors, the standard measurement likelihood for each measurement can be written separately:
(23)  
This method has been utilized by [7, 17, 24] to track vehicles using horizontally scanning lidar. Each work designed a measurement likelihood that was resistant to occlusion between wellseparated objects. As shown in Figure 5, measurements near the hypothesized vehicle were highly likely, measurements slightly farther away were highly unlikely, and measurements significantly closer to the sensor were given a moderate, uniform likelihood. Alternatively, consider a deterministic restricted measurementwise occlusion model where any measurement occludes all measurements with a higher distance. If objects are separated in distance enough that any given measurement is much more likely to have been generated from one object (or be a false positive) than the others, then the multibernoulli separablemeasurement joint distribution can be simplified greatly.
(24)  
Where and were defined in (9) and (10). Measurementwise occlusion gives the properties desired by [7, 17, 24], without their constraints on the measurement likelihood. This permits, for instance, separablelikelihood tracking using Kalman or RaoBlackwellized filters. Relaxing some of the assumptions, such as separable false positives or the deterministic nature of the occlusion, will still result in a tractable multibernoulli mixture posterior, though not necessarily a singular multibernoulli.
Vi Visual Tracking Application
Occlusion  MOTA  MOTP  IDF1  Mostly Tracked  Mostly Lost  FP  FN  # Switches  GOSPA  Cardinality 
None  .351  .408  .755  86  0  18721  12  8  14002  .39 
MWO  .426  .418  .777  78  0  17339  0  13  13947  .36 
OWO  .427  .4  .777  76  0  17317  0  9  13826  .36 
To demonstrate the value of occlusionaware tracking beyond simple LOS sensors, we track pedestrians in the fourth video from the 2017 MultiObject Tracking Benchmark using the supplied bounding boxes from the FasterRCNN detector [11]. These detections have a very low false positive rate but can miss partially occluded people, possibly due to heavy nonmaximum suppression. This video is a challenging test of occlusion reasoning. There are many cases of pedestrians occluded by single other pedestrians, groups of other pedestrians, and also street lights and other stationary objects whose existences is not known by the tracker.
The bounding boxes of each person are tracked in image space, in which horizontal and vertical location and size are the features. Occlusion is likely if the overlap between boxes, for instance measured by the intersection area over total area, is high. This representation provides no natural ordering of occlusion for objects, unlike in a groundplane setting where the relative distance to the sensor distinguishes occluding and occluded objects/measurements. We use two techniques to determine order of occlusion: first we assume that measurement boxes can only be occluded by measurement boxes whose bottom is lower than theirs. For rightsideup cameras detecting grounded objects, this emulates a distancebased ordering. To promote stability in the order of occlusion, each object is given a fifth feature, occludability. An object with a 95% occludability has a 95% chance of generating an occludable measurement, which may or may not actually be occluded by another measurement, and a 5% chance of generating a measurement which cannot be occluded no matter where it is. Given that only occludable measurements can be occluded, the posterior occludability inherently increases for undetected objects and is unchanged for detected objects. In the prediction step, occludability is slowly mixed to its equilibrium value. This approach to occlusion is applied to a measurementwise tracker and to an objectwise tracker using an expectedvalue approximation. The same tracker is also run without occlusion reasoning.
The object state (sans occludability) is normally distributed, with singleobject tracking carried out by a standard Kalman filter. The poisson multibernoulli filter [14] was used as the multiobject framework, with merging by track so that object labels were kept consistent. The data association step was achieved with the loopy belief propagation technique from [15]. For implementation, a fixed array of 2048 normal components and an array of 72 object labels was used. The most likely 2048 components from each update step were kept. Likewise, the most likely 72 objects were kept while the others were ‘recycled’ as unlabeled, poissondistributed components. Highly similar components in the same object were located via kdtrees and trivially merged by pooling their existence probability. New pedestrians entering the scene are assumed to be poissongenerated at the edges of the image.
Table II shows the accuracy and precision scores used by the MOT benchmark for labeled tracking evaluation, as well as the generalized optimal subpattern assignment metric (GOSPA) [25] and ratio of difference in total cardinality as unlabeled performance indicators. Arrows by each metric name indicate the direction of higher performance. Both labeled and unlabeled multiobject metrics require a base singleobject metric: bounding box intersectionoverunion was chosen as in the MOT1517 benchmarks, but with a looser cutoff such that any degree of intersection is considered a possible match. The bounding boxes in video 4 are smaller than most in MOT17, and occluded individuals moving in crowded areas would be extremely difficult to match with the standard requirement of 0.5 IoU. As the primary application of the MOT benchmark is consistent postprocessed labeling, its standard scoring code removes a significant number of individuals that are heavily occluded or unmoving at each time. We include all of these individuals as our goal is to track temporarily occluded objects.
While no tracker has excellent results, the occlusionequipped models outperform the baseline model by most metrics. The two snapshots of the video in Figure 6 show the raw FRCNN detections in magenta and the hypothesized objects in blue. The crowd in the upper left is not resolved (some individuals here are not detected throughout the video), but the two occlusion cases in the center are easily resolved based on the past positions of these individuals. The approximate objectwise tracker outperforms the measurementwise tracker, especially in identity switches. It is possible that violation of the restricted occlusion assumption, by the undetected stationary obstacles, significantly impacts measurementwise tracker. Figure 7 shows a case where one person is occluded by another, who proceeds to be occluded by a light pole.
Vii Conclusion
The traditional formulation of occlusion in multiobject tracking is that objects block other objects from the sensor’s view, and that occluded objects generate no measurement. This is intuitive but creates object dependencies that make tracking intractable, so a variety of approximations have been proposed. We instead formally define occlusion as an operation on a random set and show that this operation can be applied to measurements as well as objects. This new approach, termed measurementwise occlusion, is equally intuitive and fits tractably into the standard multiobject model with a loose restriction. It can be implemented with a simple additional step in any given multiobject tracking technique. We highlighted the practical value of this approach in two tracking applications where occlusion is a significant problem.
Acknowledgment
This work was supported by the Texas Department of Transportation under Project 06877 entitled “Communications and RadarSupported Transportation Operations and Planning (CARSTOP).”
References
 [1] M. Schreier, V. Willert, and J. Adamy, “Compact representation of dynamic driving environments for ADAS by parametric free space and dynamic object maps,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 2, pp. 367–384, Feb. 2016.
 [2] M. Betke and Z. Wu, “Data association for MultiObject visual tracking,” Synthesis Lectures on Computer Vision, vol. 6, no. 2, pp. 1–120, Oct. 2016.
 [3] J. Scharcanski, A. B. de Oliveira, P. G. Cavalcanti, and Y. Yari, “A ParticleFiltering approach for vehicular tracking adaptive to occlusions,” IEEE Trans. Veh. Technol., vol. 60, no. 2, pp. 381–389, Feb. 2011.
 [4] A. Yilmaz, X. Li, and M. Shah, “Contourbased object tracking with occlusion handling in video acquired using mobile cameras,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, Nov. 2004.
 [5] D. Nuss, S. Reuter, M. Thom, T. Yuan, G. Krehl, M. Maile, A. Gern, and K. Dietmayer, “A random finite set approach for dynamic occupancy grid maps with RealTime application,” arXiv preprint, May 2016.
 [6] F. Piewak, T. Rehfeld, M. Weber, and J. M. Zöllner, “Fully convolutional neural networks for dynamic object detection in grid maps,” in 2017 IEEE Intelligent Vehicles Symposium (IV), Jun. 2017, pp. 392–398.
 [7] A. Petrovskaya and S. Thrun, “Model based vehicle detection and tracking for autonomous urban driving,” Auton. Robots, vol. 26, no. 23, pp. 123–139, Apr. 2009.
 [8] A. Scheel and K. Dietmayer, “Tracking multiple vehicles using a variational radar model,” arXiv preprint arXiv:1711.03799, 2017.
 [9] T. Chen, R. Wang, B. Dai, D. Liu, and J. Song, “LikelihoodFieldModelBased dynamic vehicle detection and tracking for SelfDriving,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 11, Nov. 2016.
 [10] S. Tang, M. Andriluka, and B. Schiele, “Detection and tracking of occluded people,” Int. J. Comput. Vis., vol. 110, no. 1, pp. 58–69, Oct. 2014.
 [11] A. Milan, L. LealTaixé, I. D. Reid, S. Roth, and K. Schindler, “MOT16: A benchmark for multiobject tracking,” CoRR, vol. abs/1603.00831, 2016. [Online]. Available: http://arxiv.org/abs/1603.00831
 [12] Á. F. GarcíaFernández, J. L. Williams, K. Granstrom, and L. Svensson, “Poisson multibernoulli mixture filter: direct derivation and implementation,” IEEE Transactions on Aerospace and Electronic Systems, 2018.
 [13] B. T. Vo and B. N. Vo, “Labeled random finite sets and MultiObject conjugate priors,” IEEE Trans. Signal Process., vol. 61, no. 13, pp. 3460–3475, Jul. 2013.
 [14] J. L. Williams, “Hybrid poisson and multibernoulli filters,” in Information Fusion (FUSION), 2012 15th International Conference on. IEEE, 2012, pp. 1103–1110.
 [15] J. Williams and R. Lau, “Approximate evaluation of marginal association probabilities with belief propagation,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 4, pp. 2942–2959, Oct. 2014.
 [16] B. N. Vo, B. T. Vo, N. T. Pham, and D. Suter, “Joint detection and estimation of multiple objects from image observations,” IEEE Trans. Signal Process., vol. 58, no. 10, pp. 5129–5141, Oct. 2010.
 [17] A. Scheel, S. Reuter, and K. Dietmayer, “Using separable likelihoods for laserbased vehicle tracking with a labeled MultiBernoulli filter,” in 2016 19th International Conference on Information Fusion (FUSION), Jul. 2016, pp. 1200–1207.
 [18] C. Adam, R. Schubert, and G. Wanielik, “Radarbased extended object tracking under clutter using generalized probabilistic data association,” in 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), Oct. 2013, pp. 1408–1415.
 [19] K. Wyffels and M. Campbell, “Negative information for occlusion reasoning in dynamic extended multiobject tracking,” IEEE Trans. Rob., vol. 31, no. 2, pp. 425–442, Apr. 2015.
 [20] L. Lamard, R. Chapuis, and J. P. Boyer, “Dealing with occlusions with multi targets tracking algorithms for the real road context,” in 2012 Intelligent Vehicles Symposium, Jun. 2012.
 [21] K. Granström, S. Reuter, D. Meissner, and A. Scheel, “A multiple model PHD approach to tracking of cars under an assumed rectangular shape,” in 17th International Conference on Information Fusion (FUSION), Jul. 2014, pp. 1–8.
 [22] F. Liu, J. Sparbert, and C. Stiller, “IMMPDA vehicle tracking system using asynchronous sensor fusion of radar and vision,” in 2008 IEEE Intelligent Vehicles Symposium, Jun. 2008, pp. 168–173.
 [23] R. Mahler, Advances in statistical multisourcemultitarget information fusion. Artech House, 2014.
 [24] A. Scheel, S. Reuter, and K. Dietmayer, “Vehicle tracking using extended object methods: An approach for fusing radar and laser,” in Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 2017, pp. 231–238.
 [25] A. S. Rahmathullah, Á. F. GarcíaFernández, and L. Svensson, “Generalized optimal subpattern assignment metric,” in 2017 20th International Conference on Information Fusion (Fusion), Jul. 2017, pp. 1–8.
a Highway simulations
We also create a simple simulated highway to assess occlusion handling for tracking vehicles across multiple lanes^{2}^{2}2This section is not in the published version of this paper.. The highway has four lanes, and in each lane vehicles move at a constant velocity on the center line, much like in the classic arcade game Frogger. A point sensor at the side of the highway views these vehicles. The vehicles’ widths are neglected, so their visibility depends entirely on their lane and relative angle from the sensor. For example, say there is a vehicle in the lane nearest the sensor with its back end directly in front of the sensor, and its front end at an angle ahead of the sensor. Under objectwise occlusion, any vehicles in further lanes whose front and back ends lie within and will be completely occluded. The sensor is assumed to recognize contiguous shapes, so measurementwise occlusion operates similarly. Missed detections, false positives, and gaussian noise are applied to the sensor output in addition to occlusion. Figure 8 visualizes a single timestep of this highway, with two possible random measurement sets corresponding to the two occlusion types.
A particle filter version of the trackoriented multibernoulli filter is used so that closedform updates can be performed even for partially occluded measurements. Measurementwise occlusion probabilities can also be determined exactly, while objectwise occlusion is approximated in two different ways. The first takes the expected value of potentially occluding objects and calculates the probability that each individually occludes the target object, then combines the individual probabilities with the softmax function as in [21]. The second stores a grid representation of the sensor’s field of view, and updates the visibility of each cell in the grid based on vehicle positions. Simulation parameters such as the magnitude of measurement noise are known to the tracker. The tracker is run for 10000 timesteps, representing over half an hour of traffic at 5 timesteps per second.
Table III shows performance of each occlusion model in terms of average GOSPA per timestep. Euclidean distance in position and length is used as the base metric. The approximate objectwise occlusion tracker work equally well under either simulated from of occlusion, with the grid approximation outperforming the expectedvalue approximation. The measurementwise occlusion tracker scores slightly lower (better) than the grid approximation when the simulated occlusion type matches its assumptions, and slightly higher when objectwise occlusion is simulated. It is worth noting that this simulation is simple enough that an accurate grid approximation can be applied in real time, while more complex applications may not be able to apply it as quickly. Expectedvalue approximations are fast, but perform worse than the measurementwise tracker for both simulations. Codes for the simulated tests and for the pedestrian tracking tests are available at https://github.com/utexasghoshgroup/carstop/tree/master/MWO.
Simulated Occlusion  Tracker  GOSPA 
OWO  OWOexpval  4.94 
OWO  OWOgrid  4.79 
OWO  MWO  4.90 
MWO  OWOexpval  4.94 
MWO  OWOgrid  4.80 
MWO  MWO  4.67 