A metric on the space of finite sets of trajectories for evaluation of multi-target tracking algorithms

# A metric on the space of finite sets of trajectories for evaluation of multi-target tracking algorithms

Abu Sajana Rahmathullah, Ángel F. García-Fernández, Lennart Svensson A.S. Rahmathullah is with Volvo Cars Corporation, Gothenburg, Sweden (email: abusajana@gmail.com). Á. F. García-Fernández is with the Department of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland (email: angel.garciafernandez@aalto.fi). L. Svensson is with the Department of Signals and Systems, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden (email: lennart.svensson@chalmers.se).
July 5, 2019
###### Abstract

In this paper, we propose a metric on the space of finite sets of trajectories for assessing multi-target tracking algorithms in a mathematically sound way. The metric can be used, e.g., to compare estimates from algorithms with the ground truth. It includes intuitive costs associated to localization, missed and false targets and track switches. The metric computation is based on multi-dimensional assignments, which is an NP hard problem. Therefore, we also propose a lower bound for the metric, which is also a metric for sets of trajectories and is computable in polynomial time using linear programming (LP). The LP metric can be implemented using alternating direction method of multipliers such that the complexity scales linearly with the length of the trajectories.

Metric, sets of trajectories, track switches, random finite sets, multiple target tracking

## I Introduction

The main goal of multiple target tracking (MTT) is to estimate a collection of trajectories, which represent the evolution of target states over time, from noisy sensor observations [1]. To evaluate the quality of an estimate, one needs a distance function that quantifies the error between the ground truth, which represents the true trajectories, and the estimate. In order to design such a distance function, first we need a space, where both the ground truth and the estimate lie. In the typical MTT models, targets are born, move and die [1]. Therefore, a natural and minimal representation of the ground truth and its estimates is a set of trajectories [2], where a trajectory is a sequence of target states with a time of appearance and a certain length. Second, the distance function should be a mathematically consistent metric on the selected space and, therefore, it must meet the properties of non-negativity, identity, symmetry and triangle inequality [3, Sec. 6.2.1].

Besides the above fundamental properties, there are MTT-specific features that should be quantified in the metric on the space of sets of trajectories. For the closely related problem of multi-target filtering, which aims to estimate the current set of targets without forming trajectories, the optimal sub-pattern assignment (OSPA) metric [4] has played an important role over the past years. Given two sets of targets, OSPA matches all the targets in the smallest set to different targets in the other set to define a localization error. The rest of the targets in the largest set are penalized as cardinality error. However, traditional MTT performance assessment has been based on different concepts such as localization error for properly detected targets and costs for missed targets and false targets [5, Sec. 13.6], [6, 7, 8, 9]. These aspects have been quantified in a mathematically consistent way by the generalized OSPA (GOSPA) metric [10].

For sets of trajectories, besides the above mentioned features at the individual time instants, we have an additional challenge posed by the temporal dimension of the trajectories. For instance, it is possible that for a single trajectory in the ground truth we get multiple estimated trajectories that form a good match across time. This is referred to as track switching in the literature and should also be penalised [5, Sec. 13.6]. In the following, we proceed to review several distance functions to evaluate MTT algorithms [11, 12, 4, 13, 14, 15, 16, 17]. Though these distance functions are not defined on the space of sets of trajectories [2], it is straightforward to extend the ideas to sets of trajectories, and we discuss these distances in the context of the space of sets of trajectories for comparison.

The OSPA for tracks (OSPA-T) [13] distance function was proposed as an extension of OSPA to sequences of sets of labeled targets, whose states include unique labels besides the physical state. However, OSPA-T returns counter-intuitive results [11, 12] and is not a metric [11]. Another distance function that handles track switches is OSPA with track swaps (OSPA-TS), proposed in [12, Sec. IV], but it is limited to a fixed and known number of trajectories with equal lengths. Another related distance function is the OSPA for multiple tracks (OSPA-MT) [11], but this function does not have a clear interpretation in terms of track switches, localization error, and missed and false targets. In computer vision, distance functions that penalize track switches are commonly used [14, 15, 16]. The most popular distance function in this field is called the classification of events, activities and relationships for multi-object tracking (CLEAR MOT) [14]. CLEAR MOT can handle track switches, but it does so in a heuristic manner that can return counter-intuitive results and is not a metric [17].

The metrics proposed in [17] by Bento solve many problems of the aforementioned cost functions and are mathematically consistent. To be precise, the author first proposes a family of ’natural’ metrics [17, Theorem 3] and then a ’natural and computable’ metric [17, Theorem 9], which is computable in polynomial time using linear programming (LP) and alternating direction method of multipliers (ADMM) [18]. In all Bento’s metrics, dummy trajectories are added to both the ground truth and the estimated set so that both sets have the same cardinality. Then, all trajectories in one set are assigned to trajectories in the other set via permutations and the switches are based on how these permutations change over time. Considering switches based on permutations of real and dummy trajectories is inherent in Bento’s metrics and can provide counter-intuitive results, as illustrated in Section II-C. Another limitation of Bento’s metrics is the removal of parameter of the OSPA distance, which is important to design optimal estimators [19, 20].

In this paper, we first propose a metric on the space of finite sets of trajectories that does not have the shortcomings of the previously discussed distance functions. Our metric penalizes localization error, false and missed targets and track switches in a clear fashion, without the addition of dummy trajectories. We do so by extending the GOSPA metric [10] to trajectories where, contrary to OSPA and Bento’s metrics, targets in one set are assigned to targets in the other set if they are properly detected, but they are left unassigned if they correspond to a missed or a false target. Leaving missed and false targets unassigned is natural as there is no correspondence in the other set [9]. Now, we have track switches from assigned to unassigned or assigned to assigned trajectories. We will see that, in order to define a metric based on assignments/unassignments, an assigned/non-assigned switch must be penalised half the cost of an assigned/assigned switch, so these switches are referred to as half switches. This is an important difference with Bento’s metrics, which define switches based on permutations defined over real and dummy trajectories. From the assignments, one can also compute quantities of interest in MTT such as localisation error, the number of track switches and missed and false targets. In addition, like the OSPA distance, the proposed metric contains a parameter , which is useful in optimal estimation. For example, the choice of is necessary to minimise the mean square OSPA error, as indicated in [19].

To compute the proposed metric, we need to solve an NP-hard multi-dimensional assignment problem [21, 22], which can be solved by Viterbi algorithm [23] for problems with few trajectories. Inspired by Bento’s LP metric, we propose a lower bound on our metric, which is also a metric and can be computed in polynomial time using LP [24], which makes it applicable to sets with more trajectories. As in Bento’s LP metric, we derive an efficient implementation of the LP relaxation metric based on ADMM that scales linearly with the batch duration.

A characteristic of all the above-mentioned metrics is that they assume that the ground truth and the estimate are known sets of trajectories. However, if we evaluate algorithms using the Bayesian framework via Monte Carlo simulations, these quantities are modelled as random variables. The final contribution of this paper is to extend the proposed metric to random finite sets (RFSs) of trajectories [2], as was done in GOSPA for RFSs of targets [10]. This is important for sound evaluation of algorithms using Monte Carlo simulation and, in principle, to design optimal estimators, as in RFSs of targets.

The outline of the paper is as follows. In Section II, we formulate the problem and discuss the challenges in designing a metric for sets of trajectories. Section III presents the proposed metric based on multi-dimensional assignments and, in Section IV, we present the LP relaxation metric and its ADMM implementation. We extend the metric to RFS of trajectories in Section V and, in Section VI, we analyze the proposed metric implementations via simulations. Finally, conclusions are drawn in Section VII.

## Ii Problem formulation and background

In this section, we formulate the problem, discuss the challenges in designing a metric for MTT and present some background work.

### Ii-a Fundamental properties

Our objective is to design a metric on the space of finite sets of trajectories that has an intuitive interpretation and is computable in polynomial time. Below we unfold the problem.

In MTT, the ground truth and its estimate are collections of trajectories, where each trajectory is a sequence of states representing the evolution of the target states over time where the start and end times of the individual trajectories can vary. Both the ground truth and the estimates can be represented as sets of trajectories [2]. In the set of trajectories representation, each trajectory is of the form , where is the initial time of the trajectory, is its length and denotes a sequence that contains target states at consecutive time steps starting from . Given a single trajectory , the set is the state of the trajectory at time [2]:

 τk(X)≜{{xk+1−ω}ω≤k≤ω+ν−1∅otherwise. (1)

In order to design the metric, we only consider trajectories in the time interval from time to . We therefore consider trajectories that meet and , and let be the set of all finite sets of such trajectories.

The above-mentioned trajectory representation is designed to fit the standard multi-target tracking (MTT) models, which contain birth and death events, but no possibility to resurrect [1]. Though not required for standard MTT models, which are the main focus of this paper, the representation could be easily generalised to handle trajectories with gaps, by representing single trajectories as , where and is the number of trajectory segments. The metric that we propose is valid also for such representations without major modifications and, like Bento’s metrics, could then handle sets of trajectories with gaps.

Any metric must satisfy the non-negativity, symmetry, identity and the triangle inequality [25, Sec. 2.15]. We emphasize here that the triangle inequality property, despite its abstractness, has a major practical importance in algorithm assessment [3, Sec. 6.2.1]. Suppose, for instance, that there are two estimates and for a ground truth , and that, according to the metric, the estimate is close to both the ground truth and the other estimate . Then, according to intuition, the estimate should also be close to the ground truth . This property is ensured by the triangle inequality.

### Ii-B Challenges

Besides the fundamental properties, there are specific features to be considered in metrics for sets of trajectories. The properties that apply to metrics for sets of targets, such as localization error and cost due to missed and false targets, are also relevant for metrics on sets of trajectories [10]. However, there are additional challenges posed by the temporal connection of the target states in trajectories, which should also be addressed. Below, we discuss the challenges in detail using examples. We use the notation for the ground truth set and for the estimated set.

In the space of finite sets of targets, the concepts of localization error, missed targets and false targets are important [9], and are quantified in the GOSPA metric [10]. Assume we have a cut-off distance , which is the maximum allowable error for a single target to be considered properly detected [10]. Given this, some of the targets in the ground truth can be assigned to certain targets in the estimated set if the distance between the pairs is less than . The localization error is the sum of the distances between all such assignments. The remaining targets in the ground truth and the estimate are left unassigned and represent missed and false targets, respectively.

The above discussion on localization error and missed and false targets can be extended to sets of trajectories by considering the target states of the trajectories at each time instant. Let us study this method using the examples in Figures 1 and 1, in which states are uni-dimensional and there are two different estimates of target states for the same ground truth . Assuming , it can be observed that from time to , the states of the ground truth in both the examples have identical localization costs. However, at time step , the estimate in Figure 1 has been properly detected as before, whereas the estimate in Figure 1 has missed the state. If , the interpretation is that the target in has been missed at all the time steps and has false targets at all time steps in Figure 1 and from time to in Figure 1.

It is clear that the concepts of localization error for properly detected targets, missed and false targets are relevant to sets of trajectories and they can be quantified by the GOSPA cost at each time step [10]. However, it is not sufficient to use the sum of GOSPA costs across time as a metric. We also need to take the temporal dimension of the trajectories into account, which poses the major challenge of track switches [26, 27, 28]. Below, we provide two examples to illustrate the need of penalizing track switches.

Consider the examples in Figures 1 and 1. We argue that the estimate in Figure 1 is better than the one in Figure 1 as the latter has estimated the trajectory in two parts. However, the sum of GOSPA costs across time yields the same localization error for both estimates. The problem is that the localization cost does not consider how trajectories are connected across time, which prevents it from penalizing for splitting tracks.

Now, consider the examples in Figure 2. Assuming , where is the distance indicated in the figure, and , we argue that the estimate in Figure 2 is better than the estimate in Figure 2. Both estimates provide the same localization costs at all time steps. However, in Figure 2, both and change association of estimates across time while, in Figure 2, only changes association.

There are other desirable properties, besides the above, that the metric should have. First, in each of these examples we have discussed, if we create a new scenario that repeats both and in isolated places of the space at the same or different time window, both the switching cost and the localization cost should scale accordingly with the repetition. Second, if we flip the time axis, move all trajectories ahead or delay them in time, or move the trajectories in space without changing the distances between them, the costs should remain the same. For example, assuming and , the cost in Figure 1 should be double the cost in Figure 1.

### Ii-C Bento’s family of metrics

Among the distance functions for sets of trajectories that are available in the literature [11, 12, 4, 13, 14, 16, 15, 17], only Bento’s metrics [17] address the problem of track switches for unknown number of trajectories while being mathematically consistent.

Bento’s metrics are able to penalise localization cost for properly detected targets, missed and false targets, and track switches. The main difference of Bento’s metrics and the proposed metric is how they handle track switches so we proceed to explain Bento’s track switches.

Bento’s metrics add dummy trajectories, with no physical meaning, to both the ground truth and the estimate so that both sets have the same cardinality. They also add dummy states to all real trajectories so that all trajectories, including dummy states, have the same length. Then, at each time step, targets in the ground truth are optimally associated with targets in the estimate. This association is performed by permuting the indices of the trajectories in one set (including dummy trajectories), and the switching cost only depends on how this permutation changes over time.

Bento’s paper highlights the use of a switching cost that equally penalises if there is one or more switches at a given time step, see Theorem 4 and Section V in [17]. We refer to the resulting metric as the B1 metric. The B1 metric is very important in Bento’s paper as it is the base for Bento’s LP metrics, which are computable in polynomial time and therefore suitable for large problems. Another alternative, which we refer to as the B2 metric and also admits an LP version111Note that the LP version of B1 uses the matrix 1-norm and the LP version of B2 uses the component-wise matrix 1-norm. The LP version of B2 is used in the simulation section in [17], is to sum over the number of switches in Bento’s metrics.

An important drawback of B1 is that it does not meet the desirable property that both switching and localization costs scale according to repetition of and in different locations. B2 solves this problem but it cannot handle other cases such as the one in Figure 1 and Figure 1, where the switching cost in Figure 1 should be twice the switching cost in Figure 1. As we explain next, according to the B2 metric the switching cost in Figure 1 is 3/2 times the switching cost in Figure 1. Therefore, neither of B1 and B2 provide the desired output.

In Figure  1, Bento’s metrics add two dummy trajectories, and , to and, one, , to . The optimal permutation from times 1 to 3 is [1,2,3] and from times 4 to 5 is [2,1,3], where the th component of this permutation vector indicates the index of the element in associated with . In Figure 1, Bento’s metrics add three dummy trajectories to and three more to . Then, the optimal permutation from times 1 to 3 is [1,3,2,4,5,6] and from times 4 to 5 is [2,1,3,4,5,6]. That is, in Figure  1 two elements of the optimal permutation change but, in Figure 1 three elements of the optimal permutation change. The B1 metric provides the same switching cost to both cases. In the B2 metric, Figure 1 has a switching cost that is 3/2 the switching cost of Figure 1.

Bento’s family of metrics contains a wide range of other metrics apart from B1 and B2. However, a considerable drawback with the other metrics is that it is not clear if it is computationally efficient to compute them, e.g., using LP. In any case, all of Bento’s metrics involve dummy trajectories and permutations to define the switching cost. We argue that this implies certain weaknesses in some cases, as illustrated next.

For example, according to the case in Figure 2, Bento’s metrics add two dummy trajectories to and, two more to . Assuming that the switching cost is small, the optimal permutations at times 1 and 2 correspond to [1,2,3,4] and, at times 3 and 4, to [2,1,3,4]. We can see that there is a change in the permutation [1,2,3,4] to [2,1,3,4] that penalises this switch, as desired. However, let us consider Figure 2. Now, Bento’s metrics add three dummy trajectories to and two to . The optimal permutations are [1,2,3,4,5] at times 1 and 2 and [1,3,2,4,5] at times 3 and 4. It turns out that, despite the fact that the situation in Figure 2 is arguably better than in Figure 2, Bento’s metrics penalise both situations in the same way as in both cases there is a change in two adjacent elements in the permutation.

In the case of Figure 2, both and change assignment. On the other hand, in Figure 2, does not change assignment and among the real trajectories, only in the ground truth changes assignment. However, as Bento’s metrics consider permutations, there is an additional change in the third component of the permutation vector, which represents dummy trajectory . We can then see that in the second case, there is a change induced by a dummy trajectory, which is not there in reality, and should not be penalised. Penalising it produces a counter-intuitive behaviour by indicating that the estimates in Figures  2 and Figure 2 are equally accurate.

## Iii Assignment metric

In this section, we present a metric for sets of trajectories, based on the multi-dimensional assignment problem [22], that penalises localization costs for properly detected targets, missed and false targets, and switches between trajectories that actually exist.

### Iii-a Overview

Before we present the mathematical formulation, we provide an overview of the proposed metric using the examples we discussed in Section II. The proposed metric allows a trajectory to be assigned to different trajectories at different times and the switching cost depends on how the multiple assignments change over time. With assignments, there is no need for the sets to be of equal size, as required by permutations, and hence no need for dummy trajectories, which can create counterintuitive switching penalties, as seen before.

The proposed metric allows for the possibility for trajectories to be unassigned, which is denoted using a zero index, at any time step. Leaving trajectories unassigned is natural as, if there is no correspondence in the other set at a time step, the trajectory should not be associated to another trajectory. We will see that the possibility of leaving trajectories unassigned has an important consequence in how the track switches are defined. In the next examples, we use () to denote assignment sequences across time, not to be confused with the notation [], which was used for permutations of the indices at one time instant in Bento’s metrics in the previous section.

Let us first consider Figure 1. The optimal assignments at each time step for are (1,1,1,1,1), as is always associated to . There is no change in the assignments so there is no track switch, as desired. In Figure 1, the assignments for are (1,1,1,2,2), as is associated to during the first three time steps and to later on. Clearly, there is a switch from assigned to assigned (indices different from zero), which should be penalised.

Due to the symmetry property of metrics, the switching penalty must be the same if we consider the assignments from to or from to . Interestingly, if we consider the assignments to in Figure 1, we have (1,1,1,0,0) for and (0,0,0,1,1) for . Now, we have two switches from assigned to unassigned (or the other way round). As the value of the metric must be the same if we penalise the assignments of or , the cost of a switch that considers an unassignment must be half the cost of a switch from assigned to assigned and it is therefore referred to as half-switch. It should be noted that half-switching penalties are inherent in the definition of a metric based on assignment functions.

In our metric, the assignments for in Figure 1 are (1,1,1,1,1). It should be noted that, as in Bento’s family, the metric can assign trajectories that no longer exist. We argue that this represents the case that an estimated trajectory has a delay or starts before the ground truth and is therefore not considered a track switch. In this case, our metric penalises localization error for the first time steps, plus a missed target at time 5, but no track switch.

Let us now consider Figure 2, which Bento’s metrics cannot handle properly. In Figure 2, the assignments for and are (1,1,2,2) and (2,2,1,1). Thus, there are two switches, one for and another for . In Figure 2, the assignments for and are (1,1,1,1) and (2,2,3,3). Thus, there is only one switch for . Consequently, the proposed metric based on assignments indicates that the estimate in Figure 2 is more accurate than the one in Figure 2, as desired.

Finally, in Figure 1, the assignments for , and are (1,1,1,2,2), (3,3,3,0,0) and (0,0,0,3,3). We now have a switch and two half-switches, which make a total of two switches, which is double the switching cost in Figure 1, as desired.

We have provided a brief overview of how the proposed metric works and how it can handle cases in which Bento’s metrics provide counterintuitive results. The main benefits of the proposed metric come from the fact that the switching cost is based on assignments/unassignments while Bento’s metrics are based on permutations and dummy trajectories. We now proceed to present the mathematical formulation of the metric.

### Iii-B Multi-dimensional assignment metric

Before we present the metric in Definition 5, we define the base metric and the assignment functions that compose it. Using these, we describe the localization and switching costs, which finally make up the multi-dimensional assignment metric.

###### Definition 1.

Given a metric on , a scalar and a scalar such that , we define the following base metric for , which has at most one element, as

 d(X,Y) ≜⎧⎪ ⎪⎨⎪ ⎪⎩min(c,db(x,y))X={x},Y={y}0X=Y=∅c21potherwise. (2)

The above metric can be viewed as a simple case of the GOSPA metric () [10] where the cardinalities of the finite sets can be at the most . The role of the parameters and are similar to their roles in the OSPA [4] and GOSPA metrics [10]. The larger the value of is, the more the outliers are penalized. The parameter plays the role of the cut-off distance.

###### Definition 2.

Let be the set of all possible assignment vectors between the index sets and . That is, any is a vector such that its component implies that .

In the multi-dimensional assignment metric, is the assignment sequence that we discussed at the beginning of the section, where is the component of the assignment vector at time . Here, implies that trajectory in is assigned to trajectory in at time and implies that trajectory in is unassigned at time . We use to denote the assignment vector at time . The above definition of assignment vectors ensures that no two distinct indexes in are assigned to the same . However, multiple indexes in can be assigned to the index implying that the corresponding trajectories are unassigned. Let denote the set of indexes of that are left unassigned, according to .

###### Definition 3.

For and , the localization cost and the cost for missed and false targets at each time for a given assignment is

 dk(X,Y,πk)≜nX∑i=1dkX,Y(i,πki)p+∑j∈~πkd(∅,τk(Yj))p (3)

where

 dkX,Y(i,j)≜⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩d(τk(Xi),∅)i=1,…,nX,j=0d(∅,τk(Yj))i=0,j=1,…,nYd(∅,∅)i=0,j=0d(τk(Xi),τk(Yj))otherwise. (4)

Note that the and cases in (4) are not necessary for (3) but will be used later in Section IV-A.

###### Definition 4.

For and , we define the switching cost between two assignment vectors as follows:

 sX,Y(πk,πk+1)≜γpnX∑i=1s(πki,πk+1i) (5)

where

 s(πki,πk+1i) ≜⎧⎪ ⎪⎨⎪ ⎪⎩0πki=πk+1i1πki≠πk+1i,πki≠0,πk+1i≠012otherwise. (6)

The terms correspond to no switch when there are no changes in the assignments, a full switch when there is change from one non-zero to another non-zero assignment, and a half switch when there is a change from a zero to a non-zero assignment or vice versa. The parameter is the switching penalty. The larger the value of is, the higher a track switch costs. We also want to recall that half switches are an intrinsic part of a metric based on assignments/unassignments, as was explained in Section III-A.

We now present the multi-dimensional metric that is formed by the sum of the localization cost in (3) and the switching cost in (5) across all time instants .

###### Definition 5.

For , and , the multi-dimensional assignment distance for any is defined as

 d(c,γ)p(X,Y) ≜minπk∈ΠX,Yk=1,…,T(T∑k=1dkX,Y(X,Y,πk) +T−1∑k=1sX,Y(πk,πk+1))1p. (7)

As we formalize in the below proposition, the above distance function is indeed a metric, which is parameterized by the cut-off , switching penalty and the exponent . The base metric is also a design choice.

###### Proposition 1.

For , and , the distance function in (7) is a metric for any .

The proof for the above proposition is provided as a special case of the proof of the LP relaxation metric in Proposition 3 presented in Section IV.

### Iii-C Computation

The metric proposed in (7) is computed by solving a multi-dimensional assignment problem which belongs to the NP-hard family of problems [21, 22]. This problem can be solved using the Viterbi algorithm [29, 23], but it is only efficient for small problems (roughly in MATLAB). We would like to point out here that the Viterbi solution scales linearly with the duration , which means that it is tractable to compute (7) also for long trajectories, as long as and are small. One can also use methods such as the dual decomposition [30] to compute the assignment metric sub-optimally. Nevertheless, in the next section, we show that it is possible to get an accurate lower bound on the metric using linear programming which can be computed in polynomial time. It turns out that this lower bound is also a metric for sets of trajectories.

## Iv LP relaxation metric

In this section, inspired by Bento’s LP metrics [17], we show that the metric in (7) can be reformulated as a mixed binary-linear programming problem [31]. When the binary constraints are relaxed, we get an LP problem which provides a lower bound for the metric and is also a metric on the space of finite sets of trajectories. This LP metric is computable in polynomial time. We also describe an implementation of the LP relaxation metric using ADMM, in which the computation scales linearly with the batch duration compared to a direct implementation.

### Iv-a Binary linear programming formulation

Before we present the binary LP formulation of the metric , we introduce an equivalent representation of the assignments vectors presented in Definition 2 using binary weight matrices.

Let be the set of all binary matrices of dimension , representing associations between X and Y, such that satisfies the following properties:

 nX+1∑i=1W(i,j)=1, j=1,…,nY (8) nY+1∑j=1W(i,j)=1, i=1,…,nX (9) W(nX+1,nY+1)=0, (10) W(i,j)∈{0,1}, ∀i,j, (11)

where represents the component in the row and column of matrix . The indexes and correspond to unassigned index . The first two properties ensure that no two target indexes in are assigned to the same and vice versa.

It is immediate to see that there is a bijection between the sets and , such that for , , and :

 πi=j≠0 ⇔ W(i,j)=1 (12) πi=0 ⇔ W(i,nY+1)=1 (13) ∄i∈{1,…,nX},πi=j≠0 ⇔ W(nX+1,j)=1. (14)

To illustrate the above bijection, let us consider Figure 1, where the assignment sequence of is . The corresponding weight matrices for time are for , for and everywhere else. For the example in Figure 1, is such that for , for and everywhere else.

###### Lemma 2.

The multi-dimensional assignment metric in (7) can be written as

 d(c,γ)p(X,Y)=minWk∈WX,Yk=1,…,T(T∑k=1tr[(DkX,Y)†Wk] +γp2T−1∑k=1nX∑i=1nY∑j=1|Wk(i,j)−Wk+1(i,j)|)1p, (15)

where component of matrix is

 (16)

is given by (4), is the matrix trace operator and denotes the matrix transpose.

The proof of the above lemma follows immediately from the bijection defined between the sets and in (12), (13) and (14), using which (7) and (15) give identical localization and switching costs.

### Iv-B LP relaxation metric

In this section, we relax the binary constraints of matrices in Lemma 2 and show that the result is a metric that is computable in polynomial time using linear programming.

Let be the set of all matrices of dimension such that satisfies (8), (9), (10) and

 W(i,j)≥0, ∀i,j. (17)

The main difference to is the relaxation of the constraint in (11) and so . The relaxation of the binary constraints can be interpreted as making soft assignments of trajectories from one set to the other. Below, we define a new distance function, where the only difference to in (15) is that the optimization is over in instead of . Therefore it follows immediately that .

###### Definition 6.

For , and , the LP relaxation metric between sets of trajectories and is

 ¯d(c,γ)p(X,Y)≜minWk∈¯¯¯¯¯WX,Yk=1,…,T(T∑k=1tr[(DkX,Y)†Wk] +γp2T−1∑k=1nX∑i=1nY∑j=1|Wk(i,j)−Wk+1(i,j)|)1p, (18)

where the component of matrix , , is given by (16) and is given by (8), (9), (10) and (17).

###### Proposition 3.

The distance function in (18) is a metric on the space of finite sets of trajectories and is computable in polynomial time using LP [24].

The proof of the above proposition is provided in Appendix A. For the examples in Figure 1 and Figure 2, both the assignment metric in (7) and the LP metric in (18) take the same values.

### Iv-C Implementation of the LP relaxation metric using ADMM

In this section, we discuss an implementation of the LP relaxation metric using alternating direction method of multipliers (ADMM) [18]. This implementation is important because its computational complexity scales linearly with . Thus, it is faster than a direct implementation of an LP solver, such as the function “linprog” in MATLAB. The method also has the advantage that its sub-problems can be parallelized easily.

The LP relaxation metric in (18) is computable using LP in (22) and thus has a polynomial time complexity in the number of variables [24], which is in the order of . Therefore, the LP solutions for the metric computation has a computational complexity that scales polynomially in the batch duration . This may be undesirable when the batch duration is large, and may be slower than the Viterbi solution, for small values of and , whose complexity scales linearly with the batch duration [23, 29]. To address this problem, we propose to use ADMM [18] to compute the metric such that the complexity scales linearly with the batch duration .

Using redundant variables for the weight matrices and indicator functions for the constraints, we formulate the optimization problem for computing the LP relaxation metric in the consensus optimization form in [18, Sec. 7.2], and use an iterative ADMM procedure to compute the metric efficiently. The details of the reformulation of the problem are provided in Appendix B. For the LP relaxation metric, the iteration of ADMM for computing (22) is given by (62) to (67) in Appendix B.

We now briefly discuss the computational aspects of (62) to (67). First, it is important to observe that in each iteration, the updates for the different variables are performed at individual time instants, which is key to making the complexity of the algorithm grow linearly with . As discussed in [18, Sec. 3.2.2], the ADMM iterations in (62) to (67) converge in few iterations to reasonable accuracy, which suffices for practical purposes. Also, the update of , and in (65), (66) and (57) are straightforward and simple. The optimization problems in (62), (63) and (64) are quadratic programming (QP) problems, which are in fact convex. Therefore, each of these sub-problems is solvable in polynomial time [32]. Besides, in ADMM, it is enough to solve these sub-problems approximately to modest accuracy [18, Sec. 3.4.4]. In our case, we can adjust the accuracy of the sub-problem solutions by adjusting the number of iterations inside the QP solver. In addition to these features, these sub-problems can be easily parallelized. Overall, in principle, the ADMM iterations in (62) to (67) should provide a good trade-off between accuracy and computation time compared to a straightforward implementation of the LP relaxation metric when is large.

## V Extension to random sets of trajectories

In the previous sections, we studied metrics between deterministic finite sets of trajectories. However, in the Bayesian formulation of MTT, the ground truth is a random quantity and the estimates are sets that depend deterministically on the observed data [1]. For performance evaluation, using, e.g., Monte Carlo simulation, the metric values are averaged over several realizations of the data. In such scenarios, both the estimates and the ground truth can be interpreted as random finite sets. It has been argued in [10, Sec. V] that, in MTT, when estimating the target states, both the ground truth and the estimates are to be viewed as random sets of targets for performance evaluation and algorithm design. Therefore, it is important to have a metric on the space of random sets of targets. These arguments are valid for random sets of trajectories as well. That is, it is important to have a metric on the space of random sets of trajectories such that one can compute mean or root mean squared value of the metric over several instances of the random sets. We proceed to extend the metric proposed in (18) to random sets of trajectories using set integrals [2].

Let be the joint probability density function on the random sets of trajectories and [3]. We define a distance function between two random sets of trajectories using set integrals [3, Sec. 3.5.3] as follows:

 p√E[¯d(c,γ)p(X,Y)p] = (∫∫¯d(c,γ)p(X,Y)p (19) ×ψ(X,Y)δXδY)1p,

where the set of trajectory integrals are in the trajectory space from time step to [2]. In the above definition, the integral goes through all possible values of the cardinalities, start times, lengths and target states of trajectories.

###### Lemma 4.

in (19) is a metric on the space of random sets of trajectories.

The proof of the above proposition is immediately obtained by using the set integral on sets of trajectories instead of the set integral on sets of targets in the proof in [10, App. C].

It should be noted that it usually aids to select in (19) to obtain computable optimal estimators. In this case, when we set the Euclidean metric as the base metric on in Definition 1, we get a sum of squares form inside the expectation. For the OSPA/GOSPA metrics with known target number, we can obtain the best estimator for [19]. Similarly, choosing in the proposed metric should in principle help in the calculation of the optimal estimator.

## Vi Simulation results

In this section, we present the simulation results of how the ADMM method discussed in Section IV-C for computing the LP relaxation metric scales linearly with the batch duration . We also present simulation results of the metric for varying parameter values and also compare them with Bento’s metrics.

### Vi-a Comparison between ADMM and “linprog” implementations

In this section, we present the simulation results on how the ADMM implementation in Section IV-C and “linprog” implementation in MATLAB scale with varying values of and the number of tracks.

The set-up for the simulations is as follows: The number of trajectories and in the two sets and is chosen uniformly between and . The state dimension is , describing 2-dimensional position components. The start time of the individual trajectories are generated uniformly between and . The initial state of the trajectories are randomly generated from Gaussian densities whose means are generated uniformly in the region and whose covariance matrix is . Each existing trajectory survives into the next time instant with probability . The state transition density is given by

 g(xk|xk−1)=N(xk;[1001]xk−1,10−2[10001]). (20)

Besides these, the metric parameters are set as , and and the augmented Lagrangian parameter for the ADMM implementation.

We have used the “linprog” and "quadprog" commands in MATLAB-R2014b for a direct implementation of the LP relaxation metric and for the QP in ADMM implementation, respectively. With all other parameters fixed, we have varied and and have averaged the computation time and the absolute difference in the metric values between the two algorithms over Monte Carlo iterations. The maximum number of iterations for QP in MATLAB has been set to and the maximum number of ADMM iterations to . The ADMM iterations are forestalled if the algorithm has converged to reasonable accuracy. In our implementation, we have checked this convergence by checking if the values of the primal residual and the dual residual are smaller than [18, Sec. 3.3.1].

The computation time results are presented in Figure 3. The simulations have been run on a Windows desktop with Intel i7 processor and RAM. In our simulations, the “linprog” implementation of MATLAB ran out of memory for the case when and and, therefore, this point is not plotted in the figure. The linear trend of the run time of the ADMM for varying values of is apparent in Figure 3. In the simulation, we have noticed that the difference between the values returned by the two implementations is in essence negligible. To give a reference on the size of the error, the maximum of the average error across all the simulations was of the value of the metric.

### Vi-B Example in MTT using sets of trajectories

Here, we illustrate the behavior of the LP relaxation metric, given by Definition 6, for varying values of and using an MTT example. We also compare the values returned by the LP relaxation metric to Bento’s metrics with switching cost based on 1-norm and component-wise 1-norm. For computation, we have used the “linprog” implementation, as we get the same results from both “linprog” and the ADMM methods. We have set and as Euclidean distance in the metric for these simulations.

We consider a multiple target tracking scenario, where we use the notation, models and the Bayesian closed form solution for sets of trajectories in [2]. We consider a target state that consists of one-dimensional position and velocity for ease of illustration. The targets can be born from similar PDFs,

 β1(x)=β2(x)=N(x;[00],[25001]).

The probabilities that there are , (from either of the PDFs) or new born targets at each time are , and , respectively. The probability for a target to survive to the next time instant is , and the corresponding state is governed by the state transition density

 g(xk|xk−1)=N(xk;[1101]xk−1,110[1/31/21/21]).

We consider a batch duration of .

We consider the standard measurement model [1]. We obtain positional measurements of the targets from the sensors with probability with the target measurements generated according to . We observe Poisson clutter, which is uniformly distributed in the interval and there is an average of clutter measurement per scan.

The position components of the targets in the ground truth and the observed measurements are shown in Figures 4 and Figure 4, respectively. The circles and crosses in the figure indicate the appearance and end times of the tracks respectively. It is shown in [2] that the posterior PDF is a mixture with multiple hypotheses. Some of the possible hypotheses for the ground truth in Figure 4 according to the measurements in Figure 4 are shown in Figure 5.We have plotted the posterior mean of the trajectories conditioned on the different hypotheses. We have considered these hypotheses as we think they are insightful to illustrate the behavior of the metric.

We first analyze the LP relaxation metric between the ground truth and the posterior mean of these hypotheses. We want to point out that the LP relaxation metric and the multi-dimensional assignment metric, computed by Viterbi algorithm, returned the same values for these scenarios. The results are presented in Table I. As can be seen from the tables and the figures, for fixed , when the switching cost parameter is increased, the metric values increase for the cases with full or half switches in Figures 5,  5, and 5. Similarly, for fixed , when is increased, the metric values increase for the cases with missed and/or false targets in Figures 5,  5 and  5. For the case in Figure 5 which has a track switch and a false track, the metric value increases for increase in both and . It can also be observed that the case in Figure 5 is always returned as the most accurate one irrespective of the parameters choice, which agrees with intuition.

Table I also contains the results for Bento’s metric with 1-norm and component-wise 1-norm denoted and , respectively. It can be immediately observed that for the cases in Figures 5,  5 and  5 which have no track switches, the values are identical between the Bento’s metrics and the proposed LP relaxation metric.

One can note the differences in the values for the scenarios in Figure 5,  5 and  5. For the case in Figure 5, there is one full switch (or two half switches from the perspective of ) between time steps and . This switch contributes as the switching cost to the proposed LP metric, and Bento’s metric with 1-norm , thus resulting in identical metric values. However, also includes the switches in the dummy trajectories, similar to the case in Figure 2, and therefore has a higher switching cost leading to a higher metric value.

For the cases in Figures 5 and  5, there are two full switches between time steps and , which according to our metric contributes as the switching cost to our metric . This value also matches with , as the switches are only between the real trajectories and it counts all the switches. However, in , the switching cost contribution is regardless of the number of switches (if non-zero) at a particular time, which is counter-intuitive.

## Vii Conclusion

In this paper, we have proposed a metric that quantifies the distance between a pair of sets of trajectories. This metric captures the localization error, missed or false targets and track switches in a way that agrees with intuition without the addition of dummy trajectories.

When the number of trajectories is small, the metric can be computed using Viterbi algorithm. For larger problems, we have proposed a bound, which is also a metric and can be computed using linear programming. For all the results presented in the paper, the lower bound is identical to the original metric. To compute the LP relaxation metric efficiently, we have proposed an ADMM based algorithm whose complexity scales linearly with the number of time steps and enables parallelization. We have also extended this metric to the space of random sets of trajectories.

## Appendix A Proof for Proposition 3

### A-a Proof for computability using LP

The proof for the computability of the metric in (18) using LP is along the same lines as in [17, Theorem 10]. First, note that to compute the metric in (18), it is enough to solve the following optimization problem:

 argminWk∈¯¯¯¯¯WX,Yk=1,…,TT∑k=1tr[(DkX,Y)†Wk] +γp2T−1∑k=1nX∑i=1nY∑j=1|Wk(i,j)−Wk+1(i,j)|. (21)

The objective function in the above problem can be written in linear form as

 (22)

by introducing variables for to the optimization problem with equality constraints

 ek=nX∑i=1nY∑j=1|Wk(i,j)−Wk+1(i,j)|. (23)

Note that except the above set of constraints, all the other constraints in (8), (9), (10) and (17) are linear. The above set of equality constraints can be written in an equivalent linear form by introducing additional variables for to the optimization problem as follows:

 ek ≥nX∑i=1nY∑j=1Hk(i,j), (24) Hk(i,j) ≥Wk(i,j)−Wk+1(i,j),i=1,…,nXj=1,…,nY, (25) Hk(i,j) ≥Wk+1(i,j)−Wk(i,j),i=1,…,nXj=1,…,nY. (26)

### A-B Proof for metric properties

The non-negativity, identity and symmetry properties of the metric in (18) are immediate from the definition. Below we prove the triangle inequality. The proof in this section is done for the LP relaxation metric, where the optimization is over . The proof is analogous for the multi-dimensional assignment metric in (15), where the optimisation is over , and therefore also holds for Proposition 1.

We denote the objective function in (18) as as a function of the matrices. The outline of the proof is as follows: We assume that we have three sets of trajectories , and . Let , and be the weight matrices that minimize , and respectively. We construct a matrix from and as

 WkX,Y(i,j) (27)

and show that

 ¯d(c,γ)p(X,Y,W1:TX,Y)≤¯d(c,γ)p(X,Z)+¯d(c,γ)p(Z,Y). (28)

Combining the above result with the fact that , we get the triangle inequality

 ¯d(c,γ)p(X,Y) ≤¯d(c,γ)p(X,Z)+¯d(c,γ)p(Z,Y). (29)

To prove (28), we show that for any and , and constructed according to (27), the following result holds:

 ¯d(c,γ)p(X,Y,W1:TX,Y) ≤¯d(c,γ)p(X,Z,W1:TX,Z)+¯d(c,γ)p(Z