A Distributed Particle-PHD Filter with Arithmetic-Average PHD Fusion

# A Distributed Particle-PHD Filter with Arithmetic-Average PHD Fusion

Tiancheng Li and Franz Hlawatsch,  T. Li is with the School of Automation, Northwestern Polytechnical University, Xi’an 710129, China and also with the BISITE group, University of Salamanca, 37007 Salamanca, Spain; e-mail: t.c.li@usal.es, t.c.li@mail.nwpu. edu.cn.F. Hlawatsch is with the Institute of Telecommunications, TU Wien, Vienna, Austria and also with Brno University of Technology, Brno, Czech Republic; e-mail: franz.hlawatsch@tuwien.ac.at.This work was partially supported by the Marie Skłodowska-Curie Individual Fellowship (H2020-MSCA-IF-2015) under Grant 709267 and by the Austrian Science Fund (FWF) under Grant P27370-N30.
###### Abstract

We propose a particle-based distributed PHD filter for tracking an unknown, time-varying number of targets. To reduce communication, the local PHD filters at neighboring sensorscommunicate Gaussian mixture (GM) parameters. In contrast to most existing distributed PHD filters, our filter employs an “arithmetic average” fusion. For particles–GM conversion, we use a method that avoids particle clustering and enables a significance-based pruning of the GM components. For GM–particles conversion, we develop an importance sampling based method that enables a parallelization of filtering and dissemination/fusion operations. The proposed distributed particle-PHD filter is able to integrate GM-based local PHD filters. Simulations demonstrate the excellent performance and small communication and computation requirements of our filter.

Distributed multitarget tracking, distributed PHD filter, average consensus, flooding, probability hypothesis density, random finite set, Gaussian mixture, sequential Monte Carlo, importance sampling, arithmetic average fusion.

## I Introduction

The probability hypothesis density (PHD) filter is a popular method for tracking an unknown, time-varying number of targets in the presence of clutter and missed detections [1, 2, 3]. In decentralized sensor networks, a distributed extension of the PHD filter can be employed where each sensor runs a local PHD filter and exchanges relevant information with neighboring sensors. For the local PHD filters, a Gaussian mixture (GM) implementation [4, 5, 6, 7, 8, 9, 10] or a particle-based implementation [11, 12, 13, 14] is typically used. For distributed data fusion, most existing distributed PHD filters perform a “geometric average” (GA) fusion of the local posterior PHDs [11, 12, 4, 5, 6, 7]; this type of fusion is also known as (generalized) covariance intersection [15, 16, 17, 18, 19, 20]. However, GA fusion of PHDs has been observed to suffer from certain deficiencies: it performs poorly in the case of closely spaced targets [9, 10]; it incurs a delay in detecting new targets [6, 10]; it is sensitive to missing measurements [8, 7]; and it does not lead to consistent fusion of cardinality distributions and thus tends to underestimate the number of targets [21].

In this paper, we propose a distributed PHD filter method that performs an “arithmetic average” (AA) fusion of the localposterior PHDs. AA fusion of PHDs first appeared indirectly in the context of centralized multisensor PHD filtering, as an implicit consequence of AA fusion of the generalized likelihood functions of multiple sensors [22]. It was used explicitly and in the context of distributed PHD filtering in [8, 10] (based on a GM implementation of the local PHD filters) and in [13, 14] (based on a particle implementation of the local PHD filters and a straightforward particle-based dissemination/fusion scheme). AA fusion of PHDs was demonstrated in [10, 13, 14] to outperform GA fusion of PHDs in the sense of better filtering accuracy, higher reliability in scenarios with strong clutter and/or frequent missed detections, and lower computational complexity.

The proposed distributed PHD filter employs a particle implementation of the local PHD filters for the sake of maximum suitability for nonlinear and/or non-Gaussian system models. Straightforward fusion of particle representations of the local and fused PHDs imposes high communication requirements [13, 14]. By contrast, our filter has low communication requirements because GM parameters are communicated. This also allows our particle-based local PHD filters to be easily combined with GM-based local PHD filters within a heterogeneous network architecture.

For converting particle representations into GM representations, we propose a data-driven method that avoids a clustering of the particles. This method generates from the particle representation one Gaussian component for each measurement that has a significant impact on the particle weights. The overall approach is inspired by a scheme for estimate extraction proposed in [23, 24, 25]. For converting the GMs produced by AA fusion into particle representations, we propose a method that is based on importance sampling (IS) [26, Ch. 3.3]. This method does not require sampling from the fused GM, thereby enabling a parallelization of filtering and dissemination/fusion operations. This allows more dissemination/fusion iterations to be performed compared to protocols where the filtering and dissemination/fusion operations must be performed serially. Overall, the main contribution of this paper is to devise an AA fusion-based distributed particle-PHD filter that has low communication requirements and allows for a parallelization of filtering and dissemination/fusion operations.

The paper is organized as follows. The system model is described in Section II. Section III discusses the basic operation of the particle-based local PHD filters and presents a measurement-based weight decomposition. Section IV provides a motivation and outline of the proposed distributed PHD filter. Section V describes a method for converting particle representations into GM representations. Section VI discusses two schemes for GM dissemination and fusion. An IS method for converting the fused GM into a particle representation is proposed in Section VII. Section VIII presents two further stages of the proposed distributed PHD filter. Section IX provides a summary of the overall method, discusses the parallelization of filtering and fusion, and analyzes the communication cost. Simulation results are reported in Section X.

## Ii System Model

We consider targets with random states , at discrete time . The number of targets, , is unknown, time-varying, and considered random. Accordingly, the collection of target states is modeled by a random finite set (RFS) with random cardinality [27]. The cardinality distribution is the probability mass function of . A target with state at time continues to exist at time with probability (“survival probability”) or disappears with probability . In the former case, its new state is distributed according to a transition probability density function (pdf) . There may also be newborn targets, whose states are modeled by a Poisson RFS with intensity function [28].

There are sensors indexed by . At time , each sensor observes an RFS of measurements , where is the number of measurements observed by sensor at time . We denote by the set of sensors that are connected to sensor by a communication link, and we refer to these sensors as the neighbors of sensor . We assume that the sensor network is connected, i.e., each sensor can be reached from each other sensor by one or multiple communication hops. A target with state is “detected” by sensor with probability (“detection probability”) or “missed” by sensor with probability . In the former case, the target generates a measurement , which is distributed according to the pdf . There may also be clutter measurements, which are modeled by a Poisson RFS with intensity function (PHD) . The multitarget state evolution and measurement processes are assumed to satisfy the independence assumptions discussed in [1].

## Iii Local Particle-PHD Filters

Each sensor runs a local PHD filter that uses the local measurement set and communicates with its neighbors to exchange relevant information. Let us, at first, consider the local PHD filter without any cooperation.

### Iii-a Propagation of the Local Posterior PHD

The local PHD filter propagates the local posterior PHD over time . Let comprise the local measurements observed by sensor up to time . Furthermore, for a region , let denote the number of those targets whose states are in . Then, the local posterior PHD at sensor , , is defined as the function of whose integral over a region equals the posterior expectation of , i.e. [27]

 (1)

In particular, for , we have , and thus (1) becomes

 ∫RdDs,k(x|Zs,1:k)dx=E[Nk|Zs,1:k]=∞∑n=0nρ(n|Zs,1:k),

where . The posterior expectation of , , is equal to the minimum mean square error (MMSE) estimate of from [29], denoted . Thus, Eq. (LABEL:eq:Cardinality) implies

 (3)

This is also known as the expected a posteriori (EAP) estimate of [1, 27].

The local PHD filter performs a time-recursive calculation of an approximation of the local posterior PHD . In a prediction step, it converts the preceding approximate local posterior PHD into a “predicted” PHD, denoted , via an expression involving , , and [1]. In a subsequent update step, it converts into the approximate local posterior PHD via anexpression involving , , and [1, 2].

### Iii-B Particle-Based Implementation

We use the particle-based implementation of the prediction and update steps proposed in [2]. The approximate local posterior PHD is represented by the weighted particle set , which consists of particles and weights , . The sum of the weights,

 Ws,k≜Js,k∑j=1w(j)s,k, (4)

approximates and, hence, . Thus, it further follows with (3) that

 Ws,k≈^NMMSEs,k. (5)

Propagating the approximate local posterior PHD (i.e., ) is now approximated by propagating the weighted particle set, i.e., . The time-recursive calculation of is done as follows [2]. For each previous particle , , a current particle is drawn from a proposal pdf . In addition, “newborn” particles , are drawn from a proposal pdf . Then, for each particle , , a “predicted” weight is calculated as

Note that . A simple choice of the first proposal pdf is , in which case for .

For the calculation of the current weights , , we formally introduce a “pseudo-measurement” representing the case of a missed detection at sensor , and, accordingly, we consider an extended measurement set . Then, the weight expression in [2, Eq. (22)] can be formulated as the sum [23, 24, 25]

 (7)

where

 (8)

with . Expression (7) provides an expansion of into components , each of which corresponds to one of the measurements . We also introduce

 Ωs,k(z)≜Js,k∑j=1ω(j)s,k(z),z∈Z0s,k.\vspace−.5mm (9)

For , , which provides an estimate of the probability that measurement originates from a target. For , provides an estimate of the number of missed detections. Note that

 ∑z∈Z0s,kΩs,k(z)=Js,k∑j=1∑z∈Z0s,kω(j)s,k(z)=Js,k∑j=1w(j)s,k=Ws,k, (10)

where (7) and (4) were used.

## Iv Motivation and Outline of the Proposed PHD Fusion Scheme

The proposed distributed PHD filter uses information fused among the sensors to “re-weight” the particles of the local PHD filters such that the resulting new PHD approximates the AA of the local PHDs. Forming the AA can be motivated as follows. Suppose sensor wishes to estimate the number of targets in a region , , via the estimator (cf. (1)) . Since is affected by clutter and missed detections, may be quite different from . For example, if one target is in , i.e., , sensor may fail to detect that target, resulting in ; or if no target is in , i.e., , a false alarm (clutter) at sensor may lead to . On the other hand, because the clutter and the missed detections at different sensors are not identical—in fact, they are assumed independent across the sensors—one can expect that the AA of the , , is a more robust estimate of . This AA can be expressed as

 ^NR1:S,k=1SS∑s=1∫R^Ds,k(x|Zs,1:k)dx=∫R^Dk(x|Z1:S,1:k)dx,\vspace.5mm

with the AA of the local PHDs

 ^Dk(x|Z1:S,1:k)≜1SS∑s=1^Ds,k(x|Zs,1:k).\vspace−.5mm (11)

Thus, is obtained by integrating the AA of the local PHDs over . This motivates a fusion of the local PHDs —thereby combining all the local measurements , —by calculating the AA of the : we can expect that this compensates the effects of clutter and missed detections to some extent. In addition, the AA fusion of the local PHDs can be motivated theoretically by the fact that the fused PHD minimizes the sum of the Cauchy-Schwarz divergences relative to the local PHDs [30, 13].

To reduce the amount of intersensor communication, the information exchanged between neighboring sensors in our approach consists of GM parameters rather than particles and weights. This necessitates conversions between particle and GM representations. The proposed AA-based fusion scheme thus consists of the following steps:

1. Each sensor converts its weighted particle set into a GM (see Section V) and broadcasts the GM parameters to the neighboring sensors .

2. Each sensor broadcasts its local cardinality estimate (see (4), (5)) to the neighboring sensors .

3. The GM parameters of each sensor are fused with those received from the other sensors via a distributed dissemination/fusion scheme; see Section VI.

4. The local cardinality estimate of each sensor is fused with those received from the other sensors via a distributed dissemination/fusion scheme; see Section VIII-A [31].

5. At each sensor , the local particle weights are modified based on the fused GM parameters and the fused cardinality estimate; see Sections VII and VIII-A.

## V Particles–GM Conversion

In Step 1, the local weighted particle set is converted into a GM representation. Our conversion method differs from previous methods [32, 33, 34, 35, 12, 36, 37]in that it is based on the weight expansion in (7), i.e., . In our method, each of the extended measurements potentially corresponds to one Gaussian component (GC) . Here, denotes a Gaussian pdf with mean vector and covariance matrix . The GC is meant to represent the weighted particle set . The mean vector and covariance matrix are derived from the respective weight components and the particles as

 μs,k(z) =Js,k∑j=1¯ω(j)s,k(z)x(j)s,k, (12) Σs,k(z)

where with given by (8). In the overall GM-based PHD (briefly referred to as GM-PHD), the GC is multiplied by the weight (see (9)). Thus, there is one weighted GC for each measurement .

The overall GM-PHD is meant to represent the local weighted particle set . Because , the overall GM-PHD is ideally taken to be the sum of all the weighted GCs, i.e.,

 (14)

This provides an approximate GM representation of . However, to further reduce the communication cost, we restrict the sum (14) to the GCs corresponding to “significant” measurements. (We note that a similar restriction was used previously for estimate extraction in [23, 24, 25].) The subset of significant measurements, , is defined as the set of those for which in (9) is above a threshold , where . In other words, the GM at sensor contains a GC for if the estimated probability that the measurement originates from a target (given by ) is above , and it contains a GC for if the estimated number of missed detections (given by ) is above . Thus, the local GM-PHD is taken to be

 (15)

This can be interpreted as the GM-PHD corresponding to theparticle set whose weights are defined by summing the only over the significant measurements, i.e., . We note that an alternative definition of a significant measurement subset and, thus, of is to choose the GCs with the largest , . Here, according to (10), and we recall from (5) that approximates the MMSE estimate .

The suppression of GCs in (15) is motivated by the notion that “insignificant” measurements are likely to be false alarms (clutter). However, if an insignificant measurement is not a false alarm after all, we can expect that it is not suppressed at most of the other sensors, and thus the erroneous suppression at sensor will be compensated by the subsequent AA fusion. This is an advantage of AA fusion over GA fusion.

The GM parameter set underlying the local GM-PHD in (15) is

 Gs,k≜{(Ωs,k(z),μs,k(z),Σs,k(z))}z∈ZSs,k. (16)

All the further operations of our distributed PHD filter are based on ; the GM-PHD itself is never calculated. These further operations comprise a distributed fusion of the local GM parameter sets and of the local cardinality estimates, the conversion of the fused GM representations into particle representations, a scaling of the particle weights, and the calculation of state estimates. A detailed presentation of these steps will be given in Sections VIIX.

## Vi Two GM Dissemination/Fusion Schemes

Once the local GM parameter sets are available at the respective sensors , they are disseminated and fused via a distributed scheme. The goal of this scheme is to obtain, at each sensor , a GM parameter set that approximately corresponds to the AA of all the local GM-PHDs,

 ¯DGMk(x)≜1SS∑s=1DGMs,k(x). (17)

Note that this equals (11) except that is replaced by . Next, we discuss two alternative schemes for disseminating and fusing the local GM parameter sets.

### Vi-a GM Flooding

In the flooding scheme [36], each sensor first broadcasts to its neighbors its GM parameter set and receives their GM parameter sets , . Each sensor then augments its own GM parameter set by the neighbor GM parameter sets , , resulting in . In the subsequent flooding iteration , each sensor broadcasts to its neighbors the augmented set with the exception of the elements already broadcast (the sensor keeps track of all the elements it already broadcast [36]) and receives the new elements of the neighbors’ augmented sets . This results in the new augmented set

 GF[i]s,k=⋃r∈{s}∪SsGF[i−1]r,k.\vspace−1.5mm (18)

This recursion is initialized with .

After the final flooding iteration (the choice of will be discussed in Section IX-A), the augmented parameter set at sensor is equal to

 GF[I]s,k=⋃r∈S[I]sGr,k,\vspace−1.5mm (19)

where denotes the set of all those sensors that are at most hops away from sensor , including sensor itself. At this point, sensor would be able to calculate the AA of all the GM-PHDs whose GM parameters are contained in , i.e.,

 DGM[I]s,k(x)=1∣∣S[I]s∣∣∑r∈S[I]sDGMr,k(x). (20)

If , where is the network diameter [38, 36], then contains the GM parameters of all the sensors, and thus equals the total GM-PHD average in (17). (This presupposes that the sensor network is connected, which we assumed in Section II.) If , then provides only an approximation of .

A drawback of the flooding scheme is that as the flooding iteration proceeds, the sets grow in size since the GM parameters of additional sensors are included. Indeed, in iteration , sensor receives the GM parameters , where comprises all sensors that are exactly hops away from sensor ; note that . These GM parameters are added to the previous GM parameter set ofsensor , . Thus, Eq. (18) can be reformulated as

 (21)

The total number of real values that have to be broadcast in iteration by all the sensors in the network is equal to the number of real values needed to specify all the elements of the set .

### Vi-B GM Average Consensus

To limit the growth of the GM parameter sets and to reduce the communication cost, we may emulate a part of the averaging in (20) in each iteration . To this end, we consider a formal application of the average consensus algorithm [39, 38] to the local GM-PHDs. According to that algorithm, the iterated GM-PHD at sensor —denoted by —would be updated in iteration as

 (22)

with appropriately chosen weights , where . A popular choice is given by the Metropolis weights [39] defined as if and . The recursion (22) is initialized as (see (15)). Since the network is connected, is guaranteed to converge for to the total GM-PHD average in (17) [39]. For a finite number of iterations, provides only an approximation of .

A direct implementation of the update (22) is impossible because the iterated GM-PHDs are functions, rather than numbers. Therefore, we will emulate (22) through operations involving the GM parameters of the iterated local GM-PHDs and , involved in (22). First, as in the flooding scheme discussed in Section VI-A, each sensor broadcasts to its neighbors its GM parameter set (see(16)) and receives their GM parameter sets . Then, sensor scales each GM weight with the corresponding consensus weight , resulting in the scaled weights , for , . Thus, sensor now disposes of the “scaled GM parameter sets”

for all . The GM-PHD generated in analogy to (15) from the union of all these GM parameter sets, , would be

 DGM,∪s,k(x) =∑r∈{s}∪Ssαs,r∑z∈ZSr,kΩr,k(z)N(x;μr,k(z),Σr,k(z)) (23)

where (15) was used in the last step. A comparison with (22) shows that we have emulated the first GM-PHD average consensus iteration () by operating at the level of the GM parameters [10]. Note, however, that (or any other PHD) is not actually computed by the proposed algorithm.

Just as the flooding scheme, this scheme suffers from the fact that the fused GM parameter set at sensor , , is much larger than the original GM parameter set . Therefore, we apply mixture reduction [40, 41, 10] to , resulting in a reduced GM parameter set , where is some reduced index set. The GM-PHD corresponding to , i.e.,

 DGM[1]s,k(x)≜∑ℓ∈L[1]s,kΩ[1]s,k,ℓN(x;μ[1]s,k,ℓ,Σ[1]s,k,ℓ),\vspace−1.8mm (24)

is then only an approximation of . Mixture reduction usually consists of merging GCs that are “close” with respect to an appropriate metric, and pruning GCs with small weights. In our case, the weights are not small because they survived the thresholding performed in Section V, and thus we only perform a merging operation.

These union and merging operations are repeated in all the further iterations. In iteration , sensor broadcasts to its neighbors the set and receives their sets , . It then scales each GM weight , , with the corresponding consensus weight . This results in the “scaled GM parameter sets”

 G[i−1](α)s,r,k≜{(Ω[i−1](α)s,r,k,ℓ,μ[i−1]r,k,ℓ,Σ[i−1]r,k,ℓ)}ℓ∈L[i−1]r,k,r∈{s}∪Ss,

with . Let denote the GM-PHD corresponding to the union of all these GM parameter sets, , i.e.,

 DGM[i−1],∪s,k(x)≜∑r∈{s}∪Ss∑ℓ∈L[i−1]r,kΩ[i−1](α)s,r,k,ℓN(x;μ[i−1]r,k,ℓ,Σ[i−1]r,k,ℓ).\vspace−1.7mm

Using (24) with obvious modifications, i.e., , we obtain (cf. (23))

 (25)

Hence, we have emulated the GM-PHD average consensus iteration (22) by operating at the level of the GM parameters. Finally, a merging step reduces to a smaller GM parameter set

The GM-PHD corresponding to , i.e.,

 DGM[i]s,k(x)≜∑ℓ∈L[i]s,kΩ[i]s,k,ℓN(x;μ[i]s,k,ℓ,Σ[i]s,k,ℓ),\vspace−.5mm (26)

approximates in (25). The recursion described above is initialized with .

Thus, after iterations, we have converted the original local GM parameter set into a fused GM parameter set that approximately emulates average consensus iterations (22). The choice of will be discussed in Section IX-A. In conclusion, we have developed an approximate implementation of the GM-PHD average consensus scheme (22) that operates at the level of the GM parameters. Note that here—in contrast to the distributed flooding scheme discussed in Section VI-A—the iterated GM parameter sets do not systematically grow with progressing iteration . Furthermore, our experimental results reported in Section X suggest that the proposed GM average consensus scheme with GC merging can outperform the GM flooding scheme in terms of tracking accuracy.

## Vii IS Method for GM–Particles Conversion

The dissemination/fusion schemes discussed in the previous section effectively provide each sensor with a fused GM-PHD , which is given by (20) if the GM flooding scheme of Section VI-A is used and by (26) (with replaced by ) if the GM average consensus scheme of Section VI-B is used. (We say “effectively” because is not actually calculated.) In what follows, we will denote by

 G[I]s,k≜{(Ω[I]s,k,ℓ,μ[I]s,k,ℓ,Σ[I]s,k,ℓ)}ℓ∈L[I]s,k\vspace−.5mm (27)

the set of GM parameters involved in , i.e., we have

 DGM[I]s,k(x)=∑ℓ∈L[I]s,kΩ[I]s,k,ℓN(x;μ[I]s,k,ℓ,Σ[I]s,k,ℓ).\vspace−1.3mm (28)

Here, in the case of GM flooding, is obtained from in (19) by scaling all the weights in with the factor ; this accounts for the factor in (20).

In order to use the fused GM-PHD in the local particle-PHD filter at sensor , it is necessary to find a particle representation of . The standard method is to sample directly from . However, we here propose a methodbased on the importance sampling (IS) principle [26, Ch. 3.3], which will be seen in Section IX-A to enable a parallelization of filtering and fusion operations. We start by recalling from Section III that the local PHD filter propagates over time a weighted particle set providing an approximate representation of . Let us now consider an alternative particle representation of using a uniformly weighted particle set