On a representation of partially-distinguishable populations

# On a representation of partially-distinguishable populations

Jeremie Houssineau Daniel E. Clark Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546. School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UK.
###### Abstract

A way of representing heterogeneous stochastic populations that are composed of sub-populations with different levels of distinguishability is introduced together with an analysis of its properties. In particular, it is demonstrated that any instance of this representation where individuals are independent can be related to a point process on the set of probability measures on the individual state space. The introduction of the proposed representation is fully constructive which guarantees the meaningfulness of the approach.

###### keywords:
Stochastic population, Point process
###### Msc:
[2010] 60A10, 62C10

## Introduction

Stochastic populations such as probabilistic multi-object systems are of central importance in many areas within systems biology Chenouard2014 (), robotics Mullane2011 () or computer vision Okuma2004 (). In some cases, the sole interest is in their global characteristics, such as when only their cardinality is studied, e.g. in population dynamics Hofbauer1998 (); Turchin2003 (), or when spatial information is meant to be unspecific, as with point processes Geyer1994 (); Green1995 (). In some other cases, all the individuals of the population can be clearly identified and the way the population is represented becomes less fundamental since the problem can be recast into a collection of individual-wise representations. Except in these specific cases, the representation of stochastic populations remains mostly unexplored, in spite of their ubiquity. In general, the population might be only partially distinguishable, i.e. some individuals might be identified while another sub-population might only be described by unspecific representations, e.g. by its cardinality. The objective in this article is to find a natural way of representing these stochastic partially-distinguishable populations. The underlying motivation is that a natural representation should not only be useful in theory when expressing different results and properties, but also in practice when devising approximation algorithms for the induced probability laws. Figure 1 shows examples of samples drawn for distributions with different degrees of distinguishability, hinting at the possible drawbacks of using indistinguishable representations for distinguishable populations.

###### Example 1.

In the context of Bayesian data assimilation for stochastic populations, sub-populations that have never been observed are often well modelled by indistinguishable representations, e.g. if individuals live on the real line and if new individuals are known to appear either at point or at point different from , then it is not unnatural that individuals might appear at the same point, either or . However, if one individual has been observed at and another one at , then using a representation that allows these two individuals to be both at either or would often be inappropriate. Overall, different sub-population require different levels of distinguishability and a suitable stochastic representation should be able to deal with this modelling aspect.

One of the main application areas for the type of representation introduced in this article is in the Engineering discipline called multi-target tracking Blackman1986 (); Mahler2007 (), see e.g. Caron2011 (); Pace2013 (); DelMoral2015 () or (DelMoral2013, , Chapt. 6) for a point-process-based formulation and analysis. In this context, the limitation of point processes is found in their inability to represent and propagate specific information about targets, or tracks. Since this is often the objective, heuristics are usually applied to the output of the point-process-based algorithm in order to produce tracks. However, since tracks themselves are often not only displayed to the operator but also used for further processing steps, the addition of an ad-hoc step at this stage of the algorithm prevents from performing these steps in a principled and integrated way. For instance, specific data assimilation Houssineau2016_dataAssimilation () based on the proposed representation can be easily extended to include classification Pailhas2016 () or sensor management Delande2014_SensorControl (). Existing applications of the proposed approach include space situational awareness Delande2016_Space (), harbour surveillance Pailhas2016 () as well as multi-target tracking from radar data Houssineau2015_SMC ().

In order to build a natural representation of stochastic populations, it is convenient to start with an idealistic case in which the notion of partial distinguishability can be formalised, and so is done in Section 1. The concepts and notations introduced in Section 1 are then used as a basis for the introduction of a full representation in Section 2. An alternative formulation is finally introduced in Section 3, where simplifications are made in order to make the representation more practical.

Throughout the article, random variables will be implicitly assumed to be defined on the complete probability space . For any set , denote the set of equivalence relations on , and denote and the minimal and maximal equivalence relations respectively, i.e.  is false and is true for any .

## 1 Describing a population

We consider a representative set , i.e. a set in which individuals of interest can be uniquely characterised. Because of this characterisation, a population, which can be intuitively understood as a collection of individuals, is formally defined as a subset of . The set of all possible populations is then defined as the set of all countable subsets of . In this way, the set is itself a representative set for populations.

An important aspect is that in practice, a more realistic set needs to be considered for the representation of individuals. This set is seen as being a projection of the set and we define as the associated projection map. Such a simplification is required for most of the applications since the full characterisation of an individual is not usually considered accessible. For instance, the observation might not account for the shape, mass or composition of a given solid, so that only its centre of mass/volume can be inferred. One of the consequences of this simplified representation is that individuals might have the same state in . In the context of point process theory (Daley2003, ), processes that never have two individuals at the same point are called simple. Borrowing this term, we can impose that representations should not require simplicity in in general. A practical example of the meaning of the sets introduced so far is given in Figure 2.

The aptitude to obtain specific information, or observability, might not be sufficient to tell some of the individuals apart. Individuals that are in this situation are said to be strongly indistinguishable, i.e. they cannot be distinguished in their current states even with the best possible sources of information. Strongly indistinguishable individuals can be related through a relation defined as follows: two individuals are strongly indistinguishable if and only if holds. The set

 Y≐{(X,τ)s.t.X∈X,τ∈Π(X)}

is introduced in order to represent partially-indistinguishable populations. When individuals are not strongly indistinguishable, they are said to be weakly distinguishable. Even when some individuals are weakly distinguishable, it could happen that the available information is not sufficient to tell them apart. We then say that these individuals are weakly indistinguishable. This concept clearly depends on the knowledge about the population and might evolve if additional information is made available. To sum up, strong indistinguishability is a state-dependent concept while weak indistinguishability is a probabilistic concept.

The description of the uncertainty on a given population can be performed by associating every individual in with a random variable on . This solution, however, does not describe the relation between the different distributions related to different individuals, in particular with strongly indistinguishable ones. A global representation of uncertainty is thus sought. One of the most usual ways of describing multiple spatial entities as a whole is given by the theory of point processes. However, this theory is built on the following principle:

“We talk of the probability of finding a given number of points in a set : we do not give names to the individual points and ask for the probability of finding specified individuals within the set . Nevertheless, this latter approach is quite possible (indeed, natural) in contexts where the points refer to individual particles, animals, plants and so on.” (Daley2003, , p. 124)

Yet, we wish to model the partially-indistinguishable nature of the individuals in without assuming that they are all strongly indistinguishable, i.e. without assuming that . The study of populations composed of indistinguishable individuals is already challenging due to the difficulty in finding a consistent way of describing multiple individuals within a single stochastic object. Examples of questions arising from this issue are: Should the individuals be ordered even though there is no natural way of defining the order? Should the individuals be assumed to be represented at different points of the state space in order to enable a set representation? Should the population be assumed finite in order to proceed to the analysis? There are different ways of answering these questions and each way has to be proved equivalent in some sense to the others (Moyal1962, ; Macchi1975, ). The representation of partially indistinguishable populations raises many additional and equivalently difficult questions. Alternative representations of stochastic populations have to be found in order to tackle this issue.

## 2 Representing a population

Based on the set of all possible populations and on the set on which all individuals are represented, we describe a versatile way of introducing randomness in the states of the individuals in which conveys the concept of strong indistinguishability. This is first achieved for a fixed population in Section 2.1 before tackling the full generality of the problem in Section 2.2.

### 2.1 For a given population

We assume that the set can be written as the union of an Euclidean space and an isolated point . The latter can be viewed as an empty state and is used to provide an image to individuals that cannot be represented on such as individuals that are outside of the zone of interest.

#### 2.1.1 Construction

Let be a partially-distinguishable population of interest, i.e. a set of individuals characterised in that is equipped with an equivalence relation connecting strongly indistinguishable individuals. The objective is to include the relation between the individuals of in the probabilistic modelling of the population. We first introduce the set

 FY≐{f:X→Xs.t.% |f−1[X∙]|<∞}

that is composed of mappings that map finitely many individuals to . This condition facilitates the definition of various types of operations on individuals but can be relaxed without inducing major changes in the following results. The set is used as a way of indexing the states in and the actual knowledge of the full individual characteristics is not used. Otherwise, the state of an individual could be directly obtained from the projection . At the end of this section, we will derive a formulation that ensures that cannot be used to hold information on the state of individuals.

A suitable -algebra of subsets of , denoted can be introduced as follows: There is a natural topology on that is generated by open sets of the same form as

 A={fs.t.(∀x∈X)f(x)∈Ax},

where is an open set in that differs from for finitely many only. Note that is indeed open as an isolated point. This topology is denoted and is defined as the corresponding Borel -algebra. Representations of the population can thus be given as random variables in the measurable space of mappings .

A random variable on represents all the individuals in on and is equivalent to a collection of possibly correlated random variables, since indistinguishability has not been taken into account yet.

When two individuals in are strongly indistinguishable, we expect that individual characterisations would not be available, even when considering a specific outcome . Random variables on that do not respect this constraint would be mistakenly distinguishing individuals that are strongly indistinguishable, as shown in Figure 3. The space is then not fully satisfying as is does not ensure that indistinguishable individuals are well represented.

A natural way of circumventing this incomplete representation of the structured population is to make the -algebra coarser by “gluing” together functions that distinguish indistinguishable individuals.

###### Example 2.

Suppose that , i.e.  is made of two indistinguishable individuals so that . Additionally suppose that , i.e. there are only two possible states for the individuals and , and assume that is also representative so that and must have different states in . There are only different distinguishable outcomes in defined by their respective graph as and . To ensure that the individuals are indistinguishable, one can glue together these two symmetrical outcomes and define a new set of functions as (note the additional curly brackets). There is now only one outcome that does not allow for distinguishing the individuals and as required.

Following Example 2 and denoting the subgroup of permutations on agreeing with the equivalence relation , i.e. the ones permuting indistinguishable individuals only, we introduce a binary relation on as follows.

###### Definition 1.

A binary relation on is said to be induced by the equivalence relation if it holds that

 (∀f,f′∈FY)fρf′⇔∃σ∈Sym(X,τ)(f=f′∘σ). (1)

Intuitively, elements of are related through a binary relation whenever they only differ by a permutation of indistinguishable individuals. A representation of the elements of the quotient space is given in Figure 4. This binary relation can be proved to have additional properties.

###### Proposition 1.

The equivalence relation induces a unique binary relation on , and this binary relation is an equivalence relation.

The proof of Proposition 1 relies mostly on the group nature of , as a subgroup of . Consequently, only the specific group properties of will be invoked when proving that the induced binary relation is an equivalence relation.

###### Proof.

(Uniqueness) Let and be two binary relations induced by . We want to prove that holds for any . Let be the two permutations in satisfying (1) for and respectively. There exists in such that , proving the uniqueness.
(Reflexivity) The identity is in .
(Symmetry) Existence of an inverse element in .
(Transitivity) Closure of . ∎

Let denote the unique equivalence relation on induced by and let be the quotient map from to induced by . We introduce a -algebra of subsets of , denoted , which does not allow for distinguishing strongly indistinguishable individuals: Let denote the initial topology on induced by the quotient map . We can verify that holds, meaning that there are fewer open subsets in when compared to . The Borel -algebra induced by is denoted . A reference measure on can be easily deduced from the reference measure on , e.g. the Lebesgue measure. Random variables on characterise subsets of indistinguishable individuals rather than individuals themselves, as required.

#### 2.1.2 Independence and weak indistinguishability

Now equipped with suitable spaces for considering the representation of partially-indistinguishable populations, we study the properties of probability measures on . Since populations have an intrinsic multivariate nature, it is natural to introduce a notion of independence for probability measures on as in the following definition.

###### Definition 2.

The individuals in are said to be independent if the law on verifies

 (2)

for any , where a family of probability measures on .

The expression (2) of Definition 2 is a convolution of measures based on the operation of creating a function in out of a value in for each the individuals in . This notion of independence will be useful as an example of concepts and operations that will be defined in the general case.

The notion of weak indistinguishability that was introduced in Section 1 has not been translated into practical terms yet. As opposed to strongly indistinguishable individuals that are bound through the events in , it just happens that there is no specific knowledge about weakly indistinguishable individuals. As a result, weak indistinguishability is a fully probabilistic concept. In order to formally define it, we introduce a mapping from into itself for any given defined by

 Tσ:f↦f∘σ. (3)

Mappings of this form describe the changes induced by swapping individuals. It is therefore suitable for expressing properties of symmetry for probability measures as in the following definition.

###### Definition 3.

Let be a probability measure on . The relation of weak indistinguishability induced by on is defined as

 η=sup{η′∈Π(X)s.t.(∀σ∈Sym(X,η′))P=(Tσ)∗P},

where is the pushforward of by the measurable mapping .

The relation of weak indistinguishability is an equivalence relation by definition. Since is only a partially ordered set, the greatest element of a given subset might not exist, but it is necessarily unique if it exists. We can show that the relation of weak indistinguishability exists by verifying that any element in the considered subset can only identify less symmetries than . In other words, denoting the partition of induced by , there exist at least two subsets in which union is a subset of so that holds for any in the subset of of interest. Some of the properties of the relation of weak indistinguishability are given here using the notations of Definition 3.

It holds that .

###### Proof.

Sets in the -algebra of subsets of do not allow for distinguishing individuals related by . Thus, for any given and , it holds that for any so that is always true by construction. As a result, the equivalence relation is always in the set of which is the greatest element. ∎

###### Example 3.

Reusing the notations of Definition 2 and assuming that the individuals in are independent under and that is the relation of weak indistinguishability induced by , then for any pair of individuals in , it holds that

 (xηx′)⇔(px=px′).

The representation of strongly indistinguishable individuals by random variables on can be considered as satisfactory. Yet, the true population was supposed to be known so far, even though it is only used as an indexing set, this cannot be assumed in general. It is thus necessary to find a way of dealing with unknown populations.

### 2.2 Stochastic representation

It is natural to reuse the same mechanisms as before to bypass the necessity of knowing the true population when describing it, i.e. by defining an appropriate equivalence relation and working on the -algebras induced by the corresponding quotient spaces. However, we will see that the approach that seems the most natural at first does not lead to a satisfactory result. Nonetheless, this approach is detailed here as it motivates the introduction of a more advanced construction.

#### 2.2.1 Naive attempt

The most natural way to extend the results of the previous section to unknown populations is to consider the union of the sets and to simplify it using an equivalence relation as previously. Let the set be defined as

 F≐⋃Y∈YFY.
###### Definition 4.

Let be two populations equipped with a relation of strong indistinguishability defined via and . The binary relations on and on are defined as follows

 X∼X′⇔|X|=|X′| and Y≈Y′⇔∃ν:Y\lx@stackrel∼⟷Y′,

where indicates a relation-preserving bijection. Also, for any and any , let the binary relation on be defined as

 fρ∗f′⇔∃ν:Y\lx@stackrel∼⟷Y′(f=f′∘ν). (4)

It is easy to prove that the binary relations , and on the respective sets , and are equivalence relations. Note that the relation on can be equivalently defined as

 X∼X′⇔∃ν:X↔X′,

where indicates a bijection. This alternative definition highlights the parallel with the equivalence relations and also introduced in Definition 4.

Equivalence classes in do not allow for distinguishing functions that give the same values in and have different domains. As before, an appropriate -algebra of subsets of can be deduced from the quotient space . A first clue that the equivalence relation is over-simplifying the space is that

 Sym(X,τ)⊆{νs.t.ν:(X,τ)\lx@stackrel∼⟷(X,τ)},

for any , with the inclusion being strict for and ; if for instance then will make all individuals indistinguishable although they were initially weakly distinguishable . We can still verify that the space is suitable in cases where all the individuals are strongly indistinguishable by showing the relation between the subset

 FI≐⋃X∈XF(X,I)

of endowed with the -algebra induced by and the set of integer-valued measures, or counting measures, on equipped with its Borel -algebra . Such a relation will ensure that random variables on will be equivalent to point processes on as expected. In the next theorem, will denote the domain of a given function .

###### Theorem 1.

The mapping defined as

 ξ:FI →N(X) f ↦∑x∈dom(f)δf(x),

is -bi-measurable.

###### Proof.

We show that is measurable and then that for any :

1. A generating family for the -algebra of subsets of is found to be made of subsets of the form

 C={μ∈N(X)s.t.μ(B)=i},

for some and some . The inverse image of by the mapping is of the form

 ξ−1[C]={f∈FIs.t.∑x∈dom(f)1B(f(x))=i}.

To verify that , we check that

 (∀f∈ξ−1[C],∀f′∈FI)fρ∗f′⇒f′∈ξ−1[C].

By definition we have that

 fρ∗f′⇔∃ν:dom(f)↔dom(f′)(f=f′∘ν).

so that

 ∑x∈dom(f)1B(f(x))=∑x∈dom(f)1B(f′(ν(x)))=∑x∈dom(f′)1B(f′(x))=i,

and as required.

2. To identify a generating family for the -algebra , consider a subset of the form

 AX={f∈FIs.t.dom(f)⊇X,∀x∈dom(f)(x∈X⇔f(x)∈B)},

for some and a Borel subset of , which includes all the functions based on populations having as a sub-population that maps the individual in into and all the other individuals outside of . Then, enlarge the subset by all the functions that are related by to any function in it, that is

 C=⋃f∈AX[f]={f∈FIs.t.∃X⊆dom(f)(∃ν:X↔X(f∈Aν[X]))}

which, denoting , can also be expressed as

 C={f∈FIs.t.∑x∈dom(f)1B(f(x))=i}.

It follows easily that

 ξ[C]={μ∈N(X)s.t.μ(B)=i}∈N(X).

We conclude from 1 and 2 that is bi-measurable. ∎

Theorem 1 shows that a stochastic population where all individuals are strongly indistinguishable is essentially equivalent to a point process. To obtain the full equivalence would require to define on , in which case it would become an isomorphism. This demonstrate that stochastic representation adequately model strongly-indistinguishable populations. Yet, the objective is to be able to represent partially-distinguishable populations and therefore events about specific individuals should also be in the -algebra . However, considering a random variable on , it appears that there is no way of recognising individuals between different realisations, even when these realisations relate to the same population size and structure. In other words, this approach makes all individuals indistinguishable as there would be no way of assessing events based on specific individuals without a means of indexing the distinguished ones.

###### Example 4.

Considering, as in Example 2, a representative set as a state space, assuming that , i.e. that population are made of exactly two individuals, and supposing that individuals are always distinguishable, we obtain that

 F={f:{x,x′}→{x,x′}% s.t.x,x′∈Xa,f(x)≠f(x′)}.

We can check that holds for any , so that and is a singleton that can be seen as equivalent to the counting measure , so that the realisations for the individuals and cannot be distinguished from a random variable on .

#### 2.2.2 Second attempt

Since weak indistinguishability is a probabilistic concept, an alternative is to work directly on the set

 PF≐⋃Y∈YP(FY),

where denotes the set of probability measures on a given set with its underlying -algebra. It is then possible to simplify the set while preserving the relations of indistinguishability between individuals. For any and in and any bijection between and , we introduce the mapping defined by

 Tν:f↦f∘ν.

The mapping defined in (3) can be seen as a special case when .

###### Definition 5.

For any populations , any and any , let the binary relation on be defined as

 PρP′⇔∃ν:Y\lx@stackrel∼⟷Y′(P=(Tν)∗P′). (6)

Since each probability measure in is defined on a single population in , the latter can be recovered and will be denoted or . If individuals are independent under a given probability measure then the equivalence class of probability measures related to via is found to be

 [P]={P′s.t.∃ν:YP\lx@stackrel∼⟷YP′(∀x∈XP(px=p′ν(x)))}.

This result highlights the structure of the equivalence relation and of the mapping in (6).

Note that the definition of does not depend on the relation of weak indistinguishability. Indeed, weak indistinguishability is more an observed property of a representation rather than a building block that would impose some sort of structure on the mathematical construction of it.

A given equivalence class in allows for describing the randomness of a population of a given size and structure without knowing the actual population state in as required. Such an equivalence class is referred to as a population representation or simply as a representation, and when individuals are independent, the induced probability measures on are called individual representations.

###### Example 5.

Considering again the case of Example 4 it follows that

 PF={P∈P(F({x,y},O))s.t.x,y∈Xa}.

Focusing on the subset of for which individuals are weakly distinguishable and independent, for the sake of simplicity, we find that

 PρP′⇔∃ν:{x,y}↔{x′,y′}((px=p′ν(x))∧(py=p′ν(y))),

for any , where and . In this setup, a point in , which is in fact an equivalence class, correspond to the configuration where the uncertainty about one individual is described by a given law and the uncertainty about the other individual is described by a given law , the individuals being weakly indistinguishable if . In other words, individuals are labeled by the probability measures describing the uncertainty about them, these labels being shared by indistinguishable individuals by definition.

The set is not however a full answer to the question of the representation of populations since elements of it correspond to a given size and a given structure, i.e. a given type of strong indistinguishability. Yet, the size and structure of a population are generally unknown and possibly random, and there might be second-order uncertainties on the probability measures in themselves. Indeed, in general, there are many possible distinct configurations for each given cardinality and structure, which can only be represented by random probability measures for each choice of . The set also has to be embedded with a suitable -algebra: even when a topology on is available, the corresponding topology on would not be suitable for our purpose since it would allow for distinguishing representations based on a given population . Instead, we consider the initial topology induced by the quotient map of and we denote the corresponding Borel -algebra. There is no natural reference measure on , but we assume that such a measure is given case by case via a countable subset or a parametric family of probability measures. Similarly, the -algebra on is assumed to be induced by the discrete topology on .

A random variable on describes all the uncertainties about the system of interest and is referred to as a stochastic representation. The interpretation of can be made easier by separating its law into a marginal and a conditional as

 (∀B∈PF)P(B)=E[P(B|Y)],

where is the random population induced by on , and is a version of the conditional law of given , i.e. the law representing the second-order uncertainties given the size and structure of the population. This separation of the randomness is straightforward but helps to interpret the behaviour of : first a size and a structure is randomly selected for the population, then a probability measure on is drawn, where is any element of , describing the uncertainty about the considered type of population and ensuring that there is no specific knowledge about strongly indistinguishable individuals. Which population has been chosen from is irrelevant since the mapping has to be measurable for any .

###### Remark 1.

The probability only depends on the size of and on the size of the subsets in . For instance, we can evaluate the probability for a realisation of to contain exactly 3 strongly indistinguishable individuals and 2 weakly distinguishable ones, however, we cannot assess the probability of any event regarding the states of these individuals in . Likewise, only allows for evaluating the probability of events about some individuals being represented by some probability measures, for instance, for the 3 strongly indistinguishable individuals to be independent and associated with the individual law and for the 2 weakly distinguishable individuals to be dependent and associated with a joint law (that will be non symmetrical if they have been distinguished).

### 2.3 Statistics

The stochastic representation is a random element of inducing a random size and a random structure via the randomly selected law. Because the realisations of are probability measures on different spaces, these realisations are not directly summable, yet statistics for some aspects of can be defined. In particular, the th-order moment for the number of individuals, if it exists, is equal to . Similarly, equivalence relations are not summable, yet the th-order moments for the number of strongly indistinguishable sub-populations is found to be equal to and is finite by construction.

By definition, for any , the measurable space excludes events regarding a subset of individuals specifically when this subset does not form a union of elements of . Henceforth, the set of integer-valued measures is equipped with its Borel -algebra .

###### Proposition 3.

Let and let be a subset of , then the mapping defined as

 TX:FY →N(X) (7a) f ↦∑x∈Xδf(x), (7b)

is -measurable if and only if is an element of a partition that is coarser than .

###### Proof.

As mentioned before, the -algebra is generated by subsets of the form , for some and some , and it holds that

 T−1X(C)={f∈FYs.t.∑x∈X1B(f(x))=i}.

The mapping is measurable if and only if

 (∀f∈T−1X[C],∀f′∈FY)fρf′⇒f′∈T−1X[C],

which is equivalent to

 (∀f∈T−1X[C],∀σ∈Sym(X,τ))∑x∈X1B(f(σ(x)))=i.

This last statement holds if and only if for all , i.e. if and only if there exists partition of containing and being coarser than . ∎

One can then study the law for any , which is a point-process distribution. The mapping does not retain information about the structure of the subset but distinguishability will be lost anyway when considering expectation over . For instance, the mean number of induced point-process laws within a measurable subset of given by indistinguishable sub-populations of fixed size can be expressed as with

 χmB(P)≐∑X∈XP/τP|X|=m1B((TX)∗P).

This is only an example, statistics about sub-populations can be defined based on the characteristics of the induced point-process laws or on the characteristics of the induced point processes themselves, by studying moments corresponding to the probability mass induced by on given subsets of .

###### Example 6.

In the case where all individuals are weakly distinguishable via , that is when holds almost surely, other summable quantities that are induced by the stochastic representation are the number of marginal individual laws in a given measurable subset of as well as the marginal individual laws themselves, since they are defined on the same space. In this specific case, the mapping defined in (7) can be simplified as follows: for any such that , a measurable mapping on is introduced as

 Tx:FY →X f ↦f(x)

defined for any . This mapping can be seen as a special case of (7) with and being used directly instead of . Then, for a given such that and a given , the marginal individual law of by is the pushforward . The quantities of interest are then defined for any measurable subset of and any measurable subset of as

 χB(P) ≐∑x∈XP1B((Tx)∗P), ¯χB′(P) ≐∑x∈XP(Tx)∗P(B′).

The corresponding th-order moments, if they exist, are thus and . For instance, is the expected number of individuals with marginal law within while is the probability mass given by the average marginal individual law to the subset .

Many applications are concerned with the study of populations where the individuals are independent. The simplifications induced by such an assumption are important enough to justify studying this case specifically, and so is done in the next section.

## 3 Alternative formulation

The objective is now to show that the problem can be formulated on more standard sets than . We focus on one alternative formulation which relies on integer-valued measures, however, other formulations are possible, e.g. with product measures on suitably defined spaces. These types of formulation already exist for point processes as described in Moyal1962 () and Ito2013 (). The following assumption will henceforth be considered:

1. Individuals are independent.

The subset of composed of probability measures for which all individuals are independent is denoted and is equipped with the -algebra induced by . For a given , we denote the population on which is based and the corresponding family of individual laws on .

One of the most direct alternative formulations uses the concept of integer-valued measures or counting measures. A connection between the specific notion of population representation and the more common concept of counting measure is established in the following proposition. Since is a Polish space when equipped with the topology induced by the Prokhorov metric Prokhorov1956 (), the set can also be made Polish Daley2003 () and is therefore equipped with its Borel -algebra denoted . Also, the Borel -algebra of is denoted by .

###### Theorem 2.

The mapping , defined as

 ζ:P↦∑x∈XPδpx, (10)

is -measurable.

###### Proof.

The Borel -algebra on is the one generated by subsets of the form

 C={μ∈N(P(X))s.t.μ(B)=i},

for some and . The inverse image of by is found to be

 ζ−1[C]={P∈PFs.t.∑x∈XP1B(px)=i},

where is the population on which is defined and is the indexed family of probability measures on induced by . Following the same route as in the proof of Theorem 1, we can verify that . ∎

Theorem 2 shows stochastic representations can be expressed as a random counting measures, or point process, on the set of probability measures on , and such will be the understanding in this section. The transformation introduced in this proposition does not preserve the representation of strong indistinguishability and is not bi-measurable as a consequence. This can be seen as beneficial in practice since the observability of strong indistinguishability is often out of reach. The only individuals that are known to be strongly indistinguishable in this case are the ones that are almost surely at the same point of the state space, i.e., the ones which law is known to be of the form for some .

###### Remark 2.

It is possible to relax Assumption 1 to: individuals that are not strongly indistinguishable are independent. In this case, the corresponding subset of stochastic representations could be mapped to , with the set defined as

 X×≐{ψ∞}∪⋃k≥1Xk, (11)

where the point state denoted represents the case where infinitely many individuals are at point . In this configuration, the relation of strong indistinguishability can be preserved, but at the expense of a more complex set of counting measures.

As a point process, can be characterised by its probability-generating functional (p.g.fl.) , defined for any non-negative bounded measurable function on as (Daley2008, )

 G(h) ≐E[exp(∫logh(p)M(dp))] =p(0)+∑n≥1p(n)∫n∏i=1h(pi)P(d(p1,…,pn)|n),

where is defined as for any and is the distribution of on conditioned on , for any .

The simplicity of this integer-valued measure formulation comes from the fact that the state space does not actually appear in the equations, allowing for more flexibility in the expressed quantity. This formulation has been used in Delande2016_DISP () in the context of Bayesian data assimilation for multi-object systems. Yet, it is sometimes necessary to assess events for the stochastic population at the level of the individual state space . A second alternative would be to express the stochastic representation on a product space based on , by collapsing the two levels of probabilistic structures considered so far. This formulation has been used for expressing Bayesian data-assimilation algorithms in Delande2014_SensorControl (); Delande2016_Space () and for deriving approximate solutions Houssineau2016_HISP () used in Pailhas2016 (); Houssineau2015_SMC ().

### 3.1 Parametrised family of probability measures

A special case of interest is found when the support of the considered stochastic representations is within a family of probability measures parametrised by a set for some . This enables some of the properties of stochastic representations to be studied on the simpler set . The following additional assumption is henceforth considered:

1. Stochastic representations take values in a parametrised family of probability measures.

Under Assumption 1, let be an identifiable family of probability measures on encompassing the support of . In this context, identifiability means that whenever the parameters are different. The point process induces a point process on in the following way:

 N=(F−1)∗M

where is assumed to be bi-measurable. Straightforwardly, any point process on induces a point process on defined as . One of the consequences on this relation is the ability to recover statistics for from the ones for , for instance the th-order moment evaluated at the measurable subset of can be recovered via

 E[M(B)n]=E[N(F−1(B))n].

The p.g.fl. of can now be equivalently expressed as

 G(h) =E[exp(∫logh(pθ)N(dθ))] =p(0)+∑n≥1p(n)∫n∏i=1h(pθi)Q(d(θ1,…,θn)|n),

where is the distribution of on conditioned on , for any . It the population under consideration is fully distinguishable almost surely then the point process is simple and admits a density w.r.t. the Lebesgue measure on .

### 3.2 Discrete set of probability measures

We also formulate an assumption that is of interest when devising practical estimation algorithms:

1. The set is countable.

As a consequence of Assumption 1, the point process induced by is equivalent to a random variable on the set , with . Then can be expressed as

 M=∑θ∈ΘNθδpθ.

Note that verifies for any such that . A realisation of can be denoted with the corresponding realisation of in order to underline the multiplicity of each atom in . The law of on can then be expressed as

 P(B)=∫c(dn)1B(μn)

for any Borel subset of , where is the induced probability measure on . This way of representing stochastic populations is useful when performing filtering (Houssineau2015, , Chapt. 3) since finite collections of individual representations are often available in practice, so that Assumption 1 is verified.

###### Example 7.

If a population is known to contain exactly individuals and if the only available individual representations for these individuals are the ones in the set , in which case and elements of can be seen as pairs of integers, then the population representation can be any of the following:

 μ3,0=3δp1,μ2,1=2δp1+δp2,μ1,2=δp1+2δp2,μ0,3=3δp2.

For instance, describes the case where the uncertainty about two of the individuals is described by , so that these two individuals are indistinguishable, and the uncertainty about the other individual is described by . In this form it is not known whether the two weakly indistinguishable individuals are also strongly indistinguishable or not. Any corresponding stochastic representation is a point process on verifying

 M(P(X)−{p1,p2})=0a.s.,

so that can be simply described by the multiplicities it assigns to probability measures in .

Identifying a countable family of probability measure and additionally assuming that even if enables a simplification of the expression of the p.g.fl. of to

 G(h) =E[∏θ∈Θh(pθ)Nθ] =∑n∈NΘc(n)∏θ∈Θh(pθ)nθ,

which is related to the probability-generating functional of as expected. For instance, if , then , with the probability-generating function of defined as

 G′(z1,…,zk)≐∑n∈NΘc(n)zn11…znkk.

Assumption 1 also yields a simpler expression of the statistics induced by a stochastic representation on . Of particular interest are the mean and variance for the number of individual laws within a measurable subset of , characterised by

 M(B) ≐E[M(B)], V(B) ≐E[M(B)2]−M(B)2,

whenever they exist. These quantities are well defined since is a random measure. If the quantities of interest are the mean and variance on the state space , then the mapping

 ΦB:P(X) →R p ↦p(B),

can be introduced for any and is -measurable by (Daley2003, , Proposition A2.5.IV). The collapsed first moment and variance , describing the number of individuals within can then be defined as Delande2016_DISP ()

 M′(B) ≐E[M(ΦB)]=∑θ∈Θpθ(B), V′(B) ≐E[M(ΦB)2]−M′(B)2=∑θ,θ′∈Θcovθ,θ′pθ(B)pθ′(B).

where and . These relations between and are connected to the relation between the p.g.fl.  and the probability-generating function . Even in the simple configuration induced by Assumption 1, the structure of the proposed representation of stochastic populations enables more diverse types of statistics to be computed when compared to point processes on the state space, which is practically relevant for describing filtering algorithms for multi-object dynamical systems Delande2014_Var ().

## Conclusion

Starting from general considerations about the concepts of individual and population and about the partially-indistinguishable knowledge that may be available about them, we went across increasingly general notions in an attempt to faithfully describe the multi-faceted nature of the corresponding uncertainties. After a suitable level of generality was reached, an alternative way of expressing the uncertainty about these complex systems has been introduced. This alternative expression highlights the nature of the proposed representation by identifying it with a point process on the set of probability measures on the individual state space, under the assumption of independence between individuals.

## References

• (1) N. Chenouard, et al., Objective comparison of particle tracking methods, Nature methods 11 (3) (2014) 281.
• (2) J. Mullane, B.-N. Vo, M. D. Adams, B.-T. Vo, A random-finite-set approach to Bayesian SLAM, IEEE Transactions on Robotics 27 (2) (2011) 268–282.
• (3) K. Okuma, A. Taleghani, N. De Freitas, J. J. Little, D. G. Lowe, A boosted particle filter: Multitarget detection and tracking, in: Computer Vision-ECCV 2004, Springer, 2004, pp. 28–39.
• (4) J. Hofbauer, K. Sigmund, Evolutionary games and population dynamics, Cambridge university press, 1998.
• (5) P. Turchin, Complex population dynamics: a theoretical/empirical synthesis, Vol. 35, Princeton University Press, 2003.
• (6) C. J. Geyer, J. Møller, Simulation procedures and likelihood inference for spatial point processes, Scandinavian journal of statistics (1994) 359–373.
• (7) P. J. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82 (4) (1995) 711–732.
• (8) S. S. Blackman, Multiple-target tracking with radar applications, Vol. 1, Artech House, Norwood, 1986.
• (9) R. P. S. Mahler, Statistical multisource-multitarget information fusion, Artech House, Boston, 2007.
• (10) F. Caron, P. Del Moral, A. Doucet, M. Pace, On the conditional distributions of spatial point processes, Advances in Applied Probability 43 (2) (2011) 301–307.
• (11) M. Pace, P. Del Moral, Mean-field PHD filters based on generalized Feynman-Kac flow, IEEE Journal of Selected Topics in Signal Processing 7 (3) (2013) 484–495.
• (12) P. Del Moral, J. Houssineau, Particle association measures and multiple target tracking, in: Theoretical Aspects of Spatial-Temporal Modeling, Springer, 2015, pp. 1–30.