Unstructured sequential testing in sensor networks

# Unstructured sequential testing in sensor networks

Georgios Fellouris and Alexander Tartakovsky G. Fellouris is with the Department of Statistics, University of Illinois, Urbana-Champaign, 119 Illini Hall, 725 South Wright Street, Champaign, IL 61820 USA fellouri@illinois.edu. A. Tartakovsky is with the Department of Statistics, University of Connecticut, 215 Glenbrook Road U-4120, Sorrs, CT 06269-4120, a.tartakov@uconn.edu. This paper was originally submitted when both authors were with the Department of Mathematics at the University of Southern California.
###### Abstract

We consider the problem of quickly detecting a signal in a sensor network when the subset of sensors in which signal may be present is completely unknown. We formulate this problem as a sequential hypothesis testing problem with a simple null (signal is absent everywhere) and a composite alternative (signal is present somewhere). We introduce a novel class of scalable sequential tests which, for any subset of affected sensors, minimize the expected sample size for a decision asymptotically, that is as the error probabilities go to 0. Moreover, we propose sequential tests that require minimal transmission activity from the sensors to the fusion center, while preserving this asymptotic optimality property.

## I Introduction

### I-a Problem formulation

Consider sources of observations (sensors) which transmit their data to a global decision maker (fusion center). We assume that observations from different sensors are independent and that, for each , sensor observes a sequence of independent and identically distributed (i.i.d.) random variables with common density with respect to a dominating, -finite measure . We denote by the filtration generated by the observations at sensor , i.e., for every . For each density , we consider two possibilities, and , so that the corresponding Kullback-Leibler information numbers,

 Ik1 :=∫log(fk1(x)fk0(x))fk1(x)νk(dx)and Ik0 :=∫log(fk0(x)fk1(x))fk0(x)νk(dx),

are positive and finite. The goal is to distinguish at the fusion center between the following two hypotheses:

 H0:fk=fk0,1≤k≤K H1:fk=fk0,k∉Aandfk=fk1,k∈A,

where is a subset of sensors that belongs to some class . The interpretation is that signal is present (resp. absent) at sensor when its observations are distributed according to (resp. ). Thus, the null hypothesis, , represents the situation in which all sensors observe noise, whereas the alternative hypothesis, , corresponds to the case that signal is present in some subset of sensors. In what follows, we denote by and the probability measure and the expectation, respectively, under when the subset of affected sensors is , whereas the corresponding notation under will be and .

We will be interested in the sequential version of this hypothesis testing problem. Thus, we assume that observations at the sensors and the fusion center are acquired sequentially and we want to select the correct hypothesis at the fusion center as soon as possible. This means that the goal is to find a sequential test, , which consists of an -stopping time and an -measurable random variable that takes values in , so that is selected on , , where is the filtration generated by the observations of all sensors, i.e.,

 Ft:=σ(Xks;1≤s≤t,1≤k≤K),t∈N.

An ideal sequential test should have small detection delay under both hypotheses, while controlling its error probabilities below prescribed levels. Specifically, given and a class of subsets of , , we set

 Cα,β(P):={(T,dT): P0(dT=1)≤α and maxA∈PPA1(dT=0)≤β},

i.e., is the class of sequential tests whose probabilities of type-I and type-II error are bounded above by and , respectively. Then, the problem is to find a sequential test that attains

 inf(T,dT)∈Cα,β(P)E0[T] (1)

and, for every set ,

 inf(T,dT)∈Cα,β(P)EA1[T]. (2)

This is indeed possible when , that is when the subset of sensors in which signal may be present is known in advance. In this case, reduces to

 Cα,β(A):={(T,dT): P0(dT=1)≤α and PA1(dT=0)≤β},

and from Wald and Wolfowitz [1] it follows that, for any so that , both (1) and (2) are attained by Wald’s [2] Sequential Probability Ratio Test (SPRT):

 SA:=inf{t:ZAt∉(−A,B)}dSA:={1if ZASA≥B0if ZASA≤−A (3)

where are positive thresholds selected so that and , whereas is the log-likelihood ratio process of over . Since we have assumed that observations coming from different sensors are independent (an assumption that is not required for the optimality of the SPRT), it is clear that , where

 Zkt=Zkt−1+logfk1(Xkt)fk0(Xkt);Zk0:=0

is the log-likelihood ratio of the observations acquired by sensor up to time .

While the optimality of the SPRT holds for any given so that , closed-form expressions for its operating characteristics are, in general, available only in an asymptotic setup, that is as . In what follows, whenever and go to 0 simultaneously, we will assume implicitly that converges to some positive constant and we will write when and (resp. ) when (resp. ). Then, it is well known that

 E0[SA]∼|logβ|IA0andEA1[SA] ∼|logα|IA1, (4)

where and are the Kullback-Leibler information numbers between and , which –due to the assumption of independence across sensors– take the form , .

When the class is not a singleton, i.e., when the alternative hypothesis is composite, it is not possible to find a sequential test that attains (2) for every subset . For this reason, we need to restrict ourselves to sequential tests that are optimal in an asymptotic sense. Therefore, given a class of subsets of , we will say that a sequential test is asymptotically optimal under , if

 E0[~T]∼inf(T,dT)∈Cα,β(P)E0[T]

and under , if for every

 EA1[~T]∼inf(T,dT)∈Cα,β(P)EA1[T].

A number of asymptotically optimal (under both hypotheses) sequential tests have been proposed and studied in the case that signal may be present in at most one sensor, that is when

 P={A:|A|=1}={{k},1≤k≤K},

where represents the cardinality of . An example of such a test is given by the SPRT-bank, according to which each sensor runs an SPRT locally, transmits its decision to the fusion center and the latter stops and selects the first time that any sensor makes a selection in favor of the alternative, whereas it stops and selects when all sensors have made a decision in favor of the null (see, e.g., [3]). Another asymptotically optimal sequential test in this setup can be obtained if in (3) is replaced by the generalized log-likelihood ratio statistic, , or more generally by any statistic of the form

where each is a positive constant [3], [4].

The latter approach can in principle be applied to the case that signal may be present in more than one sensors. Indeed, given any class , it can be shown that replacing in (3) with either

 log(∑B∈PpBeZB)orlog(maxB∈PpBeZB),

where and each is a positive constant, leads to an asymptotically optimal sequential test. However, this test may not be implementable in practice, even for a moderate number of sensors. Consider, for example, the completely unstructured case, where there is absolutely no prior information regarding the set of affected sensors and is given by

 P={A:1≤|A|≤K}. (5)

Then, the implementation of the above sequential tests demands summing/maximizing statistics at every time , a requirement that may be prohibitive in practice.

### I-B Main contributions

In the present paper, we focus on the case that is given by (5), i.e., we assume that signal may be present in any subset of sensors under the alternative hypothesis. In this context, we propose a class of sequential tests, whose implementation at any time requires (instead of ) operations, and we establish their asymptotically optimality. Specifically, we set

 T∗:=min{^TB,ˇTA},d∗:={1if ^TB≤ˇTA0if ^TB>ˇTA

where and are one-sided stopping times of the form

 ˇTA :=min{t:ˇZt≤−A};ˇZt:=max1≤k≤KZkt ^TB :=min{t:^Zt≥B};^Zt:=K∑k=1^Zkt,

and each is an -adapted statistic that should be chosen appropriately. Our main contribution in this work is that we show how to select these statistics, as well as the thresholds and , in order to guarantee the asymptotic optimality of the proposed sequential test. Thus, in Section II we show that is asymptotically optimal under both hypotheses, when for every and

 ^Zkt≤Mkt:=max1≤s≤tZks, (6)

there is a constant so that

 ^Zkt≥max{Zkt−Δk,0} (7)

and thresholds and are selected so that

 A=Aβ:=|logβ|andB=Bα:=F−1(α), (8)

where is the inverse of the survival function of the Erlang distribution with parameters and , i.e.,

 F(x):=e−xK−1∑j=0xjj!,x>0. (9)

Conditions (6)-(7) are clearly satisfied when each is chosen as the positive part of , , in which case . In Section III, we show that if each sensor communicates with the fusion center only when increases by since the previous communication time, then selecting as the value of at the most recent communication time also satisfies conditions (6)-(7). Furthermore, we show that the asymptotic optimality of remains valid in this context, even with an asymptotically low rate of communication.

This infrequent communication is a very important property in applications characterized by limited bandwidth, where it is necessary to design schemes that require minimal transmission activity from the sensors to the fusion center (see, e.g., [5], [6]). Such communication constraints have motivated the problem of decentralized sequential hypothesis testing (see, e.g., [7]- [13]), where each sensor is required to transmit a small number of bits whenever it communicates with the fusion center. However, in this literature, it is typically assumed that the set of affected sensors is known in advance (i.e., ) and asymptotically optimal decentralized sequential tests have been proposed only under this assumption (see [11], [12], [13]). Our second main contribution in the present work is that we construct a decentralized sequential test which requires infrequent transmission of one-bit messages from the sensors and we establish its asymptotic optimality when is given by (5).

The remaining paper is organized as follows: in Section II we state and prove the main results of the paper and in Section III we consider the decentralized setup. In Section IV we discuss certain extensions of this work, which will be presented elsewhere.

## Ii Main results

In what follows, is given by (5). We start by obtaining an asymptotic lower bound for the optimal performance under each hypothesis.

###### Theorem II.1

As

 inf(T,dT)∈Cα,β(P)E0[T] ≻|logβ|min1≤k≤KIk0 (10)

and, for every ,

 inf(T,dT)∈Cα,β(P)EA1[T]≻|logα|IA1. (11)
{proof}

Since for any ,

 inf(T,dT)∈Cα,β(P)EA1[T] ≥inf(T,dT)∈Cα,β(A)EA1[T]∼|logα|IA1

where the asymptotic equality follows from (4). This proves (11). In a similar way we can show that

 inf(T,dT)∈Cα,β(P)E0[T] ≥inf(T,dT)∈Cα,β(A)E0[T]∼|logβ|IA0

and optimizing the asymptotic lower bound over we obtain

 inf(T,dT)∈Cα,β(P)E0[T] ≻maxA∈P|logβ|IA0=|logβ|minA∈PIA0.

Since and for every , it is clear that , which proves (10).

In the following theorem we show that selecting and according to (8) guarantees that , as long as each statistic satisfies (6).

###### Theorem II.2

If and are selected according to (8) and each satisfies (6), then .

{proof}

For any we have

 P0(^TB≤ˇTA)≤P0(^TB<∞)=limt→∞P0(^TB≤t)

and for any

 P0(^TB≤t) =P0(max0≤s≤tK∑k=1^Zks≥B)≤P0(K∑k=1Mkt≥B),

where the inequality is due to (6). Now, for any given and , it is clear that

 P0(Mkt≥B) =P0(SkB≤t)≤P0(SkB<∞),

where , and from Wald’s likelihood ratio identity it follows that

 P0(SkB<∞)=Ek1[e−ZkSkB]≤e−B,

where is expectation with respect to , the probability measure under which and for . The last two relationships imply that, for any given and , , which means that the random variable is stochastically dominated by an exponential random variable with rate 1. Since, due to the assumed independence across sensors, are independent, this implies that is stochastically dominated by an Erlang random variable with parameters 1 and , i.e.,

 P0(K∑k=1Mkt≥B)≤F(B),

where is defined in (9). From the latter observation and the definition of it follows that for any :

 P0(^TBα≤ˇTA)≤F(Bα)=α.

Furthermore, for any given , from Wald’s likelihood ratio identity it follows that for any

 PA1(ˇTA<^TB) ≤PA1(ˇTA<∞) =E0[e∑k∈AZkˇTA] ≤e−|A|A≤e−A,

where the second inequality holds because on for every and the third one because for any . Consequently,

 maxA∈PPA1(ˇTA<^TB) ≤e−A

and from the definition of it follows that for any

 maxA∈PPA1(ˇTAβ<^TB) ≤e−Aβ=β,

which completes the proof.

In the following theorem we show that if and are selected according to (8) and each statistic satisfies (7), then attains the asymptotic lower bounds in (10) and (11).

###### Theorem II.3

(i) As ,

 E0[T∗]≺Amin1≤k≤KIk0 (12)

and attains the asymptotic lower bound in (10) when .

(ii) If each satisfies (7), then as

 EA1[T∗]≺B+∑k∈AΔkIA1for everyA∈P (13)

and attains the asymptotic lower bound in (12) when .

{proof}

The proof of (i) is a direct consequence of Theorem 2 in [3]. In order to prove (ii), we observe that for any and

 K∑k=1^Zkt =∑k∈A^Zkt+∑k∉A^Zkt ≥∑k∈A(Zkt−Δk)=ZAt−∑k∈AΔk,

where the inequality is due to (7). As a result,

 T∗≤^TB≤inf{t:ZAt≥B+∑k∈AΔk}

and taking expectations we obtain (13). From this relationship and (11) it is clear that it suffices to show that as . Indeed, taking logarithms in the definition of in (8)-(9) we have

 |logα| =Bα−log(K−1∑j=0Bjαj!)∼Bα, (14)

which completes the proof.

From the previous theorems it follows that selecting and according to (8) and the statistics so that (6)-(7) hold guarantees the asymptotic optimality of under both hypotheses, when is given by (5). Let us add a few remarks to this statement:

1. Conditions (6)-(7) are clearly satisfied when . An alternative specification that satisfies these conditions is presented in the next section.

2. Condition (7) is not needed for to belong in and to be asymptotically optimal under .

3. The asymptotic optimality of remains valid even if for one or more , as long as . In the next section we will show that, with a particular specification for , this property has an interesting interpretation in terms of the communication requirements of the proposed scheme.

## Iii The decentralized setup

Let us first note that the one-sided sequential test is an one-shot scheme; it requires that each sensor communicate with the fusion center at most once, as soon as its local log-likelihood statistic takes a value smaller than , at which time it simply needs to transmit a one-bit message to the fusion center, informing it about this development.

On the other hand, the implementation of the stopping rule can be much more demanding from a communication point of view. For example, if we set , sensor needs to transmit the actual value of at every time (or at least whenever it is positive). As we discussed in the Introduction, this may not be possible in applications characterized by bandwidth constraints.

In what follows, we assume that thresholds and are selected according to (8) and our goal is to suggest specifications for that induce low transmission activity, while preserving the asymptotic optimality of the sequential test .

In order to achieve this, we require that each sensor communicate with the fusion center only at an increasing sequence of -stopping times, , which are finite under . In other words, each sensor should communicate with the fusion center only at some particular time instances and, at any given time, the decision to communicate or not should depend exclusively on the observations that have been acquired locally at the sensor until this time.

Given such a sequence of communication times, we denote by the instance of the most recent transmission and by the number of transmitted messages up to time , i.e.,

 τk(t) :=max{τkn:τkn≤t},Nkt:=max{n:τkn≤t}.

At any given time , the value of at the most recent communication instance,

 Zkτk(t)=Nkt∑n=1ℓkn,ℓkn:=Zkτkn−Zkτkn−1, (15)

cannot be larger than , the maximum value of up to time . Indeed, note that coincides with at the communication times and stays flat in between. Therefore, selecting according to (15) satisfies condition (6) and, consequently, it guarantees that belongs in and is asymptotically optimal under .

When, in particular, the communication times are described by the recursion

 τkn :=inf{t≥τkn−1:Zkt−Zkτkn−1≥Δk},n∈N (16)

where and is a positive constant, then it is straightforward to see that, for any time , and . Therefore, in the case of the communication scheme (16), setting equal to satisfies condition (7) as well and implies that is asymptotically optimal also under . Furthermore, the final remark in the end of Section II suggests that the latter asymptotic optimality property is preserved even with an asymptotically low rate of communication from one or more sensors, as the constant in this setup controls the average period of communication at sensor .

From the right-hand side in (15) it is clear that selecting as requires that at each time sensor transmit to the fusion center (with an “infinite-bit” message) the exact value of , the “realized” local log-likelihood ratio between and . However, if one insists that a small number of bits be transmitted at each communication, which is the main requirement in decentralized sequential testing [10], then this selection is no longer acceptable. Nevertheless, in the case of the communication scheme (16), it is intuitively clear that the value of each should be close to , at least when does not have “heavy tails” and/or is “large”. This suggests selecting each according

 Nkt∑n=1Δk=ΔkNkt, (17)

a selection that requires transmission of a single bit from each sensor at each communication time. Moreover, for every time it is clear that

 ΔNkt≤Zkτk(t)≤Mkt, (18)

therefore, selecting according to (17) satisfies condition (6) and, consequently, it guarantees that belongs in and is asymptotically optimal under . On the other hand, for every we have

 Zkt−ΔkNkt=Zkt−Zkτk(t)+Nkt∑n=1ηkn≤Δk+Nkt∑n=1ηkn, (19)

where is the random, non-negative overshoot associated with the transmission from sensor . This means that selecting each according to (17) does not satisfy condition (7), therefore Theorem II.3(ii) can no longer be applied to establish the asymptotic optimality of under .

Nevertheless, in the following theorem we show that this property remains valid if two additional conditions are satisfied. The first is that, for every , each must have a finite second moment under , i.e.,

 Ek1[(Zk1)2]=∫log(fk1(x)fk0(x))2fk1(x)νk(dx)<∞, (20)

a condition that guarantees that

 CA:=maxk∈AsupΔk>0Ek1[ηk1] (21)

is a finite quantity for any given and an term as for every . The second is that, now, we must let so that for every , so that each sensor does not communicate with the fusion center very frequently and the (unobserved) overshoots do not accumulate very fast.

In what follows, we denote by a term that is bounded above when divided by as , where

 Δ––:=min1≤k≤KΔk,¯Δ:=max1≤k≤KΔk.
###### Theorem III.1

Suppose that each sensor communicates with the fusion center at the sequence of times described by (16) and that each is selected according to (17).

(i) If and , then belongs to and attains the asymptotic lower bound in (10).

(ii) If (20) holds, then for any and

 EA1[T∗] ≤1IA1[O(¯Δ)+(1+CAΔ––)B], (22)

and attains the asymptotic lower bound in (11) when , as long as so that for every .

{proof}

The proof of (i) follows from (18) and Theorems II.1(i), II.2 and II.3(i). In order to prove (ii), we start with the observation that for any thresholds and that for any subset we have

 IA1EA1[^TB]=EA1[ZA^TB]=∑k∈AEA1[(Zk−^Zk)^TB]+∑k∈AEA1[^Zk^TB]≤∑k∈AEA1[(Zk−^Zk)^TB]+EA1[^Z^TB], (23)

where the equality follows from Wald’s identity, whereas the inequality holds because whenever every is non-negative, as it is the case with (17).

For any , setting in (19) and strengthening the inequality we have

 (Zk−^Zk)^TB ≤Δk+Nk^TB+1∑n=1ηkn. (24)

Moreover, setting , , we can see that is a -integrable, -adapted stopping time and a sequence of -adapted, i.i.d. random variables with finite expectation, . As a result, from Wald’s first identity it follows that for every :

 EA1[Nk^TB+1∑n=1ηkn]=EA1[Nk^TB+1]Ek1[ηk1].

Therefore, taking expectations in (24) and recalling the definition of in (21) we have

 EA1[(Zk−^Zk)^TB] ≤Δk+CA+CAEA1[Nk^TB].

Then, summing over and setting , we obtain

 ∑k∈AEA1[(Zk−^Zk)^TB] ≤∑k∈A[Δk+CA]+CAEA1[N^TB].

However, since each is selected according to (17), then it is clear that for every . Therefore,

 ∑k∈AEA1[(Zk−^Zk)^TB] ≤∑k∈A[Δk+CA]+CAEA1[^Z^TB]Δ––

and from (23) it follows that is bounded above by

 ∑k∈A[Δk+CA]+(1+CAΔ––)EA1[^Z^TB]. (25)

But since each is selected according to (17), it is clear that the overshoot cannot take a value larger than . As a result, and the upper bound (25) takes the form

 [∑k∈A(Δk+CA)+(1+CAΔ––)K¯Δ]+(1+CAΔ––)B.

Then, in order to prove (22), it suffices to note that the first two terms in the latter expression are , since is an term as , due to assumption (20).

Finally, since as (recall (14)), from (22) it follows that attains the asymptotic lower bound in (11) when , as long as so that , which completes the proof.

## Iv Extensions

Theorems II.1, II.2 and II.3 do not rely heavily on the assumed i.i.d. structure of the sensor observations. Thus, it can be shown that the asymptotic optimality of remains valid for any statistical model (in discrete or continuous time) that preserves the asymptotic optimality of the SPRT. Moreover, the above results can be generalized in the case that a lower bound, , and an upper bound, , are available on the number of affected sensors, that is when .

Furthermore, it is straightforward to generalize the decentralized sequential test described in Section III, so that more than one bits are transmitted per communication. These additional bits can be utilized for the quantization of the unobserved overshoots and can improve the performance of the proposed test in the case of high rates of communication.

Finally, we should note that all these extensions, which will be presented elsewhere, require the assumption of independence across sensors. Removing this assumption remains an open problem.

## Acknowledgments

This work was supported by the U.S. Air Force Office of Scientific Research under MURI grant FA9550-10-1-0569, by the U.S. Defense Threat Reduction Agency under grant HDTRA1-10-1-0086, by the U.S. Defense Advanced Research Projects Agency under grant W911NF-12-1-0034, by the U.S. National Science Foundation under grants CCF-0830419, EFRI-1025043, and DMS-1221888 and by the U.S. Army Research Office under grant W911NF-13-1-0073 at the University of Southern California, Department of Mathematics.

## References

• [1] A. Wald and J. Wolfowitz, “ Optimum character of the sequential probability ratio test,” Ann. Math. Statist., vol. 19, pp. 326-339, 1948.
• [2] A. Wald, Sequential analysis. Wiley, New York, 1947.
• [3] Tartakovsky, A. G., Li, X. R., and Yaralov, G. (2003). Sequential detection of targets in multichannel systems. IEEE Trans. Inform. Theory, vol. 49, 425-445.
• [4] G. Fellouris and A.G. Tartakovsky, “Almost optimal sequential tests for discrete composite hypotheses.” To appear in Statistica Sinica
• [5] J.N. Tsitsiklis, “Decentralized detection,” Advances in Statistical Signal Processing, Greenwich, CT: JAI Press, 1990.
• [6] V. Raghunathan, C. Schurgers, S. Park and M. B. Srivastava, “Energy-aware wireless microsensor networks,” IEEE Sig. Proc. Mag., vol. 19, no. 2, pp. 40-50, Mar. 2002.
• [7] V.N.S. Samarasooriya and P.K. Varshney, “Sequential approach to asynchronous decision fusion,” Opt. Eng., vol. 35, no. 3, pp. 625-633, 1996.
• [8] V.V. Veeravalli, T. Basar and H.V. Poor, “Decentralized sequential detection with a fusion center performing the sequential test,” IEEE Trans. Inf. Th., vol. 39, pp. 433-442, 1993.
• [9] A.M. Hussain, “Multisensor distributed sequential detection,” IEEE Trans. Aer. Elect. Syst., vol. 30, no. 3, pp. 698-708, 1994.
• [10] V.V. Veeravalli, “Sequential decision fusion: theory and applications,” J. Franklin Inst., vol. 336, pp. 301-322, 1999.
• [11] Y. Mei, “Asymptotic optimality theory for sequential hypothesis testing in sensor networks,” IEEE Trans. Inf. Th., vol. 54, pp. 2072-2089, 2008.
• [12] G. Fellouris and G. V. Moustakides, “Decentralized sequential hypothesis testing using asynchronous communication”, IEEE Trans. Inf. Th., vol. 57, no. 1, pp. 534–548, 2011.
• [13] Y. Yilmaz, G.V. Moustakides, and X. Wang, “Cooperative sequential spectrum sensing based on event-triggered sampling”, IEEE Trans. Signal Process. vol. 60, no. 9, pp. 4509–4524, 2012