Unstructured sequential testing in sensor networks
Abstract
We consider the problem of quickly detecting a signal in a sensor network when the subset of sensors in which signal may be present is completely unknown. We formulate this problem as a sequential hypothesis testing problem with a simple null (signal is absent everywhere) and a composite alternative (signal is present somewhere). We introduce a novel class of scalable sequential tests which, for any subset of affected sensors, minimize the expected sample size for a decision asymptotically, that is as the error probabilities go to 0. Moreover, we propose sequential tests that require minimal transmission activity from the sensors to the fusion center, while preserving this asymptotic optimality property.
I Introduction
Ia Problem formulation
Consider sources of observations (sensors) which transmit their data to a global decision maker (fusion center). We assume that observations from different sensors are independent and that, for each , sensor observes a sequence of independent and identically distributed (i.i.d.) random variables with common density with respect to a dominating, finite measure . We denote by the filtration generated by the observations at sensor , i.e., for every . For each density , we consider two possibilities, and , so that the corresponding KullbackLeibler information numbers,
are positive and finite. The goal is to distinguish at the fusion center between the following two hypotheses:
where is a subset of sensors that belongs to some class . The interpretation is that signal is present (resp. absent) at sensor when its observations are distributed according to (resp. ). Thus, the null hypothesis, , represents the situation in which all sensors observe noise, whereas the alternative hypothesis, , corresponds to the case that signal is present in some subset of sensors. In what follows, we denote by and the probability measure and the expectation, respectively, under when the subset of affected sensors is , whereas the corresponding notation under will be and .
We will be interested in the sequential version of this hypothesis testing problem. Thus, we assume that observations at the sensors and the fusion center are acquired sequentially and we want to select the correct hypothesis at the fusion center as soon as possible. This means that the goal is to find a sequential test, , which consists of an stopping time and an measurable random variable that takes values in , so that is selected on , , where is the filtration generated by the observations of all sensors, i.e.,
An ideal sequential test should have small detection delay under both hypotheses, while controlling its error probabilities below prescribed levels. Specifically, given and a class of subsets of , , we set
and 
i.e., is the class of sequential tests whose probabilities of typeI and typeII error are bounded above by and , respectively. Then, the problem is to find a sequential test that attains
(1) 
and, for every set ,
(2) 
This is indeed possible when , that is when the subset of sensors in which signal may be present is known in advance. In this case, reduces to
and 
and from Wald and Wolfowitz [1] it follows that, for any so that , both (1) and (2) are attained by Wald’s [2] Sequential Probability Ratio Test (SPRT):
(3) 
where are positive thresholds selected so that and , whereas is the loglikelihood ratio process of over . Since we have assumed that observations coming from different sensors are independent (an assumption that is not required for the optimality of the SPRT), it is clear that , where
is the loglikelihood ratio of the observations acquired by sensor up to time .
While the optimality of the SPRT holds for any given so that , closedform expressions for its operating characteristics are, in general, available only in an asymptotic setup, that is as . In what follows, whenever and go to 0 simultaneously, we will assume implicitly that converges to some positive constant and we will write when and (resp. ) when (resp. ). Then, it is well known that
(4) 
where and are the KullbackLeibler information numbers between and , which –due to the assumption of independence across sensors– take the form , .
When the class is not a singleton, i.e., when the alternative hypothesis is composite, it is not possible to find a sequential test that attains (2) for every subset . For this reason, we need to restrict ourselves to sequential tests that are optimal in an asymptotic sense. Therefore, given a class of subsets of , we will say that a sequential test is asymptotically optimal under , if
and under , if for every
A number of asymptotically optimal (under both hypotheses) sequential tests have been proposed and studied in the case that signal may be present in at most one sensor, that is when
where represents the cardinality of . An example of such a test is given by the SPRTbank, according to which each sensor runs an SPRT locally, transmits its decision to the fusion center and the latter stops and selects the first time that any sensor makes a selection in favor of the alternative, whereas it stops and selects when all sensors have made a decision in favor of the null (see, e.g., [3]). Another asymptotically optimal sequential test in this setup can be obtained if in (3) is replaced by the generalized loglikelihood ratio statistic, , or more generally by any statistic of the form
The latter approach can in principle be applied to the case that signal may be present in more than one sensors. Indeed, given any class , it can be shown that replacing in (3) with either
where and each is a positive constant, leads to an asymptotically optimal sequential test. However, this test may not be implementable in practice, even for a moderate number of sensors. Consider, for example, the completely unstructured case, where there is absolutely no prior information regarding the set of affected sensors and is given by
(5) 
Then, the implementation of the above sequential tests demands summing/maximizing statistics at every time , a requirement that may be prohibitive in practice.
IB Main contributions
In the present paper, we focus on the case that is given by (5), i.e., we assume that signal may be present in any subset of sensors under the alternative hypothesis. In this context, we propose a class of sequential tests, whose implementation at any time requires (instead of ) operations, and we establish their asymptotically optimality. Specifically, we set
where and are onesided stopping times of the form
and each is an adapted statistic that should be chosen appropriately. Our main contribution in this work is that we show how to select these statistics, as well as the thresholds and , in order to guarantee the asymptotic optimality of the proposed sequential test. Thus, in Section II we show that is asymptotically optimal under both hypotheses, when for every and
(6) 
there is a constant so that
(7) 
and thresholds and are selected so that
(8) 
where is the inverse of the survival function of the Erlang distribution with parameters and , i.e.,
(9) 
Conditions (6)(7) are clearly satisfied when each is chosen as the positive part of , , in which case . In Section III, we show that if each sensor communicates with the fusion center only when increases by since the previous communication time, then selecting as the value of at the most recent communication time also satisfies conditions (6)(7). Furthermore, we show that the asymptotic optimality of remains valid in this context, even with an asymptotically low rate of communication.
This infrequent communication is a very important property in applications characterized by limited bandwidth, where it is necessary to design schemes that require minimal transmission activity from the sensors to the fusion center (see, e.g., [5], [6]). Such communication constraints have motivated the problem of decentralized sequential hypothesis testing (see, e.g., [7] [13]), where each sensor is required to transmit a small number of bits whenever it communicates with the fusion center. However, in this literature, it is typically assumed that the set of affected sensors is known in advance (i.e., ) and asymptotically optimal decentralized sequential tests have been proposed only under this assumption (see [11], [12], [13]). Our second main contribution in the present work is that we construct a decentralized sequential test which requires infrequent transmission of onebit messages from the sensors and we establish its asymptotic optimality when is given by (5).
Ii Main results
In what follows, is given by (5). We start by obtaining an asymptotic lower bound for the optimal performance under each hypothesis.
Theorem II.1
As
(10) 
and, for every ,
(11) 
Since for any ,
where the asymptotic equality follows from (4). This proves (11). In a similar way we can show that
and optimizing the asymptotic lower bound over we obtain
Since and for every , it is clear that , which proves (10).
In the following theorem we show that selecting and according to (8) guarantees that , as long as each statistic satisfies (6).
For any we have
and for any
where the inequality is due to (6). Now, for any given and , it is clear that
where , and from Wald’s likelihood ratio identity it follows that
where is expectation with respect to , the probability measure under which and for . The last two relationships imply that, for any given and , , which means that the random variable is stochastically dominated by an exponential random variable with rate 1. Since, due to the assumed independence across sensors, are independent, this implies that is stochastically dominated by an Erlang random variable with parameters 1 and , i.e.,
where is defined in (9). From the latter observation and the definition of it follows that for any :
Furthermore, for any given , from Wald’s likelihood ratio identity it follows that for any
where the second inequality holds because on for every and the third one because for any . Consequently,
and from the definition of it follows that for any
which completes the proof.
In the following theorem we show that if and are selected according to (8) and each statistic satisfies (7), then attains the asymptotic lower bounds in (10) and (11).
Theorem II.3
The proof of (i) is a direct consequence of Theorem 2 in [3]. In order to prove (ii), we observe that for any and
where the inequality is due to (7). As a result,
and taking expectations we obtain (13). From this relationship and (11) it is clear that it suffices to show that as . Indeed, taking logarithms in the definition of in (8)(9) we have
(14) 
which completes the proof.
From the previous theorems it follows that selecting and according to (8) and the statistics so that (6)(7) hold guarantees the asymptotic optimality of under both hypotheses, when is given by (5). Let us add a few remarks to this statement:

Condition (7) is not needed for to belong in and to be asymptotically optimal under .

The asymptotic optimality of remains valid even if for one or more , as long as . In the next section we will show that, with a particular specification for , this property has an interesting interpretation in terms of the communication requirements of the proposed scheme.
Iii The decentralized setup
Let us first note that the onesided sequential test is an oneshot scheme; it requires that each sensor communicate with the fusion center at most once, as soon as its local loglikelihood statistic takes a value smaller than , at which time it simply needs to transmit a onebit message to the fusion center, informing it about this development.
On the other hand, the implementation of the stopping rule can be much more demanding from a communication point of view. For example, if we set , sensor needs to transmit the actual value of at every time (or at least whenever it is positive). As we discussed in the Introduction, this may not be possible in applications characterized by bandwidth constraints.
In what follows, we assume that thresholds and are selected according to (8) and our goal is to suggest specifications for that induce low transmission activity, while preserving the asymptotic optimality of the sequential test .
In order to achieve this, we require that each sensor communicate with the fusion center only at an increasing sequence of stopping times, , which are finite under . In other words, each sensor should communicate with the fusion center only at some particular time instances and, at any given time, the decision to communicate or not should depend exclusively on the observations that have been acquired locally at the sensor until this time.
Given such a sequence of communication times, we denote by the instance of the most recent transmission and by the number of transmitted messages up to time , i.e.,
At any given time , the value of at the most recent communication instance,
(15) 
cannot be larger than , the maximum value of up to time . Indeed, note that coincides with at the communication times and stays flat in between. Therefore, selecting according to (15) satisfies condition (6) and, consequently, it guarantees that belongs in and is asymptotically optimal under .
When, in particular, the communication times are described by the recursion
(16) 
where and is a positive constant, then it is straightforward to see that, for any time , and . Therefore, in the case of the communication scheme (16), setting equal to satisfies condition (7) as well and implies that is asymptotically optimal also under . Furthermore, the final remark in the end of Section II suggests that the latter asymptotic optimality property is preserved even with an asymptotically low rate of communication from one or more sensors, as the constant in this setup controls the average period of communication at sensor .
From the righthand side in (15) it is clear that selecting as requires that at each time sensor transmit to the fusion center (with an “infinitebit” message) the exact value of , the “realized” local loglikelihood ratio between and . However, if one insists that a small number of bits be transmitted at each communication, which is the main requirement in decentralized sequential testing [10], then this selection is no longer acceptable. Nevertheless, in the case of the communication scheme (16), it is intuitively clear that the value of each should be close to , at least when does not have “heavy tails” and/or is “large”. This suggests selecting each according
(17) 
a selection that requires transmission of a single bit from each sensor at each communication time. Moreover, for every time it is clear that
(18) 
therefore, selecting according to (17) satisfies condition (6) and, consequently, it guarantees that belongs in and is asymptotically optimal under . On the other hand, for every we have
(19) 
where is the random, nonnegative overshoot associated with the transmission from sensor . This means that selecting each according to (17) does not satisfy condition (7), therefore Theorem II.3(ii) can no longer be applied to establish the asymptotic optimality of under .
Nevertheless, in the following theorem we show that this property remains valid if two additional conditions are satisfied. The first is that, for every , each must have a finite second moment under , i.e.,
(20) 
a condition that guarantees that
(21) 
is a finite quantity for any given and an term as for every . The second is that, now, we must let so that for every , so that each sensor does not communicate with the fusion center very frequently and the (unobserved) overshoots do not accumulate very fast.
In what follows, we denote by a term that is bounded above when divided by as , where
Theorem III.1
Suppose that each sensor communicates with the fusion center at the sequence of times described by (16) and that each is selected according to (17).
(i) If and , then belongs to and attains the asymptotic lower bound in (10).
The proof of (i) follows from (18) and Theorems II.1(i), II.2 and II.3(i). In order to prove (ii), we start with the observation that for any thresholds and that for any subset we have
(23) 
where the equality follows from Wald’s identity, whereas the inequality holds because whenever every is nonnegative, as it is the case with (17).
For any , setting in (19) and strengthening the inequality we have
(24) 
Moreover, setting , , we can see that is a integrable, adapted stopping time and a sequence of adapted, i.i.d. random variables with finite expectation, . As a result, from Wald’s first identity it follows that for every :
Therefore, taking expectations in (24) and recalling the definition of in (21) we have
Then, summing over and setting , we obtain
However, since each is selected according to (17), then it is clear that for every . Therefore,
and from (23) it follows that is bounded above by
(25) 
But since each is selected according to (17), it is clear that the overshoot cannot take a value larger than . As a result, and the upper bound (25) takes the form
Then, in order to prove (22), it suffices to note that the first two terms in the latter expression are , since is an term as , due to assumption (20).
Iv Extensions
Theorems II.1, II.2 and II.3 do not rely heavily on the assumed i.i.d. structure of the sensor observations. Thus, it can be shown that the asymptotic optimality of remains valid for any statistical model (in discrete or continuous time) that preserves the asymptotic optimality of the SPRT. Moreover, the above results can be generalized in the case that a lower bound, , and an upper bound, , are available on the number of affected sensors, that is when .
Furthermore, it is straightforward to generalize the decentralized sequential test described in Section III, so that more than one bits are transmitted per communication. These additional bits can be utilized for the quantization of the unobserved overshoots and can improve the performance of the proposed test in the case of high rates of communication.
Finally, we should note that all these extensions, which will be presented elsewhere, require the assumption of independence across sensors. Removing this assumption remains an open problem.
Acknowledgments
This work was supported by the U.S. Air Force Office of Scientific Research under MURI grant FA95501010569, by the U.S. Defense Threat Reduction Agency under grant HDTRA11010086, by the U.S. Defense Advanced Research Projects Agency under grant W911NF1210034, by the U.S. National Science Foundation under grants CCF0830419, EFRI1025043, and DMS1221888 and by the U.S. Army Research Office under grant W911NF1310073 at the University of Southern California, Department of Mathematics.
References
 [1] A. Wald and J. Wolfowitz, “ Optimum character of the sequential probability ratio test,” Ann. Math. Statist., vol. 19, pp. 326339, 1948.
 [2] A. Wald, Sequential analysis. Wiley, New York, 1947.
 [3] Tartakovsky, A. G., Li, X. R., and Yaralov, G. (2003). Sequential detection of targets in multichannel systems. IEEE Trans. Inform. Theory, vol. 49, 425445.
 [4] G. Fellouris and A.G. Tartakovsky, “Almost optimal sequential tests for discrete composite hypotheses.” To appear in Statistica Sinica
 [5] J.N. Tsitsiklis, “Decentralized detection,” Advances in Statistical Signal Processing, Greenwich, CT: JAI Press, 1990.
 [6] V. Raghunathan, C. Schurgers, S. Park and M. B. Srivastava, “Energyaware wireless microsensor networks,” IEEE Sig. Proc. Mag., vol. 19, no. 2, pp. 4050, Mar. 2002.
 [7] V.N.S. Samarasooriya and P.K. Varshney, “Sequential approach to asynchronous decision fusion,” Opt. Eng., vol. 35, no. 3, pp. 625633, 1996.
 [8] V.V. Veeravalli, T. Basar and H.V. Poor, “Decentralized sequential detection with a fusion center performing the sequential test,” IEEE Trans. Inf. Th., vol. 39, pp. 433442, 1993.
 [9] A.M. Hussain, “Multisensor distributed sequential detection,” IEEE Trans. Aer. Elect. Syst., vol. 30, no. 3, pp. 698708, 1994.
 [10] V.V. Veeravalli, “Sequential decision fusion: theory and applications,” J. Franklin Inst., vol. 336, pp. 301322, 1999.
 [11] Y. Mei, “Asymptotic optimality theory for sequential hypothesis testing in sensor networks,” IEEE Trans. Inf. Th., vol. 54, pp. 20722089, 2008.
 [12] G. Fellouris and G. V. Moustakides, “Decentralized sequential hypothesis testing using asynchronous communication”, IEEE Trans. Inf. Th., vol. 57, no. 1, pp. 534–548, 2011.
 [13] Y. Yilmaz, G.V. Moustakides, and X. Wang, “Cooperative sequential spectrum sensing based on eventtriggered sampling”, IEEE Trans. Signal Process. vol. 60, no. 9, pp. 4509–4524, 2012