# Active Hypothesis Testing for Quickest

Anomaly Detection

###### Abstract

The problem of quickest detection of a single anomalous process among a finite number of processes is considered. At each time, a subset of the processes can be observed, and the observations from each chosen process follow two different distributions, depending on whether the process is normal or abnormal. The objective is a sequential search strategy that minimizes the expected detection time subject to an error probability constraint. This problem can be considered as a special case of active hypothesis testing first considered by Chernoff in 1959 where a randomized strategy, referred to as the Chernoff test, was proposed and shown to be asymptotically (as the error probability approaches zero) optimal. For the special case considered in this paper, we show that a simple deterministic test achieves asymptotic optimality and offers better performance in the finite regime. We further extend the problem to the case where multiple anomalous processes are present. In particular, we examine the case where only an upper bound on the number of anomalous processes is known.

Index Terms— Sequential detection, anomaly detection, dynamic search, active hypothesis testing, controlled sensing.

## I Introduction

We consider the problem of detecting a single anomalous process among processes. Borrowing terminologies from target search, we refer to these processes as cells and the anomalous process as the target which can locate in any of the cells. The decision maker is allowed to search for the target over cells at a time (). The observations from searching a cell are i.i.d. realizations drawn from two different distributions and , depending on whether the target is absent or present. The objective is a sequential search strategy that dynamically determines which cells to search at each time and when to terminate the search so that the expected detection time is minimized under a constraint on the probability of declaring a wrong location of the target.

The problem under study applies to intrusion detection in cyber-systems when an intrusion to a subnet has been detected and the objective is to locate the abnormal component in the subnet (since the probability of each component being compromised is small, with high probability, there is only one abnormal component). It also finds applications in target search, fraud detection, and spectrum scanning in cognitive radio networks.

### I-a A Case of Active Hypothesis Testing

The above problem is a special case of the sequential experiment design problem first studied by Chernoff in 1959 [1]. Compared with the classic sequential hypothesis testing pioneered by Wald [2] where the observation model under each hypothesis is predetermined, the sequential design of experiments has a control aspect that allows the decision maker to choose the experiment to be conducted at each time. Different experiments generate observations from different distributions under each hypothesis. Intuitively, as more observations are gathered, the decision maker becomes more certain about the true hypothesis, which in turn leads to better choices of experiments. Chernoff focused on the case of binary hypotheses and showed that a randomized strategy, referred to as the Chernoff test, is asymptotically optimal as the maximum error probability diminishes. Specifically, the Chernoff test chooses the current experiment based on a distribution that depends on past actions and observations. Variations and extensions of the problem and the Chernoff test were studied in [3, 4, 5, 6, 7, 8], where the problem was referred to as controlled sensing for hypothesis testing in [4, 5, 6] and active hypothesis testing in [7, 8] (see a more detailed discussion in Section I-C).

It is not difficult to see that the quickest anomaly detection problem considered in this paper is a special case of the active hypothesis testing problem considered in [1, 3, 4, 5, 7, 8]. In particular, under each hypothesis that the target is located in a particular cell, the distribution (either or ) of the next observation depends on the cell chosen to be searched. The Chernoff test and its variations proposed in [3, 4, 5, 7, 8] thus directly apply to our problem. However, in contrast to the randomized nature of the Chernoff test and its variations, we show in this paper that a simple deterministic test achieves asymptotic optimality and offers better performance in the finite regime.

### I-B Main Results

Similar to [1, 3, 4, 5, 7], we focus on asymptotically optimal policies in terms of minimizing the detection time as the error probability approaches zero. The asymptotic optimality of the Chernoff test as shown in [1] requires that under any experiment, any pair of hypotheses are distinguishable (i.e., has positive Kullback-Liebler (KL) divergence). This assumption does not hold in the anomaly detection problem considered in this paper. For instance, under the experiment of searching the cell, the hypotheses of the target being in the () and the () cells yield the same observation distribution . Nevertheless, we show in Theorem 2 that the Chernoff test preserves its asymptotic optimality for the problem at hand even without this positivity assumption on all KL divergences. As a result, it serves as a bench mark for comparison.

The Chernoff test, when applied directly to the anomaly detection problem, leads to a randomized cell selection rule: the cells to be searched at the current time are drawn randomly according to a distribution determined by past observations and actions. The main result of this paper is to show that a simple deterministic policy offers the same asymptotic optimality yet with significant performance gain in the finite regime and considerable reduction in implementation complexity. Specifically, under the proposed policy, the selection rule indicating which cells should be searched at time is given by:

where denotes the cell index with the highest sum of log-likelihood ratio (LLR) collected from this cell up to time , and is the KL divergence between two distributions. Since is the key quantity in the cell selection rule, we refer to the proposed deterministic policy as the DGF policy.

This deterministic selection rule is intuitively satisfying. Consider, for example, . In this case, the DGF policy selects, at each time, either the cell with the largest sum LLRs or the cell with the second largest sum LLRs, depending on the order of and . The intuition behind this selection rule is that and determine, respectively, the rates at which the state of the cell with the target and the states of the cells without the target can be accurately inferred. Based on the order of these two rates, the DGF policy aims at identifying either the cell with the target or those cells without the target. The selection rule is thus clear by noticing that searching the cell with the second largest sum LLRs will lead to sufficient exploration of all cells without the target since the less explored cells tend to have higher sum LLRs among these cells. A more detailed discussion of the DGF policy and a rigorous proof of its asymptotic optimality are given in Section III.

We then extend the problem to the case where multiple anomalous processes are present. In particular, we examine the case where only an upper bound on the number of anomalous processes is known. Interestingly, we show that the Chernoff test may not be practically appealing under the latter setting. We thus consider a modified Bayes risk that better captures the design objective of practical systems and develop a deterministic policy that is again asymptotically optimal.

### I-C Related Work

Chernoff’s pioneer work on active hypothesis testing focuses on sequential binary composite hypothesis testing [1]. The extension to M-ary hypothesis was given by Bessler in [3]. In [5], Nitinawarat et al. considered M-ary active hypothesis testing in both fixed sample size and sequential settings. Under the sequential setting, they developed a modified Chernoff test that is asymptotically optimal without the positivity assumption on all KL divergences as required in [1, 3]. Furthermore, they examined the asymptotic optimality of the Chernoff test under constraints on decision risks, a stronger condition than the error probability, and developed a modified Chernoff test to meet hard constraints on the decision risks. In [6], a more general model of Markovian Observations and non-uniform control cost was considered. In [7], in addition to the asymptotic optimality adopted by Chernoff in [1], Naghshvar and Javidi examined active sequential hypothesis testing under the notion of non-zero information acquisition rate by letting the number of hypotheses approach infinity and under a stronger notion of asymptotic optimality. They further studied in [8] the roles of sequentiality and adaptivity in active hypothesis testing by characterizing the gain of sequential tests over fixed sample size tests and the gain of closed-loop policies over open-loop policies.

Target search or target whereabout problems have been widely studied under various scenarios. Results under the sequential setting can be found in [9, 10, 11, 12], all assuming single process observations (i.e., ). Specifically, optimal policies were derived in [9, 10, 11] for the problem of quickest search over Weiner processes. In [12], an optimal search strategy was established under the constraint that switching to a new process is allowed only when the state of the currently probed process is declared. Optimal policies under general distributions or with general multi-process probing strategies remain an open question. In this paper we address these questions under the asymptotic regime as the error probability approaches zero. Target search with a fixed sample size was considered in [13, 14, 15, 16]. In [13, 14, 15], searching in a specific location provides a binary-valued measurement regarding the presence or absence of the target. Similar to this paper, Castanon considered in [16] continuous observations: the observations from a location without the target and with the target have distributions and , respectively. Different from this paper where we consider sequential settings and obtain an asymptotically optimal policy that applies to general distributions, [16] focused on the fixed sample size setting and required a symmetry assumption on the distributions (specifically, for some ) for the optimality of the proposed policy. The problem of universal outlier hypothesis testing was studied in [17]. Under this setting, a vector of observations containing coordinates with an outlier distribution is observed at each given time. The goal is to detect the coordinates with the outlier distribution based on a sequence of i.i.d. vectors of observations.

Another set of related work is concerned with sequential detection over multiple independent processes [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]. In particular, in [24], the problem of identifying the first abnormal sequence among an infinite number of i.i.d. sequences was considered. An optimal cumulative sum (CUSUM) test was established under this setting. Further studies on this model can be found in [25, 26, 27]. While the objective of finding rare events or a single target considered in [24, 25, 26, 27] is similar to that of this paper, the main difference is that in [24, 25, 26, 27] the search is done over an infinite number of i.i.d processes, where the state of each process (normal or abnormal) is independent of other processes. Under this independence assumption, the structure of the solution is to perform an independent sequential test without memory for each process. At each time when the decision maker decides to switch to a different process, the new process is chosen arbitrarily, and a sequential test starts afresh. In this paper, however, the number of the processes is finite and the number of the abnormal ones is known (or an upper bound is known). As a result, the process states are correlated. Under this model, the selection rule that governs which process to observe at each time is crucial in minimizing the detection delay, whereas in [24, 25, 26, 27] the order at which the processes are observed is irrelevant. Furthermore, in our model, the sequential tests for the processes have memory. When a process is revisited, all the observations obtained during previous visits are taken into consideration in decision making.

Another related problem considered recently deals with detecting the first disorder of a system involving multiple processes [28, 29, 30, 31]. In this problem, multiple sensors take observations sequentially from the environment and communicate with a fusion center, which determines whether there is a change in the statistical behavior of the observations. The asymptotic optimality of the multi-chart CUSUMs in detecting the first change-point was studied as the mean time between false alarms approaches to infinity. In [28], asymptotic optimality was shown under one-shot schemes, in which the sensors communicate with the fusion center only when they signal an alarm. A Bayesian version of this problem was considered in [29] under the assumption that the fusion center has perfect information about the observations and a priori knowledge of the statistics of the change process. In [30], the problem was examined for the case where an unknown subset of sensors are compromised and a fully distributed low complexity detection scheme was proposed to mitigate the performance degradation and recover the log scaling. In [31], asymptotic optimality of the multi-chart CUSUMs was shown under a coupled system, where observations in one sensor can affect the observations in another. In this paper, however, the goal is to detect the abnormal processes (and not a change point), where the process states are fixed during the detection process.

### I-D Organization

In Section II we describe the system model and problem formulation. In Section III we propose the deterministic DGF policy and establish its asymptotic optimality. We also provide a comparison of DGF with the randomized Chernoff test. In Section IV we extend the problem to the case where multiple anomalous processes are present and consider both cases of known and unknown number of anomalous processes. In Section V we provide numerical examples to illustrate the performance of the proposed policy as compared with the Chernoff test. Section VI concludes the paper.

## Ii System Model and Problem Formulation

### Ii-a System Model

Consider the following anomaly detection problem. A decision maker is required to detect the location of a single anomalous object (referred as a target) located in one of cells. If the target is in cell , we say that hypothesis is true. The a priori probability that is true is denoted by , where . To avoid trivial solutions, it is assumed that for all .

At each time, only () cells can be observed. When cell is observed at time , an observation is drawn independently from a distribution in a one-at-a-time manner. If hypothesis is false, follows distribution ; if hypothesis is true, follows distribution . Let be the probability measure under hypothesis and the operator of expectation with respect to the measure .

We define the stopping rule as the time when the decision maker finalizes the search by declaring the location of the target. Let be a decision rule, where if the decision maker declares that is true. Let be a selection rule indicating which cells are chosen to be observed at time . The time series vector of selection rules is denoted by . Let be the vector of observations obtain from cells at time and be the set of all cell selections and observations up to time . A deterministic selection rule at time is a mapping from to . A randomized selection rule is a mapping from to probability mass functions over .

###### Definition 1

An admissible strategy for the sequential anomaly detection problem is given by the tuple .

### Ii-B Objective

Let be the probability of error under strategy , where is the probability of declaring when is true. Let be the average detection delay under .

We adopt a Bayesian approach as in [1, 4] by assigning a cost of for each observation and a loss of for a wrong declaration. The Bayes risk under strategy when hypothesis is true is given by:

(1) |

Note that represents the ratio of the sampling cost to the cost of wrong detections.

The average Bayes risk is given by:

(2) |

The objective is to find a strategy that minimizes the Bayes risk :

(3) |

### Ii-C Notations

Let be the indicator function, where if cell is observed at time , and otherwise. Let

(4) |

and

(5) |

be the log-likelihood ratio (LLR) and the observed sum LLRs of cell at time , respectively. We then define as the index of the cell with the highest observed sum LLRs at time . Let

(6) |

denote the difference between the highest and the second highest observed sum LLRs at time .

Finally, we define

(7) |

In subsequent sections we show that plays the role of the rate function, which determines the asymptotically optimal performance of the test. Increasing decreases the asymptotic lower bound on the Bayes risk. It is intuitive that increases with the observation capability and decreases with the hypothesis size .

## Iii The Deterministic DGF Policy

In this section we propose a deterministic policy, referred to as the DGF policy, to solve (3). Theorem 1 shows that the DGF policy is asymptotically optimal in terms of minimizing the Bayes risk (2) as .

### Iii-a The DGF Policy

At each time , the selection rule of the DGF policy chooses cells according to the order of their sum LLRs. Specifically, based on the relative order of and , either the cells with the top highest sum LLRs or those with the second to the highest sum LLRs are chosen, i.e.,^{1}^{1}1Cells with the same sum LLRs can be ordered arbitrarily.

(8) |

The stopping rule and decision rule under the DGF policy are given by:

(9) |

and

(10) |

The deterministic selection rule of the DGF policy can be intuitively explained as follows. Consider the case where . If cell is selected at each given time , the asymptotic detection time approaches since the cell with the target (say ) is observed at each given time with high probability (in the asymptotic regime) and the test is finalized once sufficient information is gathered from this cell (for a detailed asymptotic analysis see Appendix VII-A). In this case, determines the asymptotically optimal performance of the test since . On the other hand, if cell is selected at each given time , the asymptotic detection time approaches since one of the cells without the target is observed at each given time with high probability and the test is finalized once sufficient information is gathered from all these cells. Since for all , the asymptotically optimal performance of the test is determined by . Therefore, the selection rule selects the strategy that minimizes the asymptotic detection time according to . When , the rates at which the state of cell and the states of the rest cells can be accurately inferred are given by and , respectively. Since is equivalent to , the selection rule of DGF is thus clear.

### Iii-B Performance Analysis

The following main theorem shows that the DGF policy is asymptotically optimal in terms of minimizing the Bayes risk as approaches zero:

###### Theorem 1 (asymptotic optimality of the DGF policy)

Let and be the Bayes risks under the DGF policy and any other policy , respectively. Then^{2}^{2}2The notation as refers to .,

(11) |

For a detailed proof see Appendix VII-A. We provide here a sketch of the proof. In App. VII-A1, we show that is an asymptotic lower bound on the achievable Bayes risk. Then, we show in App. VII-A2 that the Bayes risk under the DGF policy approaches the asymptotic lower bound as . Specifically, the asymptotic behavior of is established based on Lemma 11 showing that the asymptotic expected detection time approaches , while the error probability is following Lemma 5.

The basic idea in establishing the asymptotic expected detection time under DGF in Lemma 11 is to upper bound the stopping time of DGF by analyzing three last passage times (given in Lemmas 7, 8 and 10). Specifically, if the stopping rule is disregarded and sampling is continued indefinitely, then three last passage times can be defined: , where, roughly speaking, is the time when the sum LLRs of the true cell (say ) is the highest among all the cells for all ; is the time when sufficient information for distinguishing hypothesis from at least one false hypothesis has been gathered; is the time when sufficient information for distinguishing hypothesis from all false hypotheses has been gathered. It should be noted that are not stopping times and the decision maker does not know whether they have arrived (since the true cell is unknown and also depend on the future by definition). However, by the definition of (see Definition in Appendix VII-A for details) the actual stopping time under DGF is upper bounded by (i.e., the decision maker does know that for all , surely has not arrived). As a result, is an upper bound of .

To show the asymptotic behavior of , define and . Thus, . Lemma 8 shows that as . Lemma 7 shows that , i.e., does not affect the asymptotic detection time. Note that differing from [5], where only polynomial decay of was shown under the extended Chernoff test developed to handle indistinguishable hypotheses under some actions, Lemma 7 shows exponential decay of under DGF. Lemma 10 shows that . Combining Lemmas 7, 8 and 10, we can conclude that . Since the error probability is following Lemma 5, the proof thus completes by noticing that the upper bound on coincides with the lower bound on the achievable Bayes risk.

### Iii-C Comparison with the Chernoff Test

Next, we analyze the classic randomized Chernoff test proposed in [1] when it is applied to the anomaly detection problem. We then compare the performance of the proposed DGF policy with the Chernoff test.

#### Iii-C1 The Chernoff Test

The Chernoff test has a randomized selection rule. Specifically, let be a probability mass function over a set of available experiments that the decision maker can choose from, where is the probability of choosing experiment . For a general M-ary active hypothesis testing problem, the action at time under the Chernoff test is drawn from a distribution that depends on the past actions and observations:

(12) |

where is the set of the hypotheses, is the ML estimate of the true hypothesis at time based on past actions and observations, and is the observation distribution under hypothesis when action is taken. The stopping rule and decision rule are given in (9), (10)

It can be shown that when applied to the anomaly detection problem, the Chernoff test works as follows. When , the Chernoff test selects cell and draws the rest cells randomly with equal probability from the remaining cells. When , all cells are drawn randomly with equal probability from cells under the Chernoff test.

Even though the positivity assumption on KL divergences as required in the proof of the asymptotic optimality of the Chernoff test given in [1] no longer holds for the anomaly detection problem, we show in Theorem 2 below that the Chernoff test preserves its asymptotic optimality in this case. Note that in [5], a modified Chernoff test was developed in order to handle indistinguishable hypotheses under some (but not all) actions. The basic idea of the modified test is to replace the action distribution given in (12) with a uniform distribution for a subsequence of time instants that grows at a sublinear rate with time. This subsequence of arbitrary actions are independent of past observations and affects the finite-time performance. In Theorem 2 below we show that this modification is unnecessary for the anomaly detection problem.

###### Theorem 2

Let and be the Bayes risks under the Chernoff test and any other policy , respectively. Then,

(13) |

#### Iii-C2 Comparison

Although both the Chernoff test and the DGF policy are asymptotically optimal, simulation results demonstrate significant performance gain of DGF over the Chernoff test in the finite regime (see Section V). Next, we provide an intuition argument for the better finite-time performance of DGF by drawing an analogy between the anomaly detection problem and the makespan scheduling problem.

Consider the problem of scheduling jobs over parallel machines (). Each job requires a deterministic processing time of time units. The objective is to minimize the makespan which is defined as the completion time of all jobs. Note that when , processing a job continuously until it is completed can be highly suboptimal since a certain number of machines are left idle when there are less than unfinished jobs. Note also that keeping machines idle during the scheduling process increases the makespan for all . The optimal solution to this problem is given by the LRPT (the longest remaining processing time first) scheduler [32, Theorem 5.2.7] that schedules, at any time , the jobs with the longest remaining processing time.

The anomaly detection problem can be viewed as a problem of scheduling jobs (each being the detection process of distinguishing one of the false hypotheses from the true hypothesis) over machines (which is the number of cells that the decision maker can probe simultaneously). Consider first . In this case, DGF probes cells at each time, while the Chernoff test selects cells randomly among the cells . Both tests terminate once occurs. Assume that hypothesis is true. Roughly speaking, following Lemma 5, once , the decision maker has sufficient evidence to distinguish false hypothesis from the true hypothesis . Except during an asymptotically insignificant initial stage of the detection process, cells are the cells without the target (see Lemma 7 for a detailed analysis on the last passage time of cell being the cell with the target for all ). In this case, cells as selected by DGF can be viewed as the cells with the longest remaining processing times. The randomized Chenoff test, however, may lead to inefficient exploitation of the probing capacity, as explain above for the makespan scheduling problem. Furthermore, randomly selecting cells from may result in probing a cell whose state can already be inferred with sufficient accuracy (i.e., as detailed in Appendix VII-A), which can be viewed as scheduling a job that is already completed or equivalently, leaving a machine idle in the makespan problem. Such actions, however, will not occur under DGF. The argument for the case of is similar by viewing the problem as scheduling jobs over machines. Note that both DGF and the Chernoff test dedicate one machine for probing cell since under the condition of , probing the cell with the target is preferred to accelerate the detection process.

## Iv Extension to Multiple Anomalous Processes

In this section we extend the results reported in previous sections to the case where multiple processes are abnormal. In Section IV-A we consider the detection of abnormal processes, where is known. In Section IV-B we consider the case where an unknown number of abnormal processes are present and only an upper bound is known.

Throughout this section, we define as the set of all possible combinations of target locations, with cardinality (i.e., a set of hypotheses, , indicating that the locations of all targets are given by the set in ) and as the a priori probability that is true. Here, the decision rule declares a set of target locations (i.e., hypothesis ) and the error probability under policy is defined as , where is the probability of declaring when is true.

### Iv-a Known Number of Abnormal Processes

Consider the case where abnormal processes are located among the cells and is known. In this case, the detection problem involves hypotheses. We show below that a variation of the DGF policy, dubbed the DGF(L) policy, is asymptotically optimal under this setting.

The stopping rule and decision rule under the DGF(L) policy are similar to that under the DGF policy:

(14) |

where and

(15) |

The selection rule under the DGF(L) policy is more involved and depends on the relative order of and (or ). Specifically,

(16) |

where

(17) |

and

(18) |

It is not difficult to see that when , the DGF(L) policy degenerates to the DGF policy.

Next, we analyze the performance of the DGF(L) policy. Let

(19) |

where

(20) |

and

(21) |

The following theorem shows the asymptotically optimal performance of the DGF(L) policy:

###### Theorem 3

Let and be the Bayes risks under the DGF(L) policy and any other policy , respectively. Then,

(22) |

See Appendix VII-C.

Note that in the DGF(L) policy, all targets are declared simultaneously at the termination time of the detection. A modification to DGF(L) leads to a policy where abnormal processes are declared sequentially during the detection. Consider, for example, and . It can be shown (with minor modifications to Theorem 3) that an asymptotically optimal policy is to test the cell with the largest sum LLRs and declare the first target once the largest sum LLRs exceeds the threshold . The same procedure is then applied to the remaining cells. This repeats until abnormal processes have been declared, at which point, the detection terminates. The asymptotic expected termination time is given by with . Even though the total detection time remains the same order as under the DGF(L) policy, this modified version may be more appealing from a practical point of view. In particular, actions can be taken to fix each abnormal process the moment it is identified; the total impact to the system by these abnormal processes can thus be reduced. If , it can be shown that an asymptotically optimal policy is to test the cell with the smallest sum LLRs and declare the first normal process once the smallest sum LLRs drops below . The same procedure is then applied to the remaining processes and is repeated until all objects are declared as normal (thus, the remaining ones are declared as abnormal). The asymptotic expected termination time is given by with . Even though in this case, the modified version also declares all targets simultaneously at the termination time of the detection, the difference is that this modified version incurs much few switchings among processes than the DGF(L) policy. This may be more advantageous in some practical scenarios when switching among tested processes results in additional cost or delay. To see that the modified version incurs few switchings, we note that the modified version tests the process that the decision maker is most sure to be normal based on past observations while DGF(L) tests the process that the decision maker is least sure to be normal except the processes currently considered as the targets (see the second line in (18) which shows that DGF(L) chooses the cell with the largest sum LLRs; the processes with larger sum LLRs are the current maximum likelihood of the target locations). It should be noted that those modified DGF(L) schemes are expected to achieve the same performance as DGF(L) in both the finite and asymptotic regimes when following a similar argument as in Section III-C.

### Iv-B Unknown Number of Abnormal Processes

In this section we consider the interesting case where the number of abnormal processes (or targets) is unknown. It is only known that is bounded by . We consider the case where . We also assume that the number of cells satisfies:

(23) |

Note that (23) implies .

Throughout this section, we allow the decision maker to declare the target locations sequentially during the test (similar to the modified DGF(L) policy as discussed at the end of Section IV-A). We refer to the detection time as the time when the last target has been declared and to the termination time as the time when the decision maker terminates the test. Note that when the number of targets is known (as discussed in previous sections). When is unknown, however, since the decision maker does not know whether it has already identified all targets at time . In general, the termination time increases linearly with under any policy with whenever . This is due to the fact that even if the targets have been detected with sufficient reliability, the decision maker must verify whether there are other targets in the remaining cells before terminating the test. On the other hand, following the modified DGF(L) policy, one would expect to achieve a detection time less than for all , which is independent of the number of total processes.

In scenarios with a large number of processes and , a policy that focuses on minimizing the termination time , which grows linearly with , may not be practically appealing. It is desirable to have a policy that allows each abnormal process to be identified and fixed as quickly as possible during the test. In other words, it is desirable to have a policy that minimizes the detection time rather than the termination time . In this case, even though the test continues after to ensure there are no other targets, all abnormal processes have been fixed by the detection time and cease to incur cost to the system. We thus modify the objective function to the following Bayes risk:

(24) |

and we are interested in finding a strategy that minimizes the Bayes risk (24) This design objective is similar to that considered in [21, 22, 23].

Before presenting the desired solution for this case, we demonstrate with a specific example that even though the Chernoff test is asymptotically optimal in terms of minimizing the termination time , it is highly suboptimal in terms of minimizing the detection time . Assume that is the upper bound on the number of targets, which can locate in any of cells. As a result, the detection problem includes hypotheses, . The observation model under every hypothesis and cell selection is given in Table I.

cell | cell | cell | |
---|---|---|---|

g | f | f | |

f | g | f | |

f | f | g | |

g | g | f | |

g | f | g | |

f | g | g |

Assume that hypothesis is true and that , where is the ML estimate of the true hypothesis at time . Consider a deterministic policy that selects the cells according to the order of their sum LLRs and declares an object as target if or normal if . This policy achieves (since cell is first identified as a target with high probability) and (since the number of targets is unknown and , thus the decision maker must continue testing the normal processes before terminating the test). On the other hand, the Chernoff test (which aims to minimize the termination time) will not select cell at time , since or minimizes (12), i.e., . It can be verified that selecting randomly cells or with equal probability maximizes (12) and achieves a rate function , which results in , which is greater than the detection time under the above deterministic policy. Intuitively speaking, once cells are identified as normal, cell is identified as abnormal (because at least target is present). Therefore, the Chernoff test observes cells to minimize the termination time (by not testing cell ), while increasing the detection time .

Next, we present a deterministic policy to minimize the Bayes risk (24). Let be the set of cells satisfying at time . Define

(25) |

The selection rule is given by:

(26) |

The stopping rule and decision rule are given by:

(27) |

and

(28) |

Note that denotes the target locations and the complete set is declared at time . Since the number of targets is unknown, the decision maker continues taking observations to verify that there is no other target. The test is terminated at time .

The following theorem shows the asymptotically optimal performance of the proposed policy:

###### Theorem 4

See Appendix VII-D.

## V Numerical Examples

In this section we present numerical examples to illustrate the performance of the proposed deterministic policy as compared to the Chernoff test. We simulated a single anomalous object (i.e., target) located in one of cells with the following parameters: The a priori probability that the target is present in cell was set to for all . When cell is observed at time , an observation is independently drawn from a distribution or , depending on whether the target is absent or present, respectively. It can be verified that:

Let be the Bayes risks under the DGF policy and the Chernoff test, respectively. Let be the asymptotic lower bound on the Bayes risk as . We define:

as the relative loss in terms of Bayes risk under the DGF policy and the Chernoff test, respectively, as compared to the asymptotic lower bound. Following Theorems 2, 1, we expect both and to approach as . and serve as performance measures of the tests in the finite regime.

First, we consider the case where and . Note that when and , the Chernoff test coincides with the DGF policy: they both select cell . When , however, the proposed policy selects cell , while the Chernoff test selects cell randomly at each given time . We set and obtain . As a result, the Chernoff test and the DGF policy have different cell selection rules. The performance of the Algorithms is presented in Fig. 1(a), 1(b), were trials were performed. In Fig. 1(a), the asymptotic lower bound on the expected sample size and the average sample sizes achieved by the algorithms are presented as a function of (log-scale). In Table II we present the sample standard deviations and the standard deviation multipliers for a 95% confidence intervals , where is the average detection delay. In Fig. 1(b), and are presented as a function of . Although both schemes approach the asymptotic lower bound as , it can be seen that the DGF policy significantly outperforms the Chernoff test in the finite regime for all values of .

DGF | Chernoff Test | |
---|---|---|

Next, we consider the case where and (i.e., two cells are observed at a time). In this case, the DGF policy selects cells and at each given time only if . Otherwise, it selects cells and . The Chernoff test selects cells and (randomly) at each given time only if . Otherwise, it selects cells randomly. First, we set and obtain . The performance of the algorithms is presented in Fig. 2(a), 2(b). Next, we set , and obtain . The performance of the algorithms in this case is presented in Fig. 3(a), 3(b). In Fig. 2(a), 3(a), the asymptotic lower bound on the expected sample size and the average sample sizes achieved by the algorithms are presented as a function of the cost per observation . In Fig. 2(b), 3(b), and are presented as a function of .

It can be seen that the DGF policy significantly outperforms the Chernoff test in the finite regime for all values of under all cases. These results demonstrate the advantage of using the deterministic selection rule applied by the DGF policy instead of the randomized Chernoff test for the anomaly detection problem.

## Vi Conclusion

The problem of quickest detection of an anomalous process (i.e., target) among processes (i.e., cells) was investigated. Due to resource constraints, only a subset of the cells can be observed at a time, The objective is a search strategy that minimizes the expected search time subject to an error probability constraint. The observations from searching a cell are realizations drawn from two different distributions or , depending on whether the target is absent or present, respectively. A simple deterministic policy was established to solve the Bayesian formulation of the search problem, where a cost of per observation and a loss of for wrong decisions are assigned. It is shown that the proposed index policy is asymptotically optimal in terms of minimizing the Bayes risk as approaches zero.

The problem was further extended to handle the case where multiple anomalous processes are present. In particular, the interesting case where only an upper bound on the number of anomalous processes is known was considered. We showed that existing methods may not be practically appealing under the latter setting. Hence, we proposed a modified optimization problem for this case. Asymptotically optimal deterministic policies were developed for these cases as well.

## Vii Appendix

### Vii-a Proof of Theorem 1

In this appendix we prove the asymptotic optimality of the DGF policy as . In App. VII-A1, we show that is an asymptotic lower bound on the Bayes risk that can be achieved by any policy . Then, we show in App. VII-A2 that the Bayes risk under the DGF policy, approaches the asymptotic lower bound as . Specifically, the asymptotic optimality property of DGF is based on Lemma 11, showing that the asymptotic expected search time approaches , while the error probability is following Lemma 5.

Throughout the appendix we use the following notations: Let

(30) |

be the number of times that cell has been observed up to time .

We define

(31) |

as the difference between the observed sum of LLRs of cells and . Let

(32) |

Thus,

(33) |

Without loss of generality we prove the theorem when hypothesis is true. For convenience, we define

(34) |

Note that is a zero-mean r.v under hypothesis .

#### Vii-A1 The Asymptotic Lower bound on the Bayes risk

The asymptotic lower bound on the Bayes risk is shown in Theorem 5 below and is mainly based on Lemmas 1, 4, provided below. Throughout this section, denotes a generic stopping time that can be determined by any policy . In Section VII-A2, however, we will refer to as the specific stopping time under the DGF policy. Lemma 1 shows that under , , defined in (32), must be large enough to obtain a sufficiently small error . Lemma 4 implies that must be large enough to obtain a sufficiently large .

###### Lemma 1

Assume that for all . Let . Then:

(35) |

for all .

Note that:

(36) |

Note that as conditioned by the Lemma. Next, we upper bound the term

By changing the measure, as in [1, Lemma ].

Let be the subset of the sample space, in which for some and is accepted at time . Let be the observation collected from cell at time (note that only observations are obtained at a time. An observation is meaningful only when the process is probed. Otherwise, we can set an arbitrary value). Let be the set of all the observations up to time . Let be the set of time indices for the observations , containing the time indices in which cell was probed. Thus, for all there exists such that:

(37) |

Thus,

(38) |

As a result, by (32)

(39) |

Finally,

(40) |

###### Lemma 2

Assume that and