Scheduling Resource-Bounded Monitoring Devices for Event Detection and Isolation in Networks

Scheduling Resource-Bounded Monitoring Devices for Event Detection and Isolation in Networks

Waseem Abbas, Aron Laszka, Yevgeniy Vorobeychik, and Xenofon Koutsoukos W. Abbas is with the Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN, 37212 USA (email: waseem.abbas@vanderbilt.edu)A. Laszka is with the Department of Electrical Engineering and Computer Science at the University of California, Berkeley, CA 94720, USA (email: laszka@berkeley.edu).Y. Vorobeychik, and X. Koutsoukos are with the Department of Electrical Engineering and Computer Science, Vanderbilt University, and also with the Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN 37212 (emails: yevgeniy.vorobeychik@vanderbilt.edu, xenofon.koutsoukos@vanderbilt.edu).
Abstract

In networked systems, monitoring devices such as sensors are typically deployed to monitor various target locations. Targets are the points in the physical space at which events of some interest, such as random faults or attacks, can occur. Most often, these devices have limited energy supplies, and they can operate for a limited duration. As a result, energy-efficient monitoring of various target locations through a set of monitoring devices with limited energy supplies is a crucial problem in networked systems. In this paper, we study optimal scheduling of monitoring devices to maximize network coverage for detecting and isolating events on targets for a given network lifetime. The monitoring devices considered could remain active only for a fraction of the overall network lifetime. We formulate the problem of scheduling of monitoring devices as a graph labeling problem, which unlike other existing solutions, allows us to directly utilize the underlying network structure to explore the trade-off between coverage and network lifetime. In this direction, first we propose a greedy heuristic to solve the graph labeling problem, and then provide a game-theoretic solution to achieve optimal graph labeling. Moreover, the proposed setup can be used to simultaneously solve the scheduling and placement of monitoring devices, which yields improved performance as compared to separately solving the placement and scheduling problems. Finally, we illustrate our results on various networks, including real-world water distribution networks.

scheduling, networked systems, network coverage, graph labeling, potential games, dominating sets.

c4

I Introduction

Detection and isolation of unwanted events such as faults, failures, and malicious intrusions is a fundamental concern in a variety of practical networks. For example, leakage detection in water distribution networks can reduce physical damage as well as financial losses [1]. For this purpose, monitoring devices, such as sensors, are typically deployed strategically throughout the network. Spatially distributed systems over large areas may often be monitored only by battery-powered devices, as wired deployment can be prohibitively expensive or impossible. If the power supply provided by batteries is insufficient for continuous monitoring during the intended lifetime of a system, batteries must be replaced regularly. Since the cost of battery replacement for a large number of devices can be very expensive, one of the primary design concerns for such systems is increasing the time until the batteries of the sensors are depleted. At the same time, it is desired to maintain a certain level of monitoring in terms of the number of targets covered throughout the network lifetime. Here, targets are the points in the physical space at which events of interest can occur. For instance, in water distribution networks, events can be the pipe bursts, and so targets can be the water pipes, which need to be monitored through sensors such as battery operated pressure sensors.

One of the primary approaches for conserving battery power is “sleep scheduling.” The idea is to have only a subset of the sensors activated at any given time, and to turn off (i.e., “sleep”) the remaining ones, thereby conserving power. By activating different sets of devices one after another, the overall lifetime of a system can be substantially increased. Previous works have mostly focused on finding schedules that ensure complete coverage, that is, guaranteeing that every target is monitored by some device at any given moment in time (e.g., [2, 3]). However, complete coverage is a very strict requirement, which severely limits the sets of devices that may be asleep at the same time. In fact, coverage (i.e., ratio of monitored targets to the total number of targets) is a submodular function of the set of active devices in most models (e.g., [4, 5]), which roughly means that attaining complete coverage is disproportionately expensive as compared to achieving reasonably good coverage. Managing energy resources of monitoring devices via their scheduling to achieve an appropriate coverage of targets is a significant issue in networks where extended network lifetime is a critical requirement.

In this paper, we design efficient scheduling schemes for a set of monitoring devices with limited battery supplies to achieve maximum target coverage for a given network lifetime. Scheduling of such devices to achieve complete network coverage is a special case of this general formulation. We model the network as a graph, in which monitoring devices could be deployed at a subset of nodes, and the targets could be nodes and/or edges. Each monitoring device has a limited active time, and covers a subset of targets within its range during its active time. For a given network lifetime, the objective is to determine the maximum possible coverage, both in terms of the detection and isolation of (events at) targets, and a schedule of monitoring devices to obtain the optimal coverage.

In this direction the main contributions of the paper are:

(1) We show that the optimal scheduling of monitoring devices is an APX-hard problem, that is, there is no polynomial-time approximation scheme (PTAS) for the problem unless P=NP.

(2) We provide a graph-theoretic formulation of the scheduling problem by showing that it is equivalent to a unique graph labeling problem, which allows us to directly exploit the network structure to obtain optimal schedules.

(3) To solve the graph labeling, and hence the scheduling problem, we propose two solutions; first, a greedy heuristic that runs in polynomial time, and gives near optimal solutions for many networks as we illustrate. However, in general, performance guarantees of the heuristic in terms of the optimality of the solution remain unknown. Second, we present a game-theoretic solution, in which we show that the labeling problem can be posed as a potential game, for which efficient algorithms, such as binary log-linear learning (BLLL), are known that asymptotically give globally optimal solutions with an arbitrary high probability.

(4) Moreover, we illustrate that the game-theoretic solution allows simultaneously optimizing the placement and scheduling of monitoring devices, which gives better results as compared to separately solving the placement and scheduling. Note that the placement problem involves selecting optimal locations to deploy a given set of monitoring devices to maximize the target coverage within networks.

(5) We analyze the performance of the approach through simulations on various networks including real-world water distribution networks and random networks. For random networks, we also provide analytical results to determine the performance of random scheduling, which does not require any information about the network structure.

(6) Finally, we consider some practically relevant special cases of the problem, such as scheduling to maximize network lifetime while ensuring complete coverage of the targets within the network.

The rest of the paper is organized as follows: In Section II, we explain our system model and define the scheduling problem. Section III addresses the issue of complexity of the problem. In Section IV, we present a graph labeling based formulation of the scheduling, and in Section V propose solutions to the graph labeling problem. Section VII illustrates simulation results, and section VIII presents a particular case of interest of the scheduling problem. In Section IX, we provide an overview of related work, and conclude the paper in Section X.

Ii System Model and Problem Formulation

In this section, first, we present the system model by describing all the major components involved, and then we formulate the problem of optimal scheduling of resource bounded monitoring devices in networks.

(a) Network Graph – We model the network as an undirected graph, , in which is the set of nodes, and is the set of edges given by the unordered pairs of nodes. Two nodes are adjacent if there exists an edge between them. The neighborhood of a node , denoted by , is the set of all nodes that are adjacent to , i.e., , and the neighborhood of a subset of nodes , denoted by , is . The degree of a node , represented by , is simply . Moreover, a path is a sequence of nodes such that any two consecutive nodes in the path are adjacent, and the number of edges included in the path is the length of the path. Any two nodes are said to be connected if there exists a path between them. The distance between connected nodes and , denoted by , is the length of the shortest path between them. Similarly, the distance between node and edge is . The network graph abstracts interactions among various nodes within the network.

(b) Targets – They are a subset of nodes and/or edges, denoted by , that could be subjected to an abnormal activity (or event), such as pipe failure, and therefore, need to be monitored by monitoring devices.

(c) Monitoring Devices – These are the devices that are deployed at a subset of nodes in the network, and can monitor the other nodes and/or links within the network for any unusual activity, for instance, link failure detection such as pipe burst in water networks. We refer to any such abnormal activity on a target as an event. A monitoring device can monitor all nodes and edges for events that lie within some pre-specified distance, referred to as the range, of the device. If is the node at which a monitoring device with the range is deployed, then the device covers (monitors) all the nodes and edges in the set

In other words, a target is covered if and only if it lies within the range of some monitoring device. Each device is resource-bounded in terms of the available battery supply, denoted by , which means that a device can be active (or can be operational) for only time duration. Furthermore, a monitoring device has only two output states – event detected at some target without knowing the exact location of the target, and no event detected.

Ii-a Network Performance Measures

We are interested in measuring the quality of monitoring of targets through a set of monitoring devices, both from the detection and isolation perspectives. In detection, the objective is just to detect any abnormal activity on some target irrespective of determining the exact location of it, whereas in isolation, the goal is to uniquely detect the target at which the abnormal activity occurs. Moreover, we refer to the overall lifetime of the network, i.e., duration for which monitoring of targets for detection (isolation) is considered, as the network lifetime . To simplify, we divide the time into time slots of equal length. The battery supply of a monitoring device could be represented by the number of time slots, say , in which the device could remain active. Moreover, the network lifetime could be represented by the total number of time slots, say , for which the detection (isolation) of targets is considered. Note that and represent the actual duration of overall network lifetime and battery lifetime of individual monitoring device respectively, whereas, and , which are chosen to be positive integers, represent respectively the total number of time slots and the time slots for which each device could remain active.

(a) Detection Measure – Let there be a total of targets, and be the number of targets that are covered by the monitoring devices that are active in the time slot. We define the average detection performance, denoted by , as

(1)

(b) Isolation Measure – We observe that event at target can be distinguished from an event at target if and only if there exists a monitoring device that gives different outputs in case of events at and . In other words, there exists a monitoring device at some node such that exactly one of the target (either or , but not both) is covered by the monitoring device. If such a monitoring device exists, we say that the target-pair is covered. The event at target can be uniquely detected (or can be distinguished from events at all other targets) if all target-pairs () are covered. If is the total number of targets, then there is a total of target-pairs. In the time slot, let be the number of target-pairs that are covered by the active sensors. Then, we define the average isolation performance, denoted by , as

(2)

where is the total number of time slots. A list of symbols used throughout the paper is given in Table I.

Symbol Description
network graph
set of monitoring devices ()
set of targets
range of monitoring device
neighborhood of a node
neighborhood of a subset of nodes
network lifetime in terms of actual time duration
duration for which a device can remain active
network lifetime in terms of the total number of time slots
number of time slots in which a device can remain active
average detection measure (1)
average isolation measure (2)
nodes at which devices are active in the time slot
bi-partite graph representation of the network
TABLE I: List of Symbols

Ii-B Problem Formulation

Consider a network in which is the subset of nodes at which monitoring devices with ranges are deployed, and are the set of targets. Each monitoring device could remain active in at most of the total of time slots due to battery supply constraints. In each time slot , let be the subset of nodes with active monitoring devices. Thus, we get a schedule of (active) monitoring devices as .

The objective is to determine the maximum average detection performance (or average isolation performance ) for a given network life time, represented by time slots, under the battery constraints of monitoring devices, represented by time slots, and also a schedule of monitoring devices that achieves the maximum (or ).

It is obvious that as increases, the maximum values of (or ) decrease. So, in a way, our goal is to understand a relationship between and (or ), and design a systematic scheme to obtain a schedule for activating monitoring devices with limited battery supplies to obtain the desired network performance. Note that the scheduling problem for a complete coverage of targets, in which the objective is to determine a schedule that ensures throughout the network life is a special case of the above problem.

Iii Problem Complexity

In this section, we show that the problem of finding a schedule that maximizes the average detection performance for a given network lifetime and battery supplies, as discussed in Section II-B, is APX-hard. APX-hardness implies that (unless P=NP), there does not exist a polynomial-time algorithm that can solve the problem to within arbitrary multiplicative factor of the optimum.

In our case, for a target , if represents the fraction of the total number of time slots in which an event on can be detected (i.e., is covered), then the expected value of detecting an event on an arbitrary target, denoted by is

(3)

Note that and have exactly same values for a given schedule , and therefore, they both measure the average detection performance of the schedule. We formulate finding a schedule that maximizes detection performance as the following optimization problem:

  • missing(Maximum Average Detection): Given a graph , a set of monitoring devices , a set of targets , range of the monitoring device , a network lifetime represented by time slots, a battery supply represented by time slots, find a schedule that maximizes the average detection performance .

Theorem 3.1

The Maximum Average Detection Problem is APX-hard.

We show APX-hardness by reducing a well-known APX-hard problem, the Maximum Cut Problem [6] to the detection problem. The Maximum Cut Problem is defined as follows:

  • missing(Maximum Cut Problem): Given a graph , find a disjoint partition of that maximizes the number of edges between and .

Proof (Theorem 3.1) – We prove APX-hardness by showing that there exists a PTAS-reduction from the Maximum Cut Problem to the Maximum Average Detection Problem. First, we define a polynomial-time mapping from an instance of the cutting problem to an instance of the detection problem:

  • let the network of the Maximum Average Detection Problem be the graph of the Maximum Cut Problem;

  • let the set of monitoring devices be ;

  • let the set of targets be ;

  • let the range of the monitoring device be ;

  • let the network lifetime be time slots;

  • and let the battery supply be time slot.

Second, we define a polynomial-time mapping from a solution of an instance of the detection problem (i.e., a schedule) to a solution of the corresponding instance of the cutting problem (i.e., a cut):

(4)

Next, observe that if an edge is cut by , then the corresponding target is covered by both and , which implies . On the other hand, if an edge is not cut by , then the corresponding target is covered in only one time slot, which implies . Consequently, for any pair of solutions and , we have

(5)
(6)

Using the same argument, we can also show that if a schedule is an optimal solution to the detection problem, then the cut is also an optimal solution to the cutting problem, and vice versa. Therefore, if a schedule is at most times worse than the optimal schedule, then the corresponding cut is at most times worse than the optimal cut. Consequently, there is a PTAS-reduction from the Maximum Cut Problem to the Maximum Average Detection Problem.  

As a consequence, we cannot optimally solve the maximum average detection problem in a polynomial time. Hence, we need efficient heuristics that can provide reasonably good solutions with acceptable time complexities. In this regard, it becomes crucial to maximally exploit the structure of the problem in a systematic way. To achieve this objective, we first provide a graph-theoretic formulation of the scheduling problem in the next section, and then provide efficient solution to the problem using a game-theoretic setting in Section V.

Iv A Graph-Theoretic Formulation of the Scheduling Problem

In this section, using various graph-theoretic notions, we formulate the scheduling problem as a graph labeling problem. In the next section, a solution approach is presented to solve the corresponding graph labeling, thus solving the the original scheduling problem.

Our approach is to first obtain a bi-partite graph, denoted by , from a given graph. This bi-partite graph illustrates targets and the monitoring devices with given ranges covering those targets. We then formulate the scheduling problem on the original network as a graph labeling problem on the bi-partite graph .

Iv-a Bi-partite Graphs in the Cases of Detection and Isolation

Iv-A1 Case 1 – Detection

When scheduling of monitoring devices is required with an objective to maximize the average detection score , as described in Section II-A, the bi-partite graph is simply obtained as follows: the vertex set is the union , where is the set of nodes corresponding to the set of monitoring devices, and is the set of targets in the original network . Moreover, each is adjacent to vertices in that are at most distance away from in . An example is shown in Figure 1.

Iv-A2 Case 2 – Isolation

If maximizing the average isolation measure , as in Section II-A, is the objective of scheduling, then is obtained as follows: As in the case of detection, the vertex set of the bi-partite graph is , where corresponds to the set of monitoring devices. To obtain , we define a node for every pair of targets in . There will be such nodes in . As for the edge set of the bi-partite graph, let corresponds to the (unordered) target pair . Then, each is adjacent to in if and only if exactly one of the targets or is within distance from (the monitoring device corresponding to) in the original network . In other words, in the bi-partite graph , there will be no edge between and that corresponds to the target pair , if and only if the monitoring device covers both targets and in , or does not cover any of the targets and . An example is illustrated in Figure 1.

Example

Consider a graph in Figure 1. Let be the set of monitoring devices and edges in the set be the targets. Moreover, each monitoring device has the range . The bi-partite graphs for the scheduling of monitoring devices to maximize the detection and isolation measures are shown in Figures 1(b) and 1(c) respectively. The vertex set of bi-partite graphs in both cases is , where . For the detection case, , and for the isolation case, , where corresponds to the pair of edges in . Note that an edge between and indicates that the monitoring device at covers the target pair , or in other words, can distinguish between events at and .

Fig. 1: (a) An example network graph . Bi-partite graph representations for (b) detection and (c) isolation.

Iv-B A Graph Labeling Problem and its Equivalence to the Scheduling Problem

After obtaining the bi-partite graph from a given network , we can re-write the detection and isolation scores as in (1) and (2) respectively in terms of . Note that if is the subset of active monitoring devices in the time slot, then for the detection (isolation), the set of targets (target-pairs) covered by is simply the neighborhood of set , i.e., . Here, is the neighborhood of node as defined in Section II. Hence, for a given schedule where is the total number of time slots, the average detection (isolation) measure is simply . Thus, given a bi-partite graph , network life in terms of time slots, and battery supply constraint in terms of time slots, the problem of finding an optimal schedule that maximizes the average detection (isolation) measure as described in Section II-B becomes equivalent to finding a set of subsets , where , such that

(7)

and each node is included in at most such subsets.

The above problem can be cast as a graph labeling problem as described below.

Graph Labeling Problem: Let be the set of labels, and be the set of all -subsets111The cardinality of each subset is , where is some positive integer. of . Note that . Moreover, we define

(8)

i.e., is a set function that assigns to each vertex in , or in other words assign a subset of labels from to each . Also, for , we define as follows:

(9)

Note that is simply the number of distinct labels available in the neighborhood of . The objective is to obtain an assignment of labels to the nodes in (i.e., (8)) such that

(10)

Here, the objective is to assign labels to each node in such that the sum of the number of distinct labels available in the neighborhood of , , is maximized. The scheduling problem in (7) and Section II-B, is equivalent to the graph labeling problem described above.

Proposition 4.1

The problem of obtaining an optimal schedule that maximizes the average detection (isolation) measures of a set of monitoring devices with limited battery supplies that cover a set of targets (target-pairs) for a given network lifetime, which is divided into time slots, is equivalent to the graph labeling problem as defined in Equations (8)–(10).

Proof – In the graph labeling problem, let the subset of labels assigned to the vertex , i.e., , correspond to the indices of time slots in which the monitoring device corresponding to is active. Since has at most distinct labels by the definition of , the monitoring device corresponding to node can be active in at most time slots. Hence, the battery supply condition that requires a monitoring device to be active in at most time slots, is always satisfied. Moreover, indicates time slots in which the target (target-pair) remains covered by some . Then, is simply the average detection (isolation) measure. The set of vertices that have label correspond to the monitoring devices active in the time slot, i.e., . Thus, finding a labeling (8) that maximizes (10) is basically finding a schedule that maximizes the average detection (isolation) measure.  

An illustration of the graph labeling for the scheduling problem is given below.

Example

In Figure 2, instances of optimal labeling of graphs in Figures 1(b) and 1(c) are shown for and . Here means that the given network lifetime spans five time slots. Each node has at most two labels, which represents that owing to battery constraint, a node can be active in at most two of the time slots. The node labels indicate time slots in which they remain active, thus, giving us optimal schedules. Here, the optimal detection score is 0.75, which could be obtained with the schedule . Similarly, the optimal isolation score is 0.633, which could be obtained with the schedule .

Fig. 2: Graph labelings for and . Node labels, i.e., are shown in colors.

V Solutions to the Graph Labeling Problem

In this section, we provide two solution approaches to the graph labeling problem. The first one is a simple greedy heuristic, whereas, in the second approach, we utilize game-theoretic concepts. The greedy heuristic runs in polynomial time, and gives a near optimal solution for many practical networks as illustrated in the next section. However, in general, the approximation ratio of the algorithm is not known. On the other hand, the game-theoretic solution returns a graph labeling that is globally optimal with high probability.

V-a Greedy Heuristic

The graph labeling problem closely resembles the set covering problem, since we have to ‘cover’ the set of targets using a set of monitoring nodes, each of which could cover a given subset of the targets. Since the straightforward greedy algorithm is known to be an efficient approximation algorithm for the set covering problem, we can expect it to perform well for the graph labeling problem also. Hence, we formulate a simple greedy heuristic for the graph labeling problem as follows (Algorithm 1): For a given labeling set and , iteratively select a combination of a label in and a source node in that maximizes the sum of number of distinct labels available in the neighborhoods of all target nodes in . Note that in each iteration, only a source node with less than labels could be selected.

1:Given: ,
2:Initialization: ,
3:While do
4:      
5:      
6:       If do
7:          
8:       End If
9:End While
Algorithm 1 Greedy Heuristic

If is the total number of source nodes, be the number of target nodes, and be the total number of labels in the labeling set, then greedy heuristic could be executed in at most time as there are iterations and each iteration could take time. Greedy heuristic gives a simple strategy to solve the labeling problem, however, we do not know the quality of the solution returned by it, that is, how far is the greedy solution from the optimal one. Therefore, we present a game-theoretic solution by posing the labeling problem as a potential game, for which algorithms are known that maintain globally optimal solution with high probability as time goes to infinity, as discussed below.

V-B Game Theoretic Solution to the Graph Labeling Problem

Game theory concepts have been extensively employed to solve locational optimization problems, such as maximizing coverage on graphs (e.g., [7, 8]) and distributed control of multiagent systems (e.g., [9, 10]). In a particular approach, the idea is to determine a potential function that captures the overall global objective. The players’ individual utility functions are then appropriately aligned with the global objective such that the change in the utility of the player as a result of unilateral change in strategy equals the change in the global utility represented by the potential function. The players’ strategies are then designed to ensure that local actions lead to the global objective. It turns out that this problem formulation and design can be realized using a class of non-cooperative games known as potential games, which are now extensively used for various distributed control optimization problems.

A finite strategic game consists of a set of players , action space where is a finite action set of the player , and a set of utility functions where is a utility function of the player . If denotes the joint action profile, we let denote the action of players other than the player . Using this notation, we can also represent as .

A game is a potential game if there exists a potential function, such that the change in the utility of the player as a result of a unilateral deviation from an action profile to is equal to the corresponding change in the potential function. More precisely, for every player , , and , we get

(11)

In the case of potential games, there exist algorithms, such as log-linear learning (LLL) [11, 12] and binary log-linear learning (BLLL) [13] that could be utilized to drive the players to action profiles that maximize the potential function. These algorithms embody the notion of convergence of such games to the most efficient Nash equilibrium, particularly in scenarios where utility functions are designed to ensure that the action profiles that maximize the global objective of the system coincide with the potential function maximizers [11, 13]. More precisely, in potential games, LLL and BLLL algorithms guarantee that only the joint action profiles that maximize the potential function are stochastically stable [13]. The LLL and BLLL are in fact, nosiy best-response algorithms that induce a Markov chain over the action space with a unique limiting distribution that depends on the noise parameter. As the nosie parameter reduces to zero, the limiting distribution has a large part of its mass over the set of potential maximizers (see e.g., [13, 14] for details).

The basic idea behind these algorithms is to have noisy best response dynamics, in which the noise parameter allows the selection of suboptimal action occasionally by the players. The probability of selecting a suboptimal action is dependent of the pay-off difference between the optimal and suboptimal cases. Thus, formulating the graph labeling problem as a potential game would allow us to use the above mentioned learning algorithms to find the most efficient solutions to the graph labeling problem. Thus, our objective now is to design a potential game corresponding to the labeling problem on graphs, and incorporate learning algorithms for the potential games to achieve the desired labeling.

V-B1 A Potential Game for the Graph Labeling

We design a potential game to obtain a labeling of a graph that achieves the objective in (10), thus solving the scheduling problem. In our game, the set of players is the vertex set in the vertex partition ( of the bipartite graph , i.e., . For each player , the action set is the set of all -subsets of the labeling set . We also need to have a potential function that captures the global objective. For this, we define as the set of vertices with the label , i.e.,

(12)

A potential function is then defined as

(13)

Note that is simply the total number of nodes in having a label in their neighborhoods, summed over all the labels, which is equivalent to the in (10). Thus, indeed captures the global objective.

Moreover, we define the utility function of the player as the total number of labels made available by to the nodes in that otherwise would not have been available to the nodes in . For instance, in Figure 2(a), node has labels , which represents the action . Moreover, for the two neighbors of node , i.e., and , node is the only one with the label ; and for the node , node is the only one with the label . Thus, . More precisely, we define as

(14)

where,

Next, we show that with the potential function as defined in (13), and the utility function as in (14), the game designed above is indeed a potential game.

Theorem 5.1

is a potential game if utilities are defined as in (14).

Proof – The potential function, as defined in (13) can be written as,

(15)

Similarly, for , we get

(16)

Subtracting (16) from (15) gives us the desired result, i.e.,

 

Since our graph labeling problem can be formulated as a potential game, using the results in [13] we deduce that if players adhere to the binary log linear algorithm (stated below), then the objective in (10) is maximized. In other words, if unique labels from a total of labels are assigned to each node as per below algorithm, then the number of distinct labels in the neighborhood of every node is likely to converge to the maximum value.

1:Initialization: Pick a small , an , and total number of iterations.
2:While do
3:       Pick a random node , and a random .
4:       Compute .
5:       Set with probability .
6:      
7:End While
Algorithm 2 Binary Log-Linear Learning [13]

Note that initially the nodes are assigned -element subsets of labels randomly. Afterwards, in each iteration, a node is selected at random, and a -subset of labels that improve the overall labeling to attain the objective in (10), is selected with a certain probability.

Vi Simultaneous Placement and Scheduling of Monitoring Devices

So far, we have considered the optimal scheduling of resource bounded monitoring devices, assuming that their placement is fixed, i.e., locations at which monitoring devices are deployed are given. If is the set of all such nodes at which monitoring devices could be deployed, then the placement problem is to select a subset with the given cardinality such that the number of targets (pair-wise targets) that are covered, i.e., lie within the range of at least one such device, is maximized. Typically, to maximize the coverage of targets for a given network lifetime, the placement problem is first solved, followed by the determination of efficient schedules for the monitoring devices.

However, for a given network lifetime, and a fixed number of resource bounded monitoring devices, simultaneously optimizing their placement and scheduling maximizes the average detection (isolation) measure. For instance, consider the network in Fig. 3, in which three monitoring devices with and are deployed to cover the maximum number of nodes for . Fixing the placement of devices at nodes , optimal schedule (for instance, ) gives , whereas the maximum possible under the conditions is , which could be obtained by placing the devices at nodes and with a schedule .

Fig. 3: (a) Optimal schedule for a given placement. (b) Optimal placement and schedule of three monitoring devices with , for .

The BLLL based algorithm to schedule a set of monitoring devices with fixed locations, presented in Section V-B, can be modified to simultaneously optimize the placement as well as scheduling of such devices to maximize the average detection (isolation) measure. This modification is presented as Algorithm 3 below. Fixing the number of monitoring devices , the objective is to select , and assign at most labels to each node from a labeling set so that the average detection measure (or the isolation measure ) is maximized. The labeling of nodes selected in will then give the schedule.

In this case, players are the monitoring devices, for which we need to find the locations, i.e., the nodes at which they are deployed, as well as schedules, i.e., time slots in which they become active. Using the same notations as in Section V-B, here, action of a player is the selection of , where is the set of all such nodes at which a monitoring device could be placed, and is the set of all possible -subsets of the labeling set . Previously, the choice of was fixed for a monitoring device and the player’s action comprised of only selecting . Similarly, utility of a player for the choice of an action here is simply the number new labels that become available in the neighborhood of node as a result of assigning labels in to . In the search of a better solution, in each iteration, a new action is selected with a certain probability for a randomly selected player. It simply means that with a certain probability, either new labels are assigned to the node at which (randomly selected) player is located, or a new node as well as a new set of labels (selected at random) are chosen for the player.

1:Initialization: Pick a small and the number of iterations. Select randomly a subset of nodes , and assign labels to nodes in , i.e, select .
2:While do
3:       Randomly select a node .
4:       Randomly select a node , and .
5:       Compute .
6:       With probability , set , and select for node .
7:      
8:End While
Algorithm 3 Simultaneous Placement and Scheduling

Simulation results for the above algorithm are illustrated in Section VII-C. Using various networks, it is shown that simultaneously selecting the locations for monitoring devices as well as scheduling them using Algorithm 3, gives improved average detection as compared to the one obtained by solving the placement and scheduling separately.

Vii Numerical Results

Fig. 4: Schematics for water networks 1 and 2.
Fig. 5: Plots of as a function of network lifetime for scheduling on water networks and random geometric networks, assuming that each monitoring device has a battery lifetime of time slots.
Fig. 6: Plots of as a function of (BLLL) iterations to illustrate the convergence of BLLL algorithm for the scheduling of monitoring devices with and .

In this section, we present numerical results on the simple greedy and BLLL based algorithms for the scheduling and placement of monitoring devices on urban water distribution networks and random geometric networks as explained below.

Vii-a Scheduling Monitoring Devices in in Water Distribution Networks

Water distribution networks can be modeled as undirected graphs in which edges represent the pipes and nodes represent the junctions. To detect pipe bursts and leakages, pressure sensors are deployed at junctions, which could sense the pressure transient generated as a result of pipe burst within a certain distance (range) from the sensor. The distance threshold based model has been used in water networks in the context of sensor placement problems, e.g., [15, 16]. The pressure sensors are battery operated devices with limited battery lifetime. Thus, top operate these sensors for an extended period of time, they need to be scheduled. Here, we simulate scheduling algorithms, including simple greedy and BLLL based algorithm for the efficient scheduling of monitoring devices, which are pressure sensors in this case, to obtain high values of in two different water distribution networks. The details of these networks, referred to as the Water Network 1 and Water Network 2, are as follows:

Water Network 1 [17, 18] has 126 nodes, 168 pipes, one reservoir, one pump, and two storage tanks. This benchmark water distribution network has been extensively studied in the context of sensor placement problems for water quality. Water network 2 [19] is a grid system in Kentucky with 366 pipes, 270 nodes, three tanks, and five pumps. The layouts of both networks are illustrated in Figure 4. For both the networks, we consider that the sensors are deployed at the junctions as source nodes (monitoring devices), i.e., , and the set of pipes, which are edges in the corresponding network graph, as targets, i.e., . Moreover, for each sensing device, we assume , and compute for a network lifetime, given by time slots, using greedy and BLLL algorithms. For each BLLL instance, we perform 20,000 iterations by selecting to be . The plots of as a function of for various ranges of sensing devices (as defined in Section II) are given in Figure 5.

We can see that both greedy and BLLL gives approximately same results. However, BLLL has an advantage over the greedy algorithm as it allows to simulatneously solve the placement as well as scheduling problem (as discussed in Section VI), which gives improved as compared to individually solving placement and scheduling problems. Moreover, if BLLL is run for sufficiently large number of iterations, the algorithm converges to the optimal solution. Similar plots can be obtained for the scheduling of monitoring devices to maximize the average isolation measure by first obtaining the appropriate network representation as outlined in Section IV-A. In Figure 6, the convergence of BLLL algorithm is illustrated. For both water networks, as a function of iterations is shown for , , and and . We observe the algorithm converges to near optimal value fast, within about 5000 iterations, and the improvements thereafter, are quite small.

Vii-B Scheduling Monitoring Devices in Random Geometric Networks

Random geometric networks are a form of spatial networks in which nodes are deployed uniformly at random in a certain area. An edge exists between two nodes if the Euclidean distance between them is at most , which is often referred to as radius of the sensing footprint. Owing to a wide variety of applications in various domains, such as modeling of wireless sensor networks, these networks have been extensively studied. For our simulations, we consider a network with nodes, deployed uniformly at random over an area of , and . The set of targets here is the set of all nodes. Moreover, a certain fraction of nodes (either or ) are selected at random as source nodes, i.e., nodes with monitoring devices. A monitoring device has a battery lifetime of at most time slots, and can monitor targets that are at a Euclidean distance of at most from it.222In terms of the (graph) distances as defined in Section II, the range of each monitoring device is , as the Euclidean distance of at most 2 between two nodes and implies . In Figure 5, as functions of are illustrated using greedy and BLLL algorithms. Each point on the plots is an average of fifty randomly generated graph instances. In Figure 6, the convergence of BLLL algorithm is shown for some instances of random geometric graphs with 100 nodes, out of which 20 randomly selected nodes contain monitoring devices.

Vii-B1 Random Scheduling in Random Networks

Another special case of interest is related to the quality of random scheduling, i.e., given a total of time slots, if each node remains active in time slots chosen randomly, then what is the average detection performance of such a random scheduling? Here, we discuss this question for random networks, including random geometric networks and networks that could be modeled by Erdős-Rényi random graphs. Though random scheduling is inferior to the BLLL based scheduling in terms of the detection (or isolation) performance, it is useful in many scenarios since it neither requires any sort information regarding the network structure, nor requires any coordination between the monitoring devices. The average detection measure of random scheduling in random geometric networks is given below.

Proposition 7.1

Let be a random geometric graph in which each node contains a monitoring device that remains active in time slots that are randomly chosen from a total of time slots, which correspond to the overall lifetime of the network. If each node in a graph is also a target, then the average detection performance of this random scheduling is

(17)

where is the radius of the sensing footprint of node, and is the number of nodes per unit area.

A proof of the above theorem is given in the Appendix. As above, it can be shown that in the case of Erdős-Rényi random graphs with nodes, denoted by , in which any two nodes are adjacent with some probability , this random scheduling scheme results in an average detection performance given by

(18)

Note that in (18), it is assumed that all the nodes have monitoring devices and all the nodes need to be covered.

Vii-C Simultaneous Placement and Scheduling of Monitoring Devices Using Algorithm 3

We illustrate the Algorithm 3 for the water network 1 and the random geometric graph here. For the water network 1, we set the number of monitoring devices to be , where each device has a range . The set of pipes (or edges in the corresponding network graph) are the targets that need to be covered by these devices. We simulate two scenarios; in the first case we use Algorithm 3 to simultaneously select the nodes and schedules for the monitoring devices; in the second scenario, we first solve the placement problem by selecting the 25 nodes, say , that maximize the number of edges that are at most distance 2 from some node in , and then solving the scheduling problem using Algorithm 2. We note here that the placement problems, in this context, are typically solved using some variant of the minimum set cover problem, or the maximum coverage problem in case the number of monitoring devices is fixed (e.g., [4, 5, 20]). Since the number of devices is fixed here, and the targets to be covered are edges, we use the maximum coverage problem to place (a given number of) monitoring devices at nodes that maximize the number of edges that are at most distance from at least one of the selected nodes. Moreover, since maximum coverage problem is NP-hard, we use a greedy heuristic, which gives best approximation ratio, to solve it.

The results are illustrated in Figure 7. It can be seen that Algorithm 3 (simultaneously solving placement and scheduling) is always giving higher average detection . For the random geometric graph, we simulate instances consisting of 50 nodes deployed at random in an area of , out of which 10 could contain monitoring devices capable of covering nodes within a Euclidean distance of 100 units. The targets here are nodes, and the objective is to maximize the average detection for a given network lifetime. As with the water network example, average detection is improved if placement and scheduling is solved simultaneously using Algorithm 3 as compared to optimizing placement and scheduling separately. For all cases, the battery lifetime of each monitoring device is assumed to be time slots. In Figure 8, we illustrate the convergence of Algorithm 3 for the water network 1 and random geometric graph example. given the network lifetime time slots.

Fig. 7: Comparison of simultaneously optimizing scheduling and placement using Algorithm 3 versus separately optimizing placement and scheduling in terms of as a function of .
Fig. 8: Plots of as a function of iterations of Algorithm 3 showing the convergence of the algorithm.

Viii Special Case: Scheduling to Maximize Network Lifetime While Ensuring Complete Coverage

An important special case of the scheduling problem is to control the activity of monitoring devices such that the overall network lifetime is maximized while ensuring complete coverage, i.e., . In a basic setting, we consider that all nodes in a graph need to be covered at all times, and each node is equipped with a monitoring device that can remain active in at most time slots. Then, the objective is to schedule these monitoring devices such that the number of time slots , in which all of the nodes remain covered through a subset of active devices, is maximized.

The problem is related to the notion of dominating sets in graphs.

  • A dominating set is a subset of vertices in a graph , such that for every , either , or there exists some such that .

    In other words, considering the targets to be the set of nodes (i.e., ), the ranges of monitoring devices to be 1, the network is guaranteed to be completely covered whenever the set of nodes with active monitoring devices form a dominating set in the network graph. Moreover, in the case of targets being edges (i.e., ), a dominating set of active monitoring devices with ranges is also sufficient for the complete coverage of targets within the network. Thus, to maximize the overall network lifetime while ensuring complete coverage of targets, the problem of finding distinct dominating sets in a graph is of great importance. The problem of finding distinct dominating sets under certain constraints has been of great interest owing to its wide variety of applications (e.g., [21, 22, 23, 24]). There are two approaches to maximize the number of distinct dominating sets under the constraint on the number of times a node can appear in a dominating set – disjoint dominating sets, and non-disjoint dominating sets.

    Viii-a Disjoint Dominating Set Based Approach

    One way to approach this problem is to partition the vertex set such that each set in the partition is a dominating set, and all dominating sets are pair-wise disjoint. Such a partition is known as the domatic partition, and the maximum number of (disjoint dominating) sets that can be obtained is known as the domatic number, denoted by . Since dominating sets are pair-wise disjoint in such a partition, each vertex belongs to only one of the dominating sets. Moreover, since each node can be active for time slots, each dominating set can remain active for time slots. If only one dominating set is active at any time instant, which is sufficient for the complete coverage, then the lifetime of the network achievable through this approach is given by

    (19)

    time slots, where is the domatic number of a graph. The domatic partition problem is known to be NP-hard [25]. Various sensor scheduling schemes that utilize domatic partitions have been proposed to maximize the network lifetime while ensuring complete coverage (e.g., [24, 26, 27]).

    Viii-B Non-Disjoint Dominating Set Based Approach

    Another way to approach the network lifetime maximization while maintaining complete coverage is by using the non-disjoint dominating sets of active nodes. Using this approach, it is possible to obtain a better lifetime as compared to the disjoint dominating sets based approach [2, 28]. As an illustration, consider the network in Fig. 9, which has a domatic number . We assume that each node can be active for two time slots, i.e., , then using disjoint dominating sets approach, we get the network lifetime of time slots. However, it is possible to obtain five distinct dominating sets such that each node appears in at most two such sets, as shown in Fig. 9(b), thus, yielding a network lifetime of time slots.

    Fig. 9: (a) Two disjoint dominating sets are shown. (b) Five non-disjoint dominating sets, indicated by the nodes with the same labels, are shown. Each node belongs to two distinct dominating sets.

    The problem of finding the maximum number of dominating sets under the constraints on the number of times a node can be included in a dominating set is related to the notion of -configurations [29, 30] as defined below.

    • missing(-Configurations in Graphs) Let , be two positive integers, and be the set of labels, then -configuration of a graph is the assignment of distinct labels from the set to each node in the graph such that for every and every node in , the label is assigned to or one of its neighbors.

    An example of -configuration is shown in Fig. 9(b). Note that the set of nodes corresponding to a particular label in constitute a dominating set. So, if a graph has an -configuration, it is possible to have distinct (possibly non-disjoint) dominating sets such that each node can be included in at most such dominating sets. Thus, for a given , the maximum value of , say , for which -configuration exists, is of particular interest as it provides a scheduling scheme based on the non-disjoint dominating sets to maximize network lifetime while ensuring complete coverage.

    Obviously, for , the maximum for which -configuration of a graph exists, is equal to the domatic number of . Thus, given a -configuration of with the labeling set , a -configuration could be obtained for some and by simply replacing each label by a set of labels . Thus, for a given , if is the maximum value for which -configuration of a graph exists, then

    (20)

    Consequently, the non-disjoint dominating sets approach is always at least as good as disjoint dominating sets approach, though it often performs better. An interesting question here is under what conditions or specific instances ? In this regard, first we note that every connected graph has , and therefore, for a given , is always at least . However, there exists many graphs for which , but . For instance, many cubic graphs333graphs in which each vertex has a degree three. have a domatic number of 2, e.g., the one shown in Figure 9. However, the following theorem asserts that all cubic graph have for a given .

    Theorem 8.1

    [30] Any cubic graph has an -configuration with , and such a configuration can be found in polynomial time.

    Recently, it has been shown in [29] that the above result is true even for a bigger class of graphs as stated in Theorem 8.2. Here, is a star graph with one central node of degree six, and six end nodes each with a degree one (