Fast Distance Sensitivity Oracle for Multiple Failures

# Fast Distance Sensitivity Oracle for Multiple Failures

## Abstract

When a network is prone to failures, it is very expensive to compute the shortest paths every time from the scratch. Distance sensitivity oracle provides this privilege to find the new shortest paths faster and with lower cost by once pre-computing an oracle in advance. Although several efficient solutions are proposed in literature to support the single failure, few effort is done to devise an efficient method regarding the case of multiple failures.

In this paper, we present a novel distance sensitivity oracle based on Markov Tensor Theory [1] to support replacement path queries in general directed and weighted networks facing the set of failures . In contrast to the existing work, there is no limitation on maximum failure size supported by our oracle and there is no need to know the size of failure for constructing the oracle. The specifications of our oracle are: space size of , pre-process time of , where is the exponent of fast matrix multiplication, and query time of for answering to replacement path query of which computes the replacement (shortest) paths from all nodes to target at once. While the computation time for regular shortest path methods, such as Dijkstra’s, is for each query after a failure, our algorithm can save a considerable computational time when the size of failure set is or less and the network is sparse .

## 1 Introduction

Due to the demand for developing more flexible algorithms to support the changes in the network, several problems have been posed under different names and objectives. In the replacement paths problem, the objective is to answer to query by computing the shortest replaced path efficiently from a fixed source node to a fixed target node for avoiding each of the nodes (or edges) located on the shortest path denoted by . The more general forms of this problem are for multiple sources by answering queries and the all pairs replacement paths format which answers queries by efficiently finding the shortest replaced path for all pairs of source and target nodes, while avoiding an arbitrary failed node (or edge) and by constructing a distance sensitivity oracle. A more challenging problem is to find the replacement path in case of multiple failures and answer to corresponding query , which is still considered as an open problem. The main applications for distance sensitivity oracle are routing in failure-prone networks, Vickrey price problem, and finding shortest simple paths. When a network is prone to failures, it is very expensive to compute the shortest paths every time from the scratch. Distance sensitivity oracle provides this privilege to compute the new shortest paths faster and with lower cost. In extension, fault-tolerant routing protocol is a distributed solution which seeks for the shortest route avoiding the set of failures while trying to optimize the amount of memory stored in the routing tables of the nodes (compact routing scheme) [2]. In the Vickrey price problem from auction theory [3] the edges of a networks are each owned by a selfish agent and the objective is to determine the value of an edge according to how difficult it gets to route the information in the network if that edge fails. This can be done by benefiting from sensitivity distance oracle to compare the shortest path length before and after deleting the edge [4]. This problem is closely related to find the most damaging or vital node (or edge) in the network [5]. Moreover, shortest simple paths can be easily computed by running k executions of a replacement paths algorithm [6].

Our Results. In this paper, we propose a novel and simple-to-implement replacement path algorithm to support multiple failures with arbitrary size and answer to queries efficiently. This algorithm is founded upon two developed concepts in Markov Tensor Theory [7]: avoidance Markov chain and evaporation paradigm. The advantage of our algorithm is multiple folds:

1. By leveraging fast matrix multiplication (with exponent , which is currently [8]), the sensitivity distance oracle with size is constructed in time. This oracle answers to distance and path queries in only time, where is the number of nodes, is the number of edges, and is the size of failure. We consider the cases with failure size , where the query time becomes even more efficient .

2. In contrast to the existing work, there is no limitation on maximum failure size supported by our method. In addition, our sensitivity distance oracle does not depend on failure size and can be exploited for any size of the failure once is constructed.

3. The algorithm supports the general directed networks with arbitrary weights.

4. The algorithm can be simply modified to support edge failures.

5. The algorithm can find alternative longer paths.

Related Work. Sensitivity distance oracle algorithms have been studied vastly for supporting the single failure case. For weighted and directed networks, Demetrescu et al. [9] proposed an -size oracle which is constructed in time and answers to shortest path length queries in time. Bernstein and Karger [4] improved the previous algorithm by lower construction time of but space size of and the same query time of . The same authors also presented a randomized algorithm [10] which is improved in construction time and storage size with a factor of compared to their deterministic algorithm and the same query time. Note that the query time for finding the shortest path is in all of these algorithms where is the length of the path. The approximate algorithm proposed by Khanna and Baswana [11] provides a lower storage requirement of for unweighted and undirected networks. This algorithm returns -approximate distance query in time for given an integer and a fraction .

As one of the first attempts to support more than one failure, Duan and Pettie [12] proposed a method for covering the dual-failures . Their method requires the storage size of which is constructed in polynomial time. The query time for returning the length of shortest path is and for returning the whole path is . According to the authors, this method cannot be extended to cases with , since it becomes very complex and requires of space. The other -sensitivity distance oracle is a -approximate algorithm suggested by Chechik et al. [13] to support more than two failures for undirected networks. The oracle is constructed in polynomial time and takes of space to answer distance queries. The query time for this algorithm is , where is the number of failures and , is the weight of heaviest edge, and is the longest distance in the network. Weimann and Yuster [8] propose a randomized algorithm for constructing a sensitivity distance oracle with size of given a trading-off parameter and conditioned on the failure order being . Notation indicates that some has been dropped from the order. This algorithm was originally devised for integer-weighted graphs with edge weights chosen from [14] and then was extended to real-weighted graphs in a follow-up work [8]. For the case of integer weights, the construction time is with query time of , and the real weights case has been become possible by construction time of and query time of . The authors take advantage of the fast matrix multiplication, with as the exponent, in their computations which is currently [8]. In the most recent work, Chechik et al. [15] proposed a range of -sensitive distance oracles for undirected networks to answer to queries conditioned on the failure order being with stretch. Among the six proposed oracles, excluding the two that are for unweighted networks, the best space requirement is at the cost of pre-process time, and the best pre-process time is at the cost of space requirement. The reviewed works are summarized in Table (1).

## 2 Method Overview

The general idea of distance sensitivity oracle presented in this paper is founded on our developed Markov Tensor Theory [1] which is a unified theoretical platform for solving network problems. We have extended the notation of fundamental matrix and hitting time/cost in Markov chain methods to more advanced Markov metrics which we call avoidance metrics, such as avoidance fundamental matrix and avoidance hitting time/cost. We take one step further and illustrate the behavior of avoidance metrics in evaporation paradigm and show how shortest path information can be nicely resulted from these metrics.

In the next two sections, we first provide a preliminary review on Markov chain metrics and introduce the avoidance metrics. Then we show how to construct the evaporation paradigm from network . We demonstrate that once goes to 0 in , the avoidance hitting cost converges to shortest path distance in and the edges with non-zero probabilities represent the edges on the shortest path tree to target . We find the upper bound for to make this convergence to shortest path happen. Then we illustrate how to devise the distance sensitivity oracle to find the (shortest) replacement paths after multiple failures.

## 3 Preliminaries

Consider a weighted and directed network denoted by , where is the set of nodes, is the set of edges, and is the adjacency matrix whose entry indicates the distance from to if edge , otherwise . A random walk over is modeled by a Markov chain, where the nodes of represent the states of the Markov chain and the Markov chain is fully described by its transition probability matrix: , where is the diagonal matrix of ’s, and is referred to the (out-)degree of node . In addition, the target nodes in can be represented as absorbing states in the Markov chain as once being hit, the random walk stops walking around. Throughout the paper, the words “node” and “state”, and “network” and “Markov chain” are used interchangeably.

A Markov chain is called absorbing if it has at least one absorbing state that, once entered, cannot be left. The other states of an absorbing chain, that are not traps, are called non-absorbing or transient states. In an absorbing Markov chain, from each transient state at least one absorbing state should be reachable. Assuming that states are ordered in the way that set of transient states come first and set of absorbing states come last, the transition matrix for an absorbing Markov chain takes the following block matrix form:

 P=[PTTPTA0IAA], (1)

where is an identity matrix and is row-stochastic. The fundamental matrix of absorbing chain is defined as follows:

 FA=(I−PTT)−1, (2)

where entry represents the expected number of passages through state , starting from state , and before absorption by any of absorbing states [16]. To be more clear about the target set , we show it as a superscript .

Expected absorption time, which is also known as (expected) hitting time or first passage time, is calculated as follows:

 HAs=∑mFAsm, (3)

where represents the expected number of steps before absorption by any of the absorbing states in when the starting state is .

To generalize hitting time and account for cost of edges as well, Fouss et al. [17] introduced hitting cost metric. Hitting cost for a network with cost matrix is the average cost incurred by the random walk when traversing the edges and before hitting the target node for the first time, which can be computed in a recursive form and is the expected out-going cost . We show that the hitting cost can also be computed from fundamental matrix:

 UAs=∑mFAsmrm, (4)

Notice that hitting time (3) is a special case of hitting cost (4) obtained when the cost of all edges are equal to 1, i.e. for all edges .

The absorption probability matrix is defined as [16]:

 Q=FPTA (5)

is a matrix whose -th entry is the probability of absorption by absorbing state when the chain starts from transient state . We denote this entry by , where , to be more clear about the absorbing state which is hit first, i.e. in here, among the other absorbing states which are not touched at all, i.e. . Note that , since starting from any state it will end up being absorbed by one of the absorbing states eventually.

## 4 Avoidance Metrics

In this section, we introduce three new Markov chain metrics with modified properties and conditions:

###### Definition 1 (Avoidance fundamental matrix).

Avoidance fundamental matrix for source node , middle node , and target node conditioned on avoiding node is computed from classical fundamental matrix and absorption probabilities:

 F{t,¯o}s,m=F{t,o}s,m.Q{t,¯o}mQ{t,¯o}s (6)
###### Definition 2 (Avoidance hitting time).

Avoidance hitting time from to avoiding node is the conditional expectation over the number of steps required to hit for the first time when starting from and conditioned on avoiding on the way, and is obtained from the following equation:

 H{t,¯o}s=∑mF{t,o}s,m.Q{t,¯o}mQ{t,¯o}s (7)
###### Definition 3 (Avoidance hitting cost).

Avoidance hitting cost from to avoiding node is the conditional expectation over the cost of steps required to hit for the first time when starting from and conditioned on avoiding on the way, and is obtained from the following equation:

 U{t,¯o}s=∑m(F{t,o}s,m.Q{t,¯o}mQ{t,¯o}s)r{t,¯o}m=∑mF{t,¯o}s,mr{t,¯o}m, (8)

where .

In the following, we present a few lemmas which would be required for network analysis applications later on in this paper.

###### Lemma 1 (Incremental Computation of Fundamental Matrix).

The fundamental matrix for target set of can be computed from the fundamental matrix for target set :

 FS1∪S2im=FS1im−FS1iS2(FS1S2S2)−1FS1S2m, (9)

where the subscripts represent the rows and columns selected from the matrix respectively, e.g. denotes the row and the columns corresponding to set of the fundamental matrix .

###### Lemma 2 (Absorption Probability and Normalized Fundamental Matrix).

The absorption probability for absorbing set can be found from the fundamental matrix for absorbing set :

 Qj,¯¯¯Si=FSijFSjj (10)

For proof of Lemmas (1) and (2), please refer to [?].

###### Corollary 1.

Avoidance fundamental matrix can be written in terms of classical fundamental matrix by applying Lemmas (1) and (2) in the definition of avoidance fundamental matrix (6):

 Ft,¯¯¯ksm=Fkmt(FksmFkst−FktmFktt) (11)

We present the applications of advanced random walk metrics in next three sections, where the new concept of evaporation paradigm is also being introduced and exploited in conjunction with advanced random walk metrics in the last two ones.

Evaporation paradigm is obtained by multiplying factor into transition probability of for all edges , where , and adding one (imaginary) node to network, denoted by , to which every other node is connected with transition probability .

 Pij(α)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩Pijαwijif i,j≠o1−∑k∈N(i)αwikPikif i≠o and j=o0if i=o and j≠o1if i,j=o (12)

Thus the new transition probability matrix , belonging to , is an row-stochastic matrix whose main principal submatrix is , where is the element-wise product. Now with the new transition probability matrix , we compute the avoidance metrics and (from (8) and (6) respectively), and generate the routing continuum based on the following theorems.

###### Theorem 1 (Routing Continuum: Path Distances).

Consider weighted network with at least one path from node to node . Varying parameter from 0 to 1 in the avoidance hitting cost of the corresponding evaporating network yields a continuum from the shortest-path distance to all-path distance (hitting cost distance) from node to node in :

a) If , converges to the shortest-path distance from to in ,

b) If , converges to the hitting time distance from to in ; More precisely, is exactly equal to the hitting cost distance for ,

c) If , .

The intuition behind Theorem (1) is that decreasing , the probability of evaporation in paths increases and when goes to zero, the probability of longer paths become negligible compared to the probability of the shortest path, and only the shortest path survives. In addition, the non-zero entries of matrix become the indicators of the involved nodes lying on the shortest path when goes to zero, which is demonstrated in the next theorem.

###### Theorem 2 (Routing Continuum: Node Flows).

Consider weighted network with at least one path from node to node . For in the corresponding evaporating network , the entries of -th row of the avoidance fundamental matrix, i.e. for , determine the following information regarding the shortest path from to in network :

a) If , no shortest path from to passes through node .

b) If , node is located on all of the shortest paths from to .

c) If , a fraction of the shortest paths from to pass through node .

d) As an immediate result of part (c), there exists more than one shortest path from to if and only if .

According to this theorem, computing the -th row of the avoidance fundamental tensor for , we can find all of the nodes located on the shortest path(s) from to . In addition, we can compute the shortest path length by summing up over this row (3). In addition, we can find routing continuum edge probabilities (aka how to choose the next edge in a routing) from matrix based on the following theorem:

Network with avoiding node and target node can be transformed to network without node and the same target such that the avoidance metrics in the former network turn into the classical metrics in the latter network, i.e. , , and . The transformation function between transition matrix belonging to and belonging to is as follows:

 \cPij=PijQt,¯ojQt,¯oi (13)
###### Corollary 2 (Routing Continuum: Edge Probabilities).

The probabilities assigned to edges for the routing strategy and each choice of can be obtained from:

 \cPij(α)=PijαwijQt,¯oj(α)Qt,¯oi(α)=PijαwijFojt(α)Foit(α), (14)

where is computed from (5) and over evaporation transition probability matrix (12). The second equality is resulted from Lemma (2). Algorithm (1) summarizes our method for computing these three metrics to find the continuum information for each choice of .

## 6 Shortest Path

According to Theorem (1), once goes to zero, the paths are pruned to shortest ones and converges to shortest path distance . In this section, we prove some theorems for the shortest path case which clarifies the behavior of proposed method for small and how it can be exploited to devise a novel method for finding the shortest paths.

### 6.1 Avoidance Hitting Cost Convergence Behavior and the Corresponding Error

In this part, we formulate error in terms of to study the convergence behavior of avoidance hitting cost to shortest path distance when goes to 0. This formulation enables us later in this section to find a bound for to make the error become smaller than , where is the largest value by which all the edge weights are divisible and is the out-degree of nodes. Once , the shortest path distance can be found by rounding down the avoidance hitting cost to its closest value . We also show that is the sufficient condition to find the shortest path from the routing strategy in (4).

Let ’s from countable set be the length of walks from to such that , and ’s be the corresponding probabilities (if there are more than one walk with the same length, the Pr is the aggregated probability of the walks). Since is divisible by all walk lengths , we can assume that any two consecutive walk lengths differ by , i.e. , otherwise we can always add a walk length with zero-probability, i.e. . For unweighted networks . In the evaporating network, every edge is assigned a multiplicative factor of and so walks with length of have the total probability of . Recall that and . Then, the avoidance hitting cost can be decomposed into the shortest path distance plus an error term:

 U{t,¯o}s(α) = ∑i=1liαliPrli∑i=1αliPrli (15) = Lst∑i=1αliPrli+∑i=2(li−li−1)∑k=iαlkPrlk∑i=1αliPrli = Lst+δ∑i=2∑k=iαlkPrlk∑i=1αliPrli = Lst+δ∑j=1αjδ∑i=1αliPrli+j∑i=1αliPrli = Lst+δ∑j=1αjδγj = Lst+ϵst(α),

It can be seen that is a non-negative function of and so always , meaning that avoidance hitting cost converges to shortest path distance from above. In the next part, we show that by putting and computing the inverse function, a bound for can be found.

### 6.2 Finding the Edges on the Shortest Path

Beside finding the shortest path distance by computing for small enough , we need to find the path itself. In the following theorem, we show how to find the successor of each node in the shortest path tree and specify the edges located on the shortest path.

###### Theorem 4 (Shortest Path Routing Strategy).

Let , where is the number of out-going neighbors of and is the largest value by which all the edge weights are divisible. ’s out-going edge with highest probability, i.e. , is located on the shortest path from to .

Since, finding the shortest path is a recursive process, the whole path can be obtained by finding the successor of each node via the highest edge probability in each step, starting from and until getting to . The following algorithm summarizes the shortest path routing strategy based on the proposed method

### 6.3 Bound for α

We need a bound for to make distance error smaller than and Theorem 4 hold. The following theorem finds such a bound for :

###### Theorem 5.

The distance error can shrink down enough if follows the following bound:

 α≤(1(dmax)Lmax+1−dmax+1)1/δ→ϵ<δ/dmax, (16)

where is the diameter of the network and is the maximum out-degree in the network

###### Proof.

We first find an upper bound for to obtain an upper bound for distance error. Recall that :

 γj = ∑i=1αliPrli+j∑i=1αliPrli≤αl1∑i=1Pr% li+j∑i=1αliPrli≤αl1(1−(∑ji=1Prli))αl1Prl1 (17) ≤ αl1(1−Prl1)αl1Prl1=1−Prl1Prl1≤1−(1dmax)Lmax(1dmax)Lmax

The last inequality is resulted from the worst case scenario in which the shortest path probability is composed of multiplication of least edge probabilities, i.e. , and for the longest distance of network diameter. The upper bound for distance error is obtained as follows:

 ϵst≤δ∑i=1αiδ(1−(1dmax)Lmax(1dmax)Lmax)=δαδ1−αδ(1−(1dmax)Lmax(1dmax)Lmax) (18)

To guarantee that the distance error is smaller than , we can make its upper bound (18) be lower than , i.e. . Now, we can find a bound for in terms of , network diameter , and maximum out-degree to have distance error smaller than :

 α≤(1(dmax)Lmax+1−dmax+1)1/δ≈(1dmax)(Lmax+1)/δ (19)

## 7 Replacement Path After Multiple Failures

###### Theorem 6.

Assume set of nodes have failed in weighted network . If in the corresponding evaporating network , the avoidance hitting cost in converges to shortest-path distance in where failure set is discarded from the network.

 limα→0Ut,{¯¯¯¯¯¯¯F,o}s(α)=L¯¯¯¯Fst (20)

Algorithm (2) pre-computes the distance sensitivity oracle and answers to replacement shortest paths queries efficiently from all nodes to target while there are a set of failure nodes in the network.

where the second equation in Query response is resulted from Theorem (1) and the third one is a substitution of (2) in Theorem (4).

### 7.1 Preprocess time and space

The purpose of preprocess part is to compute and store which can be used to answer replacement path queries very efficiently. The required space for storing this matrix is where is the number of nodes. Regarding the complexity time, the inverse computation is the main costly component with complexity time of , where , and is discussed in the following. Recall that the ReccomenderModule requires 20 network metrics as input who are all less complex than .

Matrix Inverse: The computational complexity of matrix multiplication of two matrices is sub-qubic; according to Strassen algorithm [18] the complexity is and later on it reduced even more to by Coppersmith-Winograd algorithm [19]. Cormen et al. [20] proved that inversion is no harder than multiplication (Theorem 28.2). A divide and conquer algorithm that uses blockwise inversion to invert a matrix runs with the same time complexity as the matrix multiplication algorithm that is used internally.

### 7.2 Query time

For having a fast query time, we leverage the incremental computation in Theorem (1). Based on this theorem, only an -computation is required to compute from precomputed matrix and for a given failure set with size . The other most costly component of query computations is computing the new probabilities for all edges which takes time and makes the overall computational time of for each query. By considering the cases with failure size , the overall query time reduces to

## Acknowledgement

The research was supported in part by US DoD DTRA grants HDTRA1-09-1-0050 and HDTRA1-14-1-0040, and ARO MURI Award W911NF-12-1-0385.

## Appendix A Appendix

### a.1 Proof of Theorems

###### Proof of Theorem 1.

Let ’s from countable set be the length of walks from to such that , and ’s be the corresponding probabilities, where . The avoidance hitting cost (LABEL:eq:general_avoidanceU) in evaporation network finds the following form:

 U{t,¯o}s(α)=∑i=1liαli% Prli∑i=1αliPrli, (21)

Proof of part (a)

When , the first term of numerator (and denominator) which is for dominates the subsequent terms and converges to .

Proof of part (b)

For , there is no evaporation and network splits into two disconnected subgraphs: the original network with node as its only absorbing node, and one isolated node which is node . Then reduces to the regular hitting cost from to in the original network :

 U{t,¯o}s(α=1)=∑i=1liPrli∑i=1Prli=∑i=1liPrli=Uts (22)

Proof of part (c)

We prove that if then , i.e.:

 ∑i=1liαli1Prli∑i=1αli1Prli≤∑i=1liαli2Prli∑i=1αli2Prli (23)

Cross-multiplying the fractions in (23), we compare the corresponding terms from the left-hand-side and right-hand-side polynomials. Without loss of generality assume that :

 (αli2Prli)(ljαlj1Prlj)+(αlj2Prlj)(liαli1Prli) ≤ (αli2Prli)(ljαlj1Prlj)+(αlj2Prlj)(liαli1Prli) ⇒PrliPrlj(ljαli2αlj1+liαlj2αli1) ≤ PrliPrlj(ljαli1αlj2+liαlj1αli2)

Notice that for this inequality in two cases of: 1) or being zero, and 2) the equality holds; otherwise:

 (lj−li)αli2αlj1<(lj−li)αli1αlj2⟹αlj−li1<αlj−li2,

where the last inequality is obviously correct. ∎

###### Proof of Theorem 2.

The avoidance fundamental matrix in evaporation network when the network is weighted finds the following form:

 F{t,¯o}sm(α)=(∑li=Lsmαli∑ζj∈Zsm(li)Prζj)⋅(∑li=Lmtαli∑ζj∈Zmt(li)Prζj)∑li=Lstαli∑ζj∈Zst(li)Prζj (24)

When , the first terms with lowest exponent of dominate the subsequent terms and the equation above reduces to:

 limα→0F{t,¯o}sm(α)=limα→0αLsm+Lmt(∑ζj∈Zsm(Lsm)Prζj).(∑ζj∈Zmt(Lmt)Prζj)αLst∑ζj∈Zst(Lst)Prζj (25)

Proof of part (a)
If is not located on any shortest path from to , then and the limit in Eq. (25) converges to zero.

Proof of part (b)&(c)
If is located on at least one of the shortest paths from to , then and the limit (25) has non-zero value: . On the other hand, we know that if . In the case that is located on all of the shortest paths from to , it should be in distance from and in distance to on all of these paths (otherwise we can find a shorter path by connecting two shorter pieces) and thus we have: