The mdt algorithm
Abstract
Link state routing protocols such as OSPF or ISIS currently use only best paths to forward IP packets throughout a domain. The optimality of subpaths ensures consistency of hop by hop forwarding although paths, calculated using Dijkstra’s algorithm, are recursively composed. According to the link metric, the diversity of existing paths can be underestimated using only best paths. Hence, it reduces potential benefits of multipath applications such as load balancing and fast rerouting. In this paper, we propose a low time complexity multipath computation algorithm able to calculate at least two paths with a different first hop between all pairs of nodes in the network if such next hops exist. Using real and generated topologies, we evaluate and compare the complexity of our proposition with several techniques. Simulation results suggest that the path diversity achieved with our proposition is approximatively the same that the one obtained using consecutive Dijsktra computations, but with a lower time complexity.
1 Introduction
Routing is one of the key components of the Internet. Despite the potential benefits of multipath routing (e.g. [5] or [6]), most
backbone networks still use unipath routing such as OSPF or ISIS or their ECMP feature (Equal Cost MultiPath). With these routing protocols, the forwarding only changes upon
topology variations and not upon traffic variations. Dynamic multipath routing (e.g. [16], [15], [8] or [3]) is
able to provide several services such as load balancing, to reduce delays and improve throughput, and fast rerouting schemes in case of failures.
The reliability of an IP network against failures and congestions
depends on the reaction time necessary for the convergence of the
underlying routing protocol. Proactive multiple paths calculation allows to accelerate this reaction time: precomputed alternate paths can be directly used as backup paths without waiting for the routing protocol convergence.
This proactive mechanism can improve the network response in case of troubles where such backup paths exist. To provide these functionalities, the set of forwarding alternatives has to be large enough to achieve a good path diversity.
However, current routers only support ECMP. This feature
corresponds to a simple variant of Dijkstra where equal cost paths
are inherited along the shortest path tree (SPT).
The optimality condition of subpaths computed with ECMP restricts the number of loopfree paths and so reduces potential advantages of multipath routing.
In order to use multiple unequal cost paths between a pair of ingress and egress
routers, there are two forwarding possibilities. On the one hand, source multipath forwarding schemes can use
MPLS with a path signaling protocol (such as RSVPTE [4]) to establish any desired paths.
With this kind of approach, either the deployment is generalized in the whole
network and does not scale very well (proportional to the square of the number of routers), either
the reaction time can be as long as the notification delay on the
return path.
On the other hand, multipath routing protocols with hop
by hop forwarding needs to validate a set of next hops such that the recursive composition between neighbor routers does not create forwarding loops (see [14], [15] and [17]).
The first limitation is the complexity in time, space and the number of messages exchanged to compute and validate loopfree paths.
In this paper, we propose a simple hop by hop scheme that does not require a signaling protocol to validate loopfree paths. If the
validation procedure, whose goal is to verify the absence of loops,
is local (without exchanging any message) and does not involve all
routers, then the deployment can be incremental.
Our approach is equivalent to ECMP in terms of time, space and
message exchange complexity but allows to compute a greater diversity of forwarding alternatives.
In this paper, we propose the following contributions:

a new graph decomposition analysis.

two variants of the Dijkstra algorithm: DijkstraTransverse (DT) and multiDijkstraTransverse (mDT).

a proof that they compute at least two distinct next hops from the calculating node towards each node of the graph if such next hops exist.

an evaluation of the efficiency and the complexity of our proposition compared to existing techniques.
This paper is organized as follows. Section 2 summarizes basic multipath routing notions and related work. Section 3 introduces our algorithms and their properties. Section 4 presents our simulation results to underline the relevance and the low time complexity of our proposition.
2 Notations and context
Table 1 lists the graph definitions used in the paper. Notations are related to the multipath hop by hop forwarding context: computed paths are loopfree and first hop distinct. We order paths according to an additive metric , and we focus on the best paths having distinct first hops. To distinguish equal cost paths, we consider the lexicographical order of first hops. For simplicity reasons we do not consider the multigraph issue: a first hop is equivalent to a successor node, the next hop. The valuation denotes the weight of each directed link used by the routing protocol. Let us define a safety property for distributed routing policies.
Definition: Loopfree routing property at the router level.
A multipath routing protocol is loopfree if it always converges to a stable state such that when any router s forwards a packet to any next hop v towards any destination d, this packet never comes back to s.
Notations  Definitions 

oriented graph with a set of nodes , a set of  
edges and a strictly positive valuation of edges .  
edge connecting node to node  
we assume that .  
,  incoming and outgoing degrees of node . 
set of neighbors of node ().  
best loopfree path linking to . Recursively,  
this is the best path whose first edge is distinct from  
the first edge of the best paths.  
cost of the path  
.  
best next hop computed on towards . This is  
the first hop of . 
With hop by hop link state multipath routing using multiple unequal
cost paths, two phases may be necessary to ensure loopfree routing: a
path computation algorithm and a validation process.
We do not consider validation processes using a signaling protocol (such as it can be done with distance vector routing messages, see [15] for example).
With unipath or ECMP routing, the subpath optimality condition guarantees the
correctness of next hop composition. To increase the number of valid
alternatives, the simplest rule to select a next hop on a router
(such that ) is the downstream criteria which can
be expressed as follows:
(1) 
This rule is referenced in the ISIS standard ISO 8473, is used in OSPFOMP [14] and is denoted LFI in
[15] (with the particularity of avoiding routing loops even
in transient periods of topology changes). This rule is called
one hop vision in [17] where Yang and Wetherall
introduce a set of rules whose flexibility allows to
increase the number of valid neighbors thanks to a two hops
vision. This set of rules is more complex: the forwarding mechanism
is specific to the incoming interface and allows forwarding loops at the router
level but not at the link level. Thus, a packet is never forwarded through the same link but it can enter the same router twice.
Authors suggests that minimizing the queue level
should be the primary goal, however delays can increase if paths
contain several times the same router and this unnecessarily
consumes more resources (routers CPU, links bandwidth,…). We
consider that the queue usage is not the only resource to save.
In order to perform loopfree routing, the validation process needs
to compute a set of candidate next hops. A candidate next hop is a first hop of a computed path which is not yet validated for loopfree routing. On a given calculating
node (a root node ), the simplest way to obtain an exhaustive
candidate set is to compute the SPT of all
neighbor nodes. Thus, router can use the best costs information
of its neighborhood. This approach is denoted kD in the
following, and our analysis uses this technique as a reference. The
complexity of kD depends on the number of neighbors:
instances of the Dijkstra algorithm are necessary to compute the local and neighborhood best costs. If a router has a large number of interfaces, the computation time can be too long. Even if this calculation is typically done offline, when a congestion or a failure occurs during this period, the router is unable to perform the traffic switching.
Another way is to use an enhanced SPT
algorithm to locally compute multiple paths for each destination.
For example, algorithms and implementations presented in
[12] are designed to compute the set of shortest
loopfree paths, but do not guarantee that these paths are first hop
distinct. The shortest loopfree paths problem is not suited for
simple hop by hop forwarding. Indeed, in order to forward packets via these explicit paths,
a signaling protocol is necessary to mark routes from the ingress
router towards each egress router. Here we focus on distinct first
hops computation (), and paths are implicity stored as
candidate next hops. The objective of our approach is to compute a set of loopfree first hop disjoint paths with a lower complexity than kD.
For this purpose, we calculate a set of costs containing at least
two entries for each destination node in the graph.
With an enhanced SPT algorithm able to compute such a set, rule
(1) becomes:
(2) 
If satisfies rule
(2), then is a valid next hop.
Thus, the next hop can be used by to reach and
it satisfies the loopfree routing property at the router level. Note that: .
Terms  Definitions 

subtree of the SPT rooted at a neighbor of  
transverse edge  an edge is transverse if it connects 
two distinct branches and  
or if it connects the root  
and a node in a  
internal edge  an edge is internal if it connects two nodes 
and belonging to a given  
and such that  
ktransverse path  a path is ktransverse if it contains exactly 
k transverse edges and no internal edge  
Simple  a transverse path 
transverse path  such that 
and is a transverse edge ()  
Backward  a transverse path such that for 
transverse path  a , 
and  
Forward  a transverse path such that for 
transverse path  a , 
and 
To sum up, our approach follows these three steps:

it uses an unmodified link state routing protocol such as OSPF or ISIS to obtain topological information,

it uses a multipath computation algorithm (see section 3) instead of Dijkstra to compute candidate next hops,

it uses condition (2) to select valid next hops.
3 Candidate next hops computation
This section describes our path computation algorithms and an original edge partition analysis. Given a root node , the set of edges of a graph can be partitioned into four subsets (we consider both directions of each edge):

Edges corresponding to first hops of primary paths.

Edges belonging to subtrees corresponding to branches.

Transverse edges connecting two distinct branches or connecting the root and a branch without being the first hop of a primary path.

Internal edges linking nodes of the same branch without belonging to this branch.
These four subsets exhaustively describe because the set of
branches contains all nodes (except the root node ) in the graph.
Fig. 1 illustrates an edge partition on a simple graph (some nodes are identified with a letter to facilitate the reading of section 3.2). In this
graph (we consider as a constant function), there are three branches (black, gray and white nodes), two
transverse edges (dashed arcs denoted and ) and one internal
edge (dotted arc denoted ). Edges , and
correspond to the three first hops (red arcs) linking to the three
branches.
With multipath hop by hop routing, the primary path denotes
the optimal path depending on a given metric and a lexicographic
order to rank equal cost paths. Thus, for a given pair , an
alternate path is a path whose first edge is distinct from the first
one of the primary path . More generally, if the forwarding mechanism is distributed such as with hop by hop routing, then all alternate paths are first hop distinct. Table
2 summarizes all definitions related to
transverse paths terminology.
The path is simple transverse and
the path is backward transverse. Paths
and are both forward transverse. However,
contains a sub path whereas
contains a sub path .
The path is transverse.
The routing information base cannot directly use the set of candidate
next hops corresponding to the first hops of transverse path to perform forwarding, since routing loops may occur. Our approach needs a validation
mechanism to select valid next hops among candidate next hops in order to guarantee
the safety of forwarding. In this paper, we consider the rule (2) introduced in section 2 to
validate candidate next hops. Due to space limitations, we do not
discuss and evaluate rules allowing to use a higher route diversity (see
[11]).
3.1 DT and mDT algorithms
In [10], we have proposed and
described the DijkstraTransverse algorithm (DT). Here, we focus on DT properties that we have not presented in [10] (see section 3.2) and on a DT improvement that we call multiDT (mDT). However, the basics of DT and mDT are similar.
To sum up, DT and mDT compute a multipath cost matrix on a given root node
(denoted in the following). A multipath cost matrix contains an
overestimation of best costs for all () destinations and via all
possible () neighbors of . The goal of these algorithms is to calculate a set of candidate next hops corresponding to costs associated to each neighbor. The calculation consists in two main stages:

Compute the best path tree and transverse edges.

Compute backward and forward transverse paths.
At each iteration, our algorithms compute the best transverse paths depending on the first hop. Without an optimized structure to implement the best costs vector, the complexity of DT for each calculating node s is in the worst case:
DT adds a time complexity proportional to the outgoing degree of the
given root node compared to Dijkstra.
With a Fibonacci heap [7] to implement the best costs
vector
The set of candidate next hops computed with DT does not always include all next hops corresponding to equal best cost paths. mDT (see algorithm 1) is able to solve this problem. With mDT, only the first computation phase of DT is modified by using a next hop matrix denoted . This matrix represents the existence of a next hop per neighbor for each destination. is updated at each edge exploration. Candidate next hops recording follows a transitive rule: with . Initially, if then . With ECMP, the update of is performed only if . We have chosen to generalize this approach to improve the upper bound on the cost of forward transverse paths composed with a backward transverse path. This generalization increases the number of validated next hops. Indeed, during the exploration of the set of successors of node , if node is not already marked, it inherits all forwarding alternatives of , including when is an internal edge. In this case, the next hop inheritance is not restricted to branches as with DT: is not the son of on a primary path. mDT allows to use all forwarding alternatives already computed towards . This set of paths is not limited to transverse alternatives, it can contain alternate paths with several internal or transverse edges. The mDT computation is based on the order of node exploration which depends on the rank of costs stored in . With mDT, the first computation phase is able to calculate all candidate next hops corresponding to ECMP alternatives. Recursively, the cost inheritance takes into account all the sets of equal best cost paths for all marked nodes. The complexity of mDT is slightly greater than the one of DT: for each iteration of the main loop, operations are necessary to execute the inheritance of next hops and their costs. The worst case complexity of mDT is in without an optimized structure for .
3.2 Properties of DT and mDT
In this section, we prove the ability of our algorithms to compute at least two candidate next hops between each pair of nodes in the graph if such next hops exist.
Property 1.
DT computes all transverse paths, and mDT computes all paths computed with DT and all equal best cost paths.
The proof of these properties relies on next hops inheritance performed by DT and mDT (for more details, see [10]).
Now, let us define a major property of transverse paths.
Property 2.
If there exists an alternate path , then there exists a transverse path between and .
The demonstration of this property relies on two lemmas.
Lemma 1.
If there exists an alternate path from to then there exists a path from to whose cost is not greater than the one of and containing only one transverse edge.
[Proof of Lemma 1] Let be an alternate path from to where is the last transverse edge of and consider the shortest path from to . Then either is empty because and , or is a primary path which is not longer than . Let be the operator representing the path concatenation. In both cases, there exists a path such that is an alternate path with only one transverse edge and which is not longer than .
Figure 1 illustrates lemma 1. The transverse path between and via the neighbor node uses to reach the transverse edge . There exists an alternate simple transverse path . Note that the existence of a path with several transverse edges implies that DT (and mDT) implicitly records a transverse path in the cost matrix with a cost lower or equal to the cost of .
Lemma 2.
If there exists an alternate path from to with one transverse edge, then there exists a transverse path linking and .
[Proof of Lemma 2]
Let be such an alternate path where
is the unique transverse edge.
Without loss of generality we may assume that
is a primary path (see lemma1) without any internal edge.
Note that .
To characterize the differences between transverse paths, we use an
“ancestor function”.
An ancestor of a node is a node such that there exists a
primary path included in the SPT rooted at . The
closest common ancestor of nodes and is an ancestor of
and such that for any common ancestor of and ,
is also an ancestor of .
Let be the closest common
ancestor of nodes and .

If then there exists a forward transverse path linking and : a simple transverse path between and and a primary path between and .

Else if then there exists a backward transverse path linking and : a simple transverse path between and and a path in the reverse direction of the primary path between and
^{2} . 
Else if , then is the node where the branch including and is subdivided into two subbranches, one containing , the other containing
^{3} . In this case, there exists a forward transverse path linking and which contains a backward transverse path and a primary path .
Thus, in each case, the existence of a 1transverse path allowing to reach is verified.
Figure 1 illustrates lemma 2. Although the
alternate path is
not transverse because it contains an internal edge ,
there exists a forward transverse path
. In this case, the internal edge is bypassed with a backward
composition followed by a forward composition.
It allows to compute the alternate next hop to reach .
Thanks to the backward and forward composition, if there exists a
transverse path, then DT finds it. These two phases allow to use
edges of the SPT in both directions. Moreover, DT considers all
transverse edges because, as it is the case for the classical Dijkstra algorithm,
all edges must be explored in order to mark all nodes.
The difference is that DT implicitly stores longer or equal cost
paths in the cost matrix.
Corollary 1.
For any pair of nodes , if there exists an alternate path from to , then DT and mDT allow to compute at least two candidate next hops towards .
Corollary 2.
If the graph contains no bridge edge, then DT and mDT allow to compute at least two candidate next hops between any node and any other node of the graph.
For a given destination, the corollary 1 allows to conclude that the number of candidate next hops is at least if there exists an alternate path linking and . Corollary is more specific, if the network is 2edge connected, then corollary 1 can be extended for all pairs of routers.
4 Evaluation
We use the Network Simulator 2 (ns2, [2]) to compare several routing approaches. ECMP is already implemented within the link state module of ns2. We have extended ns2 to implement DT, mDT, kD and the downstream criteria, rule (2), in the routing module (see [1] to find the implementation).
4.1 Topologies and simulations setup
We present results obtained on three different kinds of topologies.
The first category of networks are real topologies with actual IGP
weights (for confidentiality, we approximate their size in Table
3).
Topologies denoted ISP1 and ISP2 are commercial networks covering an
European country. ISP3 and ISP4 are Tier1 ISP networks.
The second category of topologies were chosen among the Rocketfuel
inferred set of maps given in [9].
We have also used the Igen topology generator ([13]) in order
to obtain a set of evaluation topologies of various sizes. We have
generated topologies containing between and nodes using the
medoid parameter, the delaytriangulation heuristic and a
sprint pop design. The parameter that determine the number of
routers per cluster is chosen such that , so that each cluster contains approximatively routers for each
generated topology. These parameters offer a great physical
diversity to measure the relevance of our proposition to achieve the
same level of diversity as computed with . The link valuation
used for this third category is the inverse of the link capacity. The mean degree, denoted , is approximatively the same for each generated topology: .
These networks represent access backbones and contain two kinds of
links: Mbps for access links and Gbps for backbone links (so
that weights of links are respectively and ).
4.2 Results
4.3 Diversity results
First, we have measured the path diversity (see Fig.2). We have
calculated the total number of candidate next hops obtained with
ECMP (denoted EC), DT, mDT, and multiple Dijkstra computations
(kD). Results are represented as a performance ratio between
the considered technique and kD for all routers of a given network.
kD provides the best diversity but with a higher computation cost.
We observe that DT and mDT are able to compute approximatively of
candidate next hops obtained with kD, while ECMP obtains a
performance ratio only between and .
4.4 Complexity results
Then, we have compared the time complexities of the fore mentionned algorithms (see Fig. 3). We have represented the execution time measured in number of operations needed by DT, mDT and kD to compute their set of candidate next hops. The number of operations is an average computed for each router. This value takes into account all operations necessary to extract the of and perform update of , and .
Candidate next hops  Validated next hops  Number of operations  
Network  Size  mean  ratio/kD (%)  mean  ratio/kD (%)  mean  ratio/kD (%)  
name  kD  EC  DT  mDT  kD  EC  DT  mDT  kD  EC  DT  mDT  
ISP1  25  50  1.46  76  97  97  1.10  97  100  100  489  60  66  75 
ISP2  50  200  3.58  43  93  97  1.79  69  89  94  6730  30  32  32.5 
ISP3  110  350  2.70  55  89  92  1.45  82  97  99  8079  38  41  43.5 
ISP4  210  880  3.73  44  86  88  1.81  72  96  99  41747  27  28  31 
Exodus  79  294  3.58  44  88  96  1.73  58  94  99  5569  29  34  37 
Ebone  87  322  3.49  46  90  96  1.76  77  93  99  9698  30  33  36 
Telstra  104  304  2.30  72  92  95  1.30  90  98  99  6526  54  57  59 
Above  141  748  5.29  34  86  97  2.50  58  89  99  40143  18.5  20  23 
Tiscali  161  656  3.68  54  91  97  1.97  74  92  97  31044  27  29  32 
We notice that the time saved with DT or mDT is really significant
compared to kD. The number of operations needed by kD is
approximatively whereas mDT and DT
need approximatively operations. This complexity is
equivalent to the worst case of an ECMP computation. The time complexity upper bound is reached because some routers of Igen topologies
have a high degree of connectivity.
4.5 Loopfree diversity results
Finally, we have compared the number of validated next hops that are selected with the downstream criteria (rule 2) depending on the computation algorithm (see Fig. 4). We remark that mDT allows to validate as many next hops as kD. This result can be explained by the specific valuation function of our set of generated topologies: there are only two very distant weights used in these networks.
4.6 General results and discussion
Results given in Table 3 illustrate the same evaluation of
performance ratios and complexity on the set of real and inferred
topologies. For these sets of topologies, Table 3 also shows candidate and
valid next hops average per destination obtained with kD. Diversity ratio results
are similar to the ones obtained with Igen although degrees and
weights distributions are completely different. The main difference comes from
the time complexity evaluation. On these topologies, the maximum
degree of nodes is two times lower than with Igen topologies. The
measured complexity is far away from the theoretical worst case. More generally, several parameters, such as the valuation function or the degree distribution may strongly influence complexity measures, and thus the performance of algorithms. For example, if is a constant function, rule (2) is equivalent to ECMP. Thus, in this case, the number of valid next hops is the same for mDT, kD and ECMP. Another key point is the fact that the alternate paths which are not computed with mDT have a cost generally much more greater than the one of the primary path, that is why the ratio of loopfree alternatives between mDT and kD is close to .
To summarize, although DT and mDT consume less processor resources than
kD, they are able to offer almost the same diversity in terms of
validated next hops.
5 Conclusion
Multipath routing enhances the network reachability and allows load balancing to circumvent congestions or failures. However, the overhead imposed by signaling messages, the time and space complexity can hamper its deployment. In this paper, we propose a simple scheme that is able to generate a greater path diversity than ECMP with an equivalent overhead. Our path computation algorithms, DijkstraTransverse, and its improvement multiDT, allow to compute at least two candidate next hops between all pairs of routers if such next hops exist. To validate candidate next hops in a distributed manner, we have considered the simplest loopfree routing rule, the downstream criteria. Our evaluations suggest that the gain of time is very significant. We show that the number of next hops validated with the downstream criteria is slightly the same using mDT or a Dijkstra computation per neighbor. Moreover, our proposition can be integrated in OSPF or ISIS by replacing the path computation algorithm without any change in the protocol. It can be deployed incrementally, some routers using ECMP and others DT or mDT. Our proposition can be extended to compute backup next hops only selected if a failure occurs.
Acknowledgement
The research results presented herein have received support from Trilogy (http://www.trilogyproject.eu), a research project (ICT216372) partially funded by the European Community under its Seventh Framework Programme. The views expressed here are those of the author(s) only. The European Commission is not liable for any use that may be made of the information in this document. The authors would like to gratefully acknowledge Pierre Francois and Olivier Bonaventure for their comments.
Footnotes
 The minimum extraction has an unitary cost whereas the minimum suppression has an amortized cost in . For simplicity reasons, evaluations results that we present in this paper only rely on array lists.
 We assume that .
 Note that we know that and .
References
 “Implementation of dt and mdt in ns2,” http://wwwr2.ustrasbg.fr/~merindol/uploads/Research/DT.tar.gz.
 “The network simulator ns2,” http://www.isi.edu/nsnam/ns.
 D. Applegate and E. Cohen, “Making intradomain routing robust to changing and uncertain traffic demands: understanding fundamental tradeoffs,” in SIGCOMM, 2003.
 D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, and G. Swallow, “RSVPTE : Extensions to RSVP for lsp tunnels,” RFC 3209, 2001.
 R. Banner and A. Orda, “Multipath routing algorithms for congestion minimization,” IEEE/ACM Trans. Netw., 2007.
 I. Cidon, R. Rom, and Y. Shavitt, “Analysis of multipath routing,” IEEE/ACM Trans. Netw., vol. 7, no. 6, 1999.
 T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson, Introduction to Algorithms. McGrawHill Higher Education, 2001.
 S. Kandula, D. Katabi, B. Davie, and A. Charny, “Walking the tightrope: Responsive yet stable traffic engineering,” in SIGCOMM, 2005.
 R. Mahajan, N. Spring, D. Wetherall, and T. Anderson, “Inferring link weights using endtoend measurements,” in ACM SIGCOMM Internet Measurement Workshop, 2002.
 P. Mérindol, J.J. Pansiot, and S. Cateloin, “Path computation for incoming interface multipath routing,” in ECUMN, 2007.
 P. Mérindol, J.J. Pansiot, and S. Cateloin, “Improving load balancing with multipath routing,” in ICCCN, 2008.
 M. Pascoal, “Implementations and empirical comparison for k shortest loopless path algorithms,” in The Ninth DIMACS Implementation Challenge: The Shortest Path Problem, 2006.
 B. Quoitin, “Topology generation through network design heuristics,” http://www.info.ucl.ac.be/~bqu/igen/.
 C. Villamizar, “Ospf optimized multipath (ospfomp): draftietfospfomp02.txt,” IETF, Draft, Feb. 1999.
 S. N. Vutukury, “Multipath routing mechanisms for traffic engineering and quality of service in the internet,” Ph.D. dissertation, 2001.
 H. Wang, H. Xie, L. Qiu, Y. R. Yang, Y. Zhang, and A. Greenberg, “Cope: traffic engineering in dynamic networks,” in SIGCOMM, 2006.
 X. Yang and D. Wetherall, “Source selectable path diversity via routing deflections,” in SIGCOMM, vol. 36, 2006.