Active Topology Inference using Network Coding
Our goal is to infer the topology of a network when (i) we can send probes between sources and receivers at the edge of the network and (ii) intermediate nodes can perform simple network coding operations, i.e., additions. Our key intuition is that network coding introduces topology-dependent correlation in the observations at the receivers, which can be exploited to infer the topology. For undirected tree topologies, we design hierarchical clustering algorithms, building on our prior work in (25). For directed acyclic graphs (DAGs), first we decompose the topology into a number of two-source, two-receiver (2-by-2) subnetwork components and then we merge these components to reconstruct the topology. Our approach for DAGs builds on prior work on tomography (37), and improves upon it by employing network coding to accurately distinguish among all different 2-by-2 components. We evaluate our algorithms through simulation of a number of realistic topologies and compare them to active tomographic techniques without network coding. We also make connections between our approach and alternatives, including passive inference, traceroute, and packet marking.
keywords:Network Tomography, Network Coding, Topology Inference
Knowledge of network topology is important for network management, diagnosis, operation, security and performance optimization. Depending on the context, one may be interested in the topology at different layers, such as the Internet’s router-level topology, an overlay network topology, the topology of an ad-hoc wireless network, etc.
There is a large body of prior work on measurements and inference of network topology. One family of techniques assumes the cooperation of nodes in the middle of the network, and uses traceroute measurements to collect the ids of nodes along paths and use them to reconstruct the topology. Another family of techniques, referred to as network tomography, assumes no cooperation from internal nodes and relies on end-to-end probes to infer internal network characteristics, including topology. More specifically, multicast or unicast probes are sent/received between sets of sources/receivers at the edge of the network, and the topology is inferred based on the number and order of received probes.
In this paper, we revisit the problem of topology inference using end-to-end probes, in networks where intermediate nodes are equipped with simple network coding capabilities. We show how to exploit these capabilities in order to perform active topology inference in a more accurate and efficient way than existing tomographic techniques.
Our key intuition is that network coding introduces topology-dependent correlation in the content of packets observed at the receivers, which can then be exploited to reverse-engineer the topology. For example, a coding point (that combines multiple incoming packets into one or more outgoing packets) introduces correlation between packets coming from different sources, in a similar way that multicast introduces correlation in the packets sent by the same source and observed by several receivers. In fact, the correlation introduced by multicast has been the starting point and the main idea underlying tomographic topology inference. Subsequent schemes made this idea more practical, by emulating multicast with back-to-back unicast probes (11); (38). In contrast, relating probes from different sources to reveal intermediate nodes, also referred to as multiple-source tomography, has been a challenge (4); (38); (37). Using simple network coding operations at coding points solves this problem and allows accurate and fast topology inference.
Our approach is general and can be applied to infer the topology in a range of scenarios, including but not limited to wireless multi-hop networks. Wireless multi-hop networks are able to support simple network coding operations (additions are sufficient for our schemes), as demonstrated in (33), and can therefore benefit from our techniques. Furthermore, there is a good match between some properties and constraints of such networks and our schemes. First, there is natural variability in the delay of wireless links, which (if appropriately used - as explained in later sections) can expedite inference. Second, our schemes keep internal nodes simple (moving processing for inference to dedicated nodes at the edge) and anonymous (revealing the logical topology but not the identities of nodes). Finally, improving the speed of inference may prove important to keep up with changes, e.g., due to mobility.
Our contributions are as follows. First, we consider undirected trees, where leaves can act as sources or receivers of probes, and we design hierarchical clustering algorithms that infer the topology, building on our prior work in (25). Then, we consider directed acyclic graphs (DAGs) with a fixed set of M sources and N receivers and a pre-determined routing scheme. We first decompose the topology into a number of two-source, two-receiver subnetwork components and then we merge these components to reconstruct the topology. Our approach for DAGs builds on prior work on tomography (37), and improves upon it by employing simple network coding operations at intermediate nodes to deterministically distinguish among all possible 2-by-2 subnetwork components, which was impossible without network coding (37); (38). We evaluate our algorithms through simulation over a number of topologies and we show that they can infer the topology accurately and faster than tomographic approaches without network coding. We present our schemes as active probing: special probes are sent by the sources, specifically for the purpose of inference, and are treated in special ways by intermediate nodes and eventually received by the receivers and processed at a fusion center. We believe that our active probing approach with network coding provides one more building block, in the already large space of topology inference techniques, with core strength and ability to identify joining points. We also compare and make connections between our active probing approach and alternatives, such as passive inference, traceroute, and packet marking.
The rest of the paper is organized as follows. Section 2 discusses related work. Section 3 presents our assumptions, notation, and problem statement. Section 4 summarizes the main results of the paper. Section 5 presents algorithms for inferring tree topologies. Binary trees are discussed in Section 5.1, in the absence (Section 5.1.1) or presence (Section 5.1.2) of packet loss. General trees are discussed in Sections 5.2.1 and 5.2.2. Section 6 presents algorithms for inferring directed acyclic graphs (DAGs). Section 6.2 presents algorithms for inferring 2-by-2 subnetwork components, in the absence (6.2.1) or presence (6.2.2) of packet loss. Section 6.3 explains how to merge these components to reconstruct the topology. Section 7 provides simulation results for some realistic topologies. Section 8 discusses two possible deployment scenarios (one as an active probing scheme and another one using packet marking), and makes connections between our approach and alternative topology inference approaches. Section 9 concludes the paper. Appendices A and B analyze the probability of error of our inference algorithms in trees and DAGs, respectively.
2 Related Work
One body of related work is network tomography in general, and topology inference in particular. A good survey of network tomography can be found in (7). An early work on topology inference using end-to-end measurements is (39), where the correlation between end-to-end multicast packet loss patterns was used to infer the topology of binary trees. The correctness of this idea was rigorously established in (20), and was extended to general trees and to measurements other than loss, such as delay variance (21), or more generally any metric that grows monotonically with the number of traversed links. The idea has also been extended to unicast probes (11); (38). In summary, tomographic schemes for topology inference use end-to-end active probing and feed the number, order, or a monotonic property of received probes as input to statistical signal-processing techniques. Inference of link characteristics (6) can also be combined with topology inference (38). In a different context, similar problems have been studied in the context of phylogenetic trees (23). The work in (2) uses such algorithms (23), for topology inference in sparse random graphs.
In addition, inference of congested links has been studied from the angles of compressive sensing (49); (12); (13) and group testing (8); (17); (36). The work in (8) formulates the problem as a graph-constrained group testing, where the items correspond to edges, some of them being defective, and the goal is to identify the defective edges given that the test matrix conforms to constraints imposed by the graph, e.g., the path connectivities. The work in (49) recovers sparse vectors, representing certain parameters of the links over the graph, through minimization. It improves the number of required measurements over (8), as compressive sensing allows real numbers for the link characteristics and measurements instead of true/false binary values in group testing problems.
Most tomographic approaches rely on probes sent from a single source in a tree topology (39); (3); (11); (18); (45); (19); (20); (22); (5); (21); (24). Rabbat et al. (37); (38); (14) introduced the multiple-source multiple-destination (M-by-N) tomography problem, by sending probes between sources and receivers. In (37); (38), it was shown that an M-by-N network can be decomposed into a collection of 2-by-2 components. Then, coordinated transmission of multi-packet probes from two sources and packet arrival order measurements at the two receivers were used to infer some information about the 2-by-2 topology. Assuming knowledge of M 1-by-N topologies and all 2-by-2 components, it was also shown how to merge a second source’s 1-by-N tree topology with the first one. The resulting M-by-N topology is not exact, but bounds are provided on the locations of joining points with respect to the branching points. This approach also requires a large number of probes, as do all approaches that need to collect enough probes for statistical significance (21); (18); (22); (11); (47). Our work on DAGs builds on and extends the multiple-source multiple-destination work in (37); (38), but uses network coding to achieve exact and fast topology inference.
A second body of related work is from the network coding literature. It is well known that linear network coding makes a network behave as a linear system, whose transfer function depends on the topology. Based on the source packets and the observations at the receivers, one can then try to passively infer the topology. The following papers consider that random linear network coding is employed for the purpose of information transfer, and they perform passive inference on the side. In (29), passive techniques are used to distinguish among failure patterns. In (32); (31); (43); (50), subspace properties at various nodes are used for topology inference and error localization. In (43), each node passively infers its upstream topology at no cost to throughput, but at high complexity.
In contrast, we propose active probing and a simple coding scheme at intermediate nodes, to achieve low-complexity topology inference at the end nodes. In Section 8, we provide a detailed comparison and make connections between active and passive topology inference. In (26); (35), we revisited link-loss (but not topology) tomography using active probing and network coding. In the first part of this paper, we extend our preliminary work in (25), where we showed that active probes from two sources and XOR at intermediate nodes are sufficient to infer binary tree topologies. This approach generalizes to trees, but not to general graphs. In (41), we used a different approach for general graphs, which builds on (37); (38): we identify 2-by-2 components and merge them together in an M-by-N topology. This journal paper combines and extends our preliminary work in (25); (41).
A practical approach for inferring the network topology is based on traceroute (27); (9); (46); (51); (10); (15); (30); (16); (48); (44). Multiple traceroute’s are sent among monitoring hosts, they record node ids along paths, and this information is put together to reconstruct the graph. The traceroute-based approach is discussed in detail in Section 8.5.
Wireless sensor networks and information fusion are considered in (28); (34). Information is collected at sensor nodes and is forwarded towards a fusion center, following a known reverse tree topology. Information is aggregated (28) or network coded (34) at intermediate nodes, and the loss rates of links are inferred based on the observations at the fusion center. In contrast, we are interested in inferring the topology of DAGs.
3 Problem Statement
Assumptions about the Network. We are interested in inferring static topologies
In the first part of the paper, we consider undirected trees with vertices, edges that can be used in both directions, and exactly one path between any two vertices. We denote by the leaf-vertices of the tree, which correspond to end-hosts that can act as sources or receivers of probe packets.
In the second part, we consider directed acyclic graphs (DAGs) with M sources and N receiver nodes, which we refer to as M-by-N topology, following the terminology of (37); (38). Without loss of generality (W.l.o.g.), we present most of our discussion in terms of , i.e., inferring a 2-by-N topology; an M-by-N topology can be constructed by merging smaller structures. Similarly to (37); (38), we also assume that a predetermined routing policy maps each source-destination pair to a unique route from the source to the destination.
There is a unique path from each source to each receiver.
Two paths from the same source to different receivers take the same route until they branch, so that all 1-by-2 components have the “inverted Y” structure; the node where the paths to the two receivers split is called a branching point, .
Two paths from different sources to the same receiver use exactly the same set of links after they join, so that all 2-by-1 components have the “Y” structure; the node where the paths from the two sources merge is called a joining point, .
These properties are consistent with destination-based routing: the next hop taken by a packet is determined by a routing table lookup on the destination address. Each subnetwork from one source to receivers is a 1-by-N tree; the general graph is called a “multiple-tree” network (37).
Loss and Delay.
We consider scenarios with and without packet loss. Each link has a delay with a fixed part, e.g., the propagation and transmission delay, and a variable part, e.g., the queueing delay. Path delay is the sum of delays across the links in the path. We have no control over the delays of the links but we have control over the timing of operations at sources and intermediate nodes. We can make sources and intermediate nodes operate in time slots of duration and , respectively, which can be chosen to be quite longer than link delays as explained later.
Goal. Our goal, in this paper, is to design active probing schemes, i.e., the operation of sources, intermediate nodes and receivers, that will allow us to infer the logical topology from the observations at the receivers. We restrict the space of possible operations to the simple options described in the rest of this section. In later sections, we design schemes based on these simple operations and we show that they are sufficient for topology inference. We will revisit the problem statement and make it more precise in the sections for trees and DAGs.
Operation of Sources. An experiment consists of a pair of sources and sending, at the same time, a multicast probe packet each ( and , or more generally symbols from a finite field) to all receivers. These are special probes sent solely for the purpose of inference, not for regular data transfer, and treated in a special way, specified next, by intermediate nodes. We perform up to experiments. Consecutive experiments are spaced apart by a large time interval , to ensure that only probes in the same experiment are combined together.
Operation of Intermediate Nodes. Intermediate nodes are assumed to support unicast, multicast
and the simplest possible network coding operation, i.e., addition over a finite field . They operate in time slots of pre-determined duration or window : a node waits for
to receive probe packets from its incoming links; if it receives more than one packet, it codes them together and forwards (unicast or multicast) the resulting packet downstream.
The choice of affects where the packets from the two sources meet. Essentially, an intermediate node can act either as a joining point (J), in which case it adds all incoming packets and forwards the output to all outgoing links;
Operation of Receivers. Each receiver receives probes , which are the source packets , , or a linear combination of and , as the result of network coding operations at intermediate nodes. Inference of topology is based only on the observations ’s. We assume that these observations are sent to a fusion center for central processing and inference; consistently to all tomography literature, the communication of the receivers and the fusion center is out of the scope of this paper.
Intuition. Multicast as well as network coding (which is limited to simple addition in this paper, thus can be thought of as reverse multicast) introduce topology-dependent correlation in the content of packets, which can be used at the receivers to infer the underlying topology. In particular, multicast helps reveal the branching points while network coding helps reveal the joining points.
3.2 Scope and Discussion
Possible deployment scenarios are described in Section 8.1. The first scenario (sending special probes for the sole purpose of topology inference) is used to describe the schemes throughout the paper. Furthermore, and beyond the specific details of the deployment, we believe that our work provides a fundamental building block for exploiting correlation in the content of network coded packets for inference of joining points. Similarly, multicast tomography showed how to exploit correlation in multicast packets for inference of branching points; it was then followed by a series of papers that used unicast traffic to “mimic” multicast probes and the whole functionality while being more practical.
We would like to emphasize that, in this paper, we apply network coding on special probes solely for the purpose of topology inference, and not for improving data transfer (decoding the source messages at the receivers). In data transfer, throughput and delay are indeed important metrics. In our problem, the important metrics are: identifiability (for which, we show that network coding is necessary); and efficiency, i.e., the number of probes used and the amount of network resources consumed for a certain level of estimation accuracy (which we show that it is improved by network coding). Therefore, the delay of the algorithms is not of primary concern in this paper: inferring the topology in the order of seconds as opposed to milliseconds is acceptable in our setup. Multi-path routing, which could increase throughput with network coding, is not considered either.
4 Main Results
The main results obtained in this paper are the following:
For tree networks:
When there is no packet loss in the links, we design the deterministic Alg. 1, which infers the topology in iterations, where is the number of edges in the tree.
When there is packet loss in the links, we design Alg. 2, which infers the topology in iterations, where , and is the minimum probability of success across all paths between a source and a destination.
For DAGs, we decompose the topology into 2-by-2 subnetwork components, and then we merge these components to reconstruct the topology. We design inference algorithms to infer the 2-by-2’s, and we design merging algorithms to merge them back to the original topology:
Assuming knowledge of all the 2-by-2’s and one source’s 1-by-N tree topology, we design Alg. 7 that merges a second source’s topology with the first one by identifying all the joining points in steps.
Assuming knowledge of all the 2-by-2’s, but not the 1-by-N tree topology, we can identify all the joining points if and only if there are no branching points in a row. We merge the two topologies in steps, as we describe in Section 6.3.2.
We also provide a lower bound on the number of 2-by-2’s required by any merging algorithm to uniquely localize all the joining points in a 2-by-N topology, given one source’s 1-by-N topology. In Lemma 6.2, we show that it is .
We also make connections between our approach and alternative topology inference approaches in Section 8.
5 Inferring Trees
Overview. We design algorithms for inferring undirected tree topologies, based only on probes sent between leaf nodes. We follow a hierarchical, top-down approach, by iteratively dividing the tree topology into smaller clusters and revealing how the groups are connected to each other.
Operation of Sources and Receivers. In each iteration (timeslot ) a set of leaves (different across timeslots) are chosen to act as sources and the remaining leaves act as receivers. Each source sends a distinct packet. The receiver stores the first packet it receives, and discards any subsequent packets (in the same iteration).
Operation of Intermediate Nodes. Every intermediate node operates in intervals of duration . If, within , the node receives a single probe from one of its neighbors, it multicasts the probe to all other neighbors. If, within , it receives more than one packet from different neighbors, it adds them and forwards the result to all remaining neighbors. In binary trees, this linear combination is simply XOR. In general trees, we need operations over higher fields.
Summary of Results. In the rest of this section, we first consider binary trees, with or without packet loss. Then we extend our algorithms to -ary trees. For trees without loss, we design deterministic algorithms that infer the topology in iterations. For trees with loss, just one successfully received probe per network path is sufficient, without the need to collect packet loss statistics, a property that enables rapid discovery of the underlying topology.
5.1 Binary Trees
Lossless Binary Tree
Let us first consider the simplest case: an undirected binary tree without packet loss. The following example illustrates the main idea.
Consider the tree shown in Fig. a, with 7 leaves (1,2, …7) and 5 intermediate nodes (A,B,C,D,E). Assume that nodes act as
sources and send probes , respectively. All other leaves act as receivers. Intermediate node receives
and forwards it to leaf and to node .
Similarly, node receives and
forwards it to node (which in turn forwards it to leaves ) and to node . Probe packets and arrive at node , which adds them, creates the packet , and forwards to node , which in turn forwards it
to leaves .
At the end, leaf receives , leaves receive and leaves receive . Thus, the leaves of the tree can be partitioned into three sets: containing and the leaves that received , i.e., ; containing and the leaves that received ; and containing the leaves that received . From this information observed at the edge of the network, we can deduce that the binary tree has the structure depicted in Fig. b: three components, each seeing a different probe () flowing through it, and connected through three links to the middle node . This concludes the first experiment/iteration.
To infer the structure that connects leaves to node , we need a second experiment. We randomly choose two of these leaves, e.g., , to act as sources . Any probe packet leaving node will be multicast to all remaining leaves of the tree, i.e., nodes observe the same packet. One can think of node as a single “aggregate-receiver” that observes the common packet received at nodes . Following the same procedure as before, assuming that meet at node , nodes and receive . Using this additional information and the fact that the topology is a binary tree, we refine the inferred structure from Fig. b to Fig. c.
Algorithm 1 generalizes the previous example and can infer any binary tree topology. It starts by considering all the leaves . It calls SendTwoProbes and partitions into smaller sets , , . It proceeds by recursively calling SendTwoProbes within each set, until all edges are revealed.
Algorithm 1 terminates in at most iterations and exactly infers the topology of an undirected binary tree.
Consider a particular iteration (call of SendTwoProbes): sources and send exactly one probe packet each to all other leaves. Now consider the intermediate nodes on the path between the two sources. Depending on the link delays, there are two possibilities.
The first possibility is that and meet (arrive within the same ) at one of the intermediate nodes on , e.g., node . Node forwards their XOR to its third link, and the iteration reveals the neighboring edges and nodes to as depicted in Fig. 2(a). Another possibility is that and cross each other while traversing the same link of in opposite directions, i.e., they do not meet at a node. Even if a leaf node receives more than one probe, we design their operation so that they only keep the first one. In this case, we infer the configuration in Fig. 2(b) that reveals one edge.
In summary, the algorithm iteratively divides the binary tree into smaller components until one component has two or less leaves, in which case we know its structure. In each iteration, we reveal three edges or one edge. At the end, we have revealed all edges. Therefore, the algorithm requires between and iterations. ∎
Notes. In each iteration, every link is traversed exactly once by a probe. Link delays affect where the probes meet and thus what components are revealed in each iteration. However, they do not affect the correctness of the algorithm.
Lossy Binary Tree
Packet loss may cause confusion when dividing the receivers into components. One solution is to send multiple probes from the same two sources in each iteration as we discuss next. However, given packet loss and delay variability, this may result in probes meeting at different nodes in the same iteration
Intermediate Node Operation: Each intermediate node keeps a table of its neighbors. In each iteration, it marks these neighbors as source or sink neighbors. Once this marking is done, it does not change for the duration of the iteration. The first time during an iteration that an intermediate node receives a probe, it waits for a window to receive probes from other neighbors. After this time passes, the node marks all neighbors from which it received packets as sources and all other neighbors as sinks. For the remaining duration of the iteration, the node accepts packets only if they originate from its source neighbors. If the node receives a packet from one of its source neighbors, it forwards it to all its sink neighbors. If it receives more than one packet from different source neighbors, it linearly combines them, and forwards the result to its sink neighbors. The node rejects probes coming from sinks, and does not forward packets towards sources.
Performance: Alg. 2 has an associated probability of error, since a leaf might not receive the “correct” probe packet
5.2 M-ary Trees
Full M-ary Trees
We first consider full -ary trees, where all intermediate nodes have degree , , without packet loss. Alg. 1 can still accurately infer the topology in less than iterations. However, we can modify the algorithm to infer the topology even faster. The idea is to keep the hierarchical clustering approach but increase the number of components revealed in each iteration, either (i) by changing the intermediate nodes so that they forward different linear combinations of incoming probes to different outgoing links; or (ii) by increasing the number of sources in each iteration.
Modification I: (two sources per iteration, coding points send different combinations to different links). When an intermediate node receives two incoming packets from two different neighbors, it deterministically generates different linear combinations, e.g., and forwards each resulting packet to a different neighbor. Therefore, when and meet at any intermediate node, the leaves of the network will be divided into components, depending on which probe packet they receive. If the probe packets do not meet at a node but cross each other, the leaves of the network will be divided into two components. Once a component has or less leaves, since we have a full -ary tree, we know its structure. Therefore, in each iteration, we reveal edges or one edge, and the total number of iterations is reduced to at least and at most iterations. Note that the operations are performed over in this case.
Modification II: (more than two sources per iteration, coding points send the same combination to all outgoing links). Alternatively, we can use up to sources (as per Lemma 5.2) per iteration. The sources send , respectively. When an intermediate node receives packets from different neighbors within , , it simply adds them up (over ) and forwards the result to all remaining neighbors. Depending on whether the node receives packets or only a single packet, the leaves of the network will be divided into or more components; i.e., in each iteration, we reveal or edges. Therefore, the algorithm requires at least and at most iterations.
The maximum number of sources that can be used to uniquely infer the topology of a full -ary tree is .
We show that if we use sources to infer the topology of a full -ary tree, it cannot be uniquely identified. For example, consider a binary tree with three sources sending , , and respectively, to all other leaves in the tree. Assume that the three probe packets meet at one intermediate node; thus, we divide the leaves into four components, which observe , , , and respectively. Since the degree of intermediate nodes is three, we conclude that two of the three sources must have joined at one intermediate node first, and then their result must have joined with the third source in another intermediate node, so that they result in in the last component. The first two sources can be either or . Therefore, we cannot uniquely infer the underlying binary tree topology by observing these four components. The same discussion applies to larger full -ary trees (). ∎
Note. In the presence of loss, the same argument as in Section 5.1.2 applies, i.e., we can assign directions to edges in each iteration, so that our algorithms are applicable to the lossy case as well.
General M-ary Trees
In general -ary trees, the degree of intermediate nodes varies from three up to a maximum of . We can still apply Alg. 1 and infer the tree topology in iterations. We can also apply Modification I described in Section 5.2.1; the operations are still performed over since probe packets may meet at an intermediate node of degree . However, we cannot apply Modification II here: since probe packets may meet at an intermediate node of degree three, we cannot use more than two sources according to Lemma 5.2, although there exist larger degree nodes in the tree.
6 Inferring Directed Acyclic Graphs (DAGs)
6.1 From a Single-Tree to Multiple-Tree Topologies
So far, we considered undirected trees. Let us now consider directed trees, which are a special case of DAGs.
Assume that we assign directions to the links of the binary tree in Fig. a, all from the top to the bottom. Clearly, we can no longer send probe packets in arbitrary directions in each iteration. However, we can still infer some information about the topology. Assume that we send probes from the source nodes and , and we observe at the receiver nodes , , and . Therefore, we identify three components , , and , together with the intermediate nodes and , and three edges , , and , which connect the three components together. However, we cannot obtain more information about the internal structure of the component or any other part of the tree network.
Next, consider a 2-by-2 network as defined in Section 3, i.e., a directed acyclic graph with two sources, two receivers and predetermined routing. Note that directed trees are only one type among all four possible types of the basic 2-by-2 components of any multiple-tree network, as defined in Section 3. There exist four 2-by-2 topologies, as shown in Fig. 3, which were first defined in (37); (38). Following the same terminology as in (37); (38), we refer to Fig. 3(a), (b), (c) and (d) as type 1, 2, 3 and 4, respectively. Type 1 is called shared (37); (38) since the joining points for both receivers coincide () and the branching points for both sources coincide (). The other three types (types 2, 3 and 4) are called non-shared since they have two distinct joining points and two distinct branching points.
In a directed tree, all 2-by-2 components are of type 1. However, in a general M-by-N topology, several different 2-by-2 types may co-exist. The algorithms described so far can identify type 1 2-by-2 topologies, and thus, trees (either completely or partially, as described above). However, they cannot distinguish between type 1 and type 4 2-by-2’s, as described in the following example.
Consider Fig. 3 (a) and (d). Assume that in both cases, we send from to and that meet (arrive within the same ) at any joining point. Therefore, in both type 1 and type 4, both receivers observe , and we cannot distinguish between the two types.
In general, unlike single-tree networks, the observations do not uniquely characterize the underlying topology in multiple-tree networks. The reason is that once two sources in a tree network transmit their probe packets, they at most meet at one coding point for all the receivers, as we saw in Section 5. On the other hand, in a multiple-tree network, probe packets may meet at different coding points for different receivers, as depicted in Fig. 4. Therefore, we need a different approach.
Problem Statement. Our goal in this section is to infer a multiple-tree topology, or an “M-by-N” topology according to the terminology of Section 3. Similarly to (37), we take two steps. In the first step (Section 6.2), we use several experiments and we exactly identify the type of every 2-by-2 component. In the second step (Section 6.3), we merge these 2-by-2 subnetwork components to reconstruct the M-by-N network.
Operation of Sources. Pairs of sources are selected and send up to coordinated multicast packets to all receivers. As in the general setup, probes are spaced apart by intervals of length . In addition, we introduce a difference in the sending time of the two sources, which we call the offset . W.l.o.g., let send first and second.
The timing parameters are coarsely tuned so as to create observations that can distinguish among different 2-by-2 types. In particular, (i) ensures that only probes within the same experiment are coded together. To be more precise, we choose , where is the maximum number of joining points on any path in the topology. In the worst case, there can be joining points in a row and thus, . However, in practice, is usually a lot smaller. (ii) path delay (between the sources and the joining points) ensures that source packets meet at the joining points despite link delays. (iii) is selected randomly in each iteration, so that it forces probes to meet at different points, or not meet at all, in different iterations. Finally, coarse selection of with rough estimates of upper bounds on link and path delays is sufficient.
Operation of Receivers. For a given 2-by-2 subnetwork, let the observations at the two receivers be , . Based on these observations, we design Inference algorithms that identify the 2-by-2 type (in Section 6.2) and Merging algorithms that build the M-by-N from the 2-by-2’s (in Section 6.3).
Operation of Intermediate Nodes. In DAGs, the operation of an intermediate node, depending on whether it acts as a joining point or a branching point, is summarized in Alg. 3 and Alg. 4, respectively. A joining point (J) adds and forwards packets, while a branching point (B) forwards the single received packet to all “interested” links downstream. A link is “interested” in the routing sense if it is the next hop for at least one source packet in the network coded packet.
6.2 Identifying 2-by-2 Components
In this section, we propose an approach to exactly identify a 2-by-2 component, using the same intuition as in trees, i.e., coding operations result in observations that can uniquely characterize the underlying 2-by-2. Our approach builds on (37) and improves over it by uniquely distinguishing among all four 2-by-2 types, while (37) could only distinguish between shared and non-shared types.
First, we provide an algorithm to identify the type of a 2-by-2 component without packet loss. In the first experiment, sources multicast probe packets to . We begin with the assumption that act simultaneously, or in practice within the synchronization offset. A choice of large guarantees that meet at both joining points , which add the incoming probes over . Depending on the underlying 2-by-2 type, observe one of the following pairs:
type 1: : , :
type 2: : , :
type 3: : , :
type 4: : , :
Types 2 and 3 result in unique observations that make them distinguishable from any other type; i.e., one such observation suffices to identify type 2 or type 3. However, types 1 and 4 result in the same pair of observations; therefore, we need to design different experiments to get observations that can uniquely characterize type 1 or type 4.
In the next experiment, we exploit the observation, first made in (37), that type 1 is the only 2-by-2 where the two joining points coincide (). Therefore, the observations at the two receivers are always the same: either when the two packets meet at ; or a single packet ( or ) when the two packets do not meet at . In contrast, type 4 has two different joining points . If we force packets to meet only at one of the joining points but not at the other one, the receivers will have different observations. These are observations #3 and #4 in Table 1 and they can uniquely characterize type 4.
These observations can be achieved by appropriately selecting the offset in
the sources’ sending times.
needs to be large enough so that after addition to the link delays,
it can affect : if represent the
delays on the paths from to , respectively,
must be in
|Observation||Type 1||Type 4|
Alg. 5 summarizes the experiments we perform in order to infer the type of a 2-by-2 network. Types 2 and 3 are identified in the first observation. Type 4 is identified the first time that the two receivers see different observations. If after trials, we still have not seen any different observations at the two receivers, then we declare the 2-by-2 to be of type 1.
Choosing . should be large enough to ensure small probability of error. The probability of error of Alg. 5 can be computed as follows. Let indicate whether the two observations are the same or not; it is a Bernoulli random variable with success probability . The number of required experiments is a geometric random variable. The only possible error is to mistakenly declare type 4 as type 1, which happens with probability:
In type 4, occurs when , i.e., with probability . Thus, Alg. 5 identifies any 2-by-2 topology in experiments with the following error probability:
We can then find by replacing the appropriate values (40). One can calculate that in order to ensure an accuracy of in distinguishing between types 1 and 4 2-by-2’s, needs to be . However, this is a pessimistic upper bound: simulation results in Section 7 show that a much smaller is sufficient in practice.
Let us now consider a 2-by-2 network where packets may be lost on some links. In this case, we can no longer guarantee meetings of and at the joining points and predictable observations at the receivers. There are two differences from the lossless case. First, because of random packet loss, each experiment might result in different outcomes, shown in Table 2. Second, there are common observations across all four types, as opposed to just between types 1 and 4. We divide the observations in Table 2 into three groups: (i) at least one of the receivers does not receive any packet (“-”) due to loss, (ii) both receivers have the same observation , and (iii) the two receivers have different observations .
|Type 1||Type 2||Type 3||Type 4|
We choose to ignore the observations of group (i) because they can occur in any of the four 2-by-2 types and thus, they do not help to distinguish among 2-by-2’s in the deterministic way adopted in this paper. Observations of group (ii) can also be the result of any 2-by-2 type: unlike the lossless case, where is unique to type 1 or 4 topologies, any of the four topologies may result in such observations if some packets are lost. We observe that group (ii) are the only possibility for type 1 topology, apart from the group (i) that we ignore, while all other three 2-by-2 types may result in either or . Therefore, if after trials, we only have observations of group (ii), we declare the topology to be type 1.
In observations of group (iii), it is , which means that and/or . An important observation is that the difference of the coefficients between the two receivers contains topology-related information. W.l.o.g., we focus on the coefficient of and look at the difference . Table 2 shows that can only occur in type 2 or type 4 topologies; while can only occur in a type 3 or 4 topology. Note that the coefficient is larger on one side (e.g., ) when the probe () goes through two joining points on its way to one receiver (in this case, ) and through one joining point on its way to the other receiver (). By performing several independent experiments and collecting several observations of group (iii), we can distinguish among the candidate topologies. If after experiments, there are only observations of group (ii) or (iii) with , we declare the topology as type 2. If there are only observations of group (ii) or (iii) with , we declare it as type 3. If there are observations of group (ii) or (iii) with both and , we declare it as type 4.
In our experiments, we try to create those observations that reveal the topology. These can occur either naturally, as the result of packet loss, or artificially, by us introducing an offset in ’s sending time with respect to . To help these observations occur, especially for small loss rates, and similarly to the lossless case, we use a random offset . To make these experiments independent, we space apart successive sets of probes by roughly selecting , which is sufficient since there are at most two joining points on any path in a 2-by-2.
Alg. 6 summarizes the 2-by-2 inference for lossy networks. The algorithm is simple and follows a deterministic approach: one observation, or a set of observations, is sufficient to uniquely distinguish among types. For example, at least one observation of group (iii) rules out the type 1 topology; a pair of group (iii) observations with both and indicates type 4; etc. As a result, we require less experiments compared to thousands of arrival order measurements required by (37); (38) for statistical significance. In addition and more importantly, we identify the exact 2-by-2 type, while (37) was only able to distinguish between shared and non-shared types. The following Lemma describes the probability of error of Alg. 6 with respect to the number of experiments () more precisely.
Alg. 6 identifies any 2-by-2 topology with in experiments, where , , , is the link loss rate (same for all links), and is the probability that probe packet arrives within at in a type 4 topology, i.e., , .
The proof is provided in Appendix B.
Inferring all 2-by-2’s in a 2-by-N Network
Algorithms 5 and 6 can be directly applied to a 2-by-N network, where two sources multicast to receivers. A difference is that intermediate nodes need to perform addition over a larger finite field, of order larger than the maximum number of joining points on a path (g), since a packet may meet itself at all the joining points on the path. In the worst case, there can be joining points in a row and thus, the maximum required field size is the first prime greater than N. Algorithm 5 and Algorithm 6 can be performed on any pair of receivers among all possible pairs. The same set of 2-by-N probes can be used to infer, in parallel and independently, the type of all 2-by-2 topologies. This reduces the number of probes, as we re-use them, instead of sending different sets of probes. The 2-by-N structure is important for the merging algorithm in Section 6.3.
2-by-2’s vs. other Subnetwork Components
We now discuss why we choose to decompose an M-by-N network into 2-by-2 subnetwork components, as opposed to any other subnetwork structures , :
1-by-1: This is the smallest component and corresponds to measuring a single end-to-end path. However, it reveals neither joining nor branching points.
1-by-2 and 2-by-1: These correspond to a 2-leaf multicast or a reverse-multicast tree, respectively. The 2-by-1 consists of 2 sources, one coding point, and 1 receiver. The 2-by-1 cannot identify the branching points while the 1-by-2 cannot identify the joining points. Similar comments apply to M-by-1 and 1-by-N.
2-by-2: This is the smallest structure that gives information about the relative locations of joining and branching points.
m-by-n, with : If we consider larger structures, there is an exponentially larger number of possible types, which requires more complicated inference algorithms. For example, there exist 19 possible types for a 2-by-3 structure.
M-by-N: In the extreme case, we need to enumerate all possible M-by-N topologies as in (43).
The larger the subnetwork component we use as a building block, the less components we need to infer and the simpler the merging algorithm. However, as the size of the basic component grows, the number of possible types increases exponentially and the inference step becomes increasingly complex. In this paper, we choose to decompose an M-by-N into 2-by-2 components, inspired by the approach in (37). We note that 2-by-2 is the minimum size building block required to infer both joining and branching points and strikes a good tradeoff of inference vs. merging complexity.
6.3 Merging Algorithm
Assuming knowledge of all 2-by-2 subnetwork components, from Section 6.2, we now merge them together to reconstruct the M-by-N network. We study merging in two different scenarios: (i) when a 1-by-N tree topology is known, which is the same problem studied in (37); and (ii) without knowledge of any 1-by-N, which is new to our work. Exploiting the accurately identified 2-by-2’s, we can solve (i) exactly, which was previously only approximately solved; and also solve (ii), which was previously not known how to address.
More precisely, our merging algorithm can identify every joining point, in the sense that it can localize it between two branching points. However, note that when there are several joining points in a row, without any branching point in between, it is not possible to identify the relative locations of these joining points with respect to each other. In fact, this is the case in a tree topology.
Merging a 1-by-N and 2-by-2’s into a 2-by-N
This 1-by-N is a tree rooted at and contains only branching points. We also assume that the 2-by-2’s between , a new source , and any pair of receivers are known, using the algorithms of Section 6.2. Our goal is to locate the joining points where paths from to the same receivers join ’s topology. We use the assumptions of Section 3 for routing.
This problem was posed in (37); (14) and was solved there in an approximate way. Bounds on the joining point locations in the topology were provided within a sequence of consecutive logical links. This was because the 2-by-2’s are only identified as shared or non-shared types in (37); (38).
In contrast, we design Algorithm 7, which localizes each joining point for each receiver to a single logical link, between two branching points in the topology. Our algorithm is simpler, faster, and more accurate: it can identify all joining points for any topology and with lower complexity, thanks to our complete knowledge of the 2-by-2 types.
Fig. a depicts a 2-by-9 topology constructed based on the Abilene network (52). Consider : it forms a type 1 2-by-2 with . Therefore, must lie above , so that there exists a unique path from each source to . We then need to localize with respect to : , form a 2-by-2 of type 4; thus, must lie below . is now localized to one link (between and ), and the algorithm ends here for . Other receivers are considered similarly. Note that a joining point can be placed on any link from the receiver to . Therefore, the number of steps required to localize a joining point is at most equal to the height of the tree. Also, when there is a group of receivers within which all pairs are of type 1, the algorithm is run only once and it assigns the same joining point to all of them. For this example, the algorithm in (37) cannot completely resolve all joining points, and provides bounds within a sequence of several logical links instead.
Merging 2-by-2’s into a 2-by-N
In this section, we infer a 2-by-N without prior knowledge of any 1-by-N. Inference under this relaxed assumption is enabled by our exact knowledge of 2-by-2’s and was not possible before (37); (14). We first send probes over the 2-by-N and then merge all 2-by-2’s, as described next.
We first consider all shared (type 1) 2-by-2 components and assign them the minimum number of branching and joining points required. For example in Fig. a, and are identified in this step. Second, we consider all non-shared 2-by-2 topologies (of type 2, 3, or 4). We use the information about the locations of the branching and joining points in each type to: (i) add the minimum number of branching points required to the ones already identified from the shared pairs; and (ii) assign joining points to those receivers that have not been already assigned one. In the example of Fig. a, an additional branching point is required, which is connected to both joining points and , to satisfy the 2-by-2’s of type 4 between the two shared groups. No additional joining point is required in this example.
This approach identifies the locations of all joining points, between the and 1-by-N topologies, but it does not identify all the branching points in the tree topology. Only the “minimum” topology is identified, i.e., the tree made by the “necessary” branching points. We define as “necessary” branching points the ones located below a joining point of and in the 2-by-N. An “unnecessary” branching point is the child of another branching point with no joining point in between. For example in Fig. a, this approach does not identify , and directly connects their children () to the upstream branching point ().
Note that the worst case input for this approach is a tree network. Since all 2-by-2’s are of type 1, and the algorithm cannot reconstruct branching points in a row, it can only identify the top-most branching point of the entire tree structure.
From 2-by-N to M-by-N
We can directly extend the 2-by-N inference techniques to the M-by-N case (14). We start from a 2-by-N topology, and add one source at a time, to connect the 1-by-N’s of the remaining sources. Assume that we have constructed a k-by-N topology, . To add the source, we perform experiments, where at each experiment one different of the sources and the source send and . We then glue these topologies together by following the topological rules of Section 6.3.1 (with single-source trees given) or Section 6.3.2 (without that assumption).
Complexity of Merging
If one source’s 1-by-N tree topology is given, the minimum number of 2-by-2’s required by any merging algorithm to uniquely localize all the joining points (between two branching points) in the 2-by-N topology is .
One can think of checking the types of the 2-by-2 components in the following sense: we divide the receivers in the network into two sets of vertices, in a bipartite graph, and we draw an edge between any two vertices for which we check the 2-by-2 type. The minimum number of required 2-by-2’s is then given by a perfect matching in this bipartite graph; therefore, it is . ∎
Fig. 5 shows two 2-by-N topologies that require exactly 2-by-2’s for their joining points to be uniquely identified by any merging algorithm. In Fig. 5(a), checking the types of and is sufficient for localizing all four joining points. In Fig. 5(b), where all the joining points are the same as , checking the types of and would be sufficient.
Note on Lemma 6.2: If the 2-by-2’s are properly selected, of them can be sufficient in some topologies, as we see in the examples of Fig. 5. Unfortunately, we do not know in advance (without knowledge of the 2-by-N topology) which 2-by-2’s to choose out of all possible 2-by-2’s, so as to uniquely localize the joining points between branching points. Nevertheless, from the given 1-by-N topology, we can give an upper bound on the number of 2-by-2’s required. Since every receiver is checked with other receivers that are children of its upper branching points, up to the location of its joining point, we need to check for 2-by-2’s. This is less than identifying all 2-by-2’s. Note that we still need to multicast to all receivers and monitor all observations, but we can use only the observations of the selected 2-by-2’s for inference, and ignore the rest.
Algorithm 7 takes at most steps.
As mentioned in the note above, Algorithm 7 considers every single receiver and checks the 2-by-2 type of that receiver with other receivers that are children of its upper branching points, up to the location of its joining point. Therefore, it takes at most steps. ∎
This is an improvement over