Active Topology Inference using Network Coding
Abstract
Our goal is to infer the topology of a network when (i) we can send probes between sources and receivers at the edge of the network and (ii) intermediate nodes can perform simple network coding operations, i.e., additions. Our key intuition is that network coding introduces topologydependent correlation in the observations at the receivers, which can be exploited to infer the topology. For undirected tree topologies, we design hierarchical clustering algorithms, building on our prior work in (25). For directed acyclic graphs (DAGs), first we decompose the topology into a number of twosource, tworeceiver (2by2) subnetwork components and then we merge these components to reconstruct the topology. Our approach for DAGs builds on prior work on tomography (37), and improves upon it by employing network coding to accurately distinguish among all different 2by2 components. We evaluate our algorithms through simulation of a number of realistic topologies and compare them to active tomographic techniques without network coding. We also make connections between our approach and alternatives, including passive inference, traceroute, and packet marking.
keywords:
Network Tomography, Network Coding, Topology Inferencesort&compress
1 Introduction
Knowledge of network topology is important for network management, diagnosis, operation, security and performance optimization. Depending on the context, one may be interested in the topology at different layers, such as the Internet’s routerlevel topology, an overlay network topology, the topology of an adhoc wireless network, etc.
There is a large body of prior work on measurements and inference of network topology. One family of techniques assumes the cooperation of nodes in the middle of the network, and uses traceroute measurements to collect the ids of nodes along paths and use them to reconstruct the topology. Another family of techniques, referred to as network tomography, assumes no cooperation from internal nodes and relies on endtoend probes to infer internal network characteristics, including topology. More specifically, multicast or unicast probes are sent/received between sets of sources/receivers at the edge of the network, and the topology is inferred based on the number and order of received probes.
In this paper, we revisit the problem of topology inference using endtoend probes, in networks where intermediate nodes are equipped with simple network coding capabilities. We show how to exploit these capabilities in order to perform active topology inference in a more accurate and efficient way than existing tomographic techniques.
Our key intuition is that network coding introduces topologydependent correlation in the content of packets observed at the receivers, which can then be exploited to reverseengineer the topology. For example, a coding point (that combines multiple incoming packets into one or more outgoing packets) introduces correlation between packets coming from different sources, in a similar way that multicast introduces correlation in the packets sent by the same source and observed by several receivers. In fact, the correlation introduced by multicast has been the starting point and the main idea underlying tomographic topology inference. Subsequent schemes made this idea more practical, by emulating multicast with backtoback unicast probes (11); (38). In contrast, relating probes from different sources to reveal intermediate nodes, also referred to as multiplesource tomography, has been a challenge (4); (38); (37). Using simple network coding operations at coding points solves this problem and allows accurate and fast topology inference.
Our approach is general and can be applied to infer the topology in a range of scenarios, including but not limited to wireless multihop networks. Wireless multihop networks are able to support simple network coding operations (additions are sufficient for our schemes), as demonstrated in (33), and can therefore benefit from our techniques. Furthermore, there is a good match between some properties and constraints of such networks and our schemes. First, there is natural variability in the delay of wireless links, which (if appropriately used  as explained in later sections) can expedite inference. Second, our schemes keep internal nodes simple (moving processing for inference to dedicated nodes at the edge) and anonymous (revealing the logical topology but not the identities of nodes). Finally, improving the speed of inference may prove important to keep up with changes, e.g., due to mobility.
Our contributions are as follows. First, we consider undirected trees, where leaves can act as sources or receivers of probes, and we design hierarchical clustering algorithms that infer the topology, building on our prior work in (25). Then, we consider directed acyclic graphs (DAGs) with a fixed set of M sources and N receivers and a predetermined routing scheme. We first decompose the topology into a number of twosource, tworeceiver subnetwork components and then we merge these components to reconstruct the topology. Our approach for DAGs builds on prior work on tomography (37), and improves upon it by employing simple network coding operations at intermediate nodes to deterministically distinguish among all possible 2by2 subnetwork components, which was impossible without network coding (37); (38). We evaluate our algorithms through simulation over a number of topologies and we show that they can infer the topology accurately and faster than tomographic approaches without network coding. We present our schemes as active probing: special probes are sent by the sources, specifically for the purpose of inference, and are treated in special ways by intermediate nodes and eventually received by the receivers and processed at a fusion center. We believe that our active probing approach with network coding provides one more building block, in the already large space of topology inference techniques, with core strength and ability to identify joining points. We also compare and make connections between our active probing approach and alternatives, such as passive inference, traceroute, and packet marking.
The rest of the paper is organized as follows. Section 2 discusses related work. Section 3 presents our assumptions, notation, and problem statement. Section 4 summarizes the main results of the paper. Section 5 presents algorithms for inferring tree topologies. Binary trees are discussed in Section 5.1, in the absence (Section 5.1.1) or presence (Section 5.1.2) of packet loss. General trees are discussed in Sections 5.2.1 and 5.2.2. Section 6 presents algorithms for inferring directed acyclic graphs (DAGs). Section 6.2 presents algorithms for inferring 2by2 subnetwork components, in the absence (6.2.1) or presence (6.2.2) of packet loss. Section 6.3 explains how to merge these components to reconstruct the topology. Section 7 provides simulation results for some realistic topologies. Section 8 discusses two possible deployment scenarios (one as an active probing scheme and another one using packet marking), and makes connections between our approach and alternative topology inference approaches. Section 9 concludes the paper. Appendices A and B analyze the probability of error of our inference algorithms in trees and DAGs, respectively.
2 Related Work
One body of related work is network tomography in general, and topology inference in particular. A good survey of network tomography can be found in (7). An early work on topology inference using endtoend measurements is (39), where the correlation between endtoend multicast packet loss patterns was used to infer the topology of binary trees. The correctness of this idea was rigorously established in (20), and was extended to general trees and to measurements other than loss, such as delay variance (21), or more generally any metric that grows monotonically with the number of traversed links. The idea has also been extended to unicast probes (11); (38). In summary, tomographic schemes for topology inference use endtoend active probing and feed the number, order, or a monotonic property of received probes as input to statistical signalprocessing techniques. Inference of link characteristics (6) can also be combined with topology inference (38). In a different context, similar problems have been studied in the context of phylogenetic trees (23). The work in (2) uses such algorithms (23), for topology inference in sparse random graphs.
In addition, inference of congested links has been studied from the angles of compressive sensing (49); (12); (13) and group testing (8); (17); (36). The work in (8) formulates the problem as a graphconstrained group testing, where the items correspond to edges, some of them being defective, and the goal is to identify the defective edges given that the test matrix conforms to constraints imposed by the graph, e.g., the path connectivities. The work in (49) recovers sparse vectors, representing certain parameters of the links over the graph, through minimization. It improves the number of required measurements over (8), as compressive sensing allows real numbers for the link characteristics and measurements instead of true/false binary values in group testing problems.
Most tomographic approaches rely on probes sent from a single source in a tree topology (39); (3); (11); (18); (45); (19); (20); (22); (5); (21); (24). Rabbat et al. (37); (38); (14) introduced the multiplesource multipledestination (MbyN) tomography problem, by sending probes between sources and receivers. In (37); (38), it was shown that an MbyN network can be decomposed into a collection of 2by2 components. Then, coordinated transmission of multipacket probes from two sources and packet arrival order measurements at the two receivers were used to infer some information about the 2by2 topology. Assuming knowledge of M 1byN topologies and all 2by2 components, it was also shown how to merge a second source’s 1byN tree topology with the first one. The resulting MbyN topology is not exact, but bounds are provided on the locations of joining points with respect to the branching points. This approach also requires a large number of probes, as do all approaches that need to collect enough probes for statistical significance (21); (18); (22); (11); (47). Our work on DAGs builds on and extends the multiplesource multipledestination work in (37); (38), but uses network coding to achieve exact and fast topology inference.
A second body of related work is from the network coding literature. It is well known that linear network coding makes a network behave as a linear system, whose transfer function depends on the topology. Based on the source packets and the observations at the receivers, one can then try to passively infer the topology. The following papers consider that random linear network coding is employed for the purpose of information transfer, and they perform passive inference on the side. In (29), passive techniques are used to distinguish among failure patterns. In (32); (31); (43); (50), subspace properties at various nodes are used for topology inference and error localization. In (43), each node passively infers its upstream topology at no cost to throughput, but at high complexity.
In contrast, we propose active probing and a simple coding scheme at intermediate nodes, to achieve lowcomplexity topology inference at the end nodes. In Section 8, we provide a detailed comparison and make connections between active and passive topology inference. In (26); (35), we revisited linkloss (but not topology) tomography using active probing and network coding. In the first part of this paper, we extend our preliminary work in (25), where we showed that active probes from two sources and XOR at intermediate nodes are sufficient to infer binary tree topologies. This approach generalizes to trees, but not to general graphs. In (41), we used a different approach for general graphs, which builds on (37); (38): we identify 2by2 components and merge them together in an MbyN topology. This journal paper combines and extends our preliminary work in (25); (41).
A practical approach for inferring the network topology is based on traceroute (27); (9); (46); (51); (10); (15); (30); (16); (48); (44). Multiple traceroute’s are sent among monitoring hosts, they record node ids along paths, and this information is put together to reconstruct the graph. The traceroutebased approach is discussed in detail in Section 8.5.
Wireless sensor networks and information fusion are considered in (28); (34). Information is collected at sensor nodes and is forwarded towards a fusion center, following a known reverse tree topology. Information is aggregated (28) or network coded (34) at intermediate nodes, and the loss rates of links are inferred based on the observations at the fusion center. In contrast, we are interested in inferring the topology of DAGs.
3 Problem Statement
3.1 Model
Assumptions about the Network. We are interested in inferring static topologies
In the first part of the paper, we consider undirected trees with vertices, edges that can be used in both directions, and exactly one path between any two vertices. We denote by the leafvertices of the tree, which correspond to endhosts that can act as sources or receivers of probe packets.
In the second part, we consider directed acyclic graphs (DAGs) with M sources and N receiver nodes, which we refer to as MbyN topology, following the terminology of (37); (38). Without loss of generality (W.l.o.g.), we present most of our discussion in terms of , i.e., inferring a 2byN topology; an MbyN topology can be constructed by merging smaller structures. Similarly to (37); (38), we also assume that a predetermined routing policy maps each sourcedestination pair to a unique route from the source to the destination.

There is a unique path from each source to each receiver.

Two paths from the same source to different receivers take the same route until they branch, so that all 1by2 components have the “inverted Y” structure; the node where the paths to the two receivers split is called a branching point, .

Two paths from different sources to the same receiver use exactly the same set of links after they join, so that all 2by1 components have the “Y” structure; the node where the paths from the two sources merge is called a joining point, .
These properties are consistent with destinationbased routing: the next hop taken by a packet is determined by a routing table lookup on the destination address. Each subnetwork from one source to receivers is a 1byN tree; the general graph is called a “multipletree” network (37).
Loss and Delay.
We consider scenarios with and without packet loss. Each link has a delay with a fixed part, e.g., the propagation and transmission delay, and a variable part, e.g., the queueing delay. Path delay is the sum of delays across the links in the path. We have no control over the delays of the links but we have control over the timing of operations at sources and intermediate nodes. We can make sources and intermediate nodes operate in time slots of duration and , respectively, which can be chosen to be quite longer than link delays as explained later.
Goal. Our goal, in this paper, is to design active probing schemes, i.e., the operation of sources, intermediate nodes and receivers, that will allow us to infer the logical topology from the observations at the receivers. We restrict the space of possible operations to the simple options described in the rest of this section. In later sections, we design schemes based on these simple operations and we show that they are sufficient for topology inference. We will revisit the problem statement and make it more precise in the sections for trees and DAGs.
Operation of Sources. An experiment consists of a pair of sources and sending, at the same time, a multicast probe packet each ( and , or more generally symbols from a finite field) to all receivers. These are special probes sent solely for the purpose of inference, not for regular data transfer, and treated in a special way, specified next, by intermediate nodes. We perform up to experiments. Consecutive experiments are spaced apart by a large time interval , to ensure that only probes in the same experiment are combined together.
Operation of Intermediate Nodes. Intermediate nodes are assumed to support unicast, multicast
and the simplest possible network coding operation, i.e., addition over a finite field . They operate in time slots of predetermined duration or window : a node waits for
to receive probe packets from its incoming links; if it receives more than one packet, it codes them together and forwards (unicast or multicast) the resulting packet downstream.
The choice of affects where the packets from the two sources meet. Essentially, an intermediate node can act either as a joining point (J), in which case it adds all incoming packets and forwards the output to all outgoing links;
Operation of Receivers. Each receiver receives probes , which are the source packets , , or a linear combination of and , as the result of network coding operations at intermediate nodes. Inference of topology is based only on the observations ’s. We assume that these observations are sent to a fusion center for central processing and inference; consistently to all tomography literature, the communication of the receivers and the fusion center is out of the scope of this paper.
Intuition. Multicast as well as network coding (which is limited to simple addition in this paper, thus can be thought of as reverse multicast) introduce topologydependent correlation in the content of packets, which can be used at the receivers to infer the underlying topology. In particular, multicast helps reveal the branching points while network coding helps reveal the joining points.
3.2 Scope and Discussion
Possible deployment scenarios are described in Section 8.1. The first scenario (sending special probes for the sole purpose of topology inference) is used to describe the schemes throughout the paper. Furthermore, and beyond the specific details of the deployment, we believe that our work provides a fundamental building block for exploiting correlation in the content of network coded packets for inference of joining points. Similarly, multicast tomography showed how to exploit correlation in multicast packets for inference of branching points; it was then followed by a series of papers that used unicast traffic to “mimic” multicast probes and the whole functionality while being more practical.
We would like to emphasize that, in this paper, we apply network coding on special probes solely for the purpose of topology inference, and not for improving data transfer (decoding the source messages at the receivers). In data transfer, throughput and delay are indeed important metrics. In our problem, the important metrics are: identifiability (for which, we show that network coding is necessary); and efficiency, i.e., the number of probes used and the amount of network resources consumed for a certain level of estimation accuracy (which we show that it is improved by network coding). Therefore, the delay of the algorithms is not of primary concern in this paper: inferring the topology in the order of seconds as opposed to milliseconds is acceptable in our setup. Multipath routing, which could increase throughput with network coding, is not considered either.
4 Main Results
The main results obtained in this paper are the following:

For tree networks:

When there is no packet loss in the links, we design the deterministic Alg. 1, which infers the topology in iterations, where is the number of edges in the tree.

When there is packet loss in the links, we design Alg. 2, which infers the topology in iterations, where , and is the minimum probability of success across all paths between a source and a destination.


For DAGs, we decompose the topology into 2by2 subnetwork components, and then we merge these components to reconstruct the topology. We design inference algorithms to infer the 2by2’s, and we design merging algorithms to merge them back to the original topology:

Assuming knowledge of all the 2by2’s and one source’s 1byN tree topology, we design Alg. 7 that merges a second source’s topology with the first one by identifying all the joining points in steps.

Assuming knowledge of all the 2by2’s, but not the 1byN tree topology, we can identify all the joining points if and only if there are no branching points in a row. We merge the two topologies in steps, as we describe in Section 6.3.2.

We also provide a lower bound on the number of 2by2’s required by any merging algorithm to uniquely localize all the joining points in a 2byN topology, given one source’s 1byN topology. In Lemma 6.2, we show that it is .

We also make connections between our approach and alternative topology inference approaches in Section 8.
5 Inferring Trees
Overview. We design algorithms for inferring undirected tree topologies, based only on probes sent between leaf nodes. We follow a hierarchical, topdown approach, by iteratively dividing the tree topology into smaller clusters and revealing how the groups are connected to each other.
Operation of Sources and Receivers. In each iteration (timeslot ) a set of leaves (different across timeslots) are chosen to act as sources and the remaining leaves act as receivers. Each source sends a distinct packet. The receiver stores the first packet it receives, and discards any subsequent packets (in the same iteration).
Operation of Intermediate Nodes. Every intermediate node operates in intervals of duration . If, within , the node receives a single probe from one of its neighbors, it multicasts the probe to all other neighbors. If, within , it receives more than one packet from different neighbors, it adds them and forwards the result to all remaining neighbors. In binary trees, this linear combination is simply XOR. In general trees, we need operations over higher fields.
Summary of Results. In the rest of this section, we first consider binary trees, with or without packet loss. Then we extend our algorithms to ary trees. For trees without loss, we design deterministic algorithms that infer the topology in iterations. For trees with loss, just one successfully received probe per network path is sufficient, without the need to collect packet loss statistics, a property that enables rapid discovery of the underlying topology.
5.1 Binary Trees
Lossless Binary Tree
Let us first consider the simplest case: an undirected binary tree without packet loss. The following example illustrates the main idea.
Example 1.
Consider the tree shown in Fig. a, with 7 leaves (1,2, …7) and 5 intermediate nodes (A,B,C,D,E). Assume that nodes act as
sources and send probes , respectively. All other leaves act as receivers. Intermediate node receives
and forwards it to leaf and to node .
Similarly, node receives and
forwards it to node (which in turn forwards it to leaves ) and to node . Probe packets and arrive at node , which adds them, creates the packet , and forwards to node , which in turn forwards it
to leaves .
At the end, leaf receives , leaves receive and leaves receive . Thus, the leaves of the tree can be partitioned into three sets: containing and the leaves that received , i.e., ; containing and the leaves that received ; and containing the leaves that received . From this information observed at the edge of the network, we can deduce that the binary tree has the structure depicted in Fig. b: three components, each seeing a different probe () flowing through it, and connected through three links to the middle node . This concludes the first experiment/iteration.
To infer the structure that connects leaves to node , we need a second experiment. We randomly choose two of these leaves, e.g., , to act as sources . Any probe packet leaving node will be multicast to all remaining leaves of the tree, i.e., nodes observe the same packet. One can think of node as a single “aggregatereceiver” that observes the common packet received at nodes . Following the same procedure as before, assuming that meet at node , nodes and receive . Using this additional information and the fact that the topology is a binary tree, we refine the inferred structure from Fig. b to Fig. c.
Algorithm 1 generalizes the previous example and can infer any binary tree topology. It starts by considering all the leaves . It calls SendTwoProbes and partitions into smaller sets , , . It proceeds by recursively calling SendTwoProbes within each set, until all edges are revealed.
Lemma 5.1.
Algorithm 1 terminates in at most iterations and exactly infers the topology of an undirected binary tree.
Proof.
Consider a particular iteration (call of SendTwoProbes): sources and send exactly one probe packet each to all other leaves. Now consider the intermediate nodes on the path between the two sources. Depending on the link delays, there are two possibilities.
The first possibility is that and meet (arrive within the same ) at one of the intermediate nodes on , e.g., node . Node forwards their XOR to its third link, and the iteration reveals the neighboring edges and nodes to as depicted in Fig. 2(a). Another possibility is that and cross each other while traversing the same link of in opposite directions, i.e., they do not meet at a node. Even if a leaf node receives more than one probe, we design their operation so that they only keep the first one. In this case, we infer the configuration in Fig. 2(b) that reveals one edge.
In summary, the algorithm iteratively divides the binary tree into smaller components until one component has two or less leaves, in which case we know its structure. In each iteration, we reveal three edges or one edge. At the end, we have revealed all edges. Therefore, the algorithm requires between and iterations. ∎
Notes. In each iteration, every link is traversed exactly once by a probe. Link delays affect where the probes meet and thus what components are revealed in each iteration. However, they do not affect the correctness of the algorithm.
Lossy Binary Tree
Packet loss may cause confusion when dividing the receivers into components. One solution is to send multiple probes from the same two sources in each iteration as we discuss next. However, given packet loss and delay variability, this may result in probes meeting at different nodes in the same iteration
Intermediate Node Operation: Each intermediate node keeps a table of its neighbors. In each iteration, it marks these neighbors as source or sink neighbors. Once this marking is done, it does not change for the duration of the iteration. The first time during an iteration that an intermediate node receives a probe, it waits for a window to receive probes from other neighbors. After this time passes, the node marks all neighbors from which it received packets as sources and all other neighbors as sinks. For the remaining duration of the iteration, the node accepts packets only if they originate from its source neighbors. If the node receives a packet from one of its source neighbors, it forwards it to all its sink neighbors. If it receives more than one packet from different source neighbors, it linearly combines them, and forwards the result to its sink neighbors. The node rejects probes coming from sinks, and does not forward packets towards sources.
Alg. 2 presents the modifications required for Alg. 1 to be able to infer binary trees with lossy links. The main difference is that in each iteration, each source sends instead of one probes.
Performance: Alg. 2 has an associated probability of error, since a leaf might not receive the “correct” probe packet
5.2 Mary Trees
Full Mary Trees
We first consider full ary trees, where all intermediate nodes have degree , , without packet loss. Alg. 1 can still accurately infer the topology in less than iterations. However, we can modify the algorithm to infer the topology even faster. The idea is to keep the hierarchical clustering approach but increase the number of components revealed in each iteration, either (i) by changing the intermediate nodes so that they forward different linear combinations of incoming probes to different outgoing links; or (ii) by increasing the number of sources in each iteration.
Modification I: (two sources per iteration, coding points send different combinations to different links). When an intermediate node receives two incoming packets from two different neighbors, it deterministically generates different linear combinations, e.g., and forwards each resulting packet to a different neighbor. Therefore, when and meet at any intermediate node, the leaves of the network will be divided into components, depending on which probe packet they receive. If the probe packets do not meet at a node but cross each other, the leaves of the network will be divided into two components. Once a component has or less leaves, since we have a full ary tree, we know its structure. Therefore, in each iteration, we reveal edges or one edge, and the total number of iterations is reduced to at least and at most iterations. Note that the operations are performed over in this case.
Modification II: (more than two sources per iteration, coding points send the same combination to all outgoing links). Alternatively, we can use up to sources (as per Lemma 5.2) per iteration. The sources send , respectively. When an intermediate node receives packets from different neighbors within , , it simply adds them up (over ) and forwards the result to all remaining neighbors. Depending on whether the node receives packets or only a single packet, the leaves of the network will be divided into or more components; i.e., in each iteration, we reveal or edges. Therefore, the algorithm requires at least and at most iterations.
Lemma 5.2.
The maximum number of sources that can be used to uniquely infer the topology of a full ary tree is .
Proof.
We show that if we use sources to infer the topology of a full ary tree, it cannot be uniquely identified. For example, consider a binary tree with three sources sending , , and respectively, to all other leaves in the tree. Assume that the three probe packets meet at one intermediate node; thus, we divide the leaves into four components, which observe , , , and respectively. Since the degree of intermediate nodes is three, we conclude that two of the three sources must have joined at one intermediate node first, and then their result must have joined with the third source in another intermediate node, so that they result in in the last component. The first two sources can be either or . Therefore, we cannot uniquely infer the underlying binary tree topology by observing these four components. The same discussion applies to larger full ary trees (). ∎
Note. In the presence of loss, the same argument as in Section 5.1.2 applies, i.e., we can assign directions to edges in each iteration, so that our algorithms are applicable to the lossy case as well.
General Mary Trees
In general ary trees, the degree of intermediate nodes varies from three up to a maximum of . We can still apply Alg. 1 and infer the tree topology in iterations. We can also apply Modification I described in Section 5.2.1; the operations are still performed over since probe packets may meet at an intermediate node of degree . However, we cannot apply Modification II here: since probe packets may meet at an intermediate node of degree three, we cannot use more than two sources according to Lemma 5.2, although there exist larger degree nodes in the tree.
6 Inferring Directed Acyclic Graphs (DAGs)
6.1 From a SingleTree to MultipleTree Topologies
So far, we considered undirected trees. Let us now consider directed trees, which are a special case of DAGs.
Example 2.
Assume that we assign directions to the links of the binary tree in Fig. a, all from the top to the bottom. Clearly, we can no longer send probe packets in arbitrary directions in each iteration. However, we can still infer some information about the topology. Assume that we send probes from the source nodes and , and we observe at the receiver nodes , , and . Therefore, we identify three components , , and , together with the intermediate nodes and , and three edges , , and , which connect the three components together. However, we cannot obtain more information about the internal structure of the component or any other part of the tree network.
Next, consider a 2by2 network as defined in Section 3, i.e., a directed acyclic graph with two sources, two receivers and predetermined routing. Note that directed trees are only one type among all four possible types of the basic 2by2 components of any multipletree network, as defined in Section 3. There exist four 2by2 topologies, as shown in Fig. 3, which were first defined in (37); (38). Following the same terminology as in (37); (38), we refer to Fig. 3(a), (b), (c) and (d) as type 1, 2, 3 and 4, respectively. Type 1 is called shared (37); (38) since the joining points for both receivers coincide () and the branching points for both sources coincide (). The other three types (types 2, 3 and 4) are called nonshared since they have two distinct joining points and two distinct branching points.
In a directed tree, all 2by2 components are of type 1. However, in a general MbyN topology, several different 2by2 types may coexist. The algorithms described so far can identify type 1 2by2 topologies, and thus, trees (either completely or partially, as described above). However, they cannot distinguish between type 1 and type 4 2by2’s, as described in the following example.
Example 3.
Consider Fig. 3 (a) and (d). Assume that in both cases, we send from to and that meet (arrive within the same ) at any joining point. Therefore, in both type 1 and type 4, both receivers observe , and we cannot distinguish between the two types.
In general, unlike singletree networks, the observations do not uniquely characterize the underlying topology in multipletree networks. The reason is that once two sources in a tree network transmit their probe packets, they at most meet at one coding point for all the receivers, as we saw in Section 5. On the other hand, in a multipletree network, probe packets may meet at different coding points for different receivers, as depicted in Fig. 4. Therefore, we need a different approach.
Problem Statement. Our goal in this section is to infer a multipletree topology, or an “MbyN” topology according to the terminology of Section 3. Similarly to (37), we take two steps. In the first step (Section 6.2), we use several experiments and we exactly identify the type of every 2by2 component. In the second step (Section 6.3), we merge these 2by2 subnetwork components to reconstruct the MbyN network.
Operation of Sources. Pairs of sources are selected and send up to coordinated multicast packets to all receivers. As in the general setup, probes are spaced apart by intervals of length . In addition, we introduce a difference in the sending time of the two sources, which we call the offset . W.l.o.g., let send first and second.
The timing parameters are coarsely tuned so as to create observations that can distinguish among different 2by2 types. In particular, (i) ensures that only probes within the same experiment are coded together. To be more precise, we choose , where is the maximum number of joining points on any path in the topology. In the worst case, there can be joining points in a row and thus, . However, in practice, is usually a lot smaller. (ii) path delay (between the sources and the joining points) ensures that source packets meet at the joining points despite link delays. (iii) is selected randomly in each iteration, so that it forces probes to meet at different points, or not meet at all, in different iterations. Finally, coarse selection of with rough estimates of upper bounds on link and path delays is sufficient.
Operation of Receivers. For a given 2by2 subnetwork, let the observations at the two receivers be , . Based on these observations, we design Inference algorithms that identify the 2by2 type (in Section 6.2) and Merging algorithms that build the MbyN from the 2by2’s (in Section 6.3).
Operation of Intermediate Nodes. In DAGs, the operation of an intermediate node, depending on whether it acts as a joining point or a branching point, is summarized in Alg. 3 and Alg. 4, respectively. A joining point (J) adds and forwards packets, while a branching point (B) forwards the single received packet to all “interested” links downstream. A link is “interested” in the routing sense if it is the next hop for at least one source packet in the network coded packet.
6.2 Identifying 2by2 Components
In this section, we propose an approach to exactly identify a 2by2 component, using the same intuition as in trees, i.e., coding operations result in observations that can uniquely characterize the underlying 2by2. Our approach builds on (37) and improves over it by uniquely distinguishing among all four 2by2 types, while (37) could only distinguish between shared and nonshared types.
Lossless 2by2
First, we provide an algorithm to identify the type of a 2by2 component without packet loss. In the first experiment, sources multicast probe packets to . We begin with the assumption that act simultaneously, or in practice within the synchronization offset. A choice of large guarantees that meet at both joining points , which add the incoming probes over . Depending on the underlying 2by2 type, observe one of the following pairs:

type 1: : , :

type 2: : , :

type 3: : , :

type 4: : , :
Types 2 and 3 result in unique observations that make them distinguishable from any other type; i.e., one such observation suffices to identify type 2 or type 3. However, types 1 and 4 result in the same pair of observations; therefore, we need to design different experiments to get observations that can uniquely characterize type 1 or type 4.
In the next experiment, we exploit the observation, first made in (37), that type 1 is the only 2by2 where the two joining points coincide (). Therefore, the observations at the two receivers are always the same: either when the two packets meet at ; or a single packet ( or ) when the two packets do not meet at . In contrast, type 4 has two different joining points . If we force packets to meet only at one of the joining points but not at the other one, the receivers will have different observations. These are observations #3 and #4 in Table 1 and they can uniquely characterize type 4.
These observations can be achieved by appropriately selecting the offset in
the sources’ sending times.
needs to be large enough so that after addition to the link delays,
it can affect : if represent the
delays on the paths from to , respectively,
must be in
Observation  Type 1  Type 4  
Number  
1  
2  
3  
4 
Alg. 5 summarizes the experiments we perform in order to infer the type of a 2by2 network. Types 2 and 3 are identified in the first observation. Type 4 is identified the first time that the two receivers see different observations. If after trials, we still have not seen any different observations at the two receivers, then we declare the 2by2 to be of type 1.
Choosing . should be large enough to ensure small probability of error. The probability of error of Alg. 5 can be computed as follows. Let indicate whether the two observations are the same or not; it is a Bernoulli random variable with success probability . The number of required experiments is a geometric random variable. The only possible error is to mistakenly declare type 4 as type 1, which happens with probability:
(1) 
In type 4, occurs when , i.e., with probability . Thus, Alg. 5 identifies any 2by2 topology in experiments with the following error probability:
(2) 
We can then find by replacing the appropriate values (40). One can calculate that in order to ensure an accuracy of in distinguishing between types 1 and 4 2by2’s, needs to be . However, this is a pessimistic upper bound: simulation results in Section 7 show that a much smaller is sufficient in practice.
Lossy 2by2
Let us now consider a 2by2 network where packets may be lost on some links. In this case, we can no longer guarantee meetings of and at the joining points and predictable observations at the receivers. There are two differences from the lossless case. First, because of random packet loss, each experiment might result in different outcomes, shown in Table 2. Second, there are common observations across all four types, as opposed to just between types 1 and 4. We divide the observations in Table 2 into three groups: (i) at least one of the receivers does not receive any packet (“”) due to loss, (ii) both receivers have the same observation , and (iii) the two receivers have different observations .
Type 1  Type 2  Type 3  Type 4  


grp  grp  grp  grp  
1  i      i      i      i     
2          
3          
4          
5          
6          
7          
8  ii      ii  
9  ii  ii  
10  
11  iii  
12  iii  iii  
13  
14  
15  
16 
We choose to ignore the observations of group (i) because they can occur in any of the four 2by2 types and thus, they do not help to distinguish among 2by2’s in the deterministic way adopted in this paper. Observations of group (ii) can also be the result of any 2by2 type: unlike the lossless case, where is unique to type 1 or 4 topologies, any of the four topologies may result in such observations if some packets are lost. We observe that group (ii) are the only possibility for type 1 topology, apart from the group (i) that we ignore, while all other three 2by2 types may result in either or . Therefore, if after trials, we only have observations of group (ii), we declare the topology to be type 1.
In observations of group (iii), it is , which means that and/or . An important observation is that the difference of the coefficients between the two receivers contains topologyrelated information. W.l.o.g., we focus on the coefficient of and look at the difference . Table 2 shows that can only occur in type 2 or type 4 topologies; while can only occur in a type 3 or 4 topology. Note that the coefficient is larger on one side (e.g., ) when the probe () goes through two joining points on its way to one receiver (in this case, ) and through one joining point on its way to the other receiver (). By performing several independent experiments and collecting several observations of group (iii), we can distinguish among the candidate topologies. If after experiments, there are only observations of group (ii) or (iii) with , we declare the topology as type 2. If there are only observations of group (ii) or (iii) with , we declare it as type 3. If there are observations of group (ii) or (iii) with both and , we declare it as type 4.
In our experiments, we try to create those observations that reveal the topology. These can occur either naturally, as the result of packet loss, or artificially, by us introducing an offset in ’s sending time with respect to . To help these observations occur, especially for small loss rates, and similarly to the lossless case, we use a random offset . To make these experiments independent, we space apart successive sets of probes by roughly selecting , which is sufficient since there are at most two joining points on any path in a 2by2.
Alg. 6 summarizes the 2by2 inference for lossy networks. The algorithm is simple and follows a deterministic approach: one observation, or a set of observations, is sufficient to uniquely distinguish among types. For example, at least one observation of group (iii) rules out the type 1 topology; a pair of group (iii) observations with both and indicates type 4; etc. As a result, we require less experiments compared to thousands of arrival order measurements required by (37); (38) for statistical significance. In addition and more importantly, we identify the exact 2by2 type, while (37) was only able to distinguish between shared and nonshared types. The following Lemma describes the probability of error of Alg. 6 with respect to the number of experiments () more precisely.
Lemma 6.1.
Alg. 6 identifies any 2by2 topology with in experiments, where , , , is the link loss rate (same for all links), and is the probability that probe packet arrives within at in a type 4 topology, i.e., , .
The proof is provided in Appendix B.
Inferring all 2by2’s in a 2byN Network
Algorithms 5 and 6 can be directly applied to a 2byN network, where two sources multicast to receivers. A difference is that intermediate nodes need to perform addition over a larger finite field, of order larger than the maximum number of joining points on a path (g), since a packet may meet itself at all the joining points on the path. In the worst case, there can be joining points in a row and thus, the maximum required field size is the first prime greater than N. Algorithm 5 and Algorithm 6 can be performed on any pair of receivers among all possible pairs. The same set of 2byN probes can be used to infer, in parallel and independently, the type of all 2by2 topologies. This reduces the number of probes, as we reuse them, instead of sending different sets of probes. The 2byN structure is important for the merging algorithm in Section 6.3.
2by2’s vs. other Subnetwork Components
We now discuss why we choose to decompose an MbyN network into 2by2 subnetwork components, as opposed to any other subnetwork structures , :

1by1: This is the smallest component and corresponds to measuring a single endtoend path. However, it reveals neither joining nor branching points.

1by2 and 2by1: These correspond to a 2leaf multicast or a reversemulticast tree, respectively. The 2by1 consists of 2 sources, one coding point, and 1 receiver. The 2by1 cannot identify the branching points while the 1by2 cannot identify the joining points. Similar comments apply to Mby1 and 1byN.

2by2: This is the smallest structure that gives information about the relative locations of joining and branching points.

mbyn, with : If we consider larger structures, there is an exponentially larger number of possible types, which requires more complicated inference algorithms. For example, there exist 19 possible types for a 2by3 structure.

MbyN: In the extreme case, we need to enumerate all possible MbyN topologies as in (43).
The larger the subnetwork component we use as a building block, the less components we need to infer and the simpler the merging algorithm. However, as the size of the basic component grows, the number of possible types increases exponentially and the inference step becomes increasingly complex. In this paper, we choose to decompose an MbyN into 2by2 components, inspired by the approach in (37). We note that 2by2 is the minimum size building block required to infer both joining and branching points and strikes a good tradeoff of inference vs. merging complexity.
6.3 Merging Algorithm
Assuming knowledge of all 2by2 subnetwork components, from Section 6.2, we now merge them together to reconstruct the MbyN network. We study merging in two different scenarios: (i) when a 1byN tree topology is known, which is the same problem studied in (37); and (ii) without knowledge of any 1byN, which is new to our work. Exploiting the accurately identified 2by2’s, we can solve (i) exactly, which was previously only approximately solved; and also solve (ii), which was previously not known how to address.
More precisely, our merging algorithm can identify every joining point, in the sense that it can localize it between two branching points. However, note that when there are several joining points in a row, without any branching point in between, it is not possible to identify the relative locations of these joining points with respect to each other. In fact, this is the case in a tree topology.
Merging a 1byN and 2by2’s into a 2byN
In this section, we assume that the 1byN from to receivers is known using either the classic methods for singletree topology inference (7) or our algorithms in Section 5 for tree networks.
This 1byN is a tree rooted at and contains only branching points. We also assume that the 2by2’s between , a new source , and any pair of receivers are known, using the algorithms of Section 6.2. Our goal is to locate the joining points where paths from to the same receivers join ’s topology. We use the assumptions of Section 3 for routing.
This problem was posed in (37); (14) and was solved there in an approximate way. Bounds on the joining point locations in the topology were provided within a sequence of consecutive logical links. This was because the 2by2’s are only identified as shared or nonshared types in (37); (38).
In contrast, we design Algorithm 7, which localizes each joining point for each receiver to a single logical link, between two branching points in the topology. Our algorithm is simpler, faster, and more accurate: it can identify all joining points for any topology and with lower complexity, thanks to our complete knowledge of the 2by2 types.
Example 4.
Fig. a depicts a 2by9 topology constructed based on the Abilene network (52). Consider : it forms a type 1 2by2 with . Therefore, must lie above , so that there exists a unique path from each source to . We then need to localize with respect to : , form a 2by2 of type 4; thus, must lie below . is now localized to one link (between and ), and the algorithm ends here for . Other receivers are considered similarly. Note that a joining point can be placed on any link from the receiver to . Therefore, the number of steps required to localize a joining point is at most equal to the height of the tree. Also, when there is a group of receivers within which all pairs are of type 1, the algorithm is run only once and it assigns the same joining point to all of them. For this example, the algorithm in (37) cannot completely resolve all joining points, and provides bounds within a sequence of several logical links instead.
Merging 2by2’s into a 2byN
In this section, we infer a 2byN without prior knowledge of any 1byN. Inference under this relaxed assumption is enabled by our exact knowledge of 2by2’s and was not possible before (37); (14). We first send probes over the 2byN and then merge all 2by2’s, as described next.
Example 5.
We first consider all shared (type 1) 2by2 components and assign them the minimum number of branching and joining points required. For example in Fig. a, and are identified in this step. Second, we consider all nonshared 2by2 topologies (of type 2, 3, or 4). We use the information about the locations of the branching and joining points in each type to: (i) add the minimum number of branching points required to the ones already identified from the shared pairs; and (ii) assign joining points to those receivers that have not been already assigned one. In the example of Fig. a, an additional branching point is required, which is connected to both joining points and , to satisfy the 2by2’s of type 4 between the two shared groups. No additional joining point is required in this example.
This approach identifies the locations of all joining points, between the and 1byN topologies, but it does not identify all the branching points in the tree topology. Only the “minimum” topology is identified, i.e., the tree made by the “necessary” branching points. We define as “necessary” branching points the ones located below a joining point of and in the 2byN. An “unnecessary” branching point is the child of another branching point with no joining point in between. For example in Fig. a, this approach does not identify , and directly connects their children () to the upstream branching point ().
Note that the worst case input for this approach is a tree network. Since all 2by2’s are of type 1, and the algorithm cannot reconstruct branching points in a row, it can only identify the topmost branching point of the entire tree structure.
From 2byN to MbyN
We can directly extend the 2byN inference techniques to the MbyN case (14). We start from a 2byN topology, and add one source at a time, to connect the 1byN’s of the remaining sources. Assume that we have constructed a kbyN topology, . To add the source, we perform experiments, where at each experiment one different of the sources and the source send and . We then glue these topologies together by following the topological rules of Section 6.3.1 (with singlesource trees given) or Section 6.3.2 (without that assumption).
Complexity of Merging
Lemma 6.2.
If one source’s 1byN tree topology is given, the minimum number of 2by2’s required by any merging algorithm to uniquely localize all the joining points (between two branching points) in the 2byN topology is .
Proof.
One can think of checking the types of the 2by2 components in the following sense: we divide the receivers in the network into two sets of vertices, in a bipartite graph, and we draw an edge between any two vertices for which we check the 2by2 type. The minimum number of required 2by2’s is then given by a perfect matching in this bipartite graph; therefore, it is . ∎
Example 6.
Fig. 5 shows two 2byN topologies that require exactly 2by2’s for their joining points to be uniquely identified by any merging algorithm. In Fig. 5(a), checking the types of and is sufficient for localizing all four joining points. In Fig. 5(b), where all the joining points are the same as , checking the types of and would be sufficient.
Note on Lemma 6.2: If the 2by2’s are properly selected, of them can be sufficient in some topologies, as we see in the examples of Fig. 5. Unfortunately, we do not know in advance (without knowledge of the 2byN topology) which 2by2’s to choose out of all possible 2by2’s, so as to uniquely localize the joining points between branching points. Nevertheless, from the given 1byN topology, we can give an upper bound on the number of 2by2’s required. Since every receiver is checked with other receivers that are children of its upper branching points, up to the location of its joining point, we need to check for 2by2’s. This is less than identifying all 2by2’s. Note that we still need to multicast to all receivers and monitor all observations, but we can use only the observations of the selected 2by2’s for inference, and ignore the rest.
Lemma 6.3.
Algorithm 7 takes at most steps.
Proof.
As mentioned in the note above, Algorithm 7 considers every single receiver and checks the 2by2 type of that receiver with other receivers that are children of its upper branching points, up to the location of its joining point. Therefore, it takes at most steps. ∎
This is an improvement over