On Algebraic Traceback in Dynamic Networks
This paper introduces the concept of incremental traceback for determining changes in the trace of a network as it evolves with time. A distributed algorithm, based on the methodology of algebraic traceback developed by Dean et al., is proposed which can completely determine a path of nodes/routers () using marked packets, and subsequently determine the changes in its topology using marked packets with high probability. The algorithm is established to be order-wise optimal i.e., no other distributed algorithm can determine changes in the path topology using lesser order of bits (i.e., marked packets). The algorithm is shown to have a computational complexity of , which is significantly less than that of any existing non-incremental algorithm of algebraic traceback. Extensions of this algorithm to settings with node identity spoofing and network coding are also presented.
Incremental traceback, MANETs.
Given the increasing number and forms of attacks on networks in recent years, developing efficient counter-measures, such as traceback, is of significant value. In this paper, we focus on determining efficient traceback mechanisms for networks with time-varying topologies. Settings such as mobile ad-hoc networks (MANETs) are of particular interest in which we desire to use traceback towards network management and countering attacks such as denial-of-service (DoS) attack. DoS attack is arguably one of the most common forms of attack on both wire-line and wireless networks, where either a single attacker or multiple distributed attackers “flood” a victim’s link with random packets to disrupt the delivery of legitimate packets. For the Internet, IP traceback is one of the possible mechanisms for determining the source of this attack  . Similarly, generalized (not necessarily IP-based) traceback proves useful in determining the origin of attacks for MANETs. An important point to note is that traceback may prove useful for purposes other than countering distributed DoS attacks. For instance, it can be used for network maintenance purposes , for source/route verification and to determine location of faulty nodes in the network.
Traceback mechanisms have been traditionally studied for IP-based networks under the name of IP traceback . The common goal in traceback literature is to perform a post-attack traceback for an IP-based network to determine the source(s) of the attack. Our paper’s focus is on dynamic networks (which may or may not be IP-based) where traceback is preemptively performed to manage the network and deter possible attacks. To this end, we desire that the traceback mechanism be efficient and be able to track changes in the traces quickly with minimal computation. In this paper, we develop an incremental traceback mechanism which, after initialization, requires a low packet and computational overhead to detect and determine changes in traces of the network.
I-a Background on Traceback
As mentioned earlier, a large body of literature on traceback focuses on IP traceback. However, regardless of the setting, good traceback mechanisms share some common properties – they should (a) be partially deployable in the network, (b) result in little or no change in the router hardware, (c) provide accurate traceback using a small number of packets, (d) need as minimal an extent of ISP involvement as possible, (e) perform well in presence of multiple attack sources and forms, (f) have a low complexity mechanism for identifying attackers. These properties also serve as the evaluation metrics when comparing different traceback approaches.
The importance of the IP traceback problem has led to a large body of research in the field, resulting in the development of many interesting traceback mechanisms and methodologies to date. We briefly describe some of them:
Savage et al.  proposed one of the earliest probabilistic traceback mechanisms where routers randomly mark packets with their partial path information during the process of packet-forwarding. The main disadvantage of the scheme is the combinatorial computational complexity of the traceback process.
Song and Perrig  proposed an improved and authenticated packet-marking scheme with the ability to cope with multiple attacks. However, the traceback process by any workstation needs the knowledge of its current upstream router map to all attackers.
Bellovin et al.  developed iTrace, a traceback scheme where routers randomly send their IP addresses in form of special packets to the source or destination IP address of the data packets. The use of special packets generate additional traffic; besides every workstation has to wait for long enough time for getting sufficient number of special packets to carry out traceback.
Dean et al.  suggested a novel algebraic approach to the IP traceback problem – encoding the IP addresses of routers a packet passes through, into a polynomial. This allows reconstruction of the entire path in one go after getting sufficient number of packets.
Adler  gave a detailed theoretical analysis of the traceback problem, described the tradeoffs of probabilistic packet-marking scheme and proposed a -bit packet marking method to counter DoS attack.
Snoeren et al.  proposed SPIE, a mechanism which tracks every packet through querying of the states of the upstream routers. However, this requires the routers to store a large amount of state information.
Thing and Lee  showed that the performance of a traceback process in a wireless ad-hoc network depends on the routing protocol and network size.
In this paper, we perform traceback in a continuous manner, with the goal of ensuring that the destination(s) in a network stay well informed of the path(s) traversed by the packets received by them. We desire that the technique used for traceback is such that each node in the network remains blind to the global network topology and the changes in it. Essentially, when a change in topology occurs, we require that the destination(s) alone detect this change and initiate an incremental traceback analysis while the remaining nodes (including the source(s)) remain oblivious to the change.
Towards the end of developing an incremental traceback mechanism with desired qualities, we use the framework of algebraic traceback as developed by Dean et al. . Once the algebraic traceback process is initialized using the algorithm in , we show that marked packets and a traceback algorithm with a computational complexity of operations per execution are sufficient to track the change (node addition and deletion) in a path involving nodes (). Note that, if the non-incremental algebraic traceback process were repeated each time there is a change in the path, marked packets would be required to perform traceback. Next, we argue that our incremental traceback process is order-wise optimal in terms of the number of marked packets required and has a lower computational complexity compared to the conventional non-incremental traceback processes.
The rest of this paper is organized as follows. Sections II and III give the system model and a detailed review of the algebraic traceback mechanism respectively. The incremental traceback schemes based on different path encoding versions of algebraic traceback are presented in Sections IV and V. We describe the traceback procedure for systems employing network-coding in Section VI. The numerical results are shown in Section VII and the paper concludes with Section VIII.
Ii System Model
We consider a network represented by a directed graph. The nodes in the graph (identifiable with routers in the network) have unique identifiers (IDs) that come from the finite field , for some suitable prime number . A directed edge between a pair of nodes in the graph represents an error-free channel. We assume that the transmissions across different edges do not interfere with each other in any way.
Each node can act as a source, a destination or an intermediate packet-forwarding node, depending on the communication pattern in the network. We focus our attention on one such source and destination, represented in the graph by nodes and respectively. The source transmit data to the destination via the path . However, this path may change over the course of the transmission due to the dynamic nature of the network/graph. We want to develop an incremental algebraic traceback mechanism that enables destination to figure out this change in path .
We assume that there is the possibility of node-ID spoofing, i.e., a malicious node in path misreporting its ID to avoid detection by destination . We also limit our incremental traceback approach to track single node addition and deletion in path . This is deliberate, as conventionally, in wireless networks, the timescale at which routes/paths change (of the order of seconds) is many orders of magnitude greater than the timescale of data transmission (of the order of milliseconds or less). Thus, any one change can be detected before additional changes occur in a path. Our algorithm and analysis framework can be naturally extended to scenarios when multiple nodes can enter or leave path . The assumption also makes the algorithm description and proofs much more intuitive and concise, and therefore we focus on this simple case.
Iii Review: Algebraic Traceback
In this section, we present certain relevant aspects of algebraic traceback as developed by Dean et al. . The idea behind this traceback scheme is that a polynomial of degree in is completely determinable using of its evaluations at distinct points in . Though originally designed for IP traceback to counter DoS attack, the approach can be generalized to traceback in non-IP based networks.
Iii-a Deterministic Path Encoding
The deterministic path encoding scheme is used when no node-ID spoofing is suspected. The packet marking process is initiated by the first node that encounters the packet (source node, which is for path ). We include a flag-bit field and hop-count field (with initial values ) in each packet in the network – the flag-bit and hop-count values are set to when a packet is marked, otherwise the flag-bit value remains unchanged and each node following the source node just increments the hop-count by 1. In path , when node initiates the process of marking a packet (with some probability, say ), it encodes a value-pair into it, where is chosen randomly from and . If node () encounters a marked packet, it uses the values to update the value of as follows:
Hence, any marked packet received by destination has a value-pair of the form encoded in it, where
If destination receives value-pairs , where , path can be reconstructed by solving the following matrix equation:
The value of is obtained from the hop-count field of the marked packets. The resulting matrix in the equation is a full-rank Vandermonde matrix, and thus the system of equations can be solved in operations. Thus, path is determinable using marked packets, provided the -values encoded in them are distinct. This can be ensured with high probability by making the source keep a record of the -values it has used while marking packets, thereby avoiding re-use of the -values until the marking of at least packets. Therefore, choosing a large enough can ensure that marked packets are sufficient for retrieval of path .
Iii-B Randomized Path Encoding
The deterministic path encoding scheme may be infeasible if node-ID spoofing is possible and/or the first node to receive a packet is unsure if it is indeed the source node (for example, if does not know it is the source node in path ). Then we require a probabilistic traceback mechanism to address this situation. For path , node initiates marking of the packet as before (with probability ), but now each intermediate node () clears an existing marking, if any, and re-marks a packet with probability . Else, with probability , each node just follows the update mechanism as given by (1). The following pseudo-code summarizes this procedure:
Marking scheme at node :
for each packet
We assign non-trivial values to the marking probabilities such that the traceback process remains accurate while not requiring a very large overhead. For example,  examines the case when . Then, apart from marked packets with value-pairs corresponding to path , there are marked packets with value-pairs corresponding to sub-paths , as well. A marked packet received by destination has a value-pair of the form where
These marked packets can be segregated, in terms of the sub-paths their value-pairs correspond to, on the basis of their hop-count values111For simplicity, we assume that the hop-count field is not attacked. If this field is attackable, then alternate mechanisms for path reconstruction exist such as the Guruswami-Sudan algorithm based mechanism presented in ., as a hop count of implies that the value-pair is for and, consequently, a hop-count of implies that the value-pair is for path . Using this, the sub-paths and therefore, the entire path can be reconstructed after getting sufficient number of marked packets, in a manner similar to deterministic path encoding. The -values across nodes can be maintained as distinct values (to ensure invertibility of the resulting matrix at the destination) by requiring that the nodes with non-zero marking probabilities keep a track of the -values they use while marking packets and only reuse values when all elements in have been exhausted.
Suppose be defined as the fraction of packets marked by node and received by destination , then can be expressed in terms of , as
with the fraction of unmarked packets given by . This makes the fraction of marked packets coming from source to be , i.e., one out of marked packets is from node on an average. Since marked packets from node with distinct -values are needed for determining path , an average of marked packets needs to be received by destination to ensure that packets among them have value-pairs corresponding to path .
If , we have and , which gives the average number of marked packets as
As , the above quantity goes to . Hence, if is chosen reasonably small, an average of marked packets are sufficient for determining path . But is large for small , which is inefficient as then destination has to wait for a longer time to receive sufficient number of marked packets for performing traceback. Thus, there is a tradeoff in the value of . Even for the general case of marking probabilities, becomes smaller as and become large. But cannot be very large, causing a tradeoff. Regardless of this tradeoff, an average of marked packets is necessary.
Iv Inc. Traceback: Deterministic Path Encoding
In this section, we present an incremental traceback approach, based on the methodology of deterministic path encoding. We adopt the same encoding/marking procedure i.e., the source node initiates the packet marking process. As discussed earlier, path can be ascertained using marked packets with a computational complexity of . Our interest is in the case when this initial process has occurred, and then path changes due to node addition or deletion. A conventional traceback mechanism would repeat the traceback procedure again, i.e., destination would wait until it receives marked packets again, reconstruct the modified path and then determine where the change has occurred. This scheme proves to be inefficient – the number of marked packets and computational load incurred remains the same. The proposed incremental traceback method makes use of the fact that path is known to destination (due to an initial traceback process) to determine the change using marked packets with a computational complexity of .
The change in topology of path involves either addition or deletion of a single node, which can be detected using the hop-count value of a marked packet – it changes from to for node addition and to for node deletion. We examine these two cases separately.
Iv-a Node Addition
Note again that the encoding process remains the same as before (as in Section III-A). In incremental traceback, all that changes is the decoding algorithm at the destination . Suppose a node with ID gets added to path in the th position, (st position refers to the position before node and th position refers to the position after node ). Then the new packets have value-pairs of the form encoded in them, where
and are polynomials given by
for . These polynomials are known to destination from the usual traceback performed previously, which gives . The polynomials also satisfy
, where refers to the -value of the marked packet received by destination prior to addition of node .
Suppose , are the value-pairs encoded in marked packets received after the addition of in path . We consider the following set of equations:
From (2), the set of equations is consistent for . For , the set of equations is not consistent with high probability (this is established by Theorem 1 below). We make use of this property to design an incremental traceback algorithm for destination as follows:
Construct a matrix where
If there exists a unique row in with equal elements, say the th row, declare that the new node is in th position with ID .
If there exists more than one row in with equal elements, declare that an error has occurred. Wait for more value-pairs to arrive through marked packets, say , where is an integer of smaller order compared to . Repeat the algorithm using the value-pairs . Theorem 1 below shows that the algorithm terminates with high probability while obtaining the correct node ID.
Theorem 1: A newly added node in path can be identified by destination using marked packets and Algorithm I, with a computational complexity of .
Proof: From (5), it is clear that all elements of the th row of will be equal. If this is the only such row, we have the correct new node position and ID . An error occurs if there exists another row such that all elements of the th row are equal as well. To determine the probability of this happening, we note that is chosen uniformly over . This makes uniform for any , since each is purely a function of . So, is an i.i.d. uniform random process. This gives
for any and . Let be the event that all elements of the th row of are same. Then we have for , since there are elements in each row. The probability of error is
where the inequality above is due to the union bound. can be made arbitrarily small if can be made as negative as possible. If we require that , then this can be satisfied. Thus, we choose , where is a small constant. Then gets upper-bounded as
where the second inequality follows from the fact that . By choosing a large enough value for , can be bounded above by any arbitrary small positive value. In other words, is sufficient for determining the newly added node correctly.
Since the algorithm relies on the computation of which has entries, we get a complexity of (since ). This completes our proof.
Iv-B Node Deletion
Suppose node () gets deleted from path , leaving behind nodes. Then the new marked packets carry value-pairs of the form , where
Suppose be the received value-pairs from marked packets received after deletion of node . We consider the following set of equations:
From (6), the set of equations is consistent for . For , the set of equations is not consistent with high probability (proved in Theorem 2). We make use this property to design an incremental traceback algorithm for destination , for the case of node deletion, as follows:
Construct a matrix where
If there exists a unique row in with equal elements, say the th row, declare that the deleted node was in th position with ID .
If there exists more than one row in with equal elements, declare that an error has occurred. Wait to receive more value-pairs through marked packets, say , where is an integer of smaller order compared to . Repeat the algorithm using the value-pairs . Theorem 2 below shows that the algorithm terminates with high probability while obtaining the correct node ID.
Theorem 2: A deleted node in path can be identified by destination using marked packets and Algorithm II, with a computational complexity of .
Proof: From (7), all elements of the th row of will be equal. If this is the only such row, we have the correct deleted node ID . An error occurs if there exists another row such that all elements of the th row are equal as well. Using the same argument as in the proof of Theorem 1, we get to be an i.i.d. uniform random process. This gives
for and . Let be the event that all elements of the th row of are same. Then for , and the probability of error is
where the inequality is again due to union bound. Since the upper-bound of is same as that for the case of node addition, using the same approach as in the proof of Theorem 1, we conclude that can be bounded above by any arbitrary small positive value and is sufficient for determining the deleted node’s location and ID with high probability. Since the algorithm makes use of , which has entries, this results in a computational complexity of (). This completes our proof.
Thus, be it node addition or deletion, marked packets are always sufficient for destination to determine the change in path accurately. Before we proceed to randomized traceback algorithms, a quick note on the order-wise optimality of Algorithms I and II. Note that, from principles of information theory , it is well known that the entropy of a uniform source with an alphabet of size is bits. Thus, even if a centralized mechanism existed to communicate the location of the node being inserted/deleted, it would require bits to do so, as there are equally likely places for the change. Our distributed mechanism uses packets or approximately bits. Thus, in terms of the order of growth of network overhead in , the incremental traceback mechanism is order-wise optimal.
V Inc. Traceback: Randomized Path Encoding
In this section, we present an incremental traceback approach, useful when node-ID spoofing is suspected, utilizing the randomized path encoding framework. In this setup, each packet decides to clear any existing marks and re-initiate the marking process with some probability . As multiple nodes on path now act as source nodes, we receive different (sub) polynomial evaluations across time. The marked packets carry value-pairs corresponding to both sub-paths and of the entire path . As described in Section III-B, path can be initially determined using an average of marked packets with a computational complexity of at least . Once path is known to the destination, we show that it possible to track its changes using lesser number of marked packets with lower complexity.
Due to the random nature of packet-marking, one cannot immediately ascertain if node addition or node deletion has occurred from the hop-count value of the marked packets. So, we need to consider both the possibilities jointly in our analysis. If a node with ID gets added to path , the value-pair of a new marked packet has information about encoded in it, provided it has traversed a sub-path containing node . Similarly, if node is removed from path , only those marked packets that traverse sub-paths that contained node prior to its deletion can provide information about .
Note that the number of marked packets required to detect a change (addition or deletion) in path is highest when the change occurs in the first position of the path i.e., either when node gets deleted or a new node gets added before it. In such a situation, the marked packets that are useful in tracking this change are ones that are marked by the first node and by no other node along the new path, which we call . Let denote the fraction of packets received by the destination and marked by the th node in path . Then, the fraction of marked packets originating at the first node along path path is where is the fraction of unmarked packets. This implies that, from an average of new marked packets received by the destination after a change (addition or deletion in the path), marked packets with the highest hop-counts are likely to come from the node in the first position on path . In the following sections, we show that is sufficient to determine the ID, position and nature of the change in the path , given that the destination already has knowledge of the path .
Let us start with the assumption that a new node gets added at the th position in path (), Now, a marked packet with hop-count , where , contains information that includes the ID . Therefore, the value-pair for this packet can be rewritten as
is defined as in (3) and is defined as
for and for . Similarly, if node () is deleted from path , then a marked packet with hop-count , where contains value-pair such that
Depending on whether a node gets added or deleted in path , path has or nodes respectively. Note that, if there is no change in , we have . So, and can take three possible values, one is the unchanged and , the other two values result from a change in (node addition and node deletion). Let and denote those values of and that maximizes among these three choices. Suppose are the value-pairs of the marked packets with the highest hop-count values, say , among marked packets received by the destination. Then, by an expected/average value argument, these packets are marked by nodes close to node and possess information about the change in path . If for some , it means there has been node addition but if , we cannot conclude anything and have to consider both the possibilities of node addition and node deletion. We propose the following incremental traceback algorithm for destination to determine change in path :
Construct a matrix where
for and otherwise.
If there exists a unique row in , say the th row, such that all non-zero elements (there should be atleast two non-zero elements) of the row are equal, declare that there is a new node added in th position with ID equal to the non-zero element value.
If there exists more than one row in with equal non-zero elements, declare that an error has occurred. Wait to get more value-pairs with high hop-count values through marked packets. Repeat (i), (ii) using these and some of the earlier value-pairs ( value-pairs in all).
If there exists no row in with equal non-zero elements, construct a matrix where
for and otherwise.
If there exists a unique row in , say the th row, such that all non-zero elements of the row are equal, declare that the node in th position has been deleted with ID equal to the non-zero element value.
If there exists more than one row in with equal non-zero elements, declare that an error has occurred. Wait to get more value-pairs with high hop-count values through marked packets. Repeat (iv), (v) using these and some of the earlier value-pairs ( value-pairs in all).
If there exists no row in with equal non-zero elements, declare that there has been no change in .
Theorem 3: Any change in path can be identified by destination using marked packets, containing information about the change encoded in them, and Algorithm III with a computational complexity of .
Proof: The cases of node addition and node deletion cannot return positive results simultaneously i.e., both and cannot have unique rows with their non-zero elements equal. Since the value-pairs from the marked packets are assumed to possess information about the change in , equality of all the elements, not the non-zero elements alone, of some row of or would confirm the change (from (8) and (9)). So, we need to show that, for node addition (node deletion), the existence of more than one row in () with equal elements is highly improbable for . Note that this is exactly what we have already established as part of the proofs of Theorems 1 and 2. Also, Algorithm III requires evaluating both and in the worst-case situation, each of which has a computational complexity of . This gives an overall complexity of . This completes our proof.
Thus, marked packets, with the information of path change encoded in them, and an average of marked packets in general, are sufficient to determine the correct change in topology of .
V-a Reducing the requirement on number of marked packets
In this section, we develop two schemes that enable us to reduce the average order of marked packets needed to perform probabilistic traceback. If , then , and
Since the quantity in (10) increases with , we have , which approaches as . So, if is chosen arbitrarily small, an average of marked packets are sufficient for determining any change in . However, a small implies a larger value for , and thus there is a tradeoff between the two parameters.
To reduce the average number of marked packets, we must attempt to make each of the values comparable to one another for this. One way this can be done is through requiring that the marking probability of a packet be dependent on the hop-count, i.e., higher the hop-count value of a packet, lesser is the probability that a node marks it. So, we have where is the hop-count of a packet and is a non-increasing function in . This gives and for . Next, we present two packet marking schemes with the aim of reducing the average number of marked packets needed for incremental probabilistic traceback.
V-A1 Scheme 1
We consider a constant and the following marking-probability function:
This gives , and
for . As , the quantity in (11) goes to . So, the average order of marked packets becomes for . Next, we substitute and get:
for . As increases, the numerator and denominator of (12) approach and respectively. This makes . Also i.e., about of the packets remain unmarked in this scheme.
V-A2 Scheme 2
We consider the same constant and the following marking-probability function:
This gives , and
for . As , the ratio in (13) goes to and the average number of marked packets in the system is for . Note that there is a tradeoff in the choice of - if it is small, then the fraction of unmarked packets is large. For and , we get
As varies from very small to , the quantity in (14) varies from to and the fraction of unmarked packets changes from close to to around .
Thus, with an intelligent choice of marking probabilities, we can reduce the overall network overhead incurred.
Vi Traceback for Network Coding
In the previous sections, we have focused only on a single path with source node and destination . However, a general graph can have a multicast set-up with a source communicating to more than one destinations. In such a situation, adopting schemes such as network coding can help increase the set of rates achievable by the sources in the network. We use the algebraic traceback framework in this paper to develop a non-incremental (and incremental) mechanism of performing traceback in network coded systems.
To better motivate our traceback mechanism, we start with a simple unicast communication setup without network coding. Here, one source communicates with only one destination through a number of paths (Sections I through V have considered the case where there is just one path that is being traced). Note that, for unicast communication, network coding is not required and the Ford-Fulkerson algorithm  gives us routes that achieve capacity. For a network with unit capacity links and a mincut of , Ford-Fulkerson returns distinct paths from source to destination. We labels these paths as and the goal of traceback is to determine the identities of the nodes involved along each path at the destination. Note that, if the network mincut is , the destination receives at least packets at every time instant. Here, we assume that the destination can determine which path a particular packet traversed. For example, if each path were along a different OFDM sub-channel (in a MANET), then our assumption implies that the destination can identify the sub-channel through which each packet is received. Now, both the non-incremental and incremental traceback schemes described in Sections III, IV and V can be performed individually on each of the ’s separately, and nodes along all paths between source and destination can be identified.
Next, consider a multicast setup where in-network coding is used. In other words, there are nodes which generate (random) linear combinations of packets which they receive, and forward these combinations. We desire to develop a marking scheme that will enable us to trace the path taken by the source packet even after being linearly combined at the intermediate node with other packets. To make our strategy concrete, we take the well-known ‘butterfly’ network as an example for our graph (Figure 2). Note that our traceback procedure is in no way limited to this butterfly network and can be generalized to other multicast networks employing network coding.
In Figure 2, is the source node and and are the destination nodes. The paths which are used by packets originating from to are , and from to are , for communicating with . Note that the min-cut for this network is 2 bits, a rate of for both and is achievable using network coding. To develop our traceback procedure, consider the virtual network in Figure 2-b where nodes and get split into two new node-pairs and . In this virtual network, the same rate of is achievable for both and without network coding. Moreover, Ford-Fulkerson (routing) is sufficient to achieve capacity, and a traditional algebraic packet marking scheme is sufficient to perform traceback. Thus, for the original network in Figure 2-a, we desire to “mimic” the virtual network in Figure 2-b. Say and are the value-pairs received by from and respectively, Then chooses one of the value-pairs with some probability, say , and updates it using its own ID , to get , where . To ensure that the same path is not chosen every time, node may change the probability of selection in every time-slot. When the chosen value-pair is received by the other nodes, the same policy as traditional marking is followed. In this way a destination can determine the paths to all the sources. For example, destination can determine the paths , and . Thus, every destination can recreate the network subgraph corresponding to packets it observes.
Vi-a Faulty/Malicious Nodes in Network-Coded Systems
As described above, a destination in a network-coded system traces a subgraph instead of a path traversed by a packet. Here, we describe an approach to identify a malicious/faulty node in such a network. We restrict our attention to the case in which a single node in the network is faulty or malicious; this approach can be extended to the more general case.
The broad idea is that routing can be performed in such a way that the subgraph traversed by packets from a set of sources to a given destination evolves over time. More precisely, if at time , the subgraph traversed by packets originating at sources and and ending at a destination is different from the subgraph traversed between sources and destination at time , then the intersection of and is small. So, if this subgraph evolves so that it is different at different time-slots, then for each time-slot that decoding fails (due to some node in the subgraph being malicious or faulty), the subgraph traversed during that time-slot can be isolated and intersected with subgraphs of other such time-slots (when decoding failed). This will enable the receiver to identify a small set of nodes (in the intersection) as candidates for the malfunctioning/malicious node.
The subgraph creation needs to be done carefully, so that every subgraphs (for some chosen ) have a nonempty but not too large intersection. We defer the details of such a construction to a future version of the paper.
Vii Numerical Results
In this section, we present some numerical results on the number of market packets required to successfully perform algebraic traceback. We consider a network where the nodes have -bit long IDs. This means the order of the prime field, where the identities come from, should be greater than . We assume , which is the smallest prime greater than . Then for deterministic path encoding, for a dynamic path of length the number of marked packets needed for determining the path initially is . As derived in Section IV, the number of marked packets needed for determining the change in path , once its topology is known, is given by , where is a constant which determines the rate with which the (union) upper-bound of the probability of error decays with . We choose , which upper-bounds the probability of error by , which is approximately for our case. Figure 3 makes the comparison between the number of marked packets needed for the usual non-incremental traceback and the incremental version for deterministic path encoding. As observed, the incremental version of traceback proves to be better - the number of marked packets is far smaller and the rate of growth of marked packets needed, with increasing, is also smaller than non-incremental traceback.
The average number of marked packets needed for randomized path encoding for both the non-incremental and incremental traceback versions is also shown in Figure 3. Here, we consider the case when the nodes mark packets independently of each other with probability . This gives and . The average number of marked packets needed by the conventional traceback is and the average number of packets needed by the incremental traceback is . In this case, the average number of marked packets needed for incremental traceback increases significantly compared to the deterministic path encoding case, but it is still less than the number needed by conventional randomized path encoding version of traceback.
We next analyze the performances of Schemes 1 and 2 (Section V-A) in reducing the average order of marked packets needed and compare it with the scheme in  i.e., where all nodes mark packets with same probability (let us call this Scheme 0). For both the Schemes 1 and 2, we assume i.e., once a node sees a marked packet of hop-count or more, it does not mark it. We consider for Scheme 0 and 1, for Scheme 2. Then for , the fraction of unmarked packets are and for Schemes 1 and 2 respectively, which seems reasonable. Figure 4 depicts the variation of with . Clearly for Schemes 1 and 2, the value becomes a constant while for Scheme 0, it continues to grow in value. Thus, Schemes 1 and 2 reduce the average order of number of marked packets needed to perform traceback.
Viii Conclusion and Remarks
In this paper, we present a mechanism of performing incremental algebraic traceback in networks with a topology that is changing much slower than its rate of communication. We initialize the system using an established algebraic traceback mechanism, and then track the network as it evolves using an efficient incremental traceback mechanism. The decoding process is altered from a traditional traceback scheme. This decoding mechanism actively searches for a change in network topology in the incoming packets, and when one is detected, it determines what the change is (insertion or deletion), where it has occurred in the network and what the new ID, if any, of the inserted node is. We also show that, for the case with no ID spoofing among nodes, the resulting algorithm requires marked packets and a complexity of before it can declare success in determining the ID of the change in a path of nodes. We also show, very straightforwardly, that this packet overhead is order-wise optimal.
Note that our proof mechanisms closely resemble random coding proofs in information theory for discrete additive memoryless channels. Algorithms I through III can be viewed as “achievability” proofs from conventional information theory, while, in this case, the converse is straightforward. A final remark is that, when we swap a more stringent probability 1 (zero error) requirement for tracking the changing path in a dynamic network with a arbitrarily small error constraint, the resulting time taken and complexity of the incremental traceback algorithm decreases substantially.
-  A. Belenky and N. Ansari, “On IP Traceback,” IEEE Communications Magazine, Vol. 41, Issue 7, pp. 142-153, July 2003.
-  H. Burch and B. Cheswick, “Tracing Anonymous Packets to their Approximate Source,” Unpublished paper, Dec. 1999.
-  I.Y. Kim and K.C. Kim, “A Resource-Efficient IP Traceback Technique for Mobile Ad-hoc Networks Based on Time-Tagged Bloom Filter,” ACM International Conference on Convergence and Hybrid Information Technology (ICCIT), Vol. 2, pp. 549-554, 2008.
-  S. Savage, D. Wetherall, A. Karlin and T. Anderson, “Practical Network Support for IP Traceback,” ACM SIGCOMM, Aug. 2000.
-  D. Song and A. Perrig, “Advanced and Authenticated Marking Schemes for IP Traceback,” IEEE INFOCOM, Vol. 2, pp. 878-886, Apr. 2001.
-  S.M. Bellovin, M. Leech and T. Taylor, “The ICMP Traceback Message,” Internet draft available at http://www.cs.columbia.edu/smb/papers/draft-ietf-itrace-04.txt (work in progress), Oct. 2001.
-  D. Dean, M. Franklin and A. Stubblefield, “An Algebraic Approach to IP Traceback,” ACM Transactions on Information and System Security (TISSEC), Vol. 5, Issue 2, pp. 119-137, May 2002.
-  M. Adler, “Tradeoffs in Probabilistic Packet Marking for IP Traceback,” ACM Symposium on Theory of Computing (STOC), 2002.
-  A.C. Snoeren, C. Partridge, L.A. Sanchez, C.E. Jones, F. Tchakountio, S.T. Kent, and W.T. Strayer, “Hash-Based IP Traceback,” IEEE/ACM Transactions on Networking (TON), Vol. 10, Issue 6, Dec. 2002.
-  V.L.L. Thing and H.C.J. Lee, “IP Traceback for Wireless Ad-hoc Networks,” IEEE VTC, Vol. 5, pp. 3286-3290, Sept. 2004.
-  T.M. Cover and J.A. Thomas, Elements of Information Theory, 2nd Edition, Wiley Series in Telecommunications and Signal Processing.
-  T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein, Introduction to Algorithms, 2nd Edition, MIT Press and McGrawHill.