# Network error correction with unequal link capacities

###### Abstract

This paper studies the capacity of single-source single-sink noiseless networks under adversarial or arbitrary errors on no more than edges. Unlike prior papers, which assume equal capacities on all links, arbitrary link capacities are considered. Results include new upper bounds, network error correction coding strategies, and examples of network families where our bounds are tight. An example is provided of a network where the capacity is 50% greater than the best rate that can be achieved with linear coding. While coding at the source and sink suffices in networks with equal link capacities, in networks with unequal link capacities, it is shown that intermediate nodes may have to do coding, nonlinear error detection, or error correction in order to achieve the network error correction capacity.

Adversarial errors, Byzantine adversary, network coding, network error correction, nonlinear coding

## I Introduction

Network coding allows intermediate nodes in a network to mix the information content from different packets. This mixing can increase throughput and reliability in networks of error-free or stochastically failing channels [1, 2]. Unfortunately, it can also potentially increase the impact of malicious links or nodes that wish to corrupt data transmissions. A single corrupted packet, mixed with other packets in the network, can potentially corrupt all of the information reaching a particular destination. To combat this problem, network error correction was first studied by Yeung and Cai [3, 4] who investigated correction of errors in multicast network coding [1, 2, 5] on networks with unit-capacity links. In that work, the authors showed that for any network of unit-capacity links, the Singleton bound is tight and linear network error-correcting codes suffice to achieve the capacity, which equals where is the min-cut of the network and is a bound on the number of corrupted links [4, Theorem 4]. The problem of network coding under Byzantine attack was also investigated in [6], which gave an approach for detecting adversarial errors under random network coding. Construction of codes that can correct errors up to the full error-correction capability specified by the Singleton bound was presented in [7]. A variety of alternative models of adversarial attack and strategies for detecting and correcting such errors appear in the literature. Examples include [8, 9, 10, 11, 12, 13, 14, 15].

Specifically, the network error correction problem concerns reliable information transmission in a network with an adversary who arbitrarily corrupts the packets sent on some set of links. The location of the adversarial links is fixed for all time but unknown to the network user. We define a -error correcting code for a single-source and single-sink network to be a code that can recover the source message at the sink node if there are at most adversarial links in the network. The -error correcting network capacity, henceforth simply called the capacity, is the supremum over all rates achievable by -error correcting codes.

In this work, we consider network error correction when links in the network may have unequal link capacities. (A related model, where adversaries control a fixed number of nodes rather than a fixed number of edges was studied in [16], independently and concurrently with our initial conference paper [17].) The unequal link capacity problem is substantially different from the equal link capacity problem studied by Yeung and Cai in [3, 4] since the rate controlled by the adversary varies with his edge choice. In the error-free case, any link in the network with capacity can be represented by edges of capacity one without loss of generality. However, in the case with errors there is a loss of generality in using a similar representation and assuming that errors have uniform rate, since this does not capture potential trade-offs that the adversary faces in choosing whether to attack strategically positioned or larger capacity links. The error-correction capacity in the equal link capacity case has a simple cut-set characterization since the adversary always finds it optimal to attack links on a minimum cut; as a result, coding only at the source and forwarding at intermediate nodes suffices to achieve the capacity for any single-source and single-sink network. In contrast, for networks with unequal link capacities, we show that network error correction coding operations at intermediate nodes are needed even in the single-source single-sink case.

The cut-set approach is a simple yet powerful tool for bounding the capacity of a large network. This approach partitions the nodes into two subsets, say and , and then bounds the rate that can be transmitted from nodes in to nodes in . (See, for example, [18, Section 15.10].) The maximum information transmission across the “cut” occurs when the nodes within can collaborate perfectly among themselves and the nodes within can collaborate perfectly among themselves. In this case, and each act as “super-nodes” in a simple point-to-point network. All that is needed for collaboration is sufficient information exchange among the nodes on each side of the cut. Thus, the “cut-set bound” equals that rate that would be achieved in transmitting information from to if we added reliable, infinite-capacity links between each pair of nodes in and reliable, infinite-capacity links between each pair of nodes in , as shown in Figure 1. Given a network of capacitated error-free links with a source node and a sink node , minimizing over all choices of that contain but exclude gives a tight bound on the unicast capacity from to [19].

In contrast, this traditional cut-set bounding approach is not tight in general when it comes to the error-correction capacity of networks with unequal link capacities, even in the case of unicast demands. In this case, two new issues arise. We next describe each of these issues in turn.

The first issue concerns the role of feedback across –
i.e. links from to .
While feedback never increases the capacity
across a cut in a network of reliable links,
it can increase the error-correction capacity.
Intuitively, this is because feedback
allows us to inform nodes in
about what was received by nodes in ,
thereby aiding in the discovery of adversarial links.^{1}^{1}1This
process of discovery is complicated by the fact that the feedback links themselves may be corrupted,
but feedback is, nonetheless, clearly useful.
Treating all nodes in as one super-node
and all nodes in as another super-node, as in the traditional cut-set bounding approach,
makes all feedback information available to all nodes in
and all feedforward information available to all nodes in . This may give them considerably more insight
into the adversary’s location than is available to them
in the original network.

We can obtain tighter bounds by taking into account limitations on which nodes in can influence the values on each feedback edge and which nodes in have access to the feedback information. This is important in the unequal link capacity case, because it captures trade-offs faced by the adversary in choosing whether to attack links based on their capacity or whether they are upstream of feedback links that may give clues about the adversary’s actions. Specifically, given an acyclic network , we construct an acyclic network by adding a reliable infinite capacity connection from a node to a node only if contains a directed path from node to node via nodes in , and adding a reliable infinite capacity connection from a node to a node only if contains a directed path from node to node via nodes in . Figure 1(c) shows an example. Limiting the added connections in this way creates what we call a “zig-zag” network, as shown in Figure 1. We draw only those nodes in and with incoming or outgoing edges that cross between and , and draw the nodes on each side of the cut in topologically increasing order. The “forward” edges across the cut point downwards in the diagram, while “feedback” edges point upwards. By studying the capacity of these zig-zag networks, we develop upper bounds on error-correction capacity that apply to general acyclic networks. We also illustrate the usefulness of these bounds by giving examples where they improve upon previously known bounds and showing that they are tight for families of networks that are special cases of zig-zag networks.

However, the second issue with the cut-set approach to bounding network capacities is the notion of a cut itself. Reference [20] shows, for the more general case where only a subset of links are potentially adversarial, the existence of networks for which no partition yields a tight bound on the error-correction capacity. This is proven by example using a network whose minimal cut (which has no feedback links) yields a capacity bound that is proven to be unachievable. As a result, knowledge of the the capacity of the network’s minimal cut is insufficient to determine the capacity of all possible networks, and we cannot hope to derive cut-set bounds that are tight in general. Nonetheless, given the complexity of taking into account the full network topology, we proceed to study the cut-set approach, deriving general bounds and demonstrating that those bounds are tight in some cases.

Specifically, in Section III we begin with the cut-set upper bound given by the capacity of the two-node network shown in Fig. 2, which is the only cyclic network we consider in this paper. In this network, the source node can transmit packets to the sink node along the forward links and the sink node can send information back to the source node along the feedback links. As mentioned above, this cut-set bound can be quite loose since it assumes that all feedback is available to the source node and all information crossing the cut in the forward direction is available to the sink. We therefore develop a new cut-set upper bound for general acyclic networks by taking into account more details of the topological relationships among links on the cut, as in the zig-zag network construction shown in Figure 1.

In Section IV, we consider a variety
of linear and nonlinear coding strategies useful for achieving the
capacity of various example networks. We
prove the insufficiency of linear network codes to achieve the
capacity by providing an example of a network where the capacity is 50% greater than the
linear coding capacity and is achieved using a nonlinear error
detection strategy. A similar example for the problem with Byzantine
attack on nodes rather than edges appears
in [16]. We also give examples of single-source
and single-sink networks for which intermediate nodes must perform
coding, nonlinear error detection or error correction in order to
achieve the network capacity. We describe a simple greedy algorithm
for error correction at intermediate nodes.
We then introduce a new
coding strategy called “guess-and-forward.” In this strategy, an
intermediate node which receives some redundant information from
multiple paths guesses which of its upstream links controlled by the
adversary. The intermediate node forwards its guess to the sink
which tests the hypothesis of the guessing node. In Section V, we
show that guess-and-forward achieves network capacity on the
two-node network with feedback links of Fig. 2, as well as the
family of four-node acyclic
networks in Fig. 3 when the capacity of each feedback link is not too small (i.e. above a value given by a linear optimization).^{2}^{2}2After the submission of this paper, we obtained a new result that improves upon the bound in Section III for the special case of small-capacity feedback links. We mention the idea briefly as a footnote in Section III and will present it formally in an upcoming paper.
Finally, we apply guess-and-forward strategy to zig-zag
networks, deriving achievable rates and presenting
conditions under which our upper bound is tight. We conclude in Section
VI with a discussion of future work.

## Ii Preliminaries

Consider a directed acyclic communication network with unequal link capacities. Let denote the capacity of edge . A source node transmits information to a sink node over the network . Transmissions occur on the links according to their topological order, i.e. a link transmits after all its incident incoming links, and we regard a link error as being applied upon transmission. A link (or node) is said to be upstream of another link (or node) iff there is a directed path starting from the former and ending with the latter. A link (or node) is said to be downstream of another link (or node) iff there is a directed path starting from the latter and ending with the former.

In this paper, we consider the problem of correcting arbitrary adversarial errors on up to links. The location of error links is fixed for all time but unknown to the network user.

###### Definition 1

A network code is -error link-correcting if the source message can be recovered by the sink node provided that the adversary controls at most links. Thus a -error link-correcting network code can correct any adversarial links for .

Let be a partition of , and define the cut for the partition by

The cut separates nodes and if and . We use to denote the set of cuts between and . Given a cut , we call any link in a forward link, and we call any link from to a feedback link.

For the achievable strategies in Sections IV and V, we assume that coding occurs in the finite field for some prime power . An error on any link is specified by a vector containing symbols in . The output of link equals the sum in of the input to link and the error applied to link , i.e., . We say that an error occurs on the link if .

As in [3, 4], we can consider a linear network code that assigns a set of vectors , called global coding vectors, to each link in the network. Let

denote the error-free output of link when the network input is where denotes the inner product of row vectors and . We use to denote the vector of errors on the entire network. The output of a link is a function of both the network input and the error vector , which we denote by . For each node , we use and to denote the sets of incoming and outgoing edges respectively for node . With this notation, a sink node cannot distinguish between the case where is the network input and error occurs and the case where is the network input and error occurs if and only if

(1) |

Let denote the number of links in which an error occurs. We say that any pair of input vectors and are -links separable at sink node if (1) does not hold for any pair of error vectors and such that and . Lemma 1 of [4] establishes the linear properties of for networks with unit link capacities. This result extends directly to networks with arbitrary link capacities.

###### Lemma 1

For all , all network inputs and , error vectors and , and ,

and

From Lemma 1,

where for any link . Thus can be written as the sum of a linear function of and a linear function of .

## Iii Upper bounds

In this section, we consider upper bounds on network error correction capacity. Let denote the source alphabet and the size of the (arbitrary) link alphabet. The corresponding network transmission rate is given by

We first derive the cut-set upper bound obtained from coalescing all nodes on each side of the cut into a super-node, resulting in a two-node network as shown in Fig. 2.

###### Lemma 2

Consider the two-node network shown in Fig. 2 with arbitrary link capacities. Let denote the sum of the smallest forward link capacities. The network error correction capacity of this network is upper bounded by

Case 1) .

Suppose that and we show a contradiction. Since , there are two codewords and in that can be sent reliably. When is sent along the forward links and the leftmost links are adversarial, the adversary changes to so that the outputs of the leftmost links of are the same as that of . Similarly, when is sent along the forward links and the rightmost links are adversarial, the adversary changes to so that the outputs of the rightmost links of are the same as that of . Then the two codewords cannot be distinguished and this contradicts .

Case 2) .

When the sink knows adversarial links are the largest capacities forward links, the maximum achievable capacity is . When and all feedback links are adversarial, there are adversarial forward links whose locations are unknown. In this scenario, we show that the best achievable rate is , which is the sum of smallest forward link capacities. We assume that , and show that this leads to a contradiction. denotes the set of forward links such that the links indexed in increasing capacity order, i.e., . Since and is sum of the smallest forward link capacities, there exist two distinct codewords such that . So we can write

where denotes the error-free vector of symbols on when codeword is transmitted.

We can construct -error links that changes to the value as follows. We apply an error of value ( on links for . Since this does not change the output value of other links, we obtain . For , we can follow a similar procedure to construct error links that change the value of to . Thus, sink node cannot reliably distinguish between the source symbol and , which gives a contradiction.

Therefore, the upper bound on achievable capacity is .

In Section V we show that this bound is the actual capacity of the two-node network. Thus, the super-node construction gives the following cut-set upper bound for general acyclic networks.

###### Lemma 3

Given any cut with forward links and feedback links, let denote the sum of the smallest forward link capacities. The network error correction capacity is upper bounded by

However, we can show that the above upper bound is not tight using the following generalized Singleton bound, which was presented in our conference paper [17]. A similar upper bound for the problem of adversarial attack on nodes rather than edges was given in independent work [16].

###### Definition 2

Any set of links on a cut is said to satisfy the downstream condition on if none of the links in are downstream of any link in .

###### Lemma 4

(A generalized Singleton bound) Consider any -error correcting network code with source alphabet in an acyclic network . Consider any set consisting of links on a source-sink cut that satisfies the downstream condition on . Let be the total capacity of the links in . Then

The proof is similar to that of the network Singleton bound for the equal link capacity case in [3]. We assume that , and show that this leads to a contradiction.

Given a cut , denotes the number of links in . For brevity, let where and links in are ordered topologically, i.e., is not downstream of for any . Since and is the capacity of , there exist two distinct codewords such that . So we can write

where denotes the error-free vector of symbols on when codeword is transmitted.

We will show that it is possible for the adversary to produce exactly the same outputs on all the channels in when errors occur on at most links in .

Assume that the true network input is . The adversary will inject errors on links in this order as follows. First the adversary applies an error on link to change the output from to . The output of links may be affected by this change, but the outputs of links will not. Let and denote the outputs of links and , respectively after the adversary has injected errors on link , where with . Then the adversary injects errors on link to change its output from to . This process continues until the adversary finishes injecting errors on links and the output of this channel changes from to . Now suppose the input is . We can follow a similar procedure by injecting errors on links . Then the adversary can produce the outputs

Thus, sink node cannot reliably distinguish between the source symbol and , which gives a contradiction.

Consider the example four-node network shown in Fig. 4. When , the two-node bound lemma 3 gives the upper bound 22. The generalized Singleton bound gives upper bound 2.

However, the generalized Singleton bound is also not tight. Building on ideas from the above bounds, we proceed to derive tighter bounds.

Let denote the set of feedback links across cut . Given a set of feedback links and a set of forward links , we use to denote the upper bound obtained from lemma 4 (generalized Singleton bound) when evaluated for adversarial links on the cut after erasing and from the graph . Let

Then we define as follows.

For instance, consider the 2-layer zig-zag network in Fig. 6. If , and , by choosing , , and removing in the application of the Singleton bound after erasing and . By taking the minimum over and , we can show that .

###### Lemma 5

(Cut-set upper bound 1) Consider any -error correcting network code with source alphabet in an acyclic network.

For any cut , the adversary can erase a set of feedback links and a set of forward links where and . Applying Lemma 4 on after erasing and gives the upper bound . By taking the minimum over all cuts , we obtain the above bound.

The following examples illustrate how the bound in Lemma 5 tightens the generalized Singleton bound. We first consider a four-node acyclic network as shown in Fig. 5. In each example, unbounded reliable communication is allowed from source to its neighbor on one side of the cut and from node to sink on the other side of the cut. There are feedback links with arbitrary capacities from to .

When we compute the generalized Singleton bound, for any cut , we choose and erase links in the cut such that none of the remaining links in the cut are downstream of the chosen links. Then we sum the remaining link capacities and take the minimum over all cuts. Because of the downstream condition, when the link capacities between and are much larger than the link capacities between and , the Singleton bound may not be tight. For example, in the network in Fig. 5 (a), if , then the generalized Singleton bound gives upper bound 20. However, when the adversary declares that he will use two forward links between and , we obtain the erasure bound 4.

As another example, consider the network in Fig. 5 (b) when . Applying the generalized Singleton bound gives upper bound 16. If the adversary erases one of the forward links between and and we apply the generalized Singleton bound on the remaining network, then our upper bound is improved to 15. The intuition behind this example is that when the adversary erases large capacities links which do not satisfy the downstream condition, applying the generalized Singleton bound on remaining network with adversarial links can give a tighter bound.

For the 2-layer zig-zag network in Fig. 6, when , the min-cut is 37 and the generalized Singleton bound gives upper bound 27. Suppose that the adversary declares that he will use the feedback link between and and the forward link with capacity 6 between and . By applying the generalized Singleton bound on the remaining network with two adversarial links, we obtain 37-6-(3+3+3+3)=19. The intuition behind this example is that the links between and and the links between and have the same topological order once the single feedback link between and is erased. Since the generalized Singleton bound is obtained by erasing links on the cut such that none of the remaining links on the cut is downstream of any erased links, erasing the single feedback link between and yields a tighter Singleton bound even with fewer adversarial links. Moreover, before applying the Singleton bound, we first erase the link with capacity 6, which is the largest link between and as we did in example in Fig. 5(b).

Next, we introduce another upper bounding approach which considers confusion between two possible sets of adversarial links, each containing some forward links as well as the corresponding downstream feedback links required to prevent error propagation. Consider any cut and sets . We say that a feedback link is directly downstream of a forward link (and that is directly upstream of ) if there is a directed path starting from and ending with that does not include other links in or . Let be the set of links in which are directly downstream of a link in and upstream of a link in . Let be the set of links in which are directly downstream of a link in and upstream of a link in .

###### Lemma 6

(Cut-set upper bound 2) Let denote the total capacity of the remaining links on . If for , then

We assume that , and show that this leads to a contradiction. Let denote the number of links on the cut . Since , from the definition of , there exist two distinct codewords such that error-free outputs on the links in are the same. Let and . Then we can write

where denotes the error-free outputs on the links in for and ; and denote the error-free outputs on the links in for and respectively; and and denote the error-free outputs on the links in for and respectively. We will show that it is possible for the adversary to produce exactly the same outputs on all the channels in under and when errors occur on at most links. When codeword is sent, we use to denote the error-free symbols on feedback link .

Assume the input of network is . The adversary chooses feedback links set and forward links set as its adversarial links. First the adversary applies errors on to change the output from to for and to cause each feedback link to transmit . Since all feedback links which are directly downstream of a link in and upstream of a link in transmit the error-free symbols, the outputs on links in are not affected. The outputs on links in may be affected, and we denote their new values by . Thus, the sink observes .

When codeword is transmitted, the adversary chooses feedback links set and forward links set as its adversarial links. The adversary applies errors on them to change to and to cause each feedback link to transmit . Since all feedback links which are directly downstream of a link in and upstream of a link in transmit the error-free symbols, the outputs on any other links are not affected. Therefore, the output is changed from to . Thus, the sink node cannot reliably distinguish between the codewords and , which gives a contradiction.

Given a cut , we consider all possible sets on satisfying the condition of Lemma 6. We choose sets among them that have the maximum total link capacities and define to be the sum of the capacities of the links in . This gives the upper bound

The following example shows that we can obtain a tighter upper bound using Lemma 6. For the example network in Fig. 7, when , Lemma 5 gives upper bound 9. However, Lemma 6 gives a tighter upper bound 8 when , and .

Now we derive a generalized cut-set upper bound that unifies Lemma 5 and Lemma 6. Given a cut , consider a set of forward links and a set of feedback links such that . Let denote the upper bound obtained from Lemma 6 when evaluated for adversarial links on the cut after erasing and from the original graph . Then

is an upper bound on the error correction capacity of . This includes the bound of Lemma 5 as a special case, since the generalized Singleton bound is a special case of the upper bound in Lemma 6 corresponding to the case where is a set of links satisfying the downstream condition. It is also clear that is the same as the bound in Lemma 6 when . Note however that any bound obtainable with a nonempty set of erased feedback links is also obtainable by including those links in the sets and of Lemma 6 instead of erasing them. Thus, we define

and state our upper bound as follows.^{3}^{3}3
After submitting this paper, we found a way to tighten the above
bound for the case of small feedback link capacity. Briefly, the key
idea is to note that instead of choosing all the links in as
adversarial links as in the proof of Lemma 6, another
possibility is to choose only a subset as
adversarial links, as long as the values on links in and links in that are directly upstream of
links in are the same under the two confusable
codewords and . The capacities of these links then appear
as part of the upper bound; thus, this bound is useful for cases
where feedback links have small capacity. This result will be
presented formally in an upcoming paper.

###### Theorem 1

(A generalized cut-set upper bound) Consider any -error correcting network code with source alphabet in an acyclic network. Then

## Iv Coding strategies

We consider a variety of linear and nonlinear coding strategies useful for achieving the capacity of various example networks. We show the insufficiency of linear network codes for achieving the capacity in general. We also demonstrate examples of networks with a single source and a single sink where, unlike the equal link capacity case, it is necessary for intermediate nodes to do coding, nonlinear error detection or error correction in order to achieve the capacity. We then introduce a new coding strategy, guess-and-forward.

### Iv-a Error detection at intermediate nodes and insufficiency of linear codes

Here we show that there exists a network where the capacity is 50% greater than the best rate that can be achieved with linear coding. We consider the single source and the single sink network in Fig. 8, where source aims to transmit the information to a sink node . We index the links and assume the capacities of links as shown in Fig. 8. For a single adversarial link, our upper bound from Theorem 1 is 2.

###### Lemma 7

Given a network in Fig. 8, for a single adversarial link, rate 2 is asymptotically achievable with a nonlinear error detection strategy, whereas scalar linear network coding achieves at most 4/3.

We first illustrate the nonlinear error detection strategy as follows. Source wants to transmit two packets . We send them in channel uses, but each packet has only bits. We use one bit as a signaling bit. We send down all links in the top layer. In the middle layer, we do the following operations:

1. Send the linear combination of and , , down link .

2. Send down both links and .

3. Send down both links and .

4. Send a different linear combination of and , , down link .

At the bottom layer, we do the following operations:

1. Forward the received packet on link .

2. Send a 1 followed by on link if the two copies of match, send a 0 otherwise.

3. Send a 1 followed by on link if the two copies of match, send a 0 otherwise.

4. Forward the received packet on link .

We can show that above nonlinear error detection strategy allows a sink node to decode (). Suppose that and are independent. Then coding vectors on any two links on the bottom layer are independent and they satisfy with MDS (maximum distance separable) properties. If nothing was sent down both and , the decoder can recover from the information received on links and . If nothing was sent down only on , then the outputs of and should not be corrupted and the decoder can recover . Similarly, the decoder can decode correctly when nothing was sent down only on . If all the links in the bottom layer received symbols, there is at most one erroneous link on the bottom layer, which has MDS code. Thus we can achieve rate with this error detection strategy.

Now we show that scalar linear network code can achieve at most rate 4/3. Suppose that we want to achieve the linear coding capacity by transmitting symbols reliably by using a scalar linear network code over the finite field in rounds. To show the insufficiency of linear coding for achieving this capacity, from (1), it is sufficient to prove that there exist pairs and for linear network code such that

and . Since the above equation is equivalent to

by linearity, it suffices to find a source vector and error vector such that and

(2) |

where is the source alphabet. We will show that there exists satisfying the above equation when errors occur on the links and in error vector .

Let denote the transfer matrix between and in the rounds. Its rows are the global coding vectors assigned on , , , and in the rounds. Note that to transmit symbols reliably, should have rank .

Let and denote the transfer matrices between and , and between and during rounds respectively. To transmit symbols reliably, both and should have rank at least , i.e., and . Otherwise, when the adversarial link is on the top layer, the maximum achievable rate is at most from the data processing inequality, which gives a contradiction.

Let and denote the errors occurring on links and , respectively. Error propagates to and , and error propagates to and .

From (2), we have the following set of equations

Since and ,