Network error correction with unequal link capacities

# Network error correction with unequal link capacities

Sukwon Kim, Tracey Ho, , Michelle Effros, , Amir Salman Avestimehr,  This work was supported in part by subcontract #069153 issued by BAE Systems National Security Solutions, Inc. and supported by the Defense Advanced Research Projects Agency (DARPA) and the Space and Naval Warfare System Center (SPAWARSYSCEN), San Diego under Contract No. N66001-08-C-2013, NSF grant CNS 0905615 and Caltech’s Lee Center for Advanced Networking. Part of this work was performed while A. S. Avestimehr was with the Center for Mathematics of Information, Caltech. The work of A. S. Avestimehr was partly supported by NSF CAREER award 0953117.Sukwon Kim, Tracey Ho and Michelle Effros are with the Department of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125, USA, e-mail: {sukwon,tho,effros}@caltech.eduA. S. Avestimehr is with the School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, 14853, USA, e-mail: avestimehr@ece.cornell.edu
###### Abstract

This paper studies the capacity of single-source single-sink noiseless networks under adversarial or arbitrary errors on no more than edges. Unlike prior papers, which assume equal capacities on all links, arbitrary link capacities are considered. Results include new upper bounds, network error correction coding strategies, and examples of network families where our bounds are tight. An example is provided of a network where the capacity is 50% greater than the best rate that can be achieved with linear coding. While coding at the source and sink suffices in networks with equal link capacities, in networks with unequal link capacities, it is shown that intermediate nodes may have to do coding, nonlinear error detection, or error correction in order to achieve the network error correction capacity.

{keywords}

## I Introduction

Network coding allows intermediate nodes in a network to mix the information content from different packets. This mixing can increase throughput and reliability in networks of error-free or stochastically failing channels [1, 2]. Unfortunately, it can also potentially increase the impact of malicious links or nodes that wish to corrupt data transmissions. A single corrupted packet, mixed with other packets in the network, can potentially corrupt all of the information reaching a particular destination. To combat this problem, network error correction was first studied by Yeung and Cai [3, 4] who investigated correction of errors in multicast network coding [1, 2, 5] on networks with unit-capacity links. In that work, the authors showed that for any network of unit-capacity links, the Singleton bound is tight and linear network error-correcting codes suffice to achieve the capacity, which equals where is the min-cut of the network and is a bound on the number of corrupted links [4, Theorem 4]. The problem of network coding under Byzantine attack was also investigated in [6], which gave an approach for detecting adversarial errors under random network coding. Construction of codes that can correct errors up to the full error-correction capability specified by the Singleton bound was presented in [7]. A variety of alternative models of adversarial attack and strategies for detecting and correcting such errors appear in the literature. Examples include [8, 9, 10, 11, 12, 13, 14, 15].

Specifically, the network error correction problem concerns reliable information transmission in a network with an adversary who arbitrarily corrupts the packets sent on some set of links. The location of the adversarial links is fixed for all time but unknown to the network user. We define a -error correcting code for a single-source and single-sink network to be a code that can recover the source message at the sink node if there are at most adversarial links in the network. The -error correcting network capacity, henceforth simply called the capacity, is the supremum over all rates achievable by -error correcting codes.

The cut-set approach is a simple yet powerful tool for bounding the capacity of a large network. This approach partitions the nodes into two subsets, say and , and then bounds the rate that can be transmitted from nodes in to nodes in . (See, for example, [18, Section 15.10].) The maximum information transmission across the “cut” occurs when the nodes within can collaborate perfectly among themselves and the nodes within can collaborate perfectly among themselves. In this case, and each act as “super-nodes” in a simple point-to-point network. All that is needed for collaboration is sufficient information exchange among the nodes on each side of the cut. Thus, the “cut-set bound” equals that rate that would be achieved in transmitting information from to if we added reliable, infinite-capacity links between each pair of nodes in and reliable, infinite-capacity links between each pair of nodes in , as shown in Figure 1. Given a network of capacitated error-free links with a source node and a sink node , minimizing over all choices of that contain but exclude gives a tight bound on the unicast capacity from to  [19].

In contrast, this traditional cut-set bounding approach is not tight in general when it comes to the error-correction capacity of networks with unequal link capacities, even in the case of unicast demands. In this case, two new issues arise. We next describe each of these issues in turn.

The first issue concerns the role of feedback across – i.e. links from to . While feedback never increases the capacity across a cut in a network of reliable links, it can increase the error-correction capacity. Intuitively, this is because feedback allows us to inform nodes in about what was received by nodes in , thereby aiding in the discovery of adversarial links.111This process of discovery is complicated by the fact that the feedback links themselves may be corrupted, but feedback is, nonetheless, clearly useful. Treating all nodes in as one super-node and all nodes in as another super-node, as in the traditional cut-set bounding approach, makes all feedback information available to all nodes in and all feedforward information available to all nodes in . This may give them considerably more insight into the adversary’s location than is available to them in the original network.

However, the second issue with the cut-set approach to bounding network capacities is the notion of a cut itself. Reference [20] shows, for the more general case where only a subset of links are potentially adversarial, the existence of networks for which no partition yields a tight bound on the error-correction capacity. This is proven by example using a network whose minimal cut (which has no feedback links) yields a capacity bound that is proven to be unachievable. As a result, knowledge of the the capacity of the network’s minimal cut is insufficient to determine the capacity of all possible networks, and we cannot hope to derive cut-set bounds that are tight in general. Nonetheless, given the complexity of taking into account the full network topology, we proceed to study the cut-set approach, deriving general bounds and demonstrating that those bounds are tight in some cases.

Specifically, in Section III we begin with the cut-set upper bound given by the capacity of the two-node network shown in Fig. 2, which is the only cyclic network we consider in this paper. In this network, the source node can transmit packets to the sink node along the forward links and the sink node can send information back to the source node along the feedback links. As mentioned above, this cut-set bound can be quite loose since it assumes that all feedback is available to the source node and all information crossing the cut in the forward direction is available to the sink. We therefore develop a new cut-set upper bound for general acyclic networks by taking into account more details of the topological relationships among links on the cut, as in the zig-zag network construction shown in Figure 1.

In Section IV, we consider a variety of linear and nonlinear coding strategies useful for achieving the capacity of various example networks. We prove the insufficiency of linear network codes to achieve the capacity by providing an example of a network where the capacity is 50% greater than the linear coding capacity and is achieved using a nonlinear error detection strategy. A similar example for the problem with Byzantine attack on nodes rather than edges appears in [16]. We also give examples of single-source and single-sink networks for which intermediate nodes must perform coding, nonlinear error detection or error correction in order to achieve the network capacity. We describe a simple greedy algorithm for error correction at intermediate nodes. We then introduce a new coding strategy called “guess-and-forward.” In this strategy, an intermediate node which receives some redundant information from multiple paths guesses which of its upstream links controlled by the adversary. The intermediate node forwards its guess to the sink which tests the hypothesis of the guessing node. In Section V, we show that guess-and-forward achieves network capacity on the two-node network with feedback links of Fig. 2, as well as the family of four-node acyclic networks in Fig. 3 when the capacity of each feedback link is not too small (i.e. above a value given by a linear optimization).222After the submission of this paper, we obtained a new result that improves upon the bound in Section III for the special case of small-capacity feedback links. We mention the idea briefly as a footnote in Section III and will present it formally in an upcoming paper. Finally, we apply guess-and-forward strategy to zig-zag networks, deriving achievable rates and presenting conditions under which our upper bound is tight. We conclude in Section VI with a discussion of future work.

Portions of this work have appeared in our earlier work [17, 21], which introduced the network error correction problem with unequal link capacities and presented a subset of the results.

## Ii Preliminaries

Consider a directed acyclic communication network with unequal link capacities. Let denote the capacity of edge . A source node transmits information to a sink node over the network . Transmissions occur on the links according to their topological order, i.e. a link transmits after all its incident incoming links, and we regard a link error as being applied upon transmission. A link (or node) is said to be upstream of another link (or node) iff there is a directed path starting from the former and ending with the latter. A link (or node) is said to be downstream of another link (or node) iff there is a directed path starting from the latter and ending with the former.

In this paper, we consider the problem of correcting arbitrary adversarial errors on up to links. The location of error links is fixed for all time but unknown to the network user.

###### Definition 1

A network code is -error link-correcting if the source message can be recovered by the sink node provided that the adversary controls at most links. Thus a -error link-correcting network code can correct any adversarial links for .

Let be a partition of , and define the cut for the partition by

 cut(S,Sc)={(a,b)∈E:a∈S,b∈Sc}.

The cut separates nodes and if and . We use to denote the set of cuts between and . Given a cut , we call any link in a forward link, and we call any link from to a feedback link.

For the achievable strategies in Sections IV and V, we assume that coding occurs in the finite field for some prime power . An error on any link is specified by a vector containing symbols in . The output of link equals the sum in of the input to link and the error applied to link , i.e., . We say that an error occurs on the link if .

As in [3, 4], we can consider a linear network code that assigns a set of vectors , called global coding vectors, to each link in the network. Let

 ~ϕl(w)={⟨w,vF(l)i⟩:1≤i≤r(l)}

denote the error-free output of link when the network input is where denotes the inner product of row vectors and . We use to denote the vector of errors on the entire network. The output of a link is a function of both the network input and the error vector , which we denote by . For each node , we use and to denote the sets of incoming and outgoing edges respectively for node . With this notation, a sink node cannot distinguish between the case where is the network input and error occurs and the case where is the network input and error occurs if and only if

 (ψl(w,e):l∈Γ+(t))=(ψl(w′,e′):l∈Γ+(t)). (1)

Let denote the number of links in which an error occurs. We say that any pair of input vectors and are -links separable at sink node if (1) does not hold for any pair of error vectors and such that and . Lemma 1 of [4] establishes the linear properties of for networks with unit link capacities. This result extends directly to networks with arbitrary link capacities.

###### Lemma 1

For all , all network inputs and , error vectors and , and ,

 ψl(w+w′,e+e′)=ψl(w,e)+ψl(w′,e′)

and

 ψl(μw)=μψl(w).

From Lemma 1,

 ψl(w,e)=ψl(w,0)+ψl(0,e)=~ϕl(w)+θl(e),

where for any link . Thus can be written as the sum of a linear function of and a linear function of .

## Iii Upper bounds

In this section, we consider upper bounds on network error correction capacity. Let denote the source alphabet and the size of the (arbitrary) link alphabet. The corresponding network transmission rate is given by

We first derive the cut-set upper bound obtained from coalescing all nodes on each side of the cut into a super-node, resulting in a two-node network as shown in Fig. 2.

###### Lemma 2

Consider the two-node network shown in Fig. 2 with arbitrary link capacities. Let denote the sum of the smallest forward link capacities. The network error correction capacity of this network is upper bounded by

 {0if n≤2zmin{Dn−z,Dn−2(z−m)+}if n>2z
{proof}

Case 1) .

Case 2) .

 O(x)={y1,..,yn−2z,p1,..,pz,w1,..,wz},
 O(x′)={y1,..,yn−2z,p′1,..,p′z,w′1,..,w′z},

where denotes the error-free vector of symbols on when codeword is transmitted.

We can construct -error links that changes to the value as follows. We apply an error of value ( on links for . Since this does not change the output value of other links, we obtain . For , we can follow a similar procedure to construct error links that change the value of to . Thus, sink node cannot reliably distinguish between the source symbol and , which gives a contradiction.

Therefore, the upper bound on achievable capacity is .

In Section V we show that this bound is the actual capacity of the two-node network. Thus, the super-node construction gives the following cut-set upper bound for general acyclic networks.

###### Lemma 3

Given any cut with forward links and feedback links, let denote the sum of the smallest forward link capacities. The network error correction capacity is upper bounded by

 {0if k≤2zmin{Dk−z,Dk−2(z−r)+}if k>2z

However, we can show that the above upper bound is not tight using the following generalized Singleton bound, which was presented in our conference paper [17]. A similar upper bound for the problem of adversarial attack on nodes rather than edges was given in independent work [16].

###### Definition 2

Any set of links on a cut is said to satisfy the downstream condition on if none of the links in are downstream of any link in .

###### Lemma 4

(A generalized Singleton bound) Consider any -error correcting network code with source alphabet in an acyclic network . Consider any set consisting of links on a source-sink cut that satisfies the downstream condition on . Let be the total capacity of the links in . Then

 log|X|≤M⋅logq.
{proof}

The proof is similar to that of the network Singleton bound for the equal link capacity case in [3]. We assume that , and show that this leads to a contradiction.

Given a cut , denotes the number of links in . For brevity, let where and links in are ordered topologically, i.e., is not downstream of for any . Since and is the capacity of , there exist two distinct codewords such that . So we can write

 O(x)={y1,..,yK(Q)−2z,p1,..,pz,w1,..,wz},
 O(x′)={y1,..,yK(Q)−2z,p′1,..,p′z,w′1,..,w′z},

where denotes the error-free vector of symbols on when codeword is transmitted.

We will show that it is possible for the adversary to produce exactly the same outputs on all the channels in when errors occur on at most links in .

 {y1,..,yK(Q)−2z,p′1,..,p′z,w′1(z),..,w′z(z)}.

Thus, sink node cannot reliably distinguish between the source symbol and , which gives a contradiction.

Consider the example four-node network shown in Fig. 4. When , the two-node bound lemma 3 gives the upper bound 22. The generalized Singleton bound gives upper bound 2.

However, the generalized Singleton bound is also not tight. Building on ideas from the above bounds, we proceed to derive tighter bounds.

Let denote the set of feedback links across cut . Given a set of feedback links and a set of forward links , we use to denote the upper bound obtained from lemma 4 (generalized Singleton bound) when evaluated for adversarial links on the cut after erasing and from the graph . Let

 Nz,k,m(Q)=min{F⊂Q,|F|=k≤z−m}min{W⊂QR,|W|=m≤z}NF,Wz,k,m(Q).

Then we define as follows.

 Nz(Q)=min0≤m≤zmin0≤k≤z−mNz,k,m(Q).

For instance, consider the 2-layer zig-zag network in Fig. 6. If , and , by choosing , , and removing in the application of the Singleton bound after erasing and . By taking the minimum over and , we can show that .

###### Lemma 5

(Cut-set upper bound 1) Consider any -error correcting network code with source alphabet in an acyclic network.

 log|X|≤minQ∈CS(s,t){Nz(Q)}⋅logq
{proof}

For any cut , the adversary can erase a set of feedback links and a set of forward links where and . Applying Lemma 4 on after erasing and gives the upper bound . By taking the minimum over all cuts , we obtain the above bound.

The following examples illustrate how the bound in Lemma 5 tightens the generalized Singleton bound. We first consider a four-node acyclic network as shown in Fig. 5. In each example, unbounded reliable communication is allowed from source to its neighbor on one side of the cut and from node to sink on the other side of the cut. There are feedback links with arbitrary capacities from to .

When we compute the generalized Singleton bound, for any cut , we choose and erase links in the cut such that none of the remaining links in the cut are downstream of the chosen links. Then we sum the remaining link capacities and take the minimum over all cuts. Because of the downstream condition, when the link capacities between and are much larger than the link capacities between and , the Singleton bound may not be tight. For example, in the network in Fig. 5 (a), if , then the generalized Singleton bound gives upper bound 20. However, when the adversary declares that he will use two forward links between and , we obtain the erasure bound 4.

As another example, consider the network in Fig. 5 (b) when . Applying the generalized Singleton bound gives upper bound 16. If the adversary erases one of the forward links between and and we apply the generalized Singleton bound on the remaining network, then our upper bound is improved to 15. The intuition behind this example is that when the adversary erases large capacities links which do not satisfy the downstream condition, applying the generalized Singleton bound on remaining network with adversarial links can give a tighter bound.

###### Lemma 6

(Cut-set upper bound 2) Let denote the total capacity of the remaining links on . If for , then

 log|X|≤M⋅logq.
{proof}

We assume that , and show that this leads to a contradiction. Let denote the number of links on the cut . Since , from the definition of , there exist two distinct codewords such that error-free outputs on the links in are the same. Let and . Then we can write

 O(x)={y1,..,yK(Q)−c−d,u1,..,uc,w1,..,wd},
 O(x′)={y1,..,yK(Q)−c−d,u′1,..,u′c,w′1,..,w′d},

where denotes the error-free outputs on the links in for and ; and denote the error-free outputs on the links in for and respectively; and and denote the error-free outputs on the links in for and respectively. We will show that it is possible for the adversary to produce exactly the same outputs on all the channels in under and when errors occur on at most links. When codeword is sent, we use to denote the error-free symbols on feedback link .

Given a cut , we consider all possible sets on satisfying the condition of Lemma 6. We choose sets among them that have the maximum total link capacities and define to be the sum of the capacities of the links in . This gives the upper bound

 log|X|≤minQ∈cut(s,t)Mz(Q)⋅logq.

The following example shows that we can obtain a tighter upper bound using Lemma 6. For the example network in Fig. 7, when , Lemma 5 gives upper bound 9. However, Lemma 6 gives a tighter upper bound 8 when , and .

Now we derive a generalized cut-set upper bound that unifies Lemma 5 and Lemma 6. Given a cut , consider a set of forward links and a set of feedback links such that . Let denote the upper bound obtained from Lemma 6 when evaluated for adversarial links on the cut after erasing and from the original graph . Then

 min{F⊂Q,W⊂QR,|F|+|W|≤z}CF,Wz(Q)

is an upper bound on the error correction capacity of . This includes the bound of Lemma 5 as a special case, since the generalized Singleton bound is a special case of the upper bound in Lemma 6 corresponding to the case where is a set of links satisfying the downstream condition. It is also clear that is the same as the bound in Lemma 6 when . Note however that any bound obtainable with a nonempty set of erased feedback links is also obtainable by including those links in the sets and of Lemma 6 instead of erasing them. Thus, we define

 Cz(Q)=min{F⊂Q,|F|≤z}CF,∅z(Q)

###### Theorem 1

(A generalized cut-set upper bound) Consider any -error correcting network code with source alphabet in an acyclic network. Then

 log|X|≤minQ∈CS(s,t)Cz(Q)⋅logq.

## Iv Coding strategies

We consider a variety of linear and nonlinear coding strategies useful for achieving the capacity of various example networks. We show the insufficiency of linear network codes for achieving the capacity in general. We also demonstrate examples of networks with a single source and a single sink where, unlike the equal link capacity case, it is necessary for intermediate nodes to do coding, nonlinear error detection or error correction in order to achieve the capacity. We then introduce a new coding strategy, guess-and-forward.

### Iv-a Error detection at intermediate nodes and insufficiency of linear codes

Here we show that there exists a network where the capacity is 50% greater than the best rate that can be achieved with linear coding. We consider the single source and the single sink network in Fig. 8, where source aims to transmit the information to a sink node . We index the links and assume the capacities of links as shown in Fig. 8. For a single adversarial link, our upper bound from Theorem 1 is 2.

###### Lemma 7

Given a network in Fig. 8, for a single adversarial link, rate 2 is asymptotically achievable with a nonlinear error detection strategy, whereas scalar linear network coding achieves at most 4/3.

{proof}

We first illustrate the nonlinear error detection strategy as follows. Source wants to transmit two packets . We send them in channel uses, but each packet has only bits. We use one bit as a signaling bit. We send down all links in the top layer. In the middle layer, we do the following operations:

1. Send the linear combination of and , , down link .

2. Send down both links and .

3. Send down both links and .

4. Send a different linear combination of and , , down link .

At the bottom layer, we do the following operations:

2. Send a 1 followed by on link if the two copies of match, send a 0 otherwise.

3. Send a 1 followed by on link if the two copies of match, send a 0 otherwise.

We can show that above nonlinear error detection strategy allows a sink node to decode (). Suppose that and are independent. Then coding vectors on any two links on the bottom layer are independent and they satisfy with MDS (maximum distance separable) properties. If nothing was sent down both and , the decoder can recover from the information received on links and . If nothing was sent down only on , then the outputs of and should not be corrupted and the decoder can recover . Similarly, the decoder can decode correctly when nothing was sent down only on . If all the links in the bottom layer received symbols, there is at most one erroneous link on the bottom layer, which has MDS code. Thus we can achieve rate with this error detection strategy.

Now we show that scalar linear network code can achieve at most rate 4/3. Suppose that we want to achieve the linear coding capacity by transmitting symbols reliably by using a scalar linear network code over the finite field in rounds. To show the insufficiency of linear coding for achieving this capacity, from (1), it is sufficient to prove that there exist pairs and for linear network code such that

 (ψl(w,e):l∈Γ+(t))=(ψl(w′,e′):l∈Γ+(t)),

and . Since the above equation is equivalent to

 (~ϕl(w−w′):l∈Γ+(t))=(θl(−e+e′):l∈Γ+(t)),

by linearity, it suffices to find a source vector and error vector such that and

 (~ϕl(x):l∈Γ+(t))=(θl(e′′):l∈Γ+(t)), (2)

where is the source alphabet. We will show that there exists satisfying the above equation when errors occur on the links and in error vector .

Let denote the transfer matrix between and in the rounds. Its rows are the global coding vectors assigned on , , , and in the rounds. Note that to transmit symbols reliably, should have rank .

Let and denote the transfer matrices between and , and between and during rounds respectively. To transmit symbols reliably, both and should have rank at least , i.e., and . Otherwise, when the adversarial link is on the top layer, the maximum achievable rate is at most from the data processing inequality, which gives a contradiction.

Let and denote the errors occurring on links and , respectively. Error propagates to and , and error propagates to and .

From (2), we have the following set of equations

 Gtx=(M100M2)(e1,e2)τ=M⋅e′′.

Since and ,