Universal Secure Error-Correcting Schemes for Network Coding

# Universal Secure Error-Correcting Schemes for Network Coding

Danilo Silva and Frank R. Kschischang Department of Electrical and Computer Engineering, University of Toronto
Toronto, Ontario M5S 3G4, Canada, {danilo, frank}@comm.utoronto.ca
###### Abstract

This paper considers the problem of securing a linear network coding system against an adversary that is both an eavesdropper and a jammer. The network is assumed to transport packets from source to each receiver, and the adversary is allowed to eavesdrop on arbitrarily chosen links and also to inject up to erroneous packets into the network. The goal of the system is to achieve zero-error communication that is information-theoretically secure from the adversary. Moreover, this goal must be attained in a universal fashion, i.e., regardless of the network topology or the underlying network code. An upper bound on the achievable rate under these requirements is shown to be packets per transmission. A scheme is proposed that can achieve this maximum rate, for any and any field size , provided the packet length is at least symbols. The scheme is based on rank-metric codes and admits low-complexity encoding and decoding. In addition, the scheme is shown to be optimal in the sense that the required packet length is the smallest possible among all universal schemes that achieve the maximum rate.

## I Introduction

Consider a network implementing linear network coding for multicast [1]. The network may be subject to two types of attacks: a malicious user injects corrupt packets into the network in order to disrupt communication; an unauthorized eavesdropper intercepts packet transmissions in order to obtain as much information as possible about the transmitted messages. The linear mixing performed by network coding presents challenges to coding schemes in both scenarios, and has motivated a significant amount of research.

This paper considers the problem of dealing with the aforementioned attacks in a universal fashion, i.e., in a way that is completely independent of the network topology and the specific network code. This has the advantage of producing schemes that are compatible with noncoherent (random) network coding [2]. Also, we focus on the most stringent requirements of zero error probability and zero information leakage, i.e., perfectly reliable and perfectly secure (in the information-theoretic sense) communication.

Most of the previous work on this problem deals with the special cases where only error control or only security is required. A dividing assumption among these works refers to the constraints on the packet length . For a system that is required to work under any packet length (in particular, under ), the error control problem has been extensively discussed in [3, 4, 5] (see references therein) and the security problem has also received significant attention [6, 7, 8]. In all of these works, the proposed solutions require knowledge of the network code, and therefore are not universal. On the other hand, universal schemes have been proposed for the case where is required to be sufficiently large; this is the approach taken in [9, 10] for error control and in [11] for security.

When both requirements of error control and security are combined, the problem becomes harder, and a simple concatenation of an error control scheme and a security scheme may not necessarily work. The reason is that, if error control coding is followed by security coding, the overall codeword may not be robust to errors and, similarly, if security coding is followed by error control coding, the overall codeword may not be robust to eavesdropping. Previous work on this problem has been limited111except for an earlier, suboptimal version of this work. See [11, 12]. to non-universal schemes [13, 14], which require knowledge of the network code.

In this paper, we propose a universal scheme that achieves perfectly reliable and perfectly secure communication. Namely, in a network with a maxflow of packets, if at most error packets are injected in the network, and at most packets are observed by an eavesdropper, then our scheme can provide perfectly secure and reliable communication while achieving a rate of packets per transmission. This rate is shown to be optimal. Note that a similar upper bound on rate has been shown [14] in the context of non-universal network coding with , but it does not apply to the problem considered here (since it ignores the possibility of exploiting in the coding scheme).

A requirement of our scheme is that the packet length must be at least symbols. We show that this value is optimal, in the sense that it is the smallest packet length of a universal scheme achieving the maximum rate.

A main tool in the design and analysis of our scheme is the theory of rank-metric codes [15]. We show that our scheme can benefit from existing efficient algorithms for rank-metric codes [10, 16], and therefore can be encoded and decoded with low complexity.

It is worth mentioning that there is another line of work that relaxes the assumption of zero error probability (requiring, instead, vanishingly small error probability) [17, 18]. In this case, even higher rates can be achieved [18], however, the packet length must be asymptotically large.

The remainder of the paper is organized as follows. Section II establishes the notation used and reviews background material on rank-metric codes and linear network coding. In Section III, we define the problem of combined error control and security. In Section IV, we review existing techniques for the special cases of either error control or security only. We also provide new results and insights for these scenarios, which will be useful for our proposed scheme. In Section V, we present our scheme and show that it achieves the desired goals. In Section VI, we prove that our scheme is optimal both in the sense of maximal rate and smallest packet length. In Section VII, we discuss how the scheme can be extended to the case of noncoherent network coding. Finally, Section VIII presents our conclusions.

Some proofs are omitted due to lack of space. The full version of this work is being incorporated in the revised version of [11].

## Ii Background

### Ii-a Notation

Let be a finite field. Let denote the set of all matrices over , and set . Let be an extension field of . Recall that is an -dimensional vector space over . Thus, by fixing a basis for over , elements of may be viewed as (row) vectors in and vice-versa. This identification will be used extensively throughout the paper. In particular, we may view a column vector in as a matrix in and vice-versa.

### Ii-B Rank-Metric Codes

Let be matrices. The rank distance between and is defined as . As observed in [15], the rank distance is indeed a metric.

A rank-metric code is a matrix code (i.e., a nonempty set of matrices) used in the context of the rank metric. The minimum rank distance of , denoted , is the minimum rank distance between all pairs of distinct codewords of .

There is a rich coding theory for rank-metric codes that is analogous to the classical coding theory in the Hamming metric. In particular, the Singleton bound for the rank metric [15, 10] states that every rank-metric code with minimum rank distance must satisfy

 |C|≤qmax{n,m}(min{n,m}−d+1). (1)

Codes that achieve this bound are called maximum-rank-distance (MRD) codes and they are known to exist for all choices of parameters , , and [15].

In the context of the bijection between and , a rank-metric code may described as a block code of length over . (Note that, differently from classical coding theory, here we treat each codeword as a column vector. However, to avoid confusion, we will keep the standard notation on generator and parity-check matrices of linear codes.)

It is particularly useful to consider linear block codes over . For , an important family of such codes was proposed by Gabidulin [15]. A Gabidulin code is an linear code over defined by the generator matrix

 G=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣gq00gq01⋯gq0n−1gq10gq11⋯gq1n−1⋮⋮⋱⋮gqk−10gqk−11⋯gqk−1n−1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ (2)

where the elements are linearly independent over . It is shown in [15] that the minimum rank distance of a Gabidulin code is , so the code is MRD.

### Ii-C Linear Network Coding

The basic model for a (multicast) communication system using linear network coding is that of a finite-field matrix channel. At each channel use (generation) a source node transmits a batch of packets, each consisting of symbols from a finite field , which can be regarded as the rows of a matrix . Each link in the network transports a packet free of errors, and each node creates outgoing packets as -linear combinations of incoming packets. The specification of all such linear combinations defines the network code. The packets received by a (specific) destination node can be regarded as the rows of an matrix , where is the transfer matrix that describes the linear transformations incurred by packets on route to the destination. The system is said to be coherent if is known to each corresponding destination; otherwise, it is said to be noncoherent. The linear network code is said to be feasible if every transfer matrix to a destination has rank (so that, in a coherent system, each destination is able to recover ).

The system described above is referred to as an linear coded network, where denotes the minimum rank among all transfer matrices. Thus, an linear coded network contains a feasible network code.

## Iii Problem Statement

For simplicity, we restrict attention to a single destination, since all the results in this paper can be immediately extended to multiple destinations. In addition, we focus on the fundamental case of coherent network coding; extensions to noncoherent network coding are described in Section VII.

The basic model for linear network coding described in Section II-C can be extended to incorporate packet errors. Suppose that at most errors can occur in any of the links, causing the corresponding packets to become corrupted. In this case, we will say that the network is subject to errors. Assuming, without loss of generality, an additive error model, the matrix received by the destination can be expressed as

 Y=AX+DZ

where is a matrix consisting of the error packets injected and is the transfer matrix from the affected links to the destination. Note that depends on the set of links in error.

This model can be further extended to include an eavesdropper adversary, in the spirit of the wiretap channel II of Ozarow and Wyner [19]. The eavesdropper is assumed to have access to the packets transmitted on any arbitrarily chosen links in the network. In this case, we will say that the network is subject to observations. Let be a matrix consisting of the packets observed by the eavesdropper. Then can be expressed as

 W=BX

where is the transfer matrix from the source node to the eavesdropper. Note that depends on the set of intercepted links.

To ensure secure and reliable communication, the source node chooses the matrix as the (possibly stochastic) encoding of some message (which should be recovered by the destination but not by the eavesdropper). The coding scheme is said to be zero-error if can be uniquely determined from , i.e., . Here we assume that is a constant known to all, while and are unknown random variables with unknown distributions (which may depend on ). A zero-error scheme, in this context, may also be called -error-correcting scheme. A scheme is said to be universally -error-correcting if it satisfies

 H(S|Y)=0,∀A:rankA=n (3)

for any arbitrary distributions on and . In other words, a universally -error-correcting scheme must provide reliable communication for any of the choice of the (feasible) linear network code.

The coding scheme is said to be (perfectly) secret if the eavesdropper gets no information about the message, i.e., if . Note that this requirement depends on the choice of . A scheme is said to be universally (perfectly) secret under observations if it satisfies

 I(S;W)=0,∀B∈Fμ×mq. (4)

In other words, a universally secret scheme must guarantee secrecy for any choice of the linear network code.

In this paper, we are interested in schemes that are both universally -error-correcting and universally secret under observations, i.e., schemes that satisfy both (3) and (4).

## Iv Special Cases

### Iv-a Error Control Only

Consider an linear network subject to errors but observations. In this case, condition (4) can be ignored.

In the case of a deterministic encoding, the following characterization is given in [20].

###### Theorem 1 ([20])

Consider a deterministic encoder mapping to whose image is given by . There exists a universally -error-correcting scheme with this encoder if and only if .

From the Singleton bound (1), it can be seen that the maximum rate achievable by a universally -error-correcting scheme is given by symbols per transmission, and it is achieved by an MRD code. In particular, the rate of packets per transmission is achievable only if .

In the case of a stochastic encoding, the result above does not necessarily hold, since it is conceivable that recovering from does not necessarily enable the receiver to recover . Still, it is possible to obtain the following equivalence result, which will be very useful in the sequel.

###### Theorem 2

Consider a stochastic encoding from to . The encoding admits a universally -error-correcting scheme if and only if it admits a zero-error scheme for the coherent channel , for all full-rank .

{proof}

Omitted due to lack of space.

Essentially, Theorem 2 shows that any coding scheme that corrects packet errors can be modified at the decoder to instead correct “packet erasures” (i.e., rank deficiency), and vice-versa.

### Iv-B Security Only

Consider an linear coded network subject to observations but errors. In this case, ; thus, condition (3) can be replaced by .

It is shown in [11] that the maximum number of symbols per transmission that can be reliably communicated with a universally secret scheme is upper bounded by . Moreover, this rate is achievable only if .

A scheme is proposed in [11] that is able to achieve this maximum rate. The scheme uses Ozarow-Wyner coset coding [19] based on linear MRD codes. In order to describe the scheme, it is convenient to use the bijection described in Section II-A and think of vectors in as elements of the extension field . Note that this is used solely to perform the encoding and decoding operations at the source and destination nodes, and has no impact in the -linear network coding operations performed at the internal nodes.

Let be an linear code over with parity-check matrix , where . Let the message be given by . Encoding is performed by choosing uniformly at random such that . In other words, is viewed as a syndrome specifying a coset of , and is chosen as a random word from that coset. Decoding is performed simply by computing . It is shown in [11] that this scheme is universally secret if and only if is an MRD code and .

We now describe a convenient way to perform the encoding process. Let be an invertible matrix such that corresponds to the first rows of . Given a message , the encoder chooses uniformly at random and independently from , and produces by computing

 X=T[SV].

Note that . It is easy to show that , i.e., is chosen uniformly at random given . Thus, this encoder indeed implements a coset coding approach.

We now give a security condition based directly on the matrix rather than its inverse.

###### Proposition 3

The encoder described above is universally secure under observations if the last rows of form a generator matrix of an linear MRD code over with .

{proof}

Let and be such that . Then

 [I00I]=T−1T=[HH1][GT1GT]=[HGT1HGTH1GT1H1GT].

Thus, . Since both and are full-rank, it follows that and are generator and parity-check matrices, respectively, for exactly the same code.

## V Proposed Scheme

In this section, we propose a scheme that is universally -error-correcting and universally secret under observations. The scheme achieves a rate of packets per transmission and requires the packet length to be at least symbols. The scheme can be seen as a combination of the strategies for error control and security described in Section IV, designed in such a way that they can be coupled without violating conditions (3) and (4). In what follows we make use of the identification between and described in Section II-A.

Assume that and . Let be a generator matrix of an linear MRD code over . Suppose that the last rows of form a generator matrix of an linear MRD code over .

Encoding proceeds as follows. Given a message , the encoder first produces an auxiliary variable

 U=[SV]

by choosing is uniformly at random and independently from . Then, the encoder computes

 X=GT0U.

Note that the mapping from to is a deterministic mapping whose image is (a subset of)

 C0={GT0u,u∈F(k+μ)qm}.

It follows from Theorem 1 that, when is transmitted over an linear coded network subject to errors, the receiver can uniquely determine (and therefore ) if . Since is an linear MRD code over , with , we have that . Thus, the scheme is universally -error-correcting.

In particular, decoding can be performed in two steps: first, applying a decoder for in order to find ; then, extracting the message as the first rows of .

In order to prove the secrecy of the scheme, consider first an alternative interpretation. Let be an invertible matrix such that the last rows of correspond to the matrix . Then, we have

 X=GT0U=T[0U]=T[S′V]

where

 S′=[0S].

In other words, the encoder is identical to the encoder described in Section IV-B if is taken as the message. Furthermore, we have that the last rows of correspond to , which is the generator matrix of an linear MRD code over . Thus, by Proposition 3 (which holds regardless of the message distribution), we have that the scheme is universally secret under observations.

The above analysis proves the following result.

###### Theorem 4

The scheme described above is universally -error-correcting and universally secret under observations.

Our proposed scheme relies on the assumption that a generator matrix for an linear MRD code exists such that its last rows form a generator matrix for another linear MRD code. It is easy to see that, if is taken as a generator matrix of a Gabidulin code given in the form (2), then any consecutive rows of (in particular the last ones) indeed form a generator matrix of an MRD sub-code. In this case, decoding of can be efficiently performed using the methods in [10, 16, 12].

## Vi Converse Results

In this section, we prove that our proposed scheme is optimal, both in the sense of achieving the maximum possible rate and in the sense of requiring the minimum possible packet length among all schemes that achieve this maximum rate.

###### Theorem 5

Consider an linear coded network. Assume that the source message has entropy of packets. There exists a scheme that is universally -error-correcting and universally secure under observations only if . Moreover, this maximum rate can be attained only if .

{proof}

Let . Let be a full-rank matrix and let be a full-rank matrix such that for some (necessarily full-rank) . Let and . If the encoder admits a scheme that is universally -error-correcting then, by Theorem 2, it also admits a scheme that is zero-error for the coherent channel . Thus, there is a function such that . In particular, there is also a function such that . Thus, we may write . Now,

 k =H(S) =H(S|YA,WB)+I(S;YA,WB) =I(S;YA,WB) (5) =I(S;WB)+I(S;YA|WB) =I(S;YA|WB) (6) =H(YA|WB)−H(YA|S,WB) ≤H(YA|WB) (7) ≤n′−rankP=n′−μ (8)

where (5) follows since is a function of and (6) follows since . This proves the first statement. Now consider the second statement. Since (8) holds with equality, we must have and . Note that these conditions hold for all full-rank and all , where

 AB={A∈Fn′×nq:rankA=n′,⟨B⟩⊆⟨A⟩}

and denotes the row space of a matrix. This implies that and therefore , where and is the matrix consisting of the vertical stacking of all matrices in . It is not hard to see that, as long as , . (In fact, contains every nonzero vector of as one of its rows.) It follows that , for all full-rank . Thus, must be uniquely determined given and the indication that . From Theorem 1, this implies that each must be a rank-metric code with .

On the other hand, we have seen that for all full-rank where and . By the chain rule of entropy, it is not hard to see that this implies that is uniform (for instance, by choosing some ’s that are submatrices of an identity matrix, as in the wiretap channel II). Thus, , which implies that . Since , we have that . Thus, there must be some such that , which implies that . Together with the fact that , we can see, from the Singleton bound (1), that this can only happen if .

## Vii Extension to Noncoherent Network Coding

The scheme described in the paper is suitable for coherent network coding and is indeed optimal. In the case of noncoherent network coding, the scheme can be adapted by including appropriate packet headers. More precisely, the transmission matrix should be , where is the transmission matrix of the original scheme. Clearly, including packet headers does not affect security, but it allows the scheme to be decoded when the transfer matrix is unknown. It is shown in [10] that such adaptation preserves the error-correcting capability of the code, so the universally -error-correcting property is maintained. Although the rate achieved in this case is no longer optimal, it is very close to optimal for all practical packet lengths [10].

## Viii Conclusion

In this paper, we have proposed a universal end-to-end coding scheme that can guarantee perfectly secure and perfectly reliable communication over a linear coded network subject to malicious interference and eavesdropping. The scheme is optimal both in the sense of achieving the maximum possible rate as well as requiring the smallest possible packet length. The scheme is based on rank-metric codes and admit efficient encoding and decoding algorithms.

## References

• [1] R. Koetter and M. Médard, “An algebraic approach to network coding,” IEEE/ACM Trans. Netw., vol. 11, no. 5, pp. 782–795, Oct. 2003.
• [2] T. Ho, M. Médard, R. Koetter, D. R. Karger, M. Effros, J. Shi, and B. Leong, “A random linear network coding approach to multicast,” IEEE Trans. Inf. Theory, vol. 52, no. 10, pp. 4413–4430, Oct. 2006.
• [3] R. W. Yeung and N. Cai, “Network error correction, part I: Basic concepts and upper bounds; part II: Lower bounds,” Commun. Inform. Syst., vol. 6, no. 1, pp. 19–54, 2006.
• [4] Z. Zhang, “Linear network error correction codes in packet networks,” IEEE Trans. Inf. Theory, vol. 54, no. 1, pp. 209–218, 2008.
• [5] S. Yang, R. W. Yeung, and Z. Zhang, “Weight properties of network codes,” European Transactions on Telecommunications, vol. 19, no. 4, pp. 371–383, 2008.
• [6] N. Cai and R. W. Yeung, “Secure network coding,” in Proc. IEEE Int. Symp. Information Theory, Lausanne, Switzerland, Jun. 30–Jul. 5, 2002, p. 323.
• [7] J. Feldman, T. Malkin, C. Stein, and R. A. Servedio, “On the capacity of secure network coding,” in Proc. 42nd Annual Allerton Conf. on Commun., Control, and Computing, Sep. 2004.
• [8] S. Y. E. Rouayheb and E. Soljanin, “On wiretap networks II,” in Proc. IEEE Int. Symp. Information Theory, Nice, France, Jun. 24–29, 2007, pp. 551–555.
• [9] R. Kötter and F. R. Kschischang, “Coding for errors and erasures in random network coding,” IEEE Trans. Inf. Theory, vol. 54, no. 8, pp. 3579–3591, Aug. 2008.
• [10] D. Silva, F. R. Kschischang, and R. Kötter, “A rank-metric approach to error control in random network coding,” IEEE Trans. Inf. Theory, vol. 54, no. 9, pp. 3951–3967, 2008.
• [11] D. Silva and F. R. Kschischang, “Universal secure network coding via rank-metric codes,” IEEE Trans. Inf. Theory, 2008, submitted for publication. [Online]. Available: http://arxiv.org/abs/0809.3546
• [12] D. Silva, “Error control for network coding,” Ph.D. dissertation, University of Toronto, Toronto, Canada, 2009.
• [13] C.-K. Ngai and S. Yang, “Deterministic secure error-correcting (sec) network codes,” in Proc. IEEE Information Theory Workshop, Tahoe City, CA, Sep. 2–6, 2007, pp. 96–101.
• [14] C.-K. Ngai and R. W. Yeung, “Secure error-correcting (sec) network codes,” in Proc. Workshop on Network Coding Theory and Applications, Lausanne, Switzerland, Jun. 15-16, 2009, pp. 98–103.
• [15] E. M. Gabidulin, “Theory of codes with maximum rank distance,” Probl. Inform. Transm., vol. 21, no. 1, pp. 1–12, 1985.
• [16] D. Silva and F. R. Kschischang, “Fast encoding and decoding of Gabidulin codes,” in Proc. IEEE Int. Symp. Information Theory, Seoul, Korea, Jun. 28–Jul. 3, 2009, pp. 2858–2862.
• [17] S. Jaggi, M. Langberg, S. Katti, T. Ho, D. Katabi, M. Médard, and M. Effros, “Resilient network coding in the presence of Byzantine adversaries,” IEEE Trans. Inf. Theory, vol. 54, no. 6, pp. 2596–2603, Jun. 2008.
• [18] S. Jaggi and M. Langberg, “Resilient network codes in the presence of eavesdropping Byzantine adversaries,” in Proc. IEEE Int. Symp. Information Theory, 24–29 June 2007, pp. 541–545.
• [19] L. H. Ozarow and A. D. Wyner, “Wire-tap channel II,” in Proc. EUROCRYPT 84 workshop on Advances in cryptology: theory and applicationof cryptographic techniques.   New York, NY, USA: Springer-Verlag New York, Inc., 1985, pp. 33–51.
• [20] D. Silva and F. R. Kschischang, “On metrics for error correction in network coding,” IEEE Trans. Inf. Theory, vol. 55, no. 12, pp. 5479–5490, 2009.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters