Multipleaccess Network Informationflow and Correction Codes
Abstract
This work considers the multipleaccess multicast errorcorrection scenario over a packetized network with malicious edge adversaries. The network has mincut and packets of length , and each sink demands all information from the set of sources . The capacity region is characterized for both a “sidechannel” model (where sources and sinks share some random bits that are secret from the adversary) and an “omniscient” adversarial model (where no limitations on the adversary’s knowledge are assumed). In the “sidechannel” adversarial model, the use of a secret channel allows higher rates to be achieved compared to the “omniscient” adversarial model, and a polynomialcomplexity capacityachieving code is provided. For the “omniscient” adversarial model, two capacityachieving constructions are given: the first is based on random subspace code design and has complexity exponential in , while the second uses a novel multiplefieldextension technique and has complexity, which is polynomial in the network size. Our code constructions are “endtoend” in that all nodes except the sources and sinks are oblivious to the adversaries and may simply implement predesigned linear network codes (random or otherwise). Also, the sources act independently without knowledge of the data from other sources.
I Introduction
Information dissemination can be optimized with the use of network coding. Network coding maximizes the network throughput in multicast transmission scenarios [ACLY00]. For this scenario, it was shown in [LYC03] that linear network coding suffices to achieve the maxflow capacity from the source to each receiving node. An algebraic framework for linear network coding was presented in [KM03]. Further, the linear combinations employed at network nodes can be randomly selected in a distributed manner; if the coding field size is sufficiently large the maxflow capacity is achieved with high probability [Ho_etal06].
However, network coding is vulnerable to malicious attacks from rogue users. Due to the mixing operations at internal nodes, the presence of even a small number of adversarial nodes can contaminate the majority of packets in a network, preventing sinks from decoding. In particular, an error on even a single link might propagate to multiple downstream links via network coding, which might lead to the extreme case in which all incoming links at the sink are in error. This is shown in Fig. 1, where the action of a single malicious node contaminates all incoming links of the sink node due to packet mixing at downstream nodes.
In such a case, network errorcorrection (introduced in [RaymondNECC2002]) rather than classical forward errorcorrection (FEC) is required, since the former exploits the fact that the errors at the sinks are correlated, whereas the latter assumes independent errors.
A number of papers e.g. [YC06, CY06, SKK08] have characterized the set of achievable communication rates over networks containing hidden malicious jamming and eavesdropping adversaries, and given corresponding communication schemes. The latest code constructions (for instance [SKK08] and [Jaggi_etal08]) have excellent parameters – they have low computational complexity, are distributed, and are asymptotically rateoptimal. However, in these papers the focus has been on singlesource multicast problems, where a single source wishes to communicate all its information to all sinks.
In this work we examine the problem of multipleaccess multicast, where multiple sources wish to communicate all their information to all sinks. We characterize the optimal rateregion for several variants of the multipleaccess network errorcorrection problem and give matching code constructions, which have low computational complexity when the number of sources is small.
We are unaware of any straightforward application of existing singlesource network errorcorrecting subspace codes that achieve the optimal rate regions. This is because singlesource network errorcorrecting codes such as those of [Jaggi_etal08] and [SKK08] require the source to judiciously insert redundancy into the transmitted codeword; however, in the distributed source case the codewords are constrained by the independence of the sources.
Ii Background and related work
For a singlesource singlesink network with mincut , the capacity of the network under arbitrary errors on up to links is given by
(1) 
and can be achieved by a classical endtoend errorcorrection code over multiple disjoint paths from source to the sink. This result is a direct extension of the Singleton bound (see, e.g., [Roth06]). Since the Singleton bound can be achieved by a maximum distance separable code, as for example a ReedSolomon code, such a code also suffices to achieve the capacity in the singlesource singlesink case.
In the network multicast scenario, the situation is more complicated. For the singlesource multicast the capacity region was shown ([RaymondNECC2002, YC06, CY06]) to be the same as (1), with now representing the minimum of the mincuts [YC06]. However, unlike singlesource singlesink networks, in the case of singlesource multicast, network error correction is required: network coding is required in general for multicast even in the errorfree case [ACLY00], and with the use of network coding errors in the sink observations become dependent and cannot be corrected by endtoend codes.
Two flavors of the network error correction problem are often considered. In the coherent case, it is assumed that there is centralized knowledge of the network topology and network code. Network error correction for this case was first addressed by the work of Cai and Yeung [RaymondNECC2002, YC06, CY06] for the single source scenario by generalizing classical coding theory to the network setting. However, their scheme has decoding complexity which is exponential in the network size.
In the harder noncoherent case, the network topology and/or network code are not known a priori to any of the honest parties. In this setting, [Jaggi_etal08, KK08] provided network errorcorrecting codes with a design and implementation complexity that is only polynomial in the size of network parameters. Reference [KK08] introduced an elegant approach where information transmission occurs via the space spanned by the received packets/vectors, hence any generating set for the same space is equivalent to the sink [KK08]. Errorcorrection techniques for this case were proposed in [KK08] and [SKK08] in the form of constant dimension and rank metric codes, respectively, where the codewords are defined as subspaces of some ambient space. These works considered only the single source case.
For the noncoherent multisource multicast scenario without errors, the scheme of [Ho_etal06] achieves any point inside the rateregion. An extension of subspace codes to multiple sources, for a noncoherent multipleaccess channel model without errors, was provided in [SFD08], which gave practical achievable (but not rateoptimal) algebraic code constructions, and in [FragouliITW09MultiSource], which derived the capacity region and gave a rateoptimal scheme for two sources. For the multisource case with errors, [FragouliMibiHoc09MultiSource] provided an efficient code construction achieving a strict subregion of the capacity region.
Iii Challenges
In this work we address the capacity region and the corresponding code design for the multiplesource multicast communication problem under different adversarial scenarios. The issues which arise in this problem are best explained with a simple example for a single sink, which is shown in Fig. 2. Suppose that the sources and encode their information independently from each other. We can allocate one part of the network to carry only information from , and another part to carry only information from . In this case only one source is able to communicate reliably under one link error. However, if coding at the middle nodes and is employed, the two sources are able to share network capacity to send redundant information, and each source is able to communicate reliably at capacity under a single link error. This shows that in contrast to the single source case, coding across multiple sources is required, so that sources can simultaneously use shared network capacity to send redundant information, even for a single sink.
In Section LABEL:The_Non_Coherent_Case we show that for the example network in Fig. 2, the capacity region is given by
(2)  
where for , rate is the information rate of , mincut is the minimum cut capacity between and sink , mincut is the minimum cut capacity between , and and is the known upper bound on the number of link errors. Hence, similarly to singlesource multicast, the capacity region of a multisource multicast network is described by the cutset bounds. From that perspective, one may draw a parallel with pointtopoint errorcorrection. However, for multisource multicast networks pointtopoint errorcorrecting codes do not suffice and a careful network code design is required. For instance, the work of [FragouliMibiHoc09MultiSource], which applies singlesource network errorcorrecting codes for this problem, achieves a rateregion that is strictly smaller than the capacity region (2) when [fragouli_personal].
Iv Our results
In this paper we consider a “sidechannel” model and an “omniscient” adversarial model. In the former, the adversary does not have access to all the information available in the network, for example as in [Jaggi_etal08, landberg08] where the sources share a secret with the sink(s) in advance of the network communication. Let be the set of sources in the network, be the number of sources, be the multicast transmission rate from source , , to every sink, and for any nonempty subset let be the minimum mincut capacity between any sink and .
In Section LABEL:The_Random_Secret_Model we prove the following theorem:
Theorem 1.
Consider a multiplesource multicast network errorcorrection problem on network –possibly with unknown topology–where each source shares a random secret with each of the sinks. For any errors on up to links, the capacity region is given by:
(3) 
and every point in the rate region can be achieved with a polynomialtime code.
By capacity region we mean the closure of all rate tuples for which there is a sequence of codes of length , message sets and encoding and decoding functions for every node in the network and every sink , so that for every and there is integer such that for every we have and the probability of decoding error at any sink is less than regardless of the message.
In “omniscient” adversarial model, we do not assume any limitation on the adversary’s knowledge, i.e. decoding should succeed for arbitrary error values. In Section LABEL:General_approach we derive the multipleaccess network errorcorrection capacity for both the coherent and noncoherent case. We show that network errorcorrection coding allows redundant network capacity to be shared among multiple sources, enabling the sources to simultaneously communicate reliably at their individual cutset capacities under adversarial errors. Specifically, we prove the following theorem:
Theorem 2.
Consider a multiplesource multicast network errorcorrection problem on network whose topology may be unknown. For any errors on up to links, the capacity region is given by:
(4) 
The rateregions are, perhaps not surprisingly, larger for the sidechannel model than for the omniscient adversarial model.
Finally, in Section LABEL:Polynomial_time_construction we provide computationally efficient distributed schemes for the noncoherent case (and therefore for the coherent case too) that are rateoptimal for correction of network errors injected by computationally unbounded adversaries. In particular, our code construction achieves decoding success probability at least where is the size of the finite field over which coding is performed, with complexity , which is polynomial in the network size.
The remainder of the paper is organized as follows: In Section V we formally introduce our problem and give some mathematical preliminaries. In Section LABEL:The_Random_Secret_Model we derive the capacity region and construct multisource multicast errorcorrecting codes for the sidechannel model. In Section LABEL:The_Non_Coherent_Case, we consider two network errorcorrection schemes for omniscient adversary models which are able to achieve the full capacity region in both the coherent and noncoherent case. In particular, we provide a general approach based on minimum distance decoding, and then refine it to a practical code construction and decoding algorithm which has polynomial complexity (in all parameters except the number of sources). Furthermore, our codes are fully distributed in the sense that different sources require no knowledge of the data transmitted by their peers, and endtoend, i.e. all nodes are oblivious to the adversaries present in the network and simply implement random linear network coding [RandCode0]. A remaining bottleneck is that while the implementation complexity (in terms of packetlength, fieldsize, and computational complexity) of our codes is polynomial in the size of most network parameters, it increases exponentially with the number of sources. Thus, the design of efficient schemes for a large number of sources is still open. Portions of this work were presented in [Svit_rate_regions] and in [hongyi_ted].
V Preliminaries
Va Model
We consider a delayfree acyclic network where is the set of nodes and is the set of edges. The capacity of each edge is normalized to be one symbol of the finite field per unit time where is a power of a prime. Edges with nonunit capacity are modeled as parallel edges.
There are two subsets of nodes where is a set of sources and is a set of sinks within the network. Let be the multicast transmission rate from , , to every sink. For any nonempty subset , let be the indices of the source nodes that belong to . Let be the minimum mincut capacity between and any sink. For each , let be the code used by source . Let be the Cartesian product of the individual codes of the sources in .
Within the network there is a computationally unbounded adversary who can observe all the transmissions and inject its own packets on up to links^{1}^{1}1Note that since each transmitted symbol in the network is from a finite field, modifying symbol to symbol is equivalent to injecting/adding symbol into . that may be chosen as a function of his knowledge of the network, the message, and the communication scheme. The location of the adversarial links is fixed but unknown to the communicating parties. In case of a sidechannel model, there additionally exists a random secret shared between all sources and each of the sinks as in [Jaggi_etal08, landberg08].
The sources on the other hand do not have any knowledge about each other’s transmitted information or about the links compromised by the adversary. Their goal is to judiciously add redundancy into their transmitted packets so that they can achieve any ratetuple within the capacity region.
VB Random Linear Network Coding
In this paper, we consider the following wellknown distributed random linear coding scheme [RandCode0].
Sources: All sources have incompressible data which they wish to deliver to all the destinations over the network. Source arranges its data into batches of packets and insert these packets into a message matrix over (the packetlength is a network design parameter). Each source then takes independent and uniformly random linear combinations over of the rows of to generate the packets transmitted on each outgoing edge.
Network nodes: Each internal node similarly takes (uniformly) random linear combinations of the packets on its incoming edges to generate packets transmitted on its outgoing edges.
Adversary: The adversarial packets are defined as the difference between the received and transmitted packets on each link. They are similarly arranged into a matrix of size .
Sink: Each sink constructs a matrix over by treating the received packets as consecutive length row vectors of . Since all the operations in the network are linear, each sink has an incoming matrix that is given by
(5) 
where , , is the overall transform matrix from to and is the overall transform matrix from the adversary to sink .
VC Finite Field Extensions
In the analysis below denote by the set of all matrices with elements from . The identity matrix with dimension is denoted by , and the zero matrix of any dimension is denoted by . The dimension of the zero matrix will be clear from the context stated. For clarity of notation, vectors are in boldface (e.g. ).
Every finite field , where can be algebraically extended^{2}^{2}2Let be the set of all polynomials over and be an irreducible polynomial of degree . Then defines an algebraic extension field by a homomorphic mapping [Algebra_Martin]. [Algebra_Martin] to a larger finite field , where for any positive integer . Note that includes as a subfield; thus any matrix is also a matrix in . Hence throughout the paper, multiplication of matrices from different fields (one from the base field and the other from the extended field) is allowed and is computed over the extended field.
The above extension operation defines a bijective mapping between and as follows:

For each , the folded version of is a vector in given by where is a basis of the extension field with respect to . Here we treat the row of as a single element in to obtain the element of .

For each , the unfolded version of is a matrix . Here we treat the element of as a row in to obtain the row of .
We can also extend these operations to include more general scenarios. Specifically any matrix can be written as a concatenation of matrices , where . The folding operation is defined as follows: . Similarly the unfolding operation can be applied to a number of submatrices of a large matrix, e.g., .
In this paper double algebraic extensions are also considered. More precisely let be an algebraic extension from , where for any positive integer . Table LABEL:Tab:FieldNotation summarizes the notation of the fields considered.
Field  

Size 