Multiple-access Network Information-flow and Correction Codes
This work considers the multiple-access multicast error-correction scenario over a packetized network with malicious edge adversaries. The network has min-cut and packets of length , and each sink demands all information from the set of sources . The capacity region is characterized for both a “side-channel” model (where sources and sinks share some random bits that are secret from the adversary) and an “omniscient” adversarial model (where no limitations on the adversary’s knowledge are assumed). In the “side-channel” adversarial model, the use of a secret channel allows higher rates to be achieved compared to the “omniscient” adversarial model, and a polynomial-complexity capacity-achieving code is provided. For the “omniscient” adversarial model, two capacity-achieving constructions are given: the first is based on random subspace code design and has complexity exponential in , while the second uses a novel multiple-field-extension technique and has complexity, which is polynomial in the network size. Our code constructions are “end-to-end” in that all nodes except the sources and sinks are oblivious to the adversaries and may simply implement predesigned linear network codes (random or otherwise). Also, the sources act independently without knowledge of the data from other sources.
Information dissemination can be optimized with the use of network coding. Network coding maximizes the network throughput in multicast transmission scenarios [ACLY00]. For this scenario, it was shown in [LYC03] that linear network coding suffices to achieve the max-flow capacity from the source to each receiving node. An algebraic framework for linear network coding was presented in [KM03]. Further, the linear combinations employed at network nodes can be randomly selected in a distributed manner; if the coding field size is sufficiently large the max-flow capacity is achieved with high probability [Ho_etal06].
However, network coding is vulnerable to malicious attacks from rogue users. Due to the mixing operations at internal nodes, the presence of even a small number of adversarial nodes can contaminate the majority of packets in a network, preventing sinks from decoding. In particular, an error on even a single link might propagate to multiple downstream links via network coding, which might lead to the extreme case in which all incoming links at the sink are in error. This is shown in Fig. 1, where the action of a single malicious node contaminates all incoming links of the sink node due to packet mixing at downstream nodes.
In such a case, network error-correction (introduced in [RaymondNECC2002]) rather than classical forward error-correction (FEC) is required, since the former exploits the fact that the errors at the sinks are correlated, whereas the latter assumes independent errors.
A number of papers e.g. [YC06, CY06, SKK08] have characterized the set of achievable communication rates over networks containing hidden malicious jamming and eavesdropping adversaries, and given corresponding communication schemes. The latest code constructions (for instance [SKK08] and [Jaggi_etal08]) have excellent parameters – they have low computational complexity, are distributed, and are asymptotically rate-optimal. However, in these papers the focus has been on single-source multicast problems, where a single source wishes to communicate all its information to all sinks.
In this work we examine the problem of multiple-access multicast, where multiple sources wish to communicate all their information to all sinks. We characterize the optimal rate-region for several variants of the multiple-access network error-correction problem and give matching code constructions, which have low computational complexity when the number of sources is small.
We are unaware of any straightforward application of existing single-source network error-correcting subspace codes that achieve the optimal rate regions. This is because single-source network error-correcting codes such as those of [Jaggi_etal08] and [SKK08] require the source to judiciously insert redundancy into the transmitted codeword; however, in the distributed source case the codewords are constrained by the independence of the sources.
Ii Background and related work
For a single-source single-sink network with min-cut , the capacity of the network under arbitrary errors on up to links is given by
and can be achieved by a classical end-to-end error-correction code over multiple disjoint paths from source to the sink. This result is a direct extension of the Singleton bound (see, e.g., [Roth06]). Since the Singleton bound can be achieved by a maximum distance separable code, as for example a Reed-Solomon code, such a code also suffices to achieve the capacity in the single-source single-sink case.
In the network multicast scenario, the situation is more complicated. For the single-source multicast the capacity region was shown ([RaymondNECC2002, YC06, CY06]) to be the same as (1), with now representing the minimum of the min-cuts [YC06]. However, unlike single-source single-sink networks, in the case of single-source multicast, network error correction is required: network coding is required in general for multicast even in the error-free case [ACLY00], and with the use of network coding errors in the sink observations become dependent and cannot be corrected by end-to-end codes.
Two flavors of the network error correction problem are often considered. In the coherent case, it is assumed that there is centralized knowledge of the network topology and network code. Network error correction for this case was first addressed by the work of Cai and Yeung [RaymondNECC2002, YC06, CY06] for the single source scenario by generalizing classical coding theory to the network setting. However, their scheme has decoding complexity which is exponential in the network size.
In the harder non-coherent case, the network topology and/or network code are not known a priori to any of the honest parties. In this setting, [Jaggi_etal08, KK08] provided network error-correcting codes with a design and implementation complexity that is only polynomial in the size of network parameters. Reference [KK08] introduced an elegant approach where information transmission occurs via the space spanned by the received packets/vectors, hence any generating set for the same space is equivalent to the sink [KK08]. Error-correction techniques for this case were proposed in [KK08] and [SKK08] in the form of constant dimension and rank metric codes, respectively, where the codewords are defined as subspaces of some ambient space. These works considered only the single source case.
For the non-coherent multi-source multicast scenario without errors, the scheme of [Ho_etal06] achieves any point inside the rate-region. An extension of subspace codes to multiple sources, for a non-coherent multiple-access channel model without errors, was provided in [SFD08], which gave practical achievable (but not rate-optimal) algebraic code constructions, and in [FragouliITW09MultiSource], which derived the capacity region and gave a rate-optimal scheme for two sources. For the multi-source case with errors, [FragouliMibiHoc09MultiSource] provided an efficient code construction achieving a strict subregion of the capacity region.
In this work we address the capacity region and the corresponding code design for the multiple-source multicast communication problem under different adversarial scenarios. The issues which arise in this problem are best explained with a simple example for a single sink, which is shown in Fig. 2. Suppose that the sources and encode their information independently from each other. We can allocate one part of the network to carry only information from , and another part to carry only information from . In this case only one source is able to communicate reliably under one link error. However, if coding at the middle nodes and is employed, the two sources are able to share network capacity to send redundant information, and each source is able to communicate reliably at capacity under a single link error. This shows that in contrast to the single source case, coding across multiple sources is required, so that sources can simultaneously use shared network capacity to send redundant information, even for a single sink.
In Section LABEL:The_Non_Coherent_Case we show that for the example network in Fig. 2, the capacity region is given by
where for , rate is the information rate of , min-cut is the minimum cut capacity between and sink , min-cut is the minimum cut capacity between , and and is the known upper bound on the number of link errors. Hence, similarly to single-source multicast, the capacity region of a multi-source multicast network is described by the cut-set bounds. From that perspective, one may draw a parallel with point-to-point error-correction. However, for multi-source multicast networks point-to-point error-correcting codes do not suffice and a careful network code design is required. For instance, the work of [FragouliMibiHoc09MultiSource], which applies single-source network error-correcting codes for this problem, achieves a rate-region that is strictly smaller than the capacity region (2) when [fragouli_personal].
Iv Our results
In this paper we consider a “side-channel” model and an “omniscient” adversarial model. In the former, the adversary does not have access to all the information available in the network, for example as in [Jaggi_etal08, landberg08] where the sources share a secret with the sink(s) in advance of the network communication. Let be the set of sources in the network, be the number of sources, be the multicast transmission rate from source , , to every sink, and for any non-empty subset let be the minimum min-cut capacity between any sink and .
In Section LABEL:The_Random_Secret_Model we prove the following theorem:
Consider a multiple-source multicast network error-correction problem on network –possibly with unknown topology–where each source shares a random secret with each of the sinks. For any errors on up to links, the capacity region is given by:
and every point in the rate region can be achieved with a polynomial-time code.
By capacity region we mean the closure of all rate tuples for which there is a sequence of codes of length , message sets and encoding and decoding functions for every node in the network and every sink , so that for every and there is integer such that for every we have and the probability of decoding error at any sink is less than regardless of the message.
In “omniscient” adversarial model, we do not assume any limitation on the adversary’s knowledge, i.e. decoding should succeed for arbitrary error values. In Section LABEL:General_approach we derive the multiple-access network error-correction capacity for both the coherent and non-coherent case. We show that network error-correction coding allows redundant network capacity to be shared among multiple sources, enabling the sources to simultaneously communicate reliably at their individual cut-set capacities under adversarial errors. Specifically, we prove the following theorem:
Consider a multiple-source multicast network error-correction problem on network whose topology may be unknown. For any errors on up to links, the capacity region is given by:
The rate-regions are, perhaps not surprisingly, larger for the side-channel model than for the omniscient adversarial model.
Finally, in Section LABEL:Polynomial_time_construction we provide computationally efficient distributed schemes for the non-coherent case (and therefore for the coherent case too) that are rate-optimal for correction of network errors injected by computationally unbounded adversaries. In particular, our code construction achieves decoding success probability at least where is the size of the finite field over which coding is performed, with complexity , which is polynomial in the network size.
The remainder of the paper is organized as follows: In Section V we formally introduce our problem and give some mathematical preliminaries. In Section LABEL:The_Random_Secret_Model we derive the capacity region and construct multi-source multicast error-correcting codes for the side-channel model. In Section LABEL:The_Non_Coherent_Case, we consider two network error-correction schemes for omniscient adversary models which are able to achieve the full capacity region in both the coherent and non-coherent case. In particular, we provide a general approach based on minimum distance decoding, and then refine it to a practical code construction and decoding algorithm which has polynomial complexity (in all parameters except the number of sources). Furthermore, our codes are fully distributed in the sense that different sources require no knowledge of the data transmitted by their peers, and end-to-end, i.e. all nodes are oblivious to the adversaries present in the network and simply implement random linear network coding [RandCode0]. A remaining bottleneck is that while the implementation complexity (in terms of packet-length, field-size, and computational complexity) of our codes is polynomial in the size of most network parameters, it increases exponentially with the number of sources. Thus, the design of efficient schemes for a large number of sources is still open. Portions of this work were presented in [Svit_rate_regions] and in [hongyi_ted].
We consider a delay-free acyclic network where is the set of nodes and is the set of edges. The capacity of each edge is normalized to be one symbol of the finite field per unit time where is a power of a prime. Edges with non-unit capacity are modeled as parallel edges.
There are two subsets of nodes where is a set of sources and is a set of sinks within the network. Let be the multicast transmission rate from , , to every sink. For any non-empty subset , let be the indices of the source nodes that belong to . Let be the minimum min-cut capacity between and any sink. For each , let be the code used by source . Let be the Cartesian product of the individual codes of the sources in .
Within the network there is a computationally unbounded adversary who can observe all the transmissions and inject its own packets on up to links111Note that since each transmitted symbol in the network is from a finite field, modifying symbol to symbol is equivalent to injecting/adding symbol into . that may be chosen as a function of his knowledge of the network, the message, and the communication scheme. The location of the adversarial links is fixed but unknown to the communicating parties. In case of a side-channel model, there additionally exists a random secret shared between all sources and each of the sinks as in [Jaggi_etal08, landberg08].
The sources on the other hand do not have any knowledge about each other’s transmitted information or about the links compromised by the adversary. Their goal is to judiciously add redundancy into their transmitted packets so that they can achieve any rate-tuple within the capacity region.
V-B Random Linear Network Coding
In this paper, we consider the following well-known distributed random linear coding scheme [RandCode0].
Sources: All sources have incompressible data which they wish to deliver to all the destinations over the network. Source arranges its data into batches of packets and insert these packets into a message matrix over (the packet-length is a network design parameter). Each source then takes independent and uniformly random linear combinations over of the rows of to generate the packets transmitted on each outgoing edge.
Network nodes: Each internal node similarly takes (uniformly) random linear combinations of the packets on its incoming edges to generate packets transmitted on its outgoing edges.
Adversary: The adversarial packets are defined as the difference between the received and transmitted packets on each link. They are similarly arranged into a matrix of size .
Sink: Each sink constructs a matrix over by treating the received packets as consecutive length- row vectors of . Since all the operations in the network are linear, each sink has an incoming matrix that is given by
where , , is the overall transform matrix from to and is the overall transform matrix from the adversary to sink .
V-C Finite Field Extensions
In the analysis below denote by the set of all matrices with elements from . The identity matrix with dimension is denoted by , and the zero matrix of any dimension is denoted by . The dimension of the zero matrix will be clear from the context stated. For clarity of notation, vectors are in bold-face (e.g. ).
Every finite field , where can be algebraically extended222Let be the set of all polynomials over and be an irreducible polynomial of degree . Then defines an algebraic extension field by a homomorphic mapping [Algebra_Martin]. [Algebra_Martin] to a larger finite field , where for any positive integer . Note that includes as a subfield; thus any matrix is also a matrix in . Hence throughout the paper, multiplication of matrices from different fields (one from the base field and the other from the extended field) is allowed and is computed over the extended field.
The above extension operation defines a bijective mapping between and as follows:
For each , the folded version of is a vector in given by where is a basis of the extension field with respect to . Here we treat the row of as a single element in to obtain the element of .
For each , the unfolded version of is a matrix . Here we treat the element of as a row in to obtain the row of .
We can also extend these operations to include more general scenarios. Specifically any matrix can be written as a concatenation of matrices , where . The folding operation is defined as follows: . Similarly the unfolding operation can be applied to a number of submatrices of a large matrix, e.g., .
In this paper double algebraic extensions are also considered. More precisely let be an algebraic extension from , where for any positive integer . Table LABEL:Tab:Field-Notation summarizes the notation of the fields considered.