A Unified Approach for Network Information Theory
Abstract
In this paper, we take a unified approach for network information theory and prove a coding theorem, which can recover most of the achievability results in network information theory that are based on random coding. The final singleletter expression has a very simple form, which was made possible by many novel elements such as a unified framework that represents various network problems in a simple and unified way, a unified coding strategy that consists of a few basic ingredients but can emulate many known coding techniques if needed, and new proof techniques beyond the use of standard covering and packing lemmas. For example, in our framework, sources, channels, states and side information are treated in a unified way and various constraints such as cost and distortion constraints are unified as a single jointtypicality constraint.
Our theorem can be useful in proving many new achievability results easily and in some cases gives simpler rate expressions than those obtained using conventional approaches. Furthermore, our unified coding can strictly outperform existing schemes. For example, we obtain a generalized decodecompressamplifyandforward bound as a simple corollary of our main theorem and show it strictly outperforms previously known coding schemes. Using our unified framework, we formally define and characterize three types of network duality based on channel inputoutput reversal and network flow reversal combined with packingcovering duality.
1 Introduction
In network information theory, we study the fundamental limits of information flow and processing in a network and develop coding strategies that can approach the limits closely. Instead of studying a fully general network, however, we often study simple canonical models such as the multipleaccess channel [2], relay channel [3], and distributed source coding [4] because they are easier to study and more importantly because we can get useful insights from studying them. Once such insights are obtained, one can try to develop a more general theory that is applicable to general networks.
However, such a task is challenging and only partial results have been known so far [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], in which network model and/or applied coding technique is limited. For example, network coding [5] and compressandforward (CF) [3] were unified as noisy network coding in [8, 9], but does not include decodeandforward (DF) [3]. DF and partial DF [3] were generalized for singlesource multiplerelay singledestination networks [6] and for multicast and broadcast networks [13, 14], respectively. In [10], noisy network coding was combined with network DF [6], but does not allow a relay to perform both partial DF and CF simultaneously. For joint sourcechannel coding problems, a hybrid analog/digital coding strategy [12] was proposed that recovers and generalizes many previously known results. Such a hybrid coding scheme was applied to some relay networks and was shown to unify both amplifyandforward (AF) [16] and CF [3]. In [15], a novel framework for proving achievability was proposed based on output statistics of random binning and source–channel duality. One important feature of this framework is that the addition of secrecy is free, i.e., once an achievability result is obtained for a network model using this framework, an achievability result with additional secrecy constraint is immediately obtained. We note that [12] and [15] took a bottomup approach in a sense that achievability results are separately obtained for each of various network models.
In this paper, we take a topdown approach and prove a unified achievability theorem for a general network scenario with arbitrarily many nodes. Our setup is general enough such that any combination of source coding, channel coding, joint sourcechannel coding, and coding for computing problems can be treated. Our result recovers most of the exiting achievability results in network information theory as long as they are based on random coding. Some examples of known results recovered by our theorem are listed as follows:

Channel coding: GelfandPinsker coding [17], Marton’s inner bound for the broadcast channel [18], HanKobayashi inner bound for the interference channel [19], [20], coding for channels with actiondependent states [21], interference decoding for a 3user interference channel [22], [23], CoverLeung inner bound for the multiple access channel with feedback [24], a combination of partial DF and CF for the relay channel [3], network DF [6], noisy network coding [8, 9], short message noisy network coding with a DF option [10], offset encoding for the multiple access relay channel [25].
In Table 1, we compare many approaches that attemped to unify various coding strategies.
Previous Results  SWC [8]  DDF [13, 14]  NNCDF [10]  HC [12]  Our result 
GelfandPinsker coding [17]  
Marton coding[18]  ✓  ✓  
HanKobayashi coding [19]  ✓  
Interference decoding [22]  
CoverLeung coding [24]  ✓  ✓  
DF [3]  ✓  ✓  
Partial DF [3]  ✓  
AF [16]  
CF [3]  
Combination of partial DF and CF [3]  ✓  
Network coding [5]  ✓  ✓  ✓  
NNC [8, 9]  ✓  
WynerZiv coding [26]  
SlepianWolf coding [4]  ✓  
BergerTung coding [27], [28]  ✓  
ZhangBerger coding [29]  ✓  ✓  
Joint sourcechannel coding over  
multiple access channels [30], broadcast channels [31],  ✓  ✓  
and interference channels [32, 33]  
Hybrid coding[12]  
Coding for computing [34]  ✓  
Cascade coding for computing [35]  ✓ 
[Abbreviations] SWC: SlepianWolf coding over networks, DDF: distributed decodeandforward, NNCDF: noisy network coding with a DF option, HC: hybrid coding
Our theorem can be useful in proving new achievability results easily and in some cases gives simpler rate expressions than those obtained using conventional approaches. Furthermore, our unified coding can strictly outperform existing schemes. To illustrate this, we show that a generalized decodecompressamplifyandforward bound for acyclic networks can be obtained as a simple corollary of our main theorem and show it strictly outperforms previously known coding schemes. As another special case of our main theorem, we derive a generalized decodecompressandforward bound for a discrete memoryless network (DMN) in [36], which recovers both noisy network coding [9] and distributed decodeandforward [13] bounds. This is the first time the partialdecodecompressandforward bound (Theorem 7) by Cover and El Gamal [3] is generalized for DMN’s such that each relay performs both partial DF and CF simultaneously.
Our unified coding theorem enables us to state various types of duality arising in network information theory. Specifically, we formally define and characterize three types of network duality based on channel inputoutput reversal and network flow reversal combined with packingcovering duality. Our duality results include as special cases many known duality relationships in network information theory, e.g., the duality between coding for multipleaccess channel [2] and distributed sources [27], [28] (typeI duality), the duality between GelfandPinsker coding [17] and WynerZiv coding [26] (typeII duality), and the duality between coding for multipleaccess channel [2] and broadast channel [18] (typeIII duality).
Our unified achievability result is enabled by many novel elements such as a unified framework that represents various network problems in a simple and unified way, a unified coding strategy that consists of a few basic ingredients but can emulate known coding techniques if needed, and new proof techniques beyond the use of standard covering and packing lemmas. In our framework, sources, channels, states and side information are treated in a unified way and various constraints such as cost and distortion constraints are combined as a jointtypicality constraint, which is specified by a single joint distribution. Furthermore, we mainly consider acyclic discrete memoryless networks (ADMN) in this paper, where information flows in an acyclic manner. However, we also show our coding theorem can also be applied to general DMN’s by unfolding the network. Graph unfolding was first used in [5] for network coding.
Our coding scheme has four main ingredients, i.e., superposition coding, simultaneous nonunique decoding, simultaneous compression, and symbolbysymbol mapping. We note that our coding scheme does not explicitly include binning and multicoding, but is still general enough to emulate them if needed. Although each of these coding ingredients is not new, these are tweaked and combined in a special way to enable unification of many previous approaches. In our coding scheme, covering codebooks are used to compress information that each node observes and decodes. These covering codebooks are generated to permit superposition coding [37]. Each node operates according to the following three steps. The first step is simultaneous nonunique decoding [20, 38, 39], where a node uniquely decodes some covering codewords of other nodes together with some other covering codewords that do not need to be decoded uniquely. The next step is simultaneous compression, where the node finds covering codewords simultaneously that carry information about a received channel output sequence and decoded codewords. Since we allow general superposition relationship among covering codebooks, a more general analysis beyond multivariate covering lemma [40, 41] is needed. The last step is a symbolbysymbol mapping from a received channel output sequence and decoded and covered codewords to a channel input sequence. The technique of using a symbolbysymbol mapping was introduced in [42], which is referred to as the Shannon strategy. Our symbolbysymbol mapping from all three, i.e., the channel output sequence and decoded and covered codewords, was first used in [43] for a threenode noncausal relay channel. We note that such a use of symbolbysymbol mapping results in correlation between a channel input sequence and nonchosen covering codewords and thus the standard packing lemma [41] cannot be applied for the error analysis. Such correlation was problematic in many previous works and solved for some simple networks in [44, 12, 43]. Our proof technique completely solves this correlation issue in a fully general network setup.
This paper is organized as follows. In Section 2, we present our unified framework. In Section 3, we propose a unified coding scheme and present the main theorem of this paper. We also show various examples to illustrate how to utilize our results. In Section 4, we characterize three types of network duality. To demonstrate usefulness of our unified coding theorem, in Section 5, we derive a generalized decodecompressamplifyandforward bound as a simple corollary of our theorem and show it strictly outperforms previously known coding schemes. In Section 6, we present a unified coding theorem for the Gaussian case. We conclude this paper in Section 7.
The following notations are used throughout the paper.
1.1 Notation
For two integers and , denotes the set . For a set of real numbers, denotes the th smallest element in and denotes . For constants and , denotes the vector and denotes where the subscript is omitted when , i.e., . For random variables and , and are defined similarly. For sets and , denotes and denotes where the subscript is omitted when . Consider two real vectors and of length . We say that is smaller than and write if there exists such that for all and . Furthermore, we say that is componentwise smaller than and write if for all . denotes an allone vector and denotes an identity matrix. When is a Gaussian random vector with mean and covariance matrix , we write . is the indicator function, i.e., it is 1 if and 0 otherwise. denotes a function of that tends to zero as tends to zero.
We follow the notion of typicality in [45], [41]. Let denote the number of occurrences of in the sequence . Then, is said to be typical (or just typical) for if for every ,
The set of all typical is denoted as , which is shortly denoted as . A jointly typical set (or just a typical set) such as for multiple variables, which will also be denoted as , is naturally defined from the definition of .
2 Unified Framework
In this section, we build a unified framework for proving the achievability of many network information theory problems including channel coding, source coding, joint source–channel coding, and coding for computing. Let us first construct a unified framework for pointtopoint scenarios and then generalize it to general network scenarios.
2.1 Pointtopoint scenarios
Consider the standard channel coding and source coding problems [46] illustrated in Fig. 1. These two problems can be stated with the following elements: information to be communicated, node interaction and node processing functions, and the definition of achievability. Let us investigate differences between these two coding problems for each element and discuss how we can unify them into a single framework. In the following, denotes the number of channel uses for channel coding and the number of source symbols for source coding and denotes the rate in each problem.

Information to be communicated: In channel coding, a message , uniformly distributed over , is communicated from node 1 to node 2. In source coding, a discrete memoryless source (DMS) is given to node 1 and is reconstructed at node 2 (up to a prescribed distortion level in the case of lossy source coding). We can observe that a message can be regarded as a DMS such that . Hence, in both channel coding and source coding problems, we can say that a DMS is given to node 1 and is reconstructed at node 2.

Node interaction and node processing functions: In channel coding, node 1 communicates with node 2 through a discrete memoryless channel (DMC) , . Node 1 maps to a channel input sequence and node 2 receives a channel output sequence and maps it to . In source coding, node 1 maps to an index and node 2 receives exactly and maps it to . The noiseless communication of an index in source coding can be regarded as a DMC such that . Hence, in both channel coding and source coding problems, we can say that node 1 communicates with node 2 through a DMC , , the processing function at node 1 is a mapping from to , and the processing function at node 2 is a mapping from to . By denoting by and by , we can further unify the notation for sequences and the node processing functions, i.e., a sequence received by node is denoted by , the resultant sequence from processing at node is denoted by , and the node processing function at node is a mapping from to .

Achievability: In channel coding and lossless source coding problems, a rate is said to be achievable if there exists a sequence of node processing functions such that , where denotes the probability of error event given as . In lossy source coding problem, a rate–distortion pair is said to be achievable if there exists a sequence of node processing functions such that , where is a distortion measure between two arguments.
Now, let us introduce a new definition of achievability from which we can show the achievability of both channel coding and source coding problems in a unified way. We say a joint distribution , shortly denoted as , is achievable if there exists a sequence of node processing functions such that for any , where denotes the probability
in which the typical set is defined with respect to . Then, the achievability of appropriately chosen implies the achievability of or in channel coding and source coding problems. For channel coding and lossless source coding problems, is achievable if such that is achievable. For lossy source coding problem, is achievable if such that , is achievable from the typical average lemma [41] and the continuity of the ratedistortion function in .
To see whether the aforementioned unification approach is general enough for pointtopoint scenarios, let us consider more general pointtopoint scenarios in Fig. 2. First, in channels with noncausal states [17] illustrated in Fig. 2(a), node 1 observes a message of rate and a state sequence and encodes as . Then, node 2 receives and estimates as . Achievability is defined in the same way as in the channel coding problem. Let us apply the aforementioned unification approach to this problem. Since represents all the information node 1 receives, we let such that and and are independent, where corresponds to the message of rate . But, we cannot use the channel form of to capture the dependency of the channel output on state . This indicates that a more general channel form of is needed in the unified framework. Then, we can let be equal to . If we choose such that , the achievability of implies the achievability of of the original problem.
Next, in lossy source coding with side information [26] represented in Fig. 2(b), node 1 receives a source sequence and encodes it as an index . Then, node 2 receives the index and side information and reconstructs as up to some distortion level. Achievability is defined in the same way as in the lossy source coding problem. For this problem, we apply the unification approach as follows. We let . Since node 2 has two channel inputs, we let and let the channel be decomposed as , where the channel corresponds to the communication of of rate and hence its capacity is given as , i.e., , and the channel captures the correlation between and the side information . We pick up the target distribution in the same way as in the lossy source coding problem. Furthermore, coding for computing problem [34], where node 2 wishes to reconstruct a function of and up to distortion with respect to a distortion measure , can also be included in this framework by choosing such that , where .
In summary, the achievability of the aforementioned pointtopoint coding problems can be shown by considering the following unified framework. Network model is given by as illustrated in Fig. 3 and the objective is specified by a target distribution . is said to be achievable if there exists a sequence of node processing functions, , , such that for any .
2.2 General scenarios
In this subsection, we generalize the unified framework in Section 2.1 to general node networks. In our unified framework for nodes, we define an node acyclic discrete memoryless network (ADMN) , , which consists of a set of alphabet pairs , and a collection of conditional pmfs , . Here, and represent any information that comes into and goes out of node , respectively. can be a channel output, message, source, noncausal state information, and any combination of those. can be a channel input, message estimate, reconstructed source, action for generating channel state, and any combination of those. Next, signifies the correlation betweeen information prior to node and information received at node . It can capture channel distribution possibly with states, correlation between distributed sources, and complicated networkwide correlation among sources and channels.
In this network, information flows in one direction and node operations are sequential. Let denote the number of channel uses. First, is generated according to and then node 1 processes based on . Next, is generated according to and then node 2 encodes based on . Similarly, is generated according to and node encodes based on for . Clearly, any layered network [7] or noncausal network (without infinite loop) [47] possibly with noncausal state or side information is represented as an ADMN. Furthermore, any strictly causal (usual discrete memoryless network with relay functions having one sample delay) or causal network (relays without delay [47]) with blockwise operations can be represented as an ADMN by unfolding the network. Note that our unified achievability theorem (Theorem 1) still applies to the unfolded network. Therefore, considering only acyclic DMN (ADMN) in our unified approach is without loss of generality while greatly simplifying our unification approach. In the following subsection, we show several known examples represented by an ADMN.
Achievability is specified using a target joint distribution , which is shortly denoted as . For a set of node processing functions , , the probability of error is defined as , where the typical set is defined with respect to . We say the target distribution is achievable if there exists a sequence of node processing functions , , such that for any . We note that unifies diverse network demands and constaints. It can be used for designating the source–destination relationship and for imposing distortion and cost constraints.
2.3 Examples
In this subsection, we represent some network information theory problems by an ADMN and a target distribution such that the achievability of implies the achievability of the original problem. Let us first consider some examples of singlehop networks.
Example 1 (Multiple access channels [2])
For multiple access channel problem with rates and , we choose , , , , , and such that .
Example 2 (Distributed lossy compression [27], [28])
For distributed lossy compression problem with rate–distortion pairs and , we let , , , such that and , and such that for , where is a distortion measure between two arguments and .
Example 3 (Broadcast channels [18])
For broadcast channel problem with rates and , we choose , , , , , , , and such that , .
Example 4 (Multiple description coding [48])
For multiple description coding with rates and distortion triples , and , we choose , , such that , such that , , and such that for , where is a distortion measure between two arguments and .
Next, we show an example of multihop networks.
Example 5 (Relay channels)
Consider a threenode relay channel , illustrated in Fig. 4(a), where node 1 wishes to send a message to node 3 with the help of node 2. Let and denote the rate and the number of channel uses, respectively, and let and denote the message of rate at node 1 and the estimated message at node 3, respectively. Then, the node processing function at node 1 is a mapping from to , the node processing function at node 2 at time is a mapping from to , and the node processing function at node 3 is a mapping from to . The probability of error is defined as and a rate is said to be achievable if there exists a sequence of node processing functions such that .
If we assume a blockwise operation at each node, we can represent this network as an ADMN by unfolding the network. Assume transmission blocks, each consisting of channel uses. In the unfolded network illustrated in Fig. 4(b), we have nodes and the operation of node corresponds to that of node of the original network at the end of block . To reflect the fact that node is originally the same node as node , we assume that node has an orthogonal link of sufficiently large rate to node , which is represented as a dashed line in Fig. 4(b). Because this unfolded network is acyclic, it can be represented as an ADMN and can be chosen accordingly.
2.4 Introduction of a virtual node
The following two propositions are obtained by introducing a virtual node in an ADMN, which turn out to be useful in recovering some known achievability results in Section 3.
Proposition 1
Consider an node ADMN
and target distribution . For some and finite set , assume for . Then, we have
Now, consider an node ADMN
and target distribution such that
and
Then, if is achievable for the node ADMN, is achievable for the node ADMN.
The proof is straightforward from the observation that the node ADMN is obtained by introducing a virtual node, whose channel output is and channel input is null, between nodes and in the node ADMN and reindexing the nodes.
Proposition 2
Consider an node ADMN
such that and for some , , . Let denote the common part of two random variables and , where the common part of two discrete memoryless sources is defined in [49], [50]
Now, consider an node ADMN
and target distribution such that
and
where for or and can be arbitrarily large.
Then, if is achievable for the node ADMN, is achievable for the node ADMN.
Note that in the node ADMN, both nodes and observe the common part and hence can share any function of . Thus, we can introduce a virtual node whose channel output is and channel input is and assume that is available at nodes and .
3 Unified Coding Theorem
In this section, we propose a unified coding scheme and present the main theorem of this paper, followed by various examples that show how to utilize our results. Our scheme consists of the following ingredients: 1) superposition, 2) simultaneous nonunique decoding, 3) simultaneous compression, and 4) symbolbysymbol mapping. These are tweaked and combined in a special way to enable unification of many previous approaches. Let us first briefly explain the proposed scheme and introduce related coding parameters. Detailed description of our scheme is given in the proof of Theorem 1.

Codebook generation: In our coding scheme, covering codebooks are used to compress information that each node observes and decodes. We generate covering codebooks . Let for denote the alphabet for the codeword symbol of . For indexing of codewords, we consider index sets , where for some for each . We denote by the set of indices of ’s associated with in a way that each codeword in is indexed by the vector and hence consists of codewords, i.e., . Each codebook is constructed allowing superposition coding. Let denote the set of the indices of ’s on which is constructed by superposition.

Node operation: Node operates according to the following three steps as illustrated in Fig. 5.

Simultaneous nonunique decoding: After receiving , node decodes some covering codewords of previous nodes simultaneously, where some are decoded uniquely and the others are decoded nonuniquely. We denote by and the sets of the indices of ’s whose codewords are decoded uniquely and nonuniquely, respectively, at node .

Simultaneous compression: After decoding, node finds covering codewords simultaneously according to a conditional pmf that carry some information about the received channel output sequence and uniquely decoded codewords , where denotes the set of the indices of ’s used for compression.

Symbolbysymbol mapping: After decoding and compression, node generates by a symbolbysymbol mapping from uniquely decoded codewords , covered codewords , and received channel output sequence . Let denote the function used for symbolbysymbol mapping.

In summary, our scheme requires the following set of coding parameters, where some constraints are added to make the aforementioned codebook generation and node operation proper:

positive integers and

alphabets

rate tuple

sets , , , , and for and that satisfy

’s are disjoint,

and if ,

, , and .


a set of conditional pmfs and functions for such that induced by
(1) is the same as the target distribution .
Now, we are ready to present our main theorem, which gives a sufficient condition for achievability using the aforementioned scheme. For an ADMN and target distribution , let , shortly denoted as or , denote the set of all possible ’s.
Theorem 1
For an node ADMN, is achievable if there exists such that for
(2)  
(3) 
for all such that and for all such that , where