Successive Refinement with Decoder
Cooperation and its Channel Coding Duals
Abstract
We study cooperation in multi terminal source coding models involving successive refinement. Specifically, we study the case of a single encoder and two decoders, where the encoder provides a common description to both the decoders and a private description to only one of the decoders. The decoders cooperate via cribbing, i.e., the decoder with access only to the common description is allowed to observe, in addition, a deterministic function of the reconstruction symbols produced by the other. We characterize the fundamental performance limits in the respective settings of noncausal, strictlycausal and causal cribbing. We use a new coding scheme, referred to as Forward Encoding and Block Markov Decoding, which is a variant of one recently used by Cuff and Zhao for coordination via implicit communication. Finally, we use the insight gained to introduce and solve some dual channel coding scenarios involving Multiple Access Channels with cribbing.
Block Markov Decoding, Conferencing, Cooperation, Coordination, Cribbing, Double Binning, Duality, Forward Encoding, Joint Typicality, Successive Refinement.
I Introduction
Cooperation can dramatically boost the performance of a network. The literature abounds with models for cooperation, when communication between nodes of a network is over a noisy channel. In multiple access channels, the setting of cribbing was introduced by Willems and Van der Muelen in [1], where one encoder obtains the channel input symbols of the other encoder (referred to as “crib”) and uses it for coding over a multiple access channel (MAC). This was further generalized to deterministic function cribbing (where an encoder obtains a deterministic function of the channel input symbols of another encoder) and to cribbing with actions (where one encoder can control the quality and availability of the “crib” by taking cost constrained actions) by Permuter and Asnani in [2]. Cooperation can also be modeled as information exchange among the transmitters and receivers via rate limited links, generally referred to as conferencing in the literature. Such a model was introduced in the context of the MAC by Willems in [3], and subsequently studied by by Bross, Lapidoth and Wigger [4], Wiese et al. [5], Simeone et al. [6], and Maric, Yates and Kramer [7]. Cooperation has also been modeled via conferencing/cribbing in cognitive interference channels, such as the settings in Bross, Steinberg and Tinguely [8] and Prabhakaran and Vishwanath [9][10]. We refer to Ng and Goldsmith [11] for a survey of various cooperation strategies and their fundamental limits in wireless networks.
In multi terminal source coding, cooperation is generally modeled as a rate limited link such as in the cascade source coding setting of Yamamoto [12], Cuff, Su and El Gamal [13], Permuter and Weissman [14], Chia, Permuter and Weissman [15], as well as the triangular source coding problems of Yamamoto [16], Chia, Permuter and Weissman [15]. In cascade source coding (Fig. 1), Decoder 1 sends a description () to Decoder 2, which does not receive any direct description from the encoder, while in triangular source coding (Fig. 2), Decoder 1 provides a description () to Decoder 2 in addition to the direct description () from the encoder.
The contribution of this paper is to introduce new models of cooperation in multi terminal source coding, inspired by the cribbing of Willems and Van der Muelen [1] and by the implicit communication model of Cuff and Zhao [17]. Specifically, we consider cooperation between decoders in a successive refinement setting (introduced in Equitz and Cover [18]). In successive refinement, a single encoder describes a common rate to both the decoders and a private rate to only one of the decoders. We generalize this model to accommodate cooperation among the decoders as follows :

Cooperation via Conferencing : One such cooperation model considered is that shown in Fig. 3, where the encoder provides a common description () to both the decoders and a refined description () to Decoder 1, Decoder 1 cooperates with Decoder 2 by providing an additional description which is the function of its own private description (), as well as the common description (). This setting is inspired by the conferencing problem in channel coding described earlier. The region of achievable rates and distortions for this problem is given by,
(1) (2) for some joint probability distribution such that , for , where refers to the distortion function and are the distortion constraints, as is formally explained in Section II. The direct part of this characterization, namely that this region is achievable, follows standard arguments that generalize those used in the original successive refinement problem [18] (cf. Appendix A).

Cooperation via Cribbing : The main setting analyzed in this paper is shown in Fig. 4. A single encoder describes a common message to both decoders and a refined message to only Decoder 1. Instead of cooperating via a rate limited link, as in Fig. 3, Decoder 2 “cribs” (in the spirit of Willems and Van der Muelen [1]) a deterministic function of the reconstruction symbols of Decoder 1, noncausally, strictlycausally, or causally. Note a trivial function corresponds to the original successive refinement setting characterized in Equitz and Cover [18]. The goal is to find the optimal encoding and decoding strategy and to characterize the optimal encoding rate region which is defined as the set of achievable rate tuples such that the distortion constraints are satisfied at both the decoders. Cuff and Zhao [17], considered the problem of characterizing the coordination region (noncausal, strictly causal and causal coordination) in our setting of Fig. 4, for a specific function, , such that and for a specific rate tuple , that is Decoder 1 has access to the source sequence while Decoder 2 uses the reconstruction symbols of Decoder 1 (noncausally, strictlycausally or causally) to estimate the source. We use a new source coding scheme which we refer to as Forward Encoding and Block Markov Decoding, and show that it achieves the optimal rate region for strictly causal and causal cribbing. It draws on the achievability ideas (for causal coordination) introduced in Cuff and Zhao [17]. This scheme operates in blocks, where in the current block, the encoder encodes for the source sequence of the future block, (hence the name Forward Encoding) and the decoders rely on the decodability in the previous block to decode in the current block (hence the name Block Markov Decoding). More details about this scheme are deferred to Section III.
The general motivation for our work is an attempt to understand fundamental limits in source coding scenarios involving the availability of side information in the form of a lossily compressed version of the source. This is a departure from the standard and well studied models where side information is merely a correlated “noisy version” of the source, and is challenging because the effective ‘channel’ from source to side information is now induced by a compression scheme. Thus, rather than dictated by nature, the side information is now another degree of freedom in the design. There is no shortage of practical scenarios that motivate our models.
One such scenario may arise in the context of video coding, as considered by Aaron, Varodayan and Girod in ([19]). Consider two consecutive frames in a video file, denoted by Frame 1 and Frame 2, respectively. The video encoder starts by encoding Frame 1, and then it encodes the difference between Frame 1 and Frame 2. Decoder 1 represents decoding of Frame 1, while Decoder 2 uses the knowledge of decoded Frame 1 (via cribbing) to estimate the next frame, Frame 2.
Our problem setting is equally natural for capturing noncooperation as it is for capturing cooperation, by requiring the relevant distortions to be bounded from below rather than above (which, in turn, can be converted to our standard form of an upper bound on the distortion by changing the sign of the distortion criterion). For instance, Decoder 1 can represent an enduser with refined information (common and private rate) about a secret document, the source in our problem, while Decoder 2 has a crude information about the document (via the common rate). Decoder 1 is required to publicly announce an approximate version of the document, but due to privacy issues would like to remain somewhat cryptic about the source (as measured in terms of distortion with respect to the source) while also helping (via conferencing or cribbing) Decoder 2 to better estimate the source. For example, Decoder 1 can represent a Government agency required by law to publicly reveal features of the data, while on the other hand there are agents who make use of this publicly announced information, along with crude information about the source that they too, not only the government, are allowed to access, to decipher or get a good estimate of the classified information (the source).
The contribution of this paper is twofold. First, we introduce new models of decoder cooperation in source coding problems such as successive refinement, where decoders cooperate via cribbing, and we characterize the fundamental limits on performance for these problems using new classes of schemes for the achievability part. Second, we leverage the insights gained from these problems to introduce and solve a new class of channel coding scenarios that are dual to the source coding ones. Specifically, we consider the MAC with cribbing and a common message, where there are two encoders who want to communicate messages over the MAC, one has access to its own private message, there is a common message between the two encoders, and the encoders cooperate via cribbing (noncausally, strictly causally or causally).
The paper is organized as follows. Section II gives a formal description of the problem and the main results. Section III presents achievability and converses, with noncausal, causal and strictlycausal cribbing. Some special cases of our setting and numerical examples, are studied in Section IV. Channel coding duals are considered in Section V. Finally, the paper is concluded in Section VI.
Ii Problem Definitions and Main Results
We begin by explaining the notation to be used throughout this paper. Let upper case, lower case, and calligraphic letters denote, respectively, random variables, specific or deterministic values which random variables may assume, and their alphabets. For two jointly distributed random variables, and , let , and respectively denote the marginal of , joint distribution of and conditional distribution of given . is a shorthand for the tuple . We impose the assumption of finiteness of cardinality on all alphabets, unless otherwise indicated.
In this section we formally define the problem considered in this paper (cf. Fig. 4). The source sequence is drawn i.i.d. . Let and denote the reconstruction alphabets, and , for denote single letter distortion measures. Distortion between sequences is defined in the usual way,
(3) 
Definition 1.
A () ratedistortion code consists of the following,

Encoder, , .

Decoder 1, .
Definition 2.
A ratedistortion tuple is said to be achievable if , and () ratedistortion code such that the expected distortion for decoders are bounded as,
(7) 
Definition 3.
The ratedistortion region is defined as the closure of the set to all achievable ratedistortion tuples .
Our main results for this setting are presented in the Table I. Note that in all the rate regions in the table, we use the notation for , and we omit the distortion condition , for the sake of brevity. These results will be derived later in Section III. As another contribution, in Section V, we establish duality between the problem of successive refinement with cribbing decoders and communication over multiple access channels with cribbing encoders and a common message. We establish a complete duality between the settings (in a sense that is detailed in Section V) and rate regions of one can be obtained from those of the other by listed transformations.
Perfect Cribbing  Deterministic Function  
Cribbing  
NonCausal  (Theorem 1)  (Theorem 2) 
(p.m.f.) :  (p.m.f.) :  
StrictlyCausal  (Theorem 3)  (Theorem 4) 
(p.m.f.) :  (p.m.f.) :  


Causal  (Theorem 5)  (Theorem 6) 
(p.m.f.) :  (p.m.f.) :  

Lemma 1 (Equivalence to Cascade Source Coding with Cribbing Decoders).
The setup in Fig. 4 is equivalent to a cascade source coding setup with cribbing decoders as in Fig. 5 in the following way : fix a distortion pair and let denote the rate region for the problem of successive refinement with cribbing with achievable rate pairs . Let denote the closure of rate pairs, and denote the rate region for the problem of cascade source coding with cribbing (closure of achievable rate pairs ). We then have the equivalence, .
Proof.
We use certain standard techniques such as Typical Average Lemma, Covering Lemma and Packing Lemma which are stated and established in [21]. Herein, we state them for the sake of quick reference. For typical sets we use the definition as in chapter 2 of [21]. Henceforth, we omit the alphabets from the notation of typical set when it is clear from context, e.g. is denoted by .
Lemma 2 (Typical Average Lemma, Chapter 2, [21]).
Let . Then for any nonnegative function on ,
(8) 
Lemma 3 (Covering Lemma, Chapter 3, [21]).
Let . Let be a pair of arbitrarily distributed random sequences such that as and let , where , be random sequences, conditionally independent of each other and of given , each distributed according to . Then, there exists such that as , if .
Lemma 4 (Packing Lemma, Chapter 3, [21]).
Let . Let be a pair of arbitrarily distributed random sequences (not necessarily according to ). Let , where , be random sequences, each distributed according to . Assume that , is pairwise conditionally independent of given , but is arbitrarily dependent on other sequences. Then, there exists such that as , if .
Iii Successive Refinement with Cribbing Decoders
In this section we analyze the main settings considered in this paper and derive rate regions. In the various subsections to follow we will respectively study the problem of successive refinement with noncausal, strictly causal and causal cribbing. For clarity, in each subsection, we will first study the setting of “perfect” cribbing where and then generalize it to cribbing with any deterministic function .
Iiia Noncausal Cribbing
IiiA1 Perfect Cribbing
Theorem 1.
The rate region for the setting in Fig. 6 with perfect (noncausal) cribbing is given as the closure of the set of all the rate tuples such that,
(9)  
(10) 
for some joint probability distribution such that , for .
Proof.
Achievability :
“Double Binning” scheme
Before delving into the details, we first provide a high level understanding of the achievability scheme. Consider the simplified setup where , that is only Decoder 1 has access to the description of the source, and Decoder 2 gets the reconstruction symbols of Decoder 1 (“crib”). The intuition is to reveal a lossy description of source to the Decoder 2 through the “crib”. So we first generate codewords, and index them as bins. In each bin, we generate a superimposed codebook of codewords. Thus total rate of is needed to describe to Decoder 1. Decoder 2 knows via the crib, it then needs to infer the unique bin index which was sent, as then it would infer . The only issue to verify is that the codeword known via cribbing should not lie in two bins. We upper bound the probability of occurrence of such an event by , as there are overall codewords, and the probability that a particular lies in two bins is . This event has a vanishing probability so long as . Thus the achieved rate region is such that the constraint and distortion constraints are satisfied.
The general coding scheme when is depicted in Fig. 7 and has a “doublybinned” structure. Nonzero helps reduce by providing an extra dimension of binning. We first generate codewords, the indexes of which are the rows (or horizontal bins), and then in each row, we generate codewords. For each row, these codewords are then binned uniformly into vertical bins, which are the columns of our “doublybinned” structure. Thus each bin is “doublyindexed” (row and column index) and has a uniform number of codewords (as in Fig. 7). Note that this extra or independent dimension of vertical binning was not there when . Intuition is that column indexing with common rate is independent or orthogonal to the row indexing, and hence it helps to reduce the private rate . The column or vertical bin index is described to both the decoders via common rate and thus reduces to to describe to Decoder 1. Here again, from knowledge of the crib, and the column index, Decoder 2, infers the unique row index, which now will require .
We now describe the achievability in full detail.

Codebook Generation : Fix the distribution , such that and . Generate codebook consisting of codewords generated i.i.d , . For each , first generate a codebook consisting of codewords generated i.i.d. , then bin them all uniformly in vertical bins , and in each bin index them accordingly with . As outlined earlier, corresponds to the row or horizontal index and corresponds to the column or vertical index in our “doublybinned” structure, while indexes codewords within a “doublyindexed” bin. Thus for each row and column index pair, , there are codewords. can therefore be indexed by the triple . The codebooks are revealed to the encoder and both the decoders.

Encoding : Given source sequence , first the encoder finds from such that . Then the encoder finds pair such that . Thus . Encoder describes column or vertical bin index as to both the decoders, and the tuple to the Decoder 1 as rate . Thus
(11) 
Decoding : Decoder 1 knows all the indices , and it constructs . Decoder 2 receives from the noncausal cribbing and it also knows the column index through rate . It then checks inside the column or vertical bin of index , to find the unique row or horizontal bin index such that for some . The reconstruction of the Decoder 2 is then .

Distortion Analysis : Consider the following events :

No is jointly typical to a given (12) (13) The probability of this event vanishes as there are codewords. (cf. Covering Lemma, Lemma 3).

No is jointly typical to a typical pair The probability of this event vanishes as corresponding to each there are codewords, (cf. Covering Lemma, Lemma 3). Without loss of generality, now suppose that encoder does the encoding, . Decoder 2 receives via noncausal cribbing. The next two events are with respect to Decoder 2.

does not lie in bin indexed by and (16) (17) But the probability of this event goes to zero, because due to our encoding procedure, .

lies in a bin with row index, and column index . Since , this event is equivalent to finding lying in two different rows or horizontal bins, but with the same column or vertical bin index (). The probability of a single codeword occurring repeatedly in two horizontal bins indexed with different row index is , while knowing the column index, , total number of codewords with a particular column index are, , so the probability of event vanishes so long as,
(20)
Thus consider the event, , using the rate constraints from Eq. (11) and Eq. (30), the probability of the event vanishes if,
(21) (22) We will now bound the distortion. Assume without loss of generality that, , for . For both the decoders, (),
(23) (24) (25) where is via typical average lemma (cf. Typical Average Lemma 2). Proof is completed by letting when .

Converse : Converse for this setting follows by substituting in the converse for the deterministic function cribbing in the next subsection.
Note 1 (Joint Typicality Decoding).
Note that here our decoding for Decoder 2 relies on finding a unique bin index in which (obtained via cribbing) lies, and there is an error if two different bins have the same . An alternative based on joint typicality decoding can also be used to achieve the same region as follows : Decoder 2 receives via noncausal cribbing and it also knows the column index through rate . It then finds the unique row or horizontal bin index such that for some . The reconstruction of the Decoder 2 is then . We analyze the following two events, assuming without loss of generality that encoder does the encoding .

Decoder 2 finds no jointly typical indexed by and (26) (27) But the probability of this event goes to zero, because due to our encoding procedure, with high probability, . As this implies,