# Extended Gray–Wyner System with Complementary Causal Side Information

## Abstract

We establish the rate region of an extended Gray–Wyner system for 2-DMS with two additional decoders having complementary causal side information. This extension is interesting because in addition to the operationally significant extreme points of the Gray–Wyner rate region, which include Wyner’s common information, Gács-Körner common information and information bottleneck, the rate region for the extended system also includes the Körner graph entropy, the privacy funnel and excess functional information, as well as three new quantities of potential interest, as extreme points. To simplify the investigation of the 5-dimensional rate region of the extended Gray–Wyner system, we establish an equivalence of this region to a 3-dimensional mutual information region that consists of the set of all triples of the form for some . We further show that projections of this mutual information region yield the rate regions for many settings involving a 2-DMS, including lossless source coding with causal side information, distributed channel synthesis, and lossless source coding with a helper.

## 1Introduction

The lossless Gray–Wyner system [1] is a multi-terminal source coding setting for two discrete memoryless source (2-DMS) with one encoder and two decoders. This setup draws some of its significance from providing operational interpretation for several information theoretic quantities of interest, namely Wyner’s common information [2], the Gács-Körner common information [3], the necessary conditional entropy [4], and the information bottleneck [5].

In this paper, we consider an extension of the Gray-Wyner system (henceforth called the EGW system), which includes two new individual descriptions and two decoders with causal side information as depicted in Figure ?. The encoder maps sequences from a 2-DMS into five indices , . Decoders 1 and 2 correspond to those of the Gray–Wyner system, that is, decoder 1 recovers from and decoder 2 recovers from . At time , decoder 3 recovers *causally* from and decoder 4 similarly recovers causally from . Note that decoders 3 and 4 correspond to those of the complementary delivery setup studied in [6] with causal (instead of noncausal) side information and with two additional private indices and . This extended Gray-Wyner system setup is lossless, that is, the decoders recover their respective source sequences with probability of error that vanishes as approaches infinity. The rate region of the EGW system is defined in the usual way as the closure of the set of achievable rate tuples .

The first contribution of this paper is to establish the rate region of the EGW system. Moreover, to simplify the study of this rate region and its extreme points, we show that it is equivalent to the 3-dimensional *mutual information region* for defined as

in the sense that we can express using and vice versa. As a consequence and of particular interest, the extreme points of the rate region (and its equivalent mutual information region ) for the EGW system include, in addition to the aforementioned extreme points of the Gray–Wyner system, the Körner graph entropy [8], privacy funnel [9] and excess functional information [10], as well as three new quantities with interesting operational meaning, which we refer to as the *maximal interaction information*, the *asymmetric private interaction information*, and the *symmetric private interaction information*. These extreme points can be cast as maximizations of the interaction information [11] under various constraints. They can be considered as distances from extreme dependency, as they are equal to zero only under certain conditions of extreme dependency. In addition to providing operational interpretations to these information theoretic quantities, projections of the mutual information region yield the rate regions for many settings involving a 2-DMS, including lossless source coding with causal side information [12], distributed channel synthesis [13], and lossless source coding with a helper [15].

A related extension of lossy Gray–Wyner system with two decoders with causal side information was studied by Timo and Vellambi [18]. If we only consider decoders 3 and 4 in EGW, then it can be considered as a special case of their setting (where the side information does not need to be complementary). Other related source coding setups to the EGW can be found in [19]. A related 3-dimensional region, called the region of tension, was investigated by Prabhakaran and Prabhakaran [23]. We show that this region can be obtained from the mutual information region, but the other direction does not hold in general.

In the following section, we establish the rate region of the EGW system, relate it to the mutual information region, and show that the region of the original Gray–Wyner system and the region of tension can be obtained from the mutual information region. In Section ?, we study the extreme points of the mutual information region. In Section ? we establish the rate region for the same setup as the EGW system but with noncausal instead of causal side information at decoders 3 and 4. We show that the rate region of the noncausal EGW can be expressed in terms of the Gray–Wyner region, hence it does not contain as many interesting extreme points as the causal EGW. Moreover, we show that this region is equivalent to the closure of the limit of the mutual information region for as approaches infinity.

### 1.1Notation

Throughout this paper, we assume that is base 2 and the entropy is in bits. We use the notation: , and .

For discrete , we write the probability mass function as . For , we write the closure of as and the convex hull as . We write the support function as

We write the one-sided directional derivative of the support function as

Note that if is compact and convex, then

## 2Rate region of EGW and the mutual information region

The rate region of the EGW system is given in the following.

Note that if we ignore decoders 3 and 4, i.e., let be sufficiently large, then this region reduces to the Gray–Wyner region.

Although is 5-dimensional, the bounds on the rates can be expressed in terms of three quantities: , and together with other constant quantities that involve only the given . This leads to the following equivalence of to the mutual information region defined in . We denote the components of a vector by .

In the following we list several properties of .

The proof of this proposition is given in Appendix ?.

## 3Extreme Points of the Mutual Information Region

Many interesting information theoretic quantities can be expressed as optimizations over (and ). Since is convex and compact, some of these quantities can be represented in terms of the support function and its one-sided directional derivative, which provides a representation of those quantities using at most 6 coordinates. To avoid conflicts and for consistency, we use different notation for some of these quantities from the original literature . We use semicolons, e.g., , for symmetric quantities, and arrows, e.g., , for asymmetric quantities.

Figures ?, ? illustrate the mutual information region and its extreme points, and Table 1 lists the extreme points and their corresponding optimization problems and support function representations.

We first consider the extreme points of that correspond to previously known quantities.

can be expressed as

can be expressed as

[8]. Let be a graph with a set of vertices and edges between confusable symbols upon observing , i.e., there is an edge if for some . The Körner graph entropy

can be expressed as

In the Gray–Wyner system with causal complementary side information, corresponds to the setting with only decoders 1, 3 and , and we restrict the sum rate . This is in line with the lossless source coding setting with causal side information [12], where the optimal rate is also given by . An intuitive reason of this equality is that and the recovery requirement of decoder 1 forces and to contain negligible information outside , hence the setting is similar to the case in which the encoder has access only to . This corresponds to lossless source coding with causal side information setting.

[4] (also see in [27], in [28], private information in [29] and [30])

can be expressed as

can be expressed as

Note that the same tradeoff also appears in common randomness extraction on a 2-DMS with one-way communication [31], lossless source coding with a helper [15], and a quantity studied by Witsenhausen and Wyner [32]. It is shown in [33] that its slope is given by the chordal slope of the hypercontractivity of Markov operator [34]

[9] (also see the rate-privacy function defined in [29])

can be expressed as

In particular, the maximum for perfect privacy (written as in [29], also see [35]) is

The optimal privacy-utility coefficient [35] is

is closely related to one-shot channel simulation [36] and lossy source coding, and can be expressed as

In the EGW system, corresponds to the setting with only decoders 2, 4 and (since it is better to allocate the rate to instead of ), and we restrict . The value of is the rate of the additional information that decoder 2 needs, in order to compensate the lack of side information compared to decoder 4.

can be expressed as

### 3.1New information theoretic quantities

We now present three new quantities which arise as extreme points of . These extreme points concern the case in which decoders 3 and 4 are active in the EGW system. Note that they are all maximizations of the interaction information under various constraints. They can be considered as distances from extreme dependency, in the sense that they are equal to zero only under certain conditions of extreme dependency.

is defined as

It can be shown that

The maximal interaction information concerns the sum-rate of the EGW system with only decoders 3,4. Note that it is always better to allocate the rates to instead, hence we can assume (which corresponds to ). The quantity is the maximum rate in the lossless causal version of the complementary delivery setup [7].

is defined as

It can be shown that

The asymmetric private interaction information is the opposite of excess functional information defined in [10] in which is maximized instead. Another operational meaning of is the generation of random variables with a privacy constraint. Suppose Alice observes and wants to generate . However, she does not have any private randomness and can only access public randomness , which is also available to Eve. Her goal is to generate as a function of and , while minimizing Eve’s knowledge on measured by . The minimum is .

is defined as

It can be shown that

Intuitively, captures the maximum amount of information one can disclose about ), such that an eavesdropper who only has one of or would know nothing about the disclosed information. Another operational meaning of is the generation of random variables with a privacy constraint (similar to that for ). Suppose Alice observes and wants to generate . She has access to public randomness , which is also available to Eve. She also has access to private randomness. Her goal is to generate using , and her private randomness such that Eve has no knowledge on (i.e., ), while minimizing the amount of private randomness used measured by (note that if Alice can flip fair coins for the private randomness, then by Knuth-Yao algorithm [37] the expected number of flips is bounded by ). The minimum is .

We now list several properties of , and .

The proof of this proposition is given in Appendix ?.

Information quantity | Objective and constraints in EGW |

## 4Extended Gray–Wyner system with Noncausal Complementary Side Information

In this section we establish the rate region for the EGW system with complementary noncausal side information at decoders 3 and 4 (noncausal EGW), that is, decoder 3 recovers from and decoder 4 similarly recovers from . We show that can be expressed in terms of the Gray-Wyner region , hence it contains fewer interesting extreme points compared to . This is the reason we emphasized the causal side information in this paper. We further show that is related to the *asymptotic mutual information region* defined as

where is i.i.d. with . Note that may not be closed (unlike which is always closed).

The following gives the rate region for the noncausal EGW.

The proof is given in Appendix ?. Then we characterize the closure of . We show that , and the the Gray–Wyner region can be expressed in terms of each other.

The proof is given in Appendix ?. Note that Proposition ? does not characterize completely since it does not specify which boundary points are in .

### @.1Proof of the converse of Theorem

To prove the converse, let . Consider where the last inequality follows by Fano’s inequality. Similarly . Next, consider where the last inequality follows by Fano’s inequality since is a function of . Similarly . Hence the point is in the convex hull of for any . From , is the increasing hull of an affine transformation of , and thus is convex.

To prove the cardinality bound, we apply Fenchel-Eggleston-Carathéodory theorem [38] on the -dimensional vectors with entries , , and for , ; see [16].

### @.2Proof of Proposition

To see that is convex, for any and , let be independent of , and let . Then (similarly for the other two quantities). Compactness will be proved later.

The outer bound follows directly from the properties of entropy and mutual information.

For the inner bound, the first 4 points can be obtained by substituting respectively. For the last point, by the functional representation lemma [40], let such that . Again by the functional representation lemma, let such that . Let , then , , and

Hence there exists such that (by substituting ). Taking convex combination of this point and , we have .

The existence of such that can be proved by substituting and invoking the strong functional representation lemma [10].

The superadditivity property can be obtained from considering , where .

The data processing property can be obtained from considering where .

The cardinality bound can be proved using Fenchel-Eggleston-Carathéodory theorem using the same arguments as in the converse proof of Theorem ?. Compactness follows from the fact that mutual information is a continuous function, and the set of conditional pmfs with is a compact set.

The relation to Gray–Wyner region and region of tension follows from the definitions of the regions.

### @.3Proof of Proposition

To prove the bound, note that , hence , .

We first prove that if there does not exist length 3 paths in the bipartite graph, then . Let achieves the Gács-Körner common information, i.e., represents which connected component the edge lies in. If the bipartite graph does not contain length 3 paths, every connected component is a star, i.e., for each , either or . Then , and for any . Hence .

We then prove that if there exist a length 3 path in the bipartite graph, then . Assume . Let ,

where is small enough such that the above is a valid conditional pmf. One can verify that . Since , and are not conditionally independent given . Hence .

We then prove that if , then there exists a cycle in the bipartite graph. Let satisfies , and . Since is not independent of , there exists such that . Since , there exists such that . Since , there exists such that . Continue this process until we return to a visited pair, i.e., for . Then forms a cycle.

We then prove that if there exist a cycle in the bipartite graph, then . Let be a cycle. Let ,