On the Capacity of Networks with Correlated Sources

# On the Capacity of Networks with Correlated Sources

\authorblockN Satyajit Thakor , Terence Chan  and Alex Grant  S. Thakor is with the Institute of Network Coding, Chinese University of Hong Kong. A. Grant and T. Chan are with the Institute for Telecommunications Research, University of South Australia. This work was performed in part while S. Thakor was with the Institute for Telecommunications Research, University of South Australia. T. Chan and A. Grant are supported in part by the Australian Research Council under Discovery Projects DP088022 and DP1094571.
###### Abstract

Characterizing the capacity region for a network can be extremely difficult. Even with independent sources, determining the capacity region can be as hard as the open problem of characterizing all information inequalities. The majority of computable outer bounds in the literature are relaxations of the Linear Programming bound which involves entropy functions of random variables related to the sources and link messages. When sources are not independent, the problem is even more complicated. Extension of linear programming bounds to networks with correlated sources is largely open. Source dependence is usually specified via a joint probability distribution, and one of the main challenges in extending linear programming bounds is the difficulty (or impossibility) of characterizing arbitrary dependencies via entropy functions. This paper tackles the problem by answering the question of how well entropy functions can characterize correlation among sources. We show that by using carefully chosen auxiliary random variables, the characterization can be fairly “accurate”.

## I Introduction

The fundamental question in network coding is to determine the required link capacities to transmit the sources to the sinks. Characterizing the network coding capacity region is extremely hard [1]. When the sources are independent, the capacity region depends only on the source entropy rates. However, when the sources are dependent, the capacity region depends on the detailed structure of the joint source distribution.

Following [2], a linear programming outer bound was developed for dependent sources [3] (see also [4]). This bound is specified by a set of information inequalities and equalities, and source dependence is represented by the entropy function

 h(α)≜H(Xis,s∈α),α⊆S (1)

where is an index set for the sources and are independent and identically distributed copies of the dependent sources. Thus each has the same joint distribution as the sources, but are independent across different .

However (1) fails to properly characterize source dependence. We also note that the capacity regions (or best known achievable regions) for many classic multiterminal problems are also expressed as optimizations of linear combinations of joint entropies, subject to linear constraints (e.g. markov constraints) on joint entropies. If it were not for the specified joint distributions on the sources/side-information etc. typically present in such problems, numerical solution would be achieved by a linear program. Again, if it were possible to somehow accurately capture the dependence of random variables using entropies, it would lead to a convenient computational approach.

A natural question arises: How accurately can arbitrary dependencies be specified via entropies alone? We will show that by using auxiliary random variables, entropies can in fact be sufficient.

### Organization

This work of characterizing correlation between random variables using entropy functions was mainly motivated by the problem of characterizing outer bounds on the capacity of networks with correlated sources. In Section II we review known outer bounds characterized using graph theoretic approach (referred as graphical bounds) as well as outer bounds using geometrical approach (referred as geometric bounds). These bounds are not tight and can be tightened by introducing new auxiliary random variables which more accurately describe correlation between the source random variables. In Section III, we give a general framework for improving outer bounds with introduction of auxiliary random variables. In Section III-A, we demonstrate by an example that our LP bound can can in fact be tightened via the use of auxiliary random variables. In Section III-B and Section III-C, we present two approaches to construct auxiliary random variables to tighten the outer bounds. The constructions via these two approaches are direct generalizations of the auxiliary random variables designed in Example 1, Section III-A. In Section IV, we deal with the more general problem of characterizing probability distribution using entropy functions.

## Ii Background

Despite its importance, the maximal gain that can be obtained by network coding is still largely unknown, except in a few scenarios [5, 6]. One example is the single-source scenario where the capacity region is characterized by the max-flow bound [5] (see also [7, Chapter 18]) and linear network codes maximize throughput [8]. However, when it involves more than one source, the problem can become quite difficult.

The problem becomes even more complex when the sources are correlated. In the classical literature, the problem of communicating correlated sources over a network is called distributed source compression. For networks of error-free channels with edge capacity constraints, the distributed source compression problem is a feasibility problem: given a network with edge capacity constraints and the joint probability distribution of correlated sources available at certain nodes, is it feasible to communicate the correlated sources to demanding nodes?

A relevant important problem is of separation of distributed source coding and network coding [9]. Specifically, distributed source coding and network coding are separable if and only if optimality is not sacrificed by separately designing source and network codes. It has been shown in [9] that the separation holds for two-source two-sink networks however it has been shown by examples that that the separation fails for two-source three-sink and three-source two-sink networks.

In this section, we present known outer bounds on the capacity of networks with correlated sources. We first describe network model and define network code and achievable rate. We then present known graphical and geometric outer bounds.

A network is modelled as a graph where is the set of nodes and is the set of directed edges between certain pairs of nodes. Associated with each edge is a non-negative real number called the capacity of the edge . For edges , we write as a shorthand for . Similarly, for an edge and a node , the notations and respectively denote and . Let be an index set for a number of multicast sessions, and let be the set of source variables. These sources are available at the nodes identified by the mapping

 a:S↦V. (2)

Each source may be demanded by multiple sink nodes, identified by the mapping

 b:S↦2V. (3)

where, is the set of all subsets of . Each edge carries a random variable which is a function of incident edge random variables and source random variables.

For a given network and connection requirement and , a network code is a set of mappings from input random variables (sources and incoming edges) to output random variables (outgoing edges) at a network node. The mapping must obey constraints implied by the topology. The alphabets of source random variables and edge random variables are denoted by and , respectively.

###### Definition 1 (Network code)

A network code for a given network is described by sets of its encoding functions and decoding functions .

 Φ ={ϕe:∏s∈S:s→eYs×∏f∈E:f→eUf⟼Ue,e∈E} (4) Ψ ={ψu:∏f∈E:f→uUf⟼Ys,u∈b(s),s∈S} (5)

Now we define an achievable rate tuple. The definition below is different from the usual definition of an achievable rate [7, Definition 21.2] in that the source rates are fixed and the link capacity constraints are variable.

###### Definition 2 (Achievable rate tuple)

Consider a given network with discrete memoryless sources and underlying probability distribution . A link capacity tuple is called achievable if there exists a sequence of network codes such that for every and every

 limn→∞n−1log|U(n)e| ≤Ce (6) limn→∞Pr{ψ(n)u(U(n)f:f→u)≠Y(n)s} =0,∀u∈b(s) (7)

where is the decoded estimate of at node from via mapping .

The set of all achievable link capacity tuples is denoted by where the subscript describes correlated source case.

### Ii-a Graphical Bounds

In [10], the author gave a necessary and sufficient condition for when each sink requires all the sources.111The results were generalized for networks with noisy channels. However, in this paper we are mainly concerned with networks with error-free channels. This result includes the necessary and sufficient condition [11], [12] for networks in which every source is demanded by single sink as a special case.

###### Theorem 1 (Theorem 3.1, [10])

For networks of error free channels, the transmission of sources is feasible if and only if

 H(YW|YWc)≤minT∑e:head(e)∈T,tail(e)∈TcCe (8)

where source sessions are available at some nodes in and all source sessions are demanded by at least one node in , i.e., this is the min-cut of the graph.

As mentioned above, for a few special cases a necessary and sufficient condition for reliable transmission of correlated sources over a network is given in [12], [9] and [10]. However, the problem is an uncharted area in general. Until recently there did not even exist in the current literature a nontrivial necessary condition for reliable transmission of correlated sources in general multicast networks. In [4], we made the first attempt to address this problem by characterizing a graph based bound, called the “functional dependence bound”, for networks with correlated sources with arbitrary sink demands. The functional dependence bound was initially characterized for network with independent sources [13]. Later in [3], we showed that the functional dependence bound is also an outer bound for networks with correlated sources.

In [4] we gave an abstract definition of a functional dependence graph, which expressed a set of local dependencies between random variables. In particular, we described a test for functional dependence, and gave a basic result relating local and global dependence. Below is the functional dependence bound based on the implications of functional dependence.

###### Theorem 2 (Functional dependence bound [4])

Let be a functional dependence graph on the (source and edge) random variables . Let be the collection of all maximal irreducible sets [4, Definition 25]. Then

 h(YW|YWc)≤min{UA,YWc}∈M∑e∈ACe, (9)

where and .

The functional dependence region is defined as follows.

 RFD≜⋂W⊆S{h∈R2|S|−1+:h(YW|YWc)≤min{UA,YWc}∈M∑e∈ACe} (10)

where edge-sets are subsets of maximal irreducible sets.

We also generalized existing bounding techniques that characterize geometric bounds for multicast networks with independent sources for networks with correlated sources.

### Ii-B Geometric Bounds

In this section, we focus on outer bounds on achievable rate region for networks with correlated sources using geometric approach. We present outer bounds by using the set of almost entropic variables , and (again called LP bound) by using the set of polymatroid variables similar to the bounds given for independent sources in [2, Chapter 15].

###### Definition 3

Consider a network coding problem for a set of correlated source random variables . Let be the set of all link capacity tuples such that there exists a function (over the set of variables ) satisfying the following constraints:

 h(XW:W⊆S) =H(YW) (11) h(Ue|{Xs:a(s)→e},{Uf:f→e}) =0 (12) h(Xs:u∈b(s)|Ue:e→u)) =0,∀u∈b(s) (13) h(Ue) ≤Ce (14)

for all and .

Taking as and in Definition 3 gives us regions and respectively.

###### Theorem 3 (Outer bound [4])
 Rcs⊂Rcs(¯¯¯¯¯¯Γ∗)

It is well known that the region is closed and convex [14]. Moreover, the regions defined by the constraints (11)-(14) are also closed and convex. Replacing by in Theorem 3, we obtain an outer bound , for capacity of networks with correlated sources. This bound (a linear programming bound) is an outer bound for the achievable rate region since and Theorem 3 implies

###### Theorem 4 (Outer bound [4])
 Rcs⊂Rcs(¯¯¯¯¯¯Γ∗)⊂Rcs(Γ). (15)

It is possible that the outer bounds and given above, in terms of the region of almost entropic vectors and the region of polymatroid vectors , may not be tight since the representation of the regions or together with constraints (11)-(14) do not capture the exact correlation of source random variables, i.e., the exact joint probability distribution. This is because the same entropy vector induced by the correlated sources may be satisfied by more than one probability distribution. The importance of incorporating the knowledge of source correlation (joint distribution) to improve the cut-set bound is also recently and independently investigated in [15].

## Iii Improved Outer Bounds

In this section, we give a general framework for improved outer bounds using auxiliary random variables. In Section III-A we will demonstrate by an example that the outer bound is not tight and also give an explicit improved outer bound which is strictly better than the outer bound . In Section III-B and III-C, we present two generalizations of Example 1 to construct auxiliary random variables to obtain improved bounds.

###### Definition 4

Consider a set of correlated sources with underlying probability distribution . Construct any auxiliary random variables by choosing a conditional probability distribution function .

Let be the set of all link capacity tuples such that there exists an almost entropic function satisfying the following constraints:

 h(XW:W⊆S,ZZ:Z⊆L) =H(YW,KZ) (16) h(Ue|{Ys:a(s)→e},{Uf:f→e}) =0 (17) h(Ys:u∈b(s)|Ue:e→u) =0,∀u∈b(s) (18) h(Ue) ≤Ce (19)

for all and .

Similarly, an outer bound can be defined in terms of polymatroid function .

###### Theorem 5 (Improved Outer bounds)
 Rcs⊆R′cs(¯Γ∗)⊆Rcs(¯¯¯¯¯¯Γ∗)⊆Rcs(Γ) (20)
 Rcs⊆R′cs(Γ)⊆Rcs(Γ) (21)

An improved functional dependence bound can also be obtained from the functional dependence bound by introducing auxiliary random variables. The improvement of the bounds of the form in Definition 4 over the bound without using auxiliary random variables solely depends on the construction of auxiliary random variables.

### Iii-a Looseness of the Outer Bounds

In this section, we demonstrate by an example that

1. the LP bound presented in Section II-B is in fact loose and

2. the bounds derived in Section II-B can be tightened by introducing auxiliary random variables.

###### Example 1

In Figure 1, three correlated sources are available at node 1 and are demanded at nodes respectively. The edges from node to nodes have sufficient capacity to carry the random variable available at node 2. The correlated sources are defined as follows.

 Y1=(b0,b1) (22) Y2=(b0,b2) (23) Y3=(b1,b2) (24)

where are independent, uniform binary random variables.

###### Definition 5

The LP bound for the network in Figure 1 is the set of all link capacity tuples such that there exists satisfying the following constraints.

 h(Xs) =2,s=1,2,3 (25) h(Xi,Xj) =3,i≠j,i,j∈{1,2,3} (26) h(Ue|X1,X2,X3) =0,e=1,2,3,4 (27) h(X1|U2U1) =0 (28) h(X2|U3U1) =0 (29) h(X3|U4U1) =0 (30) h(Ue) ≤Ce,i=1,...,4 (31)

Note that the link capacity tuple is in the region by choosing as the entropy function of the following random variables:

 X1 =(b0,b1), U1 =b0 X2 =(b0,b2) U2 =b1 X3 =(b0,b1⊕b2) U3 =b2 U4 =b1⊕b2

which satisfies (25)-(31) and polymatroidal axioms since these are random variables.

Now, we will characterize an improved LP bound by constructing auxiliary random variables .

###### Definition 6

An improved LP bound for the network in Figure 1 is the set of all link capacity tuples such that there exists satisfying the following constraints.

 h(Xs) =2,s=1,2,3 (32) h(Xi,Xj) =3,i≠j,i,j∈{1,2,3} (33) h(Zα) =|α|,α⊆{0,1,2} (34) h(X1|Z0,Z1) =0 (35) h(X2|Z0,Z2) =0 (36) h(X3|Z0,Z3) =0 (37) h(Z0Z1) =h(X1) (38) h(Z0Z2) =h(X2) (39) h(Z1Z2) =h(X3) (40) h(Ue|X1,X2,X3) =0,e=1,2,3,4 (41) h(X1|U2U1) =0 (42) h(X2|U3U1) =0 (43) h(X3|U4U1) =0 (44) h(Ue) ≤Ce,i=1,...,4. (45)

Note that, by Definition 6, the link capacity tuple is in the improved LP bound if and only if there exists a polymatroidal satisfying (32)-(45). In the following, we prove that the link capacity tuple is indeed not in , Definition 6, and hence is not achievable.

Suppose to the contrary that is in . Then by definition, there exists a polymatroid satisfying (32)-(45). From these constraints, it is easy to prove that

 h(U1|Z0Z1) =0 (46) h(U1|Z0Z2) =0 (47) h(U1|Z1Z2) =0 (48) h(Z0Z1Z2) =h(Z0)+h(Z1)+h(Z1). (49)

As , it implies that

 Ih(U1;Z1|Z0Z2)=0. (50)

On the other hand, by (49), we have

 Ih(Z1;Z2|Z0)=0. (51)

Therefore,

 Ih(Z1;Z2,U1|Z0)=0 (52)

and consequently,

 Ih(Z1;U1|Z0)=0. (53)

Together with , this implies . Similarly, we can also prove that

 h(U1|Z2)=h(U1|Z0)=0. (54)

Using the same argument, we can once again prove that and implies .

Finally, implies

 2=h(X1) ≤h(U1,U2) (55) ≤h(U1)+h(U2) (56) =h(U2) (57) ≤1. (58)

A contradiction occurs and hence, there exists no such polymatroidal which satisfies (32)-(45). In other words, the link capacity tuple is not in , Definition 6. (End of Example 1)

In Definition 4, we present new improved outer bounds on the capacity region of networks with correlated sources using auxiliary random variables. However, there is one problem that remains to be solved: How to construct auxiliary random variables that can tighten the bounds or more generally, can lead to characterization of the capacity region for networks with correlated sources. While it appears to be a hard problem to answer in general, we propose three approaches to construct auxiliary random variables. First, we propose to construct auxiliary random variables from common information.

### Iii-B Auxiliary Random Variables from Common Information

The first approach is to construct an auxiliary random variable which is almost the common information of two random variables. This approach is a direct generalization of Example 1 in the previous section in a sense that the auxiliary random variables in Example 1 are precisely the common information between pairs of source random variables. This fact also implies that the approach leads to characterization of improved bounds.

###### Definition 7 (Common Information [16])

For any random variables and , the common information of and is the random variable (denoted by ) which has the maximal entropy among all other random variables such that

 H(K|X) =0 (59) H(K|Y) =0. (60)

In many cases, it is not easy to find the common information between two random variables. For example, let be a binary random variable such that and . Suppose is another binary random variable independent of and . Then if [16] ( see also [17])

 H(K|X) =0 (61) H(K|Y) =0, (62)

then even if and are almost the same for sufficiently small .

To address this issue, we propose a different way to construct auxiliary random variables. Consider any pair of random variables with probability distribution . For any , let

 P(δ)≜⎧⎪⎨⎪⎩PK|XY(⋅):H(K|X)≤δ,H(K|Y)≤δ,I(X;Y|K)≤δ⎫⎪⎬⎪⎭ (63)

where the probability distribution of is given by

 Pr(X=x,Y=y,K=k)≜PXY(X=x,Y=y)PK|XY(K=k|X=x,Y=y). (64)

Note that the “smaller” the is, the more similar the random variable (associated with the conditional distribution ) is to the common information.

Our constructed random variable will be selected from to formulate an improved LP bound where

 δ∗=minδ:P(δ)≠∅δ. (65)

For a multi-source multicast network with source random variables one can construct random variables from the family of distributions

 (66)

An improved LP bound for a multi-source multicast network with source random variables can be computed by constructing the random variables and then taking inequalities

 H(Kij|Yi)≤δij, (67) H(Kij|Yj)≤δij, (68) I(Yi;Yj|Kij)≤δij (69)

into consideration apart from constraints (11)- (14) and elemental inequalities.

### Iii-C Linearly Correlated Random Variables

In some scenarios, source random variables are “linearly correlated”. In this section we present a construction method for auxiliary random variables describing linear correlation between random variables. This approach is also a direct generalization of Example 1 in the previous section in a sense that the source random variables are linearly correlated.

###### Definition 8

A set of random variables is called linearly correlated if

1. for any , the support of the probability distribution of is a vector subspace and

2. is uniformly distributed.

Let be a set of linearly correlated random variables with support vector subspaces

 Vi⊆Fmq (70)

and

 ⟨Vi:i∈n⟩=⟨Yi:i∈m⟩ (71)

where are linearly independent. That is, is a basis for the subspaces . It can be noticed that, there exists a set of linearly independent random variables uniformly distributed over the support induced from a basis of the vector spaces . That is,

 H(Ka)=alog2m. (72)

The random variable can be written as a function of random variables as follows

 Yi=[K1…Km]Ai (73)

where

 Ai=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣ai1,1ai1,2⋯ai1,dim(Vi)ai2,1ai2,2⋯ai2,dim(Vi)⋮⋮⋱⋮aim,1aim,2⋯aim,dim(Vi)⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦ (74)

is an coefficient matrix.

Thus, the random variables are linear functions of the random variables . In particular, a random variable is a function of the random variables such that the coefficient of is non-zero. Then we have the following equalities.

 H(Yi|Kj:aijl≠0,∀l∈{1,…,dim(Vi)},aijl∈Ai)=0 (75) (76)

An improved LP bound can be computed by taking equalities (72),(75) and (76) into consideration apart from constraints (11)- (14) and elemental inequalities.

## Iv Probability Distribution using Entropy Functions

The basic question is: How “accurate” can entropy function specify the correlation among random variables? We partly answer the question by showing that the joint probability distribution among random variables can be completely specified by entropy functions subject to some moderate constraint. First, we describe a few notations.

#### Notations

Let and be a random variable. Assume without loss of generality that has a positive probability distribution over . Let be the set of all nonempty subsets of . The size of the support222Roughly speaking, is the number of possible values that can take with positive probabilities. of a random variable will be denoted by . For notational simplicity, we will not distinguish a set with a single element and the element . Two random variables are regarded as equivalent if they are functions of each other. Therefore, and are regarded as equivalent.

Let

 hb(q)≜−qlogq−(1−q)log(1−q). (77)

The function is not one-to-one over the interval. Yet, we will use to define as the unique such that

 hb(q)=δ. (78)

### Iv-a Single Random Variable Case

First we consider the problem of characterizing distribution of single random variable via entropy functions. To understand the idea, consider a binary random variable such that and . While the entropy of does not determine exactly what the probabilities of are, it essentially determines the probability distribution (up to permutations). To be precise, let such that where Then either or . Furthermore, the two possible distributions are in fact permutations of each other.

When is not binary, the entropy alone is not sufficient to characterize the probability distribution of . However, by using auxiliary random variables, it turns out that the distribution of can still be determined.

The idea is best demonstrated by an example. Suppose is ternary, taking values from the set . Suppose also that for all . Define random variables , and such that

 Ai={1 if X=i0 otherwise. (79)

Clearly,

 H(Ai|X)=0 (80)

and

 H(Ai)=hb(pX(i)). (81)

Let us further assume that for all . Then by (81) and strict monotonicity of in the interval , it seems at the first glance that the distribution of is uniquely specified by the entropies of the auxiliary random variables. However, this is only half of the story and there is a catch in the argument – The auxiliary random variables chosen are not arbitrary. When we “compute” the probabilities of from the entropies of the auxiliary random variables, it is assumed to know how the random variables are constructed. Without knowing the “construction”, it is unclear how to find the probabilities of from entropies. More precisely, suppose we only know that there exists auxiliary random variables such that (80) holds and their entropies are given by (81) (without knowing that the random variables are specified by (79)). Then we cannot determine precisely what the distribution of is. Having said that, in this paper we will show that the distribution of can in fact be fully characterized by the “joint entropies” of the auxiliary random variables.

#### Iv-A1 Construction of auxiliary random variables

###### Definition 9 (Constructing auxiliary random variables X∗a)

For any , let be the auxiliary random variables such that

 X∗a={1if X∗∈a0otherwise (82)

Notice that when .

###### Proposition 1 (Property 1: Distinct)

For any distinct , then

 H(X∗a|X∗b) >0, and (83) H(X∗b|X∗a) >0. (84)
{proof}

First note that and hence

 Pr(X∗a=0,X∗b=0)>0. (85)

Since are nonempty and distinct, there are two possible cases. In the first case, is nonempty. In this case, it can be checked easily that either , , or both must be nonempty. In the second case, . Then clearly both and must be nonempty. Finally, since has strictly positive probability distribution, then we can easily check that the theorem holds.

###### Proposition 2 (Property 2: Subset)

Suppose . Then

 H(X∗a|X∗i,i∈b)>0 (86)

if and only if is nonempty.

{proof}

By direct verification.

###### Proposition 3 (Property 3: Partition)

For any , there exists random variables such that

 H(X∗bk|X∗a,X∗b1,…,X∗bk−1) >0 (87)

for all .

{proof}

Assume without loss of generality that . Let

 X∗b1=X∗2,X∗b2=X∗3,…,X∗bn−2=X∗n−1. (88)

We can verified directly that (87) is satisfied.

In the rest of the paper, we will assume without loss of generality that

 pX(1)≥pX(2)≥⋯≥pX(n)>0. (89)
###### Proposition 4 (Property 4: The smallest atom)

has the minimum entropy among all . In other words

 mina∈P(N)H(X∗a)=H(X∗n). (90)
{proof}

Consider . First notice that for all and hence On the other hand,

 ∑i∈apX(i)≤∑i∈apX(i)+pX(1)−pX(n)≤1−pX(n). (91)

Therefore, and consequently . The proposition is thus proved.

###### Proposition 5 (Property 5: Singleton Xi)

Suppose . Then for any ,

 H(X∗i|X∗i+1,…,X∗n) >0. (92)

In addition, for all such that ,

 H(X∗i|X∗i+1,…,X∗n) ≤H(X∗a|X∗i+1,…,X∗n) (93) H(X∗i) ≤H(X∗a). (94)
{proof}

Inequality (92) can be directly verified. Now, suppose . To prove (93) and (94), first notice that

 H(X∗a|X∗i+1,…,X∗n) =∑ai+1,…,an∈{0,1}p(X∗i=ai+1,…,X∗n=an)H(X∗a|X∗i+1=ai+1,…,X∗n=an) (a)=p(X∗i+1=0,…,X∗n=0)H(X∗a|X∗i+1=0,…,X∗n=0).

Here, follows from the fact that if .

Consider any which is not a subset of . Let , and

 qa≜∑k∈a∖i+pX(k)∑j∉i+pX(j). (95)

It can be verified easily that

 H(X∗a|X∗i=0,…,X∗n=0)=hb(qa) (96)

As is not a subset of , there exists such that . In this case,

 ∑j∈a∖i+pX(j)≥pX(k)≥pX(i). (97)

Hence, On the other hand,

 ∑j:j∈a∖i+pX(j)≤∑j:j∈a∖i+pX(j)+pX(1)−pX(i)≤∑j:j∉i+pX(j)−pX(i). (98)

Hence, by dividing both (97) and (98) with , we can prove that

 1−q(i)≥qa≥qi,

and thus (93) holds. Now it remains to prove (94).

Notice again from (97) and (98) that and

 ∑j:j∈apX(j)≤∑j:j∈apX(j)+pX(1)−pX(i)≤1−pX(i). (99)

Consequently, and the proposition is proved.

#### Iv-A2 Uniqueness

In the previous subsection, we have defined how to construct a set of auxiliary random variables from , and have identified properties of these random variables in relation to the underlying probability distributions. In the following, we will show that the constructed set of auxiliary random variables are in fact sufficient in fully characterizing the underlying probability distribution of .

Let be a random variable such that there exists auxiliary random variables

 {Ya,a∈P(N)} (100)

such that

 H(X∗a,a∈α) =H(Ya,a∈α),∀α⊆P(N) (101) H(X∗,X∗a,a∈α) =H(Y,Ya,a∈α),∀α⊆P(N). (102)

In other word, is a random variable such that there exists auxiliary random variables such that the entropy function of is essentially the same as that of