The Three-User Finite-Field Multi-Way Relay Channel with Correlated Sources
This paper studies the three-user finite-field multi-way relay channel, where the users exchange messages via a relay. The messages are arbitrarily correlated, and the finite-field channel is linear and is subject to additive noise of arbitrary distribution. The problem is to determine the minimum achievable source-channel rate, defined as channel uses per source symbol needed for reliable communication. We combine Slepian-Wolf source coding and functional-decode-forward channel coding to obtain the solution for two classes of source and channel combinations. Furthermore, for correlated sources that have their common information equal their mutual information, we propose a new coding scheme to achieve the minimum source-channel rate.
We study the three-user multi-way relay channel (MWRC) with correlated sources, where each user transmits its data to the other two users via a single relay, and where the users’ messages can be correlated. The MWRC is a canonical extension of the extensively studied two-way relay channel (TWRC), where two users exchange data via a relay . Adding users to the TWRC can change the problem significantly . The MWRC has been studied from the point of view of channel coding and source coding.
In channel coding problems, the sources are assumed to be independent, and the channel noisy. The problem is to find the capacity, defined as the region of all achievable channel rate triplets (bits per channel use at which the users can encode/send on average). For the Gaussian MWRC with independent sources, Gündüz et al.  obtained asymptotic capacity results for the high SNR and the low SNR regimes. For the finite-field MWRC with independent sources, Ong et al.  constructed the functional-decode-forward coding scheme, and obtained the capacity region. For the general MWRC with independent sources, however, the problem remains open to date.
In source coding problems, the sources are assumed to be correlated, but the channel noiseless. The problem is to find the region of all achievable source rate triplets (bits per message symbol at which the users can encode/send on average). The source coding problem for the three-user MWRC was solved by Wyner et al. , using cascaded Slepian-Wolf source coding .
In this paper, we study both source and channel coding in the same network, i.e., transmitting correlated sources through noisy channels (cf. our recent work  on the MWRC with correlated sources and orthogonal uplinks). For most communication scenarios, the source correlation is fixed by the natural occurrence of the phenomena, and the channel is the part that engineers are “unwilling or unable to change” . Given the source and channel models, we are interested in finding the limit of how fast we can feed the sources through the channel. To this end, define source-channel rate  (also known as bandwidth ratio ) as the average channel transmissions used per source tuple. Our aim is then to derive the minimum source-channel rate required such that each user can reliably and losslessly reconstruct the other two users’ messages.
In the multi-terminal network, it is well known that separating source and channel coding, i.e., designing them independently, is not always optimal (see, e.g., the multiple-access channel ). Designing good joint source-channel coding schemes is difficult, let alone finding an optimal one. Gündüz et al.  considered a few networks with two senders and two receivers, and showed that source-channel separation is optimal for certain classes of source structure. In this paper, we approach the MWRC in a similar direction. We show that source-channel separation is optimal for three classes of source/channel combinations, by constructing coding schemes that achieve the minimum source-channel rate.
Recently, Mohajer et al.  solved the problem of linear deterministic relay networks with correlated sources. They constructed an optimal coding scheme, where each relay injectively maps its received channel output to its transmitted channel input. While this scheme is optimal for deterministic networks, such a scheme (e.g., the amplify-forward scheme in the additive white Gaussian noise channel) suffers from noise propagation in noisy channels and has been shown to be suboptimal for the MWRC with independent sources .
2.1Source and Channel Models
We consider the MWRC depicted in Figure ?, where three users (denoted by 1, 2, and 3) exchange messages through a noisy channel with the help of a relay (denoted by 0). For each node , we denote its source by , its input to the channel by , and its received channel output by . We let , as the relay has no source.
We consider correlated and discrete-memoryless sources for the users, where , , and are generated according to some joint probability mass function
The channel consists of a finite-field uplink from the users to the relay, which takes the form
and a finite-field downlink from the relay to each user , which takes the form
where , for all , for some finite field of cardinality with the associated addition . Here, can be any prime power. We assume that the noise is not uniformly distributed, i.e., its entropy ; otherwise, it will randomize the channel, and no information can be sent through.
Each user sends source symbols to the other two users (simultaneously) in channel uses. We refer to the source symbols of user as its message, denoted by , where each symbol triplet for is generated independently according to . The channel is memoryless in the sense that the channel noise for all nodes and all channel uses are independent, and the distribution is fixed for all channel uses. The source-channel rate, i.e., the number of channel uses per source triplet, is denoted by .
We assume that each user has all its source symbols prior to the channel uses
The -th transmitted channel symbol of each node depends on its message and its previously received channel symbols, i.e., , for all and for all .
Each user estimates the messages of the other users from its own message and all its received channel symbols, i.e., user decodes the messages from users and as , for all distinct . We denote .
Note that utilizing feedback is permitted in our system model. This is commonly referred to as the unrestricted MWRC (cf. the restricted MWRC ). We will see later that for the classes of source/channel combinations for which we find the minimum source-channel rate, feedback is not used. This means that feedback provides no improvement to source-channel rate for these cases.
User makes a decoding error if . We define as the probability that one or more users make a decoding error, and say that source-channel rate is achievable if the following is true: for any , there exists at least one block code of source-channel rate with . The aim of this paper is to find the infimum of achievable source-channel rates, denoted by . For the rest of the paper, we refer to as the minimum source-channel rate.
We will now state the main result of this paper. The technical terms (in italics) in the theorem will be defined in Section 2.3 following the theorem.
For Cases 1 and 2, we derive the achievability (upper bound) of using existing (i) Slepian-Wolf source coding and (ii) functional-decode-forward channel coding for independent sources. We abbreviate this pair of source and channel coding scheme by SW/FDF-IS. We derive a lower bound using cut-set arguments. While the achievability for these two cases is rather straightforward, what we find interesting is that using the scheme for independent messages is actually optimal for two classes of source/channel combinations. Furthermore, although the source-channel rates achievable using SW/FDF-IS cannot be expressed in a closed form, we are able to derive closed-form conditions for two classes of sources where the achievability of SW/FDF-IS matches the lower bound.
In SW/FDF-IS, the source coding—while compressing—destroys the correlation among the sources, and hence channel coding for independent sources is used. For Case 3, the sources have their common information equal their mutual information, meaning that each source is able to identify the parts of the messages it has in common with other source(s). For this case, we again use Slepian-Wolf source coding, but we conserve the parts that the sources have in common. We then design a new channel coding scheme that takes the common parts into account. Here, the challenge is to optimize the functions of different parts that the relay should decode. We show that the new coding scheme is able to achieve .
For all three cases, the coding schemes are derived based on the separate source-channel coding architecture. Also, for Cases 1 and 3, is found when only the sources satisfy certain conditions, and this is true independent of the underlying finite-field channel, i.e., any and any noise distribution.
In this section, we define the technical terms in Theorem ?.
We can think of as the noise level on the downlink from the relay to user . So, a symmetrical channel requires that the downlinks from the relay to all the users are equally noisy. We do not impose any condition on the uplink noise level, .
Almost-Balanced Conditional Mutual Information
Putting it another way, for unbalanced sources, we can always find a user , such that
for some and distinct .
Skewed Conditional Entropies
Common Information Equals Mutual Information
Lastly, we define common information in the same spirit as Gács and Körner . For two users, Gács and Körner defined common information as a value on which two users can agree (using the terminology of Witsenhausen ). The common information between two random variables can be as large as mutual information (in the Shannon sense), but no larger.
The concept of common information was extended to multiple users by Tyagi et al. , where they considered a value on which all users can agree. In this paper, we further extend common information to values on which different subsets of users can agree. We now formally define a class of sources, where their common information equals their mutual information.
We give graphical interpretations using information diagrams for sources that have ABCMI and SCE in Appendix Section 8, and examples of sources that have ABCMI and their common information equal their mutual information in Section 6.
Definitions ? and ? are mutually exclusive, but Definitions ? and ? (or ? and ?) are not. This means correlated sources that have their common information equal their mutual information must also have either ABCMI, SCE, or unbalanced mutual information without SCE. This leads to the graphical summary of the results of Theorem ? in Figure 1.
The rest of this paper is organized as follows: We show a lower bound and an upper bound (achievability) to in Section 3. In Section 4, we show that for Cases 1 and 2 in Theorem ?, the lower bound is achievable. In Section 5, we propose a coding scheme that takes common information into account, and show the source-channel rate achievable using this new scheme matches the lower bound. We conclude the paper with some discussion in Section 6.
3Lower and Upper Bounds to
Denote the RHS of as
We first show that is a lower bound to . Using cut-set arguments , we can show that if source-channel rate is achievable, then 1
for all distinct . Here follows from Mohajer et al.  and follows from Ong et al. . Re-arranging the equation gives the following lower bound to all achievable source-channel rates —and hence also to :
We now present the result of SW/FDF-IS coding scheme that first uses Slepian-Wolf source coding for the noiseless MWRC with correlated sources , followed by functional-decode-forward for independent sources (FDF-IS) channel coding for the MWRC . This scheme achieves the following source-channel rates:
The proof is based on random coding arguments and can be found in Appendix Section 9.
From Lemmas ? and ?, we have the following result:
Next, we will show that for Cases 1 and 2 in Theorem ?, SW/FDF-IS achieves all source-channel rates .
4Proof of Cases 1 and 2 in Theorem
4.1Proof of Case 1 in Theorem
In this subsection, we will show that if the sources have ABCMI, then . Since any relies on the existence of channel code rates , we first show the following proposition:
With this result, we now prove Case 1 of Theorem ?. We need to show that any source-channel rate is achievable, i.e., the source-channel rate
for any , lies in . Here, is independent of and .
For a source-channel rate in , we choose as in . Substituting – into , the second inequality in is satisfied. Also, – imply the first inequality in . Hence, . This proves Case 1 in Theorem ?.
4.2Proof of Case 2 in Theorem
We need to show that if the sources have SCE and the channel is symmetrical, then the source-channel rate in is achievable for any . Recall that sources that have SCE must have unbalanced conditional mutual information, for which we can always re-index the users as , , and satisfying for some fixed .
For achievability in Lemma ?, we first show the existence of satisfying the following conditions:
Furthermore, for a symmetrical channel, we can define
So, for SCE and for symmetrical channels imply that the source-channel rate in equals
Hence, we only need to show that the source-channel rate is achievable for any .
We first choose and as in , , and , respectively. From –, we get
where and follow from ; , , and follow from .
4.3A Numerical Example Showing that SW/FDF-IS is Not Always Optimal
In this section, we give an example showing that SW/FDF-IS can be suboptimal. Consider the following sources: , , and , where is uniformly distributed in , is uniformly distributed in , and are each uniformly distributed in . In addition, all and are mutually independent. Here, each represents common information between and .
For the channel, let the finite field be and be modulo- addition, i.e., . Furthermore, let for , and for ; let for , and for , for all .
For this source and channel combinations, we have , , , , , , for all . One can verify that these sources have unbalanced conditional mutual information and do not have SCE.
In this example, . Suppose that is achievable using SW/FDF-IS. From Lemma ?, there must exists three positive real numbers , , and such that 1
From , we must have that and . These imply . Hence, and cannot be simultaneously true. This means the source-channel rate 1.05 is not achievable using SW/FDF-IS.
The sources described here have their common information equal their mutual information. We will next propose an alternative scheme that is optimal for this class of sources. The following proposed scheme achieves all source-channel rates for this source/channel combination, meaning that the minimum source-channel rate for this example is . So SW/FDF-IS is strictly suboptimal for this source/channel combination.
5Proof of Case 3 in Theorem
While the achievability for Cases 1 and 2 uses existing source and channel coding schemes, for Case 3 (i.e., sources that have their common information equal their mutual information), we will use an existing source coding scheme and design a new channel coding scheme to achieve all source-channel rates .
In this section, without loss of generality,
This means we can re-write as follows:
As mentioned earlier, we will use a separate-source-channel-coding architecture, where we first perform source coding and then channel coding. We will again use random coding arguments. More specifically, we will use random linear block codes for channel coding.
We encode each to , which is a length- finite-field (of size ) vector, for all (see Definition ? for the definition of ). We also encode each to , which is a length- finite-field vector. So, each message is encoded into four subcodes, e.g., is encoded into . Some subcodes—the common parts—are shared among multiple sources.
Using the results of distributed source coding , if is sufficiently large and if
then we can decode to , to , and to with an arbitrarily small error probability. We show the proof in Appendix Section 10.
After source coding, user 1 has . In order for it to decode , it must receive from the other users through the channel. Similarly, users 2 and 3 must each obtain subcodes that they do not already have through the channel.
In contrast to the source coding used for Cases 1 and 2, here, we have generated source codes where the users share some subcodes. So, instead of using existing FDF-IS channel codes (designed for independent sources), we will design channel codes that take the common subcodes into account.
|Total channel uses|
After source coding, the users now send to the relay on the uplink. The common subcode known to all three users, i.e., , need not be transmitted. Similar to FDF-IS, we will design channel codes for the relay to decode functions of the transmitted messages. This can be realized using linear block codes of the following form:
where is the message vector, code generator matrix, is a random dither, and is the channel codeword. All elements are in , and is the multiplication in . Each element in and in is independently and uniformly chosen over , and is known to the relay.
We now state the following lemma as a direct result of using linear block codes :
From , –, and noting we choose
We consider the following two cases, where the relay decodes different functions in each case:
When (chosen when )
Uplink: We split the message into three different disjoint parts and , and the message into and . The uplink message transmission is arranged as shown in Table 1.
The messages in each column are transmitted simultaneously using linear block codes with the message length specified in the first row and the codeword length in the second last row. From , we know that and , meaning that the message length for each column is non-negative. For each column, both messages use the same code generator matrix but different dithers. The relay decodes the finite-field addition of the messages in each column. Take the first column for example, and are transmitted by user 1 and user 2 respectively. Note that the second codeword can also be transmitted by user 3 since it knows . Using Lemma ?, if is sufficiently large and if
where , then the relay can reliably decode . Using the same coding scheme for the other columns, we can show that if holds, then the relay can decode the summation of the messages in every column.
Downlink: Assume that the relay has successfully decoded the functions for all columns . Note that is a finite-field vector of length . Generate codewords of length , where each codeletter is independently generated according to the uniform distribution . Index the codewords by . After decoding , the relay transmits on the downlink. By reducing the decoding space of each user—since each user has some side information about —we can show the following (see, e.g., Ong and Johnson ):
Note that random codes are used on the downlink instead of linear codes.
Knowing of length , user 1 can reliably decode if
From and knowing its own subcodes and , user 1 can then obtain and .
Knowing of length , user 2 can reliably decode if
From and knowing its own subcodes and , user 2 can then obtain and .
Similarly, we can show that user 3 can reliably decode if
It can then proceed to obtain .
Recovering Other Users’ Messages: We have shown that from , each user can obtain all other users’ subcodes. If – are satisfied and if is sufficiently large, each user can reliably decode the messages of the other users, i.e., and from the subcodes.
Achievability: We now combine the above results. For any , we can choose (where , which is possible because ) and a sufficiently large so that the lengths of the subcodes for source coding satisfy 1
This means – are satisfied. It follows that 1
for all . The length is chosen to satisfy .
Now, for any in , i.e., for some , we can choose a much smaller for such that , , , and are simultaneously satisfied. With a sufficiently large (which also implies a large ), each user can reliably decodes the messages of the two other users. Hence, the source-channel rate in is achievable. This proves the achievability of Theorem ? for Case 3 when .
When (chosen when )
The coding scheme for this case is similar to . The uplink transmission is shown in Table 2.
|Total channel uses|
The messages in each column are transmitted simultaneously. Note that in the third row of messages in the table, is split into two parts if and only if . Else, the entire message will be transmitted in the first column, i.e., together with . Since , the message can always fit into the first and third columns, with the remaining “space” padded with zero, denoted by . The message is transmitted in a similar way.
The relay decodes the modulo addition of the messages in each column. If is sufficiently large and if is satisfied, then the relay can reliably decode its intended messages, i.e., . The relay broadcasts on the downlink. Using Lemma ?, we can show that if , , and are satisfied, then each user can reliably decode , from which it can recover the messages of the other two users.
This completes the proof for the achievability of Case 3 in Theorem ?.
6.1Other Coding Schemes
Note that the lower bound in Lemma ? and the achievable source-channel rates in Lemma ? are applicable to the general finite-field MWRC with correlated sources, in addition to Cases 1 and 2 in Theorem ?. However, the coding technique in Section 5 is useful only for sources that have their common information equal their mutual information.
Besides the coding schemes considered in this paper, one could treat the uplink (as a multiple-access channel with correlated sources) and the downlink (as a broadcast channel with receiver side information) separately to get potentially different achievable source-channel rates. In this case, on the uplink, we let the relay decode all three users’ messages . An achievable channel rate region for the two-sender multiple-access channel with correlated sources was found by Cover et al. . Then, on the downlink, we can use the result by Tuncel  for the relay to transmit to the users taking into account that each user has side information .
Extending the results of Cover et al.  to three senders, for the relay to be able to reliably decode on the uplink, must satisfy the following: 1
For the case where each user has non-zero message, for all . Comparing to , we see that this coding strategy is strictly suboptimal for all Cases 1–3 in Theorem ? when . However, this strategy may achieve better (i.e., lower) source-channel rates than SW/FDF-IS in general. We leave the derivation of the rates to the reader—it is straightforward given the results by Cover et al. and Tuncel.
In this paper, we have only investigated the separate-source-channel coding architecture without feedback. Though we have identified source/channel combinations where the minimum source-channel rate is found, the problem remains open in general. Two directions which one could explore are joint source-channel coding and feedback.
6.2Optimality of Source-Channel Separation
We now give a numerical example where none of the coding schemes in this paper achieves the lower bound . Let the sources be , , and where all are independent random variables, and denotes modulo-two addition. We choose , , , and . For this choice, we have , , .
For the channel, let , which gives . We choose and for , giving . For each , we set , , and for , giving . This means, , and for all .
In this example, . If is achievable, then (from Lemma ?) we must be able to find some and satisfying , and . Since the conditions cannot be simultaneously met, is not achievable using SW/FDF-IS. As the sources’ common information does not equal their mutual information, we cannot use the coding scheme derived in Section 5.
This example shows that the separation schemes derived in this paper cannot achieve the lower bound in some cases. However, to show that separation is suboptimal, one has to explore all possible separation schemes and show that some joint source-channel scheme achieves a better source-channel rate.
6.3Examples of Sources in Cases 1 and 3
In this paper, we have identified three classes of source/channel combinations where the minimum source-channel rate is found. The first class is sources that have almost-balanced conditional mutual information (ABCMI). An example of sources that have ABCMI is interchangeable random variables in the sense of Chernoff and Teicher , where “every subcollection of the random variables has a joint distribution which is a symmetric function of its arguments.” This can model sensor networks where the sensors are equally-spaced on a circle to detect a phenomenon occurring at the center of the circle. However, the ABCMI conditions are looser than that of interchangeable random variables as the former only requires that mutual information between any two sources has roughly the same value (see Appendix Section 8), and also, the marginal distribution of the variables can be vastly different.
Another class for which we have derived the minimum source-channel rate is when the sources have their common information equal their mutual information. An example is correlated sources in the sense of Han , where the sources can be written as , , and , where and are mutually independent random variables. Using sensor networks as an example again, each node here has multiple sensing capabilities, e.g., temperature, light, sound. As these measurements display different behavior spatially, some remain constant across subsets of sensors, e.g., nodes 1 and 2 always sense the same temperature but different light intensity.
As for the class of sources with skewed conditional entropies, the conditions appear to be purely mathematical in nature.
The authors would like to thank Roy Timo for discussions on Gács and Körner’s common information and other helpful comments.
8Graphical Interpretation of Sources that Have ABCMI and SCE
Figure ? shows the relationship among the entropies and mutual information for the three source messages , , and for the cases described above. Referring to Figure ?, the shaded areas represent the mutual information between any two source messages given the third source message. For ABCMI, we have that any of the three shaded areas must not be bigger than the sum of the other two shaded areas. Suppose that the sources do not have ABCMI, then they must have unbalanced conditional mutual information, i.e., we can find a user where is larger than the sum of and by an amount (see Figure ?). In addition, for sources that have SCE, we also have that for the two messages, and , whose mutual information conditioned on , i.e., , is larger than the sum of the other two pairs by the amount , their entropy conditioned on , i.e., , is also greater than that of any other pair (conditioned on the message of the third user) by at least . The information diagram for SCE is depicted in Figure ?.
9Proof of Lemma
We first quote two existing results of (i) channel coding for the three-user MWRC with independent messages and (ii) source coding with side information.
9.1Functional-Decode-Forward for Independent Sources (FDF-IS) Channel Coding
The following channel-coding setting assumes that the source messages are independent:
9.2Slepian-Wolf Source Coding
The following source-coding setting assumes that the channel is noiseless:
Note that Wyner et al.  derived a similar result with an additional constraint on the relay. In their setup, the users present their indices to a relay; the relay in turn re-encodes and presents its index to the users.
9.3Proof of Lemma
We use Slepian-Wolf source coding. Each user encodes its length- message to an index , satisfying and , for . Each user randomly generates a dither uniformly distributed in , and forms its encoded message . The dithers are made known to all nodes. Now, , , and are mutually independent, and each is uniformly distributed in .
We then use FDF-IS channel coding for the users to exchange the encoded independent messages , , and via the relay in channel uses. From Lemma ?, if is satisfied, then each user can reliably recover and . Knowing the dithers, it can also recover and . From Lemma ?, if and are satisfied, then each user can reliably recover .
Noting that and defining , the conditions for achievability, i.e., , , and , can be expressed as .
10Coding for Sources That Have Their Common Information Equal Their Mutual Information
We perform source coding for correlated sources . Consider the source message . Clearly, , since are deterministic functions of . Now since captures all information that and have in common (because ), we have . Similarly .
So, we can reliably reconstruct from if is sufficiently large, and if the following inequalities hold : 1
Here, we need to consider all possible non-empty subsets of