The Compound Multiple Access Channel with Partially Cooperating Encoders
The goal of this paper is to provide a rigorous information-theoretic analysis of subnetworks of interference networks. We prove two coding theorems for the compound multiple-access channel with an arbitrary number of channel states. The channel state information at the transmitters is such that each transmitter has a finite partition of the set of states and knows which element of the partition the actual state belongs to. The receiver may have arbitrary channel state information. The first coding theorem is for the case that both transmitters have a common message and that each has an additional common message. The second coding theorem is for the case where rate-constrained, but noiseless transmitter cooperation is possible. This cooperation may be used to exchange information about channel state information as well as the messages to be transmitted. The cooperation protocol used here generalizes Willems’ conferencing. We show how this models base station cooperation in modern wireless cellular networks used for interference coordination and capacity enhancement. In particular, the coding theorem for the cooperative case shows how much cooperation is necessary in order to achieve maximal capacity in the network considered.
In modern cellular systems, interference is one of the main factors which limit the communication capacity. In order to further enhance performance, methods to better control interference have recently been investigated intensively. One of the principal techniques to achieve this is cooperation among neighboring base stations. This will be part of the forthcoming LTE-Advanced cellular standard. It is seen as a means of achieving the desired spectral efficiency of mobile networks. In addition, it may enhance the performance of cell-edge users, a very important performance metric of future wireless cellular systems. Finally, fairness issues are expected to be resolved more easily with base station cooperation.
In standardization oriented literature, the assumptions generally are very strict. The cooperation backbones, i.e. the wires linking the base stations, are assumed to have infinite capacity. Full channel state information (CSI) is assumed to be present at all cooperating base stations. Then, multiple-input-multiple-output (MIMO) optimization techniques can be used for designing the system . However, while providing a useful theoretical benchmark, the results thus obtained are not accepted by the operators as reliably predicting the performance of actual networks.
In order to obtain a more realistic assessment of the performance of cellular networks with base station cooperation, the above assumptions need to be adapted to reality. First, it is well-known that one cannot really assume perfect CSI in mobile communication networks. Second, glass fibers or any medium used for the backbones never have infinite capacity. The assumption of finite cooperation capacity will also lead to a better understanding of the amount of cooperation necessary to achieve a certain performance. Vice versa, we would like to know which capacity can be achieved with the backhaul found in heterogeneous networks using microwave, optical fibers and other media. Such insights would get lost when assuming infinite cooperation capacity.
The question arises how much cooperation is needed in order to achieve the same performance as would be achievable with infinite cooperation capacity. For general interference networks with multiple receivers, the analysis is very difficult. Thus it is natural to start by taking a closer look at component networks which together form a complete interference network. Such components are those subnetworks formed by the complete set of base stations, but with only one receiving mobile. Then there is no more interference, so one can concentrate on finding out by how much the capacity increases by limited base station cooperation. This result can be seen as a first step towards a complete rigorous analysis of general interference networks.
A situation which is closely related can be phrased in the cooperation setting as well. Usually, there is only one data stream intended for one receiver. Assume that a central node splits this data stream into two components. Each of these components is then forwarded to one of two base stations. Using the cooperation setting, one can address the question how much overhead needs to be transmitted by the splitter with the data component, i.e. how much information about the data component and the CSI intended for one base station needs to be known at the other base station in order to achieve a high, possibly maximal data rate.
In , the cooperation of base stations in an uplink network is analyzed. A turbo-like decoding scheme is proposed. Different degrees of cooperation and different cooperation topologies are compared in numerical simulations. In , work has also been done on the practical level to analyze cooperative schemes. The implementation of a real-time distributed cooperative system for the downlink of the fourth-generation standard LTE-Advanced was presented. In that system, the channel state information (CSI) at the transmitters was imperfect, the limited-capacity glass fibers between the transmitting base stations were used to exchange CSI and data information. A feeder distributed the data among the transmitting base stations.
A question which is not addressed in this work but which will be considered in the future is what rates can be achieved if there are two networks as described above which belong to different providers and which hence do not jointly optimize their coding, to say nothing of active cooperation. In that case, uncontrolled interference heavily disturbs each network, and challenges different from those considered here need to be faced by the system designer.
The rigorous analysis of such cellular wireless systems as described above using information-theoretic methods should provide useful insights. The ultimate performance limits as well as the optimal cooperation protocols can be derived from such an analysis. The first information-theoretic approach to schemes with cooperating encoders goes back to Willems [20, 21] long before this issue was relevant for practical networks. For that reason, it was not considered much in the next two decades. Willems considers a protocol where before transmission, the encoders of a discrete memoryless Multiple Access Channel (MAC) may exchange information about their messages via noiseless finite-capacity links (one in each direction). This may be done in a causal and iterative fashion, so the protocol is called a conferencing protocol.
For the reasons mentioned at the beginning, Willems’ conferencing protocol has attracted interest in recent years. Gaussian MACs using Willems conferencing between the encoders were analyzed in  and . Moreover, in these two works, it was shown that interference which is known non-causally at the encoders does not reduce capacity. For a compound MAC, both discrete and Gaussian, with two possible channel realizations and full CSI at the receiver, the capacity region was found in . In the same paper, the capacity region was found for the interference channel if only one transmitter can send information to the other (unidirectional cooperation) and if the channel is in the strong interference regime. Another variant of unidirectional cooperation was investigated in , where the three encoders of a Gaussian MAC can cooperate over a ring of unidirectional links. However, only lower and upper bounds were found for the maximum achievable equal rate.
Further literature exists for Willems conferencing on the decoding side of a multi-user network. For degraded discrete broadcast channels, the capacity region was found in  if the receivers can exchange information about the received codewords in a single conference step. For the general broadcast and multicast channels, achievability regions were determined. For the Gaussian relay channel, the dependence of the performance on the number of conferencing iterations between the receiver and the relay was investigated in . For the Gaussian -interference channel, outer and inner bounds to the capacity region where the decoders can exchange information about the channel outputs are provided in . Finally, for discrete and Gaussian memoryless interference channels with conferencing decoders and where the senders have a common message,  determines achievable regions. Exact capacity regions are determined if the channel is physically degraded. If the encoders can conference instead of having a common message, the situation is the same.
The discrete MAC with conferencing encoders is closely related to the discrete MAC with common message. Intuitively, the messages exchanged between the encoders in the cooperative setting form a common message, so the results known for the corresponding non-cooperative channel with common message can be applied to find the achievable rates of the cooperative setting. This transition was used in [20, 21, 3, 19], and . The capacity region of the MAC with common message was determined in , a simpler proof was found in .
The goal of this paper is to generalize the original setting considered by Willems even further. We treat a compound discrete memoryless MAC with an arbitrary number of channel realizations. The receiver’s CSI (CSIR) may be arbitrary between full and absent. The possible transmitter’s CSI (CSIT) may be different from CSIR and asymmetric at the two encoders. It is restricted to a finite number of instances, even though the number of actual channel realizations may be infinite. For this channel, we consider two cases. First, we characterize the capacity region of this channel where the transmitters have a common message. Then, we determine the capacity region of the channel where there is no common message any more. Instead, the encoders have access to the output of a rate-constrained noiseless two-user MAC. Each input node of the noiseless MAC corresponds to one of the transmitters of the compound MAC. Each input to the noiseless MAC consists of the pair formed by the message which is to be transmitted and the CSIT present at the corresponding transmitter. This generalizes Willems’ conferencing to a non-causal conferencing protocol, where the conferencing capacities considered by Willems correspond to the rate constraints of the noiseless MAC in the generalized model. It turns out that this non-causal conferencing does not increase the capacity region, and as in [20, 21], every rate contained in the capacity region can be achieved using a one-shot Willems “conference”. We determine how large the conferencing capacities need to be in order to achieve the full-cooperation sum rate and the full-cooperation capacity region, respectively. The latter is particularly interesting because it shows that forming a “virtual MIMO system” as mentioned in Subsection I-A and considered in  does not require infinite cooperation capacity.
I-C Organization of the Paper
In Section II, we address the problems presented above. We present the two basic channel models underlying our analysis: the compound MAC with common message and partial CSI and the compound MAC with conferencing encoders and partial CSI. We also introduce the generalized conferencing protocol used in the analysis of the conferencing MAC. We state the main results concerning the capacity regions of the two models. We also derive the minimal amount of cooperation needed in the conferencing setting in order to achieve the optimal (i.e. full-cooperation) sum rate and the optimal, full-cooperation rate region. The achievability of the rate regions claimed in the main theorems is shown in Section III. The weak converses are shown in Section IV. Only the converse for the conferencing MAC is presented in detail, because the converse for the MAC with common message is similar to part of the converse for the MAC with conferencing encoders. We address the application of the MAC with conferencing encoders to the analysis of cellular systems where one data stream is split up and sent using different base stations in Section V. In the same section, in a simple numerical example, the capacity regions of a MAC with conferencing encoders is plotted for various amounts of cooperation. In the final section, we sum up the paper and discuss the directions of future research. In the Appendix several auxiliary lemmata concerning typical sequences are collected.
For real numbers and , we set and .
For any positive integer , we write for the set . The complement of a set in is denoted by . The function is the indicator function of , i.e. equals 1 if and 0 else. For a set , we write . For a mapping , define to be the cardinality of the range of .
Denote the set of probability measures on a discrete set by . The -fold product of a is denoted by . By , we denote the set of stochastic matrices with rows indexed by and columns indexed by . The -fold memoryless extension of a is defined as
Let be a finite set. For , define the type of by . For and , define to be the set of those such that for all and such that if .
Ii Channel Model and Main Results
Ii-a The Channel Model
Let be finite sets. A compound discrete memoryless MAC with input alphabets and and output alphabet is determined by a set of stochastic matrices . may be finite of infinite. Every corresponds to a different channel state, so we will also call the elements the states of the compound MAC . The transmitter using alphabet will be called transmitter (sender, encoder) 1 and the transmitter with alphabet will be called transmitter (sender, encoder) 2. If transmitter 1 sends a word and transmitter 2 sends a word , and if the channel state is , then the receiver will receive the word with probability
The compound channel model does not include a change of state in the middle of a transmission block.
The goal is to find codes that are “good” (in a sense to be specified later) universally for all those channel states which might be the actual one according to CSI. In our setting, CSI at sender is given by a finite CSIT partition
for . The sets are finite, and the satisfy
Before encoding, transmitter knows which element of the partition the actual channel state is contained in, i.e. if is the channel state, then it knows . With this knowledge, it can adjust its codebook to the channel conditions to some degree. For , we denote by
the set of channel states which is possible according to the combined channel knowledge of both transmitters. Note that every function from into a finite set induces a finite partition as in (1), so this is a very general concept of CSIT. At the receiver side, the knowledge about the channel state is given by a not necessarily finite CSIR partition
is an arbitrary set and the sets satisfy
If the channel state is , then the receiver knows . Thus it can adjust its decision rule to this partial channel knowledge. This concept includes any kind of deterministic CSIR, because any function from into an arbitrary set induces a partition as in (2). Note that if is infinite, the transmitters can never have full CSI, whereas this is possible for the receiver if .
The compound discrete memoryless MAC together with the CSIT partitions and the CSIR partition is denoted by the quadruple .
There are several communication situations which are appropriately described by a compound MAC. One case is where information is to be sent from two transmitting terminals to one receiving terminal through a fading channel. If the channel remains constant during one transmission block, one obtains a compound channel. Usually, CSIT is not perfect. It might be, however, that the transmitters have access to partial CSI, e.g. by using feedback. This will not determine an exact channel state, but only an approximation. Coding must then be done in such a way that it is good for all those channel realizations which are possible according to CSIT.
Another situation to be modeled by compound channels occurs if there are two transmitters each of which would like to send one message to several receivers at the same time. The channels to the different receivers differ from each other because all the terminals are at different locations. Now, the following meaning can be given to the above variants of channel knowledge. If CSIT is given as , this describes that the information is not intended for all receivers, but only for those contained in . Knowledge about the intended receivers may be asymmetric at the senders. If every receiver has its own decoding procedure, full CSIR (i.e. ) would be a natural assumption. If the receivers must all use the same decoder, there is no CSIR. Non-trivial CSIR could mean that independently of the decision at the transmitters where data are to be sent (modeled by CSIT), a subset of receivers is chosen as the set which the data are intended for without informing the transmitters about this decision.
Ii-B The MAC With Common Message
Let the channel be given. We now present the first of the problems treated in this paper, the capacity region of the compound MAC with common message. It is an interesting information-theoretic model in itself. However, its main interest, at least in this paper, is that it provides a basis for the solution of the problem presented in the next section, which is the capacity region of the compound MAC with conferencing encoders.
Assume that each transmitter has a set of private messages , , and that both transmitters have an additional set of common messages for the receiver (Fig. 1). Let be a positive integer.
A code is a triple of functions satisfying
is called the blocklength of the code.
Clearly, the codes are in one-to-one correspondence with the families
where , , and where the satisfy
(The sets are obtained from by setting
In the following, we will use the description of codes as families as in (3). The functional description of codes will be of use when we are dealing with transmitter cooperation. We say more on that in Remark 3.
The and are the codewords and the are the decoding sets of the code. Let the transmitters have the common message . Suppose that transmitter additionally has the private message and knows that . Then it uses the codeword . If transmitter additionally has the private message and knows that , it uses the codeword . Suppose that the receiver knows that . If the channel output is contained in , the receiver decides that the message triple has been sent.
For , a code is a code if
That means that for every instance of channel knowledge at the transmitters and at the receiver, the encoding/decoding chosen for this instance must yield a small average error for every channel state that may occur according to the CSI. In other words, the code chosen for a particular instance of CSI must be universally good for the class of channels .
The first goal in this paper is to characterize the capacity region of the compound MAC with common message. That means that we will characterize the set of achievable rate triples and prove a weak converse.
A rate triple is achievable for the compound channel with common message if for every and and for large enough, there is a code with
We denote the set of achievable rate triples by .
Before stating the theorem on the capacity region, we need to introduce some new notation. We set to be the set of families
of probability distributions, where is a distribution on a finite subset of the integers, and where for each . Every defines a family of probability measures on , where is the set corresponding to . This family consists of the probability measures (), where
and where is such that . Let the quadruple of random variables take values in with joint probability . Then, define the set to be the set of , where every and where
we are able to state the first main result.
For the compound MAC , one has
and there is a weak converse. More exactly, for every in and for every , there is a such that there exists a sequence of codes fulfilling
if is large, i.e. one has exponential decay of the error probability with increasing blocklength.
is convex. The cardinality of the auxiliary set can be restricted to be at most .
A weak converse states that if a code has rates which are further than from the capacity region and if its blocklength is large, then the average error of this code must be larger than a constant only depending on . A moment’s thought reveals that this is a stronger statement than just saying that the rates outside of the capacity region are not achievable.
is independent of the CSIR partition . That means that given a certain CSIT, the capacity region does not vary as CSIR varies. A heuristic explanation of this phenomenon is given in [22, Section 4.5] for the case of single-user compound channels. It builds on the fact that the receiver can estimate the channel from a pilot sequence with a length which is negligible compared to the blocklength.
Note that first taking a union and then an intersection of sets in the definition of is similar to the max-min capacity expression for the classical single-user discrete memoryless compound channel . We write two intersections instead of one in order to make the difference clear which remains between the two expressions. Recall that the are families of probability measures. Every choice activates a certain element of such a family . The union and the first intersection are thus related in a more complex manner than in the single-user expression.
As CSIT increases, the capacity region grows, and in principle, one can read off from this how the region scales with increasing channel knowledge at the transmitters. More precisely, assume that there are pairs and of CSIT partitions,
such that is finer than (). That means that for every there is a with , so one can assume that . Observe that the corresponding to , which we call only in this remark, can naturally be considered a subset of , which denotes the corresponding to only for this remark. Thus
and it follows that .
Ii-C The MAC with Conferencing Encoders
Again let the channel be given. Here we assume that each transmitter only has a set of private messages () for the receiver. Encoding is done in three stages. In the first stage, each encoder transmits its message and CSIT to a central node, a “switch”, over a noiseless rate-constrained discrete MAC. The rate constraints are part of the problem setting and thus fixed, but the noiseless MAC is not given, it is part of the code. For reasons that will become clear soon, we call it a “conferencing MAC”. In the second stage, the information gathered by the switch is passed on to each encoder over channels without incurring noise or loss. The codewords are chosen in the third stage. Each encoder chooses its codewords using three parameters: the message it wants to transmit, its CSIT, and the output of the conferencing MAC. This is illustrated in Fig. 2.
The conferencing MAC can be chosen freely within the constraints, so it can be seen as a part of the encoding process. Assume that the blocklength of the codes used for transmission is set to be . The rate constraints are such that is the maximal number of bits transmitter can communicate to the receiving node of the conferencing MAC. Thus if transmitter 1, say, has message and CSIT , then transmitter 2, who knows neither nor , can use at most additional bits from transmitter 1 to encode its own message. Consequently, there is a limited degree of cooperation between the encoders enhancing the reliability of transmission. As the constraints on the noiseless MAC are measured in terms of , one can interpret the communication over this channel as taking place during the transmission over of the codeword preceding that which is constructed with the help of the conferencing MAC.
Example 2 below shows how this kind of coding generalizes coding using Willems conferencing functions as defined in [20, 21]. From Theorem 2 below it follows that Willems conferencing is more than just a special case. In fact, it suffices to achieve the capacity region. In Section V-A, we give an application where it is useful to have the more general notion of conferencing which is used here.
We now come to the formal definitions. Recall that a noiseless MAC is nothing but a function from a Cartesian product to some other space.
A code is a quadruple of functions which satisfy
where is a finite set and where satisfies
for the functions and defined by . The number is called the blocklength of the code. is called a conferencing MAC or alternatively a generalized conferencing function. The latter name is justified by Example 2.
Analogous to the situation for the MAC with common message described in Remark 1, the code given by the quadruple uniquely determines a family
For the elements of this family, (not necessarily different!), (not necessarily different!), and the satisfy
For every , the family (7) must satisfy
Thus an alternative definition of codes would be families like the family (7) together with conferencing MACs as in (5) and (6). This is the form we will mostly use in the paper because of shorter notation. However, the original definition 5 is more constructive and gives more insights into the practical use of such codes. It will be used in the converse, where the way how the codewords depend on the messages will be exploited.
Note that (5) and (6) really are rate constraints. Indeed, let be a rate triple achievable by the MAC defined by , where the average error criterion is used111Even though the channel is noiseless, this does make a difference. In fact, Dueck showed in  that the maximal and the average error criteria differ for MACs using the example of a noiseless channel!. Then by the characterization of the MAC with non-cooperating encoders without common message (cf. [4, Theorem 3.2.3]), there must be independent random variables on and on such that
But by the constraints (5) and (6), one knows that the right side of (10) must be smaller than and the right side of (11) must be smaller than . Clearly, the sum rate then must be smaller than . Moreover, as the bounds in (10)-(12) are achievable, it even follows for every admissible choice of and .
With the above definition, the coding scheme is obvious: if the message pair is to be transmitted and if the pair of CSIT instances is , then the senders use the codewords and , respectively. If CSIR is and if the channel output is contained in the decoding set , then the receiver decides that the message pair has been transmitted.
For , a code is a code if
In the following example, we prove our claim that using generalized conferencing in the encoding process generalizes Willems’ conferencing encoders. We fix the notation
Example 2 (Willems Conferencing Functions).
Let positive integers and be given which can be written as products
for some positive integer which does not depend on . Assume that
We first give a formal definition of a pair of Willems conferencing functions . Such a pair is determined in an iterative manner via sequences of functions and , where for and ,
For and , one recursively defines functions
The functions are then obtained by setting
Note that not every conferencing MAC with output alphabet can be obtained through Willems conferencing. The most trivial example to see this is where is prime and where the conferencing function mapping into depends on . However, this setting can be given an interpretation in terms of MACs. Every pair of Willems’ conferencing functions is nothing but the -fold use of a non-stationary noiseless MAC with feedback. The above description of a transmission block of length over such a “Willems channel” as the one-shot use of a noiseless MAC as above is possible because noise plays no role here.
For achievability and weak converse, we adapt the definitions from II-B to the conferencing setting. Let be nonnegative real numbers at least one of which is strictly greater than 0.
A rate pair is achievable for the compound channel with conferencing encoders with conferencing capacities if for every and and for large enough, there is a code with
We denote the set of achievable rate pairs by .
To state the result, we need to define the sets . We denote by the set of families
of probability distributions, where is a distribution on a finite subset of the integers and where for every (cf. the definition of in Subsection II-B). Every defines a family of probability measures () on , where is the set corresponding to . This family consists of the probability measures () defined by
where is such that . Finally we define subsets and of . consists of those where the do not depend on and consists of those where the do not depend on .
For , let be a quadruple of random variables which is distributed according to . The set is defined as the set of those pairs of non-negative reals which satisfy
If , define the set
If (the reverse case is analogous with replacing ), define the set
For the channel and the pair of nonnegative real numbers, one has
This set can already be achieved using one-shot Willems conferencing functions, i.e. functions as defined in Example 2 with . More exactly, for every and for every , there is a such that there exists a sequence of codes