A New Achievability Scheme for the Relay Channel^{†}^{†}thanks: This work was supported by NSF Grants CCR , Ccf  and CCF , and was presented in part at IEEE Information Theory Workshop, Lake Tahoe, CA, September 2007.
Abstract
In this paper, we propose a new coding scheme for the general relay channel. This coding scheme is in the form of a block Markov code. The transmitter uses a superposition Markov code. The relay compresses the received signal and maps the compressed version of the received signal into a codeword conditioned on the codeword of the previous block. The receiver performs joint decoding after it has received all of the blocks. We show that this coding scheme can be viewed as a generalization of the wellknown CompressAndForward (CAF) scheme proposed by Cover and El Gamal. Our coding scheme provides options for preserving the correlation between the channel inputs of the transmitter and the relay, which is not possible in the CAF scheme. Thus, our proposed scheme may potentially yield a larger achievable rate than the CAF scheme.
1.2
1 Introduction
As the simplest model for cooperative communications, relay channel has attracted plenty of attention since 1971, when it was first introduced by van der Meulen [1]. In 1979, Cover and El Gamal proposed two major coding schemes for the relay channel [2]. These two schemes are widely known as DecodeAndForward (DAF) and CompressAndForward (CAF) today; see [3] for a recent review. These two coding schemes represent two different types of cooperation. In DAF, the cooperation is relatively obvious, where the relay decodes the message from the transmitter, and the transmitter and the relay cooperatively transmit the constructed common information to the receiver in the next block. In CAF, the cooperation spirit is less easy to recognize, as the message is sent by the transmitter only once. However, the relay cooperates with the transmitter by compressing and sending its signal to the receiver. The rate gains in these achievable schemes are due to the fact that, through the channel from the transmitter to the relay, correlation is created between the transmitter and the relay, and this correlation is utilized to improve the rates.
In the DAF scheme, correlation is created and then utilized in a block Markov coding structure. More specifically, a full correlation is created by decoding the message fully at the relay, which enables the transmitter and the relay to create any kind of joint distribution for the channel inputs in the next block. The shortcoming of the DAF scheme is that by forcing the relay to decode the message in its entirety, it limits the overall achievable rate by the rate from the transmitter to the relay. In contrast, by not forcing a full decoding at the relay, the CAF scheme does not limit the overall rate by the rate from the transmitter to the relay, and may yield higher overall rates. The shortcoming of the CAF scheme, on the other hand, is that the correlation offered by the block coding structure is not utilized effectively, since in each block the channel inputs and from the transmitter and the relay are independent, as the transmitter sends the message only once.
However, the essence of good coding schemes in multiuser systems with correlated sources (e.g., [4, 5]) is to preserve the correlation of the sources in the channel inputs. Motivated by this basic observation, in this paper, we propose a new coding scheme for the relay channel, that is based on the idea of preserving the correlation in the channel inputs from the transmitter and the relay. We will show that our new coding scheme may be viewed as a more general version of the CAF scheme, and therefore, our new coding scheme may potentially yield larger rates than the CAF scheme. Our proposed scheme can be further combined with the DAF scheme to yield rates that are potentially larger than those offered by both DAF and CAF schemes, similar in spirit to [2, Theorem 7].
Our new achievability scheme for the relay channel may be viewed as a variation of the coding scheme of Ahlswede and Han [5] for the multiple access channel with a correlated helper. In our work, we view the relay as the helper because the receiver does not need to decode the information sent by the relay. Also, we note that the relay is a correlated helper as the communication channel from the transmitter to the relay provides relay for free a correlated version of the signal sent by the transmitter. The key aspects of the AhlswedeHan [5] scheme are: to preserve the correlation between the channel inputs of the transmitter and the helper (relay), and for the receiver to decode a “virtual” source, a compressed version of the helper, but not the entire signal of the helper.
Our new coding scheme is in the form of block Markov coding. The transmitter uses a superposition Markov code, similar to the one used in the DAF scheme [2], except in the random codebook generation stage, a method similar to the one in [4] is used in order to preserve the correlation between the blocks. Thus, in each block, the fresh information message is mapped into a codeword conditioned on the codeword of the previous block. Therefore, the overall codebook at the transmitter has a tree structure, where the codewords in block emanate from the codewords in block . The depth of the tree is . A similar strategy is applied at the relay side where the compressed version of the received signal is mapped into a twoblocklong codeword conditioned on the codeword of the previous block. Therefore, the overall codebook at the relay has a tree structure as well. As a result of this coding strategy, we successfully preserve the correlation between the channel inputs of the transmitter and the relay. However, unlike the DAF scheme where a full correlation is acquired through decoding at the relay, our scheme provides only a partially correlated helper at the relay by not trying to decode the transmitter’s signal fully. From [4, 5], we note that the channel inputs are correlated through the virtual sources in our case, and therefore, the channel inputs between the consecutive blocks are correlated. This correlation between the blocks will surely hurt the achievable rate. The correlation between the blocks is the price we pay for preserving the correlation between the channel inputs of the transmitter and the relay within any given block.
At the decoding stage, we perform joint decoding for the entire blocks after all of the blocks have been received, which is different compared with the DAF and CAF schemes. The reason for performing joint decoding at the receiver is that due to the correlation between the blocks, decoding at any time before the end of all the blocks would decrease the achievable rate. We note that joint decoding increases the decoding complexity and the delay as compared to DAF and CAF, though neither of these is a major concern in an information theoretic context. The only problem with the joint decoding strategy is that it makes the analysis difficult as it requires the evaluation of some mutual information expressions involving the joint probability distributions of up to blocks of codes, where is very large.
The analysis of the error events provides us three conditions containing mutual information expressions involving infinite letters of the underlying random process. Evaluation of these mutual information expressions is very difficult, if not impossible. To obtain a computable result, we lower bound these mutual informations by noting some Markov structure in the underlying random process. This operation gives us three conditions to be satisfied by the achievable rates. These conditions involve eleven variables, the two channel inputs from the transmitter and the relay, the two channel outputs at the relay and the receiver and the compressed version of the channel output at the relay, in two consecutive blocks, and the channel input from the transmitter in the previous block.
We finish our analysis by revisiting the CAF scheme. We develop an equivalent representation for the achievable rates given in [2] for the CAF scheme. We then show that this equivalent representation for the achievable rates for the CAF scheme is a special case of the achievable rates in our new coding scheme, which is obtained by a special selection of the eleven variables mentioned above. We therefore conclude that our proposed coding scheme yields potentially larger rates than the CAF scheme. More importantly, our new coding scheme creates more possibilities, and therefore a spectrum of new achievable schemes for the relay channel through the selection of the underlying probability distribution, and yields the wellknown CAF scheme as a special case, corresponding to a particular selection of the underlying probability distribution.
2 The Relay Channel
Consider a relay channel with finite input alphabets , and finite output alphabets , , characterized by the transition probability . An length block code for the relay channel consists of encoders , and a decoder
where the encoder at the transmitter sends into the channel, where ; the encoder at the relay at the th channel instance sends into the channel; the decoder outputs . The average probability of error is defined as
(1) 
A rate is achievable for the relay channel if for every , , and every sufficiently large , there exists an length block code with and .
3 A New Achievability Scheme for the Relay Channel
We adopt a block Markov coding scheme, similar to the DAF and CAF schemes. We have overall blocks. In each block, we transmit codewords of length . We denote the variables in the th block with a subscript of . We denote letter codewords transmitted in each block with a superscript of . Following the standard relay channel literature, we denote the (random) signals transmitted by the transmitter and the relay by and , the signals received at the receiver and the relay by and , and the compressed version of at the relay by . The realizations of these random signals will be denoted by lowercase letters. For example, the letter signals transmitted by the transmitter and the relay in the th block will be represented by and .
Consider the following discrete time stationary Markov process for , with the transition probability distribution
(2) 
The codebook generation and the encoding scheme for the th block, , are as follows.
Random codebook generation: Let denote the transmitted and the received signals in the st block, where is the message sent by the transmitter in the st block. An illustration of the codebook structure is shown in Figure 1.

For each sequence, generate sequences, where , the th sequence, is generated independently according to . Here, every codeword in the st block expands into a codebook in the th block. This expansion is indicated by a directed cone from to in Figure 1.

For each sequence, generate sequences independently uniformly distributed in the conditional strong typical set^{1}^{1}1Strong typical set and conditional strong typical set are defined in [6, Definition 1.2.8, 1.2.9]. For the sake of simplicity, we omit the subscript which is used to indicate the underlying distribution in [6]. with respect to the distribution . If , for any given sequence, there exists one sequence with high probability when is sufficiently large such that are jointly typical according to the probability distribution . Denote this as . Here, the quantization from to , parameterized by , is indicated in Figure 1 by a directed cone from to , with a straight line from for the parameterization.

For each , generate one sequence according to . This onetoone mapping is indicated by a straight line between and in Figure 1.
Encoding: Let be the message to be sent in this block. If are sent and is received in the previous block, we choose according to the code generation method described above and transmit . In the first block, we assume a virtual th block, where , as well as , are known by the transmitter, the relay and the receiver. In the th block, the transmitter randomly generates one sequence according to and sends it into the channel. The relay, after receiving , randomly generates one sequence according to . We assume that the transmitter and the relay reliably transmit and to the receiver using the next blocks, where is some finite positive integer. We note that blocks are used in our scheme, while only the first blocks carry the message. Thus, the final achievable rate is which converges to for sufficiently large since is finite.
Decoding: After receiving blocks of sequences, i.e., , and assuming , and are known at the receiver, we seek , , such that
according to the stationary distribution of the Markov process in (3).
The differences between our scheme and the CAF scheme are as follows. At the transmitter side, in our scheme, the fresh message is mapped into the codeword conditioned on the codeword of the previous block , while in the CAF scheme, is mapped into , which is generated independent of . At the relay side, in our scheme, the compressed received signal is mapped into the codeword , which is generated according to , while in the CAF scheme, is generated independent of . The aim of our design is to preserve the correlation built in the st block in the channel inputs of the th block. At the decoding stage, we perform joint decoding for the entire blocks after all of the blocks have been received, while in the CAF scheme, the decoding of the message of the st block is performed at the end of the th block.
Probability of error: When is sufficiently large, the probability of error can be made arbitrarily small when the following conditions are satisfied.

For all such that ,
(3) 
For all such that ,
(4) 
For all such that ,
(5)
where the subscript on the left hand sides of (3), (4) and (5) indicates that the corresponding random variables belong to a generic sample of the underlying random process in (3). The details of the calculation of the probability of error where these conditions are obtained can be found in Appendix A.1. The derivation uses standard techniques from information theory, such as counting error events, etc.
In the above conditions, we used the notation as a shorthand to denote the sequence of random variables . Consequently, we note that the mutual informations on the right hand sides of (3), (4) and (5) contain vectors of random variables whose lengths go up to , where is very large. In order to simplify the conditions in (3), (4) and (5), we lower bound the mutual information expressions on the right hand sides of (3), (4) and (5) by those that involve random variables that belong to up to three blocks. The detailed derivation of the following lower bounding operation can be found in Appendix A.2. The derivation uses standard techniques from information theory, such as the chain rule of mutual information, and exploiting the Markov structure of the involved random variables.

For all such that ,
(6) 
For all such that ,
(7) 
For all such that ,
(8)
We can further derive sufficient conditions for the above three conditions in (6), (7) and (8) as follows. We define the following quantities:
(9)  
(10)  
(11)  
(12)  
(13)  
(14) 
Then, the sufficient conditions in (6), (7) and (8) can also be written as,

For all such that ,
(15) 
For all such that ,
(16) 
For all such that ,
(17)
We note that the above conditions are implied by the following three conditions,
(18)  
(19)  
(20) 
or in other words, by,
(21)  
(22)  
(23) 
The expressions in (21), (22) and (23) give sufficient conditions to be satisfied by the rate in order for the probability of error to become arbitrarily close to zero. We note that these conditions depend on variables used in three consecutive blocks, , and . With this development, we obtain the main result of our paper which is stated in the following theorem.
Theorem 1
The rate is achievable for the relay channel, if the following conditions are satisfied
(24)  
(25)  
(26) 
where
(27)  
(28)  
(29) 
In the above theorem, the notations and are used to denote the signals belonging to the previous block and the block before the previous block, respectively, with respect to a reference block. Therefore, we see that the achievable rate in the relay channel, using our proposed coding scheme, needs to satisfy three conditions that involve mutual information expressions calculated using eleven variables which satisfy the Markov chain constraint in (27), the marginal distribution constraint in (28), and the additional interblock probability distribution constraint in (29).
In the next section, we will revisit the wellknown CAF scheme proposed in [2]. First, we will develop an equivalent representation for the wellknown representation of the achievable rate in the CAF scheme. We will then show that the rates achievable by the CAF scheme can be achieved with our proposed scheme by choosing a certain special structure for the joint probability distribution of the eleven random variables in Theorem 1 while still satisfying the three conditions in (27), (28) and (29).
4 Revisiting the CompressAndForward (CAF) Scheme
In [2], the achievable rates for the CAF are characterized as in the following theorem.
Theorem 2 ([2])
The rate is achievable for the relay channel, if the following conditions are satisfied
(30)  
(31) 
where
(32) 
In the following theorem, we present three equivalent forms for the rate achievable by the CAF scheme.
Theorem 3
The following three conditions are equivalent.

For some
(33) (34) 
For some
(35) (36) 
For some
(37) (38) (39)
The proof of the above theorem is given in Appendix A.3.
5 Comparison of the Achievable Rates with Our Scheme and with the CAF Scheme
We note that the conditions on the achievable rates with our scheme given in Theorem 1, i.e., (24), (25), (26), are very similar to the final equivalent form for the conditions on the achievable rates with the CAF scheme, i.e., (40), (41), (42), except for two differences. First, the channel inputs of the transmitter and the relay, i.e., and , in our proposed scheme can be correlated, while in the CAF scheme they are independent, and second, in our scheme there are some extra random variables, which mutual information expressions are conditioned on, e.g., . These two differences come from our coding scheme where we introduced correlation between the channel inputs of the transmitter and the relay in a block, and between the variables across the blocks. The correlation between the channel inputs from the transmitter and the relay in any block is an advantage, as for channels which favor correlation, this translates into higher rates. However, the correlation across the blocks is a disadvantage as it decreases the efficiency of transmission, and therefore the achievable rates. In fact, the price we pay for the correlation between the channel inputs in any given block is precisely the correlation we have created across the blocks. For a given correlation structure, it is not clear which of these two opposite effects will overcome the other. That is, the rate of our scheme for a certain correlated distribution may be lower or higher than the rate of the CAF scheme. However, we note that the CAF scheme can be viewed as a special case of our proposed scheme by choosing an independent distribution, i.e., by choosing the following conditional distribution in (29)
(43) 
In this case, the expressions in Theorem 1, i.e., (24), (25), (26), degenerate into the third equivalent form for the CAF scheme in Theorem 3, i.e., (40), (41), (42). The above observation implies that the maximum achievable rate with our proposed scheme over all possible distributions is not less than the achievable rate of the CAF scheme. Thus, we can claim that this paper offers more choices in the achievability scheme than the CAF scheme, and that these choices may potentially yield larger achievable rates than those offered by the CAF scheme.
Appendix A Appendix
a.1 Probability of Error Calculation
The average probability of decoding error can be expressed as follows,
(44) 
where
(45)  
(46) 
where is another codeword that is generated according to the rules of our scheme.
From (3), we note the following Markov properties:

conditioned on , is independent of and ;

conditioned on , is independent of .
Here, and in the sequel, subscript refers to a generic block within overall blocks.
can be upper bounded as follows:
(47) 
From the way the code is generated, we have
(48) 
The compression from to is a conditional version of a ratedistortion code. If , then, when is sufficiently large, we have