On the Capacity of the Noncausal Relay Channel

# On the Capacity of the Noncausal Relay Channel

Lele Wang , and Mohammad Naghshvar  This paper was presented in part at IEEE International Symposium on Information Theory 2011.L. Wang is jointly with the Department of Electrical Engineering, Stanford University, Stanford, CA, USA and the Department of Electrical Engineering - Systems, Tel Aviv University, Tel Aviv, Israel (email: wanglele@stanford.edu).M. Naghshvar was with the Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA 92093 USA. He is now with Qualcomm Technologies Inc., San Diego, CA 92121 USA (e-mail: mnaghshv@qti.qualcomm.com).
###### Abstract

This paper studies the noncausal relay channel, also known as the relay channel with unlimited lookahead, introduced by El Gamal, Hassanpour, and Mammen. Unlike the standard relay channel model, where the relay encodes its signal based on the previous received output symbols, the relay in the noncausal relay channel encodes its signal as a function of the entire received sequence. In the existing coding schemes, the relay uses this noncausal information solely to recover the transmitted message or part of it and then cooperates with the sender to communicate this message to the receiver. However, it is shown in this paper that by applying the Gelfand–Pinsker coding scheme, the relay can take further advantage of the noncausally available information and achieve rates strictly higher than those of the existing coding schemes. This paper also provides a new upper bound on the capacity of the noncausal relay channel that strictly improves upon the existing cutset bound. These new lower and upper bounds on the capacity coincide for the class of degraded noncausal relay channels and establish the capacity for this class.

Relay channel, Gelfand–Pinsker coding, decoding–forward relaying, compress–forward relaying, cutset bound.

## I Introduction

The relay channel was first introduced by van der Meulen [1]. In their classic paper [2], Cover and El Gamal established the cutset upper bound and the decode–forward, partial decode–forward, compress–forward, and combined lower bounds for the relay channel. Furthermore, they established the capacity for the classes of degraded and reversely degraded relay channels, and relay channels with feedback.

The relay channel with lookahead was introduced by El Gamal, Hassanpour, and Mammen [3], who mainly studied the following two classes:

• Causal relay channel (also known as relay-without-delay) in which the relay has access only to the past and present received sequence. This model is usually considered when the delay from the sender-receiver link is sufficiently longer than the delay from the sender-relay link so that the relay can depend on the “present” in addition to the past received sequence. A lower bound for the capacity of this channel was established by combining partial decode–forward and instantaneous relaying coding schemes. The cutset upper bound for the causal relay channel was also established.

• Noncausal relay channel (also known as relay-with-unlimited-lookahead) in which the relay knows its entire received sequence in advance and hence the relaying functions can depend on the whole received block. This model provides a limit on the extent to which relaying can help communication. Lower bounds on the capacity were established by extending (partial) decode–forward coding scheme to the noncausal case. The cutset upper bound for the noncausal relay channel was also established.

The focus of this paper is on the noncausal relay channel. The existing lower bounds on the capacity of this channel are derived using the (partial) decode–forward coding scheme. In particular, the relay recovers the transmitted message from the received sequence (available noncausally at the relay) and then cooperates with the sender to coherently transmit this message to the receiver. Therefore, the noncausally available information is used solely to recover the transmitted message at the relay. However, it is shown in this paper that the relay can take further advantage of the received sequence by considering it as noncausal side information to help the relay’s communication to the receiver. Based on this observation, we establish in this paper several improved lower bounds on the capacity of the noncausal relay channel by combining the Gelfand–Pinsker coding scheme [4] with (partial) decode–forward and compress–forward at the relay. Moreover, we establish a new upper bound on the capacity that improves upon the cutset bound [5, Theorem 16.6]. The new upper bound is shown to be tight for the class of degraded noncausal relay channels and is achieved by the Gelfand–Pinsker decode–forward coding scheme.

The rest of the paper is organized as follows. In Section II, we formulate the problem and provide a brief overview of the existing literature. In Section III, we establish three improved lower bounds, the Gelfand–Pinsker decode–forward (GP-DF) lower bound, the Gelfand–Pinsker compress–forward lower bound, and the Gelfand–Pinsker partial decode–forward compress–forward lower bound. We show through Example 1 that the GP-DF lower bound can be strictly tighter than the existing lower bound. In Section IV, we establish a new upper bound on the capacity, which is shown through Example 2 to strictly improve upon the cutset bound. This improved upper bound together with the GP-DF lower bound establishes the capacity for the class of degraded noncausal relay channels.

Throughout the paper, we follow the notation in [5]. In particular, a random variable is denoted by an upper case letter (e.g., ) and its realization is denoted by a lower case letter (e.g., ). By convention, means that is a degenerate random variable (unspecified constant) regardless of its support. Let . We say that form a Markov chain if . For , , where is the smallest integer greater than or equal to . For any set , denotes its cardinality. The probability of an event is denoted by .

## Ii Problem Formulation and Known Results

### Ii-a Noncausal Relay Channel

Consider the -node point-to-point communication system with a relay depicted in Figure 1. The sender (node ) wishes to communicate a message to the receiver (node ) with the help of the relay (node ). The discrete memoryless (DM) relay channel with lookahead is described as

 (1)

where the parameter specifies the amount of lookahead. The channel is memoryless in the sense that and .

A code for the relay channel with lookahead consists of

• a message set ,

• an encoder that assigns a codeword to each message ,

• a relay encoder that assigns a symbol to each sequence for , where the symbols that have nonpositive time indices or time indices greater than are arbitrary, and

• a decoder that assigns an estimate or an error message to each received sequence .

We assume that the message is uniformly distributed over . The average probability of error is defined as . A rate is said to be achievable for the DM relay channel with lookahead if there exists a sequence of codes such that . The capacity of the DM relay channel with lookahead is the supremum of all achievable rates.

The standard DM relay channel111Note that here we define the relay channel with lookahead as , since the conditional pmf depends on the code due to the instantaneous or lookahead dependency of on . corresponds to lookahead parameter , or equivalently, a delay of . Causal relay channel corresponds to lookahead parameter , i.e., the relaying function at time can depend only on the past and present relay received sequence (instead of as in the standard relay channel). The noncausal relay channel which we focus on in this paper is the case where , i.e., the relaying functions can depend on the entire received sequence . The purpose of studying this extreme case is to quantify the limit on the potential gain from relaying.

### Ii-B Prior Work

The noncausal relay channel was initially studied by El Gamal, Hassanpour, and Mammen [3], who established the following lower bounds on the capacity . The cutset upper bound on the capacity is due to [5, Theorem 16.6].

• Decode–forward (DF) lower bound:

 C∞ ≥RDF =maxp(x1,x2)min{I(X1;Y2),I(X1,X2;Y3)}. (2)
• Partial decode–forward (PDF) lower bound:

 C∞ ≥RPDF =maxp(v,x1,x2)min{I(V;Y2)+I(X1;Y3|X2,V), I(X1,X2;Y3)}. (3)
• Cutset bound222There is a small typo in [3, Theorem 1] where the maximum is over instead of . for the noncausal relay channel:

 C∞ ≤RCS (4)

## Iii Lower Bounds

In this section, we establish three lower bounds by considering the received sequence at the relay as noncausal side information to help communication. In Subsection III-A, we first establish the Gelfand–Pinsker decode–forward (GP-DF) lower bound by incorporating Gelfand–Pinsker coding with the decode–forward coding scheme. Then we show that the GP-DF lower bound can be strictly tighter than the decode–forward lower bound and achieve the capacity in Example 1. In Subsection III-B, we establish the Gelfand–Pinsker compress–forward (GP-CF) lower bound in two different ways, one via the Wyner–Ziv binning with Gelfand–Pinsker coding and another via the recently developed hybrid coding techniques [6] [7] [8]. In Subsection III-C, we further combine the hybrid coding techniques with the partial decode–forward coding scheme.

### Iii-a Gelfand–Pinsker Decode–Forward Lower Bound

We first incorporate Gelfand–Pinsker coding with the decode–forward coding scheme.

###### Theorem 1 (Gelfand–Pinsker decode–forward (GP-DF) lower bound).

The capacity of the noncausal relay channel is lower bounded as

 C∞ ≥R\em GP-DF =maxmin{I(X1;Y2),I(X1,U;Y3)−I(U;Y2|X1)}, (5)

where the maximum is over all pmfs and functions .

###### Remark 1.

Taking conditionally independent of given and setting reduces the GP-DF lower bound to the DF lower bound in (2).

###### Proof:

The GP-DF coding scheme uses multicoding and joint typicality encoding and decoding. For each message , we generate a sequence and a subcodebook of sequences. To send message , the sender transmits . Upon receiving noncausally, the relay first finds a message estimate . It then finds a that is jointly typical with and transmits . The receiver declares to be the message estimate if are jointly typical for some . We now provide the details of the proof.

Codebook generation: Fix and that attain the lower bound. Randomly and independently generate sequences , each according to . For each message , randomly and conditionally independently generate sequences , each according to , which form the subcodebook . This defines the codebook . The codebook is revealed to all parties.

Encoding: To send message , the encoder transmits .

Relay encoding: Upon receiving noncausally, the relay first finds the unique message such that . Then, it finds a sequence such that . If there is more than one such index, it selects one of them uniformly at random. If there is no such index, it selects an index from uniformly at random. The relay transmits at time .

Decoding: Let . Upon receiving , the decoder declares that is sent if it is the unique message such that for some ; otherwise, it declares an error.

Analysis of the probability of error: We analyse the probability of error averaged over codes. Assume without loss of generality that . Let be the relay’s message estimate and let denote the index of the chosen codeword for and . The decoder makes an error only if one of the following events occur:

 ~E ={~M≠1}, ~E1 ={(Xn1(1),Yn2)∉T(n)ϵ′}, ~E2 ={(Xn1(m),Yn2)∈T(n)ϵ′for somem≠1}, ~E3 ={(Un(l|~M),Xn1(~M),Yn2)∉T(n)ϵ′ for allUn(l|~M)∈C(~M)}, E1 E2 for somem≠1,Un(l|m)∈C(m)}.

Thus, the probability of error is upper bounded as

 P(E) =P{^M≠1} ≤P(~E∪~E3∪E1∪E2) ≤P(~E)+P(~E3∩~Ec)+P(E1∩~Ec∩~Ec3)+P(E2) ≤P(~E1)+P(~E2)+P(~E3∩~Ec) +P(E1∩~Ec∩~Ec3)+P(E2).

By the law of large numbers (LLN), the first term tends to zero as . By the packing lemma [5], the second term tends to zero as if . Therefore, tends to zero as if . Given , i.e. , by the covering lemma [5], the third term tends to zero as if . By the conditional typicality lemma, the fourth term tends to zero as . Finally, note that once is wrong, is also wrong. By the packing lemma, the last term tends to zero as if . Combining the bounds and eliminating , we have shown that tends to zero as if and where . This completes the proof.

###### Remark 2.

Unlike the coding schemes for the regular relay channel, we do not need block Markov coding for the noncausal relay channel for the following two reasons. First, from the channel statistics , does not depend on and hence there is no need to make correlated with the previous block . Second, is available noncausally at the relay and hence the signals from the sender and the relay arrive at the receiver in the same block.

The GP-DF lower bound can be strictly tighter than the DF lower bound as shown in the following example.

###### Example 1.

Consider a degraded noncausal relay channel depicted in Figure 2. The channel from the sender to the relay is a channel, while the channel from the relay to the receiver is clean if and stuck at if is an erasure.

Note that the state of the channel from the relay to the receiver, namely, whether we get an erasure or not, is independent of . The first term in both the DF lower bound and the GP-DF lower bound is easy to compute as

 maxp(x1)I(X1;Y2)=1/2.

Consider the second term in the DF lower bound. Here is chosen such that form a Markov chain. By carefully computing the conditional probability in this specific channel, we can show that form a Markov chain. Thus,

 maxp(x1,x2)I(X1,X2;Y3) \lx@stackrelto0.0pt\hss$(a)$\hss=maxp(x2||x1)I(X2;Y3) \lx@stackrelto0.0pt\hss$(b)$\hss=maxp(x2)I(X2;Y3) \lx@stackrelto0.0pt\hss$(c)$\hss=H(1/5)−2/5 =0.3219,

where follows since form a Markov chain, follows since is fully determined by the marginal distribution , and follows since the channel from to is a channel with crossover probability 1/2 regardless of . Thus,

 RDF=min{1/2,0.3219}=0.3219.

Now consider the second term in the GP-DF lower bound (5)

Let , if , and , if . Note that here we always have and form a Markov chain. Thus,

 =I(X1,X2;X2)−I(X2;Y2|X1) =H(X2)−H(X2|X1)+H(X2|Y2,X1) ≥H(X2|Y2) =1/2.

Therefore,

 RGP-DF=1/2>RDF=0.3219.

Moreover, it is easy to see from the cutset bound (4) that the rate 1/2 is also an upper bound and hence .

### Iii-B Gelfand–Pinsker Compress–Forward Lower Bound

In this subsection, we first propose a two-stage coding scheme that incorporates Gelfand–Pinsker coding with the compress–forward coding scheme. Then we show an equivalent lower bound can be established directly by applying the recently developed hybrid coding scheme at the relay node.

###### Theorem 2 (Gelfand–Pinsker compress–forward (GP-CF) lower bound).

The capacity of the noncausal relay channel is lower bounded as

 C∞≥R\emphGP−CF =maxmin{I(X1;U,Y3),I(X1,U;Y3)−I(U;Y2|X1)}, (6)

where the maximum is over all pmfs and functions .

###### Proof:

The coding scheme is illustrated in Figure 3. We use Wyner–Ziv binning, multicoding, and joint typicality encoding and decoding. A description of is constructed at the relay. Since the receiver has side information about , we use binning as in Wyner–Ziv coding to reduce the rate necessary to send . Since the relay has side information of the channel , we use multicoding as in Gelfand–Pinsker coding to send the bin index of via . The decoder first decode the bin index from . It then uses and to decode and simultaneously.

We now provide the details of the coding scheme.

Codebook generation: Fix and that attain the lower bound. Randomly and independently generate sequences , , each according to . Randomly and independently generate sequences , , each according to . Partition into bins . For each , randomly and independently generate sequences , , each according to , which form subcodebook . This defines the codebook . The codebook is revealed to all parties.

Encoding: To send the message , the encoder transmits .

Relay encoding and analysis of the probability of error: Upon receiving , the relay first finds the unique such that . This requires by the covering lemma. Upon getting the bin index of , i.e., , the relay finds a sequence such that . This requires by the covering lemma. The relay transmits at time .

Decoding and analysis of the probability of error: Let . Upon receiving , the decoder finds the unique such that for some . This requires . The decoder then finds the unique message such that for some . Let be the chosen index for at the relay. If but , this requires . If and , this requires . Thus, we establish the following lower bound:

 C∞ ≥R′GP-CF =maxmin{I(X1;^Y2,Y3), I(X1,^Y2;Y3)−I(^Y2;Y2|X1)+I(U;Y3)−I(U;Y2)}, (7)

where the maximum is over all pmfs and functions .

Now we show the two lower bounds (7) and (6) are equivalent. Setting in and relabeling as , reduces to . Thus,

 R′GP-CF≥RGP-CF. (8)

On the other hand, letting in , we have

 I(X1,U,^Y2;Y3)−I(U,^Y2;Y2|X1) =I(X1,^Y2;Y3)−I(^Y2;Y2|X1)+H(U|X1,^Y2) −H(U|X1,^Y2,Y3)−H(U|X1,^Y2)+H(U|X1,^Y2,Y2) \lx@stackrelto0.0pt\hss$(a)$\hss≥I(X1,^Y2;Y3)−I(^Y2;Y2|X1)+H(U) −H(U|Y3)−H(U)+H(U|X1,^Y2,Y2) \lx@stackrelto0.0pt\hss$(b)$\hss=I(X1,^Y2;Y3)−I(^Y2;Y2|X1)+I(U;Y3)−I(U;Y2),

where follows since conditioning reduces entropy and follows since form a Markov chain. Furthermore, since the maximum in is over a larger set than the set in ,

 RGP-CF≥R′GP-CF. (9)

Combining (8) and (9) establishes the equivalence.

###### Remark 3.

Taking independent of and in (7), we establish the compress–forward lower bound without Gelfand–Pinsker coding as follows:

 C∞ ≥RCF =maxmin{I(X1;^Y2,Y3), I(X1,^Y2;Y3)+I(X2;Y3)−I(^Y2;Y2|X1)},

where the maximum is over all pmfs .

In the analysis of the probability of error in Theorem 2, there is a technical subtlety in applying the standard packing lemma and joint typicality lemma, since the bin index , the compression index , and the multicoding index all depend on the random codebook itself. In the following, we show the GP-CF lower bound (6) can be established directly by applying the recently developed hybrid coding scheme for joint source–channel coding by Lim, Minero, and Kim [6], [7], [8].

###### Proof:

In this coding scheme, we apply hybrid coding at the relay node as depicted in Figure 4. The sequence is mapped to one of sequences . The relay generates the codeword through a symbol-by-symbol mapping . The receiver declares to be the message estimate if are jointly typical for some . Similar to the hybrid coding scheme for joint source-channel coding [6], [7], [8], the precise analysis of the probability of decoding error involves a technical subtlety. In particular, since is used as a source codeword, the index depends on the entire codebook. This dependency issue is resolved by the technique developed in [6]. We now provide the details of the coding scheme.

Codebook generation: Fix and that attain the lower bound. Randomly and independently generate sequences , , each according to . Randomly and independently generate sequences , , each according to . This defines the codebook . The codebook is revealed to all parties.

Encoding: To send message , the encoder transmits .

Relay encoding: Upon receiving , the relay finds an index such that . If there is more than one such indices, it chooses one of them at random. If there is no such index, it chooses an arbitrary index at random from . The relay then transmits at time .

Decoding: Let . Upon receiving , the decoder finds the unique message such that for some .

Analysis of the probability of error: We analyze the probability of decoding error averaged over codes. Let denote the index of the chosen codeword for . Assume without loss of generality that . The decoder makes an error only if one of the following events occur:

 ~E ={(Un(l),Yn2)∉T(n)ϵ′for alll}, E1 ={(Xn1(1),Un(L),Yn3)∉T(n)ϵ}, E2 ={(Xn1(m),Un(L),Yn3)∈T(n)ϵform≠1}, E3 ={(Xn1(m),Un(l),Yn3)∈T(n)ϵform≠1,l≠L}.

By the union of the events bound, the probability of error is upper bounded as

 P(E) =P(~E∪E1∪E2∪E3) ≤P(~E)+P(E1∩~Ec)+% P(E2∩~Ec)+P(E3).

By the covering lemma, the first term tends to zero as if . By the conditional typicality lemma, the second term tends to zero as . By the packing lemma, the third term tends to zero as if .

The fourth term requires special attention. Consider

 P(E3) =P{(Xn1(m),Un(l),Yn3)∈T(n)ϵform≠1,l≠L} \lx@stackrelto0.0pt\hss$(a)$\hss≤2nR∑m=22n~R∑l=1P{(Xn1(m),Un(l),Yn3)∈T(n)ϵ,l≠L} =2nR∑m=22n~R∑l=1∑yn2p(yn2) \lx@stackrelto0.0pt\hss$(b)$\hss≤2nR2n~R∑yn2p(yn2)

where follows by the union of events bound and follows by the symmetry of the codebook generation and relay encoding. Let . Then, for sufficiently large,

 ≤P{(Xn1(2),Un(1),Yn3)∈T(n)ϵ|L≠1,Yn2=yn2} =∑(xn1,un,yn3)∈T(n)ϵP{Xn1(2)=xn1,Un(1)=un,Yn3=yn3 |L≠1,Yn2=yn2} =∑(xn1,un,yn3)∈T(n)ϵ∑¯CP{¯C=¯\footnotesizeC||L≠1,Yn2=yn2} ⋅P{Xn1(2)=xn1,Un(1)=un,Yn3=yn3