Upper Bounds on the Capacity of 2-Layer N-Relay Symmetric Gaussian Network

# Upper Bounds on the Capacity of 2-Layer N-Relay Symmetric Gaussian Network

Satyajit Thakor and Syed Abbas
School of Computing and Electrical Engineering
School of Basic Sciences
Indian Institute of Technology Mandi
email: satyajit@iitmandi.ac.in
###### Abstract

The Gaussian parallel relay network, in which two parallel relays assist a source to convey information to a destination, was introduced by Schein and Gallager. An upper bound on the capacity can be obtained by considering broadcast cut between the source and relays and multiple access cut between relays and the destination. Niesen and Diggavi derived an upper bound for Gaussian parallel -relay network by considering all other possible cuts and showed an achievability scheme that can attain rates close to the upper bound in different channel gain regimes thus establishing approximate capacity. In this paper we consider symmetric layered Gaussian relay networks in which there can be many layers of parallel relays. The channel gains for the channels between two adjacent layers are symmetrical (identical). Relays in each layer broadcast information to the relays in the next layer. For 2-layer -relay Gaussian network we give upper bounds on the capacity. Our analysis reveals that for the upper bounds, joint optimization over correlation coefficients is not necessary for obtaining stronger results.

## I Introduction

The capacity of multihop relay networks is largely unknown. Even for a simple relay network set-up [1] in which a relay assists a source to communicate information to a destination in addition to the direct transmission is unknown for discrete memoryless as well as Gaussian case. In [2], Schein and Gallager introduced a new network setup called Gaussian parallel relay network in which a source communicates information to the destination via two parallel relays. Cut-set outer bounds were derived and a coding scheme called amplify-and-forward (AF) was introduced. In [3], Niesen and Diggavi considered generalization of the Gaussian parallel relay network, called Gaussian diamond network, in which communication is via relays. For diamond network with symmetric channel gains an outer bound was derived. The authors [3] also proposed bursty amplify-and-forward scheme and showed constant additive and multiplicative gaps between the achievable rates for the coding scheme and the upper bound for all regimes (parameter choices). This also established an approximate capacity for the symmetric Gaussian diamond network. This results were further extended for asymmetric Gaussian diamond network by converting an asymmetric diamond network into (approximately) symmetric parallel diamond networks.

The recent work [4] suggests that sub-linear gap (in terms of network nodes) between the cut-set bound and an achievable scheme for general relay networks is very unlikely else the cut-set bound is tight in general. However, for certain class of networks a constant factor gap may be a possibility and should be investigated.

Figure 1 depicts a generalization of the diamond network called Gaussian layered relay network. This general network has layers of parallel relays between the source and the destination. Each relay in a layer receives information from the relays of previous layers and broadcasts a function of received information to the relays in the next layer. The source node is at layer 0 and the destination node is at layer . Layered deterministic relay networks were considered in [5]. In layered networks the information flowing through different paths to a node in the networks reaches at the same time (i.e., with the same delay). This makes both upper bounding using information theoretic arguments and lower bounding using coding schemes some what easier by dropping the causality constraint for information transmission.

As noted in [6, p. 39], some of the achievable coding schemes are seen from a better perspective once the converse results are established. In this paper, we focus on converse results (upper bounds) for the network capacity. As a first step towards characterizing better upper bounds on the capacity of Gaussian layered relay networks (and, hopefully, establishing approximate capacity), in this paper we consider 2-layer -relay Gaussian network. Diamond network is a sub-network of this network. For this network we obtain upper bounds on the capacity which involves joint optimization over correlation parameters. Our analysis reveals a surprising result: joint optimization over correlation parameters across different layers is not necessary to obtain a tighter cut-set type bounds (29)-(30) for 2-layer -relay network. This desirable result may lead to characterization of an approximate capacity of layered relay networks with small multiplicative and additive gaps between the outer bounds and achievable rates for coding schemes.

The remaining part of the paper is organized as follows. Section II describes the network model and related work. Section III presents our main results. Conclusion and future directions are discussed in Section IV.

## Ii Network Model

A 2-layer -relay Gaussian network with relays in each layer is shown in Figure 2. We adopt most notations from [3]. The sets of relays are labeled . In this context, etc., are relays and should not be interpreted as numbers or . Parameters are denoted for simplicity. The transmitted random variables have unit average power constraint. The channel gains are real positive constants. The channels introduce i.i.d. additive white Gaussian noise each with zero mean and variance 1. A random variable without time index may be viewed as a generic copy.

The source message is encoded as . The input to the relay node at time is

 Y1j[t]=√riX0[t]+Z1j[t].

Assume that the relays incur delay of one block length (or one time instance ). In the next block interval, each of these relays broadcasts a relay code to the next layer. Each relay in layer 2 receives

 Y2j[t]=√r2N∑l=1X1l[t]+Z2j[t].

The coded symbols are transmitted by relays and

 Y3[t]=√r3N∑l=1X2l[t]+Z3[t]

is received by the destination node which estimates the transmitted source message from the received message via the decoding function

 ^V=gd((Y3[t])3Tt=2T+1).

Note that, due to the layered structure of the network, in any block interval a relay in the second layer receives a corrupted version of the transmitted codes for the source message for the same block. In other words are associated with the transmitted code block from the source. Also, the destination node receives a corrupted version of the transmitted codes for the source message for the same block. In other words is associated with the transmitted code block from the source. As such, we assume no delay in the relay operations and drop the causality constraint for information transmission in the network.

For the symmetric diamond network the capacity is upper bounded as follows. We refer (1) as the ND bound.

###### Theorem 1 (Lemma 7, [3])

For symmetric diamond network with relays and channel gains at source broadcast side and at destination mutiple-access side, the capacity

 C(N,r1,2) ≤supρ1∈[0,1)minn∈{0,…,N}12(log(1+(N−n)r1)+log(1+ψ(n,ρ1)r2)) (1)

where

 ψ(n,ρ)=n(1+(n−1)ρ−n(N−n)ρ21−(N−n−1)ρ).

Note that, due to symmetrical gains only cuts need to be considered from all possible cuts. This bound is a generalization of the outer bounds for Gaussian parallel 2-relay (symmetric) network in [6] (see also [2]). In particular, [3, Lemma 7] is a generalization of the broadcast cut-set bound (2.49) and cross cut-set bounds (2.83-84) in [6] (and the simpler bound (4) in [3] is a generalization of (2.49), the multiple access cut-set bound (2.58) and pure cross cut-set bound (2.68) in [6]).

## Iii An upper bound on the capacity of 2-layer N-relay Gaussian network

In this section an upper bound on the capacity of 2-layer -relay network is given. A cut is a set of nodes where . The cardinalities of and are denoted and respectively. Also,

 iSc≜[iN]∖iS,i∈{1,2}.

The three cuts , and , separating each layer from the previous layer, are referred as broadcast cut, multiple broadcast-access cut and multiple access cut respectively. The set of channels in every cut in the network is a union of disjoint subset of channels in these three cuts. Accordingly, a cut is regarded to have a broadcast part, a multiple broadcast-access part and a multiple access part. We start from the cut set bound [7]:

 C(N,r1,2,3)≤supX0,[1N],[2N]min1S⊆[1N],2S⊆[2N]I(X0,1S,2S;Y1Sc,2Sc,3|X1Sc,2Sc) (2)

where, the supremum is subject to the unit transmit power constraint at the nodes. Now,

 I(X0,1S,2S;Y1Sc,2Sc,3|X1Sc,2Sc) =h(Y1Sc|X1Sc,2Sc)−h(Y1Sc|X0,[1N],[2N]) \ \ +h(Y2Sc|Y1Sc,X1Sc,2Sc)−h(Y2Sc|Y1Sc,X0,[1N],[2N]) \ \ +h(Y3|Y1Sc,2Sc,X1Sc,2Sc)−h(Y3|Y1Sc,2Sc,X0,[1N],[2N]) ≤h(Y1Sc)+h(Y2Sc|X1Sc)+h(Y3|X1Sc,2Sc) \ \ −h(Z1Sc)−h(Z2Sc)−h(Z3) =h(Y1Sc)+h(Y2Sc|X1Sc)+h(Y3|X1Sc,2Sc) \ \ −h(Y1Sc|X0)−h(Y2Sc|X[1N])−h(Y3|X[1N],[2N]) =I(X0;Y1Sc)+I(X1S;Y2Sc|X1Sc) \ \ +I(X1S,2S;Y3|X1Sc,2Sc) (3)

From (2) and (3),

 C(N,r1,2,3)≤supX0,X[1N],X[2N]min1S⊆[1N],2S⊆[2N] {I(X0;Y1Sc)+I(X1S;Y2Sc|X1Sc)+I(X1S,2S;Y3|X1Sc,2Sc)}. (4)

Note that, in (3) the three conditional mutual information terms are associated with the broadcast, the multiple broadcast-access and the multiple access parts of a cut respectively. The first term in (3), associated with the broadcast part, can be further upper bounded as

 I(X0;Y1Sc)≤12log(1+|1Sc|r1). (5)

Now, let us focus on the third term.

 I(X1S,2S;Y3|X1Sc,2Sc) =h(Y3|X1Sc,2Sc)−h(Y3|X[1N],[2N]) (6) =h(√r3∑2j∈2S(X2j−f2j(X1Sc,2Sc))+Z3∣∣X1Sc,2Sc) \ \ −h(Z3) ≤h(√r3∑2j∈2S(X2j−f2j(X1Sc,2Sc))+Z3)−h(Z3) (7)

The equality (6) can also be written as

 I(X1S,X2S;Y3|X1Sc,X2Sc) =h(Y3|X2Sc)−I(Y3;X1Sc|X2Sc)−h(Y3|X[1N],[2N])

and if we ignore the term then the remaining terms can be further upper bounded by the ND bound. We use the additional information available across the cut in an attempt to obtain a tighter bound.

In (7), can be any functions. But, to obtain tightest possible upper bound, it is natural to consider this function to be the “best” estimator of given . If we restrict it to be the minimum mean square error (MMSE) estimator for approximating based on then the covariance matrix for the vector of random variables (note that these random variables are MMSE [8]) can be represented as [9]

 Q2S⋅1Sc2Sc=Q2S,2S−Q2S,1Sc2ScQ−1Sc2Sc,1Sc2ScQ1Sc2Sc,2S (8)

where, is the sub-matrix of the covariance matrix (which is, by definition, positive semidefinite) for corresponding to the rows for and the columns for . Also, is the Moore-Penrose generalized inverse of the matrix . The matrix is the generalized Schur complement of in . If is invertible, then its generalized inverse is also the inverse of the matrix and hence the generalized Schur complement reduces to the Schur complement. The MMSE has been used to obtain upper bounds on the capacity of symmetric channel models in [10] as well as [3]. A pictorial presentation of the correlation matrix and submatrices associated with a generic cut is given in Figure 3.

Again, we continue with the notations in [3]: means the identity matrix, means matrix of ones. We drop the subscripts when the dimension is clear from context. Then, the equation (7) can be further upper bounded as follows.

 h(√r3∑2j∈2S(X2j−f2j(X1Sc,2Sc))+Z3)−h(Z3) ≤12log(1+r31TQ2S⋅1Sc2Sc1) (9)

The second term in (3) is associated with the multiple broadcast-access part of a cut.

 I(X1S;Y2Sc|X1Sc) =h(Y2Sc|X1Sc)−h(Y2Sc|X[1N]) =h((√r2∑1i∈1S(X1i−f1i(X1Sc))+Z2j):2j∈2Sc∣∣X1Sc) \ \ −h(Z2j:2j∈2Sc) ≤h((√r2∑1i∈1S(X1i−f1i(X1Sc))+Z2j):2j∈2Sc) \ \ −h(Z2j:2j∈2Sc) ≤12|2Sc|log(1+r11TQ1S⋅1Sc1) (10)

where is the Schur complement, , are sub-matrices of and is the Moore-Penrose generalized inverse of .

Note that, “most” of correlation between and for all is “taken care of” by subtracting the MMSE estimator. Hence, using independence inequality for entropies (independence bound) is not the worst way of bounding the term. But, (10) is still crude since correlation among for all is not utilized in upper bounding the term. We now obtain a different bound as follows. Let and without loss of generality, assume that , then

 I(X1S;Y2Sc|X1Sc) =h(Y2Sc|X1Sc)−h(Y2Sc|X[1N]) ≤h((W+Z2j):2j∈2Sc)−h(Z2Sc) =|2Sc|∑2j=21h(W+Z2j|W+Z2(j−1),…,W+Z2|Sc|) \ \ −h(Z2Sc) ≤|2Sc|∑2j=22h(Z2j−Z2(j−1))+h(W+Z21)−h(Z2Sc) =(|2Sc|−1)h(Z2j−Z2j′)+h(W+Z2k)−h(Z2Sc) =(|2Sc|−1)12log4πeσ2+h(W+Z2k) \ \ −|2Sc|12log2πeσ2 =(|2Sc|−1)12log4πeσ2+h(W+Z2k) \ \ −(|2Sc|−1)12log2πeσ2−12log2πeσ2 =|2Sc|−12+h(W+Z2k)−12log2πeσ2 (11)

where, and is the differential entropy of Gaussian distribution with variance . Note that the inequality (11) is valid only for (for , ).

###### Lemma 1

From (4), (5), (10), (11), we have (12) for any and (13) for nonempty .

 C(N,r1,2,3) ≤supQ[1N],[2N]min1S⊆[1N],2S⊆[2N]12(log(1+|1Sc|r1)+|2Sc|log(1+r21TQ1S⋅1Sc1)+log(1+r31TQ2S⋅1Sc2Sc1)) (12) C(N,r1,2,3) ≤supQ[1N],[2N]min1S⊆[1N],2S⊆[2N]12(log(1+|1Sc|r1)+|2Sc|−1+log(1+r21TQ1S⋅1Sc1)+log(1+r31TQ2S⋅1Sc2Sc1)) (13)

Now we focus on the form of the covariance matrix. It can be shown using time-sharing argument and symmetry in the network that we can restrict our attention to of the form

 (ρ11N,N+(1−ρ1)INρ1,21N,Nρ1,21N,Nρ21N,N+(1−ρ2)IN) (14)

without loss in optimality which is equivalent to (15) where, the parameters represent correlation between and , , and , represents correlation between and , .

 ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1ρ1⋯ρ1ρ12ρ12⋯ρ12ρ11⋯ρ1ρ12ρ12⋯ρ12⋮⋮⋱⋮⋮⋮⋱⋮ρ1ρ1⋯1ρ12ρ12⋯ρ12ρ12ρ12⋯ρ121ρ2⋯ρ2ρ12ρ12⋯ρ12ρ21⋯ρ2⋮⋮⋱⋮⋮⋮⋱⋮ρ12ρ12⋯ρ12ρ2ρ2⋯1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ (15)

### Sketch of proof

First fix a cut under consideration and now suppose a matrix of some another form is optimal. Note that all matrices obtained by permutation mappings and are also optimal due to symmetry. Now, time sharing between all such matrices is also optimal. By time-sharing all the matrices for equal time duration, the matrix for such time sharing scheme has the form (15).

Positive semidefinite property of a covariance matrix restricts the range of as follows (see Appendix A for details).

 ρ1,ρ2 ∈[−1,1] ρ12 ∈[−1,min{1+(N−1)ρ1N,1+(N−1)ρ2N}]≜ζ

Note that is already derived in [3] and the valid range for is also given. Now we turn our attention to deriving expression for . This involves finding inverse of sub-matrices associated with cuts (see Appendix B for details) and the Schur complement. We derive the expression for Schur complement as follows (see Appendix C for details). We derive the expression for and is given in (16) (see Appendix D).

Note that, If , then (we take the convention , and )

 1TQ2S⋅1Sc2Sc1=0. (17)

Since the derivation of involves (or assumes) inverting , we cannot use the function as a representation of when for certain values of correlation coefficients the submatrices and corresponding to cuts are not invertible. In particular, for the following cases the matrix is not invertible and we use Moore-Penrose generalized inverse to find generalized Schur complement.

• At the other extreme (compared to the situation in (17)) if then

 1TQ2S⋅1Sc2Sc1=N(1+(N−1)ρ2). (18)
• If and then

 1TQ2S⋅1Sc2Sc1=N(1+(N−1)ρ2−Nρ212). (19)
• If and then

 1TQ2S⋅1Sc2Sc1=0 (20)
• If and , then

 1TQ2S⋅1Sc2Sc1=0. (21)

Now, for these values the function is undefined:

 ρ1,2,12 =1 (22) ρ1 =−1N−1,n=0 (23)

Note that seems undefiend if we put the values and simulteneously. But, if we evaluate case by case (cut by cut) basis, i.e., first fixing a cut and thus a value of (e.g., ) and evaluating the function for this value, and then evaluate the function with the value of (e.g., 1) then, is in fact defined. Similar is the case with the values (1) and and, (2) and .

But, for the parameter values (22), we have

 limρ12↑1ϕ(n,m,ρ1=1,ρ2=1,ρ12) =1TQ2S⋅1Sc2Sc1∣ρ1,2,12=1 (24)

which follows from (17)-(21). Now, note that for the situation (23), is undefined (the limit is positive infinity) only when and . But implies broadcast cut and is practically meaningless since the channels in the broadcast cut already separates from (and hence there is no need to add more channels to the cut). As such, we can assume that if then and thus avoid the situations when is undefined. Actually, it may be assumed that without loss of generality, but this assumption is unnecessary (does not lead to undefined function) for most cases except when .

There may be still more parameter values for which the matrix is not invertible. For examining this values we first make rank preserving transformation of the matrix (15), as shown in (25). Then we check the paprameter values for which the submatrices and are non-invertible.

 ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1ρ1ρ1⋯ρ1ρ12ρ12ρ12⋯ρ12ρ1−11−ρ10⋯0000⋯0ρ1−101−ρ1⋯0000⋯0⋮⋮⋮⋱⋮⋮⋮⋮⋱⋮ρ1−100⋯1−ρ1000⋯0ρ1−100⋯1−ρ1000⋯0000⋯01−ρ2⋯00ρ2−1⋮⋮⋮⋱⋮⋮⋮⋮⋱⋮000⋯00⋯1−ρ20ρ2−1000⋯00⋯01−ρ2ρ2−1ρ12ρ12ρ12⋯ρ12ρ2⋯ρ2ρ21⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ (25)

Now, note that the submatrices associated with this new matrix (25) are not invertible (that is, the submatrices are not full-rank) when or when . For the first case the limit of the function is as follows.

 limρ12↓−1ϕ(n,m,ρ1=1,ρ2=1,ρ12) =0 (26)

For the second case

 Q1Sc2Sc,1Sc2Sc=(1−1−11)

we have

 1TQ2S⋅1Sc2Sc1 =(N−1)(1+(N−2)ρ2−(N−1)(1+ρ2)24) (27) ≜μ(ρ2) (28)

whereas as except for or for which the limit and (27) are zero. Hence, for this special case, we must rely on (27) to evaluate the bound. Also, the right hand side of (27) is positive when these two conditions are satisfied: and .

Combining these results with (12)-(13) render the following bounds.

###### Lemma 2

The capacity of 2-layer -relay Gaussian network is upper bounded as

 C(N,r1,2,3) ≤supρ1,ρ2∈[−1,1),ρ12∈ζminn,m∈{0,…,N}12(log(1+(N−n)r1) \ \ +(N−m)log(1+ψ(n,ρ1)r2) \ \ +log(1+ϕ(n,m,ρ1,2,12)r3)) (29)

for and

 C(N,r1,2,3) ≤supρ1,ρ2∈[−1,1),ρ12∈ζminn,m∈{0,…,N}12(log(1+(N−n)r1) \ \ +N−m−1+log(1+ψ(n,ρ1)r2) \ \ +log(1+ϕ(n,m,ρ1,2,12)r3)). (30)

for where, if then for both (29) and (30) and if and then . For the special case when and we have

 C(N,r1,2,3) ≤12(log(1+r1) \ \ +supρ1