Second-Order Rate Region of Constant-Composition Codes for the Multiple-Access Channel

# Second-Order Rate Region of Constant-Composition Codes for the Multiple-Access Channel

## Abstract

This paper studies the second-order asymptotics of coding rates for the discrete memoryless multiple-access channel with a fixed target error probability. Using constant-composition random coding, coded time-sharing, and a variant of Hoeffding’s combinatorial central limit theorem, an inner bound on the set of locally achievable second-order coding rates is given for each point on the boundary of the capacity region. It is shown that the inner bound for constant-composition random coding includes that recovered by i.i.d. random coding, and that the inclusion may be strict. The inner bound is extended to the Gaussian multiple-access channel via an increasingly fine quantization of the inputs.

1

## 1 Introduction

The channel capacity describes the highest rate of transmission with vanishing error probability in coded communication systems. Further characterizations of the system performance are given by error exponents [1, Ch. 9], moderate deviations results [2], and second-order coding rates [3]. The latter has regained significant attention in recent years [4, 5], and is well-understood for a variety of settings. For discrete memoryless channels, the maximum number of codewords of length yielding an error probability not exceeding , denoted by , satisfies [3]

 logM∗(n,ϵ)=nC−√nVQ−1(ϵ)+o(√n), (1)

where is the channel capacity, is the functional inverse of the standard Gaussian tail probability , and is known as the channel dispersion. Expansions of the form (1) provide additional insight into the system performance beyond the capacity alone by quantifying the rate of convergence.

In this paper, we study the second-order asymptotics of the multiple-access channel (MAC). Achievability results for this problem have previously been obtained using i.i.d. random coding with a random time-sharing sequence [6, 7] and a deterministic time-sharing sequence [8], whereas we demonstrate improved asymptotic bounds via the use of constant-composition random coding [1, Ch. 9]. A key tool in our analysis is a Berry-Esseen theorem associated with a variant of Hoeffding’s combinatorial central limit theorem (CLT) [9]. We consider a local notion of second-order achievability proposed by Nomura and Han [10], in which the second-order coding rates (e.g.  in (1)) of the users are sought for a fixed point on the boundary of the capacity region.

### 1.1 Notation

The set of all probability distributions on an alphabet is denoted by , and the set of conditional distributions on given is denoted by . Given a distribution and a conditional distribution , the joint distribution is denoted by . The set of all empirical distributions (i.e. types [11, Ch. 2]) for sequences in is denoted by . The set of all sequences of length with a given type is denoted by , and similarly for joint types. Given a sequence and a conditional distribution , we define to be the set of sequences such that .

Bold symbols are used for vectors and matrices (e.g. ), and the corresponding -th entry of a vector is written using a subscript (e.g. ). The vectors (or matrices) of all zeros and all ones are denoted by and respectively, and the identity matrix is denoted by ; the sizes will be clear from the context. The symbols , , etc. denote element-wise inequalities for vectors, and inequalities on the positive semidefinite cone for matrices (e.g.  means is positive definite). We denote the -norm of a vector by , and the maximum absolute value of the entries of a vector or matrix by . We denote the transpose of a vector or matrix by , the inverse of a matrix by , the positive definite matrix square root by , and its inverse by . The multivariate Gaussian distribution with mean and covariance matrix is denoted by .

We denote the cross-covariance matrix of two random vectors by , and we write in place of . The variance of a scalar random variable is denoted by . Logarithms have base , and all rates are in nats except in the examples, where bits are used. We denote the indicator function by . For a set of real numbers (or vectors) and a constant (or vector) , we write (or ) to denote the set . We similarly write for a given constant .

For two sequences and , we write if for some and sufficiently large , and if . We write if and .

### 1.2 System Setup and Definitions

We consider a two-user discrete memoryless MAC (DM-MAC) with input alphabets and and output alphabet , yielding an -letter transition law given by . The encoders and decoder operate as follows. Encoder takes as input a message equiprobable on the set , and transmits the corresponding codeword from the codebook . The decoder forms an estimate of the message pair using the output sequence and the two codebooks. An error is said to have occurred if . A rate pair is said to be -achievable if there exist codebooks with and codewords of length for users 1 and 2 respectively, such that the average error probability does not exceed . The capacity region is defined to be the closure of the set of rate pairs that are -achievable for any and sufficiently large .

Our results are proved using constant-composition random coding with coded time-sharing [12]. The precise description of the ensemble is postponed until Section 4; here we simply provide the definitions required to state the results. We fix a finite time-sharing alphabet , as well as the input distributions , and . We define the joint distribution

 PUX1X2Y(u,x1,x2,y)≜QU(u)Q1(x1|u)Q2(x2|u)W(y|x1,x2), (2)

and denote the induced marginal distributions by , , etc. Defining the rate vector

 R≜⎡⎢⎣R1R2R1+R2⎤⎥⎦ (3)

and the mutual information vector (implicitly dependent on , , and )

 I≜⎡⎢⎣I(X1;Y|X2,U)I(X2;Y|X1,U)I(X1,X2;Y|U)⎤⎥⎦, (4)

we have [13, 14, 15]

 R∗=⋃U⋃QU,Q1,Q2{(R1,R2):R⪯I}. (5)

Moreover, the union over may be restricted to satisfy . The three conditions in the element-wise inequality correspond to a treatment of the error event as a union of three error types:

 (Type 1) ^m1=m1 and ^m2≠m2, (Type 2) ^m1≠m1 and ^m2=m2, (Type 12) ^m1≠m1 and ^m2≠m2.

A key quantity in our analysis is the information density vector [6, 8]

 i(u,x1,x2,y)≜⎡⎢⎣i1(u,x1,x2,y)i2(u,x1,x2,y)i12(u,x1,x2,y)⎤⎥⎦, (6)

where

 i1(u,x1,x2,y) ≜logW(y|x1,x2)PY|X2U(y|x2,u) (7) i2(u,x1,x2,y) ≜logW(y|x1,x2)PY|X1U(y|x1,u) (8) i12(u,x1,x2,y) ≜logW(y|x1,x2)PY|U(y|u). (9)

Averaging with respect to the distribution in (2) yields the mutual information vector in (4).

We consider the local notion of second-order asymptotics introduced by Nomura and Han for the Slepian-Wolf problem [10]; see also Hayashi [5] for the analogous definitions in the single-user setting. We proceed by presenting similar definitions for the present setting, albeit in a slightly different form. A pair is said to be -achievable if there exist codebooks with codewords of length for such that the average error probability does not exceed . The second-order rate region is defined as the closure of the set of pairs that are -achievable for sufficiently large . In other words, is the set of all pairs for which there exists an -reliable code with for . While this definition is valid for any pair , our focus will be on pairs on the boundary of the capacity region ; in all other cases we trivially have either or . We will see that both negative and positive values of arise; the former can be thought of as a backoff from the first-order term, and the latter an addition to the first-order term.

Finally, we define the set

 Qinv(V,ϵ)≜{z∈Rd:P[Z⪯z]≥1−ϵ}, (10)

where , and is a positive semi-definite matrix. This definition applies for an arbitrary dimension , which is dictated by the first argument.

### 1.3 Previous Work

The second-order rate region has been characterized for very few multi-user problems [10, 16, 17]. The one most relevant to this paper is the Gaussian MAC with degraded message sets [16], which has the notable feature of having a curved capacity region, giving rise to a non-standard derivative term in the expression for . Our analysis will yield similar terms using different techniques.

Tan and Kosut [6] and Haim et al. [18] performed second-order asymptotic studies for various multi-user problems using different notions of achievability to those above. In particular, both of these works considered the problem of finding the backoff from the rates when a point is approached from a given angle. As demonstrated in [16], this problem can be solved numerically in a straightforward fashion once is characterized.

Other previous works on the DM-MAC have taken an alternative approach to characterizing the second-order asymptotics, namely seeking global asymptotic expansions of the following form: For any triplet , rate vectors satisfying

 nR∈nI−√nQinv(V,ϵ)+g(n)1, (11)

are -achievable for some dispersion matrix and function , where is given in (4).

The first global result for the DM-MAC was given in [6], where i.i.d. random coding was used to obtain (11) with and (see also [7]). Expansions of a similar form were given by MolavianJazi and Laneman [7]. By using a constant-composition time-sharing sequence, Huang and Moulin [8] showed that the dispersion matrix can be improved to

 Viid≜E[Cov[i(U,X1,X2,Y)∣∣U]]. (12)

As discussed by Haim et al. [18], expansions of the form (11) are more difficult to interpret than the scalar counterpart in (1), as the notion of the convergence of a region is inherently less concrete than that of the convergence of a scalar. While the scalar dispersion in (1) corresponds to a concrete operational definition [4], it appears difficult to directly give any such meaning to the matrix based on global results. In fact, non-global asymptotic studies in [6] indicate that, in most cases of interest, entries and of the matrix do not play a fundamental role in characterizing the performance. Furthermore, it may be difficult to compare two dispersion matrices, since the partial positive definite ordering does not guarantee that at least one of or hold. These issues are even more troublesome when one considers the union over all input distributions; for example, standard proofs often yield a non-uniform remainder term in (11). These limitations motivate the study of local asymptotics, such as defined above. However, global results often prove useful as an intermediate step towards the local results.

### 1.4 Contributions

The main result of this paper is an inner bound on for the discrete memoryless MAC. The result is proved using constant-composition random coding and a variant of Hoeffding’s combinatorial CLT [9, 19]. Since coding with fixed input distributions (not varying with ) is not sufficient to achieve all pairs in network information theory problems [16], we apply coded time-sharing [15, Sec. 4.5.3] between input distributions corresponding to two points on the boundary of the capacity region, with one of the points only corresponding to a fraction of the block length. Several examples are provided, including (i) a case where constant-composition random coding yields a strictly larger inner bound than that of i.i.d. random coding, and (ii) an application to the Gaussian MAC via a quantization argument.

## 2 Main Result

### 2.1 Further Definitions

Our main result is written in terms of a dispersion matrix of the form

 V ≜E[Cov[i(U,X1,X2,Y)∣∣U] −Cov[i(1)(U,X1)∣∣U]−Cov[i(2)(U,X2)∣∣U]], (13)

where

 i(1)(u,x1) Missing or unrecognized delimiter for \big (14) i(2)(u,x2) Missing or unrecognized delimiter for \big (15)

We can interpret (13) as follows: The term represents the variations in in the i.i.d. case (cf. (12)), and the terms and represent the reduced variations in and respectively, resulting from the codewords having a fixed composition. Since all covariance matrices are positive semidefinite, we clearly have . We henceforth write the entries of (see (4)) and using subscripts:

 I=⎡⎢⎣I1I2I12\par⎤⎥⎦,V=⎡⎢⎣V1V1,2V1,12V1,2V2V2,12V1,12V2,12V12⎤⎥⎦. (16)

For a fixed point on the boundary of , we let and respectively denote the left and right unit tangent vectors along the boundary of in -space; see Figure b. We let (respectively, ) be undefined when (respectively, ); in all other cases, the vectors are well-defined due to the convexity of the capacity region. The case corresponds to a curved or straight-line part of the boundary, whereas corresponds to a sudden change in slope (e.g. at a corner point).

We construct the following vectors in the same way as (3):

 T− ≜⎡⎢ ⎢⎣^T−,1^T−,2^T−,1+^T−,2⎤⎥ ⎥⎦,T+≜⎡⎢ ⎢⎣^T+,1^T+,2^T+,1+^T+,2⎤⎥ ⎥⎦, (17)

where denotes the -th entry of the corresponding unit tangent vector. To ease some of the subsequent discussions, we define the following scalars that correspond to and in a one-to-one fashion:

 D−≜^T−,2^T−,1,D+≜^T+,2^T+,1. (18)

These are the left and right derivatives of as a function of . They are non-positive, and are understood to equal when , corresponding to a vertical part of the capacity region. Observe that is obtained by normalizing and is obtained by normalizing , and hence if and only if .

Given the pairs and , we define

 Missing or unrecognized delimiter for \left (19)

For a non-empty index set , we let denote the subvector of where only the indices corresponding to are kept, and similarly for , , , and . Similarly, denotes the submatrix of where only the rows and columns indexed by are kept.

### 2.2 Statement of Main Result

We say that the triplet achieves the rate pair if ; from (5), every point in (including those on the boundary) is achieved by at least one such triplet. In the following theorem, can be thought of as the set of error types that are active for a given input distribution and boundary point (e.g. if the boundary point is achieved by the corner point of the pentagonal region corresponding to the type-2 and type-12 conditions, then ).

###### Theorem 1.

Fix , let be a point on the boundary of the capacity region in (5), let be an arbitrary triplet achieving that point, and consider and in (4) and (13) respectively. Letting be the set of indices of the largest cardinality such that , we have

 L(ϵ,R∗1,R∗2) ⊇{(L1,L2):L(K)∈⋃β≥0{βT(K)−−Qinv(V(K),ϵ)}} ∪{(L1,L2):L(K)∈⋃β≥0{βT(K)+−Qinv(V(K),ϵ)}}, (20)

where the first (respectively, second) set is understood to be empty when (respectively, ).

{IEEEproof}

See Section 4.1. We make the following remarks on Theorem 1:

1. In the case that , or equivalently (i.e. a curved or straight-line part of the boundary), the two sets in (20) can be combined into a single set containing a coefficient (with negative values allowed) and the vector . In this case, the inner bound on is a half-space. We will see in Section 3.2 that this does not always occur, and combinations other than and are possible (these are the combinations that are observed for standard pentagonal regions).

2. In the case that contains only a single entry , the unions over can be replaced by , yielding a simpler inner bound given by

 L(ϵ,R∗1,R∗2)⊇{(L1,L2):Lν≤−√VνQ−1(ϵ)}, (21)

where . The fact that suffices is shown in the same way for each , so we consider the case . Since lies on the diagonal part of the pentagonal region corresponding to (and away from the corners), both and are achievable for sufficiently small , and hence and . The convexity of the capacity region implies that , and it follows that . From (17), we see that implies , and hence each coefficient in (20) is multiplying zero.

3. More generally, if the input distribution achieving also achieves all of the boundary points in a neighborhood of , then the unions over can be replaced by . In particular, this is true when the entire capacity region is achieved by a single input distribution. This will be observed for the Gaussian MAC in Section 3.3.

4. All non-empty subsets of can occur with the exception of . Focusing on the case that time-sharing is absent, the case is impossible since

 I(X1,X2;Y) =I(X1;Y)+I(X2;Y|X1) (22) ≤I(X1;X2,Y)+I(X2;Y|X1) (23) =I(X1;Y|X2)+I(X2;Y|X1), (24)

where (24) follows since and are independent. Whenever includes , we have and , and it follows from (24) that . Therefore, we have , corresponding to a rectangular achievable rate region.

5. The inner bound in (20) is of a similar form to the set appearing in [16, Thm. 3] for the Gaussian MAC with degraded message sets. The main differences are (i) The left and right tangent vectors are treated separately here, since unlike in [16], the two do not have the same slope in general (e.g. see Figure b and the example in Section 3.2); (ii) There are six cases here corresponding to the different subsets of (see the previous item), whereas in [16] there are only two possibilities for the set of active rate conditions.

6. The proof of Theorem 1 can be followed using i.i.d. codeword distributions, yielding an analogous result with (see (12)) in place of . Using the fact that , it is not difficult to show that the inner bounds on obtained using include those obtained using whenever . In Section 3.1, we will see that the inclusion can be strict.

7. It is also of interest to compare to a hypothetical dispersion matrix of the form

 Vjoint≜E[Cov[i(U,X1,X2,Y)∣∣U,X1,X2]]. (25)

This is the matrix that would be obtained if the joint composition of were fixed, which is impossible in the absence of cooperation between the users. As we show in [20], we have ; this is proved using the matrix version of the law of total variance, along with the identity .

8. We make no attempt to present analytical expressions for and , but two numerical approaches to their computation are presented in Section 3. In general, if the capacity region is characterized numerically, then one can easily obtain numerical bounds or approximations for these tangent vectors.

## 3 Examples

### 3.1 The Collision Channel

We begin with a simple deterministic example that will permit us to compare i.i.d. and constant-composition random coding, and to discuss the role of the off-diagonal entries of the corresponding dispersion matrices.

Setting and , the channel is given by

 W(y|x1,x2)=⎧⎨⎩1y=(x1,x2) and min{x1,x2}=01y=c and min{x1,x2}≠00otherwise. (26)

In words, if either user transmits the zero symbol then the pair is received noiselessly, whereas if both users transmit a non-zero symbol then the output is , meaning “collision”.

We recall the following observations by Gallager [21]: (i) The capacity region can be obtained without time sharing;2 (ii) By symmetry, the points on the boundary of the capacity region are achieved by input distributions of the form and , where ; (iii) The achievable rate region corresponding to any such pair is rectangular. The capacity region is shown in Figure 2. The left and right tangent vectors coincide (i.e.  and hence ) at all boundary points with and , and the case of interest in Theorem 1 is .

One approach to computing the inner bound in (20) for a given boundary point is to first find the pair achieving that point, and then calculate and (e.g. see the example in Section 3.2). In this example, the reverse approach turns out to be more convenient: We start with a given derivative , and perform an optimization over to find the corresponding (unique) boundary point .

As stated above, the achievable rate region for a given pair is a rectangle with a corner point given by . The straight line of slope passing through this point is given by . Thus, finding the desired boundary point simply amounts to maximizing with respect to , which is a straightforward optimization problem.

For concreteness, we provide a numerical example for the case that , which corresponds to . Using a brute force search to three decimal places, we found the optimal parameters to be , yielding bits/use. The inner bound on from Theorem 1, and its counterpart for i.i.d. random coding (cf. (12)), are shown in Figure 3, where we set . For comparison, we also plot the weaker inner bounds in which is set to zero and only remains (cf. (20)). The boundaries of these regions are shown, and the regions lie to the bottom-left of these boundaries.

We see that the region resulting from constant-composition random coding is strictly larger than that resulting from i.i.d. random coding. In this example, the strict inclusion holds for all points on the boundary of the capacity region other than the endpoints corresponding to or . This gain is analogous to a similar gain in the error exponent [12]. In contrast, in the single-user setting, the two ensembles yield the same second-order term and error exponent after the optimization of the input distribution [4, 22].

We conclude by discussing the roles of the various entries of the covariance matrices. In this example, the diagonal entries and determine the locations of the vertical and horizontal asymptotes in Figure 3. We see from Figure 3 that the off-diagonal terms also play a role. In particular, the rectangular shape of the region with for constant-composition coding is an extreme case corresponding to a singular dispersion matrix , and this is in fact the most favorable shape possible (in terms of enlarging ) given fixed locations of the vertical and horizontal asymptotes. In contrast, the off-diagonal terms of yield a more standard curved region for , which is less favorable. Thus, at least in this example, the enlarged second-order region for constant-composition codes is not only due to smaller diagonal entries, but also due to a more favorable covariance matrix structure.

### 3.2 A Non-Deterministic Example

Here we provide an example showing that the two unions in (20) cannot, in general, be combined into one. In other words, it is necessary to consider the left and right tangent vectors separately. We set , and

 W(y|x1,x2)=⎧⎨⎩1x1=x2=y0.5x1≠x20otherwise. (27)

Thus, the channel is noiseless if , and completely noisy if . We write the input distributions as and .

The capacity region is shown in Figure 4. Observe that there are two “corner points”, but unlike those of standard pentagonal regions, neither of them corresponds to a change in angle of 45 degrees. More precisely, the middle segment shown in the plot has slope , but the other two segments are neither vertical nor horizontal (in fact, they are not even straight line segments, even though they may appear to be). Both corner points are achieved by .

Here we focus on characterizing the set for the upper corner point ; identical arguments apply for the lower corner point. The case of interest in Theorem 1 is . Since the middle segment in Figure 4 has slope , we have . The idea used to compute is to shift the point by a small amount , and observe the behavior of the corner point for , where for each we are interested in the limiting behavior as . Making the dependence of and on explicit, a second-order Taylor expansion yields

 Iν(p1+Δcosθ,p2+Δsinθ)=Iν(p1,p2)+fν(θ)Δ2+O(Δ3), (28)

for , where

 Unknown environment '% (29)

Note that the first-order term in (28) is absent, since the derivatives and are zero at for . We conclude from (28) that for a fixed choice of , moves in the direction in the limit as . Evaluating the direction for 10000 equally spaced angles over the range , we obtained , corresponding to radians and .

Figure 5 shows the resulting inner bound on given in Theorem 1, with . In this example, the region is identical for i.i.d. random coding and constant-composition random coding. It is interesting to observe the different shape of the region compared to the previous example, resulting from the differing left and right tangent vectors. It is only the former that plays a role in enlarging the achievable region, since the set (with ) already satisfies the property that if a given point is in the set, so are all points on the right-hand side of the line of slope passing through that point.

### 3.3 Gaussian Multiple-Access Channel

We have focused our attention on the DM-MAC, which permits an analysis based on combinatorial arguments. We now discuss how Theorem 1 can be extended to the Gaussian MAC via an increasingly fine quantization of the inputs, similarly to Hayashi [5, Thm. 3] and Tan [23]. Each use of the channel is described by

 Y=√P1X1+√P2X2+Z, (30)

where , and where each codeword for user is constrained to satisfy . The quantities and represent the signal-to-noise ratios for users 1 and 2 respectively.

The capacity region is pentagonal [24, Sec. 15.1], and is achieved using Gaussian input distributions, namely . The quantities and in (4) and (13) can be written explicitly; for , we have

 Iν=12log(1+Pν) (31)

with . Moreover, for , we have

 Vν =Pν(2+Pν)2(1+Pν)2 (32) Vν,12 =Pν(2+P1+P2)2(1+Pν)(1+P1+P2), (33)

and the remaining entries of are given by

 V12 =(P1+P2)(2+P1+P2)+2P1P22(1+P1+P2)2 (34) V1,2 =P1P22(1+P1)(1+P2). (35)

A brief outline of how these expressions are obtained is given in Appendix 7.

We claim that the following inner bound holds using the notation of Theorem 1 (with the additional condition that the codewords must satisfy the above power constraints in the definition of in Section 1.2):

 L(ϵ,R∗1,R∗2)⊇{(L1,L2):L(K)∈−Qinv(V(K),ϵ)}. (36)

This result was first derived by MolavianJazi and Laneman [25], who used random coding according to the uniform distribution over the surface of a sphere. The techniques of this paper provide an alternative approach to deriving the result. Extending Theorem 1 accordingly is non-trivial, but it is done using well-established techniques; we provide an outline in Appendix 7. In contrast with Theorem 1, no form of time-sharing is used in the proof, and no tangent vectors appear in (36). This is due to the fact that every point on the boundary of the capacity region is simultaneously achieved by Gaussian inputs.

## 4 Proof of Theorem 1

The random-coding ensemble used in the proof depends on two triplets and of probability distributions on the same alphabets. We define , , , , and in the same way as Sections 1.2 and 2.1, with replacing . In particular, we have

 I′ ≜E[i′(U′,X′1,X′2,Y′)] (37) V′ ≜E[Cov[i′(U′,X′1,X′2,Y′)∣∣U′] −Cov[i′(1)(U′,X′1)∣∣U′]−Cov[i′(2)(U′,X′2)∣∣U′]], (38)

where . As an intermediate step towards obtaining our local result, we present the following global result.

###### Theorem 2.

Fix any finite time-sharing alphabet and the input distributions and