# Lattice Codes Achieve the Capacity of Common Message Gaussian Broadcast Channels with Coded Side Information

## Abstract

Lattices possess elegant mathematical properties which have been previously used in the literature to show that structured codes can be efficient in a variety of communication scenarios, including coding for the additive white Gaussian noise (AWGN) channel, dirty-paper channel, Wyner-Ziv coding, coding for relay networks and so forth. We consider the family of single-transmitter multiple-receiver Gaussian channels where the source transmits a set of common messages to all the receivers (multicast scenario), and each receiver has *coded side information*, i.e., prior information in the form of linear combinations of the messages. This channel model is motivated by applications to multi-terminal networks where the nodes may have access to coded versions of the messages from previous signal hops or through orthogonal channels. The capacity of this channel is known and follows from the work of Tuncel (2006), which is based on random coding arguments. In this paper, following the approach of Erez and Zamir, we design lattice codes for this family of channels when the source messages are symbols from a finite field of prime size. Our coding scheme utilizes Construction A lattices designed over the same prime field , and uses *algebraic binning* at the decoders to expurgate the channel code and obtain good lattice subcodes, for every possible set of linear combinations available as side information. The achievable rate of our coding scheme is a function of the size of underlying prime field, and approaches the capacity as tends to infinity.

## 1Introduction

-theoretic results often rely on random coding arguments to prove the existence of good codes. Usually, the codebook is constructed by randomly choosing the components of each codeword independently and identically from a judiciously chosen probability distribution. While this technique is powerful, the resulting codebooks do not exhibit any structure that may be of practical interest. One such desirable structure is linearity, which allows complexity reductions at the encoder and decoder by utilizing efficient algebraic processing techniques. Further, in certain communication scenarios, coding schemes based on linear codes yield a larger achievable rate region than random code ensembles, as was shown by Körner and Marton [1] for a distributed source coding problem. Structured coding schemes have been widely studied in the literature, especially for communications in the presence of side information and in multi-terminal networks. For an overview of structured coding schemes we refer the reader to [2] and the references therein.

For communication in the wireless domain, structured codes can be obtained by choosing finite subsets of points from lattices [4]. A lattice is an infinite discrete set of points in the Euclidean space that are regularly arranged and are closed under addition. Codes based on lattices, known as *(nested) lattice codes* or *Voronoi codes*, are the analogues of linear codes in wireless communications. Efficient lattice based strategies are known for a variety of communication scenarios, such as for achieving the capacity of the point-to-point additive white Gaussian noise (AWGN) channel [7], for dirty-paper coding [12], the Wyner–Ziv problem [2] and communication in relay networks [13], to name only a few.

In this paper we present good lattice strategies for communication in common message Gaussian broadcast channels, which we refer to as the *multicast channel*, where receivers have prior side information about the messages being transmitted. In particular, we assume that the transmitter is multicasting message symbols from a finite field , of prime size , to all the receivers, and each receiver may have *coded side information* about the messages: the prior knowledge of the values of (possibly multiple) -linear combinations of . The number of linear combinations available as side information and the coefficients of these linear combinations can differ from one receiver to the next. The capacity of this channel is known and follows from the results of Tuncel [17], where the achievability part utilizes an ensemble of codebooks generated using the Gaussian distribution.

The multiuser channel considered in this paper is a noisy version of a simple special case of *index coding* [18]. The index coding problem considers a *noiseless* broadcast link where each receiver demands a subset of the source messages and knows the values of some other subset as side information. A generalization of the index coding problem in which the receivers have access to linear combinations of messages was studied recently in [21]. The specific instance of index coding where each receiver demands all the messages from the source corresponds to a noiseless multicast channel and has a simple optimum solution based on maximum distance separable (MDS) codes [23]. When the channel is noisy, capacity-achieving coding schemes based on structured codes are not available. In this paper we design lattice-based strategies for multicasting over the AWGN channel where the side information at the receivers is in the form of linear combinations of source messages.

The case of Gaussian multicast channel with coded side information is motivated by applications to multi-terminal communication networks. It is known that signal interference in wireless channels can be harnessed by decoding linear combinations of transmit messages instead of either treating interference as noise or decoding interference along with the intended message [15]. When such a technique is used in a mutli-hop communication protocol, one encounters receivers that have coded side information obtained from transmissions in the previous phases. Similarly, in a network that consists of both wired and wireless channels, the symbols received from wired links can be utilized as side information for decoding the wireless signals. If a linear network code is used in the wired part of the network, then the side information is in the form of linear combinations of the source messages.

A special case of coded side information is the Gaussian multicast channel where each receiver has prior knowledge of the values of some subset of the messages. The known capacity-achieving coding schemes for this special case are based on random coding using i.i.d. (independent and identically distributed) codewords [17]. Existence of lattice based capacity-achieving coding schemes were proved in [28] for the special case where the number of messages and receivers are two and each receiver has the knowledge of one of the messages. Constructions of binary codes for this channel were proposed in [33]. Explicit constructions of lattice codes were given in [36] that convert receiver side information into additional apparent coding gain in the AWGN channel. Codes based on quadrature amplitude modulation were constructed in [38]. In [32], explicit codes based on lattices and coded modulation have been designed that perform within a few decibels of capacity when the number of receivers is two and each knows one of the two messages being transmitted.

The objective of this paper is to prove that lattice codes can achieve the capacity of the common message Gaussian broadcast channels with coded side information. We use the information-theoretic framework set by Erez and Zamir [8] to this end. The proposed coding scheme uses lattices obtained by applying Construction A to linear codes over the prime field which is the alphabet of the source messages. The achievable rate of our lattice-based coding scheme is a function of the prime , and approaches the capacity of the common message Gaussian broadcast channel as .

Our decoding scheme involves *algebraic binning* [2] where the receiver side information is used to expurgate the channel code and obtain a lower rate subcode. The set of linear equations available as side information may differ from one receiver to another, and hence, each receiver must employ a different binning scheme for the same channel code. The coding scheme ensures that the binning performed at each receiver produces a good lattice subcode of the transmitted code. Following expurgation, each receiver decodes the channel output by minimum mean square error (MMSE) scaling and quantization to an infinite lattice. The algebraic structure of the coding scheme facilitates the performance analysis by decomposing the original channel into multiple independent point-to-point AWGN channels – one corresponding to each receiver – where each of the point-to-point AWGN channels uses a lattice code for communication. Unlike [8], where achievability in a point-to-point AWGN channel was proved using error exponent analysis, we provide a direct proof based only on simple counting arguments.

As a corollary to the main result, we obtain an alternative proof of the goodness of lattice codes in achieving the capacity of the point-to-point AWGN channel. Previous proofs of this result presented in [8] also use ensembles of lattices obtained by applying Construction A to random linear codes over a prime field ; see also [40]. While [8] used primes that were exponential in the code length , [9] and [10] improved this result to let grow as and , respectively. The corollary presented in this paper further improves these results by enabling a choice of the prime which is independent of the code length but is a function only of the gap between the desired rate and the channel capacity.

Lattices have been used to design powerful physical-layer coding schemes for wireless networks consisting of multiple sources, relays and destinations [15]. In these networks information from the source nodes is conveyed to the destination nodes through relays over multiple hops and time slots. In each time slot, a set of nodes act as transmitters and every other node in their range observes a linear superposition of the transmitted signals perturbed by AWGN. Lattice coding schemes for these networks are designed such that each receiver can reliably decode the observed noisy superposition to a linear combination of source messages which it then proceeds to transmit in the next time slot. Every destination node decodes its desired messages once it collects sufficiently many linear combinations. In contrast, in this paper we consider a single hop interference-free transmission in a multicast channel consisting of one transmitter and multiple destination nodes that are aided by coded side information. Our objective is to design coding schemes that can utilize prior knowledge at these receivers rather than exploit wireless interference arising from multiple simultaneous transmissions, as often experienced in relay networks.

The organization of this paper is as follows. We introduce the channel model in Section 2.1 and review the relevant background on lattices and lattice codes in Section 2.3. In Section 3, we state the main theorem, and describe the lattice code ensemble and encoding and decoding procedures. We prove the main theorem and state a few corollaries in Section 4, and finally, we discuss some concluding remarks in Section 5.

*Notation:* Matrices and column vectors are denoted by bold upper and lower case letters, respectively. The symbol denotes the Euclidean norm of a vector, and is the transpose of a matrix or a vector. The Kronecker product of two matrices and is , is the identity matrix, and is the all zero matrix of appropriate dimension. The symbol denotes logarithm to the base and denotes logarithm to the base . The expectation operator is denoted by . The symbol denotes the elements in the set that do not belong to the set .

## 2Channel Model and Lattice Preliminaries

### 2.1Channel Model and Problem Statement

We consider a (non-fading) common message Gaussian broadcast channel with a single transmitter and finitely many receivers, where all terminals are equipped with single antennas. The transmitter operates under an average power constraint and the receivers are affected by additive white Gaussian noise with possibly different noise powers. There are independent messages at the transmitter that assume values with a uniform probability distribution from a prime finite field . Each receiver desires to decode all the messages while having prior knowledge of the values of some -linear combinations of the messages . Consider a generic receiver that has access to the values , , of the following set of linear equations

We will denote this side information configuration using the matrix , where each row of represents one linear equation. Any row of that is linearly dependent on the other rows represents redundant information and can be discarded with no loss to the receiver side information, and hence, with no loss to system performance. Hence, without loss in generality, we will assume that the rows of are linearly independent over , i.e., , and . Note that the values of and can be different across the receivers. A receiver with no side information is represented with an empty matrix for (with ).

A receiver in the multicast channel is completely characterized by its *(coded) side information matrix* and the variance of the additive noise. If we assume that the average transmit power at the source is , then the signal-to-noise ratio at this receiver is . We will denote a receiver by the pair , where is any matrix over with columns and linearly independent rows, and . Note that *uncoded* side information, i.e., the prior knowledge of the values of a size subset of , is a special case, and hence, is contained within the definition of our channel model.

From elementary linear algebra we know that if the values of linearly independent combinations of the variables are given, then the set of all possible solutions of is a coset of a dimensional linear subspace of . Since the a priori probability distribution of is uniform, we conclude that, given the side information values , , the probability distribution of is uniform over this coset. Using the fact that the number of elements in the coset is , we observe that the conditional entropy of given the side information is

Suppose we want to transmit, on the average, one realization of in every uses of the broadcast channel. The transmission rate of each message is b/dim (bits per real dimension or bits per real channel use).

For the simplicity of exposition, we consider only the symmetric case where all the messages are required to be transmitted at the same rate . The general scenario, where the messages are of different rates, can be reduced to the symmetric case through *rate-splitting*: if there are messages with transmission rates , respectively, then by splitting each of these original sources into multiple virtual sources, one can generate a set of sources () such that their rates are as close to each other as required.

We will assume that the encoding at the transmitter is performed on a block of independent realizations of the message symbols, i.e., the source jointly encodes message vectors . The transmitter uses an -dimensional channel code together with a function

to jointly encode the message vectors. The number of codewords in is , and we will assume that the codebook satisfies the per-codeword power constraint

The average number of channel uses to transmit each realization of is . The resulting rate of transmission of each of the messages is

The sum rate of all the messages is b/dim.

The side information at over a block of realizations of the message symbols is of the form , , where and . This side information allows the receiver to conclude that the transmitted codeword must belong to the following subcode of ,

The optimal decoder at decodes the channel output vector to the nearest codeword of this subcode, and the error probability at this receiver is the probability that the estimated message tuple is not equal to the transmit message . In order to achieve the optimal performance at a given receiver , we thus require that the expurgated code be a good channel code for the point-to-point AWGN channel. In the multicast channel that consists of multiple receivers, the side information matrix can vary from one receiver to the next, and hence, the expurgated codes can be different at each receiver, see Figure 1. Hence, a capacity-achieving channel code is such that the resulting expurgated code at every receiver is a good channel code for the AWGN channel.

### Problem Statement

#### Problem Setup

Consider a common message Gaussian broadcast channel with single transmitter and receivers. The transmitter desires to multicast independent messages from a prime field subject to the unit power constraint on the transmit codeword. Each of the receivers has coded side information corresponding to the side information matrix , , and experiences an additive white Gaussian noise of variance , . Without loss of generality, we assume that each of the side information matrices has linearly independent rows, i.e., . Using the information-theoretic arguments of [17], which is based on the average performance of an ensemble of randomly generated codebooks, it can be shown that the (symmetric) capacity of this multicast channel is

The proof of this result is similar to the proof of Theorem 6 of [17] which considers a *discrete memoryless* common message broadcast channel where the side information at each receiver is, in general, a random variable jointly distributed with the source messages . A sketch of the proof that is the capacity for the Gaussian multicast channel with coded side information at the receivers is given in the appendix.

#### Problem Statement

Let be fixed positive real numbers and let . We seek to determine whether there exists a lattice code for the multicast channel with coded side information at the receivers that transmits each of the messages with rate at least such that the probability of decoding error at each of the receivers is at the most .

In this paper we answer the above stated problem in the affirmative under the assumption that the prime field is sufficiently large. In particular, we prove the existence of a lattice code with the said properties when the prime satisfies the inequality . Unlike the capacity which holds for any value of , our result on the optimality of lattice codes requires that vary with the tolerance . The larger the gap to capacity , the smaller is the size requirement on the prime field .

### 2.3Lattice Preliminaries

We now briefly recall the necessary properties of lattices and lattice codes, and establish our notation and terminology. The material presented in this section consists of standard ingredients used in the literature, and is mainly based on [5].

#### Lattices and Lattice Codes

Throughout this manuscript we consider -dimensional lattices with full-rank generator matrix. The closest vector lattice quantizer corresponding to is denoted by the function , and the volume of its (fundamental) Voronoi region is denoted by . For any , is the set of all points in that are mapped to under , and it has the same volume as . For any two distinct lattice points , the sets and are disjoint. The *modulo*- operation, defined as , satisfies the following properties for all

We will denote the -dimensional ball of radius with center as , i.e. , and the volume of a unit-radius ball in dimensions by . It follows that the volume of equals . The *covering radius* of the lattice is denoted by and the *effective radius* of by . We recall that and

Rogers [44] showed that for every dimension there exists a lattice such that

where is a constant. Note that the right hand side of the above inequality converges to as . A sequence of lattices of increasing dimension is said to be *Rogers-good* if . Rogers’ result shows that such a sequence exists (see also [45]).

Let be a pair of nested lattices and be a fixed vector. A *(nested) lattice code* or a *Voronoi code* is the set obtained by applying the operation on the points of the lattice translate . The code consists of all the points in that lie within the Voronoi region of , i.e., . The lattice is called the *coarse lattice* or the *shaping lattice*, is called the *fine lattice* or the *coding lattice*, and is the *dither* vector. The cardinality of this code is , and every codeword point satisfies . Note that is a lattice code with zero dither.

#### Lattice Codes from Linear Codes over a Finite Field

In this subsection we briefly describe the method proposed in [15] to construct a pair of nested lattices, and recall its relevant properties. This construction uses a coarse lattice and a linear code to generate a fine lattice such that .

Let denote the natural map that embeds into . When applied to vectors, acts independently on each component of a vector. Let be a linear code of rank , ,

where is the generator matrix with full column rank, and is the message encoded to . The set obtained by tiling copies of at every vector of is a lattice in and is known as the *Construction A* lattice of the linear code [5]. Note that the number of points in contained in the Voronoi region of the lattice is . We obtain by scaling down the Construction A lattice by and transforming it by the generator matrix of

Since contains the all zero codeword, it follows that . We observe that applying the transformation to the lattice (instead of the lattice ) generates (instead of ). Hence, has the same algebraic structure as that of , which in turn, is equivalent to the linear code . In particular,

The following lemma provides an explicit bijection between the message vectors encoded by and the points in the lattice code . This result, which is originally from [15], is proved below for completeness.

In order to prove capacity achievability, we will rely on random coding arguments to show the existence of a good choice of . As in [15], we will assume that is a random matrix chosen with uniform probability distribution on . The following result is useful in upper bounding the decoding error probability over the ensemble of random codes.

## 3Lattice Codes for the Common Message Gaussian Broadcast Channel with Coded Side Information

We will assume that the number of messages and a design rate are given, and show that there exist good lattice codes of sufficiently large dimension that encode messages over an appropriately chosen prime field at rates close to b/dim. In order to rigorously state the main result, we consider a fixed non-zero tolerance that determines the gap to capacity.

To prove Theorem ?, we utilize the lattice code ensemble introduced in [15]; see Section ? of this paper. A Rogers’ good lattice is chosen as the coarse lattice . The fine lattice is obtained from the generator matrix of the coarse lattice and a linear code over a large enough prime field using the construction described in Section ?.

The multicast channel considered in Theorem ? reduces to the traditional single-user AWGN channel if the number of messages , and the multicast channel consists of one receiver with an empty side information matrix , i.e., . Hence, Theorem ? provides an alternative proof of the existence of lattice codes that achieve the capacity of the single-user AWGN channel, and we have the following corollary.

The relation of Corollary ? to existing results on the optimality of Construction A based lattice codes in single-user AWGN channel is described in detail in Section ?.

In the rest of this section we describe the construction of random lattice codes, and the encoding and decoding operations used to prove Theorem ?. We provide the proof of the Theorem ? in Section 4.

### 3.1Random lattice code ensemble

#### Prime

Given the design rate , number of messages and tolerance , we require to satisfy the constraint The coding schemes of this paper are based on Construction A lattices which are obtained by lifting linear codes over to the Euclidean space . The generator matrices of these -linear codes are constructed randomly, and the first constraint in , viz. , will allow us to show that these randomly constructed generator matrices are full-ranked with probability close to .

The proof of Theorem ? given in Section 4 involves the derivation of an upper bound on the probability of decoding error averaged over an ensemble of lattice codes derived from Construction A. We will use the inequality from to show that this upper bound is exponentially small in dimension . Note that this inequality implies

for any integer satisfying . Rearranging the terms in the above inequality we obtain

#### Message length

Once is fixed, we choose as the largest integer that satisfies

The left-hand side in the above inequality is the actual rate at which the lattice code encodes each message, while is the design rate. The difference between the two is at the most

which converges to as . It follows that the code rate tends to the design rate as , and hence, for all sufficiently large .

#### Coarse Lattice

From in Section ?, we know that for a given and for all sufficiently large , there exists an -dimensional lattice such that

We will choose such a Rogers-good lattice as , and scale it so that

It follows that . Using the definition of the effective radius , we arrive at the following lower bound on the volume of the Voronoi region of

#### Fine Lattice

The fine lattice is obtained by the construction of [15] described in Section ?. The length of the linear code is , and its rank is the number of message symbols to be encoded by the lattice code. Note that this requires that be true. Using and the property , we have which ensures that . If is the generator matrix of , then . We will choose uniformly random over the set of all matrices of , resulting in a random ensemble of fine lattices .

#### Dither vector

We will rely on random coding arguments to prove the existence of a translate such that the code performs close to capacity. We will assume that is distributed uniformly in and is chosen independently of . This random dither is usually viewed as a common randomness available at the transmitter and the receivers [8]. Note that .

### 3.2Encoding

We will now describe the encoding operation at the transmitter that maps the message vectors to a codeword . The encoder first concatenates the messages into the vector , encodes to a codeword in the linear code , and maps it to a point using Construction A as follows

From the discussion in Section 3.1, we know that , and hence, . Finally, the transmit codeword is generated by dithering ,

This sequence of operations is illustrated in Figure 2. Note that since , each codeword satisfies , and hence, the power constraint . It is straightforward to show that the dithering operation is a one-to-one correspondence between and . Further, from Lemma ? we know that is a bijection between the message space and the undithered codewords if is full rank. Hence, to ensure that no two messages are mapped to the same codeword, we only require that the random matrix be full rank. It can be shown that (see [45])

We will only require a relaxation based on the above inequality. From , we have . Similarly, since is a prime integer, we have , and hence,

### 3.3Decoding

The receiver employs a two stage decoder: in the first stage the receiver identifies the subcode of corresponding to the available side information, and in the second stage it decodes the channel output to a point in this subcode.

#### Using Side Information to Expurgate Codewords

The side information at over a block of realizations of the messages is of the form

The receiver desires to identify the set of all possible values of the message vector that satisfy . Using the notation , the side information can be rewritten compactly in terms of and as

where denotes the Kronecker product of matrices and is the identity matrix over . Observe that is an under-determined system of linear equations, and the set of solutions is a coset of the null space of . Let be a rank matrix such that , i.e., the columns of form a basis of the null space of . Then the set of all solutions to is

where is the coset leader. From , we conclude that the undithered codeword must be of the form

We will now use the property of that for any ,

Therefore, for some . Using this in , we obtain

where we have used , and the fact that . Since the receiver knows , the component of unavailable from the side information is

Let be the subcode of with generator matrix , and be the lattice obtained by applying Construction A to and transforming it by , i.e.,

Using instead of in Lemma ?, we see that and that is a one-to-one correspondence between and as long as is full rank. Together with , , and , we conclude that the transmit vector belongs to the following lattice subcode of ,

The decoding problem at the second stage is to estimate , or equivalently , from the channel output.

#### MMSE Scaling and Lattice Decoding

Let the channel output at the receiver be , where is a Gaussian vector with zero mean and variance per dimension. The received vector is scaled by the coefficient , resulting in

This MMSE pre-processing improves the effective signal-to-noise ratio of the system beyond the channel signal-to-noise ratio and allows the lattice decoder to perform close to capacity [8]. Let

be the effective noise term in . Using the facts that and are independent, , and has zero mean, we have

where is the expectation operator. The choice of minimizes this upper bound and yields , which is less than the Gaussian noise power . In the rest of the paper we will assume that and use the notation

The lower bound on signal-to-noise ratio can be rewritten in terms of as

From , we know that for some . After MMSE scaling, the decoder removes the contributions of the dither and the offset from to obtain

The decoder proceeds by quantizing to the lattice and reducing the result modulo . If the noise is sufficiently ‘small’, then this sequence of operations will yield

Given , the receiver uses to obtain the undithered codeword , and hence the message vector