Error Correction for Index Coding With Coded Side Information

Error Correction for Index Coding With Coded Side Information

Eimear Byrne, and Marco Calderini. School of Mathematical Sciences, University College Dublin, Ireland.e-mail: ebyrne@ucd.ieResearch supported by ESF COST Action IC1104Department of Mathematics, University of Trento, Italy.email: marco.calderini@unitn.itResearch supported by ESF COST Action IC1104Manuscript received MONTH, YEAR.Copyright (c) 2013 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.
Abstract

Index coding is a source coding problem in which a broadcaster seeks to meet the different demands of several users, each of whom is assumed to have some prior information on the data held by the sender. A well-known application is to satellite communications, as described in one of the earliest papers on the subject [6]. It is readily seen that if the sender has knowledge of its clients’ requests and their side-information sets, then the number of packet transmissions required to satisfy all users’ demands can be greatly reduced if the data is encoded before sending. The collection of side-information indices as well as the indices of the requested data is described as an instance of the index coding with side-information (ICSI) problem. The encoding function is called the index code of , and the number of transmissions employed by the code is referred to as its length. The main ICSI problem is to determine the optimal length of an index code for and instance . As this number is hard to compute, bounds approximating it are sought, as are algorithms to compute efficient index codes. These questions have been addressed by several authors [1, 4, 5, 7, 33, 37], often taking a graph-theoretic approach. Two interesting generalizations of the problem that have appeared in the literature are the subject of this work. The first of these is the case of index coding with coded side information [10, 34], in which linear combinations of the source data are both requested by and held as users’ side-information. This generalization has applications, for example, to relay channels and necessitates algebraic rather than combinatorial methods. The second is the introduction of error-correction in the problem, in which the broadcast channel is subject to noise [11]. In this paper we characterize the optimal length of a scalar or vector linear index code with coded side information (ICCSI) over a finite field in terms of a generalized min-rank and give bounds on this number based on constructions of random codes for an arbitrary instance. We furthermore consider the length of an optimal -error correcting code for an instance of the ICCSI problem and obtain bounds analogous to those described in [11], both for the Hamming metric and for rank-metric errors. We describe decoding algorithms for both categories of errors based on those given in [11, 35].

Index coding, min-rank, error correction, minimum distance, network coding, coded side information.

I Introduction

The problem of index coding with side information (ICSI) was introduced by Birk and Kol in [6] under the term informed source coding on demand. In [4] the authors explicitly refer to the problem as index coding. This topic is motivated by applications in broadcast communications such as audio and video on-demand, content delivery, and wireless networking. It relates to a problem of source coding with side information, in which receivers have partial information about the data to be sent prior to its broadcast. The problem for the sender is to exploit knowledge of the users’ side information to encode data optimally, that is to reduce the overall length of the encoding, or equivalently, the number of transmitted packets. The ICSI problem has since become a subject of several studies and generalizations [1, 4, 5, 31, 11, 12, 33].

The scenario of the ICSI problem is the following. A server (sender) has to broadcast some data to a set of clients (receivers or users), with possibly different messages requested by different clients. Before the transmission starts, each receiver already has some data in its possession, its cached packets, called its side-information. These packets may be from a previous broadcast, perhaps sent during lighter data traffic periods, or acquired by some other communication. The receivers let the sender know which messages they have, and which they require. The broadcaster can use this information, along with encoding, to reduce the overall number of packet transmissions required to satisfy all the demands of its clients. If the sender has been successful in this endeavour, then the broadcasted data can be utilized by each user, along with its cached packets, in order to decode its own specific demand.

The main index coding problem is to determine the minimum number of packet transmissions required by the sender in order to satisfy all users’ requests, if encoding of data is permitted. Given an instance of the ICSI problem, Bar-Yossef et al [4] proved that finding the best scalar linear binary index code is equivalent to finding the min-rank of a graph, which is known to be an NP-hard problem [30]. The twin problem is to determine an explicit optimal encoding function for an instance. Any encoding function for an instance necessarily gives an upper bound on the optimal length of an index code. There have been a number of papers addressing this aspect of the problem, in fact finding sub-optimal but feasible solutions, using linear programming methods to obtain partitions of the users into solvable subsets. Such solutions involve obtaining clique covers, partial-clique covers, multicast partitions and some variants of these [7, 8, 33, 34, 37]. Other than these LP approaches, low-rank matrix completion methods may also be applied. This was considered for index coding over the real numbers in [22].

The importance of the index coding problem can also be seen in its equivalences and connections to other problems, such as network coding, coded-caching and interference alignment [15, 31, 16, 29]. These equivalences mean that results in index coding have impact in such other areas, and vice versa.

In [34, 10] the authors give a generalization of the index coding problem in which both demanded packets and locally cached packets may be linear combinations of some set of data packets. We refer to this as the index coding with coded side information problem (ICCSI). This represents a significant departure from the ICSI problem in that an ICCSI instance no longer has an obvious association to a graph, digraph or hypergraph, as in the ICSI case. However, as we show here, it turns out that many of the results for index coding have natural extensions in the ICCSI problem.

One motivation for the ICCSI generalization is related to the coded-caching problem. The method in [16] uses uncoded cache placement, but the authors give an example to show that coded cache placement performs better in general. In [17], it is shown that in a small cache size regime, when the number of users is not less than the number of files, a scheme based on coded cache placement is optimal. Moreover in [18] the authors show that the only way to improve the scheme given in [16] is by coded cache placement.

Another motivation is toward applications for wireless networks with relay helper nodes and cloud storage systems (see [10] and the references therein). Consider the example in Table I. We have a scenario with one sender and four receivers , , and . The source node has four packets and and for user wants packet . The transmitted packet is subject to independent erasures. It is assumed that there are feedback channels from the users, informing the transmitting node which packets are successfully received. At the beginning, in time slot 1, 2, 3 and 4 the source node transmits packets , , and , respectively. After time slot 4 we have the following setting: has packet , has packet , has packet and has packet . Now from the classical ICSI problem we have that receivers and form a clique, in the associated graph, and then we can satisfy their request sending . Similarly for and we can use . So, the source node in time slot 5 and 6 transmits the coded packet and , intending that users receive the respective packet. However, and receive the coded packet and and receive . At this point if only the uncoded packets in their caches are used, we still need to send two packets. If all packets in their caches are used, the source only needs to transmit one coded packet in time slot 7. If all four users can receive this last transmission successfully, then all users can decode the required packets by linearly combining with the packets received earlier.

A second generalization of the ICSI problem was given in [11], where the authors consider error correction. That is, the broadcast channel may be subject to noise during a transmission. Classical coding theory plays a role in several of the results and a number of bounds are given on the optimal length of an error correcting index code (ECIC) that corrects some Hamming errors. A decoding algorithm based on syndrome decoding is also described. We remark that error-correction for network coding has only been addressed for multicast, where rank-metric and subspace codes are proposed. There are numerous papers on this subject after the seminal works [25, 36]. Many of these are based on Gabidulin codes [20].

I-a Our Contribution

In this paper we develop the theory of index coding further along the lines of these latter mentioned generalizations. That is, we consider the ICCSI problem both in the error-free case and with respect to error correction. We assume that the source data is composed of blocks of length over (the finite field of elements), and that encoding involves taking -linear combinations of the data blocks. In particular, we consider both linear and scalar-linear index codes. We describe a generalized min-rank for the ICCSI problem, which we show gives the optimal length for -linear encodings. This quantity is actually shown to be the minimum rank weight of the coset of an -linear matrix code determined by an ICCSI instance. We characterize necessary and sufficient conditions for a matrix to realize an instance of the ICCSI problem and use this to obtain upper bounds on the length of an optimal -linear index code with coded side information. The first of these may be viewed as a generalization of the bound obtained by the existence of a partial clique in the side-information graph of a classical index coding problem. It requires to be large although it does not rely on the use of a maximum distance separable (MDS) code. The second of these bounds offers a refinement and relaxation of the constraint on and is not explicit. Both are based on the probability that an arbitrary matrix realizes a code for an instance.

Following the work of [11], we consider error correction for the ICCSI problem, both for the Hamming and rank metric and address the question of the main index coding problem for error correcting index codes. We establish criteria for error correction for an ICCSI instance and give bounds on the optimal length of a -error correcting ECIC, both for the Hamming metric and the rank metric. These results are extensions of the and sphere-packing and Singleton bounds as described in [11]. Some of these also yield further upper bounds on the optimal length of an ICCSI code for the error-free case.

Finally, we outline decoding strategies for linear ECICs for both the rank and Hamming distance. In the first case we extend the syndrome decoding method to correct Hamming errors for index codes given in [11] to the ICCSI case. In the second, we show that the simple, low-complexity strategy for additive matrix channels given in [35] can be applied to correct rank-metric errors, that is to handle error matrices of rank upper bounded by some .

Ii Preliminaries

We establish notation to be used throughout the paper. For any positive integer , we let . We write to denote the finite field of order and use to denote the vector space of all matrices over . Given a matrix we write and to denote the th row and th column of , respectively. More generally, for subsets and we write and to denote the and submatrices of comprised of the rows of indexed by and the columns of indexed by respectively. We write to denote the row space of .

In this work we will consider two distance functions, namely the Hamming metric and the rank metric, over the -vector space .

Choosing a basis of the finite field of elements, it is easy to see that and are isomorphic as -vector spaces. Then, given the usual definition of the Hamming distance between a pair of elements :

 dH(x,y):=|{i:xi≠yi}|,

we define the Hamming distance between a pair of matrices as the number of coordinates in such that , so the number of differing rows of and .

For two matrices , the rank distance between and is the rank of the matrix over :

 drk(A,B)=rk(A−B).

We write to denote either distance function between and and we write to denote . Given and a set , . In some cases we will specify explicitly which distance function should be understood, otherwise the reader should interpret or as denoting either metric.

Recall that for any pair of subspaces and , their sum is the subspace and we write to denote the direct sum . Moreover and are isomorphic if and only if is the trivial space. For arbitrary in the ambient space, the coset . We use the standard notation to denote that is a subspace of .

Iii Index coding with coded side information

In [34] the authors generalized the index coding problem so that coded packets of a data matrix may be broadcast or part of a user’s cache. As mentioned before, this finds applications, in broadcast channels with helper relay nodes.

Before we present the model with coded side information, let us recall the scenario for uncoded side information (see [11, 12]). In that case, the data is a vector possessed by a single sender. There are users or receivers, each of which has an index set , called its side-information. This indicates that the th user possesses the entries of indexed by . The surjection assigns users to indices, indicating that User wants and it is also assumed that . The sender is assumed to be informed of the values and of each user.

We now describe an instance of index coding with coded-side information. There is a data matrix and a set of receivers or users. is thus a list of blocks of length over . For each , the th user seeks some linear combination of the rows of , say for some . We’ll refer to as the request vector and to as the request packet of User . A user’s cache denotes locally stored data, which it can freely access. In our model it is represented by a pair of matrices

 V(i)∈Fdi×nq and Λ(i)∈Fdi×tq

related by the equation

 Λ(i)=V(i)X.

While the matrix may be unknown to User , it is assumed that any vector in the row spaces of and can be generated at the th receiver. We denote these respective row spaces by and for each . The side information of the th user is . Similarly, the sender has the pair of row spaces for matrices

 V(S)∈FdS×nq and Λ(S)=V(S)X∈FdS×tq

and does not necessarily possess the matrix itself.

The th user requests a coded packet with . We denote by the matrix over with each th row equal to . The matrix thus represents the requests of all users. We denote by

 X:={A∈Fm×nq:Ai∈X(i),i∈[m]},

so that is the direct sum of the as a vector space over .

We define , which may be viewed as the direct sum of copies of .

Remark III.1.

The reader will observe that the classical ICSI problem is indeed a special case of the index coding problem with coded side information (cf. [11, 12]). Setting to be the identity matrix, and to be the matrix with rows for each , yields . Then User has the rows of indexed by and requests .

Remark III.2.

The case where the sender does not necessarily possess the matrix itself can be applied to the broadcast relay channel, as described in [34]. The authors consider a channel as in Fig. 2, and assume that the relay is close to the users and far away from the source, and in particular that all relay-user links are erasure-free. Each node is assumed to have some storage capacity and stores previously received data in its cache. The packets in the cache of the relay node are obtained as previous broadcasts, hence it may contain both coded and uncoded packets. The relay node, playing the role of the sender, transmits packets obtained by linearly combining the packets in its cache, depending on the requests and coded side information of all users. It seeks to mimimize the total number of broadcasts such that all users’ demands are met.

Definition III.3.

An instance of the Index Coding with Coded Side Information (ICCSI) problem is a list for some positive integers , subspaces and of of dimensions for such that and a matrix in .

For the remainder, we let be as described above and we fix to denote an instance of the ICCSI problem for these parameters. We now define what is meant by an index code for an instance : it is essentially a map that encodes any data matrix in such a way that each user, given its side-information and received transmission, can uniquely determine its requested packet .

Definition III.4.

Let be a positive integer. We say that the map

 E:Fn×tq→FN×tq,

is an -code for of length if for each th receiver, there exists a decoding map

 Di:FN×tq×X(i)→Ftq,

satisfying

 ∀X∈Fn×tq:Di(E(X),A)=RiX,

for some vector , in which case we say that is an -IC. is called an -linear -IC if for some , in which case we say that represents the code , or that the matrix realizes . If , we say that represents a scalar linear index code. If we say that the code is vector linear. We write to denote the space

An encoding is sought such that the length of the -IC is as small as possible. We shall be principally concerned with -linear codes for an instance . We assume that the side information matrices of all users are known to the sender, along with the demand vectors . As we’ll see in the next section, this knowledge is sufficient to determine an encoding matrix for an -linear -IC. These assumptions are in keeping with those outlined in [6] for the original informed source coding on demand problem and are based on the existence of a slow error-free reverse channel allowing communication from users to the sender. We also assume that is known to the receivers before the broadcast of the encoded matrix . This knowledge, along with the transmission and its own cache data will be used by each th user in order to compute its demand . These assumptions mean that the gains of encoding an ICCSI instance are greater as increases.

Iii-a Necessary and Sufficient Conditions for Realization of an Fq-Linear I-Ic

In the following we give necessary and sufficient conditions for a matrix to represent a linear code of the instance (in fact the sufficiency of the statement of Lemma III.5 has already been noted in [34]).

Lemma III.5.

Let . Then represents an -linear -IC index code of length if and only if for each ,

Proof.

Let and let . Suppose that has been transmitted. If then there exist such that . Then for any we have

 RiX=AV(i)X+BLV(S)X=AΛ(i)+BY.

Therefore, Receiver , knowing and , can compute and and hence acquires .

Conversely, suppose that Then for each , we have

 rank⎛⎜⎝⎡⎢⎣RiUV(i)Λ(i)LV(S)Y⎤⎥⎦⎞⎟⎠ = 1+rank([V(i)Λ(i)LV(S)Y]) = 1+rank([V(i)LV(S)])=rank⎛⎜⎝⎡⎢⎣RiV(i)LV(S)⎤⎥⎦⎞⎟⎠.

In particular, the linear system

 RiX=U,V(i)X=Λ(i),LV(S)X=Y

is consistent for each . It follows that

 Pr(RiX=\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0U|V(i)X=Λ(i),LV(S)X=Y)=1qt, (1)

so the side information of conveys no information about to the th receiver. ∎

Lemma III.5 simply says that the demands of all users can be simultaneously satisfied if and only if for each the smallest vector space containing both and also contains ; in other words extending the side-information spaces by the same space in each case contains the th request vector. This is achieved, for example, if is the space for each , although this is clearly not necessary.

An equivalent formulation of the statement of Lemma III.5 is to say that represents a linear index code for if and only if meets each coset . We will use this view to obtain an upper bound on the optimal length on a linear index code in Theorem III.18.

Given an matrix , we write to denote the null space of in . Furthermore, for each we define the sets:

 Y(i) := {Z∈Fn×tq:V(i)Z=0}=(V(i)⊥)t, Z(i) := {Z∈Fn×tq:V(i)Z=0,RiZ≠0}.

To help put these sets in context, if has rows composed of standard basis vectors, say with leading ones indexed by the set (which means the side-information of User is uncoded) then consists of those matrices whose columns indexed by are all-zero. Then can be identified with the set , the complement of the side information of user and can be identified with .

Remark III.6.

In the classical ICSI problem, two data matrices and are called confusable at receiver (cf. [2]) if they yield the same side information for , i.e. for all , and if moreover the packets and are different (here represents the side information of the receiver and the request packet). In the ICCSI problem, two vectors are called confusable at receiver if and , i.e. if they yield the same side information for the th user but the requested data packets are different. Therefore, and are confusable at receiver if and only if lies in the set .

The essential content of next result, which follows from Lemma III.5, is that represents a linear code of if and only if any confusable pair result in different encodings. Therefore, another way of stating Corollary III.7 is:

represents an -linear -IC if and only if for any confusable pair .

Then realizes and -linear -IC if and only if no matrix of vanishes after multiplication by , so may be used to characterize all linear codes of . Of course is non-zero if and only if it has positive weight. This result will be generalized further in Theorem IV.2 to give a criterion for error-correction.

Corollary III.7.

Let . Then represents an -linear -IC of length if and only if for each , and .

Proof.

Fix some and let , let . Suppose that Then as in the proof of Lemma III.5, the linear system

 RiZ=U,V(i)Z=0,LV(S)Z=W (2)

is consistent for every choice of . In particular, (2) has a solution for . Then and . We have shown that if does not represent a linear code for then for some , there exists such that . Applying the contrapositive, this yields that if (i.e. if ) for each and , then represents a linear index code for the instance .

Conversely, if there exist such that then

 RiZ=AV(i)Z+BLV(S)Z=BLV(S)Z≠0,

for any . ∎

Iii-B The Optimal Length of an Fq-Linear I-Ic

We extend the definition of the min-rank of an instance of the ICSI problem, as given in [12], to the ICCSI problem. We will show that this characterizes the shortest possible length of an -linear -IC.

Definition III.8.

We define the min-rank of the instance of the ICCSI problem over to be

 κ(I) = min{rank(A+R):A∈Fm×nq,Ai∈X(i)∩X(S)⊂Fnq,∀i∈[m]}.

Observe that the quantity is , which is the rank-distance of to the -linear code , or equivalently the minimum rank-weight of the coset .

We now show that given the instance , the minimum length of an -linear -IC is given by its min-rank.

Lemma III.9.

The length of an optimal -linear -IC is .

Proof.

Let have rank . From Lemma III.5, represents a linear code of length if and only if for each there exist such that

 Ri=BiLV(S)−Ai,

(i.e. if and only if for each ). Equivalently this holds if and only if there exist matrices , such that , in which case we have in the coset . In particular, we have shown that every matrix represents an -linear code for only if

 BLV(S)∈R+(X∩~X)

for some , so every such has rank at least .

Now let with for each . Suppose that has rank . Since , there exists of rank satisfying . Furthermore, there exist and such that . Then

 R=A−BLV(S)

so represents a linear code of length for the instance . The length is minimized for , so there exists some of rank representing a linear code for .

Lemma III.9 gives a naive algorithm for computation of a matrix for an optimal linear -IC: put each element of into row-echelon form and choose one of minimal rank . The non-zero rows of this matrix yields the required matrix . We do not suggest this as a practical approach, since it requires operations, with . We mention this here to give a concrete illustration of the realization problem. As already observed in [11], the min-rank of the instance generalizes the notion of the min-rank of the so-called side-information graph of the classical index coding problem, which is NP-hard to compute. A discussion on the various approaches to obtaining bounds on the optimal length of an index code can be read in [33], where the authors assert that graph-theoretic methods for constructing index coding schemes yield bounds on the optimal length of an index code, which are often out-performed by the min-rank. In fact all of these so called graph-theoretic methods, which use linear programming methods to obtain (possibly sub-optimal) solutions to the linear index coding problem can be extended to the ICCSI case. These results have been outlined in a separate forthcoming paper [8].

Iii-C Upper Bounds on the Optimal Length of an Fq-Linear I-Ic

We now give upper bounds on , applying probabilistic arguments. The main results are Corollary III.13 and Theorem III.18, both of which show that with certain constraints on , there exists an -linear -IC of length . While both results essentially give lower bounds on the probability that a random matrix represents an -linear -IC, the key point is that these probabilities are positive, so that existence is guaranteed. It is from this observation that upper bounds on are achieved.

We will use the following theorem proved by Zippel [38] (see also [14, 32]). We state it here for finite fields.

Theorem III.10.

Let be positive integers with and let be a non-zero multivariate polynomial in for which the largest exponent of any variable is at most . If is chosen uniformly at random in then the probability that equals zero is at most .

Remark III.11.

Before proving the following theorem, we note that if are independent uniformly distributed random variables that take their values over a field , then the random variable

 Zℓ=ℓ∑i=1αiXi,

for some , , has a uniform distribution.

This is easily shown by an inductive argument. Clearly for any since . Moreover, for any ,

 P(Zℓ=β) = P(Zℓ−1=β−αℓXℓ) = ∑γ∈FqP(Xℓ=γ)P(Zℓ−1=β−αℓγ)=1q.

Let be the number of distinct equivalence classes of under the relation if . Let be a set of representatives for the distinct equivalence classes of .

Theorem III.12.

Let be an instance of an ICCSI problem and let . Suppose that . If the entries of a matrix are chosen uniformly at random in , then the probability that represents a linear code for is at least .

Proof.

From Corollary III.7, if for each then represents a code for . For each , let satisfy and have rank . Write . The matrix represents a code for if is a full-rank matrix for each , which holds if and only if there exists a non-zero minor of . Since the entries of are uniformly distributed, so are the entries of , from Remark III.11. Each such minor has the form Now may be viewed as a polynomial in variables of degree with each variable appearing with multiplicity at most in any term. Then the probability that represents a code for is the probability that is non-zero, which from Lemma III.10 is at least , for . ∎

If then .

Proof.

Theorem III.12 guarantees the existence of some matrix that represents an -linear -IC of length . The result is now immediate since . ∎

Remark III.14.

In fact Schwartz’s result [32] gives the lower bound of on the probability of an matrix representing an -IC, where is the average of the . While this may give a higher lower bound, it places the restriction and so in particular yields a weaker version of Corollary III.13.

Remark III.15.

Note that if for some , is non-zero for any , then it satisfies the decoding criterion for any possible request vector , and hence delivers all possible requests to User . Therefore, Theorem III.12 and Corollary III.13 should be viewed in the context of similar results in [6, 37], which lead to partial clique-cover and partition multicast schemes. There is also a close association with the so-called Main Network Coding Theorem [19, Theorem 2.2] for multicast network coding. All of these results rely on the field size being sufficiently large to invoke Zippel’s theorem and its variants.

Remark III.16.

The approach in [6] to construct a linear IC for a partial clique is based on maximum distance separable (MDS) codes (this can be used also in the more general case of a multicast group, as described in [37]). Any generator matrix of an MDS linear code of length and dimension is such that any columns are linear independent. Suppose that is an ICSI instance and that is the number of uncoded packets known to the receiver . Let be a generator matrix of an MDS code of length and dimension . Then the sender can broadcast the following linear combination of the columns of :

 X1G1+...+XnGn.

Without loss of generality, suppose that some receiver has , and can thus recover

 X1G1+.