Bounding the optimal rate of the ICSI and ICCSI problem

Bounding the optimal rate of the ICSI and ICCSI problem

Eimear Byrne, and Marco Calderini. School of Mathematical Sciences, University College Dublin, Ireland.e-mail: ebyrne@ucd.ieDepartment of Mathematics, University of Trento, Italy.email: marco.calderini@unitn.itResearch supported by ESF COST Action IC1104
Abstract

In this work we study both the index coding with side information (ICSI) problem introduced by Birk and Kol in 1998 and the more general problem of index coding with coded side information (ICCSI), described by Shum et al in 2012. We estimate the optimal rate of an instance of the index coding problem. In the ICSI problem case, we characterize those digraphs having min-rank one less than their order and we give an upper bound on the min-rank of a hypergraph whose incidence matrix can be associated with that of a 2-design. Security aspects are discussed in the particular case when the design is a projective plane. For the coded side information case, we extend the graph theoretic upper bounds given by Shanmugam et al in 2014 on the optimal rate of index code.

{keywords}

Index coding, network coding, coded side information, broadcast with side information, min-rank.

I Introduction

Since its introduction in [6], the problem of index coding has been generalized in a number of directions [1, 3, 8, 13, 14, 16]. It is a problem that has aroused much interest in recent years; from the theoretical perspective, its equivalence to network coding has established it as an important area of network information theory [18, 17]. In the classical case, a central broadcaster has a data file . There are users each of whom already possesses some subset of components of as its side-information and each of whom requests some component of the file. The index coding problem is to determine the minimum number of transmissions required so that the demands of all users can be met, given that data may be encoded prior to broadcast. This problem can be associated with a directed graph, or a hypergraph if the case is extended to consider a scenario of users. Several authors have given various bounds on the length of an index code, which refers to the number of transmissions used to meet clients’ demands for a given instance of the problem. It is well known that for the case of linear index coding, the min-rank of the associated side-information graph is the minimal number of broadcasts required. In [24], the authors give several graph theoretic upper bounds based on linear programming. In [16] the authors describe the scenario of linear index coding with coded side information. In this model, users may request a linear combination of the data held by the sender and are assumed to each have some set of linear combinations of the data packets. One motivation for this more general model is that it may serve a larger number of applications than the case for uncoded side-information, such as broadcast relay networks and wireless distributed storage systems. The set-up in [16] does not have an obvious representation in the form of a side-information hypergraph. However, as we show here, practically all the results of [24] can be extended to this case.

In this paper we present new bounds on the optimal rate for different instances of the index coding problem. For the case of uncoded side information the problem will be referred to as an index coding with side information (ICSI) problem. For the case of encoded side information we will describe this as an ICCSI instance. In the first part we give bounds on the minimum number of transmissions required for particular instances of the ICSI problem where the corresponding side-information hypergraph can be associated with the incidence matrix of a design. This comprises Sections II-V. The remainder of the paper is concerned with upper bounds on the total transmission time for the ICCSI problem and extends the results of [24] for this more general case. In Section II we give relevant definitions and results on incidence structures such as designs. In Section III the ICSI problem is described. In Section IV, extending results of [15], we characterize those digraphs having min-rank one less than their order. In Section V we give an upper bound on the min-rank of a hypergraph whose incidence matrix can be associated with that of a 2-design and discuss a security aspect for such special instances of the ICSI problem. In Section VI we describe the ICCSI problem before finally giving several upper bounds on the transmission time of an ICCSI instance based on linear programming.

Ii Preliminaries

We establish some notation to be used throughout the paper. We will assume that is a power of a prime , say . For any positive integer , we let . We write to denote the finite field of order and use to denote the vector space of all matrices over .

Given a matrix we write and to denote the th row and th column of , respectively. More generally, for subsets and we write and to denote the and submatrices of comprised of the rows of indexed by and the columns of indexed by respectively. We write to denote the row space of .

A finite incidence structure , consists of a pair of finite sets (its points) and (its blocks), and an incidence relation . We say that is contained in or is incident with if .

Definition II.1.

Let and be positive integers. An incidence structure is called a - block design if

  • ;

  • for all ;

  • every -set of points of are contained in precisely blocks of .

Often a - block design is simply referred to as a -design. Designs are well-studied objects in combinatorics with many applications. The interested reader is referred to [27, 11, 10] for further information, but we present sufficient detail here to meet our purposes. The number of blocks of a - design is and the number of blocks containing any given point of is , which is its replication number. In the case of a 2-design we have . An important parameter of a -design is its order, defined to be .

Definition II.2.

Let be an incidence structure with and . Let the points be labelled and the blocks be labelled . An incidence matrix for is a matrix with entries in such that

The code of over is the subspace of spanned by the rows of .

Definition II.3.

Let be an incidence structure and let be a prime power, the -rank of is the dimension of the code and is written

The following result was proved by Klemm [19]. We will see in Section V that this gives an immediate upper bound on the min-rank of a class of instances of the index coding problem.

Theorem II.4.

Let be a - design of order and let be a prime dividing n. Then

Moreover, if does not divide and does not divide , then

and .

A - design, for , is called a projective plane of order . A projective plane of order is an example of a symmetric design, that is, it has the same number of points as blocks, so .

The following can be read in [2, Theorem 6.3.1].

Theorem II.5.

Let be a projective plane of order and be a prime such that . Then the -ary code of , , has minimum distance . Moreover the codewords of minimal weight in are the scalar multiples of the rows of the incidence matrix of .

Chouinard, in [9], proved that:

Theorem II.6.

Let be a code arising from a projective plane of prime order . Then no codeword has weight in the interval .

Definition II.7.

A digraph is a pair where:

  • is the set of vertices of ,

  • is the set of arcs (or directed edges) of .

An arc of is an ordered pair for some . In the case that , the vertex is called the tail of and the head of . The arc is called an out-going arc of and an in-coming arc of . The out-degree of a vertex , is the number of out-going arcs, and the in-degree of a vertex , is the number of in-coming arcs. is called an undirected graph, or a graph, if whenever . If is a graph then each pair of arcs and are represented by the unordered pair , which is called an edge. The number of vertices of a digraph is called its order.

We assume that all digraphs have finite order.

Definition II.8.

A path in a graph (respectively in a digraph), is a sequence of distinct vertices , such that (, respectively) for all . If a path is closed, i.e. (, respectively), then it is called circuit. A digraph that is not a graph is called acyclic if it contains no circuits. A graph is acyclic if it has no circuits with at least 3 vertices.

Let be the circuit packing number of , namely, the maximum number of vertex-disjoint circuits in . A feedback vertex set of is a set of vertices whose removal destroys all circuits in . Let denote the minimum size of a feedback vertex set of . We denote by the maximum size of vertex subset such that induced subgraph in is acyclic. Since such a subset of vertices is the complement of a feedback vertex set, we have . In the case that is a graph, is the maximum size of an independent (pairwise non-adjacent) set of vertices,

Definition II.9.

A clique of a digraph is a set of vertices that induces a complete subgraph of that digraph. A clique cover of a digraph is a set of cliques that partition its vertex set. A minimum clique cover of a digraph is a clique cover having minimum number of cliques. The number of cliques in such a minimum clique cover of a digraph is called the clique cover number of that digraph. We denote by the clique cover number of a digraph .

Definition II.10.

Let be a digraph of order . A matrix is said to fit if

The min-rank of over is defined to be

We also have analogous definitions for a graph.

Definition II.11.

A (directed) hypergraph is a pair , where is a set of vertices and is a set of hyperarcs. A hyperarc itself is an ordered pair , where and , they respectively represent the tail and the head of the hyperarc .

Definition II.12.

Let and . Let the hyperarcs be labelled
, a matrix fits the hypergraph if

The min-rank of over is defined to be

Iii Index coding with side information

The Index Coding with Side Information (ICSI) problem is described as follows. There is a unique sender , who has a data matrix . There are also receivers, each with a request for a data packet , and it is assumed that each receiver has some side-information, that is, a client has a subset of messages , where for each . The packet requested by is denoted by , where is a (surjective) demand function. Here we assume that for all . We may assume that each th receiver requests only the message , since a receiver requesting more than one message can be split into multiple receivers, each of whom requests only one message and has the same side information set as the original [1].

For the remainder, let us fix to denote those parameters as described above. Then for any and map , the corresponding instance of the ICSI problem (or the ICSI instance) is denoted by . It can also be conveniently described by a side-information (directed) hypergraph [1].

Definition III.1.

Let be an ICSI instance. The corresponding side information hypergraph has vertex set and hyperarc set , defined by

Remark III.2.

If we have and for all , the corresponding side information hypergraph has precisely hyperarcs, each with a different origin vertex. It is simpler to describe such an ICSI instance as a digraph , the so-called side information digraph [3]. For each hyperarc of , there are arcs of , for . Equivalently, .

Definition III.3.

Let be a positive integer. We say that the map

is an -code of length for the instance if for each there exists a decoding map

satisfying

in which case we say that is an -IC. is called an -linear -IC if for some , in which case we say that represents the code . If , is called scalar linear.

The following well-known results quantify the minimal length of a linear index code in respect of its side-information hypergraph (cf. [13])

Lemma III.4.

An -IC of length over has a linear encoding map if and only if there exists a matrix such that for each , there exists a vector satisfying

(1)
(2)
Theorem III.5.

Let be an instance of the ICSI problem, and its hypergraph. Then the optimal length of a -ary linear -IC is .

Achievable schemes based on graph-theoretic models for constructing index codes (i.e. upper bounds for index coding) were largely studied [1, 3, 8, 24].

One of these methods comes from the well-known fact that all the users forming a clique in the side information digraph can be simultaneously satisfied by transmitting the sum of their packets [6]. This idea shows that the number of cliques required to cover all the vertices of the graph (the clique cover number) is an achievable upper bound.

A lower bound on the min-rank of a digraph was given in [3]. An acyclic digraph has min-rank equal to its order (see for instance [3]) and for any subgraph of a graph we have

Let be a matrix that fits , the sub-matrix of restricted on the rows and columns indexed by the vertices in is a matrix that fits . These two results are summarized in the following theorem.

Theorem III.6.

Let be a digraph. Then

Instead of covering with cliques, one can cover the vertices with circuits. In [8] the circuit-packing bound was implicitly introduced by the authors. Indeed, Chaudhry and Sprintson construct a linear index code partitioning the graph of the ICSI instance in disjoint circuits. The same bound was explicitly given in the work of Dau et al. [15]. It is based on the observation that the existence of a circuit of length in the side-information digraph requires at most transmissions to satisfy the demands of the corresponding users. Therefore a collection of vertex disjoint circuits corresponds to a ‘saving’ of at least transmissions. The bound is stated as follows: Let be the circuit-packing number of a graph of order . Then

In [26] the following result is given, leading the authors to introduce the partition multicast scheme, which outperforms the circuit-packing number.

Proposition III.7.

Let be a graph of order . Then

for any .

The broadcast rate of an IC-instance [1] is defined as follows, with respect to a prime .

Definition III.8.

Let be an IC instance. We denote by the minimal number of symbols required to broadcast the information to all receivers, when the block length is , over all possible extensions of , i.e.

Moreover we denote by the limit

In the following, we will also use the notation to indicate the broadcast rate of any instance that has as side-information graph.

The graph parameter completely characterizes the length of an optimal linear index code. Bar-Yossef et al. [3, 4] showed that in various cases linear codes attain the optimal word length, and they conjectured that the minimum broadcast rate of a graph was also for non-linear codes. Lubetzky and Stav in [20] disproved this conjecture.

In the works of Alon et al. [1] and Shanmugam et al. [23], it was shown that results based on partitioning the vertices of a graph in cliques lead to a family of stronger bounds on , starting with an LP relaxation called fractional chromatic number [1] and the stronger fractional local chromatic number [23]. In [24] the authors extended all these schemes to the case of hypergraphs.

Iv On directed graphs with min-rank one less than the order

In the work of Dau et al. [15] the authors characterize the undirected graphs of order having min-rank . Here we extend this result to include directed graphs over a sufficiently large field. Our result relies in part on the following lemma, which is a construction of a digraph of minrank one less that a digraph , obtained from by contracting an arc.

Lemma IV.1.

Let be a directed graph of order such that there exist with

  • and

  • .

Let with and
. Then

for any .

Proof.

Let be a matrix that fits of minimum rank. We may assume that and so that the first two rows of are

and

If then it is easy to check that deleting the first row and the first column of we obtain of rank that fits .

Now suppose that . We may assume that the rows are linearly independent.

For each vertex , label the corresponding vertex in by . Then construct the matrix whose -th row is obtained from the -th row of in the following way: for let

and for we define

where satisfies for some . The matrix fits , so

Conversely, let be a matrix that fits having rank and suppose the rows are linearly independent. Let be the set of vertices of with outgoing arcs directed to . We construct the matrix such that

for and

for . For we have that the -th row of is given by

for some . If , we put

and hence obtain

where the are the coefficients in the linear combination of , with respect to the first rows of , and . If we set

and we have

where .

Then fits and

Note that the digraph of Lemma IV.1 is the contraction of the digraph along the arc .

Example IV.2.

Let and be the two digraphs shown in Figure 1. The nodes and of satisfy the conditions of Lemma IV.1, so we can reduce to . Consider the matrix

which fits . We have , constructing as in the lemma above we obtain

fits . Conversely, from we obtain , and .

3

4

1

2

(a)

2

1

3

(b)
Figure 1: Contraction graph
Lemma IV.3.

Let be a directed graph of order such that . Then , for any .

Proof.

As observed in Theorem III.6, , so we need only to prove that .

We may suppose without loss of generality that there does not exist with out-degree less than , otherwise, from Lemma IV.1 we can delete the node and consider the induced subgraph , which satisfies .

Since , we have . Since , if then we have our claim immediately. Assume then that . We apply Lemma IV.1, iteratively. Note that each time we reduce a graph by an appropriate arc contraction, we obtain with and . Moreover, for each contraction of an arc of the graph, we only shorten the circuits that pass through the node that we delete, and we do not create any new circuit from the fact that the out-degree of the node is .

At the point that Lemma IV.1 is no longer applicable, there are two possible cases:

  1. the out-degree of each node of the reduced graph is at least ,

  2. there exists with out-degree and .

This last case is not possible, in fact if we consider the circuit , from we have that there exists a circuit which remains after deleting . Then, does not pass through otherwise it has to pass through . Then and are disjoint, but this is not possible because .

Therefore, reducing we obtain with fewer nodes and all nodes have out-degree at least . Then from Proposition III.7 and Lemma IV.1 it follows that

Corollary IV.4.

Let be a directed graph of order such that . Then for any , .

We have now our main result of this section.

Corollary IV.5.

Let a graph of order and let . Then if and only if . Moreover in that case we have if and only if .

Proof.

If then and we have .

Conversely towards a contradiction assume that . Then consider a subgraph of with . From Lemma IV.3 we have our claim. ∎

This last theorem implies that the problem of deciding whether or not a digraph has min-rank , over a sufficiently large field, can be solved in polynomial time, using a depth-first search algorithm (see for instance [12]) that verifies in a polynomial time whether or not a graph is acyclic.

Corollary IV.6.

Let be a digraph of order and . Then deciding whether can be done in polynomial time ().

Remark IV.7.

In the final stages of the writing of this paper we learned of Ong’s result [21]. In fact Lemma IV.3, (although obtained independently) and its immediate corollary follows from [21, Theorem 1], which is a stronger result, since it holds without any restrictions on . That is,

Theorem IV.8 ([21]).

Let be a directed graph of order satisfying . Then

The proof of Theorem IV.8 relies on showing that contains a particular subgraph and then devising a coding scheme for based on the existence of . The proof given in [21] is a non-trivial graph-theoretic proof and goes through a careful case-by-case analysis. The proof of Lemma IV.3 given here is rather more straightforward, being based on the construction of a new graph obtained by iterative contractions of the original graph , following from Lemma IV.1. Such a result could be helpful also to decrease the size of a graph and thus to optimize the computation of the min-rank of the graph. The hypothesis that follows since we invoke the partition multicast solution (Proposition III.7), therefore requiring the existence of a maximum distance separable code.

In the following table we report the values of the min-rank for graphs and directed graphs with near-extreme min-rank (i.e. and ).

Figure 2: Forbidden subgraph
Minrank Graph Digraph
1 is complete (trivial) is complete (trivial)
2 is colorable [22] for , if is -fair colorable [15]
has maximum matching and does not contain the graph in Figure 2 [15] unknown
is a star graph [15] for , Corollary IV.5
for any , Theorem IV.8
has no edges (trivial) is acyclic (trivial) [3]

V A bound from t-designs

In this section we study the case for which an incidence structure, in particular a - or projective plane, arises from the side information. This yields an immediate upper bound on the min-rank of the hypergraph, based on known results on the ranks of incidence matrices. Furthermore, we show that secrecy and privacy are attainable for such configurations. Towards secrecy, we show that if an instance fits a projective plane, then a receiver may recover only its requested data, and no more. On the matter of privacy, we identify a constraint on the side information of an adversary hearing the broadcast such that it cannot access the receivers’ requested data. We may assume without loss of generality that .

Definition V.1.

We said that an instance, , of the ICSI problem contains an incidence structure if

  • and ;

  • for each there exists such that and .

Moreover we said that the instance coincides with the incidence structure if the following condition is satisfied.

  • for each there exists such that and .

We immediately obtain the following proposition.

Proposition V.2.

Let be an instance of ICSI problem and let be the corresponding hypergraph. If the instance contains a - design then for all a power of a prime such that divides the order of it holds that

Proof.

Let be the incidence matrix of . Then for the Theorem II.4 we have that the -rank of is less or equal to .

Now, it is easy to check that fits , so

and that concludes the proof. ∎

Remark V.3.

To compute the min-rank of a hypergraph is an NP-hard problem [22], however, if there exists a -design as in Proposition V.2 it is possible to have a bound on this value and we can use the linearly independent rows of its incidence matrix to decrease the number of transmissions. We remark further that this result does not require to be large, and shows the existence of a class of instances with transmission rate much less than predicted by other bounds. For example, it is known that if an instance fits the incidence matrix of a projective plane of order and then (see, for example [5]), which is significantly greater than the bound , given by Proposition V.2.

Example V.4.

Consider the instance of the ICSI problem given by , and for . Let the side information be

Consider the blocks

These blocks form the Fano plane as in Figure 3. This is a - design of order and the design is contained in the side information. The -rank of the design is . Then we can consider linearly independent rows of the incidence matrix of the Fano plane, and encode the message using those reducing the number of transmissions from to .

It can be checked that distribution of the ranks of the matrices that fit this incidence is given by

thus the bound is sharply met in this instance. Moreover, an optimal encoding matrix for this instance must have row space spanned be the rows of this incidence matrix; there is a unique optimal solution, up to left multiplication by an invertible matrix.

Figure 3: Fano plane

Now we consider the case when an instance of the ICSI problem contains a - design, and the matrix corresponding to the index code is composed of the linearly independent rows of the incidence matrix of the design. We recall that a - design has order and the code of the design over , with a prime divisor of , has minimum distance equal to (Theorem II.5).

Theorem V.5.

If the instance of the ICSI problem coincides with the - design, then no receiver can recover a message with .

Proof.

Let be the - design. Suppose that wants to recover with . From Lemma III.4 it is able to do so if and only if there exists a vector , , such that and . If this vector is a codeword of the code, at least positions are different from . Now consider the vector , where is the vector in with ’s in the positions contained in . We have and also there are at least positions of in this intersection that have the same value (we can use only the values of for these positions). Suppose that this value is , then we have . So is not a codeword of , which means that is not able to recover . ∎

Encoding with a matrix whose rowspace contains the blocks of a projective plane guarantees the secrecy of the transmission.

Assume, now, the presence of an adversary who can listen to all transmissions. The adversary is assumed to possess side information . In [13], it is shown that for a transmission matrix for a linear index code representing , if , where is the minimum distance of the code , then is not able to recover an element with .

Consider now an instance of the ICSI problem containing a - design, where is a prime number. Suppose the matrix as above is used as an encoding matrix. Then we obtain the following result.

Theorem V.6.

If and for each block of the design , then is not able to recover for any .

Proof.

If is even, then the result follows from the fact that . Let be odd. We know from Theorem II.6 that in the code generated by the incidence matrix of a - design there are no codewords with weights in . To recover the message , needs a codeword of weight . Such codewords are those corresponding to some block , that is a vector of the form

and its scalar multiples.

So recovers if and only if there exists