On Throughput and Decoding Delay Performance of
Instantly Decodable Network Coding
Abstract
In this paper, a comprehensive study of packetbased instantly decodable network coding (IDNC) for singlehop wireless broadcast is presented. The optimal IDNC solution in terms of throughput is proposed and its packet decoding delay performance is investigated. Lower and upper bounds on the achievable throughput and decoding delay performance of IDNC are derived and assessed through extensive simulations. Furthermore, the impact of receivers’ feedback frequency on the performance of IDNC is studied and optimal IDNC solutions are proposed for scenarios where receivers’ feedback is only available after an IDNC round, composed of several coded transmissions. However, since finding these IDNC optimal solutions is computationally complex, we further propose simple yet efficient heuristic IDNC algorithms. The impact of system settings and parameters such as channel erasure probability, feedback frequency, and the number of receivers is also investigated and simple guidelines for practical implementations of IDNC are proposed.
I Introduction
Instantly decodable network coding (IDNC) is a class of linear network coding schemes [1, 2, 3, 4, 5, 6, 7] which has been widely studied and applied in wireless unicast, multicast and broadcast systems. It has been shown that IDNC schemes can quite significantly improve data throughput in such systems compared to their uncoded counterparts [1, 2], while offering simple XORbased encoding and decoding. Furthermore, IDNC provides instant packet decodability at the receivers, which can result in faster delivery of the packets to the application layer compared to other linear network coding schemes [5, 6].
In this paper, we are primarily concerned with investigating the performance limits of IDNC schemes for data dissemination in singlehop wireless broadcast systems in terms of throughput and delay. In such systems, there is a single sender who wishes to broadcast a block of data packets to multiple receivers [4, 5, 6, 7]. Due to packet erasures in wireless fading channels, some transmitted packets are lost at the receivers. Generally, information about received or lost packets are fed back from the receivers to the sender after transmission of one or multiple packets. The sender then determines which data packets from the block to combine and transmit next subject to IDNC constraints. This process is repeated until the broadcast of the block is complete, i.e. until all receivers have decoded all data packets.
In such systems, the time it takes to complete the block, or simply the IDNC block completion time, is a fundamental measure of its throughput performance and will be studied in this paper. Taking the block completion time of random linear network coding (RLNC) [8] as a benchmark, many works in the literature have been concerned with proposing IDNC schemes with good throughput performance [4, 5, 6, 9, 10, 11, 12, 13, 14, 15, 7]. The majority of these schemes collect feedback about the lost packets and determine an online IDNC solution accordingly, which comprises one or more coded packets, such that it can efficiently bring the system closer to block completion.
Although these works differ in their models and assumptions about the frequency and reliability of feedback [11, 12], at their core they run dynamic IDNC algorithms, responding to erasure patterns that have happened along the transmission. The main limitation of such studies is that it is impossible to say a priori how long it will actually take to complete a block starting with a certain system packet reception state at the receivers. Furthermore, such IDNC solution is in fact the result of a local optimization, and there is no guarantee that the solution is globally optimal. Therefore, the following two fundamental questions still seem unanswered in the literature:

What is the best throughput performance of IDNC?

Which IDNC solution can achieve this best throughput performance?
By best throughput, we refer to the minimum block completion time that is possible by using IDNC starting with a certain system packet reception state, in the absence of any future erasures. It is clear that packet erasures in an actual system can defer block completion. Therefore, our measure of throughput is the best possible performance of the IDNC schemes in terms of throughput and serves as an upper bound on what IDNC can achieve in the presence of erasures. Such measure of throughput is significant because it disentangles the effects of channelinduced packet erasures and algorithminduced IDNC coded packet selection on the throughput of the system. Based on this measure of throughput, we propose the concept of optimal IDNC scheme which refers to an IDNC scheme that provides globally optimal IDNC solution and achieves the best throughput performance in the absence of any future erasures.
Besides block completion time, another fundamental performance metric of IDNC is its decoding delay. There are various definitions of decoding delay in the literature [1, 3, 5]. In this work, we consider packet decoding delay, defined as the number of time slots it takes till the data packets are decoded by the receivers. The reasons for our choice are twofold. First, short packet decoding delay is the main advantage of IDNC, which is particularly desirable for applications in which data packets are useful regardless of their order. Second, packet decoding delay is naturally related to the throughput of IDNC and its respective IDNC solution. That is, having investigated the best throughput performance of IDNC and the optimal IDNC solution, decoding delay limits of IDNC schemes can be obtained with relative ease.
The contributions of this paper can be summarized as follows. First, the best achievable throughput performance of IDNC, regardless of packets erasure probabilities and feedback frequency, and its corresponding optimal IDNC solution are rigorously obtained. Furthermore, the concept of IDNC packet diversity in the optimal IDNC solution is introduced. It is a measure of the robustness of IDNC solution against packet erasures. While ensuring the optimal throughput performance, our proposed IDNC solution enhances packet diversity wherever possible, hence enhancing its robustness against erasures. This feature distinguishes our optimal IDNC solution from other IDNC solutions in the literature, where packet diversity has never been considered, to the best of our knowledge.
Second, the impact of feedback frequency on the performance of the IDNC scheme is investigated, the concept of semionline feedback is introduced and optimal fullyonline and optimal semionline IDNC schemes are devised.
Third, we derive lower and upper bounds on the throughput and decoding delay performance of IDNC schemes. Furthermore, we design the optimal IDNC coding algorithm, as well as its simplified alternatives that offer efficient performances with much lower computational complexities.
The performance of these algorithms is evaluated via extensive simulations under different settings of system parameters. The results illustrate the interactions among these parameters and can serve as simple implementation guidelines. Plenty of handson examples are also designed to demonstrate the proposed concepts, theorems, methodologies, and algorithms. In summary, this work can be a useful reference in the IDNC literature and can motivate further research.
Ia Additional Remarks
IDNC can be divided into two categories, strict IDNC (SIDNC) [4, 5] and general IDNC (GIDNC) [9, 7]. Although they have the same system model and use similar dynamic algorithms, they differ in the sense that GIDNC coded packets are allowed to include two or more new data packets for some receivers. However, this is not allowed in SIDNC. In this paper, we focus on SIDNC (or IDNC for short). The relationship and comparisons between S and GIDNC will be discussed when necessary.
SIDNC problem is, to some extent, related to the index coding problem [16, 17, 18], especially when memoryless decoding is considered in the index coding problem [16, 17]. Nevertheless, their problem formulations are different. A basic assumption in index coding is that a receiver who has successfully received a subset of packets but is still missing multiple packets can be considered as multiple receivers each wanting only one of the missing packets. However, such splitting is prohibited in SIDNC, for it will violate the instantly decodable property of IDNC coded packets. Therefore, results of index coding and SIDNC cannot be used interchangeably.
Ii System Model and Notations
Iia Transmission Setup
We consider a packetbased wireless broadcast scenario from one sender to receivers. Receiver is denoted by and the set of all receivers is . There are a total of binary packets with identical length to be delivered to all receivers. Packet is denoted by and the set of all packets is . Sometimes we will refer to as an original data packet to distinguish it from a coded packet. Time is slotted and in each time slot, one (coded or original data) packet is broadcast. The wireless channel between the sender and each receiver is modeled as a memoryless erasure link with i.i.d. packet erasure probability of . The results proposed in the paper can be generalized, with proper modifications, to nonhomogeneous erasure links.
IiB Systematic Transmission Phase and Receivers Feedback
Initially, the packets are transmitted uncoded once using time slots. This is the systematic transmission phase. After this phase, each receiver provides feedback to the sender about the packets it has received or lost.^{1}^{1}1We assume that there exists an errorfree feedback link between each receiver to the sender that can be used with appropriate frequency. The number of packets that are not received by at least one receiver due to erasures is denoted by and their set is denoted by , where . The number of receivers that have not received all the packets is denoted by and the set of these receivers is denoted by , where .
The complete state of receivers and packets can be captured by an state feedback matrix (SFM) (also known as receiverpacket incidence matrix [5]), where the element at row and column is denoted by and
(1) 
Based on the SFM, we define the notions of Wants set [10, 9] for each receiver and Targeted receivers for each packet:
Definition 1.
The Wants set of receiver , denoted by , is the subset of packets in which are lost at due to packet erasures. That is, .
Definition 2.
The Target set of a packet , denoted by , is the subset of receivers in who want packet . That is, . The size of is denoted by .
Example 1.
Consider the SFM in Fig. 1(a). There are packets and receivers after the systematic transmission phase. The Wants set of is . The Target set of is and thus .
IiC Coded Transmission Phase
In this subsection, we present some basic definitions and performance metrics related to IDNC. Then in the next subsection, we will briefly discuss existing models in the literature to deal with the IDNC problem.
After the systematic transmission phase and collecting receivers’ feedback, the coded transmission phase starts. In this phase, IDNC aims to satisfy the demands of all receivers by sending coded packets under two fundamental restrictions:

The sender uses the binary field for linear coding;

Receivers do not store received coded packets for future decoding, i.e. memory is not required at the receivers;
More precisely, the first restriction means that the th transmitted coded packet is of the form
(2) 
where and the summation is bitwise XOR . We denote by the set of original data packets that have nonzero coefficients in , namely, . fully represents and is called a coding set. Based on (2), can be one of the following for each receiver:
Definition 3.
A coded packet is instantly decodable for receiver if contains only one original data packet from the Wants set of .
Definition 4.
A coded packet is noninstantly decodable for receiver if contains two or more original data packets from the Wants set of .
Due to restriction 2 above, a noninstantly decodable coded packet will be discarded by upon receiving.
Definition 5.
A coded packet is noninnovative for receiver if only contains original data packets not from the Wants set of . Otherwise, it is innovative.
A noninnovative coded packet will be also discarded by upon receiving.
Example 2.
Consider the SFM in Fig. 1(a). is instantly decodable for because the corresponding coding set only contains one original data packet () from the Wants set of . Thus, can instantly decode through the operation . is also instantly decodable for . However, is noninstantly decodable for because both and are from the Wants set of . is noninnovative for because has both and already.
The throughput and decoding delay performance of IDNC can be measured by the minimum number of coded transmissions and the average packet decoding delay, where:
Definition 6.
Given an SFM, the minimum number of coded transmissions, or equivalently the minimum block completion time, is the smallest possible number of IDNC coded transmissions required in order to satisfy the demands of all the receivers in the absence of any future packet erasures. This number is denoted by .
In the next section, we will further show that cannot be reduced regardless of feedback frequency. Thus, we claim that is the absolute minimum number of coded transmissions. indicates the best throughput performance, which can be calculated as . Such measure is important as it disentangles the effect of channelinduced packet erasures and algorithminduced IDNC coded packet selections on the throughput of the system.
Next, we define the average packet decoding delay, .
Definition 7.
Denote by the time slot in the coded transmission phase when original data packet is decoded by receiver , and let if . Then:
(3) 
Example 3.
Consider the SFM in Fig. 1(a). Assume that four IDNC coded packets , , , and are transmitted. Assuming erasurefree transmissions, all the receivers will be satisfied after four time slots. are summarized in Table I. The block completion time is 4 and the average packet decoding delay is . However, we have not discussed or determined yet if they are the best throughput and decoding delay performance of IDNC.
1  0  0  0  4  2  
0  1  0  0  0  2  
0  0  2  3  4  0  
0  0  0  3  0  2  
0  0  2  0  4  0 
IiD SIDNC versus GIDNC
One main model to capture IDNC constraints and determine coded packets is strict IDNC or SIDNC [4, 19, 5]. Imposing the SIDNC constraint means that:
Definition 8.
SIDNC constraint: for every receiver , the coding set contains at most one original data packet from the Wants set of . In other words, for every receiver, the coded packet in (2) is either instantly decodable or noninnovative, but it is never noninstantly decodable.
In [4], it is shown that the SIDNC constraint on the coded packets can be represented using an undirected graph with vertices corresponding to wanted original data packets. More details on the graphical representation of SIDNC will be provided in Section III.
In contrast, in the general IDNC or GIDNC proposed in [20, 9] the SIDNC constraint is relaxed by allowing the sender to send coded packets that are noninstantly decodable for a selected subset of receivers. If a receiver receives such a coded packet, it will discard that packet. In other words, in GIDNC the sender is not restricted to send IDNC coded packets for all the receivers, but the receivers adhere to the IDNC decoding principle. Recently, a new type of GIDNC is proposed in [21], which further relaxes this constraint by allowing receivers to store noninstantly decodable packets for future decoding so that they are not wasted.
Example 4.
Consider the SFM in Fig. 1(a). is a valid coded packet for both SIDNC and GIDNC. However, is a valid coded packet for GIDNC but not for SIDNC, since it is noninstantly decodable for . In GIDNC, when receives , it will discard it.
GIDNC problem can also be modeled using an undirected graph where for each lost packet of each receiver a vertex is added to the graph. The key operation of GIDNC algorithms is to search for the largest maximal clique(s)^{2}^{2}2In an undirected graph, all vertices in a clique are connected to each other with an edge. A clique is maximal if it is not a subset of a larger clique. [20, 9, 6]. The size of GIDNC graph is , while the size of SIDNC graph is .
In the rest of this paper, our aim is to better understand and characterize the SIDNC problem from both theoretical and implementation viewpoints. An important note is that the characteristics of SIDNC cannot be directly extended to GIDNC due to the fact that they construct and update their graphs in different ways, as will be explained in Section III. Characterizing GIDNC could be the objective of future research and is out of the scope of this paper. In the rest of the paper, when there is no ambiguity, we will simply refer to SIDNC as IDNC.
Iii The Optimal IDNC
IDNC constraints of an SFM can be represented by an undirected graph with vertices. Each vertex represents a wanted original data packet . Two vertices and are connected by an edge if and are not jointly wanted by any receiver [4]. This graph model, however, has only been employed in the literature to heuristically find IDNC coding solutions [4]. In this section, we will first revisit this graph model by constructing its equivalent matrix and set models. Then, we will use these models to rigorously prove some theorems about the minimum block completion time, . The key difference between SIDNC and GIDNC will become clear after the proofs. Based on these theorems, we will discuss the effect of feedback frequency on IDNC throughput and propose the optimal IDNC schemes with fully or semionline feedback. Although some similar concepts and results exist in the graph theory literature [22, 23], their compilation, presentation and more importantly interpretation in the IDNC context is new, to the best of our knowledge. We will highlight the similarities, differences, and new results as appropriate.
Iiia IDNC Modeling
In this subsection, we construct the matrix and set models of IDNC and demonstrate their relationship with its graph model. The construction is based on the concepts of conflicting and nonconflicting original data packets, defined as follows.
Definition 9.
Two original data packets and conflict with each other if both belong to the Wants set, , of at least one receiver such as . Mathematically, we can denote a conflict between and by , where . and do not conflict otherwise.
It is clear that to avoid noninstantly decodable coded packets, two conflicting original data packets and cannot be coded together. The equivalent of such conflict in the graph model is the absence of an edge between and [4, 23]. On the other hand, two nonconflicting data packets and have their respective vertices and connected.
The conflict states of all the original data packets can be fully described by a triangular conflict matrix of size :
Definition 10.
A fullysquare conflict matrix of size is a binaryvalued matrix with element at row and column denoted by corresponding to the conflict state of packets and . In particular, if and otherwise. Due to the symmetry of conflict between packets and noting that , , we can reduce the fullysquare matrix to a triangular matrix of size . From now on, by conflict matrix, we mean the reduced triangular matrix .
The conflict matrix and the graph model of the SFM in Fig. 1(a) are presented in Fig. 1(b) and Fig. 1(c), respectively. Note that in dealing with the conflict matrix, we are not concerned with the receivers who may need a certain packet. In fact, it is not difficult to show that two or more SFMs can have the same conflict matrix. Unless otherwise stated, it suffices to deal with conflict matrix for design and analysis of IDNC, instead of SFM .
We now define the key concept of maximal coding set that is allowed for IDNC transmission.
Definition 11.
A maximal coding set, , is a set of original data packets which simultaneously hold the following two properties: 1) their XOR coded packet satisfies IDNC constraint in Definition 8, i.e. is either instantly decodable or noninnovative for every receiver ; 2) addition of any other original data packet from to will make the resulted noninstantly decodable for at least one receiver in .
Example 5.
Consider a coding set of the SFM in Fig. 1(a). Its corresponding coded packet is . is instantly decodable for and is noninnovative for . One can verify that adding any other original data packet from to will make noninstantly decodable for at least one receiver. Hence, is a maximal coding set. Through exhaustive search, one can find all remaining maximal coding sets: , , .
The equivalent of maximal coding sets in the IDNC graph model is known as maximal cliques [23]. Therefore, we use to denote both a maximal coding set and a maximal clique.
For reasons that become clear at the end of this subsection, we ensure that in each IDNC coded transmission, the sender will code all and not a subset of the original data packets in a maximal coding set. To satisfy the demands of all the receivers, the sender has to transmit coded packets from an appropriately chosen collection of maximal coding sets. To achieve this, each original data packet should appear at least once in this collection. This condition can be formally represented as the diversity constraint, where diversity of a packet is defined as:
Definition 12.
The diversity of an original data packet within a collection of maximal coding sets is denoted by and is the number of maximal coding sets in which it appears.
Definition 13.
A collection of maximal coding sets satisfies the diversity constraint iff every original data packet has a diversity of at least one within this collection.
Given all the maximal coding sets of a conflict matrix , there exists at least one collection which satisfies the diversity constraint (in the extreme case all the maximal coding sets include all the original data packets). The size of the collection is the number of maximal coding sets in it. We then define the minimum collection and its size as follows:
Definition 14.
A collection of maximal coding sets is minimum if there does not exist any other collection which satisfies the diversity constraint with a smaller size. The size of the minimum collection is called the minimum collection size.
This number, as we will prove in the next subsection, is exactly the minimum number of coded transmissions, . We thus denote a minimum collection by
If the maximal coding sets in are sent using time slots, in the absence of packet erasures, the demands of all receivers will be satisfied. Here by “sending a maximal coding set”, we mean that the corresponding coded packet is generated and sent.
A problem in the graph theory which is somewhat similar to finding a minimum collection of maximal coding sets is the minimum clique cover problem [4, 23, 22]. In this problem, a graph is partitioned into disjoint cliques and the partitioning solution that results into the smallest number of disjoint cliques is referred to as minimum clique cover solution of the problem. Furthermore, it is worth noting that the cardinality of the minimum clique cover solution is equal to the chromatic number of the complementary graph ^{3}^{3}3The complementary graph has opposite vertex connectivity to . of , denoted by [23]. However, there is a difference between the minimum clique cover problem and our minimum collection finding problem, as cliques do not overlap in the minimum clique cover problem. That is, the cliques are not necessarily maximal and each vertex appears in only one clique. This would be equivalent to choosing a minimum collection of coding sets in our IDNC model where all original data packets have a diversity equal to one. This would not change . However, it can have a serious adverse impact on IDNC’s robustness to erasures, which in turn degrades the IDNC overall throughput and decoding delay performance. Consequently, it is desirable to choose a minimum collection of maximal coding sets that, while satisfying , provides as many packet diversities as possible. The minimum collection and the importance of packet diversity is illustrated with the following example.
Example 6.
Having all the maximal coding sets of SFM in Fig. 1 obtained in Example 5, one can easily verify that the only minimum collection is
By sending these three coding sets in using transmissions, the demands of all the receivers will be satisfied in the absence of packet erasures. All the original data packets in have a diversity of one, except which has a diversity of 2, i.e. . Now, let us assume that there is a packet erasure probability of in the transmission links between the sender and the receivers. Under this scenario, with these three coded transmissions, the probability of being lost at its targeted receiver (due to erasures) will be . This probability is much lower then that of other original data packets, which will be equal to .
Remark 1.
The problem of finding all the maximal cliques and the problem of minimum clique cover for an undirected graph are both NPcomplete [23, 22, 24]. This is also true for SIDNC because the SIDNC graph does not have any special structural properties. A similar statement can be found in [7] for GIDNC. Since a minimum collection of a SIDNC conflict matrix can be reduced to a minimum clique cover solution of a SIDNC graph by reducing the diversity of all data packets to one, the problem of finding minimum collections in SIDNC is at least NPcomplete. Its exact and simplified algorithms will be presented in Section VI.
IiiB The Equivalence of , the Minimum Collection Size, and
In this subsection, we prove that the three numbers: 1) the minimum number of coded transmissions (), 2) the minimum collection size of the conflict matrix, and 3) the chromatic number of the complementary graph (), are identical. Based on this we propose two important remarks.
For a given conflict matrix , the equivalence between its and minimum collection size can be proved by induction using the following theorem:
Theorem 1.
Upon successful reception of a maximal coding set of by all its targeted receivers, the minimum collection size of the updated , denoted by , is at least .
This theorem holds if the following two theorems hold:
Theorem 2.
The minimum collection size of a conflict matrix with a graph model equals the chromatic number of the complementary graph of , .
Theorem 3.
Suppose is a maximal clique in and the chromatic number of is . By removing from we obtain an updated graph . The chromatic number of is at least . More precisely, if belongs to a minimum collection of , , while if does not belong to any minimum collection of , .
Since the in Theorem 3 is indeed the graph model of in Theorem 1, we conclude that Theorem 1 holds if Theorem 2 and 3 hold. The proofs of Theorem 2 and 3 are provided in Appendix A and Appendix B, respectively.
Remark 2.
The above theorems apply to SIDNC, but not GIDNC. The reason is that unlike SIDNC graph, GIDNC graph is not static. That is, by removing a clique from the GIDNC graph, new edges may be added to the remaining vertices, which breaks Theorem 3. In contrast, removing any clique from the SIDNC graph will never change the connectivity of the remaining vertices. In other words, GIDNC problem does not have its dual static minimum clique cover problem.
Remark 3.
Heuristic algorithms are suboptimal because they cannot guarantee . They may choose a maximal coding set which is, though large, not included in any minimum collection of . Then Theorem 3 indicates that, even if is successfully received by all its targeted receivers, the chromatic number of the updated is still and thus more transmissions are needed. Below is an example.
Example 7.
Consider the maximal coding sets in Example 5. A suboptimal IDNC algorithm might choose , which does not belong to the only minimum collection . Even if this set is successfully received by all its targeted receivers, still three more transmissions are needed to be able to deliver and to the receivers. In total, there will be at least four transmissions, which is greater than .
This example motivates the concept of optimal IDNC schemes, which will be presented next.
IiiC The Optimal Fully and Semionline IDNC Schemes
IiiC1 The optimal fullyonline IDNC scheme
In a fullyonline IDNC scheme, the sender collects feedback from all receivers in every time slot to update the SFM and the corresponding conflict matrix. A coding set is then chosen and its coded packet is generated and broadcast. To minimize the number of coded transmissions and to reduce the decoding delay, this coding set must satisfy the following three conditions:

It should be a maximal coding set;

It should be chosen from a minimum collection of the updated conflict matrix; and

It should target the largest number of receivers among all the maximal coding sets in .
We define such a fullyonline IDNC scheme as the optimal fullyonline IDNC scheme in terms of throughput.
IiiC2 The optimal semionline IDNC scheme
According to Theorem 13, collecting fullyonline feedback during transmissions cannot reduce the total number of coded transmissions to below , even in the best case scenario of erasurefree packet reception. Hence, as a variation of existing IDNC schemes in the literature, we propose to reduce feedback frequency to semionline, where the SFM is updated in rounds. For example, feedback is collected after coded packets from a selected minimum collection have been transmitted and so on. We define this scheme as the optimal IDNC scheme in terms of throughput when feedback frequency is semionline, or simply the optimal semionline IDNC scheme. We refer to the minimum collection as the optimal semionline IDNC solution. Maximal coding sets in are properly ordered so that those targeting more receivers are assigned with smaller subscripts and sent first. Fig. 2 illustrates the process of the proposed optimal IDNC schemes.
We then define the minimum average packet decoding delay of the proposed IDNC schemes:
Definition 15.
We denote by the minimum average packet decoding delay of the proposed fully and semionline IDNC scheme. It is achieved if the maximal coding sets in the optimal semionline IDNC solution are broadcast in the absence of packet erasures, and is calculated as:
(4) 
where is the index of the first maximal coding set in which contains .
Compared with (3), the decoding delays of an original data packet at its targeted receivers, i.e., , are unified to here because all these receivers can decode in the same time slot if there is no packet erasure. It is noticed that by “minimum” we mean the smallest possible average packet decoding delay of the proposed (throughput) optimal IDNC schemes in the absence of packet erasures. is not necessarily the optimal average packet decoding delay that IDNC can offer, finding which is still an open problem. Indeed, as we shall see in Section VII, a suboptimal IDNC scheme in terms of throughput may achieve a better decoding delay.
It is also noticed that, since the initial SFM is a allone matrix, the systematic transmission phase is a special semionline IDNC round, which requires the original data packets to be sent uncoded using transmissions.
IiiC3 Comparisons
In addition to making the throughput and delay analysis of IDNC tractable, a lower feedback frequency can be advantageous in practical implementations of IDNC where the use of reverse link is costly and involves transmission of some control overheads. Another practical attraction is that it also avoids solving the IDNC coding problem in every time slot. However, this comes at the potential cost of degradation in the overall system throughput in semionline IDNC, as we explain next.
Imagine an IDNC scheme in the presence of erasures. In the fullyonline feedback case, is updated before every coded transmission, so the coded packet is chosen from the minimum collection of the actual at the receivers. However, in the semionline feedback case, the sender does not update until the round for is complete. Here belong to the minimum collection of the last revealed to the sender, but not necessarily belong to the actual at the receivers. If this is the case, these coded packets can become throughput inefficient. Intuitively, we expect the gap between semi and fullyonline schemes to be small when packet erasure probability is low (in the extreme case where the packet erasure probability is zero, the two schemes perform the same). In any case, the throughput and delay analysis of semionline IDNC scheme serves as a worstcase scenario for an optimal fullyonline IDNC with packet erasures.
Iv Throughput Bounds
The findings in the last section are important because they enable theoretical analysis on the achievable throughput and decoding delay of IDNC. For throughput, is equal to the chromatic number of the complementary IDNC graph. For decoding delay, is the average decoding delay of the proposed optimal semionline IDNC solution. We note, however, that there is no explicit formula to calculate the optimal . It can only be found via algorithmic implementations that can be computationally complex, as will be discussed in Section VI. Therefore, it is desirable to have some bounds on that can be more easily calculated or algorithmically found. This is the aim of this section. It is particularly useful and can find its application in, e.g., adaptive network coding systems which choose among IDNC and other network coding techniques to meet the throughput and decoding delay requirements. Since the calculation of depends on as indicated by (4), we will first derive bounds on in this section and then on in the next section. We start with the review of existing results in graph theory and then propose useful bounds in IDNC context.
Iva Results in Graph Theory
Given a set of system parameters }, the complementary IDNC graph after the systematic transmission phase can be modeled as the classic ErdosRenyi random graph [25]. In this model, there are vertices and any two of them are connected by an undirected edge with i.i.d. probability of . In the context of IDNC, is the probability that two original data packets conflict with each other and can be calculated as:
(5) 
Lemma 1.
Almost every random graph with vertices and vertex connection probability of has chromatic number of:
(6) 
where approaches zero with increasing .
Since , this lemma could be used to calculate the mean of under any set of . However, it is only asymptotically accurate for large . Since in IDNC systems may not be very large, (6) does not provide sufficient accuracy.
In graph theory, is bounded as:
(7) 
where is called the clique number of and is the size of the maximum (the largest maximal) clique in , and is the largest vertex degree of , i.e., the largest number of edges incident to any vertex in . As we will show later, while is a tight lower bound, the upper bound is very loose and is not useful for IDNC framework. In the next two subsections, we will derive useful loose/tight lower/upper bounds on , respectively. The loose bounds are easy to calculate and they reveal the limits of IDNC, while the tight bounds are more computationally involved, but nevertheless are shown numerically to be accurate estimates of .
IvB Loose Bounds
In this subsection, we find the smallest and largest possible of all the conflicts matrices which have a size of and zero entries, with their set denoted by . The results are our loose lower and upper bounds and are denoted by and , respectively. They reveal the throughput limits of IDNC for any given and and are important references for practical/heuristic IDNC coding algorithm design: any algorithm offering above the upper bound or below the lower bound is throughput inefficient or noninstantly decodable, respectively.
(9a)  
(9b)  
(9c)  
(9d)  
(9e) 
IvB1
The general intuition here is trying our best to waste the coding opportunities brought by the zeros. We first note that for any given original data packet, there are entries in about the conflict of that packet with all other packets. When , there is no coding opportunities, so . When , we can assign all zeros to the entries about the same original data packet, say . But remains because have to be transmitted separately. After zeros have been exhausted, there are entries in about every packet other than . Thus when , we can assign these extra zeros to the entries about the same original data packet, say , and remains because have to be transmitted separately. This iterative process indicates that decreases in a staircase way with . The relationship can be written as:
(8) 
One can easily verify that the proposed loose upper bound is much tighter than because the largest possible is always when .
IvB2
The intuition here is making the best of the coding opportunities brought by the zeros. In other words, we should use as few zeros as possible to reduce by one. When , no original data packets can be coded together. Thus and . We then reduce iteratively. In each iteration, can be reduced by 1, i.e., the size of can be reduced by 1, if we can merge two maximal coding sets in together. Since any two packets from different maximal coding sets conflict, to merge two maximal coding sets of size, say and , together, we need zeros. In order to use as few zeros as possible, we always pick two smallest sets which minimize . Hence, in each iteration, it is impossible to reduce by 1 until new zeros are added to . This iterative process provides the lowerbound . Similar to the upperbound, the lower bound also decreases in a staircase way with . Below is an example with :
Example 8.
When we have an allone conflict matrix (), no packets can be coded together, and thus , as in (9a). Then in the first iteration, the size of can be reduced by 1 by merging and together, which requires one zero, as in (9b). In the second iteration, the size of can be reduced by one by merging and together, which requires 1 zero, as in (9c). In the third iteration, zeros are needed to merge the two smallest maximal coding sets and together because we have to resolve the conflicts between and and between and . In the last iteration, zeros are needed to merge and together. After that, all the 10 entries in become zeros and becomes 1. in this example can thus be expressed as:
(10) 
IvC Tight bounds
Because is well lower bounded by and , our tight lower bound on is defined as . It can be identified by finding the maximum clique in . We then find a tight upper bound, denoted by .
IvC1
Our tight upper bound on is derived using an iterative operation on a graph , denoted by . It iteratively outputs the maximum clique in and then deletes it from until becomes empty. Mathematically:
(12)  
The resulted cliques actually form a partition of . Hence, if the initial is an instance of IDNC graph, the resulted cliques form a semionline IDNC solution, denoted by with cardinality . The minimum block completion time () of this IDNC instance is thus upper bounded by .
Remark 4.
The derivations of both and rely on finding the maximum clique in an IDNC graph, which is NPcomplete [23, 22, 24]. In Section VI, we will propose a heuristic cliquefinding algorithm, which will in turn enable finding some heuristic bounds on . These heuristic bounds are still provable, but they are suboptimal because the heuristically found maximal cliques are not necessarily maximum.
IvD The Average Bounds on and Their Tightness
Based on the bounds we have derived for any instance of conflict matrix, we can calculate the average bounds over all the conflict matrices in so that the average bounds can be explicitly expressed as a function of .
Since the loose bounds are already functions of , the focus is on the average tight bounds, i.e., and . They can be obtained by listing all the conflict matrices in , calculating their bounds, and then making the average. However, this is usually unrealistic, since there are possible conflict matrices, which are prohibitively large even when and are not so large. Hence, averaging over “all” conflict matrices is replaced by Monte Carlo averaging, where instances of conflict matrix are generated by assigning random permutations of zeros and ones to the conflict matrix.
We present the average bounds under original data packets and in Fig. 3. The optimal , obtained using the method in Section VIA, is also averaged and plotted as a reference. It is denoted by . It decreases gradually with the increasing , and so do the tight bounds and . The gap between the tight bounds and the optimal one is marginal, with a value of less than 0.5 transmission on average for all . They are much tighter than the loose ones, which decrease in a staircase way with increasing .
V Decoding Delay Bounds
According to its definition in (4), the minimum average packet decoding delay of an SFM is decided by the optimal semionline IDNC solution of the corresponding conflict matrix and the number of targeted receivers of all the original data packets . Deriving lower/upper bounds on of an SFM is thus equivalent to finding two instances of which offer the best/worst possible decoding delays, respectively. We first discuss what instances will yield such decoding delays.
An instance of is denoted by where is its cardinality. Its average packet decoding delay is denoted by and can be calculated using (4), where is now the index of the first maximal coding set in that contains . Therefore, for the purpose of calculation, can be removed from all the subsequent coding sets. After applying such removal to all the original data packets, the intersection between any two coding sets in becomes empty. These coding sets are not necessarily maximal and we denote them by to distinguish them from maximal ones. Below is an example.
Example 9.
Consider an instance , . After packet removal, the instance becomes: .
Let us denote by the number of targeted receivers of coding set . Without loss of generality we assume that:
(13) 
It holds that . Then, as a variation of (4), the average packet decoding delay under can also be calculated as:
(14) 
The above two equations indicate the condition that the best/worst possible instances of should satisfy:

Because for all , is minimized if has and for . Since it is rare to have coding sets wanted by only one receiver, we relax this condition as and refer to such as the best;

is maximized with a value of if has and thus is the worst.
We now propose different instances of and obtain lower/upper bounds on with different tightness.
Va Loose Bounds
For any given SFM, without loss of generality we assume that its conflict matrix belongs to . By employing the loose bounds on for , we can derive loose bounds on .
VA1 Loose lower bound
The smallest possible cardinality of the instance is equal to the loose lower bound on , that is, . Thus, is minimized if has:
(15) 
By substituting the above into (14), a loose lower bound on is obtained.
VA2 Loose upper bound
The largest possible cardinality of the instance is equal to the loose upper bound on , that is, . Thus, is maximized with a value of if has uniform , as discussed in C2 after (14).
VB Tight Bounds
VB1 Tight lower bound
In the derivation of the tight lower bound on , we find a size clique in the complementary IDNC graph . Denote the original data packets included in this clique by , and without loss of generality assume that . These original data packets must be sent separately because they are not connected in , i.e., they conflict. In this case, the smallest decoding delay takes place when all the remaining original data packets can be coded together with in the first coding set. The sequence is:
(16) 
By substituting the above into (14), a tight lower bound on the minimum packet decoding delay is obtained and is denoted by .
VB2 Tight upper bound
We use the IDNC solution found using the operation in (12) as our instance and thus its average decoding delay is our tight upper bound .
VC The Average Bounds on and Their Tightness
In this subsection, we obtain the average bounds on using a similar method as for the average bounds on . For a given set of , conflict matrices are randomly generated and the number of targeted receivers of the original data packets are also randomly generated. Their decoding delay bounds, as well as under the optimal semionline solution are calculated and averaged.
Simulation results for and are plotted in Fig. 4. The profiles of the average decoding delay bounds are similar to the average throughput bounds. The average loose bounds decrease with increasing in a staircase way, while the average tight bounds decrease gradually as .
The main difference is that is much closer to than in the throughput case, and for , the gap becomes negligible. The reason is that the IDNC solution can be viewed as a greedy IDNC solution in terms of decoding delay. It transmits the largest maximal coding set first, which is likely to target the most receivers. This result, together with the small gap between and , indicate that could be modified into a good heuristic IDNC coding algorithm, which will be discussed in the next section.
Vi Implementations
In this section, we present the algorithmic implementations of IDNC. We first propose the optimal semionline and fullyonline IDNC coding algorithms and then their heuristic alternatives. We also employ a heuristic cliquefinding algorithm to obtain heuristic tight bounds on the throughput and decoding delay performance of IDNC.
Via Optimal IDNC Coding Algorithms
Our optimal semionline IDNC coding algorithm finds the minimum collections of the conflict matrix in two steps:

Find all the maximal coding sets (cliques): This problem is NPcomplete but has an efficient recursive algorithm called BronKerbosch algorithm [28]. The group of all the maximal coding sets is denoted by .

Find minimum collections from : We propose an iterative algorithm in Algorithm 1 to achieve it. The intuition behind this algorithm is that, if an original data packet belongs to maximal coding sets in , one of these maximal coding sets has to be transmitted. In the extreme case of , this maximal coding set must be sent. Below is an example of Algorithm 1.
Example 10.
Consider the graph model in Fig. 5. In Step1, we find all the maximal cliques: , , , , . Then in Step2:

Since is empty, none of the original data packets are included. Among them, has a diversity of only one under . Thus in the first iteration, the updated solution is ;

The remaining original data packets are . Among them has a diversity of one under . Thus in the second iteration, the updated solution is ,;

The remaining original data packets are and . They both have a diversity of two under . We pick and then branch: ,, and ,,. Since satisfies the diversity constraint, the algorithm ceases and returns as the minimum collection.
If the above twostep coding algorithm outputs several minimum collections, different criterion can be used for selection, such as the smallest average packet decoding delay and the highest average packet diversity, etc. In our simulations, we select the one having larger diversities for data packets wanted by more receivers, i.e., the collection that maximizes will be chosen, where is the diversity of within and is the number of targeted receivers of .
If fullyonline feedback is allowed and computational cost at the sender is not an issue, the optimal fullyonline IDNC scheme can be applied, where in every time slot, the sender calculates the optimal semionline solution as above, but sends only the first maximal coding set and then collects feedback.
ViB Hybrid IDNC Coding Algorithms
Algorithm 1 is optimal because it finds all the possible minimum collections. However, it is also memory demanding because the number of candidature solutions usually grows exponentially after the branching in every iteration. Thus in this subsection, we propose a simple greedy alternative to it. We first choose the largest maximal coding set in . Then for the remaining original data packets that have not been covered, we look for a maximal coding set which comprises most of them. This iterative algorithm only produces one collection, which may be suboptimal because its cardinality may be greater than . The optimal clique finding in Step1, together with this heuristic algorithm in Step2, is referred to as the hybrid semionline IDNC coding algorithm.
If fullyonline IDNC is applied, after finding in Step1, we can greedily choose the maximal coding set in that targets the maximum number of receivers. This algorithm is referred to as the hybrid fullyonline IDNC coding algorithm.
To reduce the computational load due to Step1, we resort to a fullyheuristic cliquefinding algorithm next.
ViC Heuristic IDNC coding Algorithms
A simple algorithm that heuristically finds the maximum (the largest maximal) clique in a graph is provided in Algorithm 2. The intuition behind this algorithm is that, a vertex is very likely to be in the maximum clique if this vertex has the largest number of edges incident to it. This algorithm has been employed in [4, 20, 9] for fullyonline IDNC. So we also refer to it as the heuristic fullyonline IDNC coding algorithm. However, it has not been applied to semionline IDNC and its computational complexity has not been identified yet.
The computational complexity of this algorithm is polynomial in the number of original data packets . The highest computational complexity occurs when the input graph is complete, i.e., all the vertices are connected to each other. Under this scenario, only one vertex could be removed in each iteration (in Step 8) and thus, the size of the graph in the th iteration, , will be . As a result of this, the highest computational complexity is . In practice, the graph size will shrink much faster after each iteration, and the number of iterations is usually smaller than . Hence, the computational complexity of this algorithm is loosely upperbounded by . In other words, the computational complexity of this algorithm is .
ViC1 Heuristic bounds
Here, we apply Algorithm 2 to heuristically find the proposed tight bounds on the throughput, and then the corresponding tight bounds on the decoding delay. The results are shown in Figs. 3 and 4, respectively. It is observed that the performance degradation due to the heuristic algorithm is marginal for both throughput and decoding delay. Therefore, the heuristic tight bounds could serve as reliable and efficient estimates of the throughput and decoding delay.
ViC2 Heuristic semionline IDNC coding algorithm
The operation in (12) can be implemented by using Algorithm 2. The outcome is a heuristic semionline IDNC solution , which offers good throughput and decoding delay performance in the erasurefree scenario. However, since its cliques are disjoint, all the original data packets have a diversity of only one and thus are vulnerable to packet erasures in real systems.
To overcome this drawback, we propose a heuristic semionline IDNC coding algorithm in Algorithm 3, which is an extension of . The key idea here is that, in the th iteration, after finding clique , we try to enlarge this clique by adding previously covered vertices to it whenever possible, i.e., the vertices in . By doing so, the diversity of the newly added vertices (packets) is increased by one. Below is an example:
Example 11.
Consider the graph in Fig. 1(c). In the first two iterations, the algorithm will choose and , respectively, without any adding. In the third iteration, we have