On Throughput and Decoding Delay Performance of
Instantly Decodable Network Coding
In this paper, a comprehensive study of packet-based instantly decodable network coding (IDNC) for single-hop wireless broadcast is presented. The optimal IDNC solution in terms of throughput is proposed and its packet decoding delay performance is investigated. Lower and upper bounds on the achievable throughput and decoding delay performance of IDNC are derived and assessed through extensive simulations. Furthermore, the impact of receivers’ feedback frequency on the performance of IDNC is studied and optimal IDNC solutions are proposed for scenarios where receivers’ feedback is only available after an IDNC round, composed of several coded transmissions. However, since finding these IDNC optimal solutions is computationally complex, we further propose simple yet efficient heuristic IDNC algorithms. The impact of system settings and parameters such as channel erasure probability, feedback frequency, and the number of receivers is also investigated and simple guidelines for practical implementations of IDNC are proposed.
Instantly decodable network coding (IDNC) is a class of linear network coding schemes [1, 2, 3, 4, 5, 6, 7] which has been widely studied and applied in wireless unicast, multicast and broadcast systems. It has been shown that IDNC schemes can quite significantly improve data throughput in such systems compared to their uncoded counterparts [1, 2], while offering simple XOR-based encoding and decoding. Furthermore, IDNC provides instant packet decodability at the receivers, which can result in faster delivery of the packets to the application layer compared to other linear network coding schemes [5, 6].
In this paper, we are primarily concerned with investigating the performance limits of IDNC schemes for data dissemination in single-hop wireless broadcast systems in terms of throughput and delay. In such systems, there is a single sender who wishes to broadcast a block of data packets to multiple receivers [4, 5, 6, 7]. Due to packet erasures in wireless fading channels, some transmitted packets are lost at the receivers. Generally, information about received or lost packets are fed back from the receivers to the sender after transmission of one or multiple packets. The sender then determines which data packets from the block to combine and transmit next subject to IDNC constraints. This process is repeated until the broadcast of the block is complete, i.e. until all receivers have decoded all data packets.
In such systems, the time it takes to complete the block, or simply the IDNC block completion time, is a fundamental measure of its throughput performance and will be studied in this paper. Taking the block completion time of random linear network coding (RLNC)  as a benchmark, many works in the literature have been concerned with proposing IDNC schemes with good throughput performance [4, 5, 6, 9, 10, 11, 12, 13, 14, 15, 7]. The majority of these schemes collect feedback about the lost packets and determine an online IDNC solution accordingly, which comprises one or more coded packets, such that it can efficiently bring the system closer to block completion.
Although these works differ in their models and assumptions about the frequency and reliability of feedback [11, 12], at their core they run dynamic IDNC algorithms, responding to erasure patterns that have happened along the transmission. The main limitation of such studies is that it is impossible to say a priori how long it will actually take to complete a block starting with a certain system packet reception state at the receivers. Furthermore, such IDNC solution is in fact the result of a local optimization, and there is no guarantee that the solution is globally optimal. Therefore, the following two fundamental questions still seem unanswered in the literature:
What is the best throughput performance of IDNC?
Which IDNC solution can achieve this best throughput performance?
By best throughput, we refer to the minimum block completion time that is possible by using IDNC starting with a certain system packet reception state, in the absence of any future erasures. It is clear that packet erasures in an actual system can defer block completion. Therefore, our measure of throughput is the best possible performance of the IDNC schemes in terms of throughput and serves as an upper bound on what IDNC can achieve in the presence of erasures. Such measure of throughput is significant because it disentangles the effects of channel-induced packet erasures and algorithm-induced IDNC coded packet selection on the throughput of the system. Based on this measure of throughput, we propose the concept of optimal IDNC scheme which refers to an IDNC scheme that provides globally optimal IDNC solution and achieves the best throughput performance in the absence of any future erasures.
Besides block completion time, another fundamental performance metric of IDNC is its decoding delay. There are various definitions of decoding delay in the literature [1, 3, 5]. In this work, we consider packet decoding delay, defined as the number of time slots it takes till the data packets are decoded by the receivers. The reasons for our choice are twofold. First, short packet decoding delay is the main advantage of IDNC, which is particularly desirable for applications in which data packets are useful regardless of their order. Second, packet decoding delay is naturally related to the throughput of IDNC and its respective IDNC solution. That is, having investigated the best throughput performance of IDNC and the optimal IDNC solution, decoding delay limits of IDNC schemes can be obtained with relative ease.
The contributions of this paper can be summarized as follows. First, the best achievable throughput performance of IDNC, regardless of packets erasure probabilities and feedback frequency, and its corresponding optimal IDNC solution are rigorously obtained. Furthermore, the concept of IDNC packet diversity in the optimal IDNC solution is introduced. It is a measure of the robustness of IDNC solution against packet erasures. While ensuring the optimal throughput performance, our proposed IDNC solution enhances packet diversity wherever possible, hence enhancing its robustness against erasures. This feature distinguishes our optimal IDNC solution from other IDNC solutions in the literature, where packet diversity has never been considered, to the best of our knowledge.
Second, the impact of feedback frequency on the performance of the IDNC scheme is investigated, the concept of semi-online feedback is introduced and optimal fully-online and optimal semi-online IDNC schemes are devised.
Third, we derive lower and upper bounds on the throughput and decoding delay performance of IDNC schemes. Furthermore, we design the optimal IDNC coding algorithm, as well as its simplified alternatives that offer efficient performances with much lower computational complexities.
The performance of these algorithms is evaluated via extensive simulations under different settings of system parameters. The results illustrate the interactions among these parameters and can serve as simple implementation guidelines. Plenty of hands-on examples are also designed to demonstrate the proposed concepts, theorems, methodologies, and algorithms. In summary, this work can be a useful reference in the IDNC literature and can motivate further research.
I-a Additional Remarks
IDNC can be divided into two categories, strict IDNC (S-IDNC) [4, 5] and general IDNC (G-IDNC) [9, 7]. Although they have the same system model and use similar dynamic algorithms, they differ in the sense that G-IDNC coded packets are allowed to include two or more new data packets for some receivers. However, this is not allowed in S-IDNC. In this paper, we focus on S-IDNC (or IDNC for short). The relationship and comparisons between S- and G-IDNC will be discussed when necessary.
S-IDNC problem is, to some extent, related to the index coding problem [16, 17, 18], especially when memoryless decoding is considered in the index coding problem [16, 17]. Nevertheless, their problem formulations are different. A basic assumption in index coding is that a receiver who has successfully received a subset of packets but is still missing multiple packets can be considered as multiple receivers each wanting only one of the missing packets. However, such splitting is prohibited in S-IDNC, for it will violate the instantly decodable property of IDNC coded packets. Therefore, results of index coding and S-IDNC cannot be used interchangeably.
Ii System Model and Notations
Ii-a Transmission Setup
We consider a packet-based wireless broadcast scenario from one sender to receivers. Receiver is denoted by and the set of all receivers is . There are a total of binary packets with identical length to be delivered to all receivers. Packet is denoted by and the set of all packets is . Sometimes we will refer to as an original data packet to distinguish it from a coded packet. Time is slotted and in each time slot, one (coded or original data) packet is broadcast. The wireless channel between the sender and each receiver is modeled as a memoryless erasure link with i.i.d. packet erasure probability of . The results proposed in the paper can be generalized, with proper modifications, to non-homogeneous erasure links.
Ii-B Systematic Transmission Phase and Receivers Feedback
Initially, the packets are transmitted uncoded once using time slots. This is the systematic transmission phase. After this phase, each receiver provides feedback to the sender about the packets it has received or lost.111We assume that there exists an error-free feedback link between each receiver to the sender that can be used with appropriate frequency. The number of packets that are not received by at least one receiver due to erasures is denoted by and their set is denoted by , where . The number of receivers that have not received all the packets is denoted by and the set of these receivers is denoted by , where .
The complete state of receivers and packets can be captured by an state feedback matrix (SFM) (also known as receiver-packet incidence matrix ), where the element at row and column is denoted by and
The Wants set of receiver , denoted by , is the subset of packets in which are lost at due to packet erasures. That is, .
The Target set of a packet , denoted by , is the subset of receivers in who want packet . That is, . The size of is denoted by .
Consider the SFM in Fig. 1(a). There are packets and receivers after the systematic transmission phase. The Wants set of is . The Target set of is and thus .
Ii-C Coded Transmission Phase
In this subsection, we present some basic definitions and performance metrics related to IDNC. Then in the next subsection, we will briefly discuss existing models in the literature to deal with the IDNC problem.
After the systematic transmission phase and collecting receivers’ feedback, the coded transmission phase starts. In this phase, IDNC aims to satisfy the demands of all receivers by sending coded packets under two fundamental restrictions:
The sender uses the binary field for linear coding;
Receivers do not store received coded packets for future decoding, i.e. memory is not required at the receivers;
More precisely, the first restriction means that the -th transmitted coded packet is of the form
where and the summation is bit-wise XOR . We denote by the set of original data packets that have non-zero coefficients in , namely, . fully represents and is called a coding set. Based on (2), can be one of the following for each receiver:
A coded packet is instantly decodable for receiver if contains only one original data packet from the Wants set of .
A coded packet is non-instantly decodable for receiver if contains two or more original data packets from the Wants set of .
Due to restriction 2 above, a non-instantly decodable coded packet will be discarded by upon receiving.
A coded packet is non-innovative for receiver if only contains original data packets not from the Wants set of . Otherwise, it is innovative.
A non-innovative coded packet will be also discarded by upon receiving.
Consider the SFM in Fig. 1(a). is instantly decodable for because the corresponding coding set only contains one original data packet () from the Wants set of . Thus, can instantly decode through the operation . is also instantly decodable for . However, is non-instantly decodable for because both and are from the Wants set of . is non-innovative for because has both and already.
The throughput and decoding delay performance of IDNC can be measured by the minimum number of coded transmissions and the average packet decoding delay, where:
Given an SFM, the minimum number of coded transmissions, or equivalently the minimum block completion time, is the smallest possible number of IDNC coded transmissions required in order to satisfy the demands of all the receivers in the absence of any future packet erasures. This number is denoted by .
In the next section, we will further show that cannot be reduced regardless of feedback frequency. Thus, we claim that is the absolute minimum number of coded transmissions. indicates the best throughput performance, which can be calculated as . Such measure is important as it disentangles the effect of channel-induced packet erasures and algorithm-induced IDNC coded packet selections on the throughput of the system.
Next, we define the average packet decoding delay, .
Denote by the time slot in the coded transmission phase when original data packet is decoded by receiver , and let if . Then:
Consider the SFM in Fig. 1(a). Assume that four IDNC coded packets , , , and are transmitted. Assuming erasure-free transmissions, all the receivers will be satisfied after four time slots. are summarized in Table I. The block completion time is 4 and the average packet decoding delay is . However, we have not discussed or determined yet if they are the best throughput and decoding delay performance of IDNC.
Ii-D S-IDNC versus G-IDNC
S-IDNC constraint: for every receiver , the coding set contains at most one original data packet from the Wants set of . In other words, for every receiver, the coded packet in (2) is either instantly decodable or non-innovative, but it is never non-instantly decodable.
In , it is shown that the S-IDNC constraint on the coded packets can be represented using an undirected graph with vertices corresponding to wanted original data packets. More details on the graphical representation of S-IDNC will be provided in Section III.
In contrast, in the general IDNC or G-IDNC proposed in [20, 9] the S-IDNC constraint is relaxed by allowing the sender to send coded packets that are non-instantly decodable for a selected subset of receivers. If a receiver receives such a coded packet, it will discard that packet. In other words, in G-IDNC the sender is not restricted to send IDNC coded packets for all the receivers, but the receivers adhere to the IDNC decoding principle. Recently, a new type of G-IDNC is proposed in , which further relaxes this constraint by allowing receivers to store non-instantly decodable packets for future decoding so that they are not wasted.
Consider the SFM in Fig. 1(a). is a valid coded packet for both S-IDNC and G-IDNC. However, is a valid coded packet for G-IDNC but not for S-IDNC, since it is non-instantly decodable for . In G-IDNC, when receives , it will discard it.
G-IDNC problem can also be modeled using an undirected graph where for each lost packet of each receiver a vertex is added to the graph. The key operation of G-IDNC algorithms is to search for the largest maximal clique(s)222In an undirected graph, all vertices in a clique are connected to each other with an edge. A clique is maximal if it is not a subset of a larger clique. [20, 9, 6]. The size of G-IDNC graph is , while the size of S-IDNC graph is .
In the rest of this paper, our aim is to better understand and characterize the S-IDNC problem from both theoretical and implementation viewpoints. An important note is that the characteristics of S-IDNC cannot be directly extended to G-IDNC due to the fact that they construct and update their graphs in different ways, as will be explained in Section III. Characterizing G-IDNC could be the objective of future research and is out of the scope of this paper. In the rest of the paper, when there is no ambiguity, we will simply refer to S-IDNC as IDNC.
Iii The Optimal IDNC
IDNC constraints of an SFM can be represented by an undirected graph with vertices. Each vertex represents a wanted original data packet . Two vertices and are connected by an edge if and are not jointly wanted by any receiver . This graph model, however, has only been employed in the literature to heuristically find IDNC coding solutions . In this section, we will first revisit this graph model by constructing its equivalent matrix and set models. Then, we will use these models to rigorously prove some theorems about the minimum block completion time, . The key difference between S-IDNC and G-IDNC will become clear after the proofs. Based on these theorems, we will discuss the effect of feedback frequency on IDNC throughput and propose the optimal IDNC schemes with fully- or semi-online feedback. Although some similar concepts and results exist in the graph theory literature [22, 23], their compilation, presentation and more importantly interpretation in the IDNC context is new, to the best of our knowledge. We will highlight the similarities, differences, and new results as appropriate.
Iii-a IDNC Modeling
In this subsection, we construct the matrix and set models of IDNC and demonstrate their relationship with its graph model. The construction is based on the concepts of conflicting and non-conflicting original data packets, defined as follows.
Two original data packets and conflict with each other if both belong to the Wants set, , of at least one receiver such as . Mathematically, we can denote a conflict between and by , where . and do not conflict otherwise.
It is clear that to avoid non-instantly decodable coded packets, two conflicting original data packets and cannot be coded together. The equivalent of such conflict in the graph model is the absence of an edge between and [4, 23]. On the other hand, two non-conflicting data packets and have their respective vertices and connected.
The conflict states of all the original data packets can be fully described by a triangular conflict matrix of size :
A fully-square conflict matrix of size is a binary-valued matrix with element at row and column denoted by corresponding to the conflict state of packets and . In particular, if and otherwise. Due to the symmetry of conflict between packets and noting that , , we can reduce the fully-square matrix to a triangular matrix of size . From now on, by conflict matrix, we mean the reduced triangular matrix .
The conflict matrix and the graph model of the SFM in Fig. 1(a) are presented in Fig. 1(b) and Fig. 1(c), respectively. Note that in dealing with the conflict matrix, we are not concerned with the receivers who may need a certain packet. In fact, it is not difficult to show that two or more SFMs can have the same conflict matrix. Unless otherwise stated, it suffices to deal with conflict matrix for design and analysis of IDNC, instead of SFM .
We now define the key concept of maximal coding set that is allowed for IDNC transmission.
A maximal coding set, , is a set of original data packets which simultaneously hold the following two properties: 1) their XOR coded packet satisfies IDNC constraint in Definition 8, i.e. is either instantly decodable or non-innovative for every receiver ; 2) addition of any other original data packet from to will make the resulted non-instantly decodable for at least one receiver in .
Consider a coding set of the SFM in Fig. 1(a). Its corresponding coded packet is . is instantly decodable for and is non-innovative for . One can verify that adding any other original data packet from to will make non-instantly decodable for at least one receiver. Hence, is a maximal coding set. Through exhaustive search, one can find all remaining maximal coding sets: , , .
The equivalent of maximal coding sets in the IDNC graph model is known as maximal cliques . Therefore, we use to denote both a maximal coding set and a maximal clique.
For reasons that become clear at the end of this subsection, we ensure that in each IDNC coded transmission, the sender will code all and not a subset of the original data packets in a maximal coding set. To satisfy the demands of all the receivers, the sender has to transmit coded packets from an appropriately chosen collection of maximal coding sets. To achieve this, each original data packet should appear at least once in this collection. This condition can be formally represented as the diversity constraint, where diversity of a packet is defined as:
The diversity of an original data packet within a collection of maximal coding sets is denoted by and is the number of maximal coding sets in which it appears.
A collection of maximal coding sets satisfies the diversity constraint iff every original data packet has a diversity of at least one within this collection.
Given all the maximal coding sets of a conflict matrix , there exists at least one collection which satisfies the diversity constraint (in the extreme case all the maximal coding sets include all the original data packets). The size of the collection is the number of maximal coding sets in it. We then define the minimum collection and its size as follows:
A collection of maximal coding sets is minimum if there does not exist any other collection which satisfies the diversity constraint with a smaller size. The size of the minimum collection is called the minimum collection size.
This number, as we will prove in the next subsection, is exactly the minimum number of coded transmissions, . We thus denote a minimum collection by
If the maximal coding sets in are sent using time slots, in the absence of packet erasures, the demands of all receivers will be satisfied. Here by “sending a maximal coding set”, we mean that the corresponding coded packet is generated and sent.
A problem in the graph theory which is somewhat similar to finding a minimum collection of maximal coding sets is the minimum clique cover problem [4, 23, 22]. In this problem, a graph is partitioned into disjoint cliques and the partitioning solution that results into the smallest number of disjoint cliques is referred to as minimum clique cover solution of the problem. Furthermore, it is worth noting that the cardinality of the minimum clique cover solution is equal to the chromatic number of the complementary graph 333The complementary graph has opposite vertex connectivity to . of , denoted by . However, there is a difference between the minimum clique cover problem and our minimum collection finding problem, as cliques do not overlap in the minimum clique cover problem. That is, the cliques are not necessarily maximal and each vertex appears in only one clique. This would be equivalent to choosing a minimum collection of coding sets in our IDNC model where all original data packets have a diversity equal to one. This would not change . However, it can have a serious adverse impact on IDNC’s robustness to erasures, which in turn degrades the IDNC overall throughput and decoding delay performance. Consequently, it is desirable to choose a minimum collection of maximal coding sets that, while satisfying , provides as many packet diversities as possible. The minimum collection and the importance of packet diversity is illustrated with the following example.
By sending these three coding sets in using transmissions, the demands of all the receivers will be satisfied in the absence of packet erasures. All the original data packets in have a diversity of one, except which has a diversity of 2, i.e. . Now, let us assume that there is a packet erasure probability of in the transmission links between the sender and the receivers. Under this scenario, with these three coded transmissions, the probability of being lost at its targeted receiver (due to erasures) will be . This probability is much lower then that of other original data packets, which will be equal to .
The problem of finding all the maximal cliques and the problem of minimum clique cover for an undirected graph are both NP-complete [23, 22, 24]. This is also true for S-IDNC because the S-IDNC graph does not have any special structural properties. A similar statement can be found in  for G-IDNC. Since a minimum collection of a S-IDNC conflict matrix can be reduced to a minimum clique cover solution of a S-IDNC graph by reducing the diversity of all data packets to one, the problem of finding minimum collections in S-IDNC is at least NP-complete. Its exact and simplified algorithms will be presented in Section VI.
Iii-B The Equivalence of , the Minimum Collection Size, and
In this subsection, we prove that the three numbers: 1) the minimum number of coded transmissions (), 2) the minimum collection size of the conflict matrix, and 3) the chromatic number of the complementary graph (), are identical. Based on this we propose two important remarks.
For a given conflict matrix , the equivalence between its and minimum collection size can be proved by induction using the following theorem:
Upon successful reception of a maximal coding set of by all its targeted receivers, the minimum collection size of the updated , denoted by , is at least .
This theorem holds if the following two theorems hold:
The minimum collection size of a conflict matrix with a graph model equals the chromatic number of the complementary graph of , .
Suppose is a maximal clique in and the chromatic number of is . By removing from we obtain an updated graph . The chromatic number of is at least . More precisely, if belongs to a minimum collection of , , while if does not belong to any minimum collection of , .
Since the in Theorem 3 is indeed the graph model of in Theorem 1, we conclude that Theorem 1 holds if Theorem 2 and 3 hold. The proofs of Theorem 2 and 3 are provided in Appendix A and Appendix B, respectively.
The above theorems apply to S-IDNC, but not G-IDNC. The reason is that unlike S-IDNC graph, G-IDNC graph is not static. That is, by removing a clique from the G-IDNC graph, new edges may be added to the remaining vertices, which breaks Theorem 3. In contrast, removing any clique from the S-IDNC graph will never change the connectivity of the remaining vertices. In other words, G-IDNC problem does not have its dual static minimum clique cover problem.
Heuristic algorithms are suboptimal because they cannot guarantee . They may choose a maximal coding set which is, though large, not included in any minimum collection of . Then Theorem 3 indicates that, even if is successfully received by all its targeted receivers, the chromatic number of the updated is still and thus more transmissions are needed. Below is an example.
Consider the maximal coding sets in Example 5. A suboptimal IDNC algorithm might choose , which does not belong to the only minimum collection . Even if this set is successfully received by all its targeted receivers, still three more transmissions are needed to be able to deliver and to the receivers. In total, there will be at least four transmissions, which is greater than .
This example motivates the concept of optimal IDNC schemes, which will be presented next.
Iii-C The Optimal Fully- and Semi-online IDNC Schemes
Iii-C1 The optimal fully-online IDNC scheme
In a fully-online IDNC scheme, the sender collects feedback from all receivers in every time slot to update the SFM and the corresponding conflict matrix. A coding set is then chosen and its coded packet is generated and broadcast. To minimize the number of coded transmissions and to reduce the decoding delay, this coding set must satisfy the following three conditions:
It should be a maximal coding set;
It should be chosen from a minimum collection of the updated conflict matrix; and
It should target the largest number of receivers among all the maximal coding sets in .
We define such a fully-online IDNC scheme as the optimal fully-online IDNC scheme in terms of throughput.
Iii-C2 The optimal semi-online IDNC scheme
According to Theorem 1-3, collecting fully-online feedback during transmissions cannot reduce the total number of coded transmissions to below , even in the best case scenario of erasure-free packet reception. Hence, as a variation of existing IDNC schemes in the literature, we propose to reduce feedback frequency to semi-online, where the SFM is updated in rounds. For example, feedback is collected after coded packets from a selected minimum collection have been transmitted and so on. We define this scheme as the optimal IDNC scheme in terms of throughput when feedback frequency is semi-online, or simply the optimal semi-online IDNC scheme. We refer to the minimum collection as the optimal semi-online IDNC solution. Maximal coding sets in are properly ordered so that those targeting more receivers are assigned with smaller subscripts and sent first. Fig. 2 illustrates the process of the proposed optimal IDNC schemes.
We then define the minimum average packet decoding delay of the proposed IDNC schemes:
We denote by the minimum average packet decoding delay of the proposed fully- and semi-online IDNC scheme. It is achieved if the maximal coding sets in the optimal semi-online IDNC solution are broadcast in the absence of packet erasures, and is calculated as:
where is the index of the first maximal coding set in which contains .
Compared with (3), the decoding delays of an original data packet at its targeted receivers, i.e., , are unified to here because all these receivers can decode in the same time slot if there is no packet erasure. It is noticed that by “minimum” we mean the smallest possible average packet decoding delay of the proposed (throughput) optimal IDNC schemes in the absence of packet erasures. is not necessarily the optimal average packet decoding delay that IDNC can offer, finding which is still an open problem. Indeed, as we shall see in Section VII, a suboptimal IDNC scheme in terms of throughput may achieve a better decoding delay.
It is also noticed that, since the initial SFM is a all-one matrix, the systematic transmission phase is a special semi-online IDNC round, which requires the original data packets to be sent uncoded using transmissions.
In addition to making the throughput and delay analysis of IDNC tractable, a lower feedback frequency can be advantageous in practical implementations of IDNC where the use of reverse link is costly and involves transmission of some control overheads. Another practical attraction is that it also avoids solving the IDNC coding problem in every time slot. However, this comes at the potential cost of degradation in the overall system throughput in semi-online IDNC, as we explain next.
Imagine an IDNC scheme in the presence of erasures. In the fully-online feedback case, is updated before every coded transmission, so the coded packet is chosen from the minimum collection of the actual at the receivers. However, in the semi-online feedback case, the sender does not update until the round for is complete. Here belong to the minimum collection of the last revealed to the sender, but not necessarily belong to the actual at the receivers. If this is the case, these coded packets can become throughput inefficient. Intuitively, we expect the gap between semi- and fully-online schemes to be small when packet erasure probability is low (in the extreme case where the packet erasure probability is zero, the two schemes perform the same). In any case, the throughput and delay analysis of semi-online IDNC scheme serves as a worst-case scenario for an optimal fully-online IDNC with packet erasures.
Iv Throughput Bounds
The findings in the last section are important because they enable theoretical analysis on the achievable throughput and decoding delay of IDNC. For throughput, is equal to the chromatic number of the complementary IDNC graph. For decoding delay, is the average decoding delay of the proposed optimal semi-online IDNC solution. We note, however, that there is no explicit formula to calculate the optimal . It can only be found via algorithmic implementations that can be computationally complex, as will be discussed in Section VI. Therefore, it is desirable to have some bounds on that can be more easily calculated or algorithmically found. This is the aim of this section. It is particularly useful and can find its application in, e.g., adaptive network coding systems which choose among IDNC and other network coding techniques to meet the throughput and decoding delay requirements. Since the calculation of depends on as indicated by (4), we will first derive bounds on in this section and then on in the next section. We start with the review of existing results in graph theory and then propose useful bounds in IDNC context.
Iv-a Results in Graph Theory
Given a set of system parameters }, the complementary IDNC graph after the systematic transmission phase can be modeled as the classic Erdos-Renyi random graph . In this model, there are vertices and any two of them are connected by an undirected edge with i.i.d. probability of . In the context of IDNC, is the probability that two original data packets conflict with each other and can be calculated as:
Almost every random graph with vertices and vertex connection probability of has chromatic number of:
where approaches zero with increasing .
Since , this lemma could be used to calculate the mean of under any set of . However, it is only asymptotically accurate for large . Since in IDNC systems may not be very large, (6) does not provide sufficient accuracy.
In graph theory, is bounded as:
where is called the clique number of and is the size of the maximum (the largest maximal) clique in , and is the largest vertex degree of , i.e., the largest number of edges incident to any vertex in . As we will show later, while is a tight lower bound, the upper bound is very loose and is not useful for IDNC framework. In the next two subsections, we will derive useful loose/tight lower/upper bounds on , respectively. The loose bounds are easy to calculate and they reveal the limits of IDNC, while the tight bounds are more computationally involved, but nevertheless are shown numerically to be accurate estimates of .
Iv-B Loose Bounds
In this subsection, we find the smallest and largest possible of all the conflicts matrices which have a size of and zero entries, with their set denoted by . The results are our loose lower and upper bounds and are denoted by and , respectively. They reveal the throughput limits of IDNC for any given and and are important references for practical/heuristic IDNC coding algorithm design: any algorithm offering above the upper bound or below the lower bound is throughput inefficient or non-instantly decodable, respectively.
The general intuition here is trying our best to waste the coding opportunities brought by the zeros. We first note that for any given original data packet, there are entries in about the conflict of that packet with all other packets. When , there is no coding opportunities, so . When , we can assign all zeros to the entries about the same original data packet, say . But remains because have to be transmitted separately. After zeros have been exhausted, there are entries in about every packet other than . Thus when , we can assign these extra zeros to the entries about the same original data packet, say , and remains because have to be transmitted separately. This iterative process indicates that decreases in a staircase way with . The relationship can be written as:
One can easily verify that the proposed loose upper bound is much tighter than because the largest possible is always when .
The intuition here is making the best of the coding opportunities brought by the zeros. In other words, we should use as few zeros as possible to reduce by one. When , no original data packets can be coded together. Thus and . We then reduce iteratively. In each iteration, can be reduced by 1, i.e., the size of can be reduced by 1, if we can merge two maximal coding sets in together. Since any two packets from different maximal coding sets conflict, to merge two maximal coding sets of size, say and , together, we need zeros. In order to use as few zeros as possible, we always pick two smallest sets which minimize . Hence, in each iteration, it is impossible to reduce by 1 until new zeros are added to . This iterative process provides the lower-bound . Similar to the upper-bound, the lower bound also decreases in a staircase way with . Below is an example with :
When we have an all-one conflict matrix (), no packets can be coded together, and thus , as in (9a). Then in the first iteration, the size of can be reduced by 1 by merging and together, which requires one zero, as in (9b). In the second iteration, the size of can be reduced by one by merging and together, which requires 1 zero, as in (9c). In the third iteration, zeros are needed to merge the two smallest maximal coding sets and together because we have to resolve the conflicts between and and between and . In the last iteration, zeros are needed to merge and together. After that, all the 10 entries in become zeros and becomes 1. in this example can thus be expressed as:
Iv-C Tight bounds
Because is well lower bounded by and , our tight lower bound on is defined as . It can be identified by finding the maximum clique in . We then find a tight upper bound, denoted by .
Our tight upper bound on is derived using an iterative operation on a graph , denoted by . It iteratively outputs the maximum clique in and then deletes it from until becomes empty. Mathematically:
The resulted cliques actually form a partition of . Hence, if the initial is an instance of IDNC graph, the resulted cliques form a semi-online IDNC solution, denoted by with cardinality . The minimum block completion time () of this IDNC instance is thus upper bounded by .
The derivations of both and rely on finding the maximum clique in an IDNC graph, which is NP-complete [23, 22, 24]. In Section VI, we will propose a heuristic clique-finding algorithm, which will in turn enable finding some heuristic bounds on . These heuristic bounds are still provable, but they are suboptimal because the heuristically found maximal cliques are not necessarily maximum.
Iv-D The Average Bounds on and Their Tightness
Based on the bounds we have derived for any instance of conflict matrix, we can calculate the average bounds over all the conflict matrices in so that the average bounds can be explicitly expressed as a function of .
Since the loose bounds are already functions of , the focus is on the average tight bounds, i.e., and . They can be obtained by listing all the conflict matrices in , calculating their bounds, and then making the average. However, this is usually unrealistic, since there are possible conflict matrices, which are prohibitively large even when and are not so large. Hence, averaging over “all” conflict matrices is replaced by Monte Carlo averaging, where instances of conflict matrix are generated by assigning random permutations of zeros and ones to the conflict matrix.
We present the average bounds under original data packets and in Fig. 3. The optimal , obtained using the method in Section VI-A, is also averaged and plotted as a reference. It is denoted by . It decreases gradually with the increasing , and so do the tight bounds and . The gap between the tight bounds and the optimal one is marginal, with a value of less than 0.5 transmission on average for all . They are much tighter than the loose ones, which decrease in a stair-case way with increasing .
V Decoding Delay Bounds
According to its definition in (4), the minimum average packet decoding delay of an SFM is decided by the optimal semi-online IDNC solution of the corresponding conflict matrix and the number of targeted receivers of all the original data packets . Deriving lower/upper bounds on of an SFM is thus equivalent to finding two instances of which offer the best/worst possible decoding delays, respectively. We first discuss what instances will yield such decoding delays.
An instance of is denoted by where is its cardinality. Its average packet decoding delay is denoted by and can be calculated using (4), where is now the index of the first maximal coding set in that contains . Therefore, for the purpose of calculation, can be removed from all the subsequent coding sets. After applying such removal to all the original data packets, the intersection between any two coding sets in becomes empty. These coding sets are not necessarily maximal and we denote them by to distinguish them from maximal ones. Below is an example.
Consider an instance , . After packet removal, the instance becomes: .
Let us denote by the number of targeted receivers of coding set . Without loss of generality we assume that:
It holds that . Then, as a variation of (4), the average packet decoding delay under can also be calculated as:
The above two equations indicate the condition that the best/worst possible instances of should satisfy:
Because for all , is minimized if has and for . Since it is rare to have coding sets wanted by only one receiver, we relax this condition as and refer to such as the best;
is maximized with a value of if has and thus is the worst.
We now propose different instances of and obtain lower/upper bounds on with different tightness.
V-a Loose Bounds
For any given SFM, without loss of generality we assume that its conflict matrix belongs to . By employing the loose bounds on for , we can derive loose bounds on .
V-A1 Loose lower bound
The smallest possible cardinality of the instance is equal to the loose lower bound on , that is, . Thus, is minimized if has:
By substituting the above into (14), a loose lower bound on is obtained.
V-A2 Loose upper bound
The largest possible cardinality of the instance is equal to the loose upper bound on , that is, . Thus, is maximized with a value of if has uniform , as discussed in C2 after (14).
V-B Tight Bounds
V-B1 Tight lower bound
In the derivation of the tight lower bound on , we find a size- clique in the complementary IDNC graph . Denote the original data packets included in this clique by , and without loss of generality assume that . These original data packets must be sent separately because they are not connected in , i.e., they conflict. In this case, the smallest decoding delay takes place when all the remaining original data packets can be coded together with in the first coding set. The sequence is:
By substituting the above into (14), a tight lower bound on the minimum packet decoding delay is obtained and is denoted by .
V-B2 Tight upper bound
We use the IDNC solution found using the operation in (12) as our instance and thus its average decoding delay is our tight upper bound .
V-C The Average Bounds on and Their Tightness
In this subsection, we obtain the average bounds on using a similar method as for the average bounds on . For a given set of , conflict matrices are randomly generated and the number of targeted receivers of the original data packets are also randomly generated. Their decoding delay bounds, as well as under the optimal semi-online solution are calculated and averaged.
Simulation results for and are plotted in Fig. 4. The profiles of the average decoding delay bounds are similar to the average throughput bounds. The average loose bounds decrease with increasing in a staircase way, while the average tight bounds decrease gradually as .
The main difference is that is much closer to than in the throughput case, and for , the gap becomes negligible. The reason is that the IDNC solution can be viewed as a greedy IDNC solution in terms of decoding delay. It transmits the largest maximal coding set first, which is likely to target the most receivers. This result, together with the small gap between and , indicate that could be modified into a good heuristic IDNC coding algorithm, which will be discussed in the next section.
In this section, we present the algorithmic implementations of IDNC. We first propose the optimal semi-online and fully-online IDNC coding algorithms and then their heuristic alternatives. We also employ a heuristic clique-finding algorithm to obtain heuristic tight bounds on the throughput and decoding delay performance of IDNC.
Vi-a Optimal IDNC Coding Algorithms
Our optimal semi-online IDNC coding algorithm finds the minimum collections of the conflict matrix in two steps:
Find all the maximal coding sets (cliques): This problem is NP-complete but has an efficient recursive algorithm called Bron-Kerbosch algorithm . The group of all the maximal coding sets is denoted by .
Find minimum collections from : We propose an iterative algorithm in Algorithm 1 to achieve it. The intuition behind this algorithm is that, if an original data packet belongs to maximal coding sets in , one of these maximal coding sets has to be transmitted. In the extreme case of , this maximal coding set must be sent. Below is an example of Algorithm 1.
Consider the graph model in Fig. 5. In Step-1, we find all the maximal cliques: , , , , . Then in Step-2:
Since is empty, none of the original data packets are included. Among them, has a diversity of only one under . Thus in the first iteration, the updated solution is ;
The remaining original data packets are . Among them has a diversity of one under . Thus in the second iteration, the updated solution is ,;
The remaining original data packets are and . They both have a diversity of two under . We pick and then branch: ,, and ,,. Since satisfies the diversity constraint, the algorithm ceases and returns as the minimum collection.
If the above two-step coding algorithm outputs several minimum collections, different criterion can be used for selection, such as the smallest average packet decoding delay and the highest average packet diversity, etc. In our simulations, we select the one having larger diversities for data packets wanted by more receivers, i.e., the collection that maximizes will be chosen, where is the diversity of within and is the number of targeted receivers of .
If fully-online feedback is allowed and computational cost at the sender is not an issue, the optimal fully-online IDNC scheme can be applied, where in every time slot, the sender calculates the optimal semi-online solution as above, but sends only the first maximal coding set and then collects feedback.
Vi-B Hybrid IDNC Coding Algorithms
Algorithm 1 is optimal because it finds all the possible minimum collections. However, it is also memory demanding because the number of candidature solutions usually grows exponentially after the branching in every iteration. Thus in this subsection, we propose a simple greedy alternative to it. We first choose the largest maximal coding set in . Then for the remaining original data packets that have not been covered, we look for a maximal coding set which comprises most of them. This iterative algorithm only produces one collection, which may be suboptimal because its cardinality may be greater than . The optimal clique finding in Step-1, together with this heuristic algorithm in Step-2, is referred to as the hybrid semi-online IDNC coding algorithm.
If fully-online IDNC is applied, after finding in Step-1, we can greedily choose the maximal coding set in that targets the maximum number of receivers. This algorithm is referred to as the hybrid fully-online IDNC coding algorithm.
To reduce the computational load due to Step-1, we resort to a fully-heuristic clique-finding algorithm next.
Vi-C Heuristic IDNC coding Algorithms
A simple algorithm that heuristically finds the maximum (the largest maximal) clique in a graph is provided in Algorithm 2. The intuition behind this algorithm is that, a vertex is very likely to be in the maximum clique if this vertex has the largest number of edges incident to it. This algorithm has been employed in [4, 20, 9] for fully-online IDNC. So we also refer to it as the heuristic fully-online IDNC coding algorithm. However, it has not been applied to semi-online IDNC and its computational complexity has not been identified yet.
The computational complexity of this algorithm is polynomial in the number of original data packets . The highest computational complexity occurs when the input graph is complete, i.e., all the vertices are connected to each other. Under this scenario, only one vertex could be removed in each iteration (in Step 8) and thus, the size of the graph in the -th iteration, , will be . As a result of this, the highest computational complexity is . In practice, the graph size will shrink much faster after each iteration, and the number of iterations is usually smaller than . Hence, the computational complexity of this algorithm is loosely upper-bounded by . In other words, the computational complexity of this algorithm is .
Vi-C1 Heuristic bounds
Here, we apply Algorithm 2 to heuristically find the proposed tight bounds on the throughput, and then the corresponding tight bounds on the decoding delay. The results are shown in Figs. 3 and 4, respectively. It is observed that the performance degradation due to the heuristic algorithm is marginal for both throughput and decoding delay. Therefore, the heuristic tight bounds could serve as reliable and efficient estimates of the throughput and decoding delay.
Vi-C2 Heuristic semi-online IDNC coding algorithm
The operation in (12) can be implemented by using Algorithm 2. The outcome is a heuristic semi-online IDNC solution , which offers good throughput and decoding delay performance in the erasure-free scenario. However, since its cliques are disjoint, all the original data packets have a diversity of only one and thus are vulnerable to packet erasures in real systems.
To overcome this drawback, we propose a heuristic semi-online IDNC coding algorithm in Algorithm 3, which is an extension of . The key idea here is that, in the -th iteration, after finding clique , we try to enlarge this clique by adding previously covered vertices to it whenever possible, i.e., the vertices in . By doing so, the diversity of the newly added vertices (packets) is increased by one. Below is an example:
Consider the graph in Fig. 1(c). In the first two iterations, the algorithm will choose and , respectively, without any adding. In the third iteration, we have