ARQ with Cumulative Feedback to Compensate for Burst Errors
Abstract
We propose a cumulative feedbackbased ARQ (CF ARQ) protocol for a sliding window of size over packet erasure channels with unreliable feedback. We exploit a matrix signalflow graph approach to analyze probabilitygenerating functions of transmission and delay times. Contrasting its performance with that of the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia, we demonstrate that CF ARQ can provide significantly less average delay under bursty feedback, and gains up to about 20% in terms of throughput. We also outline the benefits of CF ARQ under burst errors and asymmetric channel conditions. The protocol is more predictable across statistics, hence is more stable. This can help design robust systems when feedback is unreliable. This feature may be preferable for meeting the strict endtoend latency and reliability requirements of future use cases of ultrareliable lowlatency communications in 5G, such as missioncritical communications and industrial control for critical control messaging.
I Introduction
Ultra reliability and low latency in 5G are key factors for many applications ranging from industrial automation, tactile Internet, remote healthcare, public safety, to missioncritical communications such as autonomous driving and wearable computing devices [1, 2, 3]. 5G will need to support a roundtrip time (RTT) of about 1 millisecond, along with necessary overheads for resource allocation and access in 5G networks. Such severe latency constraints introduce a plethora of challenges in terms of the protocol stack design, control/user plane, and the core network [4].
Repetition of a packet over nondeterministic channel conditions, and the use of forward error correction (FEC) codes help repair the loss of the packets. Feedback packets are used to request FEC retransmission for increasing the reliability in packet delivery. The role of feedback is to increase data channel efficiency by limiting the repetitions. However, coding and feedback have been difficult to blend.
Reliable communication over a packet erasure channel can be achieved using Automatic Repeat reQuest (ARQ), when there is full feedback [5]. This simple scheme achieves 100% throughput, inorder delivery and the lowest possible packet delay, and it is composable across links. However, when the network is lossy, i.e., with no idealized feedback, linkbylink ARQ cannot achieve the capacity of a general network.
In the literature, the achievable rate has been optimized using acknowledgments and coding, under the condition that each received packet is either useless or can be immediately decoded by the destination [6]. Feedback and coding over a broadcast erasure channel have been combined in [7] to optimize decoding delay when perfect feedback is available from the receivers. An extension of ARQ for coded networks has been proposed in [5] to minimize the queue size at the transmitter. This approach combines the benefits of network coding and ARQ by acknowledging degrees of freedom (DoF) instead of original packets. It enables the feedbackbased control of the tradeoff between throughput and decoding delay [8]. The proposed scheme in [5] is robust to delayed or imperfect feedback. None of these examples jointly investigate the delay and throughput when the feedback is imperfect.
For schemes requiring feedback, it is generally assumed that feedback is lossless (perfect) and instantaneous (delayfree) [5], [8], [9]. Inevitable feedback channel impairments may cause unreliability in packet delivery. Burst errors might occur, which can impede the stability. The situation becomes worse under roundtrip time (RTT) fluctuations along with the delayed feedback. To the best of our knowledge, the effect of unreliable feedback has not been captured before.
In this paper, we investigate the effect of unreliable feedback in packet erasure channels. Erasure errors can occur in both the forward and reverse channels. However, an acknowledgment (ACK) cannot be decoded as a negative acknowledgment (NACK), and vice versa. Building on the uncoded baseline scheme proposed in [10], we propose a SR ARQ scheme under a cumulative feedbackbased ARQ (CF ARQ) scheme in order to investigate the role of feedback. We investigate how much we can gain with cumulative feedback and how to compensate the forward errors with cumulative feedback. Contrasting the throughput and delay performance of CF ARQ with the uncoded ARQ in [10], we demonstrate that with a sliding window of size 2, CF ARQ can provide gains up to 18% in terms of throughput. Cumulative feedback also has benefits under burst errors or high erasure rates.
Ii Channel Model
We have a pointtopoint channel model consisting of a sender and a receiver. In the forward link, the sender attempts to transmit a packet to the receiver, and upon the successful reception of the packet, in the reverse link, the receiver acknowledges the sender by transmitting a feedback. We use a GilbertElliott (GE) model^{1}^{1}1A general finitestate Markov model can be used to represent a physical channel with fading. The received signaltonoise ratio can be partitioned into a finite number of states, corresponding to different channel qualities [11]. [12], which is a special case of hidden Markov models (HMMs), both for the forward and reverse channels. The status of a transmission at time is a random variable taking values in , where denotes an errorfree packet, and means the packet is erroneous. This binarystate Markov process , with probability transition matrix , has states G (good) and B (bad), i.e. , with where and are the probabilities of transmitting a packet in error in the respective states. The GE channel , driven by , is characterized by .
The channel state information is not available at the transmitter and the receiver. Hence, the transmitter does not know the status of a transmission (state of the forward link) at time , but it observes the status of the feedback at time , which is a Bernoulli random variable taking values in . Similarly, the receiver does not know the status of the reverse link, but it observes the status of a transmission at time , which is a Bernoulli random variable taking values in . The transmitter and receiver do not observe the process . However, for the GE channel, given the channel state at time , the joint probabilities of channel state and observation at time can be computed using the statetransition probabilities. For a GE channel, the statetransition matrix is
(1) 
where the first and second rows correspond to states G and B. The erasure rate is , where is the stationary vector of , which is found by solving and . Note that represents the average error burst. Hence, burst errors occur when is low.
The joint probabilities of channel state and observation at time , given the channel state at time , are given as
which can be collected into a matrix of transition probabilities . Similarly, define . The entries in matrices and are statetransition probabilities when viewed jointly with the conditional channel observations [10]. Hence, the HMM is characterized by .
In practice, data packets and acknowledgments typically have different lengths and different coding levels. Therefore, the erasure rates and the parameters and of the forward and reverse channels are not necessarily the same, which is accounted in our model. Denote by and the statetransition matrices for the forward and reverse channels, respectively. The forward link and the reverse link are mutually independent.
For the GE channel, the probability matrices for the forward and reverse channels and are given as
using the shorthand notation , , and . We can similarly compute and .
The composite channel is characterized by , where are the composite channel states, i.e. the Cartesian product of forward and reverse states, and is the combined observation set. For example, means the forward channel is erroneous and the reverse channel is good. For , the joint probability of the combined observation and the composite state at time , given the composite state at time , is . In compact notation, we have for , where is the Kronecker product of matrices and . For the GE channel, the combined observation probabilities are given by the following matrices: , , , and . The combined statetransition matrix for the GE channel, i.e., , is a matrix that is given by the Kronecker product of and , i.e. .
In the rest of the paper, we will drop the superscript and denote the observation probability matrices by , , and . We also let and be the probability matrices of success and error in the forward channel, respectively, and let and be the matrices of success and error in the reverse channel, respectively. Furthermore, we let the matrices , , denote the composite channel matrices.
Iii ARQ with Cumulative Feedback
We propose a cumulative feedbackbased ARQ (CF ARQ) scheme with coding for data transmission. It is an extension of the slotted SR ARQ, which allows the receiver to accept packets out of order, which can be stored in a buffer and sorted at the receiver to ensure inorder final delivery. Assume that all packets are available at the sender prior to transmission, the receiver does not have buffer overflows, and there is a synchronous transmission from the sender to the receiver.
We consider minimum coding, i.e., with a sliding window of size . The protocol can easily be generalized to packet streams with , which is out of the scope of the current paper. This scheme differs from the uncoded ARQ in [10] in the sense that the transmitted packet stream is MDS coded, and the feedback is cumulative for coded packets. However, the transmission scheme is repetitionbased, i.e. the transmission rate is not adjusted based on the cumulative feedback. The receiver needs both coded packets to reconstruct the transmitted packet stream, i.e., the degrees of freedom (DoF) required at the receiver is . We do not assume inorder packet delivery. Hence, the transmitted stream will be successfully decoded when both of the coded packets are successfully received and acknowledged by the receiver.
The feedback, i.e. ACK and NACK messages sent by the receiver indicating if it has correctly received a data packet, acknowledges all correctly received packets, and is cumulative for coded packets. After the start of transmission (I), it takes time slots between the transmission of the second packet and receipt of its feedback. Therefore, the roundtrip time (RTT) of CF ARQ is slots. If the feedback was not cumulative, i.e., the first feedback was received slots after the transmission of the first packet, then the RTT would have been slots. A timeout mechanism is used at the transmitter to achieve reliable data transmission. When a packet stream is (re)transmitted, the timeout is set to that is greater than RTT. If the sender does not receive an acknowledgment before the timeout, it retransmits the packets until it receives an acknowledgment. Hence, we do not have an upper bound on the maximum number of retransmissions.
The ACK/NACK sent in each slot. The packet whose ACK is lost will be acknowledged by subsequent ACKs/NACKs. If the succeeding ACKs/NACKs are successfully received before timer expiration, the packet will not be retransmitted. If the timeout expires and no ACK is received, the packet will be retransmitted. When a packet is lost and its NACK is received, the packet will be retransmitted immediately. If the NACK is also lost, the packet will be retransmitted after the timer expires. The transmission protocol is illustrated in Fig. 1.
The combined observation set for CF ARQ with packets is all 3tuples of , i.e., . For example, means that the forward channel is good for both packets and the reverse channel is erroneous, i.e., the ACK for both packets is lost at time . Since the feedback is cumulative for packets, it is possible that both packets are successfully acknowledged, or they both need to be transmitted or only one of the packets has to be retransmitted.
Hidden Markov model (HMM) is a statistical Markov process with unobserved states. Although the state is not directly observed, the output dependent on the state can be observed. Thus, under unreliable channel conditions, the analysis of ARQ protocol is possible using HMMs. The analysis of finitestate HMMs can be streamlined using flow graphs. Scalarflow graphs have been used to find the probabilitygenerating functions (PGFs) of transmission and delay times [13, 14, 15].
HMMs can be analyzed by labeling the branches of scalarflow graphs with observation probability matrices. The nodes of the flow graphs correspond to the states of the transmitter. The input node () represents the start of transmission, and the output node () represents correct reception of acknowledgment. Other nodes represent intermediate states. Upon the start of transmission, the transmitter goes from one state to the other. A state transition is accompanied with a certain value for the random variable , and a probability , which together appear in the branch gain . Hence, the inputoutput gain of the graph is a polynomial in , whose coefficients are the probabilities of corresponding values of . This polynomial is equivalent to , the PGF for . Flow graphs with matrix branch transmissions and vector node values are called matrix signalflow graphs (MSFGs) [10]. The matrix gain of the graph is calculated using the basic equivalences known as parallel, series, and selfloop. The matrixgenerating function (MGF) gives the inputoutput relationship for the matrixflow graph. Then, the PGF is calculated by pre and postmultiplications of row and column vectors, respectively.
The HMM for delay analysis of CF ARQ is shown in Fig. 2. The states and are the input and output nodes, and nodes , , , represent the hidden states. The possibilities upon the transmission of coded packets are:

Transition to state . Node denotes the reception of the first feedback. The coded packets are retransmitted until the forward link is successful and at least one packet is successfully transmitted. The retransmission is modeled by the selfloop at , where
where is the residual time for timer expiration upon transmission. Upon the reception of the first feedback, the transition probability matrix for the transmission of packets is given by
which models the errorfree NACK. It combines the different cases such that the feedback is an errorfree NACK, i.e., the forward link was bad for both packets and the reverse link was good (first term), or the forward link was bad for either one of the packets only and the reverse link was good (second and third terms). We assume the cumulative feedback is errorfree as long as the reverse link is good before the forward transmission is over.
The transition probability matrix is given by
which models the erroneous NACK feedback. It combines the different cases such that the forward link was bad for both packets and the reverse link (CF) was also bad.
In CF ARQ, unless both packets are successfully acknowledged, we always need retransmissions. Hence, it is suboptimal. Furthermore, the erasure rate of CF ARQ is not the same as the erasure rate of uncoded ARQ. For example, for the case of symmetric memoryless channels, the relationship between the erasure rate for CF ARQ with packets, and the erasure rate of the uncoded ARQ in [10] is computed as . Hence, .

Transition to state . When the first feedback is received at node , if the number of DoFs acknowledged equals , then the system transits to state . The matrix
denotes the transition probability matrix from to . Hence, if the system goes into state , the additional number of DoFs required by the receiver is , i.e., only one packet needs to be retransmitted. The packet retransmission at is modeled by the selfloop, where
where the probability matrices and model the errorfree and the erroneous NACK, respectively. At node , as only one packet is retransmitted, the matrices satisfy , where ’s, for are the transition probability matrices for the uncoded ARQ in [10].

Transition to state . If DoF’s are received, the stream can be successfully decoded. If DoF’s are acknowledged (with probability ), the system transits to state .

Transition to state . If DoF’s are received, but the feedback is an erroneous ACK (with probability ), then the system transits to , where the sender waits till it receives an errorfree ACK/NACK, modeled by the selfloop at .

Transition to state . If DoF’s are received, but only one packet is successfully acknowledged and the feedback for the other packet is an erroneous ACK (with probability ), then the system transits to , where the sender waits till it receives an errorfree ACK/NACK.
Given the transition probabilities, the success and error probability matrices in the reverse channel for CF ARQ are:
respectively, where is the number of DoFs acknowledged by the receiver, i.e., DoFs are needed at the receiver.
The transmission time () is the number of frames being transmitted per a successful frame, and the delay time () is the time from when a frame is first transmitted to when its ACK is received. Under the given model, both and are random variables with positive integer outcomes. The matrix gain of the graph in Fig. 2 can be calculated. Using the PGF, the average values for and can be calculated and the throughput is the reciprocal of . We next compute the MGF of the transmission time for packets.
Proposition 1.
For CF ARQ, the MGF of is given by
(2) 
where
and the matrix for that gives the gain of the transition from the state can be computed as
where .
The PGF of of CF ARQ for packets is computed as using the MGF in (1), where is a column vector of ones, and is the probability vector of state . The throughput is the reciprocal of the derivative of at , i.e., .
We next compute the MGF of the delay for packets.
Proposition 2.
For CF ARQ, the MGF of the delay is
(3) 
where for can be computed using relation
The PGF of delay of CF ARQ for coded packets can be computed as using the MGF in (2). Finally, the average delay will be the derivative of at , i.e., .
Iv Numerical Results
We numerically investigate the throughput and average delay of the pointtopoint GE channel. Our objective is to understand the impacts of feedback and cumulative feedback (CF) under less reliable and bursty channel conditions.
First assume that the forward and reverse channels have the same erasure rates, i.e., . In Fig. 3, we fix the burst rate of the forward channel, i.e., , and vary the burst rate of the reverse channel, i.e., , and vice versa. As timeout increases, both the throughput and the average delay increase. As increases, it is clear that gets lower and increases both for uncoded ARQ and CF ARQ. Sensitivity of to timeout also increases under burst errors (low ). If the feedback erasures are bursty, decreases with feedback delay. We also observe that of CF ARQ is higher than of uncoded ARQ. The difference becomes significant when the feedback is bursty and is high. When is small, since the RTTs of CF ARQ and uncoded ARQ are and , respectively, of CF ARQ is higher. However, of CF ARQ is smaller when the feedback is bursty and is high. Hence, CF ARQ is more robust to burst errors in the feedback. When there is perfect feedback with , uncoded ARQ has lower and higher than CF ARQ. Throughput and delay performance of both schemes degrade as the feedback loss increases. However, CF ARQ outperforms uncoded ARQ under feedback loss.
In Fig. 4, we investigate the role of asymmetry such that either the forward channel is more robust to erasures, i.e., and , or vice versa. We keep fixed and increase . When the forward channel is more robust to erasures, is higher both for uncoded ARQ and CF ARQ, and is significantly less compared to the case when the reverse channel is more robust. In this case, CF ARQ provides significantly better throughput than uncoded ARQ. Thus, even if the forward and reverse channels are asymmetric, CF ARQ performs better than uncoded ARQ under bursty feedback.
From Figs. 3 and 4, we see that erasures of forward channel scale the throughput, and feedback erasures change the shape of the throughput. Hence, forward erasures significantly degrade the throughput of both uncoded ARQ and CF ARQ, and dominate the performance of throughput. Still, CF ARQ throughput gap from uncoded ARQ is higher when is high. In terms of delay, CF ARQ is more stable than uncoded ARQ when increases, and less stable when increases. Still, CF ARQ is more stable when is high.
We next investigate the robustness of CF ARQ to forward erasures. Letting , we illustrate and of uncoded ARQ and CF ARQ in Fig. 5 for different sets of burst rates. From the plots, we see that CF ARQ provides a higher , and even more forward erasures can be compensated (up to ) with CF for packets if the feedback channel is more bursty without sacrificing . CF ARQ can provide a gain of 18% in terms of throughput.
Uncoded ARQ is very sensitive to error bursts. The higher the burst rate, the lower its throughput is and the higher its delay is. To compensate for the forward erasures, CF can be used, which can provide significantly less delay, and better throughput. Similarly, when the feedback erasures dominate, performance of CF is much better in terms of throughput, and CF can provide reductions in delay under bursty feedback.
V Conclusions
We proposed a cumulative feedbackbased ARQ with a sliding window of size 2, and computed the MGFs of transmission and delay times. Contrasting its performance with uncoded ARQ, we demonstrated its robustness under burst errors. The following insights should enable more robust design for packet erasure channels with imperfect and bursty feedback:

At high erasures, CF ARQ provides considerably low average delay and high throughput than uncoded ARQ.

CF ARQ has benefits under burst errors or higher erasure rates in the reverse channel. It is more predictable across statistics, hence is more stable. This can help design robust systems when feedback is unreliable.
Incorporating FEC, the transmission rate can be adaptively adjusted with the cumulative feedback for multiple coded packets. Extensions hence include the study of different coded schemes, and the throughput achievable with coding. While the analysis is prohibitively complex for larger window sizes with excessive number of hidden states, the technique can easily be evaluated for general window sizes using a network simulator. This can help understand the scalings between the window size and system parameters. This will pave the way for protocol design for 5G with desirable throughputdelay tradeoffs.
References
 [1] P. Popovski et al., “Wireless access for ultrareliable lowlatency communication: Principles and building blocks,” IEEE Network, vol. 32, no. 2, pp. 16–23, Mar. 2018.
 [2] G. P. Fettweis, “The tactile internet: Applications and challenges,” IEEE Veh. Technol. Mag., vol. 9, no. 1, p. 64Ð70, Mar. 2014.
 [3] “3GPP TS 23.725 Study on enhancement of URLLC supporting in 5GC,” 3GPP, Tech. Rep., Mar. 2018.
 [4] J. G. Andrews et al., “What will 5G be?” IEEE Journ. on Sel. Areas in Comm., vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
 [5] J. K. Sundararajan, D. Shah, and M. Médard, “ARQ for network coding,” in Proc., IEEE ISIT, 2008.
 [6] S. Katti et al., “XORs in the air: Practical wireless network coding,” in Proc., Sigcomm, 2006.
 [7] L. Keller, E. Drinea, and C. Fragouli, “Online broadcasting with network coding,” in in Proc. of NetCod, 2008.
 [8] C. Fragouli, D. Lun, M. Médard, and P. Pakzad, “On feedback for network coding,” in Proc., IEEE Annual Conference on Information Sciences and Systems, Mar. 2007, pp. 248–252.
 [9] P. Pakzad, C. Fragouli, and A. Shokrollahi, “Coding schemes for line networks,” in Proc., IEEE ISIT, 2005.
 [10] K. Ausavapattanakun and A. Nosratinia, “Analysis of selectiverepeat ARQ via matrix signalflow graphs,” IEEE Trans. Commun., vol. 55, no. 1, pp. 198–204, Jan. 2007.
 [11] Q. Zhang and S. A. Kassam, “Finitestate Markov model for Rayleigh fading channels,” IEEE Trans. Commun., vol. 47, no. 11, Nov. 1999.
 [12] E. O. Elliott, “Estimates of error rates for codes on burstnoise channels,” Bell System Technical Journal, vol. 42, pp. 1977–1997, 1963.
 [13] D. L. Lu and J. F. Chang, “Performance of ARQ protocols in nonindependent channel errors,” IEEE Trans. Commun., vol. 41, no. 5, pp. 721–730, May 1993.
 [14] Y. J. Cho and C. K. Un, “Performance analysis of ARQ error controls under Markovian block error pattern,” IEEE Trans. Commun., vol. 42, no. 24, pp. 2051–2061, Feb.  Apr. 1994.
 [15] D.L. Lu and J.F. Chang, “Analysis of ARQ protocols via signal flow graphs,” IEEE Trans. Commun., vol. 37, no. 3, pp. 245–251, Mar. 1989.