Balance Queueing and Retransmission: LatencyOptimal Massive MIMO Design
Abstract
One fundamental challenge in 5G URLLC is how to optimize massive MIMO communication systems for achieving both low latency and high reliability. A reasonable design is to choose the smallest possible target error rate, which can achieve the highest link reliability and the minimum retransmission latency. However, the overall system latency is the sum of latency due to queueing and due to retransmissions, hence choosing the smallest target error rate does not always minimize the overall latency. In this paper, we minimize the overall latency by jointly designing the target error rate and transmission rate, which leads to a sweet tradeoff point between queueing latency and retransmission latency. This design problem can be formulated as a Markov decision process, whose complexity is prohibitively high for realsystem deployments. Nonetheless, we managed to develop a lowcomplexity closedform policy named LargearraY Reliability and Rate Control (LYRRC), which is latencyoptimal in the largearray regime. In LYRRC, the target error rate is a function of the antenna number, arrival rate, and channel estimation error; and the optimal transmission rate is twice of the arrival rate. Using overtheair channel measurements, our evaluations suggest that LYRRC can satisfy both the latency and reliability requirements of 5G URLLC.
I Introduction
Nextgeneration cellular systems, labeled as 5G, are targeting low latency and ultrahigh reliability to support new forms of applications, e.g. mission critical communications. One of the key technologies for 5G will be massive MIMO, where the basestations will be equipped with tens to hundreds of antennas [1, 2, 3]. In this paper, we explore how to leverage the large number of spatial degrees of freedom to minimize latency while ensuring high reliability.
Current cellular system design follows a layered approach. The queueing latency is managed at MAC and higher layers, while the target error rate is managed separately by the physical layer to maximize the physical layer throughput. For example, the transmission rate is often chosen to meet a fixed target error rate (around %). This decoupled design is shown to be nearly throughput optimal [4] for singleantenna systems. However, such a decoupled design may not achieve low latency.
As 5G pushes to low latency (10100 times lower latency than current systems) and ultrahigh reliability, it is of paramount importance to control the latency and service unreliability caused by retransmissions. The 5G UltraReliable LowLatency Communication (URLLC) has a reliability requirement of % [5], i.e., the probability of packet successful delivery within round of transmissions ( ms5G frame length) should be higher than %. To satisfy such reliability requirement, the target error rate cannot exceed %. It might be natural to choose the smallest possible target error rate, as a smaller target error rate results in higher link reliability and less retransmission latency. However, since the overall system latency is the sum of latency due to queueing and due to retransmissions, choosing the smallest target error rate does not always minimize the overall latency. In this paper, we achieve reliability guaranteed latency minimization by jointly optimizing the target error rate and the transmission rate policy.
While it is widely known that the error rate reduces with a higher power or a lower transmission rate, the relationships between the error rate and latency are more complex. There is a tradeoff between retransmission latency and queueing latency, and both are impacted by the target error rate: On the one hand, the retransmission latency reduces as the error rate reduces. On the other hand, if the system is fixed to an extremely low error rate, fewer packets can be transmitted in each frame i.e., the transmission time to send the same amount of packets increases, and packets have to wait for a longer time in the queue. Therefore, under a given arrival rate, the queueing latency increases as the error rate and transmission rate reduce. The situation is further complicated by the fact that current mobile users can adapt their transmission powers, which makes the feasible (transmission rate, error rate) tuple timevarying. Fig. 1 depicts an example of the minimum latency achieved at different error rates with the optimal transmission rate policies (developed later in Section III). For the specific example in Fig. 1, an error rate (1%) smaller than the 5G URLLC reliability requirement (error rate of 3.16%) results in the minimum latency. It is clear that we need to find a sweet spot on the error rate that minimizes the overall latency by balancing the queueing latency with the retransmission latency.
Most of the existing work on massive MIMO focuses on the physical layer aspects, a layer at which latency optimization can not be addressed. Massive MIMO was shown to provide higher spectral efficiency [1, 6, 7, 8], wider coverage [9, 3] and easier network interference management [10, 7, 9] than traditional MIMO. This work differs from previous massive MIMO works in that we provide reliability guaranteed latencyoptimal transmission control. There are also prior works that optimized the retransmission process, either for throughput maximization [4] or energy efficiency [11] maximization. In addition, crosslayer optimization [12, 13, 14, 15] is known to have the potential to achieve latency reduction. For a pointtopoint system, past studies [16, 17, 18, 19] showed that using the queuelength information for transmission rate control can reduce queueing latency.
To the best of our knowledge, this paper is the first work to identify the tradeoff between retransmission latency and queueing latency. We further optimize the error rate and transmission rate policy to achieve the optimal tradeoff. The main contributions of our work are the following:

We formulate the reliability constrained latencyminimization problem as the joint control design of the error rate and the transmission rate policy. We cast this optimization problem as a Markov decision process and solve it by value iteration.

Because Markov decision process does not provide insights on the optimal control, we develop a deterministic control for largearray systems with constant arrival rate, which is an important 5G URLLC type of traffic (like the timesensitive VoIP service [20]). The joint error rate and transmission rate policy is labeled as LargearraY Reliability and Rate Control (LYRRC). LYRRC is a low complexity, closedform solution to the latency minimization problem in the largearray regime: The optimal transmission rate is , where is the packet arrival rate. The optimal error rate of the physical layer is , where is the CDF of the effective channel gain (defined later), is the array size, is the number of users, and is the traffic arrival load over link capacity. The is the number of uplink pilots, which captures the impact of interference from imperfect channel knowledge. Furthermore, we discover that the total latency is determined by the array size , the number of pilots , the number of served users , and . In particular, for , we show that the average waiting time diminishes to zero as the array size increases to infinity.

To verify LYRRC’s performance and usefulness in the wild, we measure massive MIMO channels on the GHz with Rice Argos platform [2], which consists of a antenna basestation and multiple mobile users. Based on the measurements, we find LYRRC with 5G selfcontained frame [21, 22, 23] can simultaneously meet the ms latency and % reliability criterion. Under the same conditions, the best latency of fixed error rate control policies ( error rate) is more than ms. On the measured channels, we find that LYRRC provides latency reduction compared to current LTE transmission control ( error rate with peak power control). Compared to the best (queuelength based) rate adaption with fixed error rate (), LYRRC achieves a latency reduction.
The remainder of this paper is structured as follows. In Section II, we provide a physical layer abstraction and new network model for latency minimization problem of a single user wideband massive MIMO with retransmissions. Section III provides an algorithm to solve the proposed latency minimization problem. A simple yet latencyoptimal transmission control, LYRRC, is further investigated in the largearray regime in Section IV. Additionally, we capture how the minimum latency reduces with larger antenna array sizes in closedform. In Section V, we extend our singleuser analytical results to multiuser massive MIMO systems. We provide numerical results in Section VI and conclude in Section VII.
Ii System Model and Problem Formulation
In this paper, we consider a multiuser massive MIMO uplink system with retransmission. The base station only has imperfect channel knowledge from pilot estimation. The transmission of each user is under a maximum error rate constraint. Each user can control the average packet latency by designing both the target error rate and the transmission rate policy. We now formulate the reliability constrained latency minimization problem.
Iia System Model
Fig. 2 depicts a discrete timeslotted model that consists of an antenna basestation and a singleantenna user. The extension to multiuser systems is presented in Section V. We take inspiration from the recent 5G proposals [21, 22, 23] and assume that the system operates in selfcontained frames, as shown in Fig. 3. A selfcontained frame consists of both the data transmission and the immediate ACK/NACK. Without loss of generality, the duration of each frame is of unit and Frame spans the time interval . Within each frame, the user first transmits the encoded uplink data to the basestation. The basestation then feeds back an ACK or NACK to signal whether a decoding error occurred via downlink without error.
IiA1 Physical Layer Model
During the (uplink) data transmission, received signal by the basestation over the wideband channel is
(1) 
where is the subcarrier index and is the total number of subcarriers. Here is the transmitted signal and is a zeromean circularly symmetric complex Gaussian noise vector. The frequency independent largescale fading channel constant is . The frequency selective channel vector follows Rayleigh fading and is . We adopt the block fading model to capture the channel fading processes. According to the model, the smallscale fading vector are the same during each frame. And is i.i.d. across frames and subcarriers.
During the data transmission in each frame, the user transmits uplink pilots. The basestation estimates the channel with MMSE. The estimated (smallscale) channel vector is then
(2) 
where is the power of the pilots and is a zeromean circularly symmetric complex Gaussian noise vector. Therefore, after applying the receive beamformer, the received signal is
Here, the first term denotes the signal and the other two terms capture the interference and noise. According to the above signal model, the received on Subcarrier is
(3) 
where is the power of uplink data transmission.
The user is considered aware of only the largescale channel and the distribution of the smallscale channel via the estimation of a periodic indication signal that is broadcast by the basestation [24]. We consider the transmission power of the user to satisfy a longterm power constraint of . During each frame, the uplink packets to be transmitted are encoded within a single code block that spans all subcarriers. When packets are scheduled, the error rate is a function of the transmission power. We use the following model to capture the interplay between imperfect channel, error rate, transmission rate, and power.^{1}^{1}1 In this paper, we use for notational simplicity. One can also replace by the term in (4), and the expression in the effective channel gain (11) and power mapping (12) could be changed accordingly. Our simulation in Fig. 4 suggests that (4) is sufficiently accurate for LPDC codes at moderate .
(4) 
where is the number of information bits in each packet and is the number of transmitted packets per frame, which is referred to as transmission rate. In [25, 26, 27], it was shown that with strong channel coding, (4) closely captures the error rate for sufficiently high . Fig. 4 provides an example of (4) for the case of LDPC based system.
During Frame , the transmission power is adapted, based on the transmission rate , and the channel estimation accuracy, to achieve the selected error rate . We let denote the used transmission power, whose expression is provided later in (12).
IiA2 Buffer Dynamics with Retransmission
We assume that there is no packet in the buffer at time . During each frame, new packets arrive in the queue^{2}^{2}2 Our model and analysis can be directly generalized to the case where the number of new arrival packets across frames follow an independent and identically distribution. . And each packet contains bits. After receiving uplink data, as shown in Fig. 3, the basestation notifies the user on whether an error has occurred via an immediate downlink ACK/NACK. In this work, we assume the feedback messages are correctly decoded due to the beamforming gain and the limited downlink ACK/NACK rate (ACK/NACK is bit for each transmission). Upon ACK, the transmitted packets are removed from the buffer. Upon NACK, the transmitted packets remain at the buffer queue head^{3}^{3}3 It is possible to reduce the power of retransmissions via the joint decoding of failed packets and retransmissions. For mathematical tractability, we consider that the receiver will discard packets which cannot be decoded.. We use the indicator function to represent decoding success, means success and otherwise. The distribution of the is determined by the chosen error rate as and .
At time , let be the queuelength of the buffer, and be the number of packets to be transmitted at Frame as per the control decision. The bufferlength process follows
(5) 
where is the size of the buffer. Then the state space of the buffer, , is . If the number of the generated packets is larger than the remaining capacity of the buffer, not all packets can enter the buffer and buffer overflow occurs. Let denotes the number of dropped packets due to overflow, and is given by
(6) 
When packet overflow happens, the dropped packets induce significant latency in a realistic system. To capture the buffer overflow latency penalty, we assume that each overflowed packet introduces a large latency penalty of . The average number of dropped packets due to overflow, measured in packets per frame, is . We are interested in minimizing the average packet latency (from arrival to successfully delivery). We consider the stationary policies are complete, i.e., the minimum latency can be achieved by a stationary policy. Under a stationary policy, the queueing latency of successfully served packets are
(7) 
which is by Little’s Law [29]. To summarize, if a packet is dropped, its latency is and if a packet is successfully served (not dropped), its latency is (7). The average latency is then
(8) 
where is the proportion of the packets are successfully served and is the average queuelength, i.e., .
IiB Singleuser Latency Minimization Problem
We first formulate and analyze the joint error rate and transmission rate control in a singleuser system that is shown in Fig. 2. The multiuser extension will be presented in Section V.
We define the system state as the queuelength whose state space is . The objective of the transmission controller is to minimize the average packet latency under a longterm average power constraint. Based on the power constraint, arrival rate, largescale channel, and the array size , the transmission controller chooses an error rate from a finite set . This selected error rate remains constant in all frames. At the beginning of Frame (time ), the transmission rate policy determines the number of packets to send based on the system state of Frame , which is the queuelength . Denote the above stationary transmission rate policy as . Following (3) and (4), the transmission power is adapted based on the designed rate , error rate , and number of pilot , which is denoted as . Both the transmission rate policy and the resulting transmission power are independent of the exact smallscale fading as it is unknown to the user.
For any error rate and transmission rate policy , we assume that the resulted Markov chain of the system states is ergodic. The associated unique steady state of the system is denoted as . The objective of the joint error rate and transmission rate control is to minimize the average packet latency. For the abovementioned system, the aim is to find the optimal error rate and an optimal sequence of transmission rate (measured in packets per frame) via solving the following optimization problem:
(9a)  
subject to  (9b)  
(9c)  
(9d) 
where it the maximum error rate (minimum reliability) required by the user service. For 5G URLLC, .
By configuring the basestation with a larger array, the average latency can be potentially reduced due to increased spatial degrees of freedom. In this work, we want to capture how the minimum latency changes as a function of the array size . For each given pair of largescale fading and arrival rate, the minimum latency for different array sizes can be found by solving (9). We denote the relationship between the minimum latency and the array size as the function , which is referred to as the arraylatency curve.
Notation: We use boldface to denote vectors/matrices. We use to denote the norm and to denote complex space. The space of real value is whose positive half is denoted as . The following notations are used to compare two nonnegative realvalued sequences , : if ; if .
Iii LatencyOptimal SingleUser Transmission Control
In this section, we first formulate the latency minimization problem (9) as a constrained average cost Markov Decision Process (MDP) and solve it by a proposed algorithm.
Iiia MDP Formulation
In each frame, the transmission power is adapted based on the channel gain, scheduled transmission rate, and the selected error rate. We now formally quantify the interplay between the power, channel and control actions. Substituting the derived (3) into the channel outage model (4), we have that
(10) 
where is the perantenna channel gain . Thus, the error rate is modeled as the probability that a random variable is smaller than a constant. The righthandside constant (of the inequality) is independent of the smallscale fading channel. And the smallscale fading determines the distribution of the lefthandside. Let denote the random variable as
(11) 
which is referred to as the effective channel gain. Since signals received with different antennas are combined during the linear beamforming, is the arithmetic mean of the smallscale channel gain across the antenna. For a coded system, the total mutual information is the linear sum of the persubcarrier . Therefore, the effective channel gain is the geometric mean across the subcarriers. We let denote the cumulative distribution function (CDF) of the effective channel . And the inverse CDF of is . With some algebraic manipulations, using (4), we have
(12) 
where is the inverse CDF of the effective channel gain in (11). When increases, the basestation has a more accurate channel estimation and the needed transmission power (at the same rate with the same reliability) reduces. One can observe that the required transmission power increases with the transmission rate and the packet size , and decreases with the array size and the number of subcarriers .
The system state space of the queue length is denoted as . For a selected error rate , and a stationary transmission rate policy , based on the definition of average latency (8), we define the induced latency cost mapping on each state action pair as
where is the number of the dropped packet due to buffer overflow as shown in (6). By taking expectation over the steady state of the system , the average latency is then given by
Similarly, utilizing the transmission power characterization in (12), the average power is
Given an average power constraint , the objective of the joint error rate selection and transmission rate control is restated as a constrained MDP as
(13) 
The constrained MDP (13) is converted to an unconstrained MDP via Lagrange’s relaxation as
(14) 
The results in [30, 31] provide a sufficient condition under which the unconstrained MDP is also optimal for the original constrained problem (9). For all policies such that , the sufficient condition provided by [30, 31] is satisfied. Thus, when the constraint is binding, there exists zeroduality gap between original problem (9) and the unconstrained MDP (14), i.e., their optimal solution is the same.
IiiB An Algorithm to Solve for , ,
For each error rate that is smaller than , and , problem (14) is an MDP with an average cost criterion and with infinite horizon. For each and , we thus find the optimal transmission rate policy by considering the corresponding discounted problem [32]. For each system state , define value cost function as
where is the discount factor. For each and , we want to find a stationary policy for all discounted problem with , i.e., the Blackwell optimal policy. For the considered finite state MDP, the Blackwell optimal policy [32] exists and is also optimal for the average cost problem (14). The Bellman’s equation of the above discounted problem is then
(15) 
whose state transition is described by (4), (5), and (6). Using dynamic programming with value iteration [32] over (15), we can solve the discounted problem. Since the discounted cost is bounded, we have that by updating the cost value using (15), the solved optimal transmission rate control converges to [32].
For each error rate , to find whose solution of (14) satisfies the longterm power constraint , we can use the binary search method due to the following observation. The binary search method guarantees the convergence to the optimal solution for (13) because for each , the average power is monotonically nondecreasing on . Finally, by solving the latency minimization problem for each , we can find the optimal . We summarize the above steps in Algorithm 1.
Iv LargeArray LatencyOptimal Control
In this section, we evaluate how the optimal solution to the latency minimization problem (9) behaves as the array size . Specially, we seek to find the minimum achievable latency for systems with large array.
We start with the following assumptions. We consider the distribution of the perantenna channel gain (12) to satisfy the following assumptions.

Its mean grows linearly as increases, i.e.,
(16) 
Its variance is inverse proportional to , i.e.,
(17)
It is worthwhile to comment that the above assumptions on the perantenna gain are reasonable and hold true in many practical systems. For example, for a single user system with imperfect channel and in Rayleigh fading environment, it is straightforward to check that the mean condition (16) and variance condition (17) are both satisfied. In Section V, we will show that conditions (16) and (17) also hold true in uplink multiuser systems with imperfect channel.
Based on the condition of mean (16), the achievable rate of the link (on each subcarrier) converges to as the array size . Here, is a fixed constant that does not increase as increases. Hence, can be viewed as the link “capacity”. In a practical coded system targeting lowlatency and high reliability, only part of the capacity can be achieved. In asymptotic analysis, we define the system utilization factor to be a constant as
(18) 
where is the packet arrival rate, is the number of bits in a packet, and is the number of subcarriers. Hence, under (18), the packet arrival rate increases with the array size and equals . Conceptually, the term can be viewed as the total “capacity” of the wideband link and can be viewed as the data load. Thus, the utilization factor can be interpreted as the ratio between the offered data load and the total link “capacity”.
We also make the following assumptions for mathematical tractability. We consider an infinite buffer (i.e., ), thus no buffer overflow or overflow latency occurs. And the error rate can be chosen from a continuous set . We also consider that there does not exist a finite positive value such that^{4}^{4}4For the degenerate case where there exists a finite such that , there is a finite array size with which all incoming traffic can be instantly served without any waiting time and without any channelinduced retransmission. .
Iva ArrayLatency Scaling Lower Bound
Notice that a trivial lower bound of is frame, which is the first transmission attempt of a packet. This frame latency lower bound can only be achieved if the transmission power can be arbitrarily high. We now provide a tighter lower bound of the arraylatency curve . We will later present a simple yet optimal transmission control policy that can achieve such lower bound in the largearray regime.
Theorem 1 (Latency Scaling Lower Bound).
Proof.
The main idea is to first lower bound the average latency by considering only the packet retransmissions latency. We then complete the proof by converting the latency lower bound via the Jensen’s inequality. Appendix A provides the proof details. ∎
Theorem 1 presents a latency lower bound. Theorem 1 states that selecting any error rate smaller than leads to a very large queueing latency. Both and the latency lower bound increases as the channel knowledge reduces, i.e., as the number of pilots reduces. For example, if the channel estimation error is large (), and the retransmission latency becomes large. Therefore, without good enough channel estimation, neither the reliability target and latency target can be met. Given the highly complex nature of the considered lossy channel with retransmissions, it is not immediately clear whether such a latency lower bound in Theorem 1 is achievable. We show that there exists a simple transmission control policy that is latencyoptimal as in Section IVB.
IvB LargeArray Optimal Error Rate and Transmission Rate Control
In this subsection, we first present a simple transmission control and then prove that it is latencyoptimal as .
Definition.
We define the LargearraY Reliability and Rate Control (LYRRC) to be
(20) 
The LYRRC policy contains two parts: an error rate control policy and an transmission rate control policy . Here, the superscript stands for largearray. In the error rate control policy , a smaller error rate is selected for systems with larger array size or if the utilization factor reduces. In addition, the select error rate increases with channel estimation error. The transmission rate policy describes a simple thresholding rule: If there are more than packets in the buffer queue, i.e., , packets will be transmitted. If less than packets are currently in the buffer, all packet in the queue will be scheduled for transmission in the frame. In each frame, based on the transmission rate of , the user utilizes power adaption (12) to achieve the error rate target . The arrival rate scales linearly as . Hence, the error rate and the transmission rate of LYRRC are both determined by the array size .
To provide insights on the reasoning behind , we consider the associated Markov chain of the bufferlength. The bufferlength state transition under any error rate , which is not necessarily equal to , and the transmission rate policy is depicted in Fig. 5. By Little’s Law, the average latency equals to the ratio between the average queuelength and the arrival rate . Notice that is the difference between the adjacent states in Fig. 5. Hence, the average queuelength is in proportional with (see Appendix B for a rigorous proof). As a result, the average latency depends only on the error rate , but not on .
The transmission rate control policy applies a negative drift with probability towards the minimum queuelength . To minimize the latency as , the queuelength needs to be regulated towards the minimum queuelength . This regulation is achieved by selecting a smaller error rate. As mentioned above, the error rate of LYRRC (20) reduces as the array size increases. We conclude that the achieved latency under the LYRRC is a function of the error rate which reduces as . Next, we will characterize the latency under LYRRC and prove that it is asymptotically latencyoptimal.
Lemma 1 (Latency Under Transmission Rate with Thresholding).
Under any error rate and transmission rate policy , the average latency is .
Proof.
The main idea is to compute the steady state distribution of the queuelength, which is a Markov chain with infinite countable states. Appendix B provides the complete proof. ∎
Lemma 1 provides a closedform characterization of the transmission rate policy when the maximum bufferlength is infinite. By using Lemma 1, we have that the achieved latency of LYRRC is . We next prove the optimality of LYRRC (20) by comparing the achieved latency to the minimum latency lower bound in Theorem 1.
Theorem 2 (Optimal LargeArray Control).
For any and positive , as , LYRRC (20) guarantees that the average latency is within a vanishing gap from optimal as
(21) 
where is the average latency by LYRRC. And denotes that .
Proof.
Theorem 2 establishes the optimality of LYRRC. In addition, the latency gap between the lower bound and LYRRC increases as the channel estimation error increases ( reduces). Furthermore, Lemma 1 and Theorem 2 suggest that LYRRC reduces latency by selecting lower target error rate for systems with larger array sizes. Hence, the reliability and lowlatency design objectives of 5G URLLC naturally matches with each other for large . Finally, we note that LYRRC can achieve optimallatency for any , which seems to contradict the transmission rate of . This can be explained by the fact that we are considering a wireless link with power adaption and the probability of transmit at reduces as . Therefore, using larger transmission power (over a few frames) can increase the peak transmission rate beyond the longterm average rate. We next combine Theorem 2 and Theorem 1 to characterize the scaling of the arraylatency curve in closedform.
Theorem 3 (LargeArray Latency Scaling).
As , for any positive and , the optimum latency converges to frame as
(22) 
where is the CDF function of the effective channel gain . And denotes that .
Proof.
Theorem 3 provides a closedform characterization of the largearray latency. In closedform, it describes the minimum latency as a function of the utilization factor , the channel estimation error, and the array size . As , . Thus, both the retransmission and queueing latency converges to frame. Additionally, we observe that serving less load (smaller ) leads to faster latency convergence rate. Finally, we comment on the impact of imperfect channel state information. For any , the latency convergence to the frame as . However, the convergence speed of is determined by the channel estimation error, which reduces as the number of pilots increases. For a practical system with finite , more accurate channel leads to smaller latency.
During the largearray analysis, we find that the CDF of the effective channel critically determines both the minimum latency and the latencyoptimal target error rate. In the real world, the spatial channel correlation [33, 34] can exist because of the limited number of scatterers. Due to such spatial correlation, the distribution of can differ from the popular Rayleigh fading model. To evaluate the realworld performance of LYRRC, we conduct numerical experiments with measured overtheair channels in Section VI.
V Multiuser Extension
In this section, we now consider the user latency minimization problem over the lossy channel. Still, the basestation only has imperfect channels from uplink pilots estimation. Fig. 6 pictures the setup. In this section, suffix denotes the user index. Each user’s transmission power is subject to an individual longterm power constraint . The multiuser controller decides the error rate and the transmission rate of User . The buffer dynamic of each user is identical to that of the single user counterpart that is described in Section IIA2.
To minimize the system latency of the users at the same time, we associate positive weights to users. The multiuser latency minimization problem is then
(23) 
where is the maximum error rate (minimum reliability) of User . And is the receiver of the th subcarrier in Frame for User . Here, the buffer length and buffer overflow of User is given by (5) and (6), respectively.
To jointly detect signals from the users, the basestation applies (receiver) beamforming on the received signal. Let matrix denotes the uplink smallscale channel fading between the antenna basestation and the users. The channel of User on Subcarrier is and is the th column of . Throughout this section, we consider users are in a rich scattering environment and user channels follow i.i.d. Rayleigh fading.
In practice, the basestation learns the uplink channel matrix by estimating uplink pilots. Denote the estimated channel as . The channel estimation accuracy depends on the number of pilots and the pilot power. With the commonly used MMSE estimator, the estimation error between each basestation antenna and User is an complex Gaussian random variable with zero mean and variance of . Here, and are the number of uplink pilots and the pilot power, respectively. Using the estimated channel, the basestation generates zeroforcing receive beamformers to detect the uplink signal of each user. The beamforming matrix is . One Subcarrier , the corresponding uplink of User is
(24) 
where is the th column of . The second term in the denominator represents the residual interbeam interference after the receiver beamforming. Due to channel estimation error, the residual interbeam interference is always be positive. Previous work has shown that, with MMSE channel estimator of pilots, the uplink effective is [7, Eq.]
(25) 
where denotes the th diagonal element of a matrix and captures the interbeam interference penalty.
Due to the interference, for a practical uplink system where each user is unaware of other users’ channel or queue information, the joint error rate and transmission rate policy design appears intractable. To see the difficulty of the joint policy design, let and be the (scheduled) error rate and transmission rate, respectively. Recall that (25) finds that the interbeam interference of each user depends also on the power and largescale fading of the other users. This problem is further complicated by the fact that each user’s transmission power changes in each frame based on its current queuelength. Thus, it is extremely difficult for each user with only local knowledge (queuelength and largescale fading) to infer the exact value of and hence the proper transmission power. As a result, the error rate and transmission rate policy cannot be designed distributedly by each user, which is undesirable for a practical uplink system.
Here, we proceed with the observation that, in realworld systems, the pilot power is usually required to be higher than the data signal power [24]. Hence, the term is upper bounded by , which can be viewed as a worst cast interference penalty. Each user then adjusts its power based on the loss upper bound. Substituting the expression (25) of the multiuser system into (4), we then have that the error rate is now
(26) 
where the perantenna gain is
(27) 
Similarly to the singleuser case, we also compute the perframe transmission power as
(28) 
where is the scheduled reliability (error rate) target and is the transmission rate (in unit of packet). Here, in (26) is due to that each user considers the upper bound of interbeam interference. The perantenna gain (27) is independent of the largescale channel, transmission power, and hence queuelength of the other users. For each user, the distribution of the effective channel in (11) then becomes independent of the channel, queuelength, and power of the other users. Therefore, we can decouple the multiuser problem. By adopting a new distribution of the effective channel gain (generated by (27)) and the new power mapping (28), the multiuser problem is decoupled to independent single user problems (9). Each of the singleuser problems can be solved by Algorithm 1. We now further demonstrate that the largearray analytical results in Section IV also apply to the considered multiuser system with imperfect channel knowledge.
Theorem 4.
For multiuser downlink systems, LYRRC becomes
(29) 
As , for positive and , each user operates under LYRRC achieves the minimum latency of
(30) 
Here, denotes that .
Proof.
LYRRC, therefore, indeed provides the latencyoptimal error rates and transmission rate policies to the multiuser massive MIMO system. And Theorem 3 also captures the optimal latency of each user. In conclusion, for any nonnegative weights , we can convert the user latency minimization problem into parallel single user optimization problems. For finite , Algorithm 1 solves each of the single user problems and provides the optimal error rate and transmission rate policy. Furthermore, we showed that each user operates using LYRRC distributedly is latencyoptimal as . Section VI will use numerical experiments to evaluate the proposed transmission control.
Vi Numerical Results
In this section, we utilize measured overtheair channels to confirm our previous analysis in Section III and Section V. During the numerical evaluation, the basestation still only has imperfect channel from pilots estimation. And the latency duration is captured in the unit of second, which is obtained by multiplying frame duration to latency measured in the unit of frame. We measure the overtheair channels between mobile clients and a antenna massive MIMO basestation with Argos system [2] on the campus of Rice University. Figure (a)a and (b)b describes the Argos array and the overtheair measurement setup. We measured the GHz WiFi channel ( MHz, nonempty data subcarriers) for four pedestrian users at different locations, which are denoted by Fig. (c)c. For each location, we take channel measurements over frames of all subcarriers. During measurements, the effective measured between each mobile user and each basestation antenna is higher than dB. In simulations, we consider measured overtheair channel traces as the perfect channel.



The basestation adopts MMSE estimator to estimate uplink pilots, each of power dBm, from the users. Using the imperfect (estimated) channel, the basestation generates zeroforcing receive beamforming vectors to decode the signal of each user. The user is assumed to follow average power constraint of dBm with largescale fading of dB. The maximum buffer length is . The packet arrival rate is uniform over the time at the rate of packets per frame. And the packet size is bits per OFDM symbol. The latency penalty of dropped packets from buffer overflow is s. And each selfcontained frame is considered of duration ms. The state space of the error rate is , , and . Each user is under a maximum error rate constraint of %, which is equivalent to the 5G URLLC reliability constraint of % (over ms).




Fig. 8 provides the latency performance comparison of four different policies over the measured overtheair channels with different channel estimation accuracies. The blue lines are the optimal arraylatency curves under the proposed joint reliability and transmission rate policy, which is obtained by Algorithm 1. The red lines are the proposed lowcomplexity LYRRC (20), which was discussed in Section IV. The green colored lines capture the latency under optimal transmission rate adaption but fixed reliability (error rate of ). And the black lines are the latencies of fixed reliability ( error rate) and transmission rate adaption under a peak power constraint, which is currently deployed in LTE and WiFi systems.
The proposed joint control (blue and red lines) clearly provides better latency performance than the two fixedreliability counterparts. Allowing error rate to be adaptive on array size turned out to reduce the latency significantly. Compared to the fixed error rate with peak power control, for basestations with array size larger than , a latency reduction compared is observed. Additionally, when array size is larger than , we find that the proposed joint control can provide a latency reduction compared to the stateoftheart control that fixes error rate and adapts transmission rate [16, 17, 18, 19] (based on array size and queue length). Our largearray asymptotic latencyoptimal control, LYRRC, turned out to be near latencyoptimal when array size is larger than . It is worthwhile to note that the abovedescribed policy impacts on latency are consistent across different channel accuracies, i.e., systems with different numbers of uplink pilots (). Finally, we find that fixed error rate (at ) policies leads to at least ms latency and cannot satisfy the URLLC latency requirement.
Fig. 8 also captures the influence of imperfect channel state information on latency. For a multiuser uplink system, the interbeam interference (27) reduces as the channel estimation error reduces, i.e., as the number of pilots increases. And achieving the same reliability (error rate) becomes more power expensive with larger interbeam interference. Therefore, under all policies, the latency increases as the number of pilots reduces. For example, a basestation with antennas and perfect channel knowledge, can reduces latency to near the frame first transmission time of ms. But with estimated channel from pilots, achieving the same latency needs a larger array size of . Additionally, even for systems with single uplink pilot (), the proposed joint control satisfies the latency requirement of URLLC with larger than .


We now comment on the optimal error rate that minimizes the latency. Fig. (a)a describes the latencyoptimal error rate obtained during solving the latency minimization problems in Fig. 8. The latencyoptimal error rate reduces as reduces due to less accurate channel estimation, which agrees with LYRRC. Additionally, due to the reliability constraint, the solved latencyoptimal error rates satisfy the 5G reliability requirement (error rate of ).
Finally, we use simulations to verify our structural analysis in Section IV. Fig. 8 confirms that LYRRC (20) is near latencyoptimal for larger than a finite number of . One technical contribution independent of the massive MIMO system is a simple transmission rate policy as , which is referred to as “rule of double” and is part of LYRRC. Lemma 1 captures that, when buffer size , the resulted latency by using and a error rate is . Fig. (b)b shows the resulted latency by using with a finite buffer size. The (largebuffer) asymptotic latency turned out to accurately approximate the system latency when is larger than . And as the target reliability increases (error rate reduces), buffer overflow is less likely to happen and the latency approximation in Lemma 1 becomes increasingly accurate.
Vii Conclusion
In this work, we study the latencyoptimal crosslayer control over wideband massive MIMO channels. By identifying a tradeoff between queueing and retransmission latency, we find that a lower physical layer target error rate does not always guarantee lower latency. We present algorithms that generate the optimal error rate and transmission rate policies. We show that to achieve the minimum latency, the target error rate can no longer be considered fixed and needs to be adapted based on different basestation array sizes, channel estimation accuracy, and the traffic arrival rate. Our results also demonstrate that massive MIMO systems have the potential to achieve both high reliability and low latency and are a promising candidates of 5G URLLC.
Appendix A Proof of Theorem 1
We use a per packet argument. Since infinite buffer is assumed in this section, no packet is dropped and all packets will be successfully received with a variable number of transmissions due to the potential channelinduced error. For any selected error rate , let be the average number of retransmissions, the average retransmission latency and transmission time is
(31) 
which is a lower bound of the total latency. To finish the proof, we now lower bound under the longterm power constraint . Under the steady state, the average transmission rate must equal to the packet arrival rate that is
(32) 
Notice that the power function (12) is convex on , we apply Jensen’s inequality and have
Here, the second step is by using the considered the arrival packet scaling (18). Function is an inverse CDF and is nondecreasing. A lower bound of is computed as
Using the monotonicity of the CDF, a lower bound on the error rate is then
(33) 
Appendix B Proof of Lemma 1
We compute the queueing latency by considering the steady state. Under transmission rate policy , the buffer length process (5) is rewritten as The buffer length process under thus constitutes a Markov chain with countably infinite states [36]. The distribution of is determined by error rate as and . The state transition is shown in Fig. 5. Denote the steady state distribution of the buffer length as . We then have that
where . The steady state distribution is then computed as
(34) 
Using (34), the average latency is then computed as
(35) 
which completes the proof.
Appendix C Proof of Theorem 2
We first characterize the gap between latency under and as
(36) 
where the last step is obtained via applying Theorem 1 and (33). Equ. (36) provides the characterization of the latency gap. To finish the proof, it is sufficient to show that the average power constraint is satisfied under the largearray simple control.
With utilization factor (18), the packet arrival rate scales as . Using the perframe power (12) and the definition of (20), the transmission power with rate is
(37) 
Since we assume empty buffer at time and constant arrival rate of , the transmission rates under policy is either or . Based on the queue length steady state characterization (34), we have that and