# Efficient Algorithms
for Tandem Queueing System Simulation^{1}

^{1}

## Abstract

Serial and parallel algorithms for simulation of tandem queueing systems with
infinite buffers are presented, and their performance is examined. It is
shown that the algorithms which are based on a simple computational
procedure involve low time and memory requirements.

Key-Words: tandem queueing systems, simulation algorithm, parallel processing.

## 1 Introduction

The simulation of a queueing system is normally an iterative process which involves generation of random variables associated with current events in the system, and evaluation of the system state variables when new events occur [1, 2, 3, 4, 5]. In a system being simulated the random variables may represent the interarrival and service time of customers, and determine a random routing procedure for customers within the system with non-deterministic routing. As state variables, the arrival and departure time of customers, and the service initiation time can be considered.

The methods of generating random variables present one of the main issues in computer simulation, which has been studied intensively in the literature (see, e.g. [3]). In this paper however, we assume as in [2] that for the random variables involved in simulating a queueing system, appropriate realizations are available when required, and we, therefore, concentrate only on algorithms of evaluating the system state variables from these realizations.

The usual way to represent dynamics of queueing systems is based on recursive equations describing evolution of system state variables. Furthermore, these equations, which actually determine a global structure of changes in the state variables consecutively, have proved to be useful in designing efficient simulation algorithms [1, 2, 4, 5].

In this paper we apply recursive equations to the development of algorithms for simulation of tandem queueing systems with infinite buffers capacity. A serial algorithm and parallel one designed for implementation on single instruction, multiple data parallel processors are presented. These algorithms are based on a simple computational procedure which exploits a particular order of evaluating the system state variables from the recursive equations. The analysis of their performance shows that the algorithms involve low time and memory requirements.

The rest of the paper is organized as follows. In Section 2, we give recursive equations based representation for tandem systems. These representations are used in Section 3 to design both serial and parallel simulation algorithms. Time and memory requirements of the algorithms are also discussed in this section. Finally, Section 4 includes two lemmae which offer a closer examination of the performance of the parallel algorithm.

## 2 Tandem Queues with Infinite Buffers

To set up the recursive equations that underlie the development and analysis of simulation algorithms in the next sections, consider a series of single server queues with infinite buffers. Each customer that arrives into the system is initially placed in the buffer at the st server and then has to pass through all the queues one after the other. Upon the completion of his service at server , the customer is instantaneously transferred to queue , , and occupies the st server provided that it is free. If the customer finds this server busy, he is placed in its buffer and has to wait until the service of all his predecessors is completed.

Denote the time between the arrivals of th customer and his predecessor by , and the service time of the th customer at server by , , . Furthermore, let be the th arrival epoch to the system, and be the th departure epoch from the th server. We assume that are given parameters, whereas are unknown state variables. Finally, we define for all , and for .

With the condition that the tandem queueing system starts operating at time zero, and it is free of customers at the initial time, the recursive equations representing the system dynamics can readily be written as [1, 2, 4]

(1) |

where denotes the maximum operator, , . We can also rewrite (1) in another form intended to provide the basic representation for parallel algorithms, as

(2) |

where stands for the th initiation of service at server , , .

## 3 Algorithms for Tandem Queueing System Simulation

The simulation algorithms presented in this section are based on the equations (1) and (2) with indices being varied in a particular order which is illustrated in Fig. 1(a). Clearly, at each iteration the variables with , , are evaluated. They form diagonals depicted in Fig. 1(a) by arrows, for each diagonal the direction of arrows indicates the order in which the variables should be evaluated within their associated iterations. Note that to obtain each element of a diagonal, only two elements from the preceding diagonal are required, as Fig. 1(b) shows.

(a) The simulation schematic for a tandem system with . | ||

(b) The diagram of calculating . |

By applying the computational procedure outlined above, both serial and
parallel simulation algorithms which provide considerable savings in time and
memory costs may be readily designed. Specifically, the next serial algorithm
is intended for simulation of the first customers in a tandem
queueing system with servers.

Algorithm 1.

For each , do

for ,

compute .

Based on Fig. 1(a) as an illustration, it is not difficult to calculate the total number of arithmetic operations which one has to perform using Algorithm 1. Since each variable can be obtained using one maximization and one addition, all variables with , and , require operations without considering index manipulations.

Note that the order in which the variables are evaluated within
each iteration is essential for reducing memory used for computations. One can
easily see that only memory locations are actually
required with this order. To illuminate the memory requirements, let us
represent Algorithm 1 in more detailed form as

Algorithm 2.

Set , .

For each , do

for ,

set .

In Algorithm 2, the variable serves all the iterations to store current values of for all . Upon the completion of the algorithm, we have for server the th departure time saved in , .

Finally, we present a parallel algorithm for tandem system simulation which is
actually a simple modification of Algorithm 1.

Algorithm 3.

For each , do

in parallel, for
,

compute
;

in parallel, for
,

compute
.

As in the case of Algorithm 1, we may conclude that Algorithm 3 entails memory locations. Furthermore, it is easy to understand that Algorithm 3 requires the performance of parallel operations provided processors are used. Otherwise, if there are processors available, one has to rearrange computations so as to execute each iteration in several parallel steps. In other words, all operations within an iteration should be sequentially separated into groups of operations, assigned to the sequential steps. We will discuss time requirements and speedup of the algorithm in this case in the next section.

## 4 Performance Study of the Parallel Simulation Algorithm

We now turn to the performance evaluation of Algorithm 3 with respect to the number of parallel processors. Note that the results of this section are obtained by considering the time taken to compute only the state variables . In other words, in our analysis we ignore the time required for computing indices, allocating and moving data, and synchronizing processors, which in general can have an appreciable effect on the performance of parallel algorithms.

###### Lemma 1.

To simulate the first customers in a tandem queue with servers, Algorithm 3 using processors requires the time

(3) |

where , .

###### Proof.

We start our proof with evaluating the exact number of parallel operations to be performed when processors are available. As it easy to see, at each iteration , , the algorithm first carries out in parallel a fixed number of maximizations, and then does the same number of additions. Denote this number by . It follows from the above description of the algorithm that the numbers , , form the sequence with elements

Since parallel operations may be performed using processors in the time , for the entire algorithm we have the total time

(4) |

To calculate , let us first consider the sum

Substitution of this expression into (4), and trivial algebraic manipulations give

(5) |

Finally, since , and as , we conclude that

Note that in two critical cases with and , the order produced by (3) coincides with the exact times respectively equaled and .

###### Lemma 2.

For a tandem system with servers, Algorithm 3 using processors achieves the speedup

(6) |

###### Proof.

To evaluate the speedup which is defined as

(7) |

first note that .

###### Corollary 3.

For a tandem system with servers, Algorithm 3 using processors achieves linear speedup as the number of customers .

###### Proof.

It follows from (6) that with the speedup as . ∎

### Footnotes

- thanks: Applied Mathematics Letters, 1994. Vol. 7, no. 6, pp. 45-49

### References

- L. Chen and C.-L. Chen, “A fast simulation approach for tandem queueing systems,” in 1990 Winter Simulation Conference Proceedings, O. Balci, R. P. Sadowski, and R. E. Nance, eds., pp. 539–546. IEEE, 1990.
- A. G. Greenberg, B. D. Lubachevsky, and I. Mitrani, “Algorithms for unboundedly parallel simulations,” ACM Trans. Comput. Syst. 9 no. 3, (1991) 201–221.
- S. M. Ermakov, Die Monte-Carlo-Methode und verwandte Fragen. VEB Deutscher Verlag der Wissenschaften, Berlin, 1975.
- N. K. Krivulin, Optimization of Discrete Event Dynamic Systems by Using Simulation, PhD Dissertation. St. Petersburg University, St. Petersburg, 1990. (in Russian).
- N. Krivulin, “Unbiased estimates for gradients of stochastic network performance measures,” Acta Appl. Math. 33 (1993) 21–43, arXiv:1210.5418 [math.OC].