On Continuousspace Embedding of Discreteparameter Queueing Systems
Abstract
Motivated by the problem of discreteparameter simulation optimization
(DPSO) of queueing systems, we consider the problem of embedding the
discrete parameter space into a continuous one so that descentbased
continuousspace methods could be directly applied for efficient optimization.
We show that a randomization of the simulation model itself can be used
to achieve such an embedding when the objective function is a
longrun average measure.
Unlike spatial interpolation, the computational cost of this embedding
is independent of the number of parameters in the system, making the
approach ideally suited to highdimensional problems.
We describe in detail the application of this technique
to discretetime queues for embedding queue capacities, number of servers
and serverdelay parameters into continuous space and empirically
show that the technique can produce smooth interpolations of the objective function.
Through an optimization casestudy of a queueing network with
design points, we demonstrate that existing continuous optimizers can be
effectively applied over such an embedding to find good solutions.
Keywords—Queueing Systems, Discrete Parameter Simulation Optimization, Interpolation
1 Introduction
1.1 Motivation
The use of simulation is often necessary in the optimization of complex reallife queueing networks. Such queueing networks typically have discrete valued parameters such as queue capacities, the number of servers and serverdelays. More concretely, consider a queuing system with a parameter set where each can take integer values between some fixed bounds. The set of all possible values that can take is the dimensional discrete parameter space . Let be a cost/performance measure of the system that we wish to optimize. In many applications, may be composed of longrun average measures such as average throughput, average waiting time per customer or blocking probabilities. An analytical expression for is rarely available and given , can only be measured using a simulation of the system. The measurements are noisy as each simulation run has finite length. We are motivated by the problem of finding an that minimizes . This is a discreteparameter simulation optimization (DPSO) problem. The problem is often difficult as the number of parameters can be very large and each function evaluation is computationally expensive.
For small parameter sets, techniques such as Optimal Computing Budget Allocation (OCBA)[1] have been effectively applied. When the number of parameters is large, an exhaustive evaluation of all design points is infeasible. The goal then is to find the best possible solution within a finite computational budget, rather than the global optimum. In such a case, randomized search techniques such as simulated annealing[2, 3] and genetic algorithms[4] or heuristicbased local search methods such as Tabu search[5] have been employed. Discretespace variants of continuous optimizers such as Simultaneous Perturbation Stochastic Approximation (SPSA)[6] have also been proposed wherein the parameter estimate is projected back to the discrete space at each iteration [7, 8]. In all of the above methods, the objective function evaluation is limited strictly to points in the original discrete domain.
If the discrete parameter space can be embedded into a larger continuous space by using some form of interpolation, the optimization problem can be solved by directly applying descentbased continuousspace methods. Let be the convex hull of . An embedding of the discrete parameter space into a continuous one is essentially an interpolation (say ) of defined over . If can be constructed in a computationally efficient manner and has a suitable structure (that is, is continuous, piecewise smooth, has few local minima) then continuous optimizers applied directly over can be expected to perform well. This is because gradient information can be utilized at each step to converge rapidly to local minima. Random multistarts can be used in the case of nonconvex functions. Most importantly, such methods often scale well with the number of parameters in comparison to enumerative approaches. However, the solution needs to be projected back to the original discrete space carefully. It is to be noted that an embeddingbased approach is essentially different from methods where the parameter vector is rounded/truncated to integer points at each step. In the former case, a continuousspace offers more pathways to reach the solution aggressively.
The continuousspace embedding approach is suited to simulation optimization of queueing, inventory or manufacturing systems where most discrete parameters have an integer ordering and the objective function has some structure (unlike combinatorial problems). Such an approach has been recently reported in [9] and [10] for the optimization of queueing and inventory systems where spatial interpolation (piecewiselocal interpolation over a simplex) was used to obtain a continuousspace embedding of the objective function.
1.2 Problem Statement
Motivated by the embeddingbased approach to the DPSO problem, we consider the problem of obtaining a continuousspace embedding in a computationally efficient manner. Consider a system with the parameter vector where each is discrete valued and belongs to the set . Let denote the range . The set of all possible values that can take is and is the convex hull of .
We consider the problem of finding such that when and is continuous over . Further, it is desirable that is continuously differentiable over so that gradientbased methods could be applied. Spatial interpolation as a means of obtaining is computationally expensive. Given and a set of points in the neighbourhood of , the interpolation can be found as:
where are interpolation coefficients. Here simulations are required to estimate the interpolated value. For linear interpolation, and for piecewise simplex interpolation, . Thus the computational effort required for the embedding grows rapidly with the dimensionality of the parameter space.
Instead, we propose an embedding technique which is based on a randomization of the simulation model itself. The interpolated value can be computed using a single simulation of the randomized model, irrespective of the dimensionality of the parameter space. The basic idea behind the embedding technique is described next.
1.3 Embedding via Randomization
Let be a point in at which we wish to compute the interpolated value . Thus the parameter needs to be assigned a value . To compute the interpolation, we construct a randomized version of the model where each parameter is perturbed periodically (for example, at the beginning of each timeslot) and assigned values of a discrete random variable which we denote as . The random variable is chosen in such a way that its moments can be made to vary continuously with respect to the parameter , and with probability whenever . The simplest example of such a random variable is:
(1)  
Let be the longrun average measure obtained by simulating such a randomized model. is now a function of . Further, when by definition. Thus is an interpolation of . As each parameter in the model can be embedded independently, the interpolated value can be computed using a single simulation of the randomized model, irrespective of the number of parameters. In essence, the interpolation technique relies on averaging in time in contrast to spatial interpolation methods which perform an averaging in space.
Such a technique can be applied in principle, to several types of discreteevent systems where the objective is a longrun average. A randomizationbased approach was first reported in [11] for sensitivity measurement in VLSI systems (for producing small realvalued perturbations to discretevalued parameters) and in [12] for the design optimization of multicore system architectures. Both these works demonstrated a specific application of the technique, however, without a theoretical justification. A theoretical basis for the embedding technique was introduced in [13] in the context of two specific algorithms for solving the DPSO problem. This work proposed variants of two continuous optimizers (SPSA and Smoothed Functional (SF) algorithm) wherein the parameter estimate at each iteration is projected back to the discrete space probabilistically (rather than in a deterministic manner) and showed that this effectively produces a continuous embedding of the underlying discreteparameter process. The work focused on the optimization algorithms and the embedding technique itself was not explored in depth.
The focus of the current work is on the randomizationbased embedding technique itself and its application to queueing systems (rather than a specific optimization method). In fact, once a suitable embedding has been obtained, a rich set of existing continuous optimizers become directly applicable to the original DPSO problem. The main contributions of this paper are listed below.
1.4 Our Contributions
We present a general scheme for randomization of discretevalued parameters in a simulation model and show that such a randomization can indeed produce continuous interpolations of the longrun average objective. The proof (presented in Section 2) uses a result from Markov chain perturbation theory[14] and builds on a proof in [13] by relaxing some of its assumptions. The interpolation is not unique and can be tuned by varying the shape of the weights (probabilities) used during the randomization. We refer to these weights as Stochastic Interpolation Coefficients. In Section 3 we present a parameterized template for generating these stochastic interpolation coefficients. We then discuss the desirable properties of a good interpolation and demonstrate how the template parameters can be varied to improve the interpolation.
We explore in detail the application of the embedding technique to discretetime queues. Such queues are of importance in many applications such as communication networks, manufacturing lines and transportation systems[15, 16]. Through several concrete examples, we demonstrate the continuousspace embedding of queuecapacity, number of servers and serverdelay parameters and observe that the technique can produce smooth interpolations of the objective function. We present detailed simulation results for these discretetime queues in Section 4. The simulation models and scripts used for all results reported in this paper are available online (see [17]).
As a demonstration of the utility of this embedding technique, we present an optimization casestudy of a queueing network with integer parameters and design points. We observe that a randomization of the simulation model produces smooth embeddings of the objective function with a low computational overhead. Two wellestablished continuousspace optimizers (COBYLA[18] and SPSA[6]) applied directly over the embedding perform well and find good solutions in comparison to a direct discretespace search. We present the observations from this optimization casestudy in Section 5. In summary, our work establishes a computationally efficient technique for embedding discreteparameter queueing systems into continuousspace, enabling direct application of continuous optimizers to solve the DPSO problem.
2 A General Randomization Scheme
We describe the randomization scheme with respect to a single parameter. All parameters in the model can be embedded in a similar manner simultaneously and independently of each other. Consider the parameter in the model which can take integer values from the finite set . Let denote the range .
To embed the parameter into a continuous space and assign to it the value , we construct a randomized version of the model where the parameter is perturbed at multiple timeinstants within a single simulation trajectory and assigned values of the discrete random variable . The random variable can have a multipoint distribution taking values from the set . We choose a set of functions which map values from the domain to the range such that:

, (2)

for , (3)

are continuous at all points in . (4)
The random variable then has the distribution:
(5) 
At a given point the values serve as stochastic interpolation coefficients.
There are an infinite number of choices for the set of functions that satisfy conditions (2) to (2). We illustrate one example of such a set in Figure 1. In general, the shape of these functions will affect the resulting interpolation . In the following paragraphs we show that such a randomization can indeed produce continuous interpolations of the longrun average measure. The result is similar to [13, Lemma 3], except that the assumption about the ergodicity of the constituent Markov chains has been relaxed. This makes the analysis applicable to a wider set of parameters and systems (including chains with transient or periodic states such as those described in Section 4).
2.1 Analysis
Let be points in the dimensional discrete parameter space . At each parameter point we assume that the behavior of the system can be described as a stationary Markov chain with a finite statespace and a transition probability matrix . Such a chain will have a unique stationary distribution (that is, a unique value of the distribution which satisfies ) unless it contains two or more closed communicating classes. We assume that at each point the corresponding chain contains exactly one closed communicating class and therefore has a unique stationary distribution . The chain is not required to be ergodic and may contain periodic states and/or some transient states. Further, we assume that the chains at all share a common statespace . This assumption holds when the model behavior is welldefined for dynamically changing parametervalues, as demonstrated in Section 4. Note that it is permissible for the subset of states forming a closed communicating class at points and to be different or altogether nonoverlapping for .
Let be a cost function that assigns a fixed cost to every occurrence of a state in the Markov chain. Let denote the probability of occurrence of a state under the distribution . The longrun average cost at the point can then be defined in terms of the stationary distribution as follows:
(6) 
Now consider the randomized model at where the parameter in the model is assigned values of the random variable . Here are independent random variables with the distribution given by Equation (5). We assume that all parameters in the model are perturbed at identical time instants. Let all of the stochastic interpolation coefficient functions be of differentiability class (that is, have continuous derivatives, where is some nonnegative integer). Let be the corresponding longrun average measure of the randomized model.
Theorem 1.
is a interpolation of .
Proof.
The dimensional vector of parameter values at any instant of time is itself a random variable whose distribution is a function of as follows:
(7) 
Let be a point in and let denote the index of element in the set . The coefficients can then be obtained as:
(8) 
From equations (8) and (2) it follows that the functions which map values from the domain to the range also satisfy the following conditions:

, (9)

for , (10)

are continuous at all points in . (11)
Let denote the transition probability matrix of the randomized model. We choose the time instants at which to perturb the parameter values in such a way that is given by
(12) 
In a discretetime system, this can be achieved in a straightforward manner by perturbing the parameter values at the beginning of each timeslot. From (12) and (2.1) it follows that is continuous with respect to . We now refer to a result from [14, Section 6] which states that: If is the transition probability matrix of a finite Markov chain containing a single irreducible subset of states (a single closed communicating class), then for an arbitrary stochastic matrix with the same statespace as , the randomized stationary Markov chain with transition probability matrix
will also possess a single irreducible subset of states. Further, has a unique stationary distribution which is infinitely differentiable with respect to for .
In Equation (12) each have a single irreducible set of states. Therefore the stationary distribution corresponding to exists and is infinitely differentiable with respect to the coefficients and times continuously differentiable () with respect to . Let denote the probability of occurrence of a state under the distribution . The longrun average cost in the randomized model is given by:
The function is also with respect to . From equations (7) and (2.1) we have with probability when for . Thus whenever . Therefore is a interpolation of . ∎
Thus we have shown that a randomization of the simulation model can be used as a means of producing continuous interpolations of the longrun average measure under the listed assumptions. Note that for to be of class , it is sufficient but not necessary for the stochastic interpolation coefficients to be functions. For instance, it may be possible to obtain a continuously differentiable interpolation using coefficients that are not differentiable at integer points.
3 Generating the Stochastic Interpolation Coefficients
Consider the set of functions that satisfy conditions (2) to (2). While an infinite number of choices exist for such a set of functions, we present one possible template that can be used to generate a family of such sets. The template is useful as the final interpolation curves can be tuned by varying the parameters of this template.
For a given point we refer to the stencil around point , denoted as the set of points in that are chosen as basis to represent . In other words is the set of values that the random variable would take with nonzero probabilities. For example, at a symmetric stencil of size would be the set . As a concrete example for the case where consists of consecutive integers, a symmetric stencil of size (where is a natural number) around any point can be defined as follows:
(13) 
Note that the stencil size can be lower than for points near the boundary of . Now corresponding to each point we define a function such that is nonnegative, continuous over and its value approaches as approaches points in its stencil other than . For example:
(14) 
The stochastic interpolation coefficients for can now be defined as:
(15) 
Since the values of are nonnegative and a normalization is performed in Equation (15) it can be seen that the coefficients will always lie in the range and satisfy conditions (2) to (2). To obtain a more general template, let and be fixed constants such that and . Let be the smallest element in . Then,
(16) 
The coefficients can be obtained as given by Equation (15) using the values of computed using Equation (16).
The stencil size (), and the constants and are parameters of this template. Figure 2 shows sets of coefficients generated using this template for different values of , and the stencil size. The constant controls the spread of around the point . The spread is linear for , and reduces with increasing values of . It is interesting to note that for very large values of (when the stencil size is and ) the value of is nearly in the range and the interpolation approaches a simple nearestneighbour rounding. The constant controls the skew. The functions are symmetric for , show a rightskew for and a left skew for . The template thus provides a simple means of varying the shape of the stochastic interpolation coefficients. In the next section, we demonstrate how this can be used to tune the final interpolation curves for a specific example.
4 Application to Discretetime Queues
We now present concrete examples for the application of the embedding technique to discretetime queues. We use the following notations and assumptions throughout this section: Time is divided into unitsized slots. Jobs arrive into the system near the beginning of a slot and depart towards the end of the slot. At most one job can arrive within a single slot. The queue state is measured at the end of a slot, as illustrated in Figure 3. denotes the initial state and denotes the state measured at the end of slot . For job arrivals with geometrically distributed interarrival times (denoted Geo) the probability of a job arriving in a slot is denoted as . For geometrically distributed (Geo) service times, the probability of the server finishing a job in a slot is denoted as . This probability is independent of the number of slots for which a job has already received service. For a deterministic server (denoted D), the number of slots taken to process a job is .
4.1 Embedding QueueCapacity
Consider a Geo/Geo/1 queue with finite buffering. The queue state is the number of jobs in the queue at the end of slot . The queue has a capacity parameter that we wish to embed into a continuous space. We first define the behavior of the queue with respect to as follows:
Definition 1.
A job arriving in slot is allowed to enter the queue if , else the job is lost.
By defining the capacity parameter in this way, we ensure that the Markov chains corresponding to every possible value of share the same statespace. The states are transient and unreachable from states . As an example, Figure 4 shows the Markov chains for and .
Let be some longrun average measure of this system expressed as a function of the capacity parameter . We wish to embed the capacity parameter into continuous space, and evaluate the interpolation of at some given point . To do so, we construct a randomized model where the capacity parameter is perturbed at the beginning of each slot and assigned the value of the random variable with the distribution:
where the coefficients satisfy the conditions described in Section 2. Let denote the instantaneous value of the capacity parameter for the duration of slot . By the definition of the capacity parameter above, whenever the parameter is updated the jobs already present in the queue are not disturbed. The updated value of the parameter is used solely to decide if a new job should be accepted into the queue. Thus it is possible that for some values of .
In Figure 5, we show the simulation results obtained with this randomization scheme. The interpolation coefficients are generated using the template described in Section 3 with the spread and skew attributes set to and . For all simulation results presented in this section we have used a 2point stencil. We fix the arrival and service probabilities and sweep the queue capacity parameter in steps of . The interpolated function is the blocking probability (probability of an arriving job being denied entry into the system). Each point in the plot is the mean value of measured using simulation samples with distinct randomization seeds. The shaded area around the plot represents the standarddeviation interval.
We observe that the randomization scheme produces a smooth interpolation and the standard deviation values (indicating simulation error) are similar at the discrete and the interpolated points. The computational overhead of the embedding contributed primarily by the additional calls to a random number generator, was between to . The overhead was computed as the relative difference between the time per simulation for the original discreteparameter model (at some ) and the randomized model (at ). Thus a smooth embedding of the queue capacity parameter could be obtained through a randomization of the simulation model.
4.2 Tuning the Interpolation
The interpolation is sensitive to the shape of the coefficient function used during randomization. To demonstrate this, we consider the problem of embedding the queue capacity parameter in a finite capacity Geo/D/1 queue. This queue is similar to the Geo/Geo/1 queue except that the server is deterministic with a fixed service time of slots. The queue capacity parameter is embedded using a randomization scheme identical to that of the Geo/Geo/1 case using the template described in Section 3. We show the simulation results and the resulting interpolation obtained using different values of the skew () and spread () parameters in Figure 6.
We observe that for the interpolation has kinks at the integer points. If the original objective function is integerconvex over some convex subset of the domain , it is desirable that the interpolation also be convex over the region that forms the convexhull of . Otherwise a continuous optimizer applied directly over can get stuck in a local minimum that does not correspond to any minimizer of in . Further, it is also desirable that the interpolation be smooth so that gradientbased continuous optimizers can converge fast. Considering these criteria, a reasonably good interpolation was obtained at . For some systems, it may be possible to arrive at the best values for the randomization parameters () analytically. This is an interesting problem, however beyond the scope of the current work. For examples presented henceforth, we have tuned the interpolation curves along each parameter axis by varying the template parameters.
4.3 Embedding the Number of Servers
Consider a Geo/Geo/K queue. The queue has infinite buffering and identical, independent servers working in parallel. We wish to embed the parameter into continuous space. To do so, we first redefine the system behavior as follows:
Definition 2.
The system consists of a single, infinitely fast controller with a parameter , and a fixed large number () of identical, independent servers. In each timeslot, the controller pulls jobs from the head of the queue and assigns each job to a free server, as long as the queue is not empty and the number of jobs currently receiving service is less than .
Note that for a fixed (integer) value of , this description is identical to a queue with independent servers. However, the new definition of the queue behavior makes it intuitive for the controller parameter to be updated dynamically. To embed into continuous space and evaluate the interpolation at some point , we construct a randomized model where is perturbed at the beginning of each slot and assigned the value of the random variable . By the definition above, whenever the parameter is updated, the jobs already receiving service are not disturbed and the updated parameter value is used solely for deciding if service can commence for new jobs.
In Figure 7, we show the interpolation obtained using this scheme with and . We fix the arrival and service probabilities and sweep the parameter in small steps (the stepsize is chosen to be smaller near the knee region). The interpolated function is the average number of jobs in the system. Each point in the plot is the mean value of measured using simulation samples.
In Figure 8 we show the interpolation results for the same randomization scheme for a Geo/D/K queue (with a deterministic service time of slots).
4.4 Embedding Service Time
Consider a Geo/D/1 queue. The server is deterministic with a fixed service time of slots. To embed the parameter into continuous space, we first define the server behavior as follows:
Definition 3.
The server has a parameter . At the end of each slot, the server ends jobs that have already received slots of service.
To embed into continuous space and evaluate the interpolation at some point , we construct a randomized model where is perturbed at the beginning of each slot and assigned the value of the random variable . In Figure 9, we show the interpolation obtained using this randomization scheme with and . We fix the arrival probability and sweep the service time parameter in steps of 0.05. The interpolated function is the average number of jobs in the system. Each point in the plot is the mean value of measured using simulation samples. The randomization produces a smooth interpolation of .
Using the approach described in this section, multiple discrete parameters in a model can be embedded simultaneously and independently of each other, and existing continuous optimizers can be effectively applied over such an embedding. We demonstrate this using an optimization case study in the following section.
5 An Optimization Example
5.1 Problem Statement
To demonstrate the utility of the embedding technique, we consider the optimization of a queueing network in Figure 10.

The system consists of three nodes , and . Each node has a queue with a finite capacity .

Jobs arrive at with geometrically distributed interarrival times (with an arrival probability ). An arriving job that finds the queue full is lost. The node consists of a single deterministic server with a service time of slots. This server forwards each job to either or with equal probabilities, and stalls if the destination queues are full.

Nodes and respectively consist of and identical servers working in parallel. The servers in have geometrically distributed service times (with service probability ) whereas those in are deterministic, with a service time of slots.

The servers in are prone to faults. The probability of a job turning out faulty (denoted ) is inversely related to the service time (). A job that has received faulty service is sent back to node to be reprocessed as a fresh job. If the destination queue at is full, the corresponding server in stalls.
The parameter set for this system is . Each parameter can take integer values between and . The arrival probability and the service parameter are kept fixed (, ). Let denote the vector of parameter values and denote the set of all possible values that can take. Let denote the expected value of the longrun average throughput of the system (estimated using simulation). The throughput can be improved directly by improving the value of each parameter (that is, increasing the buffer sizes and the number of servers and reducing the service times). However, in most realworld applications improving a parameter value would also incur a certain cost. To model this tradeoff we define a synthetic cost function whose value increases with increasing buffer sizes and the number of servers and reduces with increasing servicetimes as follows:
(17) 
The objective function to be minimized is the weighted sum of the normalized cost and throughput components:
(18) 
Since an exhaustive evaluation of over all design points is infeasible, our goal is to find the best solution possible within a fixed computational budget.
5.2 Randomizationbased Embedding
To solve this optimization problem, we first embed the discrete parameter space into a continuous one by randomizing each parameter in the model using the scheme described in Section 4. The randomization settings (listed in Table 1) were chosen via a rough tuning of the interpolation curves along each parameter axis.
Parameter  ,  Parameter  ,  Parameter  , 

1, 2  1, 1  1, 4  
1, 1  1, 1  1, 1  
1, 2 
Figure 11 shows the resulting interpolation plotted along arbitrary twodimensional slices of the 7dimensional parameter space. The plots were obtained by sweeping two parameters at a time (in steps of ) while keeping the other parameter values fixed. The plots indicate that in the observed regions of the domain, the interpolation is reasonably smooth and suited to the application of descentbased optimization methods.
In Table 2 we list the computational overheads contributed by the additional calls to the random number generator as successive parameters in the model are embedded. The overheads were measured relative to the point by successively randomizing each parameter and assigning to it the value . The values in Table 2 show that the overheads are a fraction of the total simulation time. This is in contrast to spatial interpolation methods where the computational cost of interpolation grows as integer multiples of the simulation time.


Overhead  

0  0.2173  0  
1  0.2294  5.59 %  
2  0.2304  6.06 %  
3  0.2302  5.96 %  
4  0.2447  12.65 %  
5  0.2598  19.58 %  
6  0.2701  24.31 %  
7  0.2880  32.53 %  

for the 7parameter queueing network in Figure 10
5.3 Optimization
Over the randomizationbased embedding, we apply two continuous optimizers: COBYLA[18, 19] and SPSA[6]. Both optimizers avoid the need for an explicit computation of numerical derivatives along each parameter axis and are thus suited to high dimensional problems. Further, both methods are suited to problems where the objective function evaluations can be noisy. COBYLA is a trustregion based method. It builds an estimate of the gradient at each step by interpolation at the vertices of a simplex in the parameter space. SPSA is a descentbased method that approximates the gradient at each step using only two objective function evaluations regardless of the dimensionality of the parameter space.
While discreteparameter variants of SPSA exist[7],[20, Chapter 9], COBYLA cannot be applied in the discreteparameter case. This fact illustrates the utility of our embedding technique, which makes it possible to apply existing continuous optimizers directly to solve discreteparameter simulation optimization problems. We compare the performance of both these optimizers against a simple discretespace variant of SPSA [7] where each objective function evaluation is restricted to points in the original discrete domain. While the focus of this work is on the embedding technique rather than specific optimization methods, we present the comparison as an evaluation of the embedding technique’s utility.
We use an implementation of COBYLA from Python’s SciPy library[21] with the settings and . For SPSA, we use an implementation from the NoisyOpt library[22] with the default stepsize schedules (suggested in [23]). The stepsize parameter is identical between the continuous and discrete SPSA variants. For each method we perform optimization runs using distinct, randomly chosen initial points. The set of initial points is fixed and is common across all three optimization methods. For each optimization run we set an upper limit of objective evaluations. At each objective function evaluation, the system throughput is measured using a single simulation of the randomized model (of length slots) and the cost is computed analytically using Equation (17). At the end of each optimization run, we round the solution to the nearest integer point, and record the objective value at this point.
COBYLA  SPSA  DiscreteSPSA  

best  0.7130  0.7108  0.6842  
avg  0.5240  0.1994  0.1964  
stddev  0.2930  0.3481  0.3358  

52.7  1000  1000  

0.15  2.94  2.33 
Table 3 presents the performance of the three methods measured across optimization runs. We observe that COBYLA shows the best performance, both in terms of the quality of the solutions and the convergence rate. The continuousspace SPSA performs better in comparison to its discreteparameter variant. The solutions are wellclustered. Among the top solutions found by COBYLA, all have the parameter values , , and . The results indicate that existing continuous optimizers can be effectively applied over the embedding and their performance compares favourably against direct discretespace search.
6 Conclusions
We have presented a simple and computationally efficient technique using which discrete parameters in a simulation optimization problem can be embedded into a continuousspace, enabling direct application of continuousspace methods for optimization. The technique is based on a randomization of the simulation model and is applicable to problems where the objective function is a longrun average measure. We described in detail the application of this technique to discretetime queues for embedding queue capacities, number of servers and serverdelay parameters and empirically showed that the technique can produce smooth interpolations. Further, the interpolation can be tuned by varying the shape of the coefficient functions. An analytical means to arrive at the best randomization scheme (rather than through tuning) can be an interesting problem for future research.
Over the randomizationbased embedding, existing continuous optimizers can be effectively applied to find good solutions. We demonstrated this via an optimization case study of a queueing network with discrete parameters. A randomization of the model produced smooth embeddings of the objective function with a low computational overhead. Two continuousspace optimizers (COBYLA and SPSA) applied over this embedding showed good performance in comparison to a direct discretespace search.
In summary, this work established the randomization based technique as an efficient means of embedding discrete parameter queueing systems into continuous space for simulationbased optimization over a large number of parameters. A detailed comparison across multiple continuous optimizers that can be applied over such an embedding could be the subject of future investigation. The application of the embedding technique to other kinds of systems (such as inventory models) also needs to be investigated further.
References
 [1] C. H. Chen and L. H. Lee, Stochastic Simulation Optimization: An Optimal Computing Budget Allocation, 1st ed. River Edge, NJ, USA: World Scientific Publishing Co., Inc., 2010.
 [2] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by Simulated Annealing,” SCIENCE, vol. 220, no. 4598, pp. 671–680, 1983.
 [3] T. M. Alkhamis, M. A. Ahmed, and V. K. Tuan, “Simulated Annealing for Discrete Optimization with Estimation,” European Journal of Operational Research, vol. 116, no. 3, pp. 530 – 544, 1999.
 [4] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, 1st ed. Boston, MA, USA: AddisonWesley Longman Publishing Co., Inc., 1989.
 [5] F. Glover and M. Laguna, Tabu Search. Norwell, MA, USA: Kluwer Academic Publishers, 1997.
 [6] J. C. Spall, “Multivariate Stochastic Approximation Using a Simultaneous Perturbation Gradient Approximation,” IEEE Transactions on Automatic Control, vol. 37, no. 3, pp. 332–341, 1992.
 [7] J. E. Whitney, L. I. Solomon, and S. D. Hill, “Constrained optimization over discrete sets via SPSA with application to nonseparable resource allocation,” in Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304), vol. 1, 2001, pp. 313–317 vol.1.
 [8] S. Bhatnagar and H. J. Kowshik, “A Discrete Parameter Stochastic Approximation Algorithm for Simulation Optimization,” SIMULATION, vol. 81, no. 11, pp. 757–772, 2005.
 [9] E. Lim, “Stochastic Approximation over Multidimensional Discrete Sets with Applications to Inventory Systems and Admission Control of Queueing Networks,” ACM Trans. Model. Comput. Simul., vol. 22, no. 4, pp. 19:1–19:23, Nov. 2012.
 [10] H. Wang and B. W. Schmeiser, “Discrete Stochastic Optimization using Linear Interpolation,” in 2008 Winter Simulation Conference. IEEE, Dec. 2008.
 [11] G. Hazari, M. P. Desai, and G. Srinivas, “Bottleneck Identification Techniques Leading to Simplified Performance Models for Efficient Design Space Exploration in VLSI Memory Systems,” in 2010 23rd International Conference on VLSI Design. IEEE, 2010.
 [12] N. V. Karanjkar and M. P. Desai, “An Approach to Discrete Parameter Design Space Exploration of Multicore Systems Using a Novel Simulation Based Interpolation Technique,” in Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd International Symposium on, Oct 2015, pp. 85–88.
 [13] S. Bhatnagar, V. K. Mishra, and N. Hemachandra, “Stochastic Algorithms for Discrete Parameter Simulation Optimization,” IEEE Transactions on Automation Science and Engineering, vol. 8, no. 4, pp. 780–793, Oct 2011.
 [14] P. J. Schweitzer, “Perturbation Theory and Finite Markov Chains,” Journal of Applied Probability, vol. 5, no. 2, pp. 401–413, 1968.
 [15] A. S. Alfa, Applied DiscreteTime Queues, 2nd ed. Springer Publishing Company, Incorporated, 2015.
 [16] H. Bruneel, “Performance of discretetime queueing systems,” Computers & Operations Research, vol. 20, no. 3, pp. 303–320, 1993.
 [17] N. Karanjkar. GitHub repository for the embedding experiments. [Online]. Available: https://github.com/NehaKaranjkar/Embedding
 [18] M. J. D. Powell, A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation. Dordrecht: Springer Netherlands, 1994, pp. 51–67. [Online]. Available: https://doi.org/10.1007/9789401583305_4
 [19] M. Powell, “On Trust Region Methods for Unconstrained Minimization without Derivatives,” Mathematical Programming, vol. 97, no. 3, 2003.
 [20] S. Bhatnagar, H. Prasad, and L. Prashanth, Stochastic Recursive Algorithms for Optimization. London: Springer London, 2013.
 [21] Implementation of COBYLA in Python’s scipy.optimize library. [Online]. Available: http://docs.scipy.org/doc/scipy0.14.0/reference/generated/scipy.optimize.fmin_cobyla.html
 [22] Noisyopt: A Python library for optimizing noisy functions. [Online]. Available: https://noisyopt.readthedocs.io/en/latest/
 [23] J. C. Spall, “Implementation of the simultaneous perturbation algorithm for stochastic optimization,” IEEE Transactions on Aerospace and Electronic Systems, vol. 34, no. 3, pp. 817–823, Jul 1998.