1 Introduction
Abstract

We consider a generalized processing system having several queues, where the available service rate combinations are fluctuating over time due to reliability and availability variations. The objective is to allocate the available resources, and corresponding service rates, in response to both workload and service capacity considerations, in order to maintain the long term stability of the system. The service configurations are completely arbitrary, including negative service rates which represent forwarding and service-induced cross traffic. We employ a trace-based trajectory asymptotic technique, which requires minimal assumptions about the arrival dynamics of the system.

We prove that cone schedules, which leverage the geometry of the queueing dynamics, maximize the system throughput for a broad class of processing systems, even under adversarial arrival processes. We study the impact of fluctuating service availability, where resources are available only some of the time, and the schedule must dynamically respond to the changing available service rates, establishing both the capacity of such systems and the class of schedules which will stabilize the system at full capacity. The rich geometry of the system dynamics leads to important insights for stability, performance and scalability, and substantially generalizes previous findings.

The processing system studied here models a broad variety of computer, communication and service networks, including varying channel conditions and cross-traffic in wireless networking, and call centers with fluctuating capacity. The findings have implications for bandwidth and processor allocation in communication networks and workforce scheduling in congested call centers. By establishing a broad class of stabilizing schedules under general conditions, we find that a scheduler can select the schedule from within this class that best meets their load balancing and scalability requirements.

Cone Schedules for Processing Systems in Fluctuating Environments

KEVIN ROSS 111School of Engineering, University of California Santa Cruz; kross@soe.ucsc.edu;

NICHOLAS BAMBOS 222Electrical Engineering and Management Science & Engineering, Stanford University; bambos@stanford.edu

GEORGE MICHAILIDIS 333Statistics and Electrical Engineering & Computer Science, The University of Michigan; gmichail@umich.edu


Keywords: random environment, stability, adversarial queueing theory, dynamic scheduling, throughput maximization.

1 Introduction

We consider a processing system comprised of infinite capacity queues, indexed by , operating in a time-varying environment which fluctuates amongst environment states . In each environment state, only a subset of the service configurations are available. The process scheduler selects a service configuration vector from the environment-dependent available set . Upon selection, if then queue is emptied at rate , and if then the queue is filled at the corresponding rate. The available service configurations can be completely arbitrary, including vectors with any combination of positive and negative components.

A key question addressed in this study is which of the available service configurations should be selected, given the system workload and environment state histories, so as to maximize its throughput. We introduce a family of resource allocation policies - called Cone Schedules - which are shown to stabilize the system under the maximal possible traffic load, even if that load is designed by an adversary to destabilize the system whenever possible.

This canonical processing model captures several applications in computing and communication systems, including wireless networks, packet switches and call centers. The main characteristic of these applications is that the service rates across multiple queues are coupled through operational constraints, giving rise to the available service configurations. Service rate availability (corresponding to the environment states) is affected by staff scheduling in call centers, congestion dynamics in wireless networks and scheduled or unscheduled outages due to maintenance or reliability issues in other processing systems.

1.1 Related Work

The trace-based stability analysis technique employed in this paper relates to the study of adversarial queueing networks exemplified in [Andrews et al., 2001] and [Borodin et al., 2001]. This approach avoids imposing a probabilistic framework on the arrival traffic, and instead analyzes the performance of a queueing network under and adversarial arrival traffic trace, designed to stress the system as much as possible. They describe a queueing network as universally stable when they can show that the total workload of the system is bounded under any deterministic or stochastic adversary’s arrival trace. This work is really finding the worst-case behavior of a network by considering the network to be a game between the schedule (protocol) and the worst possible arrival trace (adversary). They limit the absolute arrival volume within a finite interval, but do not require it to follow any stationary distribution or apply any further restrictions. This concept builds upon earlier work called leaky-bucket analysis in [Cruz, 1991a] and [Cruz, 1991b].

Adversarial models have been used to in packet networks before, such as [Borodin et al., 2001] which considers a fixed-path packet network. Some more general queueing systems, including multiclass queueing networks are studied in [Tsaparas, 2000], with generalized service times and heterogeneous customers. Adversarial methods have also been employed to study multi-hop network stability in [Kushner, 2006]. In [Anshelevich et al., 2002], adversarial models are used to analyze load-balancing algorithms in a distributed setting based using a token-based system on a network with limited deviations from the average load. While none of these study the same network scheduling setting of this paper (to our knowledge they have only considered fixed-path networks under time-invariant service environments), each example presents a persuasive argument for the value of network stability analysis in the absence of a well-defined probabilistic framework.

A special example of the system described in this paper is a single crossbar packet/cell switch with virtual output queues, used in high speed IP networks. The switch paradigm is the focus of [Ross and Bambos, 2009], and provides a helpful context to develop the cone algorithms. In this switch, cells arriving to each input port get buffered in separate virtual queues, based on the output port they are destined to. The switching fabric can be set to a different connectivity mode in each time slot, matching each input port with a corresponding output port for cell transfer. In this context, Maximum Weight Matching (MWM) has been shown in [McKeown et al., 1999] to maximize the throughput of input queued switches, employing Lyapunov methods for stability analysis, as also in constrained queueing systems studied in [Tassiulas and Ephremides, 1992, Tassiulas, 1995, Tassiulas and Bhattacharya, 2000, Hung and Michailidis, 2011]. In our more general service model, MWM corresponds to maximizing , where the weight is the cell workload of queue or a related congestion measure, and the vectors represent the crossbar configurations.

More general results on the stability of MWM algorithms, using fluid scaling methods, were later obtained in [Dai and Prabhakar, 2000], and on a generalized switch model in [Stolyar, 2004]. Stability in networks of switches was studied in [Marsan et al., 2005] and [Leonardi et al., 2005]. [Dai and Lin, 2005] and [Dai and Lin, 2008] considered maximum pressure policies by modeling fluid flows for types of processing networks. Their work can be seen as a generalization of the policies which maximize where some of the service rates are negative because the available configurations involve forwarding workload from one queue to another downstream queue. [Neely et al., 2003] studied broader optimal controls for generalized (wireless) network models that involve joint scheduling, routing and power allocation. All of these have significantly advanced the theory of the stability of scheduling rules which allocate service to queues based on a weighted-matching approach, and utilize a probabilistic framework to apply fluid limit or heavy-traffic analysis.

Instead of using fluid scaling methods (primarily analytic, involving passage to a limit regime) to establish the results, we opt to use an alternative direct and primarily geometric approach in this work, which seems to have broader applicability to other queueing systems and reveals useful geometric insight regarding their dynamics. The trace-based asymptotic analysis employed here was introduced in [Armony and Bambos, 2003], where the maximum weight matching algorithms were studied and it was shown that maintain maximal throughput is guaranteed under very general arrival process assumptions. The method was also employed in [Bambos and Michailidis, 2004] where randomly fluctuating service levels were studied. In that case the service rate assignments are made without full knowledge of service availability, as opposed to the processing systems studied here where service allocation decisions are made in response to availability. Like the adversarial queueing models, there is no probabilistic framework required, but unlike the traditional adversarial models, there is also no short-term restriction on arrival bursts in finite time, but just a long-term traffic load restriction. This leads to more general stability results, but eliminates the possibility of tighter bounds on other performance metrics. For example under such general assumptions there can be no guaranteed finite bound on the total workload, or even the expected workload in the system.

1.2 Results Overview

We classify the stability region for these processing systems with fluctuating service availability. We find that rate stability for these general processing systems can be guaranteed by the class of cone schedules, for any arbitrary arrival process that can possibly be stabilized. Cone schedules use the available service vector with maximal projection on the projected workload vector , for every matrix that is positive-definite, has negative or zero off-diagonal elements, and is symmetric. This substantially generalizes a similar result in [Ross and Bambos, 2009] , where the same class of algorithms was shown to maximize throughput for the special case of packet switches.

In classifying the stability region, we show how the combination of service vectors in each environment impacts the overall capacity of the system, beyond the long term availability of each service vector. The geometric framework for stability aids the intuition and analysis significantly. Because of environment fluctuations, one may expect that a scheduling rule needs to account for future and past states. However we find that cone schedules, which respond only to the current workload, are able to guarantee stability for any arrival rate within the stability region.

The service rates in this paper are allowed to be completely arbitrary, in contrast to previous results using the trace-based analysis which only applied to positive-service switches. This captures cross-traffic and forwarding between queues, because the selected service vector may induce additional workload to the system, in addition to the external arrival process. Further, in this work time is continuous, and arbitrarily large arrival bursts can be handled at arbitrarily small time intervals. This is more general than previous models where arrivals and decisions were restricted to timeslots.

From an architectural point of view, the geometric approach to the scheduling problem provides key practical design leads. Specifically, the conic representation (Section 4) of cone schedules leads to scalable implementations in switching systems. Further, varying the elements of matrix , we can generate a very rich family of cone schedules that implement a soft coupled priority scheme (and coupled load balancing) across the various queues, managing delay tradeoffs between them. The schedules are also robust to any sublinear perturbation such as delayed or flawed state information.

The remainder of the paper proceeds as follows. In Section 2, we introduce the model and system dynamics. Section 3 describes the throughput capacity or stability region of these networks, and in section 4 we introduce the family of Cone Schedules and their geometry. Stability and performance implications are discussed in sections 5 and 6 respectively. We conclude in Section 7.

2 The Processing Structure

Let be the total workload that arrives to queue in the time interval ; that is, is the instantaneous workload arrival rate at time . The traffic trace is a (deterministic) function, which may have discontinuities and even -jumps for each . The overall (vector) instantaneous traffic rate is at time and the traffic trace is . We assume that the (long-term) traffic load of the trace444Throughout this study we employ the notation ,

(2.1)

is well-defined on the traffic trace . Correspondingly, we define the set of traffic traces of load ,

(2.2)

restricting our attention in this paper to traffic traces of well-defined load. A variety of natural arrival processes are included. For example, models jobs of service requirement arriving at times to queue . In this case, is zero between consecutive -jumps. In general, there could be positive instantaneous workload arrival rate between consecutive -jumps, which would represent a continuous inflow of work.

No further restrictions are placed on the arriving traffic trace. It may be generated by an underlying stochastic process, or even an adversary specifically designed to destabilize the system whenever possible.

The arriving workload is queued up in the queues , which are assumed to be of infinite capacity. Let be the workload (total workload or service requirement) in queue at time and

the overall (vector) workload.

The processing system operates in a fluctuating environment, which can be in one of distinct states at any point in time, indexed by . Let be the environment state at time and the overall environment trace over time. It is assumed that the proportion of time the environment trace spends in each state is well-defined, that is,

with . Correspondingly, we define the set of environment traces with time proportions as

(2.3)

and restrict our attention in this paper to environment traces that have well-defined time proportions. Finally, naturally corresponds to the degenerate case of a constant (non-fluctuating) environment.

When the environment is in state , a (nonempty) set of service vectors becomes available to the system manager, who can select a service vector at any point in time to operate the system. Each is a -dimensional vector

where is the drain (or fill, see below) rate of queue when the service vector is used. For example, in a simple system with two queues (), a service vector would serve (drain) queue 1 at rate 1.35 and queue 2 at rate 2.17 (work units per time unit). This is the standard way of viewing service vectors.

In this general model, however, we also allow for negative ‘service’ rates, actually corresponding to traffic workload ‘feed’ rates, as explained below. In the previous simple example of two queues, a service vector would serve (drain) the first queue at rate 1.2, but feed workload to the second queue at rate 0.8, filling it up.

The motivation to allow for negative components in the service vectors comes from the need to model environmental (background) cross-traffic sharing the queue buffers with the primary (foreground) traffic . This cross-traffic depends explicitly on the service vector used, and implicitly on the environment state through the set where the service vector should be chosen from. When service vector is used with for some queue , this corresponds to cross-traffic workload fed into queue at constant rate , in addition to the primary traffic workload . It is easy to see that can be interpreted as the ‘net’ cross-traffic through the queue; that is, workload could be fed into queue at rate and removed (served) at rate , with the net cross-traffic load fed into the queue being .

One special case related to this model is a feed-forward network. A service vector representing the transfer of workload from one upstream queue to another downstream queue would be represented with and all other . The model here could handle the aggregate of many transfers, as well as gain and loss in the system at any queue. The concept of cross-traffic considered here is more general, requiring no restrictions on the physical structure of the network. Feed-forward networks require some additional assumptions and are not the primary focus of this paper, but have been studied extensively elsewhere, such as [Dai and Lin, 2005].

Note that the above environmental cross-traffic is far less ‘innocuous’ than simply allowing the primary traffic to be modulated555Actually, the environment could also modulate the primary traffic trace in the following sense. There is a collection of traffic traces one for each environment state . When the environment is in state , the traffic driven into the system is selected from . Therefore, the overall traffic trace is simply . Hence, this basically reverts to the standard model (as long as the limit exists) and this is why we do not treat this case explicitly. by the environment state. Indeed, the cross-traffic depends on the choice of service vector , hence, the scheduling decisions actively influence it. The environment plays only a secondary role by defining , hence, restricting the range of scheduling choices. Actually, the introduction of cross-traffic is shown to have significant implications on the stability behavior of the scheduling policies studied later.

The sets may be overlapping, that is, a service vector may be available under one or more environment states. Let . It is assumed that each service vector set is complete, that is, for each and any

(2.4)

Hence, any ‘sub-vector’ of a service vector in (i.e. with one or more positive components reduced to zero) is also666Note that if any service vector in has no negative components, then the zero vector must be in as a sub-vector of the former vector, due to completeness. But if each service vector in has at least one negative component, the zero vector does not necessarily have to be in unless it is by design. a service vector in . The reason for requiring completeness of each is to accommodate the following situation: when some queues become empty and ceases receiving service, the resulting effective service vector is a feasible one. Under the latter perspective, the imposed assumption (2.4) is a natural one indeed. As seen below, it allows us to naturally handle schedules which provide zero service rate to empty queues.

The key issue is choosing the service vector at time , when the environment is in state and the vectors are available to choose from. In general, the decision can be based on the observable histories of the workload , the environment , and prior service choices . The scheduling policy is the overall trace of service vector choices . Our primary objective is to design schedules which maximize the system throughput (keep the system stable under the maximum possible load ), while being robust and utilizing minimum information, like the current workload and environment states, with no knowledge of the actual load and the environment time proportions . We elaborate on such issues later.

We are interested in natural schedules that never apply positive service to empty queues. That is, whenever the scheduler chooses a service vector with . This is possible because we have assumed that the sets are complete. Therefore, we can write

(2.5)

without having to explicitly ‘compensate’ for any idling time.

3 The Stability Issue

In the interest of robustness of the results, we employ the ‘lightest’ possible (see below) concept of stability, that is, rate stability [Bambos and Walrand, 1993]. Specifically, we call the system stable iff

(3.1)

Note that from (2.5) and (2.1), rate-stability implies that . Moreover, when the traffic trace involves pure ‘job-arrivals’ (-jumps) with zero workload arrival rate between them, then rate-stability (3.1) implies that the long-term job departure rate from each queue is equal to the long-term job arrival rate [Armony and Bambos, 2003]. Therefore, there is flow conservation through the system and the inflow at each queue is equal to the outflow. On the contrary, when the system is unstable there is a inflow-to-outflow deficit, which accumulates in the queues. This is consistent with engineering intuition and, in that sense, the concept of rate-stability is quite natural. Of course, it can be further tightened by imposing progressively heavier statistical assumptions on the traffic and environment traces. We resist doing that at this point, in order to preserve the generality of the results and keep them as robust and ‘assumptions-agnostic’ as possible.

Definition 3.1 (Stability Region)

We define formally the stability region of the system as the set of traffic loads for which there exists a scheduling policy under which the system is rate-stable (3.1) for all traffic traces with and all environment traces with .

As shown below, the universal stability region can be characterized as

(3.2)

The intuition is that is in the stability region if it is dominated (covered) by a convex combination of the service vectors , induced under the various service vectors in . Thus, is the ‘weighted sum’ of the various ‘stability regions’ generated by the individual sets for each state of the environment.

If and were known in advance and could be computed, then selecting each mode for a fraction of the time while the system is in environment state would keep the system stable. This could be achieved through round-robin or randomized algorithms. A scheduling algorithm which maintains stability (3.1) for any is referred to as throughput maximizing. However, we are primarily interested in adaptive scheduling schemes which maintain stability (3.2) for all , without actual prior knowledge of or . The cone schedules defined below are shown to provide such universal stability for any traffic load in , while being agnostic to particulars of the traffic and environment traces and ; they respond only to current workload and environment state.

In general, the stability behavior of scheduling rules could require the arrival trace to satisfy stronger conditions than those above. For example, restricting the study to Markovian or stationary arrival processes, or disallowing mixing, may provide special cases of stability. Instead, we allow the arrival traffic trace and environment trace to be designed by an adversary to stress the system. Consider for example an arrival trace where arrivals to queue are deliberately correlated to the environment states when cannot be served at maximum capacity. Even further, an adversarial trace may push arrivals to queues in a state-dependent way which responds to the scheduling rules themselves. These are very difficult to capture by a natural probabilistic framework, but are simply treated as particular traffic traces here.

To motivate the definition of the stability region for the processing system under consideration, we examine first the case where for all and there is only one environment state (, no environment fluctuation); that is, service is always non-negative and all service vectors are available at every point in time. Under the trace-based perspective employed in this paper, it is known [Armony and Bambos, 2003] that for any load in the region

the system can be made rate stable with an appropriate scheduling rule. The non-negative parameters are essentially proportional weights, which are chosen so that the load vector is component-wise dominated by the weighted linear combination of the service vectors.

Extending this ‘geometric’ stability perspective to allow cross traffic and varying environment states is not a trivial task. Intuition may suggest that the stability region in networks in fluctuating environments should be reduced according to how often each mode is available. Consider the following simple network to illustrate that the distribution of environment states is critical to stability. Take a 2-queue network with three service vectors, . Clearly, if all vectors are available all the time, by employing always the system can accommodate any input vector . On the other hand, if there are two environment states with service vector sets and with , then the system can accommodate any input vector satisfying the conditions , and .

However, a different configuration of the service vector sets, say and with , yields and for stability. Note that although the sets and ensure that each service vector is available for the same portion of time in both scenarios, the relative combinations of the available service vectors change the stability region. We illustrate (and generalize) this perspective in Figure 1.

Figure 1: The stability region. The set of allowable arrival rate vectors is called the stability region . Two separate sets of service vectors are shown in the first two plots, with their respective stability regions if they were the only environment state, and available of the time. The third plot shows the stability region when and . This corresponds to the environment state fluctuating so that of the time, the service vectors from the first group are available, and of the time the service vectors from the second group are available to be scheduled. For any in the region above, there is a convex combination of service modes within the resource sets which would apply a total service rate to each queue which is at least the arrival rate to that queue. For outside there is no such combination. Service modes and are strictly dominated by a convex combination of other service vectors and therefore do not contribute to the stability region (and in fact need not be utilized to maintain stability). Service vectors with negative components such as and above may contribute to the stability region without being inside the stability region itself. The stability region for the combination of environments can be seen to be the weighted sum of the two original stability regions, with care taken to the impact of extreme points with negative components.

We establish first that if , it is impossible to maintain stability and flow conservation in all queues, no matter what scheduling policy one employs. At least one queue will suffer an outflow deficit (compared to its inflow), which will accumulate in the queue and cause its workload to explode linearly it in time.

Proposition 3.1 (Instability)

For any arbitrarily fixed traffic traffic trace and environment trace , we have

(3.3)

for at least one queue under any scheduling policy.

Proof: For convenience, we drop the fixed argument from and write it traffic load as simply , and proceed by contradiction. If (3.3) does not hold, then from (2.5) we must have . But then we have

where satisfies and , which satisfies (3.2). We then easily get (arguing by contradiction) that for at least one queue .    

4 Cone Schedules and their Geometry

We focus in this paper on schedules that are workload-aware and resource-aware but not rate-aware; that is, the system’s operator can observe and respond to both the environment state and the workload state , but has no knowledge of the long-term load vector and state probabilities .

In particular, we examine a family of resource allocation policies that are called Cone Schedules and are parameterized by a fixed matrix . These schedules select the service vector that has the maximal projection on , when the workload state is and the environment state is . Specifically:

Definition 4.1 (Cone Schedules)

Given a fixed real matrix , a cone schedule is one that, when the environment is in state and the the workload is , it selects a service vector in the set

(4.1)

which satisfies whenever . We show that such a vector must be contained in by proposition 4.1 below. The set is nonempty, but may contain several service vectors in , in which case one is arbitrarily chosen by the cone schedule. Note that

(4.2)

so the chosen is one of maximal projection on amongst those in . Therefore, the service vector chosen by the cone schedule at time is

based on the observed current workload and environment state .

Notice that the maximization ensures that cone schedules follow some important intuition for a scheduling rule. We see that is increasing in and decreasing in for . This will increase whenever comes to dominate other queues. By maximizing this sum, the cone schedules all prefer large positive service rates whenever is large and positive. Thus the schedules will prefer remove the most workload from the longer queues, and restrict the cross-traffic added to those longer queues. The relationship to performance and load balancing is discussed in section 6.

Proposition 4.1 (Matrices with Negative or Zero Off-Diagonal Elements)

If the cone schedule matrix has negative or zero off-diagonal elements () and the service vector sets are complete for each environments state , then there must exist some for which we have

for each . Thus, for such matrices, the corresponding cone schedules can always select service vectors that provide no positive service rate to an empty queue.

Proof: Given a workload vector such that for some (empty) queue , let us examine the inner product maximized by the cone schedule (4.1) in selecting , that is,

(4.3)

with . Consider the term corresponding to the empty queue in the above sum, that is,

(4.4)

Since , the first term above is automatically zero, irrespectively of and . However, since and for each , we see that

(4.5)

Arguing by contradiction, assume that maximizes (4.3) with . But because is assumed to be complete (2.4), the vector also belongs to and has , hence, leads to an equal or greater value of (4.3) because of (4.4) and (4.5). This establishes a contradiction and implies that the set of service vectors that maximize (4.3) must always include one where (provide no positive service rate) for each empty queue (that is, with workload ).

To justify the term ‘cone’ schedule consider the following perspective. Define first the set of workloads for which the cone schedule would choose the service vector when the environment is in state , that is:

for . This is simply the set of workloads that have maximum projection on amongst all other sets in . Note that is a geometric cone because implies that for any positive scalar and . Thus, if belongs to then any up/down-scaling also belongs to it.

For each environment state , the cones form a partition of the workload space, that is,

In general, some cones may actually be degenerate (like those corresponding to service vectors in that are fully dominated component-wise by others in ) and several cones may share common boundaries. Observe that the cone schedule can now be geometrically defined as follows:

The cone structure of the sets motivates the name cone schedules.

Figure 2: The cone schedules assign a service vector from by identifying the location of with respect to the cones formed by . This figure shows the cone structure for a system with queues and 4 service vectors for this particular environment. When is in cone , then service vector corresponding to that cone is used. The vector will fluctuate within , switching between service vectors when the arrivals and departures cause to cross a cone boundary, or when the environment state changes. The cone boundaries are influenced by the environment state and the matrix .

When the environment is in state and the workload is in the interior of the non-degenerate cone , then the only service vector that can be used by the cone schedule is . However, if is on the boundary of several adjacent cones (for example, ), then any of the service vectors corresponding to these cones can be used (, or , or ). Therefore, given a workload vector , we want to define the cone it belongs to, which consequently specifies what service vector the cone schedule ought to use. We proceed in this direction below.

To take another perspective, recall that when the environment is in state and the workload is , then the cone schedule chooses a service vector in the set

any vector is arbitrarily chosen, if there are more than one vector in the set . When is in the interior of the (non-degenerate) cone , then is a singleton and . This follows since the interior of a cone denotes all workload vectors for which the inner product is uniquely maximized by .

To cover the general case of being on a cone boundary (perhaps, a common boundary of several cones), we define the ‘surrounding’ cone of the workload vector as

For example, if is on the boundary of and only, then . Note that the above definitions lead to the following equivalence

as well as

for any two workload vectors and environment state . This is illustrated in Fig. 3.

Figure 3: For workload vectors which lie precisely on the boundary of two or more cones, the cone is the union of all of the cones in which include . In contrast to Fig. 2, where was interior to a single cone, the above illustration shows at the boundary of 3 of the cones. In this case includes all the elements of the three different cones. This definition is important in the proof because it captures the workload vectors which share an optimal service vector with .

Note that if then there must exist a service vector for which both is maximized at and is maximized at , and if then no such vector can exist.

We observe that cannot be on an interior boundary of (the only boundary it could be on is where the cone meets an axis because of the non-negativity constraint). If were on an interior boundary then there must exist a direction vector for which and for an arbitrarily small positive scalar . This means that there exists some service vector for which for all . But since we also have for all . This leads to the inequality

Since the left hand side can be made arbitrarily small this leads directly to a contradiction and we conclude that is indeed on the strict interior of . This observation becomes critical in the proof of stability.

Finally, we define the cone around with respect to all environment states as

The cone is illustrated in Fig 4 is of course non-empty because belongs to each cones . This is the cone of workloads for which, at each environment state , the cone schedule could have selected for the same service vector as for (fixed), that is,

Figure 4: The cone over environments is illustrated. Here, is in the cones and for the two environments and . The cone is the intersection of both of those cones. Since is known to be on the interior of each cone, is also on the interior of .

hence, when , then for each we have , besides of course. Viewed another way,

that is, when , then for each we have that has maximal projection on , besides also having maximal projection on (by definition).

We note that since is strictly on the interior of each cone and there are finitely many environment states in then is strictly on the interior of .

The cone turns out to be of key importance in the stability proof below. This completes the geometric picture of cone schedules.

5 Universal Stability of Cone Schedules

We are primarily interested in the throughput maximizing properties of cone schedules for various families of matrices , given the traffic load . The following theorem establishes that stability can be maintained for any by rich families of matrices .

Consider a cone schedule generated by the matrix and operating on any arbitrarily fixed system chosen from the class of processing systems defined by:

  1. some set of queues and some set of environment states ,

  2. some environment trace , as per (2.3),

  3. some (non-empty) service vector sets that are complete, as per (2.4),

  4. some traffic trace with load , as per (2.2).

Theorem 5.1 (Universal Stability of Cone Schedules)

Given the above assumptions if is positive-definite, symmetric and has negative or zero off-diagonal elements (), then

(5.1)

universally on . That is, each system in is (rate) stable under such a cone schedule, when .

It turns out that being positive definite and having nonpositive off-diagonal elements are both necessary for universal stability, which was shown in [Ross and Bambos, 2009].

To see why nonpositive off diagonal elements are required, consider a simple network with queues and environment state, where is used. If and are the two available service vectors then , and . Since strictly dominates for any nonzero workload, would never be selected and any arrival process with will be unstable.

To see why positive definiteness is required, consider a simple network with queues and environment state, where is used. Let and be the available service vectors. Then we have , and will never be selected. The effective service rates applied to the queues must then satisfy . Now with is contained within by (3.2), but rate stability cannot possibly be achieved in (2.5). The parameters of the non-positive-definite matrix cause the cone schedule to avoid utilizing , which is critical for rate stability because it lies on the convex hull of .

5.1 Proof of Theorem 5.1

We prove rate stability via a sequence of intermediate steps.

Consider any arbitrarily fixed environment trace , such that is complete and

for each . Consider also any arbitrarily fixed traffic trace satisfying

We note that while and are fixed, they can be generated arbitrarily, including by an underlying stochastic process or an adversary. Recall that by Proposition (4.1) when has negative or zero off-diagonal elements the generated cone schedule applies no positive rate to empty queues. Therefore,

(5.2)

for the workload at time – as in (2.5) – without having to compensate for any idle time.

Proposition 5.1

Under the conditions conditions of Theorem 5.1, the service vectors selected by the cone schedule under various environment states satisfy

(5.3)

for each workload .

Proof: First, choose any workload and fix it. Since , we have according to (3.2), or

(5.4)

for some positive weights such that .

We denote and note that this may be negative for some . We examine, the following two cases:

  1. If , we get from (5.4) that

    (5.5)
  2. If , we have

    since .

Combining the two cases, we get

for . Adding the terms up over , we get

(5.6)

where is the vector generated by the service vector by setting 0 the components for which and .

Now recall that for each and , is a sub-vector of (dropping some positive components to 0) and is also in because the latter set is complete. But the service vector selected by the cone schedule (4.2) has the maximal projection on amongst all those in , so for every . Therefore, (5.6) becomes

where the last inequality holds because for each . Putting back , we get