Choice-memory tradeoff in allocations

Choice-memory tradeoff in allocations

[ [    [ [    [ [ Tel Aviv University, Microsoft Research and Microsoft Research N. Alon
School of Mathematics
Tel Aviv University
Tel Aviv, 69978
Israel
and
Microsoft-Israel R&D Center
Herzeliya, 46725
Israel
\printeade1
O. Gurel-Gurevich
E. Lubetzky
Microsoft Research
One Microsoft Way
Redmond, Washington 98052-6399
USA
\printeade2
E-mail: \printead*e3
\smonth10 \syear2009
\smonth10 \syear2009
Abstract

In the classical balls-and-bins paradigm, where balls are placed independently and uniformly in bins, typically the number of bins with at least two balls in them is and the maximum number of balls in a bin is . It is well known that when each round offers independent uniform options for bins, it is possible to typically achieve a constant maximal load if and only if . Moreover, it is possible w.h.p. to avoid any collisions between balls if .

In this work, we extend this into the setting where only bits of memory are available. We establish a tradeoff between the number of choices and the memory , dictated by the quantity . Roughly put, we show that for one can achieve a constant maximal load, while for no substantial improvement can be gained over the case (i.e., a random allocation).

For any and , one can achieve a constant load w.h.p. if , yet the load is unbounded if . Similarly, if then balls can be allocated without any collisions w.h.p., whereas for there are typically collisions. Furthermore, we show that the load is w.h.p. at least . In particular, for , if the optimal maximal load is (the same as in the case ), while suffices to ensure a constant load. Finally, we analyze nonadaptive allocation algorithms and give tight upper and lower bounds for their performance.

[
\kwd
\doi

10.1214/09-AAP656 \volume20 \issue4 2010 \firstpage1470 \lastpage1511 \newproclaimobservation[theorem]Observation \newproclaimremark[theorem]Remark \newproclaimttttPerfect allocation algorithm for balls \newproclaimbbbBasic version of allocation algorithm for balls \newproclaimiiiIntermediate version of allocation algorithm for balls

\runtitle

Choice-memory tradeoff in allocations

{aug}

a]\fnmsNoga \snmAlon\thanksreft1label=e1]nogaa@tau.ac.il, b]\fnmsOri \snmGurel-Gurevichlabel=e2]origurel@microsoft.com\corref and b]\fnmsEyal \snmLubetzkylabel=e3]eyal@microsoft.com \thankstextt1Supported in part by a USA Israeli BSF grant, by a grant from the Israel Science Foundation, by an ERC Advanced Grant and by the Hermann Minkowski Minerva Center for Geometry at Tel Aviv University.

class=AMS] \kwd60C05 \kwd60G50 \kwd68Q25. Space/performance tradeoffs \kwdballs and bins paradigm \kwdlower bounds on memory \kwdbalanced allocations \kwdonline perfect matching.

1 Introduction

The balls-and-bins paradigm (see, e.g., Feller (), JK ()) describes the process where balls are placed independently and uniformly at random in bins. Many variants of this classical occupancy problem were intensively studied, having a wide range of applications in computer science.

It is well known that when for fixed and , the load of each bin tends to Poisson with mean and the bins are asymptotically independent. In particular, for , the typical number of empty bins at the end of the process is . The typical maximal load in that case is (cf. Gonnet ()). In what follows, we say that an event holds with high probability (w.h.p.) if its probability tends to as .

The extensive study of this model in the context of load balancing was pioneered by the celebrated paper of Azar et al. ABKU () (see the survey MRS ()) that analyzed the effect of a choice between independent uniform bins on the maximal load, in an online allocation of balls to bins. It was shown in ABKU () that the Greedy algorithm (choose the least loaded bin of the ) is optimal and achieves a maximal-load of w.h.p., compared to a load of for the original case . Thus, random choices already significantly reduce the maximal load, and as further increases, the maximal load drops until it becomes constant at .

In the context of online bipartite matchings, the process of dynamically matching each client in a group of size with one of independent uniform resources in a group of size precisely corresponds to the above generalization of the balls-and-bins paradigm: Each ball has options for a bin, and is assigned to one of them by an online algorithm that should avoid collisions (no two balls can share a bin). It is well known that the threshold for achieving a perfect matching in this case is : For , w.h.p. every client can be exclusively matched to a target resource, and if then requests cannot be satisfied.

In this work, we study the above models in the presence of a constraint on the memory that the online algorithm has at its disposal. We find that a tradeoff between the choice and the memory governs the ability to achieve a perfect allocation as well as a constant maximal load. Surprisingly, the threshold separating the subcritical regime from the supercritical regime takes a simple form, in terms of the product of the number of choices , and the size of the memory in bits :

  • If , then one can allocate balls in bins without any collisions w.h.p., and consequently achieve a load of for balls.

  • If , then any algorithm for allocating balls w.h.p. creates collisions and an unbounded maximal load.

Roughly put, when the amount of choice and memory at hand suffices to guarantee an essentially best-possible performance. On the other hand, when , the memory is too limited to enable the algorithm to make use of the extra choice it has, and no substantial improvement can be gained over the case , where no choice is offered whatsoever.

Note that rigorous lower bounds for space, and in particular tradeoffs between space and performance (time, communication, etc.), have been studied intensively in the literature of Algorithm Analysis, and are usually highly nontrivial. See, for example, Ajtai (), Beame (), BJSK (), BFMUW (), BSSV (), BC (), Fortnow (), FLMV () for some notable examples.

Our first main result establishes the exact threshold of the choice-memory tradeoff for achieving a constant maximal-load. As mentioned above, one can verify that when there is unlimited memory, the maximal load is w.h.p. uniformly bounded iff . Thus, assuming that is a prerequisite for discussing the effect of limited memory on this threshold.

Theorem 1

Consider balls and bins, where each ball has uniform choices for bins, and bits of memory are available. If , one can achieve a maximal-load of w.h.p. Conversely, if , any algorithm w.h.p. creates a load that exceeds any constant.

Consider the case . The naïve algorithm for achieving a constant maximal-load in this setting requires roughly bits of memory ( bits of memory always suffice; see Section 1.3). Surprisingly, the above theorem implies that bits of memory already suffice, and this is tight.

As we later show, one can extend the upper bound on the load, given in Theorem 1, to (useful when ), whereas the lower bound tends to  with . This further demonstrates how the quantity governs the value of the optimal maximal load. Indeed, Theorem 1 will follow from Theorems 3 and 4 below, which determine that the threshold for a perfect matching is .

Again consider the case of , where an online algorithm with unlimited memory can achieve an load w.h.p. While the above theorem settles the memory threshold for achieving a constant load in this case, one can ask what the optimal maximal load would be below the threshold. This is answered by the next theorem, which shows that in this case, for example, bits of memory yield no significant improvement over an algorithm which makes random allocations.

Theorem 2

Consider balls and bins, where each ball has uniform choices for bins, and bits of memory are available. Then for any algorithm, the maximal load is at least w.h.p.

In particular, if for some fixed and , then the maximal load is w.h.p.

Recall that a load of order is what one would obtain using a random allocation of balls in bins. The above theorem states that, when and , any algorithm would create such a load already after rounds.

Before describing our other results, we note that the lower bounds in our theorems in fact apply to a more general setting. In the original model, in each round the online algorithm chooses one of uniformly chosen bins, thus inducing a distribution on the location of the next ball. Clearly, this distribution has the property that no bin has a probability larger than .

Our theorems apply to a relaxation of the model, where the algorithm is allowed to dynamically choose a distribution for each round , which is required to satisfy the above property (i.e., ). We refer to these distributions as strategies.

Observe that indeed this model gives more power to the online algorithm. For instance, if (and the memory is unlimited), an algorithm in the relaxed model can allocate balls perfectly (by assigning probability to the occupied bins), whereas in the original model collisions occur already with balls w.h.p., for any tending to with .

Furthermore, we also relax the memory constraint on the model. Instead of treating the algorithm as an automaton with states, we only impose the restriction that there are at most different strategies to choose from. In other words, at time , the algorithm knows the entire history (the exact location of each ball so far), and needs to choose one of its strategies for the next round. In this sense, our lower bounds are for the case of limited communication complexity rather than limited space complexity.

We note that all our bounds remain valid when each round offers choices with repetitions.

1.1 Tradeoff for perfect matching

The next two theorems address the threshold for achieving a perfect matching when allocating balls in bins for some fixed [note that for , even with unlimited memory, one needs choices to avoid collisions w.h.p.]. The upper and lower bounds obtained for this threshold are tight up to a multiplicative constant, and again pinpoint its location at . The constants below were chosen to simplify the proofs and could be optimized.

Theorem 3

For fixed, consider balls and bins: Each ball has uniform choices for bins, and there are bits of memory. If

then any algorithm has collisions w.h.p.

Furthermore, the maximal load is w.h.p. .

Theorem 4

For fixed, consider balls and bins, where each ball has uniform choices for bins, and bits of memory are available. The following holds for any and . If

then a perfect allocation (no collisions) can be achieved w.h.p.

In light of the above, for any value of , the online allocation algorithm given by Theorem 4 is optimal with respect to its memory requirements.

1.2 Nonadaptive algorithms

In the nonadaptive case, the algorithm is again allowed to choose a fixed (possibly randomized) strategy for selecting the placement of ball number in one of the possible randomly chosen bins given in step . Therefore, each such algorithm consists of a sequence of predetermined strategies, where is the strategy for selecting the bin in step number .

Here, we show that even if , the maximum load is w.h.p. at least , that is, it is essentially as large as in the case . It is also possible to obtain tight bounds for larger values of . We illustrate this by considering the case .

Theorem 5

Consider the problem of allocating balls into bins, where each ball has uniform choices for bins, using a nonadaptive algorithm. {longlist}[(ii)]

The maximum load in any nonadaptive algorithm with is w.h.p. at least .

Fix . The maximum load in any nonadaptive algorithm with is w.h.p. . This is tight, that is, there exists a nonadaptive algorithm with so that the maximum load in it is w.h.p.

1.3 Range of parameters

In the above theorems and throughout the paper, the parameter may assume values up to . As for the memory, one may naïvely use bits to store the status of bins, each containing at most balls. The next observation shows that the factor is redundant.

{observation*}

At most bits of memory suffice to keep track of the number of balls in each bin when allocating balls in bins.

Indeed, one can maintain the number of balls in each bin using a vector in , where -bits stand for separators between the bins. In light of this, the original case of unlimited memory corresponds to the case .

1.4 Main techniques

The key argument in the lower bound on the performance of the algorithm with limited memory is analyzing the expected number of new collisions that a given step introduces. We wish to estimate this value with an error probability smaller than , so it would hold w.h.p. for all of the possible strategies for this step.

To this end, we apply a large deviation inequality, which relates the sum of a sequence of dependent random variables with the sum of their “predictions” , where is the expectation of given the history up to time . Proposition 2.1 essentially shows that if the sum of the predictions is large (exceeds some ), then so is the sum of the actual random variables , except with probability . In the application, the variable measures the number of new collisions introduced by the th ball, and is determined by the strategy and the history so far.

The key ingredient in proving this proposition is a Bernstein–Kolmogorov type inequality for martingales, which appears in a paper of Freedman Freedman () from 1975, and bounds the probability of deviation of a martingale in terms of its cumulative variance. We reproduce its elegant proof for completeness. Crucially, that theorem does not require a uniform bound on individual variances (such as the one that appears in standard versions of Azuma–Hoeffding), and rather treats them as random variables. Consequently, the quality of our estimate in Proposition 2.1 is unaffected by the number of random variables involved.

For the upper bounds, the algorithm essentially partitions the bins into blocks, where for different blocks it maintains an accounting of the occupied bins with varying resolution. Once a block exceeds a certain threshold of occupied bins, it is discarded and a new block takes its place.

1.5 Related work

The problem of balanced allocations with limited memory is due to Itai Benjamini. In a recent independent work, Benjamini and Makarychev BM () studied the special case of the problem for (i.e., when there are two choices for bins at each round). While our focus was mainly the regime (where one can readily achieve a constant maximal load when there is unlimited memory), our results also apply for smaller values of . Namely, as a by-product, we improve the lower bound of BM () by a factor of , as well as extend it from to any .

A different notion of memory was introduced to load balancing balls into bins in MPS (), where one has the option of placing the current ball in the least loaded bin offered in the previous round. In that setting, one could indeed improve the asymptotics (yet not the order) of the maximal load. Note that in our case we consider the original balls and bins model (as studied in ABKU ()) and just impose restrictions on the space complexity of the algorithm.

See, for example, MU (), Chapter 5, for more on the vast literature of load balancing balls into bins and its applications in computer science.

A modern application for the classical online perfect matching problem has advertisers (or bidders) play the role of the bins and internet search queries (or keywords) play the role of the balls. Upon receiving a search query, the search engine generates the list of related advertisements (revealing the choices for this ball) and must decide which of them to present in response to the query (where to allocate the ball). Note that in the classical papers that analyze online perfect matching one assumes a worst-case graph rather than a random bipartite graph, and the requests are randomly permuted; see KVV () for a fundamental paper in this area.

1.6 Organization

This paper is organized as follows. In Section 2, we prove the large deviation inequality (Proposition 2.1). Section 3 contains the lower bounds on the collisions and load, thus proving Theorem 3. Section 4 provides algorithms for achieving a perfect-matching and for achieving a constant load, respectively proving Theorem 4 and completing the proof of Theorem 1. In Section 5, we extend the analysis of the lower bound to prove Theorem 2. Section 6 discusses nonadaptive allocations, and contains the proof of Theorem 5. Finally, Section 7 is devoted to concluding remarks.

2 A large deviation inequality

This section contains a large deviation result, which will later be one of the key ingredients in proving our lower bounds for the load. Our proof will rely on a Bernstein–Kolmogorov-type inequality of Freedman Freedman (), which extends the standard Azuma–Hoeffding martingale concentration inequality. Given a sequence of bounded (possibly dependent) random variables adapted to some filter , one can consider the sequence where , which can be viewed as predictions for the ’s. The following proposition essentially says that, if the sum of the predictions is large, so is the sum of the actual variables .

Proposition 2.1

Let be a sequence of random variables adapted to the filter so that for all , and let . Then

{pf}

As mentioned above, the proof hinges on a tail-inequality for sums of random variables, which appears in the work of Freedman Freedman () from 1975 (see also Steiger ()), and extends such inequalities of Bernstein and Kolmogorov to the setting of martingales. See Freedman () and the references therein for more background on these inequalities, as well as Burkholder () for similar martingale estimates. We include the short proof of Theorem 2.2 for completeness.

Theorem 2.2 ((Freedman (), Theorem 1.6))

Let be a martingale with respect to the filter . Suppose that for all , and write . Then for any we have

{pf}

Without loss of generality, suppose , and put . Re-scaling by , it clearly suffices to treat the case . Set

and for some to be specified later, define

The next calculation will show that is a super-martingale with respect to the filter . First, notice that the function

is monotone increasing [as for all ], and in particular, for all . Rearranging,

Now, since and for all , it follows that

By definition, this precisely says that . That is, is a super-martingale, and hence by the Optional Stopping Theorem so is , where is some integer and . In particular,

and (noticing that for all ) Markov’s inequality next implies that

A choice of therefore yields

and taking a limit over concludes the proof.

{remark*}

Note that Theorem 2.2 generalizes the well-known version of the Azuma–Hoeffding inequality, where each of the terms is bounded by some constant (cf., e.g., McDiarmid ()).

We now wish to infer Proposition 2.1 from Theorem 2.2. To this end, define

and observe that is a martingale by the definition . Moreover, as the ’s are uniformly bounded, so are the increments of :

Furthermore, crucially, the variances of the increments are bounded as well in terms of the conditional expectations:

giving that .

Finally, for any integer let denote the event

Note that the event implies that . Hence, applying Theorem 2.2 to the martingale along with its cumulative variances we now get

Summing over the values of , we obtain that if then

while for the above inequality holds trivially. Hence, for all ,

(1)

To complete the proof of the proposition, we repeat the above analysis for

Clearly, we again have and . Defining

it follows that the event implies that . Therefore, as before, we have that

and thus for all

(2)

Summing the probabilities in (1) and (2) yields the desired result.

We note that essentially the same proof yields the following generalization of Proposition 2.1. As before, the constants can be optimized.

Proposition 2.3

Let and be as given in Proposition 2.1. Then for any ,

{remark}

The statements of Propositions 2.1 and 2.3 hold also in conjunction with any stopping time adapted to the filter . That is, we get the same bound on the probability of the mentioned event happening at any time . This follows easily, for instance, by altering the sequence of increments to be identically after . Such statements become useful when the uniform bound on the increments is only valid before .

3 Lower bounds on the collisions and load

In this section, we prove Theorem 3 as well as the lower bound in Theorem 1, by showing that if the quantity is suitably small, then any allocation would necessarily produce nearly linearly many bins with arbitrarily large load.

The main ingredient in the proof is a bound for the number of collisions, that is, pairs of balls that share a bin, defined next. Let denote the number of balls in bin after performing rounds; the number of collisions at time is then

The following theorem provides a lower bound on for for some absolute .

Theorem 3.1

Consider balls and bins, where each ball has uniform choices for bins, and bits of memory are available. {longlist}[(ii)]

For all , we have

Furthermore, with probability , for all and any , either the maximal load is at least or

Note that the main statement of Theorem 3 immediately follows from the above theorem, by choosing and . Indeed, recalling the assumption in Theorem 3 that , we obtain that, except with probability , either the algorithm creates a load of , or it has . Observing that a load of immediately induces collisions, we deduce that either way there are at least collisions w.h.p.

We next prove Theorem 3.1; the statement of Theorem 3 on unbounded maximal load will follow from an iterative application of a more general form of this theorem (namely, Theorem 3.4), which appears in Section 3.1.

{pf*}

Proof of Theorem 3.1 As noted in the Introduction, we relax the model by allowing the algorithm to choose any distribution for the location of the next ball, as long as it satisfies .

We also relax the memory constraint as follows. The algorithm has a pool of at most different strategies, and may choose any of them at a given step without any restriction (basing its dynamic decision on the entire history).

To summarize, the algorithm has a pool of at most strategies, all of which have an -norm of at most . In each given round, it adaptively chooses a strategy from this pool based on the entire history, and a ball then falls to a bin distributed according to .

The outline of the proof is as follows: consider the sequence , chosen adaptively out of the pool of of strategies. The large deviation inequality of Section 2 (Proposition 2.1) will enable us to show the following. The expected number of collisions encountered in the above process is well approximated by the expected number of collisions between independent balls, placed according to (i.e., equivalent to the result of the nonadaptive algorithm with strategies ).

Having reduced the problem to the analysis of a nonadaptive algorithm, we may then derive a lower bound on by analyzing the structure of the above strategies. This bound is then translated to a bound on using another application of the large deviation inequality of Proposition 2.1.

Let be an arbitrary probability distribution on satisfying , and denote by the strategy of the algorithm at time . It will be convenient from time to time to treat these distributions as vectors in .

By the above discussion, is a random variable whose values belong to some a priori set . We further let denote the actual position of the ball at time (drawn according to the distribution ).

Given the strategy at time , let denote the probability of a collision between and given , that is, that the ball that is distributed according to will collide with the one that arrived in time . We let be the inner product of and , which measures the expectation of these collisions:

Further define the cumulative sums of and as follows:

To motivate these definitions, notice that given the history up to time and any possible strategy for the next round, , we have

and so is the expected number of collisions that will be contributed by the ball given the entire history . Summing over , we have that

thus estimating the quantities will provide a bound on the expected number of collisions. Our aim in the next lemma is to show that w.h.p., whenever is large, so is . This will reduce the problem to the analysis of the quantities , which are deterministic functions of . This is the main conceptual ingredient in the lower bound, and its proof will follow directly from the large deviation estimate given in Proposition 2.1.

Lemma 3.2

Let be a sequence of strategies adapted to the filter , and let and be defined as above. Then with probability at least , for every and every we have that implies .

{pf}

Before describing the proof, we wish to emphasize a delicate point. The lemma holds for any sequence of strategies (each is an arbitrary function of ). No restrictions are made here on the way each such is produced (e.g., it does not even need to belong to the pool of strategies), as long as it satisfies . The reason that such a general statement is possible is the following: Once we specify how each is determined from (this can involve extra random bits, in case the adaptive algorithm is randomized), the process of exposing the positions of the balls, , defines a martingale. Hence, for each fixed , we would be able to show that the desired event occurs except with probability . A union bound over the strategies (which, crucially, do belong to the pool of size ) will then complete the proof.

Fix a strategy out of the pool of possible strategies, and recall the definitions of and , according to which

By applying Proposition 2.1 to the sequence (with the cumulative sums and cumulative conditional expectations ), we obtain that for all ,

Thus, taking we obtain that

Summing over the pool of at most predetermined strategies, completes the proof.

Having shown that is well approximated by , and recalling that we are interested in estimating , we now turn our attention to the possible values of .

Claim 3.3

For any sequence of strategies , we have that

{pf}

By our definitions, for the strategies we have

Recalling the definition of the strategies , we have that

Therefore,

On the other hand, by Cauchy–Schwarz,

Plugging these two estimates in (3), we deduce that

as required.

While the above claim tells us that the average size of is fairly large [has order at least ], we wish to obtain bounds corresponding to individual distributions . As we next show, this sum indeed enjoys a significant contribution from indices where . More precisely, setting , we claim that for large enough ,

(4)

To see this, observe that if

then

Combining this with Claim 3.3 [while noting that ] yields (4) for any sufficiently large .

We may now apply Lemma 3.2, and obtain that, except with probability , whenever we have , and so

(5)

Altogether, since , we infer that

(6)

where the last inequality holds for large enough . This proves part (i) of Theorem 3.1.

It remains to establish concentration for under the additional assumption that for some . First, set the following stopping-time for reaching a maximal-load of :

Next, recall that

and notice that

Therefore, we may apply our large deviation estimate given in Section 2 (Proposition 2.1), combined with the stopping-time (see Remark 2):

  • The sequence of increments is .

  • The sequence of conditional expectations is .

  • The bound on the increments is , as for all .

It follows that

where the last inequality is by the assumption . Finally, by (5), we also have that for all , except with probability . Combining these two statements, we deduce that for any ,

concluding the proof of Theorem 3.1.

3.1 Boosting the subcritical regime to unbounded maximal load

While Theorem 3.1 given above provides a careful analysis for the number of -collisions, that is, pairs of balls sharing a bin, one can iteratively apply this theorem, with very few modifications, in order to obtain that the number of -collisions (a set of balls sharing a bin) has order w.h.p. The proof of this result hinges on Theorem 3.4 below, which is a generalization of Theorem 3.1.

Recall that in the relaxed model studied so far, at any given time the algorithm adaptively selects a strategy (based on the entire history ), after which a ball is positioned in a bin . We now introduce an extra set of random variables, in the form of a sequence of increasing subsets, . The set is determined by , and has the following effect: If , we add a ball to this bin as usual, whereas if , we ignore this ball (all bins remain unchanged). That is, the number of balls in bin at time is now given by

and as before we are interested in a lower bound for the number of collisions:

The idea here is that, in the application, the set will consists of the bins that already contain balls at time . As such, they indeed form an increasing sequence of subsets determined by . In this case, any collision corresponds to balls placed in some bin which already has other balls, and thus immediately implies a load of .

Theorem 3.4

Consider the following balls and bins setting:

  1. [(3)]

  2. The online adaptive algorithm has a pool of possible strategies, where each strategy satisfies . The algorithm selects a (random) sequence of strategies adapted to the filter .

  3. Let denote a random increasing sequence of subsets adapted to the filter , that is, is determined by .

  4. There are rounds, where in round a new potential location for a ball is chosen according to . If this location belongs to , a ball is positioned there (otherwise, nothing happens).

Define . Then for any ,

{pf}

As the proof follows the same arguments of Theorem 3.1, we restrict our attention to describing the modifications that are required for the new statement to hold.

Define the following subdistribution of with respect to :