Self-stabilizing Balls & Bins in BatchesThe Power of Leaky Bins

# Self-stabilizing Balls & Bins in Batches The Power of Leaky Bins

Petra Berenbrink petra@cs.sfu.ca. Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada Tom Friedetzky tom.friedetzky@dur.ac.uk. Durham University, Durham DH1 3LE, U.K. Peter Kling pkling@sfu.ca. Supported in part by the Pacific Institute for the Mathematical Sciences. Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada
Frederik Mallmann-Trenn
mallmann@di.ens.fr
Lars Nagel nagell@uni-mainz.de. Supported by the German Ministry of Education and Research under Grant 01IH13004. Johannes Gutenberg-Universität Mainz, 55128 Mainz, Germany Chris Wastell christopher.wastell@dur.ac.uk. Supported in part by EPSRC. Durham University, Durham DH1 3LE, U.K.
###### Abstract

A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modelled as static balls into bins processes, where m balls (tasks) are to be distributed to n bins (servers). In a seminal work, Azar et al.  proposed the sequential strategy \textsc{Greedy}[{d}] for n=m. When thrown, a ball queries the load of d random bins and is allocated to a least loaded of these. Azar et al. showed that d=2 yields an exponential improvement compared to d=1. Berenbrink et al.  extended this to m\gg n, showing that the maximal load difference is independent of m for d=2 (in contrast to d=1).

We propose a new variant of an infinite balls into bins process. Each round an expected number of \lambda n new balls arrive and are distributed (in parallel) to the bins. Each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server’s current load but receive no information about parallel requests. We study the \textsc{Greedy}[{d}] distribution scheme in this setting and show a strong self-stabilizing property: For any arrival rate \lambda=\lambda(n)<1, the system load is time-invariant. Moreover, for any (even super-exponential) round t, the maximum system load is (w.h.p.) \mathrm{O}\bigl{(}\frac{1}{1-\lambda}\cdot\log\frac{n}{1-\lambda}\bigr{)} for d=1 and \mathrm{O}\bigl{(}\log\frac{n}{1-\lambda}\bigr{)} for d=2. In particular, \textsc{Greedy}[{2}] has an exponentially smaller system load for high arrival rates.

\pdfstringdefDisableCommands

## 1 Introduction

One of the fundamental problems in distributed computing is the distribution of requests, tasks, or data items to a set of uniform servers. In order to simplify this process and to avoid a single point of failure, it is often advisable to use a simple, randomized strategy instead of a complex, centralized controller to allocate the requests to the servers. In the most naïve strategy (1-choice), each client chooses the server where to send its request uniformly at random. A more elaborate scheme (2-choice) chooses two (or more) servers, queries their current loads, and sends the request to a least loaded of these. Both approaches are typically modelled as balls-into-bins processes [13, 20, 4, 7, 2, 22, 5], where requests are represented as balls and servers as bins. While the latter approach leads to considerably better load distributions [4, 7], it loses some of its power in parallel settings, where requests arrive in parallel and cannot take each other into account [2, 22].

We propose and study a new infinite and batchwise balls-into-bins process to model the client-server scenario. In a round, each server (bin) consumes one of its current tasks (balls). Afterward, (expectedly) \lambda n tasks arrive and are allocated using a given distribution scheme. The arrival rate \lambda is allowed to be a function of n (e.g., \lambda=1-1/\mathsf{poly}(n)). Standard balls-into-bins results imply that, for high arrival rates, with high probability111An event \mathcal{E} occurs with high probability (w.h.p.) if \Pr\mathopen{}\mathclose{{}\left({\mathcal{E}}}\right)=1-n^{-\Omega\mathopen{}% \mathclose{{}\left(1}\right)}. (w.h.p.) each round there is a bin that receives \Theta\mathopen{}\mathclose{{}\left(\log n}\right) balls.

1-choice Process:

The maximum load at an arbitrary time is (w.h.p.) bounded by \mathrm{O}\bigl{(}\frac{1}{1-\lambda}\cdot\log\frac{n}{1-\lambda}\bigr{)}. We also provide a lower bound which is asymptotically tight for \lambda\leq 1-1/\mathsf{poly}(n). While this implies that already the simple 1-choice process is self-stabilizing, the load properties in a “typical” state are poor: even an arrival rate of only \lambda=1-1/n yields a superlinear maximum load.

2-choice Process:

The maximum load at an arbitrary time is (w.h.p.) bounded by \mathrm{O}\bigl{(}\log\frac{n}{1-\lambda}\bigr{)}. This allows to maintain an exponentially better system load compared to the 1-choice process; for any \lambda=1-1/\mathsf{poly}(n) the maximum load remains logarithmic.

### 1.1 Related Work

Let us continue with an overview of related work. We start with classical results for sequential and finite balls-into-bins processes, go over to parallel settings, and give an overview over infinite and batch-based processes similar to ours. We also briefly mention some results from queuing theory (which is related but studies slightly different quality of service measures and system models).

#### Sequential Setting.

There are many strong, well-known results for the classical, sequential balls-into-bins process. In the sequential setting, m balls are thrown one after another and allocated to n bins. For m=n, the maximum load of any bin is known to be (w.h.p.) (1+\mathrm{o}\mathopen{}\mathclose{{}\left(1}\right))\cdot\ln(n)/\ln\ln n for the 1-choice process [13, 20] and \ln\ln(n)/\ln d+\Theta\mathopen{}\mathclose{{}\left(1}\right) for the d-choice process with d\geq 2 . If m\geq n\cdot\ln n, the maximum load increases to m/n+\Theta\bigl{(}\sqrt{m\cdot\ln(n)/n}\bigr{)}  and m/n+\ln\ln(n)/\ln d+\Theta\mathopen{}\mathclose{{}\left(1}\right) , respectively. In particular, note that the number of balls above the average grows with m for d=1 but is independent of m for d\geq 2. This fundamental difference is known as the power of two choices. A similar (if slightly weaker) result was shown by Talwar and Wieder  using a quite elegant proof technique (which we also employ and generalize for our analysis in Section 3). Czumaj and Stemann  study adaptive allocation processes where the number of a ball’s choices depends on the load of queried bins. The authors subsequently analyze a scenario that allows reallocations.

Berenbrink et al.  adapt the threshold protocol from  (see below) to a sequential setting and m\geq n bins. Here, ball i randomly choose a bin until it sees a load smaller than 1+i/n. While this is a relatively strong assumption on the balls, this protocol needs only \mathrm{O}\mathopen{}\mathclose{{}\left(m}\right) choices in total (allocation time) and achieves an almost optimal maximum load of \lceil m/n\rceil+1.

#### Parallel Setting.

Several papers (e.g. [2, 22]) investigated parallel settings of multiple-choice games for the case m=n. Here, all m balls have to be allocated in parallel, but balls and bins might employ some (limited) communication. Adler et al.  consider a trade-off between the maximum load and the number of communication rounds r the balls need to decide for a target bin. Basically, bounds that are close to the classical (sequential) processes can only be achieved if r is close to the maximum load . The authors also give a lower bound on the maximum load if r communication rounds are allowed, and Stemann  provides a matching upper bound via a collision-based protocol.

#### Infinite Processes.

In infinite processes, the number of balls to be thrown is not fixed. Instead, in each of infinitely many rounds, balls are thrown or reallocated while and bins possibly delete old balls. Azar et al.  consider an infinite, sequential process starting with n balls arbitrarily assigned to n bins. In each round one random ball is reallocated using the d-choice process. For any t>cn^{2}\log\log n, the maximum load at time t is (w.h.p.) \ln\ln(n)/\ln d+\mathrm{O}\mathopen{}\mathclose{{}\left(1}\right).

Adler et al.  consider a system where in each round m\leq n/9 balls are allocated. Bins have a FIFO-queue, and each arriving ball is stored in the queue of two random bins. After each round, every non-empty bin deletes its frontmost ball (which automatically removes its copy from the second random bin). It is shown that the expected waiting time is constant and the maximum waiting time is (w.h.p.) \ln\ln(n)/\ln d+\mathrm{O}\mathopen{}\mathclose{{}\left(1}\right). The restriction m\leq n/9 is the major drawback of this process. A differential and experimental study of this process was conducted in . The balls’ arrival times are binomially distributed with parameters n and \lambda=m/n. Their results indicate a stable behaviour for \lambda\leq 0.86. A similar model was considered by Mitzenmacher , who considers ball arrivals as a Poisson stream of rate \lambda n for \lambda<1. It is shown that the 2-choice process reduces the waiting time exponentially compared to the 1-choice process.

Czumaj  presents a framework to study the recovery time of discrete-time dynamic allocation processes. In each round one of n balls is reallocated using the d-choice process. The ball is chosen either by selecting a random bin or by selecting a random ball. From an arbitrary initial assignment, the system is shown to recover to the maximum load from  within \mathrm{O}\mathopen{}\mathclose{{}\left(n^{2}\ln n}\right) rounds in the former and \mathrm{O}\mathopen{}\mathclose{{}\left(n\ln n}\right) rounds in the latter case. Becchetti et al.  consider a similar process with only one random choice per ball, also starting from an arbitrary initial assignment of n balls. In each round, one ball is chosen from every non-empty bin and reallocated randomly. The authors define a configuration to be legitimate if the maximum load is \mathrm{O}\mathopen{}\mathclose{{}\left(\log n}\right). They show that (w.h.p.) any state recovers in linear time to a legitimate state and maintain such a state for \mathsf{poly}(n) rounds.

#### Batch-Processes.

Batch-based processes allocate m balls to n bins in batches of (usually) n balls each, where each batch is allocated in parallel. They lie between (pure) parallel and sequential processes. For m=\tau\cdot n, Stemann  investigates a scenario with n players each having m/n balls. To allocate a ball, every player independently chooses two bins and allocates copies of the ball to both of them. Every bin has two queues (one for first copies, one for second copies) and processes one ball from each queue per round. When a ball is processed, its copy is removed from the system and the player is allowed to initiate the allocation of the next ball. If \tau=\ln n, all balls are processed in \mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right) rounds and the waiting time is (w.h.p.) \mathrm{O}\mathopen{}\mathclose{{}\left(\ln\ln n}\right). Berenbrink et al.  study the d-choice process in a scenario where m balls are allocated to n bins in batches of size n each. The authors show that the load of every bin is (w.h.p.) m/n\pm\mathrm{O}\mathopen{}\mathclose{{}\left(\log n}\right). As noted in Lemma 3.5, our analysis can be used to derive the same result by easier means.

#### Queuing Processes.

Batch arrival processes have also been considered in the context of queuing systems. A key motivation for such models stems from the asynchronous transfer mode (ATM) in telecommunication systems. Tasks arrive in batches and are stored in a FIFO queue. Several papers [21, 15, 16, 3] consider scenarios where the number of arriving tasks is determined by a finite state Markov chain. Results study steady state properties of the system to determine properties of interest (e.g., waiting times or queue lengths). Sohraby and Zhang  use spectral techniques to study a multi-server scenario with an infinite queue. Alfa  considers a discrete-time process for n identical servers and tasks with constant service time s\geq 1. To ensure a stable system, the arrival rate \lambda is assumed to be \leq n/s and tasks are assigned cyclical, allowing to study an arbitrary server (instead of the complete system). Kamal  and Kim et al.  study a system with a finite capacity. Tasks arriving when the buffer is full are lost. The authors study the steady state probability and give empirical results to show the decay of waiting times as n increases.

### 1.2 Model & Preliminaries

We model our load balancing problem as an infinite, parallel balls-into-bins processes. Time is divided into discrete, synchronous rounds. There are n bins and n generators, and the initial system is assumed to be empty. At the start of each round, every non-empty bins deletes one ball. Afterward, every generator generates a ball with a probability of \lambda=\lambda(n)\in[0,1] (the arrival rate). This generation scheme allows us to consider arrival rates that are arbitrarily close to one (like 1-1/\mathsf{poly}(n)). Generated balls are distributed in the system using a distribution process. In this paper we analyze two specific distribution processes: {enumerate*}

The 1-choice process \textsc{Greedy}[{1}] assigns every ball to a randomly chosen bin.

The 2-choice process \textsc{Greedy}[{2}] assigns every ball to a least loaded among two randomly chosen bins.

#### Notation.

The random variable X_{i}(t) denotes the load (number of balls) of the i-th fullest bin at the end of round t. Thus, the load situation (configuration) after round t can be described by the load vector \bm{X}(t)=(X_{i}(t))_{i\in[n]}\in\mathbb{N}^{n}. We define \varnothing(t)\coloneqq\frac{1}{n}\sum_{i=1}^{n}X_{i}(t) as the average load at the end of round t. The value \nu(t) denotes the fraction of non-empty bins after round t and \eta(t)\coloneqq 1-\nu(t) the fraction of empty bins after round t. It will be useful to define \mathbbm{1}_{i}(t)\coloneqq\min\bigl{(}1,X_{i}(t)\bigr{)} and \eta_{i}(t)\coloneqq\mathbbm{1}_{i}(t)-\nu(t) (which equals \eta(t) if i is a non-empty bin and -\nu(t) otherwise).

#### Markov Chain Preliminaries.

The evolution of the load vector over time can be interpreted as a Markov chain, since \bm{X}(t) depends only on \bm{X}(t-1) and the random choices during round t. We refer to this Markov chain as \bm{X}. Note that \bm{X} is time-homogeneous (transition probabilities are time-independent), irreducible (every state is reachable from every other state), and aperiodic (path lengths have no period; in fact, our chain is lazy). Recall that such a Markov chain is positive recurrent (or ergodic) if the probability to return to the start state is 1 and the expected return time is finite. In particular, this implies the existence of a unique stationary distribution. Positive recurrence is a standard formalization of the intuitive concept of stability. See  for an excellent introduction into Markov chains and the involved terminology.

## 2 The 1-Choice Process

We present two main results for the 1-choice process: Theorem 2.1 states the stability of the system under the 1-choice process for an arbitrary \lambda, using the standard notion of positive recurrence (cf. Section 1). In particular, this implies the existence of a stationary distribution for the 1-choice process. Theorem 2.2 strengthens this by giving a high probability bound on the maximum load for an arbitrary round t\in\mathbb{N}. Together, both results imply that the 1-choice process is self-stabilizing.

###### Theorem 2.1 (Stability).

Let \lambda=\lambda(n)<1. The Markov chain \bm{X} of the 1-choice process is positive recurrent.

Let \lambda=\lambda(n)<1. Fix an arbitrary round t of the 1-choice process. The maximum load of all bins is (w.h.p.) bounded by \mathrm{O}\bigl{(}\frac{1}{1-\lambda}\cdot\log\frac{n}{1-\lambda}\bigr{)}.

Note that for high arrival rates of the form \lambda(n)=1-\varepsilon(n), the bound given in Theorem 2.2 is inversely proportional to \varepsilon(n). For example, for \varepsilon(n)=1/n the maximal load is \mathrm{O}\mathopen{}\mathclose{{}\left(n\log n}\right). Theorem 2.3 shows that this dependence is unavoidable: the bound given in Theorem 2.2 is tight for large values of \lambda. In Section 3, we will see that the 2-choice process features an exponentially better behaviour for large \lambda.

###### Theorem 2.3.

Let \lambda=\lambda(n)\geq 0.5 and define t\coloneqq 9\lambda\log\mathopen{}\mathclose{{}\left(n}\right)/(64(1-\lambda)^% {2}). With probability 1-\mathrm{o}\mathopen{}\mathclose{{}\left(1}\right) there is a bin i in step t with load \Omega\bigl{(}\frac{1}{1-\lambda}\cdot\log n\bigr{)}.

The proofs of these results can be found in the following subsections. We first prove a bound on the maximum load (Theorem 2.2), afterwards we prove stability of the system (Theorem 2.1), and finally we prove the lower bound (Theorem 2.3).

### 2.1 Maximum Load – Proof of Theorem 2.2

###### Proof of Theorem 2.2 (Maximum Load).

We prove Theorem 2.2 using a (slightly simplified) drift theorem from Hajek  (cf. Theorem A.2 in Appendix A). Remember that, as mentioned in Section 1.2, our process is a Markov chain, such that we need to condition only on the previous state (instead of the full filtration from Theorem (A.2)). Our goal is to bound the load of a fixed bin i at time t using Theorem A.2 and, subsequently, to use this with a union bound to bound the maximum load over all bins. To apply Theorem A.2, we have to prove that the maximum load difference of bin i between two rounds is is exponentially bounded (Majorization) and that, given a high enough load, the system tends to loose load (Negative Bias). We start with the majorization. The load difference \lvert X_{i}(t+1)-X_{i}(t)\rvert is bounded by \max(1,B_{i}(t))\leq 1+B_{i}(t), where B_{i}(t) is the number of tokens resource i receives during round t+1. In particular, we have (\lvert X_{i}(t+1)-X_{i}(t)\rvert\,|\,\bm{X}(t))\prec 1+B_{i}(t). Note that B_{i}(t) is binomially distributed with parameters n and \lambda/n (each of the n balls has probability of \lambda\cdot 1/n to end up in i). Using standard inequalities we bound

 \Pr\mathopen{}\mathclose{{}\left({B_{i}(t)=k}}\right)\leq\binom{n}{k}\cdot% \mathopen{}\mathclose{{}\left(\frac{\lambda}{n}}\right)^{k}\leq\mathopen{}% \mathclose{{}\left(\frac{e\cdot n}{k}}\right)^{k}\cdot\mathopen{}\mathclose{{}% \left(\frac{1}{n}}\right)^{k}=\frac{e^{k}}{k^{k}} (1)

and calculate

 \mathbb{E}\mathopen{}\mathclose{{}\left[{e^{B_{i}(t)+1}}}\right]=e\cdot\sum_{k% =0}^{n}e^{k}\cdot\frac{e^{k}}{k^{k}}\leq e\cdot\sum_{k=0}^{\lceil e^{3}-1% \rceil}\frac{e^{2k}}{k^{k}}+e\cdot\sum_{k=e^{3}}^{\infty}\frac{e^{2k}}{k^{k}}% \leq\Theta\mathopen{}\mathclose{{}\left(1}\right)+\sum_{k=1}^{\infty}e^{-k}=% \Theta\mathopen{}\mathclose{{}\left(1}\right). (2)

This shows that the Majorization condition from Theorem A.2 holds (with \lambda^{\prime}=1 and D=\Theta\mathopen{}\mathclose{{}\left(1}\right)). To see that the Negative Bias condition is also given, note that if bin i has non-zero load, it is guaranteed to delete one ball and receives in expectation n\cdot\lambda/n=\lambda balls. We get \mathbb{E}\mathopen{}\mathclose{{}\left[{X_{i}(t+1)-X_{i}(t)|X_{i}(t)>0}}% \right]\leq\lambda-1<0, establishing the Negative Bias condition (with \varepsilon_{0}=1-\lambda). We finally can apply Theorem A.2 with \eta\coloneqq\min(1,(1-\lambda)/2D,1/(2-2\lambda))=(1-\lambda)/(2D) and get for b\geq 0

 \Pr\mathopen{}\mathclose{{}\left({X_{i}(t)\geq b}}\right)\leq e^{-b\cdot\eta}+% \frac{2D}{\eta\cdot(1-\lambda)}\cdot e^{\eta\cdot(1-b)}\leq\frac{2\cdot(2D)^{2% }}{(1-\lambda)^{2}}\cdot e^{\frac{(1-\lambda)\cdot(1-b)}{2D}}=\frac{c}{(1-% \lambda)^{2}}\cdot e^{-\frac{b\cdot(1-\lambda)}{c}}, (3)

where c denotes a suitable constant. Applying a union bound to all n bins and choosing b\coloneqq\frac{c}{1-\lambda}\cdot\ln\bigl{(}\frac{c\cdot n^{a+1}}{(1-\lambda)% ^{2}}\bigr{)} yields \Pr\mathopen{}\mathclose{{}\left({\max_{i\in[n]}X_{i}(t)\geq b}}\right)\leq n^% {-a}. The theorem’s statement now follows from

 b=\frac{c}{1-\lambda}\cdot\ln\mathopen{}\mathclose{{}\left(\frac{c\cdot n^{a+1% }}{(1-\lambda)^{2}}}\right)\leq\frac{c\cdot(a+1)+1}{1-\lambda}\cdot\ln% \mathopen{}\mathclose{{}\left(\frac{n}{1-\lambda}}\right)=\mathrm{O}\mathopen{% }\mathclose{{}\left(\frac{1}{1-\lambda}\cdot\ln\mathopen{}\mathclose{{}\left(% \frac{n}{1-\lambda}}\right)}\right). (4)

### 2.2 Stability – Proof of Theorem 2.1

In the following, we provide an auxiliary lemma that will prove useful to derive the stability of the 1-choice process.

###### Lemma 2.4.

Let \lambda=\lambda(n)<1. Fix an arbitrary round t of the 1-choice process and a bin i. There is a constant c>0 such that the expected load of bin i is bounded by \frac{6c}{1-\lambda}\cdot\ln\bigl{(}\frac{e\cdot c}{1-\lambda}\bigr{)}.

###### Proof.

To get a bound on the expected load of bin i, note that the probability in Equation (3) (see proof of Theorem 2.2) is 1 for b\leq\gamma\coloneqq\frac{c}{1-\lambda}\cdot\ln\bigl{(}\frac{e\cdot c}{(1-% \lambda)^{2}}\bigr{)}.

Considering time windows of \gamma rounds each, we calculate

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{X_{i}(t)}}\right] \displaystyle\leq\sum_{b=1}^{\gamma}b\cdot\Pr\mathopen{}\mathclose{{}\left({X_% {i}(t)=b}}\right)+\sum_{k=1}^{\infty}\sum_{b=k\cdot\gamma}^{(k+1)\gamma}b\cdot% \Pr\mathopen{}\mathclose{{}\left({X_{i}(t)=b}}\right) (5) \displaystyle\leq\gamma+\sum_{k=1}^{\infty}(k+1)\cdot\gamma\cdot\Pr\mathopen{}% \mathclose{{}\left({X_{i}(t)\geq k\cdot\gamma}}\right)\leq\gamma+\sum_{k=1}^{% \infty}(k+1)\cdot\gamma\cdot e^{-k} \displaystyle\leq 3\gamma\leq\frac{6c}{1-\lambda}\cdot\ln\mathopen{}\mathclose% {{}\left(\frac{e\cdot c}{1-\lambda}}\right).

This finishes the proof. ∎

###### Proof of Theorem 2.1 (Stability).

We prove Theorem 2.1 using a result from Fayolle et al.  (cf. Theorem A.1 in Appendix A). Note that \bm{X} is a time-homogenous irreducible Markov chain with a countable state space. For a configuration \bm{x} we define the auxiliary potential \Psi(\bm{x})\coloneqq\sum_{i=1}^{n}x_{i} as the total system load of configuration \bm{x}. Consider the (finite) set C\coloneqq\set{\bm{x}}{\Psi(\bm{x})\leq n^{4}/(1-\lambda)^{2}} of all configurations with not too much load. To prove positive recurrence, it remains to show that Condition 1 (expected potential drop if not in a high-load configuration) and Condition 2 (finite potential) of Theorem A.1 hold. In the following, let \Delta\coloneqq\frac{n^{3}}{(1-\lambda)^{2}}.

Let us start with Condition 1. So fix a round t and let \bm{x}=\bm{X}(t)\not\in C. By definition of C, we have \Psi(\bm{x})>n^{4}/(1-\lambda)^{3}, such that there is at least one bin i with load x_{i}\geq\Psi(\bm{x})/n>n^{3}/(1-\lambda)^{2}. In particular, note that x_{i}\geq\Delta, such that during each of the next \Delta rounds exactly one ball is deleted. On the other hand, bin i receives in expectation \Delta\cdot\lambda n\cdot\frac{1}{n}=\lambda\Delta balls during the next \Delta rounds. We get \mathbb{E}\mathopen{}\mathclose{{}\left[{X_{i}(t+\Delta)-x_{i}|\bm{X}(t)=\bm{x% }}}\right]=\lambda\Delta-\Delta=-(1-\lambda)\cdot\Delta. For any bin j\neq i, we assume pessimistically that no ball is deleted. Note that the expected load increase of each of these bins can be majorized by the load increase in an empty system running for \Delta rounds. Thus, we can use Lemma 2.4 to bound the expected load increase in each of these bins by \frac{6c}{1-\lambda}\cdot\ln\bigl{(}\frac{e\cdot c}{1-\lambda}\bigr{)}\leq% \frac{6e\cdot c^{2}}{(1-\lambda)^{2}}\leq\Delta/n^{2}. We get

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Psi(\bm{X}(t+\Delta))|\bm{X}(t)=\bm{% x}}}\right]\leq-(1-\lambda)\cdot\Delta+(n-1)\cdot\frac{\Delta}{n^{2}}=-\Delta% \cdot\mathopen{}\mathclose{{}\left(1-\lambda-\frac{1}{n}}\right)\leq-\Delta% \cdot\frac{1-\lambda}{2}.

This proves Condition 1 of Theorem A.1. For Condition 2, assume \bm{x}=\bm{X}(t)\in C. We bounds the system load after \Delta rounds trivially by

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Psi(\bm{X}({t+\Delta}))|\bm{X}(t)=% \bm{x}}}\right]\leq\Psi(\bm{x})+\Delta\cdot n\leq\frac{n^{4}}{(1-\lambda)^{2}}% +\Delta\cdot n<\infty (6)

(note that the finiteness in Theorem A.1 is with respect to time, not n). This finishes the proof. ∎

### 2.3 Lower Bound on Maximum Load – Proof of Theorem 2.3

###### Proof of Theorem 2.3 (Lower Bound).

To show this result we will use the bound of Theorem A.3 which lower bounds the the maximum number of balls a bin receives when m balls are allocated into m bins. The idea of the proof is as follows. We assume that we start at an empty system and apply Theorem A.3 on m=\lambda tn many balls. The theorem says that one of the bins is likely to get much more than \lambda t many balls, which allows us to show that the load of this bin is large, even if the bin was able to delete a ball during each of the t observed time steps.

Let m(t^{\prime}) the the number of balls allocated during the first t^{\prime} steps and let b_{u}(t^{\prime}) the number of these balls that are allocated to bin u. Set t=9\lambda\log\mathopen{}\mathclose{{}\left(n}\right)/(64(1-\lambda)^{2}), assume \lambda>0.5 and assume \lambda nt\leq n\cdot(\log n)^{c} for a constant c. Since the expected number of balls is \lambda nt\geq n\log n we can use Chernoff bounds to show that w.h.p. at least (1-\epsilon)\cdot m(t) balls are generated for very small \epsilon.

Then

 \mathbb{E}\mathopen{}\mathclose{{}\left[{m(t^{\prime})}}\right]=\tfrac{9% \lambda\log n}{64(1-\lambda)^{2}}\cdot\lambda n.

Using Chernoff’s inequality we can show that w.h.p. m(t)\geq(1-\epsilon)\cdot\frac{9\lambda\log n}{64(1-\lambda)^{2}}\cdot\lambda n for an arbitrary small constant \epsilon. By Theorem A.3 (Case 3) with \alpha=\sqrt{8/9} we get (w.h.p.)

 \displaystyle b_{u}(t)\geq(1-\epsilon)\cdot\frac{9\lambda\log n}{64(1-\lambda)% ^{2}}\cdot\lambda+\sqrt{(1-\epsilon)\cdot\frac{16\cdot 9\lambda(\log n)^{2}}{9% \cdot 64(1-\lambda)^{2}}\cdot\lambda}. (7)

We derive

 \displaystyle X_{u}(t) \displaystyle\geq(1-\epsilon)\cdot\frac{9}{64}\frac{\lambda^{2}\log n}{(1-% \lambda)^{2}}+\sqrt{(1-\epsilon)\cdot\frac{16\cdot 9\lambda^{2}\cdot(\log n)^{% 2}}{9\cdot 64(1-\lambda)^{2}}}-\frac{9\lambda\log n}{64(1-\lambda)^{2}} \displaystyle=\lambda\cdot(1-\epsilon)\cdot\frac{9}{64}\frac{\lambda\log n}{(1% -\lambda)^{2}}+\sqrt{\frac{(1-\epsilon)}{4}}\cdot\frac{\lambda\log n}{(1-% \lambda)}-\frac{9\lambda\log n}{64(1-\lambda)^{2}} \displaystyle=\sqrt{\frac{(1-\epsilon)}{4}}\cdot\frac{\lambda\log n}{(1-% \lambda)}+(\lambda\cdot(1-\epsilon)-1)\cdot\frac{9\lambda\log n}{64(1-\lambda)% ^{2}} \displaystyle\geq\sqrt{\frac{(1-\epsilon)}{4}}\cdot\frac{\lambda\log n}{(1-% \lambda)}+((1-\epsilon^{\prime})-1)\cdot\frac{9\lambda\log n}{64(1-\lambda)^{2}} \displaystyle\geq\sqrt{\frac{(1-\epsilon)}{4}}\cdot\frac{\lambda\log n}{(1-% \lambda)}-\epsilon^{\prime}\cdot\frac{9}{64}\frac{\lambda\log n}{(1-\lambda)} \displaystyle=\Omega\mathopen{}\mathclose{{}\left(\frac{\lambda\log n}{1-% \lambda}}\right).

## 3 The 2-Choice Process

We continue with the study of the 2-choice process. Here, new balls are distributed according to \textsc{Greedy}[{2}] (cf. description in Section 1.2). Our main results are the following theorems, which are equivalents to the corresponding theorems for the 1-choice process.

###### Theorem 3.1 (Stability).

Let \lambda=\lambda(n)\in[1/4,1). The Markov chain \bm{X} of the 2-choice process is positive recurrent.

Let \lambda=\lambda(n)\in[1/4,1). Fix an arbitrary round t of the 2-choice process. The maximum load of all bins is (w.h.p.) bounded by \mathrm{O}\bigl{(}\log\frac{n}{1-\lambda}\bigr{)}.

Note that Theorem 3.2 implies a much better behaved system than we saw in Theorem 2.2 for the 1-choice process. In particular, it allows for an exponentially higher arrival rate: for \lambda(n)=1-1/\mathsf{poly}(n) the 2-choice process maintains a maximal load of \mathrm{O}\mathopen{}\mathclose{{}\left(\log n}\right). In contrast, for the same arrival rate the 1-choice process results in a system with maximal load \Omega\mathopen{}\mathclose{{}\left(\mathsf{poly}(n)}\right).

Our analysis of the 2-choice process relies to a large part on a good bound on the smoothness (the maximum load difference between any two bins). This is stated in the following lemma. This result is of independent interest, showing that even if the arrival rate is 1-e^{-n}, where we get a polynomial system load, the maximum load difference is still logarithmic.

###### Lemma 3.3 (Smoothness).

Let \lambda=\lambda(n)\in[1/4,1]. Fix an arbitrary round t of the 2-choice process. The load difference of all bins is (w.h.p.) bounded by \mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right).

#### Analysis Overview.

To prove these results, we combine three different potential functions: For a configuration \bm{x} with average load \varnothing and for a suitable constant \alpha (to be fixed later), we define

The potential \Phi measures the smoothness (basically the maximum load difference to the average) of a configuration and is used to prove Lemma 3.3 (Section 3.1). The proof is based on the observation that whenever the load of a bin is far from the average load, it decreases in expectation. The potential \Psi measures the total load of a configuration and is used, in combination with our results on the smoothness, to prove Theorem 3.2 (Section 3.2). The potential \Gamma entangles the smoothness and total load, allowing us to prove Theorem 3.1 (Section 3.3). The proof is based on the fact that whenever \Gamma is large (i.e., the configuration is not smooth or it has a huge total load) it decreases in expectation.

Before we continue with our analysis, let us make a simple but useful observation concerning the smoothness: For any configuration \bm{x} and value b\geq 0, the inequality \Phi(\bm{x})\leq e^{\alpha\cdot b} implies (by definition of \Phi) \max_{i}\lvert x_{i}-\varnothing\rvert\leq b. That is, the load difference of any bin to the average is at most b and, thus, the load difference between any two bins is at most 2b. We capture this in the following observation.

###### Observation 3.4.

Let b\geq 0 and consider a configuration \bm{x} with average load \varnothing. If \Phi(\bm{x})\leq e^{\alpha\cdot b}, then \lvert x_{i}-\varnothing\rvert\leq b for all i\in[n]. In particular, \max_{i}(x_{i})-\min_{i}(x_{i})\leq 2b.

### 3.1 Bounding the Smoothness

The goal of this section is to prove Lemma 3.3. To do so, we show the following bound on the expected smoothness (potential \Phi) at an arbitrary time t:

###### Lemma 3.5.

Let \lambda\in[1/4,1]. Fix an arbitrary round t of the 2-choice process. There is a constant \varepsilon>0 such that222For \Phi, the condition \lambda\geq 1/4 can be substituted with \lambda=\Omega\mathopen{}\mathclose{{}\left(1}\right) and only minor changes in the analysis. Moreover, the analysis can be easily adapted for a process that (deterministically) throws \lambda\cdot n balls in each round, even for \lambda>1 as long as it is a constant. Finally, one can easily adapt the analysis to cover the process without deletions by setting \eta_{i}(t)=0 (see Observation 3.6). Using Markov’s inequality, this yields the same result as  using a simpler analysis.

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t))}}\right]\leq\frac{n}{% \varepsilon}. (9)

Note that Lemma 3.5 together with Observation 3.4 immediately implies Lemma 3.3 by a simple application of Markov’s inequality to bound the probability that \Phi(\bm{X}(t))\geq n^{2}/\varepsilon.

Our proof of Lemma 3.5 follows the lines of [19, 24], who used the same potential function to analyze variants of the sequential d-choice process without deletions. While the basic idea of showing a relative drop when the potential is high combined with a bounded absolute increase in the general case is the same, our analysis turns out much more involved. In particular, not only do we have to deal with deletions and throwing balls in batches but the size of each batch is also a random variable. Once Lemma 3.5 is proven, Lemma 3.3 emerges by combining Observation 3.4, Lemma 3.5, and Markov’s inequality as follows:

 \Pr\mathopen{}\mathclose{{}\left({\max_{i}X_{i}(t)-\min_{i}X_{i}(t)\geq\frac{4% }{\alpha}\cdot\ln\mathopen{}\mathclose{{}\left(\frac{n}{\varepsilon}}\right)}}% \right)\leq\Pr\mathopen{}\mathclose{{}\left({\Phi(\bm{X}(t))\geq\frac{n^{2}}{% \varepsilon^{2}}}}\right)\leq\frac{\varepsilon}{n}. (10)

It remains to prove Lemma 3.5. Remember the definition of \Phi(\bm{x}) from Equation (8). We split the potential in two parts \Phi(\bm{x})\coloneqq\Phi_{+}(\bm{x})+\Phi_{-}(\bm{x}). Here, \Phi_{+}(\bm{x})\coloneqq\sum_{i}e^{\alpha\cdot(x_{i}-\varnothing))} denotes the upper potential of \bm{x} and \Phi_{-}(\bm{x})\coloneqq\sum_{i}e^{\alpha\cdot(\varnothing-x_{i}))} denotes the lower potential of \bm{x}. For a fixed bin i, we use \Phi_{i,+}(\bm{x})\coloneqq e^{\alpha\cdot(x_{i}-\varnothing)} and \Phi_{i,-}(\bm{x})\coloneqq e^{\alpha\cdot(\varnothing-x_{i})} to denote i’s contribution to the upper and lower potential, respectively. When we consider the effect of a fixed round t+1, we will sometimes omit the time parameter and use prime notation to denote the value of a parameter at the end of round t+1. For example, we write X_{i} and X^{\prime}_{i} for the load of bin i at the beginning and at the end of round t+1, respectively.

We start with two simple but useful identities regarding the potential drop \Delta_{i,+}(t+1) (and \Delta_{i,-}(t+1)) due to a fixed bin i during round t+1.

###### Observation 3.6.

Fix a bin i, let K denote the number of balls that are placed during round t+1 and let k\leq K be the number of these balls that fall into bin i. Then

1. \Delta_{i,+}(t+1)=\Phi_{i,+}(\bm{X}(t))\cdot\mathopen{}\mathclose{{}\left(e^{% \alpha\cdot(k-\eta_{i}(t)-K/n)}-1}\right) and

2. \Delta_{i,-}(t+1)=\Phi_{i,-}(\bm{X}(t))\cdot\mathopen{}\mathclose{{}\left(e^{-% \alpha\cdot(k-\eta_{i}(t)-K/n)}-1}\right).

We now derive the main technical lemma that states general bounds on the expected upper and lower potential change during a single round. This will be used to derive bounds on the potential change in different situations. For this, let p_{i}\coloneqq(\frac{i}{n})^{2}-(\frac{i-1}{n})^{2}=\frac{2i-1}{n^{2}} (the probability that a ball thrown with \textsc{Greedy}[{2}] falls into the i-th fullest bin). We also define \hat{\alpha}\coloneqq e^{\alpha}-1 and \check{\alpha}\coloneqq 1-e^{-\alpha}. Note that \hat{\alpha}\in(\alpha,\alpha+\alpha^{2}) and \check{\alpha}\in(\alpha-\alpha^{2},\alpha) for \alpha\in(0,1.7). This follows easily from the Taylor approximation e^{x}\leq 1+x+x^{2}, which holds for any x\in(-\infty,1.7] (we will use this approximation several times in the analysis). Finally, let \hat{\delta}_{i}\coloneqq\lambda n\cdot(\sfrac{1}{n}\cdot\check{1}-p_{i}\cdot% \sfrac{\hat{\alpha}}{\alpha}) and \check{\delta}_{i}\coloneqq\lambda n\cdot(\sfrac{1}{n}\cdot\hat{1}-p_{i}\cdot% \sfrac{\check{\alpha}}{\alpha}), where \check{1}\coloneqq 1-\alpha/n<1<\hat{1}\coloneqq 1+\alpha/n. These \hat{\delta}_{i} and \check{\delta}_{i} values can be thought of as upper/lower bounds on the expected difference in the number of balls that fall into bin i under the 1-choice and 2-choice process, respectively (note that \hat{1}, \check{1}, \hat{\alpha}/\alpha, and \check{\alpha}/\alpha are all constants close to 1).

###### Lemma 3.7.

Consider a bin i after round t and a constant \alpha\leq 1.

1. For the expected change of i’s upper potential during round t+1 we have

 \frac{\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{i,+}(t+1)|\bm{X}(t)}}% \right]}{\Phi_{i,+}(\bm{X}(t))}\leq-\alpha\cdot\mathopen{}\mathclose{{}\left(% \eta_{i}+\hat{\delta}_{i}}\right)+\alpha^{2}\cdot\mathopen{}\mathclose{{}\left% (\eta_{i}+\hat{\delta}_{i}}\right)^{2}. (11)
2. For the expected change of i’s lower potential during round t+1 we have

 \frac{\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{i,-}(t+1)|\bm{X}(t)}}% \right]}{\Phi_{i,-}(\bm{X}(t))}\leq\alpha\cdot\mathopen{}\mathclose{{}\left(% \eta_{i}+\check{\delta}_{i}}\right)+\alpha^{2}\cdot\mathopen{}\mathclose{{}% \left(\eta_{i}+\check{\delta}_{i}}\right)^{2}. (12)
###### Proof.

For the first statement, we use Observation 3.6 to calculate \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{i,+}(t)|\bm{X}}}\right]/\Phi_% {i,+}

 \displaystyle{}= \displaystyle\sum_{K=0}^{n}\sum_{k=0}^{K}\binom{n}{K}\binom{K}{k}\cdot(p_{i}% \lambda)^{k}\cdot\bigl{(}(1-p_{i})\lambda\bigr{)}^{K-k}\cdot(1-\lambda)^{n-K}% \cdot\mathopen{}\mathclose{{}\left(e^{\alpha\cdot(k-\eta_{i}-K/n)}-1}\right) \displaystyle{}= \displaystyle\sum_{K=0}^{n}\binom{n}{K}(1-\lambda)^{n-K}\cdot\lambda^{K}\sum_{% k=0}^{K}\binom{K}{k}\cdot p_{i}^{k}\cdot(1-p_{i})^{K-k}\cdot\mathopen{}% \mathclose{{}\left(e^{\alpha\cdot(k-\eta_{i}-K/n)}-1}\right) \displaystyle{}= \displaystyle\sum_{K=0}^{n}\binom{n}{K}(1-\lambda)^{n-K}\cdot\lambda^{K}\cdot% \mathopen{}\mathclose{{}\left(e^{-\alpha(\eta_{i}+K/n)}\sum_{k=0}^{K}\binom{K}% {k}\cdot(e^{\alpha}\cdot p_{i})^{k}\cdot(1-p_{i})^{K-k}-1}\right) \displaystyle{}= \displaystyle\sum_{K=0}^{n}\binom{n}{K}(1-\lambda)^{n-K}\cdot\lambda^{K}\cdot% \mathopen{}\mathclose{{}\left(e^{-\alpha(\eta_{i}+K/n)}\cdot\mathopen{}% \mathclose{{}\left(1+\hat{\alpha}\cdot p_{i}}\right)^{K}-1}\right), where we first apply the law of total expectation together with Observation 3.6 and, afterward, twice the binomial theorem. Continuing the calculation using the aforementioned Taylor approximation e^{x}\leq 1+x+x^{2} (which holds for any x\in(-\infty,1.7]), and the definition of \hat{\delta}_{i} yields \displaystyle{}= \displaystyle e^{-\alpha\eta_{i}}\cdot\bigl{(}1-\lambda+\lambda e^{-\alpha/n}% \cdot(1+\hat{\alpha}\cdot p_{i})\bigr{)}^{n}-1\leq e^{-\alpha\eta_{i}}\cdot% \bigl{(}1-\lambda(1-e^{-\alpha/n})+\lambda\cdot\hat{\alpha}\cdot p_{i}\bigr{)}% ^{n}-1 \displaystyle{}\leq \displaystyle e^{-\alpha\eta_{i}}\cdot\mathopen{}\mathclose{{}\left(1-\frac{% \lambda\cdot\alpha}{n}\cdot(1-\alpha/n)+\lambda\cdot\hat{\alpha}\cdot p_{i}}% \right)^{n}-1\leq e^{-\alpha\eta_{i}}\cdot\mathopen{}\mathclose{{}\left(1-% \frac{\alpha}{n}\cdot\hat{\delta}_{i}}\right)^{n}-1\leq e^{-\alpha\cdot% \mathopen{}\mathclose{{}\left(\eta_{i}+\hat{\delta}_{i}}\right)}-1.

Now, the claim follows by another application of the Taylor approximation. The second statement follows similarly. ∎

Using Lemma 3.7, we derive different bounds on the potential drop that will be used in the various situations. The proofs for the following statements can all be found in Appendix C.

We start with a result that will be used when the potential is relatively high.

###### Lemma 3.8.

Consider a round t and a constant \alpha\leq\ln(10/9) (<1/8). Let R\in\set{+,-} and \lambda\in[1/4,1]. For the expected upper and lower potential drop during round t+1 we have

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{R}(t+1)|\bm{X}(t)}}\right]<2% \alpha\lambda\cdot\Phi_{R}(\bm{X}(t)). (13)

The next lemma derives a bound that is used to bound the upper potential change in reasonably balanced configurations.

###### Lemma 3.9.

Consider a round t and the constants \varepsilon (from Claim B.2) and \alpha\leq\min(\ln(10/9),\varepsilon/4). Let \lambda\in[1/4,1] and assume X_{\frac{3}{4}n}(t)\leq\varnothing(t). For the expected upper potential drop during round t+1 we have

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]% \leq-\varepsilon\alpha\lambda\cdot\Phi_{+}(\bm{X}(t))+2\alpha\lambda n. (14)

The next lemma derives a bound that is used to bound the lower potential drop in reasonably balanced configurations.

###### Lemma 3.10.

Consider a round t and the constants \varepsilon (from Claim B.2) and \alpha\leq\min(\ln(\sfrac{10}{9}),\varepsilon/8). Let \lambda\in[1/4,1] and assume X_{\frac{n}{4}}(t)\geq\varnothing(t). For the expected lower potential drop during round t we have

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}(t)}}\right]% \leq-\varepsilon\alpha\lambda\cdot\Phi_{-}(\bm{X}(t))+\frac{\alpha\lambda n}{2}. (15)

The next lemma derives a bound that will be used to bound the potential drop in configurations with many balls far below the average to the right.

###### Lemma 3.11.

Consider a round t and constants \alpha\leq 1/46 (<\ln(10/9)) and \varepsilon\leq 1/3. Let \lambda\in[1/4,1] and assume X_{\frac{3}{4}n}(t)\geq\varnothing(t) and \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]% \geq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{+}(\bm{X}(t)). Then we have either \Phi_{+}(\bm{X}(t))\leq\frac{\varepsilon}{4}\cdot\Phi_{-}(\bm{X}(t)) or \Phi(\bm{X}(t))=\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n% }\right).

The next lemma derives a bound that will be used to bound the potential drop in configurations with many balls far above the average to the left.

###### Lemma 3.12.

Consider a round t and constants \alpha\leq 1/32 (<\ln(10/9)) and \varepsilon\leq 1. Let \lambda\in[1/4,1] and assume X_{\frac{n}{4}}(t)\leq\varnothing(t) and \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}(t)}}\right]% \geq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{-}(\bm{X}(t)). Then we have either \Phi_{-}(\bm{X}(t))\leq\frac{\varepsilon}{4}\cdot\Phi_{+}(\bm{X}(t)) or \Phi(\bm{X}(t))=\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n% }\right).

Putting all these lemmas together, we can derive the following bound on the potential change during a single round.

###### Lemma 3.13.

Consider an arbitrary round t+1 of the 2-choice process and the constants \varepsilon (from Claim B.2) and \alpha\leq\min(\ln(10/9),\varepsilon/8). For \lambda\in[1/4,1] we have

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t+1))|\bm{X}(t)}}\right]% \leq\mathopen{}\mathclose{{}\left(1-\frac{\varepsilon\alpha\lambda}{4}}\right)% \cdot\Phi(\bm{X}(t))+\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}% \left(n}\right). (16)

We can use this result in a simple induction to prove Lemma 3.5.

###### Proof of Lemma 3.5.

Lemma 3.13 gives us a \gamma<1 and c>0 such that \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t+1))|\bm{X}(t)}}\right]% \leq\gamma\cdot\Phi(\bm{X}(t))+c holds for all rounds t\geq 0. Taking the expected value on both sides yields \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t+1))}}\right]\leq\gamma% \cdot\mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t))}}\right]+c. Using induction and the linearity of the expected value, it is easy to check that \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t))}}\right]\leq\frac{c}{% 1-\gamma} solves this recursion. Using the values from Lemma 3.13 for \gamma and c (substituting \varepsilon^{\prime} for \varepsilon) we get \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t))}}\right]\leq\frac{% \smash{4\varepsilon^{\prime-8}}}{\varepsilon^{\prime}\alpha\lambda}\cdot% \mathrm{O}\mathopen{}\mathclose{{}\left(n}\right). The lemma’s statement follows for the constant \varepsilon=\mathrm{O}\mathopen{}\mathclose{{}\left(\varepsilon^{\prime-9}/(% \alpha\lambda)}\right). ∎

### 3.2 Bounding the Maximum Load

The goal of this section is to prove Theorem 3.2. Remember the definitions of \Phi(\bm{x}) and \Psi(\bm{x}) from Equation (8). For any fixed round t, we will prove that (w.h.p.) \Psi(\bm{X}(t))=\mathrm{O}\mathopen{}\mathclose{{}\left(n\cdot\ln n}\right), so that the average load is \varnothing=\mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right). Using a union bound and Lemma 3.3, we see that (w.h.p.) the the maximum load at the end of round t is bounded by \varnothing+\mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right)=\mathrm{O}% \mathopen{}\mathclose{{}\left(\ln n}\right).

It remains to prove a high probability bound on \Psi(\bm{X}(t)) for arbitrary t. To get an intuition for our analysis, consider the toy case t=\mathsf{poly}(n) and assume that exactly \lambda\cdot n\leq n balls are thrown each round. Here, we can combine Observation 3.4 and Lemma 3.5 to bound (w.h.p.) the load difference between any pair of bins and for all t^{\prime}<t by \mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right) (via a union bound over \mathsf{poly}(n) rounds). Using the combinatorial observation that, while the load distance to the average is bounded by some b\geq 0, the bound \Psi\leq 2b\cdot n is invariant under the 2-choice process (Lemma 3.14), we get for b=\mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right) that \Psi(\bm{X}(t))\leq 2b\cdot n=\mathrm{O}\mathopen{}\mathclose{{}\left(n\cdot% \ln n}\right), as required. The case for t=\omega\mathopen{}\mathclose{{}\left(\mathsf{poly}(n)}\right) is considerably more involved. In particular, the fact that the number of balls in the system is only guaranteed to decrease when the total load is high and the load distance to the average is low makes it challenging to design a suitable potential function that drops fast enough when it is high. Thus, we deviate from this standard technique and elaborate on the idea of the toy case: Instead of bounding (w.h.p.) the load difference between any pair of bins by \mathrm{O}\mathopen{}\mathclose{{}\left(\ln n}\right) for all t^{\prime}<t (which is not possible for t\gg\mathsf{poly}(n)), we prove (w.h.p.) an adaptive bound of \mathrm{O}\mathopen{}\mathclose{{}\left(\ln(t-t^{\prime})\cdot f(\lambda)}\right) for all t^{\prime}<t, where f is a suitable function (Lemma 3.15). Then we consider the last round t^{\prime\prime}<t with an empty bin. Observation 3.4 yields a bound of \Psi(\bm{X}(t^{\prime\prime}))=2\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(% \ln(t-t^{\prime\prime})\cdot f(\lambda)}\right)\cdot n on the total load at time t^{\prime\prime}. Using the same combinatorial observation as in the toy case, we get that (w.h.p.) \Psi(\bm{X}(t))\leq\Psi(\bm{X}(t^{\prime\prime}))=2\cdot\mathrm{O}\mathopen{}% \mathclose{{}\left(\ln(t-t^{\prime\prime})\cdot f(\lambda)}\right)\cdot n. The final step is to show that the load at time t^{\prime\prime} (which is logarithmic in t-t^{\prime\prime}) decreases linearly in t-t^{\prime\prime}, showing that the time interval t-t^{\prime\prime} cannot be too large (or we would get a negative load at time t). See Figure 1for an illustration. Figure 1: To bound the system load at time t, consider the minimum load and our bound on the load difference over time. There was a last time t^{\prime\prime} when there was an empty bin. The system load can only increase if there is an empty bin, and this increase is bounded by our bound on the load difference. Exploiting that the system load decreases linearly in time while every increase is bounded by our logarithmic bound on the load difference, we find a small interval [t^{\prime},t] containing t^{\prime\prime}.
###### Lemma 3.14.

Let b\geq 0 and consider a configuration \bm{x} with \Psi(\bm{x})\leq 2b\cdot n and \Phi(\bm{x})\leq e^{\alpha\cdot b}. Let \bm{x^{\prime}} denote the configuration after one step of the 2-choice process. Then \Psi(\bm{x^{\prime}})\leq 2b\cdot n.

###### Proof.

We distinguish two cases: If there is no empty bin, then all n bins delete one ball. Since the maximum number of new balls is n, the number of balls cannot increase. That is, we have \Psi(\bm{x^{\prime}})\leq\Psi(\bm{x})\leq 2b\cdot n. Now consider the case that there is at least one empty bin. Let \eta\in(0,1] denote the fraction of empty bins (i.e., there are exactly \eta\cdot n>0 empty bins). Since the minimal load is zero, Observation 3.4 implies \max_{i}x_{i}\leq 2b. Thus, the total number of balls in configuration \bm{x} is at most (1-\eta)n\cdot 2b. Exactly (1-\eta)n balls are deleted (one from each non-empty bin) and at most n new balls enter the system. We get \Psi(\bm{x^{\prime}})\leq(1-\eta)n\cdot 2b-(1-\eta)n+n=(1-\eta)n\cdot(2b-1)+n% \leq 2b\cdot n. ∎

###### Lemma 3.15.

Let \lambda\in[1/4,1). Fix a round t. For i\in\mathbb{N} with t-i\cdot\frac{8\log n}{1-\lambda}\geq 0 define \mathcal{I}_{i}\coloneqq[t-i\cdot\frac{8\ln n}{1-\lambda},t]. Let Y_{i} be the number of balls which spawn in \mathcal{I}_{i}.

1. Define the (good) smooth event \mathcal{S}_{t}\coloneqq\bigcap_{t^{\prime}<t}\bigl{(}\Phi(\bm{X}(t^{\prime}))% \leq\lvert t-t^{\prime}\rvert^{2}\cdot n^{2}\bigr{)}. Then \Pr\mathopen{}\mathclose{{}\left({\mathcal{S}_{t}}}\right)=1-\mathrm{O}\bigl{(% }n^{-1}\bigr{)}.

2. Define the (good) bounded balls event \mathcal{B}_{t}\coloneqq\bigcap_{i}\bigl{(}Y_{i}\leq\frac{1+\lambda}{2}\cdot% \lvert\mathcal{I}_{i}\rvert\cdot n\bigr{)}. Then \Pr\mathopen{}\mathclose{{}\left({\mathcal{B}_{t}}}\right)=1-\mathrm{O}\bigl{(% }n^{-1}\bigr{)}.

###### Proof.

Consider an arbitrary time t^{\prime}<t. By Lemma 3.5 we have \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(t^{\prime})}}\right]\leq n/\varepsilon. Using Markov’s inequality, this implies \mathinner{\Pr({\Phi(t^{\prime})\geq\lvert t-t^{\prime}\rvert^{2}\cdot n^{2}})% }\leq 1/(\varepsilon\cdot\lvert t-t^{\prime}\rvert^{2}\cdot n). Using the union bound over all t^{\prime}<t we calculate

 \Pr\mathopen{}\mathclose{{}\left({\bar{\mathcal{S}}_{t}}}\right)\leq\sum_{t^{% \prime}

where the last inequality applies the solution to the Basel problem. This proves the first statement.

For the second statement, let Z_{i}\coloneqq\lvert\mathcal{I}_{i}\rvert\cdot n-Y_{i} be number of balls that did not spawn during \mathcal{I}_{i}. Note that Z_{i} is a sum of \lvert\mathcal{I}_{i}\rvert\cdot n independent indicator variables with \mathbb{E}\mathopen{}\mathclose{{}\left[{Z_{i}}}\right]=(1-\lambda)\cdot\lvert% \mathcal{I}_{i}\rvert\cdot n=8i\cdot\ln n. Chernoff yields \Pr\mathopen{}\mathclose{{}\left({Z_{i}\leq(1-\lambda)\cdot\lvert\mathcal{I}_{% i}\rvert\cdot n/2}}\right)\leq e^{-8i\cdot\ln n/8}=n^{-i}. The desired statement follows by applying the identity Z_{i}=\lvert\mathcal{I}_{i}\rvert\cdot n-Y_{i} and taking the union bound. ∎

###### Lemma 3.16.

Fix a round t and assume that both \mathcal{S}_{t} and \mathcal{B}_{t} hold. Then \Psi(\bm{X}(t))\leq\frac{9n}{\alpha}\cdot\ln\bigl{(}\frac{n}{1-\lambda}\bigr{)}.

###### Proof.

Let t^{\prime}<t be the last time when there was an empty bin and set \Delta\coloneqq t-t^{\prime}. Note that t^{\prime} is well defined, as we have X_{i}(0)=0 for all i\in[n]. Since \mathcal{S}_{t} holds, we have \Phi(\bm{X}(t^{\prime}))\leq\Delta^{2}\cdot n^{2}=\exp\mathopen{}\mathclose{{}% \left(\ln(\Delta^{2}\cdot n^{2})}\right). By choice of t^{\prime} we have \min_{i}X_{i}(t^{\prime})=0. Together with Observation 3.4 we get \max_{i}X_{i}(t^{\prime}))\leq 2\ln\bigl{(}\Delta^{2}\cdot n^{2}\bigr{)}/\alpha. Summing up over all bins (and pulling out the square), this implies \Psi(\bm{X}(t^{\prime}))\leq 4n\cdot\ln\bigl{(}\Delta\cdot n\bigr{)}/\alpha. Applying Lemma 3.14 yields \Psi(\bm{X}(t^{\prime}+1))\leq 4n\cdot\ln\bigl{(}\Delta\cdot n\bigr{)}/\alpha. By choice of t^{\prime}, there is no empty bin in \bm{X}(t^{\prime\prime}) for all t^{\prime\prime}\in\set{t^{\prime}+1,t^{\prime}+2,\dots,t-1}. Thus, during each of these rounds exactly n balls are deleted. To bound the number of deleted balls, let i be maximal with \mathcal{I}_{i}\subseteq[t^{\prime},t] (as defined in Lemma 3.15). Since \mathcal{B}_{t} holds and using the maximality of i, the number of balls Y that spawn during [t^{\prime},t] is at most (1+\lambda)\lvert\mathcal{I}_{i}\rvert\cdot n/2+\frac{8\ln n}{1-\lambda}\cdot n% \leq(1+\lambda)\Delta\cdot n/2+\frac{8\ln n}{1-\lambda}\cdot n. We calculate

 \displaystyle\Psi(\bm{X}(t)) \displaystyle\leq\Psi(\bm{X}(t^{\prime}+1))-\Delta\cdot n+Y\leq\frac{4n}{% \alpha}\cdot\ln(\Delta\cdot n)-\frac{1-\lambda}{2}\Delta\cdot n+\frac{8\ln n}{% 1-\lambda}\cdot n (18) \displaystyle=\frac{1-\lambda}{2}\cdot n\cdot\mathopen{}\mathclose{{}\left(% \frac{8}{\alpha(1-\lambda)}\cdot\ln(\Delta\cdot n)-\Delta+\frac{16\ln n}{(1-% \lambda)^{2}}}\right) \displaystyle\leq\frac{1-\lambda}{2}\cdot\Delta\cdot n\cdot\mathopen{}% \mathclose{{}\left(\frac{24}{\alpha(1-\lambda)^{2}}\cdot\frac{\ln(\Delta\cdot n% )}{\Delta}-1}\right).

With f=f(\lambda)\coloneqq 24/\bigl{(}\alpha(1-\lambda)^{2}\bigr{)} the last factor becomes f\cdot\ln(\Delta\cdot n)/\Delta-1. It is negative if and only if \Delta>f\cdot\ln(\Delta\cdot n). This inequality holds for any \Delta>-f\cdot W_{-1}(-\smash{\frac{1}{f\cdot n}}), where W_{-1} denotes the lower branch of the Lambert W function333Note that -\frac{1}{f\cdot n}\geq-1/e, so that W_{-1}(-\frac{1}{f\cdot n}) is well defined.. This implies that \Delta\leq-f\cdot W_{-1}(-\sfrac{1}{fn}), since otherwise we would have \Psi(\bm{X}(t))<0, which is clearly a contradiction. Using the Taylor approximation W_{-1}(x)=\ln(-x)-\ln\bigl{(}\ln(-1/x)\bigr{)}-\mathrm{o}\mathopen{}\mathclose% {{}\left(1}\right) as x\to-0, we get

 \Delta\leq-f\cdot W_{-1}\mathopen{}\mathclose{{}\left(-\frac{1}{f\cdot n}}% \right)\leq f\cdot\ln(f\cdot n)+f\cdot\ln\bigl{(}\ln(f\cdot n)\bigr{)}+f\leq 2% f\cdot\ln(f\cdot n). (19)

Finally, we use this bound on \Delta to get

 \begin{aligned} \displaystyle\Psi(\bm{X}(t))&\displaystyle\leq\Psi(\bm{X}(t^{% \prime}+1)\leq\frac{4n}{\alpha}\cdot\ln(\Delta\cdot n)\leq\frac{4n}{\alpha}% \cdot\ln\bigl{(}2fn\cdot\ln(fn)\bigr{)}\\ &\displaystyle\leq\frac{4n}{\alpha}\cdot\ln\mathopen{}\mathclose{{}\left(\frac% {48n}{\alpha(1-\lambda)^{2}}\cdot\ln\mathopen{}\mathclose{{}\left(\frac{24n}{% \alpha(1-\lambda)^{2}}}\right)}\right)\leq\frac{9n}{\alpha}\cdot\ln\mathopen{}% \mathclose{{}\left(\frac{n}{1-\lambda}}\right).\end{aligned}\qed

Now, by combining Lemma 3.16 with the fact that the events \mathcal{S}_{t} and \mathcal{B}_{t} hold with high probability (Lemma 3.15), we immediately get that (w.h.p.) \Psi(\bm{X}(t))=\mathrm{O}\mathopen{}\mathclose{{}\left(n\cdot\ln n}\right). As described at the beginning of this section, combining this with Lemma 3.3 proves Theorem 3.2.

### 3.3 Stability

This section proves Theorem 3.1. The first auxiliary lemma states that for sufficiently high value of \Gamma, this potential decreases.444It might look tempting to use \Gamma together with Hajek’s theorem to derive a bound on the maximum load of system. However, this would require (exponentially) sharper bounds on \Phi.

###### Lemma 3.17 (Negative Bias \Gamma).

Let \lambda\in[1/4,1). If \Gamma(\bm{X}(t))\geq 2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda}, then

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Gamma(\bm{X}(t+1))-\Gamma(\bm{X}(t))% |\bm{X}(t)}}\right]\leq-1.
###### Proof.

Assume \bm{X}(t)=x is fixed. By definition of \Gamma(\cdot), we have \Phi(x)\geq\Gamma(x)/2 or \Psi(x)\geq\Gamma(x)/2. We now show that in both cases \mathbb{E}\mathopen{}\mathclose{{}\left[{\Gamma(\bm{X}(t+1))-\Gamma(x)|\bm{X}(% t)=x}}\right]\leq-1.

1. If \Phi(x)\geq\Gamma(x)/2, then we have, by Lemma 3.13, a potential drop of

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t+1))-\Phi(x)|\bm{X}(t)=x% }}\right]\leq-(\varepsilon\alpha\lambda/4)\cdot\Phi(x)+n\log n\leq-(% \varepsilon\alpha\lambda/8)\cdot\Gamma(x)+n\log n.

Note that, by definition of \Psi, \Psi(\bm{X}(t+1))-\Psi(x)\leq n. Together with \Gamma(x)\geq\tfrac{8(n\log n+n^{2}/(1-\lambda)+1)}{e\alpha\lambda},

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Gamma(\bm{X}(t+1))-\Gamma(x)|\bm{X}(% t)=x}}\right]\leq-(\varepsilon\alpha\lambda/8)\cdot\Gamma(x)+n\log n+(n/(1-% \lambda))\cdot n\leq-1.
2. Otherwise, i.e., if \Phi(x)<\Gamma(x)/2, we have that

• the load difference is, by Observation 3.4, bounded by 2\ln(\Gamma(x)/2)/\alpha, and

• \Psi(x)\geq\Gamma(x)/2 must hold. This implies that \varnothing\geq\frac{1}{n}\mathopen{}\mathclose{{}\left(\tfrac{\Gamma(x)/2}{% \frac{n}{1-\lambda}}}\right)=\tfrac{(1-\lambda)\cdot\Gamma(x)}{2n^{2}}.

From (i) and (ii) we have that the minimum load is at least \tfrac{(1-\lambda)\cdot\Gamma(x)}{2n^{2}}-\ln(\Gamma(x)/2)/\alpha. From Lemma 3.18 and \Gamma(x)\geq 2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda}, it follows that every bin has load at least load 1. Thus each bin will delete one ball and the number of balls arriving is \lambda n in expectation. Hence, \mathbb{E}\mathopen{}\mathclose{{}\left[{\Psi(\bm{X}(t+1))-\Psi(x)|\bm{X}(t)=x% }}\right]=-\frac{n}{1-\lambda}(1-\lambda)n. Now,

 \displaystyle\begin{split}\displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left% [{\Gamma(\bm{X}(t+1))-\Gamma(x)|\bm{X}(t)=x}}\right]&\displaystyle=\mathbb{E}% \mathopen{}\mathclose{{}\left[{\Phi(\bm{X}(t+1))-\Phi(x)|\bm{X}(t)=x}}\right]-% \frac{n}{1-\lambda}(1-\lambda)n\\ &\displaystyle\leq n\log n-\frac{n}{1-\lambda}(1-\lambda)n\leq-1.\end{split} (20)

Thus, \mathbb{E}\mathopen{}\mathclose{{}\left[{\Gamma(\bm{X}(t+1))-\Gamma(x)|\bm{X}(% t)=x}}\right]\leq-1, which yields the claim. ∎

###### Lemma 3.18.

For all x\geq 2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda} it holds that \tfrac{(1-\lambda)\cdot x}{2n^{2}}-2\ln(x/2)/\alpha\geq 1.

###### Proof.

Define f(x)=\tfrac{(1-\lambda)\cdot x}{2n^{2}}-2\ln(x/2)/\alpha. We have f\mathopen{}\mathclose{{}\left(2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda}}\right)% \geq\tfrac{n^{2}}{(1-\lambda)\lambda}-\tfrac{2}{\alpha}\ln\mathopen{}% \mathclose{{}\left(\frac{n^{4}}{(1-\lambda)^{2}\lambda}}\right)\geq 1, where the last inequality holds for large enough of n since \alpha is a constant. Moreover, for all x\geq 2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda} we have f^{\prime}(x)=\tfrac{1-\lambda}{n^{2}}-\tfrac{2}{\alpha x}\geq 0. Thus, the claim follows. ∎

We are ready to prove Theorem 3.1.

###### Proof of Theorem 3.1.

The proof proceeds by applying Theorem A.1. We now define the parameters of Theorem A.1. Let \zeta(t)=\bm{X}(t) and hence \Omega is the state space of X. First we observe that \Omega is countable since there are a constant number of bins (n is consider a constant in this matter) each having a load which is a natural number. We define \phi(\bm{X}(t)) to be \Gamma(\bm{X}(t)). We define C=\{x:\Gamma(x)\leq 2\tfrac{n^{4}}{(1-\lambda)^{2}\lambda}\}. Define \beta(x)=1 and \eta=1. We now show that the preconditions (a) and (b) of Theorem A.1 are fulfilled.

• Let x\not\in C. By definition of C and \phi(\bm{X}(t)), and from Lemma 3.17 we have

 \displaystyle E[\phi(X({t+1}))-\phi(x)|\bm{X}(t)=x]\leq E[\Gamma(\bm{X}(t+1))-% \Gamma(x)|\bm{X}(t)=x]\leq-1. (21)
• Let x\in C. Recall that \Gamma(\bm{X}(t))=\Phi(\bm{X}(t))+\Psi(\bm{X}(t)). By Lemma 3.12 and the fact the the number of balls arriving in one round is bounded by n, we derive,

 \displaystyle E[\phi(X({t+1}))|\bm{X}(t)=x] \displaystyle=E[\Phi(\bm{X}(t+1))|\bm{X}(t)=x]+E[\Psi(\bm{X}(t+1))|\bm{X}(t)=x] \displaystyle\leq\mathopen{}\mathclose{{}\left(\mathopen{}\mathclose{{}\left(1% -\frac{\varepsilon\alpha\lambda}{4}}\right)2\tfrac{n^{4}}{(1-\lambda)^{2}% \lambda}}\right)+\frac{n}{1-\lambda}n<\infty. (22)

The claim follows by applying Theorem A.1 with Equations (21) and (3.3). ∎

## References

• Adler et al. [1998a] M. Adler, P. Berenbrink, and K. Schröder. Analyzing an infinite parallel job allocation process. In Proceedings of the 6th Annual European Symposium on Algorithms, ESA ’98, pages 417–428, London, UK, UK, 1998a. Springer-Verlag. ISBN 3-540-64848-8.
• Adler et al. [1998b] M. Adler, S. Chakrabarti, M. Mitzenmacher, and L. Rasmussen. Parallel randomized load balancing. Random Structures & Algorithms, 13(2):159–188, 1998b. ISSN 1098-2418.
• Alfa  A. S. Alfa. Algorithmic analysis of the bmap/d/k system in discrete time. Adv. in Appl. Probab., 35(4):1131–1152, 12 2003.
• Azar et al.  Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on Computing, 29(1):180–200, 1999.
• Becchetti et al.  L. Becchetti, A. E. F. Clementi, E. Natale, F. Pasquale, and G. Posta. Self-stabilizing repeated balls-into-bins. In G. E. Blelloch and K. Agrawal, editors, Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures, SPAA 2015, Portland, OR, USA, June 13-15, 2015, pages 332–339. ACM, 2015. ISBN 978-1-4503-3588-1.
• Berenbrink et al.  P. Berenbrink, A. Czumaj, T. Friedetzky, and N. D. Vvedenskaya. Infinite parallel job allocation (extended abstract). In Proceedings of the Twelfth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’00, pages 99–108, New York, NY, USA, 2000. ACM. ISBN 1-58113-185-2.
• Berenbrink et al.  P. Berenbrink, A. Czumaj, A. Steger, and B. Vöcking. Balanced allocations: The heavily loaded case. SIAM Journal on Computing, 35(6):1350–1385, 2006.
• Berenbrink et al.  P. Berenbrink, A. Czumaj, M. Englert, T. Friedetzky, and L. Nagel. Multiple-choice balanced allocation in (almost) parallel. In A. Gupta, K. Jansen, J. Rolim, and R. Servedio, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, volume 7408 of Lecture Notes in Computer Science, pages 411–422. Springer Berlin Heidelberg, 2012. ISBN 978-3-642-32511-3.
• Berenbrink et al.  P. Berenbrink, K. Khodamoradi, T. Sauerwald, and A. Stauffer. Balls-into-bins with nearly optimal load distribution. In Proceedings of the Twenty-fifth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’13, pages 326–335, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1572-2.
• Czumaj  A. Czumaj. Recovery time of dynamic allocation processes. In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’98, pages 202–211, New York, NY, USA, 1998. ACM. ISBN 0-89791-989-0.
• Czumaj and Stemann  A. Czumaj and V. Stemann. Randomized allocation processes. In Foundations of Computer Science, 1997. Proceedings., 38th Annual Symposium on, pages 194–203, Oct 1997.
• Fayolle et al.  G. Fayolle, V. Malyshev, and M. Menshikov. Topics in the Constructive Theory of Countable Markov Chains. Cambridge University Press, 1995. ISBN 9780521461979.
• Gonnet  G. H. Gonnet. Expected length of the longest probe sequence in hash code searching. J. ACM, 28(2):289–304, Apr. 1981. ISSN 0004-5411.
•  B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications. Advances in Applied Probability, 14(3):502–525.
• Kamal  A. Kamal. Efficient solution of multiple server queues with application to the modeling of atm concentrators. In INFOCOM ’96. Fifteenth Annual Joint Conference of the IEEE Computer Societies. Networking the Next Generation. Proceedings IEEE, volume 1, pages 248–254 vol.1, Mar 1996.
• Kim et al.  N. K. Kim, M. L. Chaudhry, B. K. Yoon, and K. Kim. A complete and simple solution to a discrete-time finite-capacity bmap/d/c queue. 2012.
• Levin and Perres  D. A. Levin and Y. Perres. Markov Chains and Mixing Times. American Mathematical Society, December 2008. ISBN 978-0-8218-4739-8.
• Mitzenmacher  M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst., 12(10):1094–1104, 2001. doi: 10.1109/71.963420.
• Peres et al.  Y. Peres, K. Talwar, and U. Wieder. The (1+\beta)-choice process and weighted balls-into-bins. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SODA ’10, pages 1613–1619, Philadelphia, PA, USA, 2010. Society for Industrial and Applied Mathematics. ISBN 978-0-898716-98-6.
• Raab and Steger  M. Raab and A. Steger. "balls into bins" - A simple and tight analysis. In M. Luby, J. D. P. Rolim, and M. J. Serna, editors, Randomization and Approximation Techniques in Computer Science, Second International Workshop, RANDOM’98, Barcelona, Spain, October 8-10, 1998, Proceedings, volume 1518 of Lecture Notes in Computer Science, pages 159–170. Springer, 1998. ISBN 3-540-65142-X.
• Sohraby and Zhang  K. Sohraby and J. Zhang. Spectral decomposition approach for transient analysis of multi-server discrete-time queues. In INFOCOM’92. Eleventh Annual Joint Conference of the IEEE Computer and Communications Societies, IEEE, pages 395–404. IEEE, 1992.
• Stemann  V. Stemann. Parallel balanced allocations. In Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’96, pages 261–269, New York, NY, USA, 1996. ACM. ISBN 0-89791-809-6.
• Talwar and Wieder  K. Talwar and U. Wieder. Balanced allocations: A simple proof for the heavily loaded case. CoRR, abs/1310.5367, 2013.
• Talwar and Wieder  K. Talwar and U. Wieder. Balanced allocations: A simple proof for the heavily loaded case. In J. Esparza, P. Fraigniaud, T. Husfeldt, and E. Koutsoupias, editors, Automata, Languages, and Programming, volume 8572 of Lecture Notes in Computer Science, pages 979–990. Springer Berlin Heidelberg, 2014. ISBN 978-3-662-43947-0.

## Appendix A Auxiliary Results

###### Theorem A.1 (Fayolle et al. [12, Theorem 2.2.4]).

A time-homogeneous irreducible aperiodic Markov chain \zeta with a countable state space \Omega is positive recurrent if and only if there exists a positive function \phi(x),x\in\Omega, a number \eta>0, a positive integer-valued function \beta(x),x\in\Omega, and a finite set C\subseteq\Omega such that the following inequalities hold:

1. E[\phi(\zeta({t+\beta(x)}))-\phi(x)|\zeta(t)=x]\leq-\eta\beta(x), x\not\in C

2. E[\phi(\zeta({t+\beta(x)}))|\zeta(t)=x]<\infty, x\in C

###### Theorem A.2 (Simplified version of Hajek [14, Theorem 2.3]).

Let (Y(t))_{t\geq 0} be a sequence of random variables on a probability space (\Omega,\mathcal{F},P) with respect to the filtration (\mathcal{F}(t))_{t\geq 0}. Assume the following two conditions hold:

1. (Majorization) There exists a random variable Z and a constant \lambda^{\prime}>0, such that \mathinner{\mathbb{E}[{e^{\lambda^{\prime}Z}}]}\leq D for some finite D, and (|Y({t+1})-Y(t)|\big{|}\mathcal{F}(t))\prec Z for all t\geq 0; and

2. (Negative Bias) There exist a,\varepsilon_{0}>0, such for all t we have

 \mathinner{\mathbb{E}[{Y({t+1})-Y(t)|\mathcal{F}(t),Y(t)>a}]}\leq-\varepsilon_% {0}.

Let \eta=\min\{\lambda^{\prime},\varepsilon_{0}\cdot\lambda^{\prime 2}/(2D),1/(2% \varepsilon_{0})\}. Then, for all b and t we have

 \Pr\mathopen{}\mathclose{{}\left({Y(t)\geq b|\mathcal{F}(0)}}\right)\leq e^{% \eta(Y(0)-b)}+\frac{2D}{\varepsilon_{0}\cdot\eta}\cdot e^{\eta(a-b)}.
###### Proof.

The statement of the theorem provided in  requires besides (i) and (ii) to choose constants \eta, and \rho such that 0<\rho\leq\lambda^{\prime}, \eta<\varepsilon_{0}/c and \rho=1-\varepsilon_{0}\cdot\eta+c\eta^{2} where c=\frac{\mathbb{E}\mathopen{}\mathclose{{}\left[{e^{\lambda^{\prime}Z}}}\right% ]-(1+\lambda^{\prime}\mathbb{E}\mathopen{}\mathclose{{}\left[{Z}}\right])}{% \lambda^{\prime 2}}=\sum_{k=2}^{\infty}\frac{\lambda^{\prime k-2}}{k!}\mathbb{% E}\mathopen{}\mathclose{{}\left[{Z^{k}}}\right]. With these requirements it then holds that for all b and t

 \displaystyle\begin{split}\displaystyle\Pr\mathopen{}\mathclose{{}\left({Y(t)% \geq b|\mathcal{F}(0)}}\right)\leq\rho^{t}e^{\eta(Y(0)-b)}+\frac{1-\rho^{t}}{1% -\rho}\cdot D\cdot e^{\eta(a-b)}.\end{split} (23)

In the following we bound (23) by setting \eta=\min\{\lambda^{\prime},\varepsilon_{0}\cdot\lambda^{\prime 2}/(2D),1/(2% \varepsilon_{0})\}. The following upper and lower bound on \rho follow.

• \rho=1-\varepsilon_{0}\cdot\eta+c\eta^{2}\leq 1-\varepsilon_{0}\cdot\eta+% \varepsilon_{0}\cdot\eta\cdot c\cdot\lambda^{\prime 2}/(2D)\leq 1-\varepsilon_% {0}\cdot\eta+\varepsilon_{0}\cdot\eta/2=1-\varepsilon_{0}\cdot\eta/2, where we used c\leq D/\lambda^{\prime 2}.

• \rho=1-\varepsilon_{0}\cdot\eta+c\eta^{2}\geq 1-\varepsilon_{0}/(2\varepsilon_% {0})\geq 0.

We derive, from (23) using that for any t\geq 0 we have 0\leq\rho^{t}\leq 1

 \displaystyle\begin{split}\displaystyle\Pr\mathopen{}\mathclose{{}\left({Y(t)% \geq b|\mathcal{F}(0)}}\right)&\displaystyle\leq\rho^{t}e^{\eta(Y(0)-b)}+\frac% {1-\rho^{t}}{1-\rho}\cdot D\cdot e^{\eta(a-b)}\leq e^{\eta(Y(0)-b)}+\frac{1}{1% -\rho}\cdot D\cdot e^{\eta(a-b)}\\ &\displaystyle\leq e^{\eta(Y(0)-b)}+\frac{2D}{\varepsilon_{0}\cdot\eta}\cdot e% ^{\eta(a-b)},\end{split} (24)

since \frac{1}{(1-\rho)}\leq\frac{2}{\varepsilon_{0}\cdot\eta}. This yields the claim. ∎

###### Theorem A.3 (Raab and Steger [20, Theorem 1]).

Let M be the random variable that counts the maximum number of balls in any bin, if we throw m balls independently and uniformly at random into n bins. Then \Pr\mathopen{}\mathclose{{}\left({M>k_{\alpha}}}\right)=o(1) if \alpha>1 and \Pr\mathopen{}\mathclose{{}\left({M>k_{\alpha}}}\right)=1-o(1) if 0<\alpha<1, where

 k_{\alpha}=\begin{cases}\tfrac{\log n}{\log\tfrac{n\log n}{m}}\mathopen{}% \mathclose{{}\left(1+\alpha\tfrac{\log\log\tfrac{n\log n}{m}}{\log\tfrac{n\log n% }{m}}}\right)&if\ \tfrac{n}{\mathsf{polylog}(n)}\leq m\ll n\log n\\ (d_{c}-1+\alpha)\log n&if\ m=c\cdot n\log n\text{ for some constant $c$}\\ \tfrac{m}{n}+\alpha\sqrt{2\tfrac{m}{n}\log n}&if\ n\log n\ll m\leq n\mathsf{% polylog}(n)\\ \tfrac{m}{n}+\sqrt{2\tfrac{m}{n}\log n\mathopen{}\mathclose{{}\left(1-\tfrac{1% }{\alpha}\tfrac{\log\log n}{2\log n}}\right)}&if\ m\gg n(\log n)^{3},\end{cases}

where d_{c} is largest solution of 1+x(\log c-\log x+1)-c=0. We have d_{1}=e and d_{1.00001}=2.7183.

## Appendix B Auxiliary Tools for the 2-Choice Process

###### Claim B.1.

Consider a bin i and the values \hat{\delta}_{i} and \check{\delta}_{i} as defined before Lemma 3.7. If \alpha\leq\ln(10/9), then \max(\lvert\hat{\delta}_{i}\rvert,\lvert\check{\delta}_{i}\rvert)\leq\frac{5}{% 4}\lambda.

###### Proof.

Remember that \hat{\delta}_{i}\coloneqq\lambda n\cdot(\sfrac{1}{n}\cdot\check{1}-p_{i}\cdot% \sfrac{\hat{\alpha}}{\alpha}) and \check{\delta}_{i}\coloneqq\lambda n\cdot(\sfrac{1}{n}\cdot\hat{1}-p_{i}\cdot% \sfrac{\check{\alpha}}{\alpha}), where \check{1}=1-\alpha/n<1<1+\alpha/n=\hat{1} (see proof of Lemma 3.7). Note that if \alpha\leq\ln(10/9), we have \hat{1}<5/4 and \check{1}>8/9. The claims hold trivially for i=1, since then p_{i}=(2i-1)/n^{2}=1/n^{2} and both \lvert\sfrac{1}{n}\cdot\check{1}-p_{i}\cdot\sfrac{\hat{\alpha}}{\alpha}\rvert% \leq 1/n and \lvert\sfrac{1}{n}\cdot\hat{1}-p_{i}\cdot\sfrac{\check{\alpha}}{\alpha}\rvert% \leq\hat{1}/n. For the other extreme, i=n, we have p_{n}\leq 2/n. The first statement follows from this and the definition of \hat{\alpha}=e^{\alpha}-1 since \frac{2}{n}\cdot\frac{\hat{\alpha}}{\alpha}-\frac{1}{n}\cdot\check{1}\leq\frac% {2}{n}\frac{10/9-1}{\ln(10/9)}-\frac{1}{n}\cdot\check{1}<\frac{5}{4n}. Similarly, the second statement follows together with \frac{2}{n}\frac{\check{\alpha}}{\alpha}-\frac{1}{n}\cdot\hat{1}<\frac{1}{n} (which holds for any \alpha>0). ∎

###### Claim B.2.

There is a constant \varepsilon>0 such that

 \sum_{i\leq\frac{3}{4}n}p_{i}\cdot\Phi_{i,+}\leq(1-2\varepsilon)\cdot\frac{% \Phi_{+}}{n}. (25)

and

 \sum_{i\in[n]}p_{i}\cdot\Phi_{i,-}\geq(1+2\varepsilon)\cdot\frac{\Phi_{-}-\sum% _{i\leq\frac{n}{4}}\Phi_{i,-}}{n}. (26)
###### Proof.

The claim follows from comments in . For Equation 25 recall that \sum_{i<3n/4}\Phi_{i,+}\leq\Phi_{+} (by definition). Since \Phi_{i,+} for i=1,\dots,n is non-increasing where i is the i-th loaded bin, the above equation is maximized where all \Phi_{i,+}=\frac{4\Phi_{+}}{3n}. The following observation that can be found in 

 \displaystyle\sum_{i\geq 3n/4}p_{i} \displaystyle\geq\frac{1}{4}+\varepsilon\implies\sum_{i\leq 3n/4}p_{i} \displaystyle\leq 1-\frac{1}{4}-\varepsilon=\frac{3}{4}-\varepsilon (27)

The result follows from combining these two facts.

 \sum_{i\leq\frac{3}{4}n}p_{i}\cdot\Phi_{i,+}\leq\mathopen{}\mathclose{{}\left(% \frac{3}{4}-\varepsilon}\right)\frac{4\Phi_{+}}{3n}=\mathopen{}\mathclose{{}% \left(1-\frac{4\varepsilon}{3n}}\right)\cdot\Phi_{+}\leq(1-2\varepsilon)\cdot% \frac{\Phi_{+}}{n}. (28)

Equation (26) follows similarly. ∎

###### Claim B.3.

Consider a round t and a constant \alpha\geq 0. The following inequalities hold:

1. \displaystyle\sum_{i\in[n]}\alpha\eta_{i}(\alpha\eta_{i}-1)\cdot\Phi_{i,+}(\bm% {X}(t))\leq\alpha^{2}\eta\nu\cdot\min\bigl{(}n,\Phi_{+}(\bm{X}(t))\bigr{)}.

2. \displaystyle\sum_{i\in[n]}\alpha\eta_{i}(\alpha\eta_{i}+1)\cdot\Phi_{i,-}(\bm% {X}(t))\leq\alpha^{2}\eta\nu\cdot\Phi_{-}(\bm{X}(t)).

###### Proof.

For the first statement, we calculate \sum_{i\in[n]}\alpha\eta_{i}(\alpha\eta_{i}-1)\cdot\Phi_{i,+}(\bm{X}(t))

 \displaystyle=\sum_{i\leq\nu n}\alpha\eta_{i}(\alpha\eta_{i}-1)\cdot\Phi_{i,+}% (\bm{X}(t))+\sum_{i>\nu n}\alpha\eta_{i}(\alpha\eta_{i}-1)\cdot\Phi_{i,+}(\bm{% X}(t)) (29) \displaystyle=\alpha\eta(\alpha\eta-1)\cdot\sum_{i\leq\nu n}\Phi_{i,+}(\bm{X}(% t))+\alpha\nu(1+\alpha\nu)\cdot\sum_{i>\nu n}\Phi_{i,+}(\bm{X}(t)) \displaystyle\leq\alpha\eta(\alpha\eta-1)\cdot\nu\cdot\Phi_{+}(\bm{X}(t))+% \alpha\nu(1+\alpha\nu)\cdot\eta\cdot\min\bigl{(}n,\Phi_{+}(\bm{X}(t))\bigr{)} \displaystyle\leq\alpha^{2}\eta\nu\cdot\min\bigl{(}n,\Phi_{+}(\bm{X}(t))\bigr{% )},

where the first inequality uses that \Phi_{i,+}(\bm{X}(t)) is non-increasing in i and that \Phi_{i,+}(\bm{X}(t))\leq 1 for all i>\nu n. The claim’s second statement follows by a similar calculation, using that \Phi_{i,-}(\bm{X}(t)) is non-decreasing in i (note that we cannot apply the same trick as above to get \min\bigl{(}n,\Phi_{-}(\bm{X}(t))\bigr{)} instead of \Phi_{-}(\bm{X}(t))). ∎

## Appendix C Missing Proofs for the 2-Choice Process

###### Proof of Observation 3.6.

Remember that \mathbbm{1}_{i} is an indicator value which equals 1 if and only if the i-th bin is non-empty in configuration \bm{X}. Bin i looses exactly \mathbbm{1}_{i} balls and receives exactly k balls, such that X^{\prime}_{i}-X_{i}=-\mathbbm{1}_{i}+k. Similarly, we have \varnothing^{\prime}-\varnothing=-\nu+K/n for the change of the average load. With the identity \eta_{i}=\mathbbm{1}_{i}-\nu (see Section 1.2), this yields

 \displaystyle\Delta_{i,+}(t) \displaystyle=e^{\alpha\cdot\bigl{(}X_{i}^{\prime}-\varnothing^{\prime}\bigr{)% }}-e^{\alpha\cdot\bigl{(}X_{i}-\varnothing\bigr{)}} (30) \displaystyle=e^{\alpha\cdot\bigl{(}X_{i}-\varnothing\bigr{)}}\cdot\mathopen{}% \mathclose{{}\left(e^{\alpha\cdot\bigl{(}-\mathbbm{1}_{i}+k+\nu-K/n\bigr{)}}-1% }\right)=\Phi_{i,+}\cdot\mathopen{}\mathclose{{}\left(e^{\alpha\cdot(k-\eta_{i% }-K/n)}-1}\right),

proving the first statement. The second statement follows similarly. ∎

###### Proof of Lemma 3.8.

We prove the statement for R=+. The case R=- follows similarly. Using Lemma 3.7 and summing up over all i\in[n] we get

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right] \displaystyle\leq\sum_{i\in[n]}\mathopen{}\mathclose{{}\left(-\alpha\cdot(\eta% _{i}+\hat{\delta}_{i})+\alpha^{2}\cdot(\eta_{i}+\hat{\delta}_{i})^{2}}\right)% \cdot\Phi_{i,+} (31) \displaystyle=\sum_{i\in[n]}\mathopen{}\mathclose{{}\left(\eta_{i}\alpha(\eta_% {i}\alpha-1)+\alpha^{2}\cdot(2\eta_{i}\hat{\delta}_{i}+\hat{\delta}_{i}^{2})-% \alpha\cdot\hat{\delta}_{i}}\right)\cdot\Phi_{i,+} \displaystyle\leq\sum_{i\in[n]}\mathopen{}\mathclose{{}\left(\eta_{i}\alpha(% \eta_{i}\alpha-1)+5\alpha^{2}\lambda+\frac{5}{4}\alpha\lambda}\right)\cdot\Phi% _{i,+}.

Here, the last inequality uses \lambda\leq 1 and \lvert\hat{\delta}_{i}\rvert\leq\frac{5}{4}\lambda (Claim B.1). We now apply Claim B.3, \nu\eta\leq 1/4\leq\lambda, and \alpha<1/8 to get

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t)|\bm{X}}}\right]\leq% \mathopen{}\mathclose{{}\left(\alpha^{2}\lambda+5\alpha^{2}\lambda+\frac{5}{4}% \alpha\lambda}\right)\cdot\Phi_{+}<2\alpha\lambda\cdot\Phi_{+}.\qed
###### Proof of Lemma 3.9.

To calculate the expected upper potential change, we use Lemma 3.7 and sum up over all i\in[n] (using similar inequalities as in the proof of Lemma 3.8 and the definition of \hat{\delta}_{i}):

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{+}-\sum_{i\in[n]}\alpha\cdot% \hat{\delta}_{i}\cdot\Phi_{i,+} (32) \displaystyle=\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda-\alpha\lambda% \cdot\check{1}}\right)\cdot\Phi_{+}+\hat{\alpha}\lambda n\sum_{i\in[n]}p_{i}% \cdot\Phi_{i,+}.

We now use that \Phi_{i,+}=e^{\alpha\cdot(X_{i}-\varnothing)}\leq 1 for all i>\frac{3}{4}n (by our assumption on X_{\frac{3}{4}n}). This yields

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right]\leq% \mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda-\alpha\lambda\cdot\check{1}}% \right)\cdot\Phi_{+}+\hat{\alpha}\lambda n\sum_{i\leq\frac{3}{4}n}p_{i}\cdot% \Phi_{i,+}+2\alpha\lambda n. (33)

Finally, we apply Claim B.2 and the definition of \check{1} and \hat{\alpha} to get

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right] \displaystyle\leq\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda-\alpha% \lambda\cdot\check{1}+(1-2\varepsilon)\cdot\hat{\alpha}\lambda}\right)\cdot% \Phi_{+}+2\alpha\lambda n (34) \displaystyle\leq\mathopen{}\mathclose{{}\left(4\alpha^{2}\lambda-2\varepsilon% \cdot\alpha\lambda}\right)\cdot\Phi_{+}+2\alpha\lambda n.

Using \alpha\leq\varepsilon/4 yields the desired result. ∎

###### Proof of Lemma 3.10.

To calculate the expected lower potential change, we use Lemma 3.7 and sum up over all i\in[n] (as in the proof of Lemma 3.9):

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{-}+\sum_{i\in[n]}\alpha\cdot% \check{\delta}_{i}\cdot\Phi_{i,-} (35) \displaystyle=\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda+\alpha\lambda% \cdot\hat{1}}\right)\cdot\Phi_{-}-\check{\alpha}\lambda n\sum_{i\in[n]}p_{i}% \cdot\Phi_{i,-}.

We now use that \Phi_{i,-}=e^{\alpha\cdot(\varnothing-X_{i})}\leq 1 for all i\leq\frac{n}{4} (by our assumption on X_{\frac{n}{4}}) and apply Claim B.2 to get

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t)|\bm{X}}}\right] \displaystyle\leq\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda+\alpha% \lambda\cdot\hat{1}}\right)\cdot\Phi_{-}-(1+2\varepsilon)\cdot\check{\alpha}% \lambda n\cdot\frac{\Phi_{-}-\frac{n}{4}}{n} (36) \displaystyle=\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda+\alpha\lambda% \cdot\hat{1}-(1+2\varepsilon)\cdot\check{\alpha}\lambda}\right)\cdot\Phi_{-}+(% 1+2\varepsilon)\cdot\frac{\check{\alpha}\lambda n}{4} \displaystyle\leq\mathopen{}\mathclose{{}\left(8\alpha^{2}\lambda-2\varepsilon% \cdot\alpha\lambda}\right)\cdot\Phi_{-}+\frac{\alpha\lambda n}{2},

where the last inequality used the definitions of \hat{1}, \check{\alpha}, as well as \check{\alpha}>\alpha-\alpha^{2}. Using \alpha\leq\varepsilon/8 yields the desired result. ∎

###### Proof of Lemma 3.11.

Let L\coloneqq\sum_{i\in[n]}\max(X_{i}-\varnothing,0)=\sum_{i\in[n]}\max(% \varnothing-X_{i},0) be the “excess load” above and below the average. First note that the assumption X_{\frac{3}{4}n}\geq\varnothing implies \Phi_{-}\geq\frac{n}{4}\cdot\exp(\frac{\alpha L}{n/4}) (using Jensen’s inequality). On the other hand, we can use the assumption \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right]\geq-% \frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{+} to show an upper bound on \Phi_{+}. To this end, we use Lemma 3.7 and sum up over all i\in[n] (as in the proof of Lemma 3.9):

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{+}-\sum_{i\in[n]}\alpha\cdot% \hat{\delta}_{i}\cdot\Phi_{i,+} (37) \displaystyle=6\alpha^{2}\lambda\cdot\Phi_{+}-\sum_{i\leq\frac{n}{3}}\alpha% \cdot\hat{\delta}_{i}\cdot\Phi_{i,+}-\sum_{i>\frac{n}{3}}\alpha\cdot\hat{% \delta}_{i}\cdot\Phi_{i,+}.

For i\leq n/3 we have p_{i}=\frac{2i-1}{n^{2}}\leq\frac{2}{3n} and, using the definition of \check{1} and \hat{\alpha}, \hat{\delta}_{i}=\lambda n\cdot\bigl{(}\sfrac{1}{n}\cdot\check{1}-p_{i}\cdot% \sfrac{\hat{\alpha}}{\alpha}\bigr{)}\geq(1-5\alpha)\lambda/3. Setting \Phi_{\leq n/3,+}\coloneqq\sum_{i\leq n/3}\Phi_{i,+} and \Phi_{>n/3,+}\coloneqq\sum_{i>n/3}\Phi_{i,+}, together with Claim B.1 this yields

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{+}-\frac{\alpha(1-5\alpha)% \lambda}{3}\cdot\Phi_{\leq n/3,+}+\frac{5}{4}\alpha\lambda\cdot\Phi_{>n/3,+} (38) \displaystyle=\mathopen{}\mathclose{{}\left(6\alpha^{2}\lambda-\frac{\alpha(1-% 5\alpha)\lambda}{3}}\right)\cdot\Phi_{+}+\mathopen{}\mathclose{{}\left(\frac{5% }{4}\alpha\lambda+\frac{\alpha(1-5\alpha)\lambda}{3}}\right)\cdot\Phi_{>n/3,+} \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{2}\cdot\Phi_{+}+2\alpha% \lambda\cdot\Phi_{>n/3,+},

where the last inequality uses \alpha\leq 1/46\leq\frac{1}{23}-\frac{3}{46}\varepsilon. With this, the assumption \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}}}\right]\geq-% \frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{+} implies \Phi_{+}\leq\frac{8}{\varepsilon}\cdot\Phi_{>n/3,+}\leq\frac{8}{\varepsilon}% \cdot\frac{2n}{3}e^{\frac{\alpha L}{n/3}}=\frac{16n}{3\varepsilon}e^{\frac{3% \alpha L}{n}} (the last inequality uses that none of the 2n/3 remaining bins can have a load higher than L/(n/3)). To finish the proof, assume \Phi_{+}>\frac{\varepsilon}{4}\cdot\Phi_{-} (otherwise the lemma holds). Combining this with the upper bound on \Phi_{+} and with the lower bound on \Phi_{-}, we get

 \frac{16n}{3\varepsilon}e^{\frac{3\alpha L}{n}}\geq\Phi_{+}>\frac{\varepsilon}% {4}\cdot\Phi_{-}\geq\frac{\varepsilon n}{16}\cdot e^{\frac{4\alpha L}{n}}. (39)

Thus, the excess load can be bounded by L<\frac{n}{\alpha}\cdot\ln\mathopen{}\mathclose{{}\left(\frac{256}{3% \varepsilon^{2}}}\right). Now, the lemma’s statement follows from \Phi=\Phi_{+}+\Phi_{-}<\frac{5}{\varepsilon}\cdot\Phi_{+}\leq\frac{80n}{3% \varepsilon^{2}}e^{\frac{3\alpha L}{n}}=\varepsilon^{-8}\cdot\mathrm{O}% \mathopen{}\mathclose{{}\left(n}\right). ∎

###### Proof of Lemma 3.12.

Let L\coloneqq\sum_{i\in[n]}\max(X_{i}-\varnothing,0)=\sum_{i\in[n]}\max(% \varnothing-X_{i},0) be the “excess load” above and below the average. First note that the assumption X_{\frac{n}{4}}\leq\varnothing implies \Phi_{+}\geq\frac{n}{4}\cdot e^{\frac{\alpha L}{n/4}} (using Jensen’s inequality). On the other hand, we can use the assumption \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}}}\right]\geq-% \frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{-} to show an upper bound on \Phi_{-}. To this end, we use Lemma 3.7 and sum up over all i\in[n] (as in the proof of Lemma 3.10):

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{-}+\sum_{i\in[n]}\alpha\cdot% \check{\delta}_{i}\cdot\Phi_{i,-} (40) \displaystyle=6\alpha^{2}\lambda\cdot\Phi_{-}+\sum_{i\leq\frac{2n}{3}}\alpha% \cdot\check{\delta}_{i}\cdot\Phi_{i,-}+\sum_{i>\frac{2n}{3}}\alpha\cdot\check{% \delta}_{i}\cdot\Phi_{i,-}.

For i\geq 2n/3 we have p_{i}=\frac{2i-1}{n^{2}}\geq\frac{4}{3n}-\frac{1}{n^{2}}. Using this with p_{i}\leq p_{n}\leq 2/n and \check{\alpha}\geq\alpha-\alpha^{2}, we can bound \check{\delta}_{i}=\lambda n\cdot\bigl{(}\sfrac{1}{n}\cdot\hat{1}-p_{i}\cdot% \sfrac{\check{\alpha}}{\alpha}\bigr{)}\leq\lambda\cdot(-\sfrac{1}{3}+\frac{1+% \alpha}{n})+2\alpha\lambda\leq-\lambda/6+2\alpha\lambda. Setting \Phi_{\leq 2n/3,-}\coloneqq\sum_{i\leq 2n/3}\Phi_{i,-} and \Phi_{>2n/3,-}\coloneqq\sum_{i>2n/3}\Phi_{i,-}, together with Claim B.1 this yields

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}}}\right] \displaystyle\leq 6\alpha^{2}\lambda\cdot\Phi_{-}+\frac{5}{4}\alpha\lambda% \cdot\Phi_{\leq 2n/3,-}-\frac{\alpha\lambda}{6}\cdot\Phi_{>2n/3,-}+2\alpha^{2}% \lambda\cdot\Phi_{>2n/3,-} (41) \displaystyle\leq\mathopen{}\mathclose{{}\left(8\alpha^{2}\lambda-\alpha% \lambda/6}\right)\cdot\Phi_{-}+\mathopen{}\mathclose{{}\left(\frac{5}{4}\alpha% \lambda+\alpha\lambda/6}\right)\cdot\Phi_{\leq 2n/3,-} \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{2}\cdot\Phi_{-}+2\alpha% \lambda\cdot\Phi_{\leq 2n/3,-},

where the last inequality uses \alpha\leq 1/32\leq\frac{1}{16}-\frac{1}{48}\varepsilon. With this, the assumption \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}}}\right]\geq-% \frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{-} implies that \Phi_{-}\leq\frac{8}{\varepsilon}\cdot\Phi_{\leq 2n/3,-}\leq\frac{8}{% \varepsilon}\cdot\frac{2n}{3}e^{\frac{\alpha L}{n/3}}=\frac{16n}{3\varepsilon}% e^{\frac{3\alpha L}{n}} (the last inequality uses that none of the 2n/3 remaining bins can have a load higher than L/(n/3)). To finish the proof, assume \Phi_{-}>\frac{\varepsilon}{4}\cdot\Phi_{+} (otherwise the lemma holds). Combining this with the upper bound on \Phi_{-} and with the lower bound on \Phi_{+}, we get

 \frac{16n}{3\varepsilon}e^{\frac{3\alpha L}{n}}\geq\Phi_{-}>\frac{\varepsilon}% {4}\cdot\Phi_{+}\geq\frac{\varepsilon n}{16}\cdot e^{\frac{4\alpha L}{n}}. (42)

Thus, the excess load can be bounded by L<\frac{n}{\alpha}\cdot\ln\mathopen{}\mathclose{{}\left(\frac{256}{3% \varepsilon^{2}}}\right). Now, the lemma’s statement follows from \Phi=\Phi_{+}+\Phi_{-}<\frac{5}{\varepsilon}\cdot\Phi_{-}\leq\frac{80n}{3% \varepsilon^{2}}e^{\frac{3\alpha L}{n}}=\varepsilon^{-8}\cdot\mathrm{O}% \mathopen{}\mathclose{{}\left(n}\right). ∎

###### Proof of lemma 3.13.

The proof is via case analysis

#### Case 1: x_{\frac{n}{4}}\geq\varnothing and x_{\frac{3n}{4}}\leq\varnothing

In this case the desired bound follows from lemma 3.9 and lemma 3.10.

#### Case 2: x_{\frac{n}{4}}\geq x_{\frac{3n}{4}}\geq\varnothing

For \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]% \leq\frac{-\varepsilon\alpha}{4}\Phi_{+} the results follows from lemma 3.10.

Recall \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta(t+1)|\bm{X}(t)}}\right]=% \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]+% \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}(t)}}\right]

By lemma 3.10 we can derive the following

 \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta(t+1)|\bm{X}(t)}}\right]\leq% \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]-% \varepsilon\alpha\lambda\cdot\Phi_{-}(\bm{X}(t))+\frac{\alpha\lambda n}{2} (43)

We now show that the RHS can be bounded by the RHS of lemma 3.13. The result holds when

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t% )}}\right] \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi(\bm{X}(t))+% \varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}\right)+% \varepsilon\alpha\lambda\cdot\Phi_{-}(\bm{X}(t))-\frac{\alpha\lambda n}{2} (44) \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi(\bm{X}(t))+% \varepsilon\alpha\lambda\cdot\Phi_{-}(\bm{X}(t)) \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\mathopen{}\mathclose% {{}\left(\Phi_{+}(\bm{X}(t))+\Phi_{-}(\bm{X}(t))}\right)+\varepsilon\alpha% \lambda\cdot\Phi_{-}(\bm{X}(t)) \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi_{+}(\bm{X}(t))

For \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{+}(t+1)|\bm{X}(t)}}\right]% \geq\frac{-\varepsilon\alpha}{4}\Phi_{+} lemma 3.11 gives two subcases

#### Case 2.1 \Phi_{+}(\bm{X}(t))\leq\frac{\varepsilon}{4}\cdot\Phi_{-}(\bm{X}(t))

Using lemma 3.8 and lemma 3.10 we obtain the following

 \displaystyle\mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta(t+1)|\bm{X}(t)}}\right] \displaystyle\leq 2\alpha\lambda\cdot\Phi_{+}(\bm{X}(t))-\varepsilon\alpha% \lambda\cdot\Phi_{-}(\bm{X}(t))+\frac{\alpha\lambda n}{2} (45) \displaystyle\leq\frac{2\alpha\lambda\varepsilon}{4}\cdot\Phi_{-}(\bm{X}(t))-% \varepsilon\alpha\lambda\cdot\Phi_{-}(\bm{X}(t))+\frac{\alpha\lambda n}{2} \displaystyle=-\frac{\varepsilon\alpha\lambda}{2}\cdot\Phi_{-}(\bm{X}(t))+% \frac{\alpha\lambda n}{2} \displaystyle\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi(\bm{X}(t))+% \varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}\right)

#### Case 2.2 \Phi(\bm{X}(t))=\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n% }\right)

Using lemma 3.8 we get that \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta(t+1)|\bm{X}(t)}}\right]\leq 2% \alpha\lambda\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}\right)

It remains to show that

 2\alpha\lambda\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}% \right)\leq-\frac{\varepsilon\alpha\lambda}{4}\cdot\Phi(\bm{X}(t))+\varepsilon% ^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}\right) (46)

Since \Phi(\bm{X}(t))=\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n% }\right)

 2\alpha\lambda\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}% \right)\leq\mathopen{}\mathclose{{}\left(1-\frac{\varepsilon\alpha\lambda}{4}}% \right)\cdot\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n}\right) (47)

This holds where 2\alpha\lambda\leq\bigl{(}1-\frac{\varepsilon\alpha\lambda}{4}\bigr{)}. By definition \alpha\leq 1/8 and \lambda<1. The result follows.

#### Case 3: x_{\frac{3n}{4}}\leq x_{\frac{n}{4}}\leq\varnothing

This case is similar to case 2. For \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}(t)}}\right]% \leq\frac{-\varepsilon\alpha n}{4}\Phi_{-} the results follows from lemma 3.9. For \mathbb{E}\mathopen{}\mathclose{{}\left[{\Delta_{-}(t+1)|\bm{X}(t)}}\right]% \geq\frac{-\varepsilon\alpha n}{4}\Phi_{-} two subcases are given by lemma 3.12.

#### Case 3.1 \Phi_{-}(\bm{X}(t))\leq\frac{\varepsilon}{4}\cdot\Phi_{+}(\bm{X}(t))

The result follows from applying lemma 3.12 and lemma 3.8.

#### Case 3.2 \Phi(\bm{X}(t))=\varepsilon^{-8}\cdot\mathrm{O}\mathopen{}\mathclose{{}\left(n% }\right)

This result follows from lemma 3.8. ∎

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   