Barak-Erdős graphs and the infinite-bin model

Barak-Erdős graphs and the infinite-bin model

Abstract

A Barak-Erdős graph is a directed acyclic version of an Erdős-Rényi graph. It is obtained by performing independent bond percolation with parameter on the complete graph with vertices , in which the edge between two vertices is directed from to . The length of the longest path in this graph grows linearly with the number of vertices, at rate . In this article, we use a coupling between Barak-Erdős graphs and infinite-bin models to provide explicit estimates on . More precisely, we prove that the front of an infinite-bin model grows at linear speed, and that this speed can be obtained as the sum of a series. Using these results, we prove the analyticity of for , and compute its power series expansion. We also obtain the first two terms of the asymptotic expansion of as , using a coupling with branching random walks.

1 Introduction

Random graphs and interacting particle systems have been two active fields of research in probability in the past decades. In 2003, Foss and Konstantopoulos [11] introduced a new interacting particle system called the infinite-bin model and established a correspondence between a certain class of infinite-bin models and Barak-Erdős random graphs, which are a directed acyclic version of Erdős-Rényi graphs.

In this article, we study the speed at which the front of an infinite-bin model drifts to infinity. These results are applied to obtain a fine asymptotic of the length of the longest path in a Barak-Erdős graph. In the remainder of the introduction, we first describe Barak-Erdős graphs, then infinite-bin models. We then state our main results on infinite-bin models, and their consequences for Barak-Erdős graphs.

1.1 Barak-Erdős graphs

Barak and Erdős introduced in [3] the following model of a random directed graph with vertex set (which we refer to as Barak-Erdős graphs from now on) : for each pair of vertices , add an edge directed from to with probability , independently for each pair. They were interested in the maximal size of strongly independent sets in such graphs.

However, one of the most widely studied properties of Barak-Erdős graphs has been the length of its longest path. It has applications to mathematical ecology (food chains) [9, 22], performance evaluation of computer systems (speed of parallel processes) [14, 15] and queuing theory (stability of queues) [11].

Newman [21] studied the length of the longest path in Barak-Erdős graphs in several settings, when the edge probability is constant (dense case), but also when it is of the form with (sparse case). In the dense case, he proved that when gets large, the length of the longest path grows linearly with in the first-order approximation :

(1.1)

where the linear growth rate is a function of . We plot in Figure 1 an approximation of .

Figure 1: Plot of a an approximation of , using iterations of an infinite-bin model, for values of that are integer multiples of .

Newman proved that the function is continuous and computed its derivative at . Foss and Konstantopoulos [11] studied Barak-Erdős graphs under the name of “stochastic ordered graphs” and provided upper and lower bounds for , obtaining in particular that

where denotes the probability of the absence of an edge.

Denisov, Foss and Konstantopoulos [10] introduced the more general model of a directed slab graph and proved a law of large numbers and a central limit theorem for the length of its longest path. Konstantopoulos and Trinajstić [17] looked at a directed random graph with vertices in (instead of for the infinite version of Barak-Erdős graphs) and identified fluctuations following the Tracy-Widom distribution. Foss, Martin and Schmidt [12] added to the original Barak-Erdős model random edge lengths, in which case the problem of the longest path can be reformulated as a last-passage percolation question. Gelenbe, Nelson, Philips and Tantawi [14] studied a similar problem, but with random weights on the vertices rather than on the edges.

Ajtai, Komlós and Szemerédi [1] studied the asymptotic behaviour of the longest path in sparse Erdős-Rényi graphs, which are the undirected version of Barak-Erdős graphs.

1.2 The infinite-bin model

Foss and Konstantopoulos introduced the infinite-bin model in [11] as an interacting particle system which, for a right choice of parameters, gives information about the growth rate of the longest path in Barak-Erdős graphs. Consider a set of bins indexed by the set of integers . Each bin may contain any number of balls, finite or infinite. A configuration of balls in bins is called admissible if there exists such that:

  1. every bin with an index smaller or equal to is non-empty ;

  2. every bin with an index strictly larger than is empty.

The largest index of a non-empty bin is called the position of the front. From now on, all configurations will implicitly be assumed to be admissible. Given an integer , we define the move of type as a map from the set of configurations to itself. Given an initial configuration , is obtained by adding one ball to the bin of index , where is the index of the bin containing the -th ball of (the balls are counted from right to left, starting from the rightmost nonempty bin).

(a) A configuration , the numbers inside the balls indicate how they are counted from right to left.
(b) The configuration .
(c) The configuration .
Figure 2: Action of two moves on a configuration.

Given a probability distribution on the set of positive integers and an initial configuration , one defines the Markovian evolution of the infinite-bin model with distribution (or IBM() for short) as the following stochastic recursive sequence:

where is an i.i.d. sequence of law . We prove in Theorem 1.1 that the front moves to the right at a speed which tends a.s. to a constant limit . We call the speed of the IBM(). Note that the model defined in [11] was slightly more general, allowing to be a stationary-ergodic sequence. We also do not adopt their convention of shifting the indexing of the bins which forces the front to always be at position .

Foss and Konstantopoulos [11] proved that if was the geometric distribution of parameter then , where is the growth rate of the length of the longest path in Barak-Erdős graphs with edge probability . They also proved, for distributions with finite mean verifying , the existence of renovations events, which yields a functional law of large numbers and central limit theorem for the IBM(). Based on a coupling result for the infinite-bin model obtained by Chernysh and Ramassamy [8], Foss and Zachary [13] managed to remove the condition required by [11] to obtain renovation events.

Aldous and Pitman [2] had already studied a special case of the infinite-bin model, namely what happens to the speed of the front when is the uniform distribution on , in the limit when goes to infinity. They were motivated by an application to the running time of local improvement algorithms defined by Tovey [24].

1.3 Speed of infinite-bin models

The remainder of the introduction is devoted to the presentation of the main results proved in this paper. In this section we state the results related to general infinite-bin models, and in the next one we state the results related to the Barak-Erdős graphs.

We first prove that in every infinite-bin model, the front moves at linear speed. Foss and Konstantopoulos [11] had derived a special case of this result, when the distribution has finite expectation.

Theorem 1.1.

Let be an infinite-bin model with distribution , starting from an admissible configuration . For any , we write for the position of the front of . There exists , depending only on the distribution , such that

In the next result, we obtain an explicit formula for the speed of the IBM(), as a series. To give this formula we first introduce some notation. Recalling that is the set of positive integers, we denote by the set of words on the alphabet , i.e. the set of all finite-length sequences of elements of . Given a non-empty word , written (where the are the letters of ), we denote by the length of . The empty word is denoted by .

Fix an infinite-bin model configuration . We define the subset of as follows: a word belongs to if it is non-empty, and if starting from configuration and applying successively the moves , the last move results in placing a ball in a previously empty bin.

Given a word which is not the empty word, we set to be the word obtained from by removing the first letter. We also set . We define the function as follows:

Theorem 1.2.

Let be an admissible configuration and a probability distribution on . We define the weight of a word by

If , then

(1.2)
Remark 1.3.

One of the most striking features of (1.2) is that whereas for any , is a non-constant function of , does not depend of this choice of configuration. As a result, Theorem 1.2 gives in fact an infinite number of formulas for the speed of the IBM().

Theorem 1.2 can be extended to prove the following result:

In other words, if we define as the Cesàro mean of its partial sums (on words of finite length), (1.2) holds for any probability distribution and admissible configuration .

1.4 Longest increasing paths in Barak-Erdős graphs

Using the coupling introduced by Foss and Konstantopoulos between Barak-Erdős graphs and infinite-bin models, we use the previous results to extract information on the function defined in (1.1). Firstly, we prove that for large enough (i.e. for dense Barak-Erdős graphs), the function is analytic and we obtain the power series expansion of centered at . Secondly, we provide the first two terms of the asymptotic expansion of as .

We deduce from Theorem 1.2 the analyticity of for close to . For any word , we define the height of to be

For any and admissible configuration , we set

(1.3)
Theorem 1.4.

The function is analytic on and for ,

Similarly to what has been observed in Remark 1.3, this result proves that the value of does not depend on the configuration , justifying a posteriori the notation.

We do not believe the bounds and to be optimal, see Remark 1.7. They are obtained using very rough bounds on the function .

Remark 1.5.

Using (1.3) and Lemma 6.2, it is possible to explicitly compute as many coefficients of the power series expansion as desired, by picking a configuration and computing quantities of the form for finitely many words . For example, we observe that as ,

It is clear from formula (1.3) that is integer-valued. Based on our computations, we conjecture that is non-negative and non-decreasing.

We now turn to the asymptotic behaviour of as , i.e. the length of the longest increasing path in sparse Barak-Erdős graphs. We improve the result obtained by Newman [21].

Theorem 1.6.

We have .

In particular, we note that does not have a finite second derivative at .

Remark 1.7.

Numerical simulations tend to suggest that the power series expansion of at has a radius of convergence larger than but smaller than . Together with the fact that admits no second derivative at , this raises the question of the existence of a phase transition in this process.

Theorem 1.6 is obtained by coupling the infinite-bin model with uniform distribution with a continuous-time branching random walk with selection (as observed by Aldous and Pitman [2]) and by extending to the continuous-time setting the results of Bérard and Gouéré [4] on the asymptotic behaviour of a discrete-time branching random walk. Assuming that a conjecture of Brunet and Derrida [7] on the speed of a branching random walk with selection holds, the next term in the asymptotic expansion should be given by .

Remark 1.8.

With arguments similar to the ones used to prove Theorem 1.6, we expect that one can also obtain the asymptotic behaviour of as and simultaneously, proving that:

as long as . We expect a different behaviour if .

Organisation of the paper

We state more precisely the notation used to study the infinite-bin model in Section 2. We also introduce an increasing coupling between infinite-bin models, which is a key result for the rest of the article.

In Section 3, we prove that the speed of an infinite-bin model with a measure of finite support can be expressed using the invariant measure of a finite Markov chain. This result is then used to prove Theorem 1.1 in the general case. We prove Theorem 1.2 in Section 4 using a method akin to “exact perturbative expansion”.

We review in Section 5 the Foss-Konstantopoulos coupling between Barak-Erdős graphs and the infinite-bin model and use it to provide a sequence of upper and lower bounds converging exponentially fast to . This coupling is used in Section 6, where we prove Theorem 1.4 using Theorem 1.2. Finally, we prove Theorem 1.6 in Section 7, by extending the results of Bérard and Gouéré [4] to compute the asymptotic behaviour of a continuous-time branching random walk with selection.

2 Basic properties of the infinite-bin model

We write for the set of positive integers, , for the set of non-negative integers and . We denote by

the set of admissible configurations for an infinite-bin model. Note that the definition we use here is more restrictive than the one used, as a simplification, in the introduction. Indeed, we impose here that if a bin has an infinite number of balls, every bin to its left also has an infinite number of balls. However, this has no impact on our results, as the dynamics of an infinite-bin model does not affect bins to the left of a bin with an infinite number of balls. One does not create balls in a bin at distance greater than 1 from a non-empty bin.

We wish to point out that our definition of admissible configurations has been chosen out of convenience. Most of the results of this article could easily be generalized to infinite-bin models with a starting configuration belonging to

see e.g. Remark 3.7. They could even be generalized to configurations starting with a finite number of balls, if we adapt the dynamics of the infinite-bin model as follows. For any , if is larger than the number of balls existing at time , then the step is ignored and the IBM configuration is not modified. However, with this definition some trivial cases might arise, for example starting with a configuration with only one ball, and using a measure with .

For any and , we call the number of balls at position in the configuration . Observe that the set of non-empty bins is a semi-infinite interval of . In particular, for any , there exists a unique integer such that and for all . The integer is called the front of the configuration.

Let , and . We denote by

the number of balls to the right of and the leftmost position such that there are less than balls to its right respectively. Note that the position of the front in the configuration is given by . Observe that for any ,

(2.1)

For and , we set the transformation that adds one ball to the right of the -th largest ball in . We extend the notation to allow , by setting . We also introduce the shift operator . We observe that and  commute, i.e.

(2.2)

Recall that an infinite-bin model consists in the sequential application of randomly chosen transformations , called move of type . More precisely, given a probability measure on and i.i.d. random variables with distribution , the IBM() is the Markov process on starting from , such that for any , .

We introduce a partial order on , which is compatible with the infinite-bin model dynamics: for any , we write

The functions are monotone, increasing in and decreasing in for this partial order. More precisely

(2.3)

Moreover, the shift operator dominates every function , i.e.

(2.4)

As a consequence, infinite-bin models can be coupled in an increasing fashion.

Proposition 2.1.

Let and be two probabilities on , and . If for any , we can couple the IBM and the IBM such that for any , a.s.

Proof.

As for any , , we can construct a couple such that has law , has law and a.s. Let be i.i.d. copies of , we set and . By induction, using (2.3), we immediately have for any . ∎

We extended in this section the definition of the IBM() to measures with positive mass on . As applying does not modify the ball configuration, the IBM() and the IBM() are straightforwardly connected.

Lemma 2.2.

Let be a probability measure on with . We write for the measure verifying . Let be an IBM() and be an independent random walk with step distribution Bernoulli with parameter . Then the process is an IBM().

In particular, assuming Theorem 1.1 holds, we have .

3 Speed of the infinite-bin model

In this section, we prove the existence of a well-defined notion of speed of the front of an infinite-bin model. We first discuss the case when the distribution is finitely supported and the initial configuration is simple, then we extend it to any distribution and finally we generalize to any admissible initial configuration.

3.1 Infinite-bin models with finite support

Let be a probability measure on with finite support, i.e. such that there exists verifying . Let be an IBM(), we say that is an infinite-bin model with support bounded by . One of the main observations of the subsection is that such an infinite-bin model can be studied using a Markov chain on a finite set. As a consequence, we obtain an expression for the speed of this infinite-bin model.

Given , we introduce the set

For any , we write . We introduce

For any , we write , that encodes the set of balls that are close to the front. As the IBM has support bounded by , the bin in which the -st ball is added to depend only on the position of the front and on the value of . This reduces the study of the dynamics of to the study of .

Lemma 3.1.

The sequence is a Markov chain on .

Proof.

For any and , we denote by

For any and , we have . Moreover, we have .

Figure 3: “Commutation” of with and .

Let be i.i.d. random variables with law and . For any , we set . Using the above observation, we have

thus is a Markov chain. ∎

For any , the set of bins that are part of represents the set of “active” bins in , i.e. the bins in which a ball can be added at some time in the future with positive probability. The number of balls in increases by one at each time step, until it reaches . At this time, when a new ball is added, the leftmost bin “freezes”, it will no longer be possible to add balls to this bin, and the “focus” is moved one step to the right.

We introduce a sequence of stopping times defined by

We also set the number of balls in the bin that “freezes ” at time . For any , we write for any .

Lemma 3.2.

Let such that , then

  • for any , ,

  • for any and , .

Proof.

By induction, for any , . Consequently, for any , we have . Moreover, as

we have the second equality. ∎

Using the above result, we prove that the speed of an infinite-bin model with finite support does not depend on the initial configuration. We also obtain a formula for the speed , that can be used to compute explicit bounds.

Proposition 3.3.

Let be a probability measure with finite support and be an IBM() with initial configuration . There exists such that for any , we have

Moreover, setting for the invariant measure of we have

(3.1)
Proof.

Let , we can assume that , up to a deterministic shift. At each time , a ball is added in a bin with a positive index, thus for any , we have

Using the notation of Lemma 3.2, we rewrite it . Moreover, as and , we have

yielding a.s. As a.s., we obtain

Moreover and by ergodicity of . Consequently, if we set the constant is well-defined.

We apply Lemma 3.2, we have

Moreover, we have a.s. This yields

(3.2)

Using (2.1), this convergence is extended to a.s. ∎

Remark 3.4.

If the support of is included in , it follows from Lemma 2.2 that the IBM() also has a well-defined speed .

3.2 Extension to arbitrary distributions

We now use Proposition 3.3 to prove Theorem 1.1.

Proposition 3.5.

Let be probability measure on and an IBM() with initial configuration . There exists such that for any , we have a.s.

Moreover, if is another probability measure we have

(3.3)
Proof.

Let . We write for an i.i.d. sequence of random variables of law . For any , we set . We then define the processes and by and

By induction, we have for any , using (2.3) and (2.4).

As is an infinite-bin model with support included in , by Remark 3.4, there exists such that for any

Moreover, by definition of and (2.2), for any we have

therefore, by law of large numbers

By Proposition 2.1, we observe immediately that is an increasing sequence, bounded by 1, thus converges. Moreover, . We conclude that a.s. By Proposition 2.1, (3.3) trivially holds. ∎

Remark 3.6.

Let be a probability measure on , we set . We observe from the proof of Proposition 3.5 and Lemma 2.2 that

As is the speed of an IBM with support bounded by , it can be computed explicitly using (3.1). This provides tractable bounds for . For example, we have , where .

Remark 3.7.

Proposition 3.5 can be extended to infinite-bin models starting with a configuration . Let be a probability measure and an IBM() starting with a configuration . If has a support bounded by , then the projection is a Markov chain, that will hit the set in finite time. Therefore, we can apply Proposition 3.3, we have a.s.

If has unbounded support, the IBM() can still be bounded, in the same way than in the proof of Proposition 3.5, by infinite-bin models with bounded support. As a consequence, Theorem 1.1 holds for any starting configuration belonging to .

4 A formula for the speed of the infinite-bin model

In this section, we prove that we can write as the sum of a series, provided that this series converges. A non-rigorous heuristic for the proof goes along the following lines. Let and be a probability measure such that , and be i.i.d. random variables with law . If is small enough then the sequence consists in long time intervals such that on these intervals, separated by short patterns that appear at random. Every move of type 1 makes the front of the infinite-bin model increase by 1, and each pattern induces a delay. Therefore, we expect the value of to be close to minus the sum over every possible pattern of the delay caused by this pattern to the process multiplied by its probability of occurrence.

This sum is an infinite sum and we hope that for small enough, the contributions of the long patterns will decay fast enough so that the series converges and its sum is equal to . It appears that in fact, this series often converges, even when is not close to 1, and whenever it converges its sum is equal to .

We recall some notation from the introduction. We denote by the set of finite words on the alphabet . For any , we define to be the length of .

Let be a probability distribution on and i.i.d. random variables with law . We write

for the weight of the word .

If is a non-empty word, we denote by (respectively ) the word (resp. ) obtained by erasing the last (resp. first) letter of . We use the convention .

Given any , we define the function by

where is the set of non-empty words such that, starting from and applying successively the moves , the last move results in placing a ball in a previously empty bin.

For and , we denote by the configuration of the infinite-bin model obtained after applying successively moves of type to the initial configuration , i.e.

and we set the displacement of the front of the infinite-bin model after performing the sequence of moves in . Using this definition, we obtain an alternative expression for .

Lemma 4.1.

For any , we have

(4.1)
Proof.

Observe that equals (resp. ) if the last move of adds a ball in a previously non-empty (resp. empty) bin. Therefore we have . Similarly, . We conclude that

As a direct consequence of Lemma 4.1, for any we have

(4.2)

i.e., the displacement induced by is the sum of for any consecutive subword of (where the subwords are counted with multiplicity).

Remark 4.2.

One could also go the other way round, start with and define to be the function verifying

where denotes the fact that is a factor of (i.e. a consecutive subword of ) and denotes the number of times appears as a factor of . In that case, one would obtain formula (4.1) for as the result of a Mőbius inversion formula (see [23, Sections 3.6 and 3.7] for details on incidence algebras and Mőbius inversion formulas).

Using these notation and results, we prove the following lemma.

Lemma 4.3.

For any probability measure and , we have

This lemma straightforwardly implies Theorem 1.2 by Stolz-Cesàro theorem.

Proof.

Let be an IBM() starting from the configuration . We have, by definition of , . Moreover, by Theorem 1.1 and dominated convergence,

We easily compute using (4.2), we obtain

which concludes the proof. ∎

In Section 6, we study in more details the function . In particular, we give sufficient conditions on to have , which allows to prove that in some cases, the series is absolutely convergent.

5 Length of the longest path in Barak-Erdős graphs

In the rest of the article, we use the results obtained in the previous sections to study the asymptotic behaviour of the length of the longest path in a Barak-Erdős graph. Let , we write for the geometric distribution on with parameter , verifying for any . In this section, we present a coupling introduced by Foss and Konstantopoulos [11] between an IBM() and a Barak-Erdős graph of size , used to compute the asymptotic behaviour of the length of the longest path in this graph.

Recall that a Barak-Erdős graph on the vertices , with edge probability is constructed by adding an edge from to with probability , independently for each pair . We write for the length of the longest path in this graph. Newman [21] proved that