Random triangle removal

Random triangle removal

Tom Bohman Tom Bohman Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh, PA 15213, USA.
Alan Frieze Alan Frieze Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh, PA 15213, USA.
 and  Eyal Lubetzky Eyal Lubetzky Theory Group of Microsoft Research
One Microsoft Way
Redmond, WA 98052, USA.

Starting from a complete graph on vertices, repeatedly delete the edges of a uniformly chosen triangle. This stochastic process terminates once it arrives at a triangle-free graph, and the fundamental question is to estimate the final number of edges (equivalently, the time it takes the process to finish, or how many edge-disjoint triangles are packed via the random greedy algorithm). Bollobás and Erdős (1990) conjectured that the expected final number of edges has order , motivated by the study of the Ramsey number . An upper bound of was shown by Spencer (1995) and independently by Rödl and Thoma (1996). Several bounds were given for variants and generalizations (e.g., Alon, Kim and Spencer (1997) and Wormald (1999)), while the best known upper bound for the original question of Bollobás and Erdős was due to Grable (1997). No nontrivial lower bound was available.

Here we prove that with high probability the final number of edges in random triangle removal is equal to , thus confirming the exponent conjectured by Bollobás and Erdős and matching the predictions of Spencer et al. For the upper bound, for any fixed we construct a family of graphs by gluing triangles sequentially in a prescribed manner, and dynamically track all homomorphisms from them, rooted at any two vertices, up to the point where edges remain. A system of martingales establishes concentration for these random variables around their analogous means in a random graph with corresponding edge density, and a key role is played by the self-correcting nature of the process. The lower bound builds on the estimates at that very point to show that the process will typically terminate with at least edges left.

T. B. is supported in part by NSF grant DMS-1001638. A. F. is supported in part by NSF grant DMS-0721878.

1. Introduction

Consider the following well-known stochastic process for generating a triangle-free graph, and at the same time creating a partial Steiner triple system. Start from a complete graph on vertices and proceed to repeatedly remove the edges of uniformly chosen triangles. That is, letting denote the initial graph, is obtained from by selecting a triangle uniformly at random out of all triangles in and deleting its 3 edges. The process terminates once no triangles remain, and the fundamental question is to estimate the stopping time

This is equivalent to estimating the number of edges in the final triangle-free graph, since has precisely edges by definition. As the triangles removed are mutually edge-disjoint, this process is precisely the random greedy algorithm for triangle packing.

Bollobás and Erdős (1990) conjectured that the expected number of edges in has order (see, e.g., [Bol1, Bol2]), with the motivation of determining the Ramsey number . Behind this conjecture was the intuition that the graph should be similar to an Erdős-Rényi random graph with the same edge density. Indeed, in the latter random graph with vertices and edges there are typically about triangles, thus, for small , deleting all of its triangles one by one would still retain all but a negligible fraction of the edges.

It was shown by Spencer [Spencer] in 1995, and independently by Rödl and Thoma [RT] in 1996, that the final number of edges is with high probability (w.h.p.). In 1997, Grable [Grable] improved this to an upper bound of w.h.p., and further described how similar arguments, using some more delicate calculations, should extend that result to . This remained the best upper bound prior to this work. No nontrivial lower bound was available. (See [GKPS] for numerical simulations firmly supporting an answer of to this problem.)

Of the various works studying generalizations and variants of the problem, we mention two here. In his paper from 1999, Wormald [Wormald] demonstrated how the differential equation method can yield a nontrivial upper bound on greedy packing of hyperedges in -uniform hypergraphs. For the special case , corresponding to triangle packing, this translated to a bound of . Also in the context of hypergraphs, Alon, Kim and Spencer [AKS] introduced in 1997 a semi-random variant of the aforementioned process (akin to the Rödl nibble [Rodl] yet with some key differences) which, they showed, finds nearly perfect matchings. Specialized to our setting, that process would result in a collection of edge-disjoint triangles on vertices that covers all but of the edges of the complete graph. Alon et al [AKS] then conjectured that the simple random greedy algorithm should match those results, and in particular — generalizing the Bollobás-Erdős conjecture — that applying it to find a maximal collection of -tuples with pairwise intersections at most would leave out an expected number of at most uncovered -tuples. They added that “at the moment we cannot prove that this is the case even for ”, the focus of our work here.

Joel Spencer offered $200 for a proof that the answer to the problem is w.h.p. ([Grable, Wormald]). The main result in this work establishes this precise statement, thus confirming the exponent conjectured by Bollobás and Erdős (1990).

Theorem 1.

Let be the number of steps it takes the random triangle removal process to terminate starting from a complete graph on vertices, and let be the edge set of the final triangle-free graph. Then with high probability , or equivalently, .

We prove Theorem 1 by showing that w.h.p. all variables in a collection of random variables, carefully designed to support the analysis, stay close to their respective trajectories throughout the evolution of this stochastic hypergraph process (see §1.1 for details). Establishing concentration for this large collection of variables hinges on their self-correction nature: the further a variable deviates from its trajectory, the stronger its drift is back toward its mean. However, turning this into a rigorous proof is quite challenging given that the various variables interact and errors (deviations from the mean) propagating from other variables may interfere in the attempt of one variable to correct itself. We construct a system of martingales to guarantee that the drift of a variable towards its mean will dominate the errors in our estimates for its peers.

The tools developed here for proving such self-correcting estimates are generic and we believe they will find applications in other settings. In particular, these methods should support an analysis of the random greedy hypergraph matching process as well as that of the asymptotic final number of edges in the graph produced by the triangle-free process (see §1.2). Progress on the latter process would likely yield improvement on the best known lower bound on the Ramsey number .

1.1. Methods

Our starting point for the upper bound is a system of martingales for dynamically tracking an ensemble of random variables consisting of the triangle count and all co-degrees in the graph. The self-correcting behavior of these variables is roughly seen as follows: should the co-degree of a given pair of vertices deviate above/below its average, then more/fewer than average triangles could shrink the co-neighborhood if selected for removal in the next round, and this compensation effect would eventually drive the variable back towards its mean. One can exploit this effect to maintain the concentration of all these variables around their analogous means in a corresponding Erdős-Rényi random graph, as long as the number of edges remains above . (In the short note [BFL] the authors applied this argument to match this upper bound due to Grable.)

It is no coincidence that various methods break precisely at the exponent 7/4 as it corresponds to the inherent barrier where co-degrees become comparable to the variations in their values that arose earlier in the process. In order to carry out the analysis beyond this barrier, one can for instance enrich the ensemble of tracked variables to address the second level neighborhoods of pairs of vertices. This way one can avoid cumulative worst-case individual errors due to large summations of co-degree variables, en route to co-degree estimates which improve as the process evolves. Indeed, these ideas can push this framework to successfully track the process to the point where edges remain. However, beyond that point (corresponding to an edge density of about ) the size of common neighborhoods of triples would become negligible, foiling the analysis. This example shows the benefit of maintaining control over a large family of subgraphs, yet at the same time it demonstrates how for any family of bounded size subgraphs the framework will eventually collapse.

Figure 1. A subset of the ensemble of about triangular ladders that are tracked for any pair of roots to sustain the analysis of the upper bound to the point of edges for .

In order to prove the upper bound we track an arbitrarily large collection of random variables. A natural prerequisite for understanding the evolution of the number of homomorphisms from any small subgraph to rooted at a given pair of vertices is to have ample control over homomorphisms from subgraphs obtained by “gluing a triangle” to an edge to (as the number of those dictates the probability of losing copies of in the next round on account of losing the mentioned edge). Tracking these would involve additional larger subgraphs etc., and to end this cascade and obtain a system of variables which can self-sustain itself we do the following. To estimate the probability of losing the “last edge” in the largest tracked , in lieu of counting homomorphisms from a “forward extension” (the result of gluing a new triangle onto the given edge) rooted at , we fix our candidate for the edge in that would play the role of this last edge and then count homomorphisms rooted at and featuring this specific edge. In other words, we replace the requirement for control over “forward extensions” with control over “backward extensions” from any given edge in . Obviously, rooting the subgraphs at as many as 4 vertices can drastically decrease the expected number of homomorphisms, say to , while our analysis must involve error probabilities that tend to at a super-polynomial rate to account for our polynomial number of variables.

These points lead to a careful definition of the ensemble of graphs we wish to track. We construct what we refer to as triangular ladders which, roughly put, are formed by repeatedly taking an existing element in the ensemble and gluing a triangle on one of its last added two edges. We do so until reaching a certain threshold on the number of vertices, namely for some fixed , amounting to a family of graphs (see Figure 1). Special care needs to be taken so as to avoid certain substructures that would foil our analysis, yet ignoring these for the moment (postponing precise definitions to §4), note that each new step of adding a triangle (a new vertex and two new edges) also increases the density of the ladder under consideration, as long as the edge density is at least . When examining a given ladder in this ensemble, its forward extensions will also be in the ensemble, by construction, unless we exceed the size threshold. In the latter case, if the threshold is taken to be suitably large, we can safely appeal to a backward extension argument and wind up with polynomially many homomorphisms to .

The main challenge in this approach is to come up with a canonical definition of the ensemble such that a uniform analysis could be applied to its arbitrarily many elements. In particular, since as usual the errors in one variable propagate to others, we end up with a system of linear constraints on variables. Fortunately, suitable canonical definitions of the main term in each variable as a function of its predecessor in the ensemble make this constraint system one without cyclic dependencies, allowing for a simple solution to it. Altogether, the triangular ladder ensemble can be maintained as long as there are edges, and while so it sustains the analysis of the number of triangles.

The lower bound builds on the fact that our analysis of the upper bound yielded asymptotic estimates for all co-degrees up to the point of edges, for arbitrarily small . Already a weaker bound on the co-degrees, when combined with an analysis of the evolution of the number of triangles, suffices to show that either the final number of edges has order at least or at some point there are edges and edge-disjoint triangles. These give rise to a subset of edge-disjoint triangles that existed earlier in the process, such that each one could guarantee an edge to survive in the final graph with probability , and these events are mutually independent. Altogether this leads to at least edges in the final graph w.h.p.

1.2. Comparison with the triangle-free process

A different recipe for obtaining a random triangle-free graph is the so-called “triangle-free process”. In that process, the edges arrive one by one according to a uniformly chosen permutation, and each one is added if and only if no triangles are formed with it in the current graph. It was shown in [ESW] that the final number of edges in this process is w.h.p. , and the correct order of , along with its ramifications on the Ramsey number , was recovered in [Bohman] (see also [BK] for generalizations).

Similarly to that process, the triangle removal process studied here can be presented as taking a uniform ordering of the triangles, then sequentially removing the edges of a triangle if and only if all 3 edges presently belong to the graph. The main result in this work shows that this process too culminates in a triangle-free graph with edges.

Despite the high level similarity between the two protocols, the triangle removal process has proven quite challenging already at the level of acquiring the correct exponent of the final number of edges. One easily seen difference is that, in the triangle-free process, at any given point there is a set of forbidden edges (whose addition would form at least one triangle), yet regardless of its structure the next edge to be added is uniformly distributed over the legal ones. As long as the forbidden set of edges is negligible compared to the remaining legal edges, this process therefore mostly adds uniform edges just as in the Erdős-Rényi random graph. In particular, this is the case as long as edges were added (even based on all the edges that arrived rather than those selected). A coupling to the Erdős-Rényi random graph up to that point immediately gives a lower bound of order on the expected number of edges at the end of the triangle-free process. One could hope for an analogous soft argument to hold for the triangle removal process, where it would translate to an upper bound (as it deletes rather than adds edges).

However, in the triangle removal process studied here, deleting almost all edges save for forms a substantial challenge for the analysis. Already when the edge density is a small constant, still bounded away from , the set of forbidden triangles becomes much larger than the set of legal ones. Thus, in an attempt to analyze the process even up to a density of for some , one is forced to control the geometry of the remaining triangles quite delicately from its very beginning.

2. Upper bound on the final number of edges modulo co-degree estimates

Let be the filtration given by the process. Let denote the neighborhood of a vertex , let be the co-neighborhood of vertices and let denote their co-degree. Our goal is to estimate the number of triangles in which we denote by . The motivation behind tracking co-degrees is immediate from the fact that , yet another important element is their effect on the evolution of . By definition,


where here and in what follows is the one-step change in the variable .

It is convenient to re-scale the number of steps and introduce a notion of edge density as follows:


With this notation we have


and we see that, up to the negligible linear term, the number of edges corresponding to the edge density is in line with that of the Erdős-Rényi random graph . Under the assumption that indeed resembles we expect that the number of triangles and co-degrees would satisfy

Turning to the evolution of the co-degrees, we see that


and so tight estimates on the co-degrees and can already support the analysis of the process. In the short note [BFL] the authors relied on these variables alone to establish a bound of on the number of edges that survive to the conclusion of the algorithm. However, once the edge density drops to we arrive at . As the variables have order and standard deviations of order while is constant, we see that at the point the deviations that developed early in the process cease to be regarded as negligible and a refined analysis is required.

One can achieve better precision for the co-degrees by introducing additional variables aimed at decreasing the variation for as the process evolves. For instance, through estimates on the number of edges between the common and exclusive neighborhoods for every pair of vertices, along with some additional ingredients exploiting the self-correction phenomenon, one can gain an improvement on the exponent. However, eventually these arguments break down, soon after some bounded-size subgraphs becomes too rare, thus foiling the concentration estimates.

We remedy this via the triangular ladder ensemble, defined in §4 and analyzed in §5. To sustain the analysis up to the point where edges remain we must simultaneously control types of random variables, yet our ultimate goal in those sections is simply to estimate all co-degrees . The following corollary is a special case of Theorem 5.4.

Theorem 2.1.

Define to be


and for some (arbitrarily large) absolute constant let

Then for every w.h.p.

for every and all such that and .

The heart of this work is in establishing the above theorem (Sections 35), which is complemented by the following result: A multiplicative error of for all co-degree variables can be enhanced to bounds tight up to a factor of on the number of triangles .

Theorem 2.2.

Set and for some fixed let


Then w.h.p. as long as and we have



We define below a narrow critical interval that has its upper endpoint at the bound we aim to establish for . As long as lies in this interval it is subject to a self-correcting drift, which turns out to be just enough to show that . That is, as long as the sequence is a supermartingale. Standard concentration estimates then imply that is unlikely to ever cross this critical interval, leading to the desired estimate on .


and observe that for all we have , whence as long as (2.7) is valid we have . This clearly holds at the beginning of the process, and in what follows we may condition on this event as part of showing that the bound (2.7) will be maintained except with an error probability that tends to 0 at a super-polynomial rate.

We begin with estimates on the one-step expected changes of our variables. Recall that due to (2.1) we have

To bound we thus need an estimate on , achieved by the next simple lemma.

Lemma 2.3.

Let and suppose that for all and some . Then


The lower bound is due to Cauchy-Schwarz. For the upper bound fix and note that the assumption on the ’s implies that for all . Observe that the convex function achieves its maximum over the convex set at an extremal point where for all . Hence, for indices, for indices and if is odd there is a single . In particular

Since and by Eq. (2.6), an application of this lemma together with the fact that gives that

together with the following upper bound:

(In the last equality we absorbed the -term into the error-term factor of the last expression since .) Adding this to our estimate for (where again the additive 2 may be absorbed in our error term) while observing that yields

again incorporating the term into the error. Thus, by the definition of

Now assume that is the first round where raises above , i.e., it enters the interval

Further let

Since and the error-term can easily be increased to as (with much room to spare), the upper bound on gives

It follows that is indeed a supermartingale for large .

Next consider the one-step variation of . Denoting the selected triangle in a given round by , the change in following this round is at most and in light of our co-degree estimate (2.6) this expression deviates from its expected value of by at most . In particular, and letting this ensures that

where the last inequality holds for large . With at most steps remaining until the process terminates, Hoeffding’s inequality establishes that for some fixed ,

Altogether, w.h.p.  for all , as required. ∎

Observe that for all the function defined in Theorem 2.2 satisfies , thus the estimates (2.6)–(2.7) precisely feature a relative error of order in line with the definition of in Theorem 2.1. Specifically, the conclusion of Theorem 2.1 provides the estimate for , thus fulfilling the hypothesis of Theorem 2.2 en route to a bound of for . Altogether, we can apply both theorems in tandem as long as . In particular, at that point (2.7) guarantees that the process is still active with triangles, yet there are merely edges left. This establishes the upper bound in our main result modulo Theorem 2.1.

3. From degrees to extension variables

Our goal in this section is to show that strong control on degrees, co-degrees and the number of triangles implies some level of control on arbitrary subgraph counts. Let denote the degree of the vertex and as before let and denote the co-degree of and total number of triangles respectively. Our precise assumption will be that for some absolute constant

where .

The function begins the process at and then gradually increases with until reaching a value of for some (arbitrarily small) at the final stage of our analysis. However, in order to emphasize that the value of this plays no role in this section we will only make use of the fact that is such that




we will show that as long as we can w.h.p. asymptotically estimate the number of copies of any balanced (a precise definition follows) labeled graph with vertices rooted at a prescribed subset of vertices. These estimates will feature a multiplicative error-term of , where will only depend on the uniform bound on the sizes of the graphs under consideration.

Definition 3.1.

[extensions and subextensions] An extension graph is defined to be a graph paired with a subset of distinguished vertices which form an independent set in . The scaling of an extension graph is defined to be

where , and . The density of is defined to be

A subextension of , denoted , is an extension graph with , , and the same distinguished set (thus ). We say that is a proper subextension, denoted by , whenever . We will denote the trivial subextension (the edgeless graph on ) by . For any subextension , define the quotient to be the extension graph with the distinguished vertex set . Observe that

Let be an extension graph associated with the distinguished subset as in Definition 3.1 and set , and . Let be an injection. We are interesting in tracking the number of copies of in with playing the role of the distinguished vertices. That is, we are interested in counting the extensions from to copies of . Formally, an injective map extends to a copy of if is a graph homomorphism that agrees with , that is, and implies that . Define

Note that counts labeled copies of rooted at the given vertex subset .

We will track the variable for extension graphs that are dense relative to their subextensions. More precisely, we say that is balanced if its density satisfies for any subextension of . Furthermore, is strictly balanced if its density is strictly larger than the densities of all of its subextensions. Our aim is to control the variable for any balanced extension graph up to the point where the scaling of becomes constant, formulated as follows. Let


where the times being considered here are, as usual, of the form for integer . Recalling that observe that , thus


and in particular any balanced has for any subextension of (a strict inequality holds when is strictly balanced). Thus, the threshold up to which we can track will indeed be dictated by rather than by for some .

Finer precision in tracking the variables will be achievable up to a point slightly earlier than such that is still reasonably large. To this end it will be useful to generalize the quantity as follows: With (2.5) in mind, for any let


(again selecting among all time-points of the form for integer values of ). We will establish these more precise bounds on up to for some suitably small , i.e., up to a point where the scaling is still super-logarithmic in . (Recall that is decreasing in and yet tends to faster than .)

When examining the times and one ought to keep in mind that, for any extension graph on vertices, the value of decreases within a single step of the process by a multiplicative factor of . In particular, in that case at time .

The main result of this section is the following estimate on the variables , where the proportional error will be in terms of , the varying proportional error we have for the variables and .

Theorem 3.2.

For every there is some so that w.h.p. for all and every extension graph on vertices and injection we have:

  1. If is balanced then for all .

  2. If is strictly balanced then with holds for all .

Before proving Theorem 3.2 we will derive the following corollary which will be applicable to extension graphs that are not necessarily balanced.

Corollary 3.3.

Fix and let . Then w.h.p. for every extension graph on vertices with we have:

  1. If for all subextensions then .

  2. If for all subextensions then .


Assume w.h.p. that the estimates in Theorem 3.2 hold simultaneously for all extension graphs on at most vertices throughout time . We may further assume that every vertex has . Indeed, for the statement in Part (1) this is w.l.o.g. since adding any isolated vertex to would contribute a factor of to , balanced by a factor of to . For Part (2) this assumption is implied by the hypothesis , as we can take some subextension with to obtain that .

Define a sequence of strictly increasing sets as follows. Associate each candidate for for , one that contains , with the extension graph where

i.e., the set of edges of the induced subgraph on that do not have both endpoints in , and set the distinguished vertex set of to be (guaranteed to be an independent set by our definition of ). With this definition in mind, let be the subset of vertices that maximizes . If more than one such subset exists, let be one of these which has minimal cardinality.

To see that the sequence of ’s is strictly increasing, let and consider the potential values for building upon some . Taking would yield , strictly smaller than the density that would correspond to (otherwise would consist of isolated vertices in , contradicting our minimal degree assumption). Hence, indeed and overall is strictly balanced by construction. (We note in passing that and that is only a subextension of for since for .)

Motivating these definitions is the fact that for a given sequence as above one can recover (counted by ) iteratively from a sequence of (in which and for all ), hence


(Here and throughout this section, with a slight abuse of notation we use the notation to denote an injective map counted by the corresponding variable .)

To prove Part (1) of the corollary, observe that and so by hypothesis we have , or equivalently, . For any , the fact that implies that is as large as , where is the extension graph with vertex set , distinguished vertices and all edges of between vertices of excluding those whose endpoints both lie in . Since we get

which readily implies that

Equivalently (via (3.4)), and by induction we thus have for all . Recalling that is strictly balanced, we can appeal to Theorem 3.2 and derive that for any injection

where . It is easily seen that by definition and so revisiting (3.6) yields that under the hypothesis of Part (1) we have

The proof of Part (1) is completed by the fact that for all (since the ’s are strictly increasing) and in particular


It remains to prove Part (2). Observe that the hypothesis in this part implies that (by taking ), that is we are considering some time point .

To utilize the assumption on the scaling of quotients of , we go back to the definition of the sequence of ’s and argue that


Indeed, this follows from the fact that selecting would exactly correspond to having , and yet is chosen so as to maximize the value of .

While itself may not be a subextension of (one has iff ), one can modify its distinguished vertex set to remedy this fact. Namely, the extension graph with the distinguished vertex set is a subextension of , and since we get that

where the last inequality follows from our hypothesis on all the quotients of .

From the inequality established above we infer that . At the same time, the combination of (3.4) and (3.8) implies that , and altogether . Now, since