Faster Spectral Sparsification in Dynamic StreamsThe algorithmic part of this paper has recently been improved by a subset of the authors [KNST19].

# Faster Spectral Sparsification in Dynamic Streams††thanks: The algorithmic part of this paper has recently been improved by a subset of the authors [Knst19].

Michael Kapralov
EPFL
Aida Mousavifar
EPFL
Cameron Musco
Microsoft Research
Christopher Musco
Princeton University
Navid Nouri
EPFL
###### Abstract

Graph sketching has emerged as a powerful technique for processing massive graphs that change over time (i.e., are presented as a dynamic stream of edge updates) over the past few years, starting with the work of Ahn, Guha and McGregor (SODA’12) on graph connectivity via sketching. In this paper we consider the problem of designing spectral approximations to graphs, or spectral sparsifiers, using a small number of linear measurements, with the additional constraint that the sketches admit an efficient recovery scheme.

Prior to our work, sketching algorithms were known with near optimal space complexity, but time decoding (brute-force over all potential edges of the input graph), or with subquadratic time, but rather large space complexity (due to their reliance on a rather weak relation between connectivity and effective resistances). In this paper we first show how a simple relation between effective resistances and edge connectivity leads to an improved space and time algorithm, which we show is a natural barrier for connectivity based approaches. Our main result then gives the first algorithm that achieves subquadratic recovery time, i.e. avoids brute-force decoding, and at the same time nontrivially uses the effective resistance metric, achieving space and recovery time.

Our main technical contribution is a novel method for ‘bucketing’ vertices of the input graph into clusters that allows fast recovery of edges of high effective resistance: the buckets are formed by performing ball-carving on the input graph using (an approximation to) its effective resistance metric. We feel that this technique is likely to be of independent interest.

Our second technical contribution is a new PRG for graph sketching applications that allows stretching a seed vector of random bits of length to polynomial length pseudorandom strings with only cost per evaluation. In fact, one notes that the aforementioned runtime bounds for graph sketches formally only hold under the assumption of free perfect randomness, and deteriorate by another factor of if Nisan’s PRG is used, as is standard. Our PRG is the first efficient PRG for graph sketching applications, allowing us to remove the free randomness assumption at only a polylogarithmic factor loss in runtime.

## 1 Introduction

A surprising and extremely useful algorithmic fact is that any graph can be approximated, in a strong sense, by a very sparse graph. In particular, given a graph on nodes that has possibly edges, it is possible to find a graph with just edges such that, for any vector ,

 (1−ϵ)xTLGx≤xTL~Gx≤(1+ϵ)xTLGx (1)

Here and are the Laplacian matrices of and , respectively. Any satisfying (1) is called a spectral sparsifier of . A spectral sparsifier preserves many important structural properties of : the total weight of edges crossing any cut in is within a factor of the weight crossing the same cut in , each eigenvalue of is within a factor of each eigenvalue of , and electrical flows in well approximate those in .

These properties and more allow to be used as a surrogate for in many algorithmic applications. Since it can be stored in less space and operated on more efficiently, using the spectral sparsifier can generically reduce computational costs associated with processing large graphs [BSST13].

### 1.1 Spectral sparsifiers via linear sketching

The first algorithm for computing spectral sparsifiers was introduced by Spielman and Teng [ST11], with improvements offered in a number of subsequent papers. [SS11] gives a simple algorithm based on randomly sampling ’s edges and [BSS12] gives the first result achieving sparsifiers with an optimal number of edges, . Algorithms in [LS15] and [LS17] offer faster alternatives.

Recently, there has been great interest in algorithms that can recover a sparsifier based on a linear sketch of [AGM12a, AGM12b, GKP12, AGM13, KLM17]. The idea is to compress some representation of (usually its edge-vertex incidence matrix ) by multiplying that representation by a random sketching matrix, , with a small number of rows. We then extract a sparsifier from , which ideally does not store much more than bits of information itself.

This approach can be viewed as reframing the sparsification problem as a highly structured sparse recovery problem. In traditional sparse recovery, the goal is to compress a vector with a linear sketch . From , we extract a sparse vector that approximates . Here, the object we compress is a graph, and we extract a sparse approximation to the graph. As in vector sparse recovery, we are interested in two central questions:

1. How small of a linear sketch still allows for recovery of a spectral sparsifier ?

2. How quickly can we extract from this linear sketch?

We refer to the second cost as the time to “decode” our linear sketch. In traditional sparse recovery, the answer to both of these questions in roughly . I.e., we can recover a sparse vector approximation using space and decoding time that are both linear in .

The case is far less clear for graph sketching. Current methods achieve space , which is nearly optimal, but use brute force decoding techniques that run in time. We conjecture that this cost can be improved to . This paper makes the first progress towards that goal.

### 1.2 Why study this problem?

Like algorithms for vector sparse recovery [GI10], linear sketching algorithms for graph sparsification offer powerful tools for distributed or streaming computational environments. In particular, they can be far more flexible than traditional sparsification algorithms.

For example, any linear sketching algorithm for computing a sparsifier immediately yields a dynamic streaming algorithm for sparsification. In the dynamic streaming setting, the algorithm receives a stream of edge updates to a changing graph (i.e. edge insertions or deletions). The goal is to maintain a small space, e.g. space111We use as shorthand for , where is a fixed constant that does not depend on ., compression of and to eventually extract a sparsifier from this compression. To apply a linear sketching algorithm to this problem, we simply note that any edge update can be expressed as a linear update to the edge-vertex incidence matrix , so can be maintained dynamically. A sparsifier can then be extracted from at any time.

The dynamic streaming setting naturally models computational challenges that arise when processing dynamically changing social networks, web topologies, transportation networks, and other large graphs. Not only does linear graph sketching offer a powerful approach to dealing with these computation problems, but it is the only known approach: all known dynamic streaming algorithms for graph sparsification, and in fact any other graph problem, are based on linear sketching [McG14]222In fact, it can be shown formally that any dynamic streaming algorithm can be implemented as a linear sketching algorithm [LNW14]..

### 1.3 Prior work

For a general survey on linear sketching methods and streaming graph algorithms more generally, we refer the reader to [McG14]. We focus on reviewing prior work specifically related to sparsification, which in some sense is the most generic graph compression objective that has been studied.

The idea of linear graph sketching was introduced in a seminal paper by Ahn, Guha, and McGregror [AGM12a]. They present an algorithm for computing a cut sparsifier of , which is a strictly weaker, but still useful, approximation than a spectral sparsifier [BK96]. Their work was improved in [AGM12b] and [GKP12], which use a linear compression of size to compute a cut sparsifier.

The more challenging problem of computing a spectral sparsifier from a linear sketch was addressed in [AGM13], who give an space solution. An space solution was obtained in [KLM17] by more explicitly exploiting the connection between graph sketching and vector sparse recovery.

We also mention that spectral sparsifiers have been studied in the insertion-only streaming model, where edges can only be added to [KL13], and in a dynamic data structure model [ADK16, ACK16, JS18a], where more space is allowed, but the algorithm must quickly output a sparsifier at every step of the stream. While these models are superficially similar to the dynamic streaming model, they seem to allow for different techniques, and in particular do not require linear sketching.

#### Effective resistance, spectral sparsification, and random spanning trees.

The effective resistance metric or effective resistance distances induced by an undirected graph plays a central role in spectral graph theory and has been at the heart of numerous algorithmic breakthroughs over the past decade. They are central to the to obtaining fast algorithms for constructing spectral sparsifiers [SS11, KLP16a], spectral vertex sparsifiers [KLP16b], sparsifiers of the random walk Laplacian [CCL15, JKPS17], and subspace sparsifiers [LS18]. They have played a key role in many advances in solving Laplacian systems [ST04, KMP10, KMP11, PS14, CKM14, KLP16a, KLP16b, KS16] and are critical to the current fastest (weakly)-polynomial time algorithms for maximum flow and minimum cost flow in certain parameter regimes [LS14]. Given their utility, the computation of effective resistances has itself become an area of active research [JS18b, CGP18].

In a line of work particularly relevant to this paper, the effective resistance metric has played an important role in obtaining faster algorithms for generating random spanning trees [KM09, MST15, Sch18]. The result of [MST15] partitions the graph into clusters with bounded diameter in the effective resistance metric in order to speed up simulation of a random walk, whereas [Sch18] proposed a more advanced version of this approach to achieve a nearly linear time simulation. While these results seem superficially related to ours, there does not seem to be any way of using spanning tree generation techniques for our purpose. The main reason is that the objective in spanning tree generation results is quite different from ours: there one would like to find a partition of the graph that in a sense minimizes the number times a random walk crosses cluster boundaries, which does not correspond to a way of recovering ‘heavy’ effective resistance edges in the graph. In particular, while in spanning tree generation algorithms the important parameter is the number of edges crossing the cuts generated by the partitioning, whereas it is easily seen that heavy effective resistance edges cannot be recovered from small cuts.

### 1.4 Our results

In general, we cannot hope to improve on the space complexity of the solution in [KLM17] because any spectral sparsifier extracted from takes at least space to represent (for constant ). However, there still remains a major gap in addressing our second question of decoding time. The algorithm in [KLM17] uses decoding time. The method in [AGM13] is faster, running in time, but it requires space , which is far from optimal.

We present two results that improve on these bounds. The first, summarized in Theorem 1, gives a simple algorithm that runs in space and time . This second, summarized in Theorem 5, gives a more involved method that runs in space and time for any constant . Both of these algorithms are based on effective resistance sampling, which is a powerful way of constructing spectral sparsifiers in the offline setting [SS11].

We give a detailed technical overview of both methods in Section 3. At a high level, our second algorithm can be viewed as the first successful attempt to apply “bucketing” methods to the graph sparse recovery problems. The most naive way to recover a sparse approximation to a vector from a sketch is to use to check whether or not an individual entry in is large in comparison to . This “brute-force” approach leads to algorithms that run in time for an length vector. To achieve decoding time, it is necessary to check multiple entries at once, which can be done with hashing or bucketing schemes that divide into intervals of different sizes, checking the mass of entire intervals at once.

Similarly, the graph sketching algorithm of [KLM17] recovers a sparsifier by using to find edges with high effective resistance. It does so by checking all possible edges in , leading to a runtime. Our improvement hinges on a method for bucketing into node clusters that effectively allow for many edges to be checked simultaneous.

### 1.5 Fast pseudorandomness for linear sketching algorithms

Finally, we mention that, besides a better understanding of bucketing methods for graphs, obtaining faster sketching methods for sparsification requires solving a largely orthogonal issue, which we discuss in Section 3.3. In particular, like many streaming algorithms, our methods are developed with the assumption that we have access to a large number of fully random hash functions. To ensure that the algorithms actually run in small space, we need to eliminate this assumption. One potential way of doing so is through the use of a pseudorandom number generator (PRG) for small space computation [Ind00]. However, existing PRGs used in the streaming literature run slowly in our setting, creating another time bottleneck for decoding [Nis92].

We address this issue by describing a much faster, “locally computable” pseudorandom generator based on construction of Nisan and Zuckerman [NZ96] and a locally computable randomness extractor of De and Vidick [DV10]. We hope this result will be more widely useful in designing faster sketching algorithms for graph problems and other applications.

## 2 Preliminaries

Let be an unweighted undirected graph with vertices and edges. Let denote the vertex edge incidence matrix of .333For any distinct pair of vertices , if then the corresponding row in is zeroed out. Also, for any set and any vertex we define as the the set of edges in connected to . For any vertex , be the indicator vector of . Let denote the vertex edge incidence matrix of an unweighted and undirected complete graph, where for any edge , its ’th row is equal to . In order to avoid repeating trivial conditions, we usually drop the condition , when we say edge , however, we have this condition implicitly. Also, for any distinct pair of vertices , let .

For weighted graph , where denotes the edge weights, let be the diagonal matrix of weights where . Note that , is the laplacian matrix of the graph. Also, let denote the Moore-Penrose pseudoinverse of .

###### Definition 1.

For any unweighted graph and any , we define , as follows:

 LGγ=LG+γI.

This can be seen in the following way. One can think of as graph plus some regularization term and in order to distinguish between edges of and regularization term in , we let , where is the operation of appending rows of to matrix .

### 2.1 Effective Resistance

Suppose that we inject a unit current at vertex and extract from vertex . Let vector denote the the currents induced in the edges. Thus by Kirchoff’s current law, the sum of the currents entering (exiting) any vertex is zero except for the source and the sink of electrical network. Hence, we obtain . Let vector denote the potentials induced at the vertices by the mentioned setting. Thus by Ohm’s law we have . Putting these together we get

 χu−χv=B⊤WBφ=Lφ.

Observe that , hence .

The effective resistance between vertices and in graph , denoted by is defined as the voltage difference between vertices and , when a unit of current is injected into and is extracted from . Thus,

 Ruv=b⊤uvL+buv (2)

We also let for any , for convenience.

Also, for any pair , the potential difference induced on this pair can be calculated as follows

 φ(w1)−φ(w2)=b⊤w1w2L+buv. (3)

Furthermore, if the graph is unweighted, the flow on edge is

 i(f)=b⊤fL+buv. (4)
###### Lemma 1.

Suppose that in a weighted graph , we inject unit of flow to and extract it from . Let vector , denote the potentials induced on the vertices, i.e., . Then

 ∑(a,b)=e∈Ew(e)(φ(a)−φ(b))2=1Ruv.
###### Proof.

Suppose that one injects unit of flow to and removes it from , then for the potential induced on the vertices, . Thus,

 ∑(a,b)=e∈Ew(e)(φ(a)−φ(b))2=φ⊤Lφ=1R2uv⋅b⊤uvL+LL+buv=1R2uv⋅b⊤uvL+buv=1Ruv.

 φ(u)−φ(v)=1Ruvb⊤uvL+buv=1, (5)

which means that instead of having a current source with unit of current, alternatively, we can have a one unit voltage source on and , setting . ∎

For graph (see Definition 1), where and is an unweighted graph, we have the following corollary.

###### Corollary 1.

For graph , where is an unweighted graph and , for any pair of vertices , if vector , then

 ∑(a,b)=e∈E(φ(a)−φ(b))2+γn∑{a,b}∈(V2)(φ(a)−φ(b))2=1RGγuv.

We also have the following characterizations of effective resistance, which we use several times in this paper.

###### Fact 1.

For every weighted graph , the effective resistance can be characterized as

 1Ruv=minφ∈RVs.t.:φ(u)−φ(v)=1∑e=(u,v)∈Ew(e)(φ(u)−φ(v))2.

For regularized graphs, we have the following corollary, for convenience.

###### Corollary 2.

For graph , where is an unweighted graph and , for any pair of vertices , if vector , then

 1RGγuv=minφ∈RVs.t.:φ(u)−φ(v)=1∑(a,b)=e∈E(φ(a)−φ(b))2+γn∑{a,b}∈(V2)(φ(a)−φ(b))2.
###### Fact 2.

For every weighted graph , the effective resistance can be characterized as

 Ruv=minfs.t.~{}B⊤f=χu−χv∑e∈Ef(e)2w(e).

Also, we frequently use the following simple fact.

###### Fact 3 (See e.g. [Klm+17], Lemma 3).

For any pair of vertices , we have,

 b⊤uvL+bu′v′=b⊤u′v′L+buv≤b⊤uvL+buv=Ruv. (6)

We also use Rayleigh’s monotonicity law throughout the paper.

###### Fact 4 (Rayleigh’s monotonicity law).

For every graph and every edge the removal of an edge from can only increase effective resistances of other edges.

###### Definition 2.

In any graph , for and any , we define

 BG(u,r)={v:v∈V,RGuv≤r}.

Recall that since we defined , then .

###### Lemma 2.

Suppose that graphs and are such that and for any pair of vertices , . We claim that for any and any vertex , we have the following:

 BG(u,rΓ)⊆B˜G(u,r).
###### Proof.

For any we have that . Therefore, by the assumption of the lemma we get , which means . ∎

In graph , with , Suppose that is a demand vector, satisfying the condition . A flow vector is called -flow, if .

In graph , for any set of vertices , we denote the graph induced on by . Also, we let and , which indicate the effective resistance diameter of vertices in and , respectively.

For matrices , we write , if , . We say that is -spectral sparsifier of , and we write it as , if . Graph is –spectral sparsifier of graph if, . We also sometimes use a slightly weaker notation , to indicate that , for any in the row span of .

## 3 Technical overview

Graph sketching, started by the work of Ahn, Guha and McGregor on solving graph connectivity in dynamic streams [AGM12a], is the idea of designing graph algorithms that access the input graph via linear measurements. While graph sketching is a relatively recent development, the idea of linear sketching has been applied to design basic statistical estimation problems on vectors (e.g. norm estimation, heavy hitters) for more than a decade, with many efficient algorithms for fundamental problems known.

A very successful approach to designing linear sketches for graphs, originally suggested by Ahn, Guha and McGregor, amounts to applying a classical sketching matrix to the edge incidence matrix of the graph, and then designing an offline decoding algorithm that reconstructs useful information about the graph from . Such sketches turn out to have a very useful composability property: post-multiplying the sketch by any vector lets one obtain a sketch for any given vector in the column space of , i.e. in the cut space of the graph. This property has been exploited in the literature [AGM12a, AGM12b, AGM13, KLM17, KW14] to obtain space efficient sketches for connectivity (where the sketch is an sampler) and spectral sparsification (here is an -heavy hitters sketch).

### 3.1 Graph sketching vs classical sparse recovery: small sketch size and efficient decoding

A graph sketching algorithm consists of two phases. First, one maintains the sketch , where is the sketch and is the edge incident matrix of the input graphs, under dynamic edge updates (insertions and deletions). Next, at the end of the stream one runs a (usually nonlinear) decoding algorithm on the sketch to produce a sparsifier. Note that the space complexity of the algorithm is the number of rows in times , the number of vertices of the graph. We now describe approaches for spectral sparsification through linear sketches that have been developed in the literature. We focus on single-pass dynamic streaming algorithms, i.e. oblivious sketches. In this setting, the only known approach for designing space efficient sketches is to implement the effective resistance sampling approach of Spielman and Srivastava [SS11]444If two passes over the stream are allowed, one can use a relation between spectral sparsifiers and spanners [KP12] and exploit spanner construction algorithms [KW14], but this approach is not known to extend to the single pass setting..

In order to construct a spectral sparsifier, as per  [SS11], one samples edges with probabilities proportional to their effective resistances, and gives sampled edges appropriate weights (inverse of the sampling probability) to make the estimate unbiased. Our main problem is to design an oblivious sketch with a small number of rows that allows efficient recovery of such a sample.

It is known from prior work [KLM17] that the main challenge in constructing sketches for spectral sparsification lies in designing a sketch that allows recovery of ‘heavy’ edges of the graph, or edges with large (e.g., larger than – such edges need to be included in a sparsifier with probability as per the sampling approach of Spielman and Srivastava) effective resistance. For simplicity we focus on this question of finding heavy edges for the purposes of our overview, as a reduction introduced in [KLM17] can be used to convert any such primitive into a full-fledged sparsification routine. We refer to the problem of recovering high effective resistance edges as the HeavyEdges problem. The task is to design a sketch with a small number of rows that allows solving the following problem:

HeavyEdges() Input:     Sketch of graph , a coarse spectral sparsifier of Output: List of size that contains all edges with effective resistance

In the description above we refer to as a coarse sparsifier of if for some parameter one has

 1ΓK⪯˜K⪯K,

where is the Laplacian of , is the Laplacian of and stands for the positive semidefinite ordering on matrices.

Two approaches have been designed for this problem in the literature.

#### Spectral sparsifiers through inverse connectivity sampling: suboptimal space but subquadratic decoding time.

The first approach, introduced by Ahn, Guha and McGregor [AGM13], is based on relating effective resistances in graphs to edge connectivity: one proves that an edge of large effective resistance (e.g., ), must have nontrivially large connectivity, and concludes that sampling with probabilities proportional to (overestimates of) inverse connectivities suffices as long as a large enough number of samples is taken. The latter is possible using a spanning forest sketch [AGM12a]. Unfortunately, the relation between effective resistance and inverse connectivities is rather weak in general, and this approach leads to an algorithm with space and time complexity. We note that this approach inherently relies on the idea of recovering high effective resistance edges by using the fact that they must cross (reasonably) small cuts.

#### Spectral sparsifiers through ℓ2 heavy hitters: optimal space but quadratic decoding time.

Another approach, introduced in [KLM17] uses an -based characterization of edges with high effective resistance. An edge has effective resistance at least in the graph if and only if at least a fraction of the energy of the electrical flow from to is contributed by that edge. This characterization allows one to recover high effective resistance edges by applying an heavy hitters sketch to the edge incidence matrix . Specifically, if is the vector of vertex potentials induced by injecting one unit of flow at and removing unit of flow from , then the effective resistance of edge satisfies

 Re=(Bφ)2e||Bφ||22. (7)

The relation (7) above implies that if one chooses to be a -heavy hitters sketch, post-multiplies the sketch by the vector of potentials and decodes the resulting sketching using standard heavy hitters decoding, the resulting list of nonzero coordinates will contain . This is a very useful observation, but it does not quite lead to an algorithm, since in order to compute the vector of potentials , one needs to know the entire graph ! It turns out, however, that the exact is not needed. If a coarse (large constant factor) sparsifier of is available explicitly, then one can instead compute the corresponding vector of potentials and decode instead. The approximation quality of by affects the size of the sketch, but the approach still works – this is the algorithm of [KLM17]. This approach leads to an optimal space complexity, but suffers from large runtime: the decoding is essentially brute-forcing over all potential edges , and hence the runtime is quadratic.

#### Our approach to efficient recovery: ‘bucketing’ vertices by ball carving in effective resistance metric.

It is interesting to contrast the state of the art in graph sketching with classical sketching algorithms for heavy hitters. The problem in heavy hitters is: design a sketch such that for every vector most of whose mass is in the top coordinates one can recover a good approximation to those coordinates from the low dimensional vector . This problem can be solved in optimal space by hashing into buckets and then using brute force decoding over the universe of size (e.g. the CountSketch algorithm). This is similar in flavor to the result of [KLM17], where space complexity (or, sketching dimension) is optimal, but the decoding is brute-force over the universe of possible edges, i.e. over . For classical sketching solutions have been proposed that achieve the optimal bounds on sketch size and also run in sublinear time by careful decoding of the buckets, but no equivalent approach was known prior to our work. The question that we ask is:

Can one construct a ‘bucketing scheme’ for spectral sparsification via sketching that will allow fast recovery?

Our result is the first to define a notion of ‘buckets’ in graph sketching that admit efficient decoding primitives. Our ‘bucketing scheme’ is based on ball carving in the effective resistance metric of the input graph: our algorithm (Algorithm 1 for sparsification and Algorithm 2 for the HeavyEdges problem) is a recursive procedure that constructs progressively better approximations to the effective resistance metric of the input graph, and at every point partitions the vertices of the graph by ball carving in the effective resistance metric learnt so far in order to speed up recovery of new important edges.

### 3.2 Our techniques

We now present an overview of our techniques.

#### Ensuring a lower bound on minimum degree via peeling.

Our development in this paper starts with the observation that for every , at the cost of space and time, a linear sketching algorithm can assume that the input graph has minimum degree lower bounded by . This is due to the fact that we can store a sketch of the edge incidence matrix of with rows that allows recovery of all edges incident on any given vertex of degree at most , with high probability, as well as store all vertex degrees exactly (using a sketch or a simple counter). The algorithm can therefore perform the following decoding operation in time. Iteratively find the smallest degree vertex in the graph, recover all incident edges, subtract these edges from the sketch and repeat on the residual graph while a vertex of degree at most exists. Such iterative processes are often hard to implement using sketches due to dependencies that may develop, but in this case this issue does not arise: the algorithm stores degrees of all vertices exactly, and therefore the execution path of such a peeling process does not depend on the sketches as long as high probability success events for sparse recovery sketches happen at all intermediate iterations. Since a given edge can be subtracted from any sketch that we use in time, this results in an algorithm with runtime. See proof Lemma 16 in Section 6 for details.

In what follows that our input graph has minimum degree lower bounded by a parameter , at the cost of a additive term in space and decoding time. We set for our warm-up result below (a simple sketch with space and decoding time; see below for overview and Section 6 for the actual algorithm), and as for our main result (see below for an overview and Section 4 onwards for details).

#### Warm-up result: ˜O(n3/2) space and time via edge connectivities.

We start by noting that if a lower bound on the minimum degree of a graph is assumed, one can prove a stronger relation between edge connectivities and effective resistances than the one used in [AGM13]. Specifically, we show that if the minimum degree in the input graph is lower bounded by , then every edge with effective resistance , say, necessarily has connectivity at most . More formally, we show:

###### Lemma 3 (Informal version of Lemma 17).

For every graph , every integer , if for every vertex , we have , then for every edge with edge-connectivity one has .

To prove the lemma, for every edge we consider the optimal line embedding of the graph given the vertex potentials induced by injecting one unit of flow into and taking it out at . The sum of squared potential differences over edges of (i.e. the energy of the embedding ) is the effective conductance between and (the inverse of effective resistance). We then use the min-degree assumption to note that all cuts that contain vertices on one side are of size at least and therefore conclude that any low energy line embedding must map large groups of vertices close together. The latter fact, together with the connectivity assumption implies a lower bound on the conductance of the edge (see proof of Lemma 17 in Section 6 for the details).

Lemma 17 immediately yields a solution to our HeavyEdges problem with : one simply uses a sketch that recovers all edges with connectivity at most and outputs this list. The latter can be one using a result of  [AGM12a], but the decoding time for that procedure is quadratic in the connectivity parameter (due to the need to subtract recovered spanning forests from the sketch), which unfortunately does not yield a runtime improvement for our setting. However, we give a simple sketch based on the spanning forest algorithm of [AGM12a] that achieves linear runtime in Section C.1. In Section 6 we show how this outline leads to an algorithm with space and runtime complexity (this part of the analysis follows the ideas developed in  [KLM17]). Formally, we prove

###### Theorem 1.

There exists an algorithm such that for any , processes a list of edge insertions and deletions for an unweighted graph in a single pass and maintains a set of linear sketches of this input in space. From these sketches, it is possible to recover, with high probability, a weighted subgraph with edges, such that is a -spectral sparsifier of . The algorithm recovers in time.

Unfortunately, the space and time complexity provided by Theorem 1 seems to be the limit of the idea of removing low degree vertices and then recovering low connectivity edges, because Lemma 17 is optimal (consider a union of cliques of size connected by matchings, with the first and the last clique further connected by an edge). This motivates the following goal:

Design a sketching algorithm for spectral sparsification in dynamic streams that achieves better than space and decoding time simultaneously.

Such a result would need to use heavy hitters sketches to go beyond the space complexity, but at the same time must avoid brute-force decoding used in [KLM17]. Our main result achieves exactly that: we give an algorithm with space and decoding time that uses heavy hitters sketches to go beyond the relation between effective resistances and connectivity, but at the same time avoids brute force decoding using a novel scheme for bucketing nodes in a graph based on ball carving in the effective resistance metric of the underlying graph.

#### Main technique: ball-carving in effective resistance metric.

We start by recalling the high level approach of [KLM17], and then outline the main technical ideas involved in implementing the -heavy hitters decoding to run faster than the brute-force approach. The algorithm of [KLM17] is

HeavyEdgesBruteForce() Input:     Sketch of graph , a coarse spectral sparsifier of Output:   List of size that contains all edges with effective resistance Initialize for      for          Compute        potentials induced by flow from to in         Decode , add result to     end for end for

The algorithm above recovers a heavy edge whenever it routes flow from to in the coarse sparsifier , so the natural question is whether one can group vertices into clusters, or buckets, to avoid testing all pairs.

#### Group testing heavy edges by bucketing.

The main idea underlying our analysis is the following bucketing scheme. Suppose that we are able to partition the vertex set of into vertex disjoint subsets such that the effective resistance diameter of every set is smaller than a parameter . Now consider an edge with . Since , it must be that the endpoints and belong to different elements of the partition ! This suggests the following approach: instead of sending flow from to for every potential pair of vertices, proceed as follows for every element of the partition . First contract the subset of to a supernode in the coarse sparsifier , obtaining (explicitly) a graph . Then for every node compute the potential induced by unit electrical flow from to in and decode the sketch . We record this informally in the HeavyEdgesFast algorithm below. We note that the HeavyEdgesFast primitive below serves as an approximation to our actual HeavyEdges algorithm, namely Algorithm 2 from Section 4.2.

HeavyEdgesFast() Input:   Sketch of graph , a coarse spectral sparsifier of ,                               partition Output: List of size that contains all edges with effective               resistance Initialize for  to , supernode corresponding to     for all vertices in ,         Compute      potentials induced by flow from to in         Decode , add result to     end for end for

Note that this approach amounts to recovering all edges that could go from to any node in at the cost of only one flow computation as opposed to computations, and hence is promising, as long as we can prove that this approach correctly recovers heavy edges with one endpoint in . Our first crucial observation is that this is method of recovering edges actually works, because if for some vertex one has , then the effective resistance between and in the contracted graph will be large. Formally this is guaranteed by

###### Lemma 4.

In graph , suppose that vertex belongs to a set of vertices , where . Also assume that and let denote the corresponding super-node , i.e., is the resulting graph after contracting vertices of in . Then for any such that one has

 RHcv≥RGuv(1−βRGuv)2

The proof of the lemma is given in Appendix B.

Our actual HeavyEdges algorithm (Algorithm 2 in Section 4.2) crucially uses this lemma for correctness analysis. Now that we know that the idea of contracting subsets of vertices of low effective resistance diameter leads to a correct algorithm, we need to understand the runtime. The runtime depends on the number of elements in the partition 555We note that in the actual HeavyEdges algorithm the sets in general may not form a partition – see Section 4. They do, however, form a partition if the minimum cut in the input graph is lower bounded by , for example.: the contraction can be done in time as the coarse sparsifier is given to us explicitly, and computing all the necessary sketches can also be accomplished in time using standard techniques (see Algorithm 2, line 24, and its analysis in Lemma 6 in Section 4.2), so the over all runtime is . This means that in order to make our approach work, we need to answer the following question:

Can a graph with minimum degree lower bounded by partitioned into few () clusters of low (e.g., ) effective resistance diameter?

It is not hard to show that one can always partition into such clusters. Unfortunately, however, the bound is essentially tight, and is not sufficiently good for our purposes. For tightness, is it easy to construct graphs with minimum degree lower bounded by where the number of clusters must be – just consider a cliques of size , joined by a path to ensure connectivity. This means that without further ideas we cannot ensure that the number of elements in our partition is smaller than , which means that the runtime of the process above is , which together with time for recovering edges incident on vertices of degree gives overall space and time , which is at least for all choices of .

The next observation that we need is that if using a sketch with rows we could not only ensure that our graph has minimum degree lower bounded by , but also ensure that the minimum-cut in the graph is , a much better result would follow666In the actual algorithm we are not able to reduce the problem to the setting when the input graph has min-cut lower bounded by at the expense of only space and time complexity. However, we instead ensure a weaker condition that turns out to be sufficient for our purposes – see the analysis of the BallCarving algorithm (Algorithm 3) in Section 5.. In the actual algorithm we are not quite able to ensure that the min-cut in our graph is lower bounded by , but rather use a weaker assumption – see Section 5. Nevertheless, it is useful for this overview to consider the following question:

Can a graph with minimum cut lower bounded by partitioned into few () clusters of low (e.g., ) effective resistance diameter?

The answer to this questions turns out to be yes, and the quantitative bounds are sufficient to achieve a better than space and time – this is exactly how our algorithm works. We show that every graph with minimum cut lower bounded by can be partition into vertex disjoint subsets with effective resistance diameter , say, with the number of partitions being much smaller than (in contrast with our previous version of this question, where was the best possible bound). This fact is exactly what underlies our runtime and space complexity of . The algorithm for doing this is simple: we repeatly pick vertices of the graph (ball centers) and remove balls of a given effective resistance radius, until no vertex remains. The algorithm is given below:

BallCarving() Input:     A coarse spectral sparsifier of , radius Output:   Partition into vertex disjoint subsets of effective resistance diameter Initialize while                    a vertex in                   end while

We note that the BallCarving algorithm above serves as an illustration to our actual BallCarving primitive (Algorithm 3) presented and analyzed in Section 5.

We show

###### Theorem 2 (Informal).

For every graph with minimum cut lower bounded by , if is a constant factor spectral approximation to , then the procedure BallCarving above outputs a partitioning into .

The fact that the number of parts in the decomposition is only even when is exactly the reason why our ball-carving based approach is able to go beyond the space and time barrier.

The proof of Lemma 2 is never used for the actual analysis, but follows formally from the following two crucial bounds that we prove. Theorem 3 bounds the number of elements in the collection output by our BallCarving algorithm (see Algorithm 3 in Section 5) is the core tool behind our analysis. Its proof is given in Section 5.

###### Theorem 3.

For every graph , and a set of edges if:

1. The minimum degree in is lower bounded by .

2. For some and integer the vertex set admits a partition such that for every the subgraph induced by has effective resistance diameter bounded by , i.e., such that .

3. Edge set contains all the edges of connectivity no more than in graph .

4. is the set of needed sketches.

Then BallCarving() returns a set of disjoint subsets of vertices with effective resistance diameter bounded by in the metric of , and there are no more than such non-singleton partitions.

We note that the result of Theorem 3 depends on the quality of the clustering whose existence is assumed by the theorem. Such a clustering is provided by Theorem 4 below:

###### Theorem 4.

For any unweighted graph that and with min-degree at least , set of vertices, , admits a partitioning into , that for any

 ∀i∈[k],diamIndeff(Ci)≤10n0.4

and

 k≤c⋅n√logn√1n0.4log2n=c⋅n0.8√1logn=O(n0.8)

We note that Theorem 4 provides a decomposition of the vertex set into vertex-disjoint sets such that every has very low effective resistance diameter, about the inverse of the minimum degree, as an induced subgraph. The proof is a quantitative improvement of the work [AALG17] on graph clustering using effective resistances and is provided in Appendix A. Qualitatively, the fact that the number of clusters of diameter can be made polynomially smaller than is one of the main observations that enable our algorithm (Algorithm 1, presented in Section 4) to go space and decoding time simultaneously.

We note that Theorem 2 follows by combining Theorem 4 with Theorem 3, and observing that if the input graph does satisfy the minimum cut assumption, then the set passed to BallCarving may be empty, in which case do form a partition, as required. However, we note that Theorem 2 is just a toy application of our techniques, and we therefore do not provide the full proof, instead referring the reader to the formal analysis of our HeavyEdges primitive (Algorithm 2) in Section 4 as well as the BallCarving (Algorithm 3) analysis in Section 5

The proof of Theorem 3 is the technical core of the paper (see Section 5). The main idea behind the proof is a very natural ball-growing process that helps us lower bound efficiency of ball carving. The simple main observation is that if BallCarving, when run with parameter as the radius, outputs many partitions, then balls of radius around the corresponding ball centers do not overlap, which as we show is not possible. The proof is by considering a natural ball-growing process that, starting with any node , keeps growing a ball in effective resistance metric up to radius . We show that most such balls will capture many vertices. Since the balls are disjoint, this implies that there cannot be too many of them, and consequently there cannot be too many elements in the partition that BallCarving outputs. The resistance metric in question can be thought of as the effective resistance metric of itself for the purposes of this outline. In the actual algorithm we use the effective resistance metric of the coarse sparsifier since we do not have access to the effective resistance metric of , and show that these two metrics are equivalent for our purposes up to a small loss in parameters (see proof of Theorem 6 in Section 5). We refer the reader to Section 5 for mode details.

### 3.3 Reducing the cost of randomness

In the previous section, we discussed the primary challenge in obtaining better space vs. decoding time tradeoffs for spectral sparsification in dynamic graph streams. In particular, faster decoding requires an understanding of how to apply “bucketing” methods to what is essentially a sparse recovery problem involving graphs. Our primary technical contribution is the first substantial progress towards this understanding. However, beyond this contribution, obtaining faster decoding time also requires solving another mostly orthogonal problem: we need a faster way to generate pseudorandom bits for use in our randomized sketching algorithms. This is an issue that has largely been unaddressed, as decoding speed has not previous been an objective of prior work linear sketching algorithms for sparsification [KLM17].

#### Nisan’s pseudorandom number generator.

Like most streaming algorithms, our methods depend heavily on randomness. We compute sketches of the form , where is a randomly constructed matrix with columns and rows. Naively, just storing after random initialization would take space, dominating the space complexity of our algorithms. Accordingly, to obtain truly space efficient methods, we need to find a more compact way of representing the random matrix . This is not a challenge unique to graph sketching – essentially all linear sketching require efficient ways of representing the sketch matrix .

While there are a number of ways of handling this issue (e.g. many algorithms build using low-independence hash functions), one of the most generic techniques is to generate using a pseudorandom number generator with a small seed. Indyk first applied this idea to algorithms for estimating vector norms in a streaming setting [Ind00]. He showed that any pseudorandom number generator than can “fool” a small space algorithm can also fool any linear sketching algorithm with a small sketch size (i.e., with few rows in ).

Instantiating Indyk’s result with Nisan’s well known pseudorandom number generator [Nis92] allows for to be generated from a seed of just random bits, as long as , or in our case, can be stored in space and can be generated from random bits. Instead of storing , we just need to store this small random seed and columns of all of can be generated “on-the-fly” as needed. This is a powerful result: since Indyk’s original application, Nisan’s generator has become a central tool in streaming algorithm design.

Unfortunately, when runtime is a concern, Nisan’s pseudorandom number generator is a costly option for graph streaming algorithms. If random bits are required to generate , Nisan’s generator requires time to generate even a single random bit from its length seed. In our setting is polynomial in , but upwards of random bits in need to be accessed during the cost of our decoding algorithms. Generating these random bits “on-the-fly” using Nisan’s generator would immediately imply an runtime for decoding.

#### A faster pseudorandom generator.

To deal with the cost of generating in a pseudorandom way, in Section 7 we present a pseudorandom generator that is much faster than Nisan’s. In particular, we show that, at least when is polynomial in , it is possible to construct a generator that still uses a seed of just random bits (i.e., only a factor more than Nisan’s) but can generate any pseudorandom bits in time instead of time.

Our generator can be constructed by carefully combining several results from the literature on pseudorandomness. Ultimately, Nisan’s pseudorandom generator requires to generate a single pseudorandom bits from its length seed because every pseudorandom bit output depends on every seed bit. To avoid this cost, we need a generator that is inherently local, with each pseudorandom bit only depending on seed bits.

While “locally computable” pseudorandom number generators have not been studied directly, there do exist locally computable constructions of extractors, a closely related object [Vad04, Lu02, DPVR12]. The goal of an extractor is to extract a small string of nearly uniform random bits from a long stream of weakly random bits. In certain cryptographic settings, it is desirable to do so in a way that only bases each output bit on a relatively small number of input bits.

Furthermore, it is actually possible to construct a pseudorandom number generator using an algorithm for randomness extraction. In particular, by plugging a locally computable extractor from [DV10] into a pseudorandom generator of Nisan and Zuckerman [NZ96], we obtain a generator that can compute each pseudorandom bit using just pseudorandom bits. Naively, this construction can output up to pseudorandom bits using a seed of . We describe a relatively simple iterative process which further exands the output to pseudorandom bits,, while still maintaining a generation time of if is constant.

There are likely many possible improvements to our basic construction. We hope that bringing a broader set of tools from the pseudorandomness literature to the streaming algorithms community, we can initiate an exploration of these improvements, which will lead to faster linear sketching

## 4 The algorithm and its analysis

In this section we present our main sketch-based sparsification algorithm (Sparsify, Algorithm 1, and the main result of the section is

###### Theorem 5.

There exists an algorithm (Algorithm 1) such that for any , processes a list of edge insertions and deletions for an unweighted graph in a single pass and maintains a set of linear sketches of this input in space. From these sketches, it is possible to recover, with high probability, a weighted subgraph with edges, such that is a -spectral sparsifier of . The algorithm recovers in time.

The Sparsify (Algorithm 1) generally follows the approach of [KLM17], and the main technical contribution of this section is the HeavyEdges algorithm (Algorithm 2). We now outline the main ideas involved in both algorithms and analysis.

The Sparsify algorithm have recursive structure: given (a sketch of) a graph to be sparsified, the algorithm proceeds as follows. First it adds a regularization term to the graph (essentially a multiple of the complete graph) and recursively calls itself, obtaining a coarse (large factor approximation) sparsifier of , whose Laplacian (together with the regularization term) we denote by (see line 7 of Algorithm 1). One then invokes the HeavyEdges procedure (see line 15 of Algorithm 1) to recover all edges of large (about , where is the quality of approximation of by ) effective resistance in the sample. One then uses to estimate the effective resistance of every edge recovered by the invocation of HeavyEdges, and keeps those edges that were recovered from the sample corresponding to their actual approximated effective resistance (see line 18 of Algorithm 1). This process is similar to the approach of [KLM17], with one nontrivial difference: our HeavyEdges procedure needs a spectral approximation to the sampled graph in order to recover high effective resistance (or, heavy) edges – indeed, this spectral approximation is needed in order to perform ball carving in the effective resistance metric. As a consequence, the overall procedure has somewhat nontrivial recursive structure that we describe next.

#### Recursive structure of Sparsify and HeavyEdges (recursion tree T).

We represent the recursive structure of Sparsify and HeavyEdges by a recursion tree that is described in detail in Section 4.3. Every node of the recursion tree corresponds to subsampling of the input graph that is formally defined in Section 4.3. When an invocation of HeavyEdges or Sparsify performs a recursive call to itself or the counterpart subroutine, it must pass a collection of sketches of a subsampling of its graph as input to the recursive call. We associate the appropriate set of sketches with the corresponding nodes in the tree , so that invocations of our subroutines simply get a node as their first input, and the sketches that correspond to the subtree of . See Section 4.3 for more details.

#### Recursive chain of sparsifiers.

Our recursion tree is a natural generalization of the chain of coarse sparsifiers used in [KLM17]. In particular, in recursion tree , for any node , that corresponds to a call to Sparsify, and node , which corresponds to a call to Sparsify, and is a child of ,one has

 Ka⪯rKb⪯rΓKa

where and correspond to Laplacian matrix of and . If node does not have a child that corresponds to a call to Sparsify, then

 Ka⪯rΓ⋅λu⋅I⪯rΓKa.

See Remark 2 in Section 4.3 for the details.