Popular conjectures imply strong lower bounds for dynamic problems

Popular conjectures imply strong lower bounds for dynamic problems

Abstract

We consider several well-studied problems in dynamic algorithms and prove that sufficient progress on any of them would imply a breakthrough on one of five major open problems in the theory of algorithms:

  1. Is the SUM problem on numbers in time for some ?

  2. Can one determine the satisfiability of a CNF formula on variables in time for some ?

  3. Is the All Pairs Shortest Paths problem for graphs on vertices in time for some ?

  4. Is there a linear time algorithm that detects whether a given graph contains a triangle?

  5. Is there an time combinatorial algorithm for Boolean matrix multiplication?

The problems we consider include dynamic versions of bipartite perfect matching, bipartite maximum weight matching, single source reachability, single source shortest paths, strong connectivity, subgraph connectivity, diameter approximation and some nongraph problems such as Pagh’s problem defined in a recent paper by Patrascu [STOC 2010].

1 Introduction

Dynamic algorithms are a natural extension of the typical notion of an algorithm: besides computing a function on an input , the algorithm needs to be able to update the computed function value as undergoes small changes, without redoing all of the computation. Dynamic algorithms have a multitude of applications, and their study has evolved into a vibrant research area. Among its many successes are efficient dynamic graph algorithms for graph connectivity  [48, 87, 74], minimum spanning tree [37, 51, 49], graph matching [81, 14, 68, 43] and approximate shortest paths in undirected graphs [17, 18, 47]. Graph connectivity and minimum spanning tree for instance can be supported in only polylogarithmic time per edge update or query.

Nevertheless, there are some dynamic problems that seem stubbornly difficult. For instance, consider maintaining a reachability tree from a fixed vertex under edge insertions or deletions, i.e. the so called dynamic single source reachability problem (ss-Reach). The best known dynamic ss-Reach algorithm [80] has update time . This is only better than the trivial recomputation time for very dense graphs. Moreover, the result uses heavy machinery such as fast matrix multiplication, and is currently not practical. There are many such problems, including dynamic shortest paths, maximum matching, strongly connected components, and some nongraph problems such as Pagh’s problem [72] supporting set intersection updates and membership queries. For many of these problems, the only known dynamic algorithms are to recompute the answer from scratch. (Although there has been some success when only insertions or only deletions are to be supported.)

When there are no good upper bounds, lower bounds are highly sought after. Typically, for dynamic data structure problems, one attempts to prove cell probe lower bounds. However, unfortunately, the best known cell probe lower bounds are at best logarithmic [73], and for these hard dynamic problems we would want higher, polynomial lower bounds, i.e. of the form where is the size of the input and is an explicit constant. Patrascu [72] initiated the study of basing the hardness of dynamic problems on a conjecture about the hardness of the SUM problem, a problem in quadratic time with no known “truly” subquadratic solutions ( for constant ). He showed that one can indeed prove conditional polynomial lower bounds for some notable problems such as transitive closure and shortest paths.

Other papers have considered proving conditional lower bounds for specific problems. Roditty and Zwick [79] for instance showed tight lower bounds for decremental and incremental single source shortest paths, based on the conjecture that all pairs shortest paths (APSP) cannot be solved in truly subcubic time. Chan [20] showed that a fast algorithm for subgraph connectivity would imply an unusually fast algorithm for finding a triangle in a graph. Some other works compare the complexity of their dynamic problem of study to the complexity of Boolean matrix multiplication [79, 47]. However, the only systematic study of conditional lower bounds for a larger collection of dynamic problems is Pǎtraşcu’s paper [72].

In this paper we expand on Pǎtraşcu’s idea and prove strong conditional lower bounds for a much larger collection of dynamic problems, based on five well-known conjectures: the SUM, All Pairs Shortest Paths, Triangle and Boolean Matrix Multiplication Conjectures and the Strong Exponential Time Hypothesis; we define these formally below. In section 4 we discuss the prior work on these conjectures and some potential relationships between them. As far as we know, any subset of the below conjectures could be false, and the rest could still be true. Hence it is interesting to have lower bounds based on each one of them.

Conjecture 1 (No truly subquadratic Sum).

In the Word RAM model with words of bits, any algorithm requires time in expectation to determine whether a set of integers contains three distinct elements with .

Conjecture 2 (No truly subcubic APSP).

There is a constant , such that in the Word RAM model with words of bits, any algorithm requires time in expectation to compute the distances between every pair of vertices in an node graph with edge weights in .

Conjecture 3 (Strong Exponential Time Hypothesis (SETH)).

For every , there exists a , such that SAT on -CNF formulas on variables cannot be solved in time2.

Conjecture 4 (No almost linear time triangle).

There is a constant , such that in the Word RAM model with words of bits, any algorithm requires time in expectation to detect whether an edge graph contains a triangle.

“Conjecture” 1 (No truly subcubic combinatorial BMM).

In the Word RAM model with words of bits, any combinatorial algorithm requires time in expectation to compute the Boolean product of two matrices.3

This paper is the first study that relates the complexity of any dynamic problem to the exact complexity of Boolean Satisfiability (via the SETH). Our lower bounds hold even for randomized fully dynamic algorithms with (expected) amortized update times. Most of our results also hold for partially dynamic (incremental and decremental) algorithms with worst-case time bounds.

Interestingly, many of our lower bounds (those based on the SETH) hold even when one allows arbitrary polynomial preprocessing time, and achieve essentially optimal guarantees. These are the first lower bounds of this nature.

Most of our lower bounds also hold in the setting when one knows the list of updates and queries in advance, i.e. in the lookahead model. This is of interest since many dynamic problems can be solved faster given sufficient lookahead, e.g. graph transitive closure [83] and matrix rank [59].

Organization.

In Section 2 we discuss our results and the prior work on the problems we address. In Section 3 we describe our techniques. In Section 4 we give an overview of the prior work on the conjectures. In Section 5 we give a formal statement of the theorems we prove. The problems we consider are summarized in Table 1 and the results are summarized in Table 2. In section 6 we define some useful notation and prove reductions between dynamic problems. In section 7 we prove lower bounds based on Conjecture 2 (SETH). In section 8 we prove lower bounds based on Conjectures 4 and 1 (Triangle and BMM). In section 9 we prove lower bounds based on Conjecture 2 (APSP). And finally, in section 10 we prove lower bounds based on Conjecture 1 (SUM).

2 Prior work and our results

Below we define each of the problems we consider and discuss the implications of our results for each problem in turn. The problem definitions are also summarized in Table 1, and our results for each problem are summarized in Table 2.

Maximum cardinality bipartite matching.

The maximum cardinality bipartite matching problem has a long history. In a seminal paper, Hopcroft and Karp [52] designed an time algorithm for the problem in bipartite graphs with edges and nodes. Mucha and Sankowski [67] (and Harvey [46]) improved their result for dense graphs by giving an 4 time algorithm where is the matrix multiplication exponent [89]. In a breakthrough paper earlier this year, Madry [65] devised the first improvement over the Hopcroft-Karp algorithm for sparse bipartite graphs, with a runtime of .

The amazing algorithms for the static case of the problem do not seem to imply efficient dynamic algorithms, however. Since a single edge update can cause the addition of at most one augmenting path, a trivial fully dynamic algorithm algorithm for maximum bipartite matching has update time . The only improvement over this is a result by Sankowski [81] who gave a fully dynamic algorithm with an amortized update time of . His result uses fast matrix multiplication and is only an improvement for sufficiently dense graphs. Two questions emerge.

(1) Is the use of matrix multiplication inherent?

(2) Can one get an improvement over the trivial algorithm when the graph is sparse?

We first address question (1). We show that any improvement over the trivial algorithm implies a nontrivial algorithm for Boolean matrix multiplication, thus showing that the use of matrix multiplication is indeed inherent. We partially address question (2), by showing three interesting consequences of a dynamic algorithm for maximum bipartite matching that has (amortized) update and query time for .

First, we show that an algorithm with would imply an improvement on the -year old time bound [6, 5] for the triangle detection problem in sparse graphs. In fact, Conjecture 4 implies that there is some for which (amortized, expected) update or query time is necessary. Second, we show that an algorithm with would imply that SUM is in truly subquadratic time, thus falsifying Conjecture 1. Finally, we show that any combinatorial algorithm with any falsifies “Conjecture” 1. All of our results apply also for the bipartite perfect matching problem (BPMatch).

Approximately maximum matching.

In the absence of good dynamic algorithms for maximum matching, recent research has focused on developing efficient algorithms for dynamically maintaining approximately maximum matchings. Ivkovic and Lloyd [56] presented the first such algorithm, maintaining a maximal matching (and hence a -approximate maximum matching) with update time . Baswana, Gupta and Sen [14] developed a randomized dynamic algorithm for maximal matching with expected amortized update time. Neiman and Solomon [68] presented a deterministic worst case update time that maintained a -approximate maximum matching. Finally, Gupta and Peng [43] showed that with the same update time one can maintain a -approximation for any constant .

All of the above papers except [43] obtain an approximate maximum matching by maintaining a matching that does not admit short augmenting paths. It is well known that for any , if a matching does not admit length augmenting paths, then it is a approximate maximum matching. The algorithms for maximal matching exclude length augmenting paths, and the -approximation algorithm of [68] excludes length and augmenting paths.

We show an inherent limitation to this approach for maintaining an approximately maximum matching. In particular, we show that there exists a constant such that any dynamic algorithm that maintains a matching that excludes augmenting paths of length at most can be converted into an algorithm for SUM, triangle detection and Boolean matrix multiplication. Our results are the same as that for BPMatch: an update time for the problem falsifies Conjecture 1 for , Conjecture 4 for and “Conjecture” 1 for if it is combinatorial. In particular, the above results imply that using the augmenting paths approach for dynamic approximate matching is unlikely to yield a result such as Gupta and Peng’s algorithm.

Maximum weight bipartite matching.

There are several weighted versions of the bipartite matching problem, all equivalent to each other: find a maximum weight matching, find a maximum weight perfect matching, find a minimum weight perfect matching (also known as the assignment problem). We will refer to the weighted matching problem as MWM. The first polynomial time algorithm for MWM, the Hungarian algorithm, was proposed by Kuhn [60]. Using Fibonacci heaps [38] its runtime is . When the edge weights are in , on a word-RAM with bit words, Gabow and Tarjan [40, 41] and a recent improvement by Duan and Su [32] give scaling algorithms for the problem running in time. Sankowski [82] gave an time algorithm.

The dynamic case of the problem seems less studied. It is not hard to obtain a fully dynamic algorithm for MWM that can answer in constant time queries about the weight of the MWM, and perform edge updates in time. The algorithm is based on Edmonds-Karp’s algorithm [33] and performs each update by searching for the shortest augmenting path. There are no dynamic algorithms for MWM with update time. The only result for the dynamic problem is an algorithm by Anand et al. [7] that maintains an -approximate MWM with expected amortized time where is the ratio between the maximum and minimum edge weight.

A natural question is, is it inherently hard to obtain update time dynamic MWM algorithms? We address this question by showing that any dynamic MWM algorithm, even a decremental or incremental one, with amortized update time for constant in dense graphs would imply a truly subcubic APSP algorithm, thus explaining the lack of progress on the problem.

Subgraph Connectivity.

The subgraph connectivity problem (SubConn) is as follows: given a graph , maintain a subgraph where the updates are adding/removing a node of to/from , and the queries are to determine whether a query node is reachable from a query node in . SubConn is a version of the graph connectivity problem, but instead of edge updates, one needs to maintain vertex updates. As mentioned earlier, graph connectivity has extremely efficient algorithms (e.g. [87]). However, the obvious way of simulating vertex updates using edge updates is to insert/delete all incident edges to a vertex that is to be inserted/deleted. As the degree of a vertex can be linear, this type of simulation cannot give better than update time for SubConn. Thus SubConn seems much more difficult than graph connectivity.

The SubConn problem was first introduced by Frigioni and Italiano [39] in the context of communication networks where processors may become faulty and later can come back online. They obtained an efficient dynamic algorithm for planar graphs. Later, Chan [20] studied the problem in general graphs because of its applications to geometric connectivity problems. In such problems, one is to maintain a set of axis parallel boxes in dimensions under insertions and deletions so that one can answer queries about whether there is a path between any two given points that is contained within the set of boxes. Chan showed that for any constant , the box connectivity problem can be reduced to subgraph connectivity in a graph on edges, thus any dynamic algorithm for subgraph connectivity immediately implies an algorithm for geometric connectivity. Chan also showed that subgraph connectivity can be reduced to the box connectivity problem in dimensions, thus showing that subgraph connectivity and box connectivity are equivalent problems for all . Chan, Pǎtraşcu, Roditty [21] further showed that a variety of other geometric connectivity problems are reducible to the SubConn problem.

Chan [20] obtained an algorithm for SubConn with preprocessing time, update time and query time. Later, Chan, Pǎtraşcu and Roditty [21] improved these bounds, obtaining an algorithm with preprocessing time, update time and query time. Duan [31] presented algorithms with better space usage.

Pǎtraşcu [72] showed that unless Conjecture 1 above is false, there is some such that SubConn cannot be solved with preprocessing time and update and query time. Here we exhibit an explicit , , for which Pǎtraşcu’s result holds. Moreover, we show that assuming Conjecture 1, there is a tradeoff lower bound between the query and update time for fully dynamic algorithms for SubConn. In particular, we show that unless SUM has truly subquadratic algorithms, SubConn cannot be maintained with preprocessing time , update time and query time , for any and .

Chan [20] showed that any dynamic algorithm for SubConn with preprocessing time , update and query time would imply an time algorithm for triangle detection. His result implies that if Conjecture 4 is true, then for any such algorithm either or for some constant , i.e. the same conclusion as Pǎtraşcu’s assuming Conjecture 1.

Here we improve Chan’s result slightly. In particular, we show that one can reduce the triangle detection problem on edge, node graphs to dynamic SubConn with updates and only queries. This implies that any combinatorial dynamic algorithm with truly sublinear query time ( for some ) and truly subcubic in preprocessing time, must have update time for all , unless “Conjecture” 1 is false. (Notice that it is trivial to get query time and update time.) Thus, if the algorithm of [21] can be improved to have update time , we would have a new alternative BMM algorithm. Our results hold even for the special case -SubConn of SubConn in which we only care about whether two fixed vertices are connected in the subgraph .

Subgraph Connectedness.

Chan [20] identifies a problem extremely related to SubConn, that nevertheless seems much more difficult. The problem is Subgraph Connectedness (ConnSub): similarly to SubConn, one has to maintain a subgraph of a fixed graph under vertex additions and removals, but the query one needs to be able to answer is whether is connected.

The best and only known algorithm for ConnSub is to recompute the connectivity information (via DFS in time) after each update or at each query. Here we explain this lack of progress by showing that unless the SETH (Conjecture 2) is false, any algorithm for ConnSub with arbitrary polynomial preprocessing time, must either have essentially linear update time or essentially linear query time. Thus, the trivial algorithm is essentially optimal, under the SETH.

Our result holds even for a special case of the problem called SubUnion, originally identified by Chan: given a fixed collection of sets of total size such that , maintain a subcollection under set insertions and deletions, while answering the query whether the union of the sets in is exactly . That is, the query is exactly “Is a set cover?”.

Single Source Reachability.

Unlike in undirected graphs where great results are possible (e.g. [87]), the reachability problem in directed graphs is surprisingly tough in the dynamic setting, even when the reachability between two fixed vertices is to be maintained (-Reach). The trivial algorithm that recomputes the reachability after each update or at each query is still the best known algorithm in the case of sparse graphs. For graphs with edges, Sankowski [80] showed that one can get a better update time. In particular, he obtained update time and query time for -Reach and query time for single source reachability (ss-Reach). Sankowski’s result improved on the first sublinear update time result by Demetrescu and Italiano [30] who obtained update time and query time for ss-Reach.

Both of these results heavily rely on fast matrix multiplication. Here we show that this is inherent. In particular, any algorithm with truly subcubic (in ) preprocessing time and truly subquadratic query and update time can be converted without significant overhead into a truly subcubic time algorithm for Boolean matrix multiplication. Thus, any such combinatorial algorithm would falsify “Conjecture” 1.

Pǎtraşcu [72] showed that assuming Conjecture 1, there is some such that fully dynamic transitive closure requires either preprocessing time , or update or query time . Here we slightly extend his result, showing that under the SUM conjecture, -Reach requires either preprocessing time or update or query time . Similar to our results on SubConn, we also exhibit a tradeoff: under the SUM conjecture, if -Reach can be solved with preprocessing time and update time for some and , then the query time must be at least .

The single source reachability problem has been studied in the partially dynamic setting as well. In the incremental setting, it is not hard to obtain an algorithm for ss-Reach with amortized update time and query. From the work of Even and Shiloach [35] follows an amortized update time decremental ss-Reach algorithm (with constant query). For the special case of DAGs, Italiano [55] obtained an amortized update and query time decremental ss-Reach algorithm.

In this paper we show that any combinatorial incremental or decremental algorithm for ss-Reach (and also -Reach) must have worst case update or query time, even in the special case of dense DAGs, assuming “Conjecture” 1. Thus deamortizing Italiano’s DAG ss-Reach algorithm, or Even and Shiloach’s algorithm for general graphs, would have interesting consequences for matrix multiplication algorithms.

Finally, we consider a version of ss-Reach, SSR, in which we want to dynamically answer the query about how many nodes are reachable from the fixed source. We note that any algorithm that explicitly maintains a reachability tree can answer this counting query. We show strong lower bounds based on the SETH for SSR: even after polynomial preprocessing time, a dynamic algorithm cannot beat the trivial recomputation algorithm by any, however small, polynomial factor. Hence in particular (under the SETH) no nontrivial algorithm for ss-Reach can maintain the size of the reachability tree.

Incremental/Decremental Single Source Shortest Paths (SSSP)

Roditty and Zwick [79] showed that any decremental or incremental algorithm for SSSP in -node graphs with preprocessing time , and update time and query time for any implies a truly subcubic time algorithm for APSP. The trivial algorithm for the problem recomputes the shortest paths from the source in time after each update, via Dijkstra’s algorithm. Hence, [79] showed that any tiny polynomial improvement over this result would falsify Conjecture 2.

Their result, however, did not exclude the possibility of an algorithm that has both time updates and time queries. This is exactly what our result excludes, again based on the APSP conjecture. In fact, we show it for the seemingly easier problem of incremental/decremental -shortest path.

Strongly Connected Components.

Dynamic algorithms for maintaining the strongly connected components have many applications. One example is in compilers research, to speed up pointer analysis by finding cyclic relationships; other examples are listed in [44]. In the partially dynamic setting, nontrivial results are known. Haeupler et al. [44] presented an incremental algorithm that maintains the strongly connected components (SCCs) in a graph with amortized update time , while being able to answer queries of the form, are and in the same SCC? Bender et al. [15, 16] improved the total update time for the case of dense graphs to (thus getting amortized update time ). In the decremental setting, Roditty and Zwick [78], Lacki [61], and Roditty [76] obtained algorithms with amortized update time . The algorithm of [78] was randomized, whereas Lacki’s was deterministic, and Roditty improved the preprocessing time to . No nontrivial results are known for the fully dynamic setting.

Our results are manyfold. First, we show results similar to those for -Reach. That is, under “Conjecture” 1 any combinatorial fully dynamic algorithm must have either preprocessing time , or update or query time . The same bounds apply for partially dynamic algorithms, but for worst-case update times. Thus, if the known algorithmic results can be deamortized, we would have an alternative BMM algorithm. Under Conjecture 1, either the preprocessing time should be at least , or the query or update time should be at least .

The above results hold even for the special case of the problem in which we want to answer the query “Is the graph strongly connected?”. Next, we consider a variation of the problem which we call SC2, that maintains the graph to answer the query “Does the graph have more than 2 strongly connected components?”. We note that all known algorithms for dynamic SCC explicitly store the SCCs of the graph and hence can also solve SC2. We show surprisingly that SC2 may be a much more difficult problem than SC. In particular, any algorithm with arbitrary polynomial preprocessing time must have either query or update time, unless the SETH fails. That is, either Conjecture 2 is false and we have a breakthrough in the area of SAT algorithms, or the trivial algorithm for SC2 is essentially optimal.

As before, our results also hold for partially dynamic algorithms, but for worst-case update times, implying that deamortizing the results of [44, 15, 78, 61, 76] is SETH-hard.

The same lower bounds under SETH hold for any of the two following variants of dynamic SCC:

  • AppxSCC: approximate the number of SCCs within some constant factor,

  • MaxSCC: determine the size of the largest SCC.

We also consider the dynamic -Reach problem under edge updates: given node sets and , determine whether every node in is reachable from every node in . We are able to prove much stronger update and query lower bounds for it: even after polynomial preprocessing time, the update or query time of any dynamic algorithm must be in an node graph, even when the graph is sparse. is the trivial update time for the problem.

Pagh’s problem.

Pǎtraşcu [72] introduced a problem that he called Pagh’s problem (PP) defined as follows: maintain a set of at most sets over under the following operation: add to , while answering queries of the form “Does belong to ?”. (We can assume that .) The best known dynamic algorithm for PP is the trivial one: perform the set intersection explicitly in time at each update and store the sets in a dictionary for which membership tests are efficient. We introduce a natural variant of Pagh’s problem, which we call -PP (“Pagh’s Problem with Emptiness Queries”) where the query is changed to “Is empty?”. There is also no nontrivial algorithm known for -PP.

The reductions in [72] imply (after some work) that if Conjecture 1 is true, then any dynamic algorithm for PP must have either preprocessing time or update or query time. We prove this same type of conditional lower bound for -PP. Based on Conjecture 4 with constant , we show that any algorithm for PP or -PP must have either preprocessing time or update or query time. (The result for PP holds only for , however this is still interesting since as far as we know, there may be no time algorithm for Triangle detection.) We obtain that under “Conjecture” 1, any algorithm for PP or -PP must have either preprocessing time or update or query time. Finally, we also relate -PP to Conjecture 2, making it the only problem for which we can prove lower bounds based on all conjectures except for 2. We show that under the SETH, any nontrivial algorithm for -PP, even assuming arbitrary polynomial time preprocessing, would violate the SETH. Thus also, if SETH is true, any algorithm for that beats the trivial recomputation cannot also answer emptiness queries.

Diameter Approximation.

The graph diameter is the largest distance in the graph. One can compute the diameter in the same time as computing all pairs shortest paths, and no better algorithms are known. There are faster algorithms that achieve constant factor approximations for the diameter, however. A folklore result is that in linear time one can obtain a -approximation. Aingworth et al. [3] improved the approximation factor, obtaining a approximation for unweighted graphs that runs in time. Roditty and Vassilevska Williams [77] improved the running time to with randomization, and Chechik et al. [22] obtained deterministic -approximation algorithms running in time that also work for weighted graphs. Roditty and Vassilevska Williams showed that any -approximation algorithm that runs in time in undirected unweighted graphs with edges, for any constants would violate the SETH.

In some applications, an efficient dynamic algorithm for diameter estimation may be useful. The above result does not exclude the possibility that after some preprocessing, one can update the estimate for the diameter faster than recomputing it. Here we show that if for some , there is an algorithm that after an arbitrary polynomial time preprocessing can update a -approximation to the diameter of a graph on edges in amortized update time, then the SETH is false. That is, the trivial recomputation of the diameter is essentially optimal.

3 Description of our techniques

Lower bounds based on the SETH.

We begin all of our reductions with an idea used in prior reductions from the SETH in [75, 77, 22].

We assume that the strong exponential time hypothesis holds. Thus, for every , there is some , such that -SAT cannot be solved faster than time. Using this, for each , we work with a carefully chosen . Given an instance of -SAT on variables, we first use the sparsification lemma of Impagliazzo, Paturi and Zane [53] to convert to a small number of -CNF formulas with variables and clauses each. Now we can assume that the given formula has a linear number of clauses. After this, we can construct a graph as follows.

Split the variables into two sets and of size each. We create a set on nodes, each corresponding to a partial assignment to the variables in . Similarly, we create a set on nodes, each corresponding to a partial assignment to the variables in . We also create a set on nodes, one corresponding to each clause.

Suppose now that we add a directed edge from each partial assignment to a clause if and only if does not satisfy , and a directed edge from to a partial assignment if and only if does not satisfy . Then, there is a satisfying assignment to the formula if and only if there is a pair of nodes and such that is not reachable from . Hence any algorithm that can solve this static -reachability problem on node, edge graphs in time for constant would imply a time algorithm for -SAT (for some obtained from the sparsification lemma). We have chosen however so that we obtain a contradiction, and hence the SETH must be false.

Similar constructions to the above are used in prior papers [75, 77, 22]. We adapt the above argument for the case of dynamic SSR (counting the number of nodes reachable from a source) as follows. (The reductions to the remaining problems use a similar approach with some extra work.)

Instead of having all nodes of in the above graph , in our dynamic graph we have a single node . We have stages, one for each partial assignment . In each stage, we add edges from to but only to the neighbors of in , i.e. the clauses that does not satisfy. Say we have inserted edges. After the insertions, we ask the query “Is the number of nodes reachable from less than ?”. If the answer to the query is yes, then the formula is satisfiable, and we can stop. Otherwise, cannot be completed to a satisfying assignment. We then remove all the inserted edges in this stage and move on to the next partial assignment of .

The graph has edges, vertices and we do updates and queries. Hence any dynamic algorithm with preprocessing time, and update and query time would violate the SETH.

Now, suppose that we could achieve update and query time after preprocessing for some big constant . Then we could still contradict the SETH by modifying the above construction further. Instead of splitting the variables into two parts on variables each, we split them into of size and of size for some constant . Then we apply exactly the same construction as above where is the set of partial assignments to and is the set of partial assignments to .

The number of vertices and edges of the graph is now for some . Hence the preprocessing time only takes time. The number of updates we do is but since the graph is much smaller we get that time updates and queries imply a runtime of

(excluding polynomial factors) for solving the SAT instance. Hence we again violate the SETH.

Lower bounds from Triangle Detection and BMM.

To obtain lower bounds based on “Conjecture” 1, we first obtain lower bounds from Conjecture 4 that hold for arbitrary and an arbitrary number of edges , and then apply them for and a carefully chosen to obtain the lower bounds from BMM. For instance if Conjecture 4 for any constant implies that problem cannot have a dynamic algorithm with preprocessing time, update time and query time, then we get that “Conjecture” 1 implies that cannot have a dynamic algorithm with preprocessing time, update time and query time. Then picking , we get a lower bound for all of preprocessing time , update time and query time .

Our reductions from Triangle Detection typically begin with the following construction. Given a graph on edges and vertices, we create copies of , , and for each edge we add the directed edges where is the copy of in . Now contains a triangle if and only if for some , there is a path from to . Since the new graph has edges and vertices, it suffices to simulate the reachability queries with dynamic algorithms for the problem at hand.

For -Reach for instance, we add two additional nodes and to the above graph and we proceed in stages, one for each node . In each stage, we add edges and , and ask whether is reachable from . This will be the case iff appears in a triangle in . If the answer to the query is no, we remove the edges incident to and and move on to the next stage. The number of queries and updates is overall, and hence any dynamic algorithm with preprocessing time, and update and query time would imply an time triangle algorithm. We then apply a high-degree low-degree argument as in [6] to show that this also implies an time triangle algorithm.

To obtain the lower bounds for Strong Connectivity and Bipartite Perfect Matching, we prove general reductions from -Reach to SC and BPMatch that show that if the latter two problems can be solved with preprocessing time , update time and query time , then -Reach can be solved with preprocessing time , update time and query time . We show a separate reduction from Triangle Detection to -SubConn (similar to the one to -Reach) that performs updates and queries, giving an lower bound on the update time and on the query time.

Our lower bound for PP is more involved than the rest of the lower bounds based on Conjecture 4. We will explain the main ideas. Given an -node, -edge graph, first let us look for triangles containing a node of high degree . We begin by creating for every node of high degree a set containing node iff is not a neighbor of . The number of such sets is and constructing them takes time. Now, for each node , using updates, we create the intersection of all sets for the neighbors of . Then, for every edge , we query whether . Notice that if and only if is not a neighbor of any of the neighbors of . Thus, if any one of the queries returns “no”, we have detected a triangle.

Suppose now that no triangle with a node of high degree is found. Then, all nodes of any triangle have degree . We can attempt to do exactly the same reduction as above. The only problem is that the number of sets that we would have to create could be , and thus just creating the sets would take time. This is sufficient for a reduction from triangle in dense graphs, however it is too costly for a reduction from sparse graphs. Fortunately, we can avoid the high cost. Before we create the sets , we pick a universal hash function and hash all nodes with it into a universe of size . We are guaranteed that with constant probability, if we take two nodes and of low degree, then won’t contain any two nodes hashing to the same element. Thus, we can simulate the search for a triangle with an edge where both and have low degree, just as before, except that we create a set for each hash value , . The creation time is now , and everything else works out with constant probability. We can obtain correctness with high probability by using hash functions. Picking , we obtain an extra term in our reduction which is negligible if we are trying to contradict Conjecture 4 for .

Lower bounds from Sum.

Pǎtraşcu [72] showed that -SUM on numbers can be reduced to the problem of listing triangles in a certain tripartite graph on partitions where , , and , for any and , in truly subquadratic time. Then, he reduced this triangle listing problem to “the multiphase problem”, which in turn can be reduced to several dynamic problems. We examine Pǎtraşcu’s reduction in more detail and show that by directly reducing the triangle listing problem to dynamic problems like -SubConn we can overcome some inefficiencies incurred by “the multiphase problem” and get improved lower bounds.

A first approach is to use the known reductions from triangle listing to triangle finding [93, 57] to directly apply our hardness results based on triangle finding. However, using the currently best reductions, even a linear time algorithm for triangle finding would not be able to get us a faster than time algorithm for listing triangles which is what we need in order to get subquadratic SUM.

Instead, we reason about Pǎtraşcu’s construction directly. First, we observe that to falsify Conjecture 1, it is enough to list in subquadratic time all pairs of nodes that participate in a triangle. To do this, note that in Pǎtraşcu’s construction, every node of has at most neighbors in . Thus, once the pairs of nodes that appear in triangles are known, one can go through each one pair , and check each of the at most neighbors of , to find all triangles going through . Thus SUM would be in = time which is truly subquadratic when for .

Thus, to obtain lower bounds for our dynamic problems, we show how to list the pairs of nodes in that appear in triangles using a small number of queries and updates. We first reduce -Reach to -SubConn, thus also showing that -SubConn is at least as hard as SC and BPMatch. Then we focus on -SubConn. Given Pǎtraşcu’s graph for some choice of , we create an instance of -SubConn. is a copy of in which all the edges between parts are removed. Thus has only edges for any choice of . We also add a node that is connected to all the nodes in and a node that is connected to all the nodes in . Initially, , and all nodes in are activated, while the nodes in are deactivated.

We preprocess this graph in time which is subquadratic if . Then, we have a stage for each of the edges in in . In the stage for , we activate the nodes in and query if are connected. and are connected iff there is a node in that is a neighbor of both , i.e. participates in a triangle. Then we deactivate and and move on to the next edge. This way, we can list all the pairs that are in triangles with updates and queries to -SubConn, which would be in subquadratic time if and the update and query times are .

This type of approach is insufficient to prove a tradeoff between the query and update time, however. To obtain such a tradeoff, we need to be able to reduce the search for triangle edges to -SubConn where the number of queries is very different from the number of updates. To achieve this, on the same underlying graph as before, we use -SubConn to binary search for the nodes in that participate in a triangle with a given node (instead of simply trying each neighbor of as we did above). This allows us to reduce the number of queries in the reduction to , while keeping the number of updates . This lets us pick a larger and trade-off the lower bounds for the query and the update times.

In the binary search for a fixed , we use the queries to check whether there is a node in a certain contiguous subset of (interval) that participates in a triangle with . This can be done by activating all neighbors of in the interval at once, and asking the connectivity query. We start the search with an interval that contains all of . If we discover that an interval contains a node that participates in a triangle with , we proceed to search within both subintervals of of half the size. (Thus, we only search in an interval if its parent interval returned “yes”.) Since no that appears in a triangle with appears in more than -intervals, the number of queries to -SubConn is only bigger than the number of triangles by a logarithmic factor, and is thus . The number of updates is no more than for each where is the number of neighbors of in . Hence the total number of updates is .

Lower bounds on partially dynamic algorithms.

Notice that our reductions almost always look like this (with the exception of PP and -PP). They proceed in stages, and each stage has the following form: insertions are performed, then some number of queries are asked. Finally the insertions are undone.

We can simulate this type of a reduction with an incremental algorithm as follows. During each stage, we perform the insertions and queries, and while we do them, we record the sequence of all changes to the data structure that the insertions (and queries) cause. This makes our reduction no longer black box (it was black box for fully dynamic algorithms). It also increases the space usage to be on the order of the time that it takes to perform the insertions. However, once we have recorded all the changes, we can undo them in reverse order in roughly the same time as they originally took, and bring the data structure to the same state that it was before the beginning of the stage. We obtain lower bounds on the preprocessing, update and query time of incremental algorithms. However, since we undo changes, the lower bounds only hold for worst case runtimes.

Simulating the above algorithms with decremental algorithms is more challenging since it would seem that we need to simulate insertions with roughly deletions, and this is not always possible. We develop some techniques that work for many of our reductions. For instance, we are able to simulate the following with only deletions (and undeletions) over all stages: in each stage a node has an edge to only the th node from a set of size . This is useful for our proof that efficient worst-case decremental -Reach implies faster triangle algorithms.

Lower bounds based on APSP.

To show our lower bounds from APSP to incremental or decremental -SP and BWM, we first reduce -SP to BWM, thus showing that we only have to concentrate on -SP. Then, we combine Roditty and Zwick’s [79] original reduction with Vassilevska Williams and Williams’ [93] proof that negative triangle detection is equivalent to APSP. In particular, we show that the number of shortest paths queries can be reduced to (from ) since we only need to simulate determining whether there is a path on edges from each vertex back to itself.

Problem
Maintain Update Query
-Subgraph Connectivity (-SubConn)
A fixed undirected graph, a subset Insert/remove a node into/from Are and connected in the
of its vertices and fixed vertices subgraph induced by the nodes in ?
Bipartite Perfect Matching (BPMatch)
An undirected bipartite graph Edge insertions/deletions Does the graph have a
perfect matching?
Bipartite Maximum Weight Matching (BWMatch)
An undirected bipartite graph Edge insertions/deletions What is the weight of the
with integer edge weights maximum weight matching?
Bipartite matching without length augmenting paths (-BPM)
An undirected bipartite graph Edge insertions/deletions What is the size of a matching
that does not admit
length augmenting paths?
Single Source Reachability (SS-Reach)
A directed graph and a Edge insertions/deletions Given a vertex ,
fixed vertex is reachable from ?
-Reachability (-Reach)
A directed graph and Edge insertions/deletions Is reachable from ?
fixed vertices
-shortest path (-SP)
An undirected weighted graph and Edge insertions/deletions What is the distance
fixed vertices between and ?
Strong Connectivity (SC)
A directed graph Edge insertions/deletions Is the graph strongly connected?
2 Strong Components (SC2)
A directed graph Edge insertions/deletions Are there more than
strongly connected components?
2 vs Strong Components (AppxSC)
A directed graph Edge insertions/deletions Is the number of SCCs
or more than ?
Maximum SCC size (MaxSCC)
A directed graph Edge insertions/deletions What is the size of
the largest SCC?
Single Source Reachability Count (# SSR)
A directed graph with Edge insertions/deletions Given , is the number of
a fixed source nodes reachable from ?
Connected Subgraph (ConnSub)
A fixed undirected graph Insert/remove a node Is the subgraph induced
and a vertex subset into/from by connected?
-Reachability (-Reach)
A directed graph and Edge insertions/deletions Are there some ,
fixed node subsets and s.t. is unreachable from ?
-Approximate Diameter (-Diam)
An undirected graph Edge insertions/deletions Is the diameter
or ?
Chan’s Subset Union Problem (SubUnion)
A subset of a fixed collection Insert/remove a set Is ?
of subsets over a into/from
universe , with
Pagh’s Problem (PP)
A collection of Given , insert Given index
subsets into and , is ?
Pagh’s Problem with Emptiness Queries (-PP)
A collection of Given , insert Given index ,
subsets into is ?
Table 1: The problems we consider.
Problem Best Upper Bounds Lower Bounds
Conjecture
-Reach (*) SUM
(*) Triangle
 [80] BMM
SC (*) SUM
(*) Triangle
BMM
SubConn (*) BMM
(*) Triangle
 [21] SUM
BPMatch BM (*) SUM
BM (*) Triangle
 [81] BMM
Dec/Inc BWMatch WM (*) APSP
Dec/Inc -SP (*) APSP
(*)
SC2, SSR, ConnSub, (*)
AppxSCC, SubUnion (*) SETH
-PP over (*) SETH
a universe of size (*) Triangle
and sets BMM
SUM
PP over (*) Triangle with
a universe of size (*) BMM
and sets  [72] SUM
-Reach or -Diam (*) SETH
in sparse graphs (*)
Table 2: The table includes the current best upper bounds for the listed problems, together with bounds for which a listed conjecture would be falsified. In the above, WM refers to , i.e. asymptotically the fastest known time to compute a weighted matching, BM refers to , i.e. asymptotically the fastest known time to compute a bipartite perfect matching, is an arbitrarily small constant, and is some constant for which Triangle detection is not in time. (*) denotes the trivial algorithm. Dec/Inc means that the upper and lower bounds apply to fully dynamic, and also to partially dynamic, i.e. decremental and incremental, algorithms. All lower bounds can be amortized and expected. All above lower bounds also hold in the case of partially dynamic algorithms, however then the lower bounds are assumed to be worst-case (unless they are already listed in the table).

4 The conjectures

Sum.

The SUM problem is the problem of determining whether a set of integers contains three integers so that . The problem has a simple time solution: sort the integers, and for every pair , check whether their sum is in the list using binary search. There are faster algorithms as well. Baran, Demaine and Pǎtraşcu [12] showed that in the Word RAM model with bit words, SUM can be solved in time. However, there are no known time (so called “truly subquadratic”) algorithms for the problem for any . The lack of progress on the problem has led to the following conjecture [72, 42].

Conjecture 1 (No truly subquadratic Sum).

In the Word RAM model with words of bits, any algorithm requires time in expectation to determine whether a set of integers contains three distinct elements with .

(By standard hashing arguments, one can assume that the size of the integers in the SUM instance is , and so the conjecture is not for a restricted version of the problem.)

Many researchers believe this conjecture. Besides Pǎtraşcu’s paper [72] on dynamic lower bounds, SUM is often used to prove conditional hardness for nondynamic problems. Gajentaan and Overmars [42] formed a theory of “SUM-hard problems” by showing that one can reduce SUM to many static problems in computational geometry, showing that unless SUM has a truly subquadratic time algorithm, none of them do. One example of a SUM-hard problem is testing whether in a given set of points in the plane, of them are colinear. Following [42] many other papers proved the SUM hardness of geometric problems [29, 64, 34, 2, 8, 10, 23, 13]. Vassilevska and Williams [88, 90] showed that a certain weighted graph triangle problem cannot be found efficiently unless Conjecture 1 is false, relating SUM to problems in weighted graphs. Their work was recently extended [1] for other weighted subgraph problems.

Apsp.

The second conjecture concerns the all pairs shortest paths problem (APSP): given a directed or undirected graph with integer edge weights, determine the distances between every pair of vertices in the graph. Classical algorithms such as Dijkstra’s or Floyd-Warshall’s provide running times for APSP in -ndoe graphs. Just as with SUM, there are improvements over this cubic runtime. Until 2013, the fastest such runtime was by Han and Takaoka [45]. Williams [91] has recently designed an algorithm that runs faster than time for all constants . Nevertheless, no truly subcubic time ( for ) algorithm for APSP is known. This led to the following conjecture assumed in many papers, e.g. [79, 93].

Conjecture 2 (No truly subcubic APSP).

There is a constant , such that in the Word RAM model with words of bits, any algorithm requires time in expectation to compute the distances between every pair of vertices in an node graph with edge weights in .

Vassilevska Williams and Williams [93] showed that many other graph problems are equivalent to APSP under subcubic reductions, and as a consequence any truly subcubic algorithm for them would violate Conjecture 2. Some examples of these problems include detecting a negative weight triangle in a graph, computing replacement paths and finding the minimum cycle in the graph.

One could ask, is there a relationship between Conjectures 2 and 1? The answer is unknown. However, there is a problem that is in a sense at least as hard as both SUM and APSP, and may be equivalent to either one of them. The problem, Exact Triangle, is, given a graph with integer edge weights, determine whether it contains a triangle with total weight . The work of Vassilevska Williams and Williams [90] based partially on [72] shows that if Exact Triangle can be solved in truly subcubic time, then both Conjectures 2 and 1 are false.

The Strong Exponential Time Hypothesis.

The next conjecture is about the exact complexity of an NP-hard problem, namely Boolean Satisfiability in Conjunctive Normal Form (CNF-SAT). The best known algorithm for CNF-SAT is the time exhaustive search algorithm which tries all possible assignments to the variables, and it has been a major open problem to obtain an improvement. There are faster algorithms for -SAT for constant . Their running times are typically of the form for some constant independent of and (e.g. [50, 66, 70, 69, 84, 85]). That is, as grows, the base of the exponent of the best known algorithms goes to .

Impagliazzo, Paturi, and Zane [53, 54] introduced the Strong Exponential Time Hypothesis (SETH) to address the question of how fast one can solve -SAT as grows. They define:

The sequence is clearly nondecreasing. The SETH hypothesizes that .

Conjecture 3 (Seth).

For every , there exists an integer , such that SAT on -CNF formulas on variables cannot be solved in time.

The SETH is an extremely popular conjecture in the exact exponential time algorithms community. For instance, Cygan et al. [25] showed that the SETH is also equivalent to the assumption that several other NP-hard problems cannot be solved faster than by exhaustive search, and the best algorithms for these problems are the exhaustive search ones. Some other work that proves conditional lower bounds based on the SETH for NP-hard problems includes [25, 19, 28, 63, 27, 92, 71, 24, 26, 36].

Assuming the SETH, one can prove tight conditional lower bounds on the complexity of some polynomial time problems as well. Pǎtraşcu and Williams [75] give several tight lower bounds for problems such as -dominating set (for any constant ), SAT with two extra unrestricted length clauses, and HornSAT with extra unrestricted length clauses. Roditty and Vassilevska Williams [77] and the follow-up work of Chechik et al. [22] related the complexity of approximating the diameter of a graph to the SETH. In this paper we prove the first lower bounds for dynamic problems based on the SETH. The lower bounds we obtain are surprisingly tight- any polynomial improvement over the trivial algorithm would falsify Conjecture 2. In addition, all lower bounds based on the SETH also hold with arbitrary polynomial preprocessing time.

Triangle.

The next conjecture is on the complexity of finding a triangle in a graph. The best known algorithm for triangle detection relies on fast matrix multiplication and runs in time in -edge, -node graphs [6]. Even if there were an optimal matrix multiplication algorithm, it would at best imply an time algorithm for triangle detection. The lack of alternative algorithms leads to the conjecture that there may not be a linear time algorithm for triangle finding. (In fact, one may even conjecture that time is not possible, but we will be conservative.)

Conjecture 4 (No almost linear time triangle).

There is a constant , such that in the Word RAM model with words of bits, any algorithm requires time in expectation to detect whether an edge graph contains a triangle.

One may ask whether Conjecture 4 is related to Conjectures 2 and 1. Although there is no known strong relationship between Conjectures 2 and 4 5 the relationship between SUM and Triangle detection has been explored. For instance, Pǎtraşcu [72] showed that one can reduce SUM on numbers to the problem of listing up to triangles in a graph on edges, thus any algorithm that lists triangles in an -edge graph in time for would falsify Conjecture 1.

However, is there a relationship between triangle listing and triangle detection? Vassilevska Williams and Williams [93] proved that for dense graphs, any truly subcubic algorithm for triangle detection implies a truly subcubic algorithm for listing any truly subcubic number of triangles. Jafargholi and Viola [57] extended this result to the case of sparse graphs. They showed that if one can detect a triangle in time, then one can list triangles in time. Unfortunately, their reduction always produces a listing algorithm that runs in time which is insufficient to falsify Conjecture 1. The authors [57] also show that Triangle detection on a graph with edges can be reduced to SUM on numbers, which implies that if the Triangle Conjecture is true for some then SUM requires time. Beyond this, the SUM conjecture and the Triangle Conjecture may be unrelated.

We state our lower bounds in terms of the exponent in Conjecture 4. Thus any sufficiently large improvement on the complexity of our dynamic problems would yield a new algorithm for triangle detection.

Boolean matrix multiplication (BMM).

The Boolean product of two Boolean matrices and is the matrix with entries . Many important problems can not only be solved using a fast BMM routine, but are also equivalent to BMM [62, 93]. Hence an efficient and practical BMM algorithm is highly desirable.

The Boolean product of matrices can be computed using any algorithm for integer matrix multiplication, and hence the problem is in time [89]. However, the theoretically efficient matrix multiplication algorithms (except possibly Strassen’s [86]) use mathematical machinery that causes them to have high constant factors, and are thus currently impractical. Because of this, alternative, so called “combinatorial” methods, for BMM are sought after.

The current best combinatorial algorithm for BMM by Bansal and Williams [11] runs in time, improving on the well-known Four-Russians algorithm [9]. Because it has been such a longstanding open problem to obtain an time (for constant ) algorithm for BMM, the following conjecture has been floating around the community; many papers base lower bounds for problems on it (e.g. [79, 62, 58, 4]). (We place “conjecture” in quotes, mainly because “combinatorial” is not a well-defined term.)

“Conjecture” 1 (No truly subcubic combinatorial BMM).

In the Word RAM model with words of bits, any combinatorial algorithm requires time in expectation to compute the Boolean product of two matrices.

The only known relationship between the complexity of BMM and the rest of the conjectures in this paper is a result from [93] that any truly subcubic in combinatorial algorithm for finding a triangle can be converted to a truly subcubic combinatorial algorithm for BMM. Hence “Conjecture” 1 is equivalent to the conjecture that combinatorial triangle finding in node graphs requires time. However, that does not necessarily imply Conjecture 4 since it could be that there is a linear time algebraic triangle finding algorithm, but no combinatorial one. Furthermore, “Conjecture” 1 could be false but Conjecture 4 might still be true. According to our current knowledge, even an optimal algorithm for BMM would at best imply an time algorithm for triangle detection.

5 Formal statement of our results

The problems we study are defined in Table 1. We prove the following theorems. Most of our results are summarized in Table 2.

Theorem 5.1.

If for some and , we can solve either of

  • fully dynamic #SSR, SC, AppxSCC, MaxSCC, SubUnion, -PP, or ConnSub, with preprocessing time , amortized update time , and amortized query time , or

  • incremental or decremental #SSR, SC, AppxSCC, MaxSCC, SubUnion, -PP, or ConnSub, with preprocessing time , worst case update time , and worst case query time , or

  • fully dynamic -Reach or -Diam with preprocessing time , amortized update time , and amortized query time , or

  • incremental or decremental -Reach or -Diam with preprocessing time , worst case update time , and worst case query time ,

then Conjecture 2 is false.

Theorem 5.2.

If for some and , we can solve either of

  • fully dynamic -SubConn, -Reach, BPMatch, or SC, with preprocessing time , amortized update time , and amortized query time , or

  • incremental -SubConn, -Reach, BPMatch, or SC, with preprocessing time , worst case update time , and worst case query time , or

  • decremental -Reach, BPMatch, or SC, with preprocessing time , worst case update time , and worst case query time , or

  • PP or -PP over sets and a universe of size with preprocessing time , amortized update and query time ,

then Conjecture 1 is false.

Theorem 5.3.

If for some , we can solve either of

  • fully dynamic -Reach, BPMatch, -BPM, or SC, with preprocessing time , amortized update and query times , or

  • incremental or decremental -Reach, BPMatch, -BPM, or SC, with preprocessing time , worst case update and query times , or

  • fully dynamic -SubConn or -BPM with preprocessing time