Tighter Connections Between Formula-SAT and Shaving Logs1footnote 11footnote 1Part of the work was performed while visiting the Simons Institute for the Theory of Computing, Berkeley, CA. The work was completed when A.A. was at Stanford University and was supported by Virginia Vassilevska Williams’ NSF Grants CCF-1417238 and CCF-1514339, and BSF Grant BSF:2012338.

Tighter Connections Between Formula-SAT and Shaving Logs111Part of the work was performed while visiting the Simons Institute for the Theory of Computing, Berkeley, CA. The work was completed when A.A. was at Stanford University and was supported by Virginia Vassilevska Williams’ NSF Grants CCF-1417238 and CCF-1514339, and BSF Grant BSF:2012338.

Amir Abboud IBM Almaden Research Center, abboud@cs.stanford.edu    Karl Bringmann Max Planck Institute for Informatics, Saarland Informatics Campus, Germany, kbringma@mpi-inf.mpg.de
Abstract

A noticeable fraction of Algorithms papers in the last few decades improve the running time of well-known algorithms for fundamental problems by logarithmic factors. For example, the dynamic programming solution to the Longest Common Subsequence problem (LCS) was improved to in several ways and using a variety of ingenious tricks. This line of research, also known as the art of shaving log factors, lacks a tool for proving negative results. Specifically, how can we show that it is unlikely that LCS can be solved in time ?

Perhaps the only approach for such results was suggested in a recent paper of Abboud, Hansen, Vassilevska W. and Williams (STOC’16). The authors blame the hardness of shaving logs on the hardness of solving satisfiability on boolean formulas (Formula-SAT) faster than exhaustive search. They show that an algorithm for LCS would imply a major advance in circuit lower bounds. Whether this approach can lead to tighter barriers was unclear.

In this paper, we push this approach to its limit and, in particular, prove that a well-known barrier from complexity theory stands in the way for shaving five additional log factors for fundamental combinatorial problems. For LCS, regular expression pattern matching, as well as the Fréchet distance problem from Computational Geometry, we show that an runtime would imply new Formula-SAT algorithms.

Our main result is a reduction from SAT on formulas of size over variables to LCS on sequences of length . Our reduction is essentially as efficient as possible, and it greatly improves the previously known reduction for LCS with , for some .

1 Introduction

Since the early days of Algorithms research, a noticeable fraction of papers each year shave log factors for fundamental problems: they reduce the best known upper bound on the time complexity from to , for some . While in some cases a cynic would call such results “hacks” and “bit tricks”, there is no doubt that they often involve ingenious algorithmic ideas and suggest fundamental new ways to look at the problem at hand. In his survey, Timothy Chan calls this kind of research “The Art of Shaving Logs” [37]. In many cases, we witness a race of shaving logs for some problem, in which a new upper bound is found every few months, without giving any hints on when this race is going to halt. For example, in the last few years, the upper bound for combinatorial Boolean Matrix Multiplication dropped from [16], to [20], to [38], and most recently to [102]. Perhaps the single most important missing technology for this kind of research is a tool for proving lower bounds.

Consider the problem of computing the Longest Common Subsequence (LCS) of two strings of length . LCS has a simple time dynamic programming algorithm [95, 47]. Several approaches have been utilized in order to shave log factors such as the “Four Russians” technique [16, 63, 76, 23, 60], utilizing bit-parallelism [10, 48, 64], and working with compressed strings [49, 56]. The best known upper bounds are for constant size alphabets [76], and for large alphabets [60]. But can we do better? Can we solve LCS in time? While the mathematical intrigue is obvious, we remark that even such mild speedups for LCS could be significant in practice. Besides its use as the diff operation in unix, LCS is at the core of highly impactful similarity measures between biological data. A heuristic algorithm called BLAST for a generalized version of LCS (namely, the Local Alignment problem [89]) has been cited more than sixty thousand times [14]. While such heurisitics are much faster than the near-quadratic time algorithms above, they are not guaranteed to return an optimal solution and are thus useless in many applications, and biologists often fall back to (highly optimized implementations of) the quadratic solutions, see, e.g. [73, 74].

How would one show that it is hard to shave logs for some problem? A successful line of work, inspired by NP-hardness, utilizes “fine-grained reductions” to prove statements of the form: a small improvement over the known runtime for problem A implies a breakthrough algorithm for problem B, refuting a plausible hypothesis about the complexity of B. For example, it has been shown that if LCS can be solved in time, where , then there is a breakthrough algorithm for CNF-SAT, and the Strong Exponential Time Hypothesis (SETH, defined below) is refuted [2, 29]. Another conjecture that has been used to derive interesting lower bounds states that the -SUM problem222-SUM asks, given a list of numbers, to find three that sum to zero. The best known upper bound is for real numbers [61, 55, 58, 39] and for integers [21]. cannot be solved in time. It is natural to ask: can we use these conjectures to rule out log-factor improvements for problems like LCS? And even more optimistically, one might hope to base the hardness of LCS on a more standard assumption like . Unfortunately, we can formally prove that these assumptions are not sharp enough to lead to any consequences for log-factor improvements, if only Turing reductions are used. In Section 3 we prove the following theorem which also shows that an time algorithm for problem A cannot imply, via a fine-grained reduction, an algorithm for problem , unless is (unconditionally) solvable in time.

Theorem 1.1 (Informally).

If for some there is a fine-grained reduction proving that LCS is not in time unless SETH fails, then SETH is false.

Note that it also does not suffice to simply make SETH stronger by postulating a higher running time lower bound for CNF-SAT, since superpolynomial improvements are known for this problem [83, 34, 50, 8]. Similarly, we cannot base a study of log-factor improvements on the APSP conjecture, since superlogarithmic improvements are known for APSP [99]. (However, 3SUM could be a candidate to base higher lower bounds on, since only log-factor improvements are known [61, 55, 58, 21], see Section A for a discussion.)

Thus, in a time when super-linear lower bounds for problems like LCS are far out of reach, and our only viable approach to obtaining such negative results is reductions-based, we are left with two options. We could either leave the study of log-factor improvements in limbo, without a technology for proving negative results, or we could search for natural and convincing assumptions that are more fine-grained than SETH that could serve as the basis for the negative results we desire. Such assumptions were recently proposed by Abboud, Hansen, Vassilevska Williams and Williams [3]. The authors blame the hardness of shaving logs on the hardness of solving satisfiability on boolean formulas (Formula-SAT) faster than exhaustive search333In [3] the authors focus on SAT on Branching Programs (BPs) rather than formulas, but due to standard transformations between BPs and formulas, the two problems are equivalent up to polynomial factors. Focusing on Formula-SAT will be crucial to the progress we make in this paper., by polynomial factors (which are log-factors in the runtime), a task for which there are well known “circuit lower bound” barriers in complexity theory. They show that an algorithm for LCS would imply a major advance in circuit lower bounds. In the final section of this paper, we give a more detailed argument in favor of this approach. Whether one should expect it to lead to tight barriers, i.e. explaining the lack of algorithms for LCS or any other natural problem, was completely unclear.

The Machine Model

We use the Word-RAM model on words of size , where there is a set of operations on words that can be performed in time . Most papers do not fix the concrete set of allowed operations, and instead refer to “typical Boolean and arithmetic operations”. In this paper, we choose a set of operations that is robust with respect to changing the word size: For any operation , given two words (of size ) we can compute in time on a Word RAM with word size and operation set . In other words, if we split into words of size then can still be computed very efficiently.

This robustness in particular holds for the following standard set of operations: initializing a cell with a constant, bitwise AND, OR, NOT, shift, addition, subtraction, multiplication, and division with remainder (since multiplication and division have near-linear time algorithms).

The results in this paper will get gradually weaker as we relax the restriction on near-linear time per operation to higher runtimes, however, even with this restriction, to the best of our knowledge this model captures all log shaving results in the literature (on the “standard” Word RAM model without fancy word operations).

Formula-SAT

A boolean formula over input variables can be viewed as a tree in which every leaf is marked by an input variable or its negation and every internal node or gate represents some basic boolean operation. Throughout this introduction we will only talk about deMorgan formulas, in which every gate is from the set . The size of the formula is defined to be the number of leaves in the tree.

In the Formula-SAT problem we are given a formula of size over inputs, and we have to decide whether there is an input that makes it output . A naive algorithm takes time, since evaluating the formula on some input takes time. Can we do better? We will call a SAT algorithm non-trivial444Some works on SAT algorithms used this term for runtimes of the form . In our context, we need to be a bit more fine-grained. if it has a runtime at most , for some .

It seems like a clever algorithm must look at the given formula and try to gain a speedup by analyzing it. The more complicated can be, the harder the problem becomes. Indeed, Dantsin and Hirsch [50] survey dozens of algorithms for SAT on CNF formulas which exploit their structure. For -CNF formulas of size there are time algorithms (e.g. [83]), and for general CNF formulas the bound is where is the clause-to-variable ratio [34, 50, 8]. The popular SETH [68, 35] essentially says that this is close to optimal, and that there is no algorithm for CNF-SAT. For arbitrary deMorgan formulas, the upper bounds are much worse. A FOCS’10 paper by Santhanam [86] and several recent improvements [42, 44, 43, 72, 93] solve Formula-SAT on formulas of size in time , which is non-trivial only for , and going beyond cubic seems extremely difficult. This leads us to the first barrier which we will transform into a barrier for shaving logs.

Hypothesis 1.2.

There is no algorithm that can solve SAT on deMorgan formulas of size in time, for some , in the Word-RAM model.

Perhaps the main reason to believe this hypothesis is that despite extensive algorithmic attacks on variants of SAT (perhaps the most extensively studied problem in computer science) over decades, none of the ideas that anyone has ever come up with seem sufficient to refute it. Recent years have been particularly productive in non-trivial algorithms designed for special cases of Circuit-SAT [86, 88, 66, 35, 101, 22, 41, 69, 65, 45, 85, 59] (in addition to the algorithms for deMorgan formulas above) and this hypothesis still stands.

A well-known “circuit lower bounds” barrier seems to be in the way for refuting Hypothesis 1.2: can we find an explicit boolean function that cannot be computed by deMorgan formulas of cubic size? Functions that require formulas of size [91] and [71] have been known since the 60’s and 70’s, respectively. In the late 80’s, Andreev [15] proved an which was later gradually improved to by Nisan and Wigderson [67] and to by Paterson and Zwick [81] until Håstad proved his lower bound in FOCS’93 [62] (a recent result by Tal improves the term [92]). All these lower bound results use the “random restrictions” technique, first introduced in this context by Subbotovskaya in 1961 [91], and it is known that a substantially different approach must be taken in order to go beyond the cubic barrier. What does this have to do with Formula-SAT algorithms? Interestingly, this same “random restrictions” technique was crucial to all the non-trivial Formula-SAT algorithms mentioned above. This is not a coincidence, but only one out of the many examples of the intimate connection between the task of designing non-trivial algorithms for SAT on a certain class of formulas or circuits and the task of proving lower bounds against . This connection is highlighted in many recent works and in several surveys [87, 80, 98]. The intuition is that both of these tasks seem to require identifying a strong structural property of functions in . There is even a formal connection shown by Williams [97], which in our context implies that solving Formula-SAT on formulas of size in time (which is only slightly stronger than refuting Hypothesis 1.2) is sufficient in order to prove that there is a function in the class that cannot be computed by formulas of size (see [3] for more details). This consequence would be the first polynomial progress on the fundamental question of worst case formula lower bounds since Håstad’s result.

1.1 Our Results: New Reductions

Many recent papers have reduced CNF-SAT to fundamental problems in to prove SETH-based lower bounds (e.g. [82, 84, 6, 4, 27, 18, 7, 1, 33, 5, 19, 77, 40]). Abboud et al. [3] show that even SAT on formulas, circuits, and more, can be efficiently reduced to combinatorial problems in . In particular, they show that Formula-SAT on formulas of size over inputs can be reduced to an instance of LCS on sequences of length . This acts as a barrier for shaving logs as follows. A hypothetical time algorithm for LCS can be turned into an

time algorithm for Formula-SAT, which for a large enough would refute Hypothesis 1.2. The first factor in the runtime comes from the jump from to and our Word-RAM machine model: whenever the LCS algorithm wants to perform a unit-cost operation on words of size (this is much more than the word size of our SAT algorithm which is only ), the SAT algorithm can simulate it in time in the Word-RAM model with words of size .

Our main result is a much more efficient reduction to LCS. For large but constant size alphabets, we get a near-linear dependence on the formula size, reducing the factor to just .

Theorem 1.3.

Formula-SAT on formulas of size on inputs can be reduced to an instance of LCS on two sequences over an alphabet of size of length , in time.

Thus, if LCS on sequences of length and alphabet of size can be solved in time, then Formula-SAT can be solved in time. Recall that the known upper bound for LCS is for any constant alphabet size, with , and we can now report that the barrier of cubic formulas stands in the way of improving it to (see Corollary 1.6 below).

The novelty in the proof of Theorem 1.3 over [3] is discussed in Section 2. As an alternative to Theorem 1.3, in Section D we present another reduction to LCS which is much simpler than all previously known reductions, but uses a larger alphabet.

Fréchet Distance

An important primitive in computational geometry is to judge how similar are two basic geometric objects, such as polygonal curves, represented as sequences of points in -dimensional Euclidean space. Such curves are ubiquitous, since they arise naturally as trajectory data of moving objects, or as time-series data of stock prices and other measures. The most popular similarity measure for curves in computational geometry is the Fréchet distance, also known as dog-leash-distance. For formal definitions see Section F. The Fréchet distance has found many applications (see, e.g., [78, 26, 30]) and developed to a rich field of research with many generalizations and variants (see, e.g., [11, 17, 13, 53, 36, 46, 32, 52, 75, 70]).

This distance measure comes in two variants: the continuous and the discrete. A classic algorithm by Alt and Godau [12, 57] computes the continuous Fréchet distance in time for two given curves with vertices. The fastest known algorithm runs in time (on the Word RAM) [31]. If we only want to decide whether the Fréchet distance is at most a given value , this algorithm runs in time . For the discrete Fréchet distance, the original algorithm has running time  [54], which was improved to by Agarwal et al. [9]. Their algorithm runs in time for the decision version. It is known that both versions of the Fréchet distance are SETH-hard [27]. However, this does not rule out log factor improvements. In particular, no reduction from versions of SETH on formulas or branching programs is known.

In this paper we focus on the decision version of the discrete Fréchet distance (which we simply call “Fréchet distance” from now on). We show that Fréchet distance suffers from the same barriers for shaving logs like LCS. In particular, this reduction allows us to base the usual lower bound on a weaker assumption than SETH, such as NC-SETH (see the discussion in [3]). This is the first NC-SETH hardness for a problem that does not admit alignment gadgets (as in [29]).

Theorem 1.4.

Formula-SAT on formulas of size on inputs can be reduced to an instance of the Fréchet distance on two curves of length , in time.

Regular Expression Pattern Matching

Our final example is the fundamental Regular Expression Pattern Matching problem: Decide whether a given regular expression of length matches a substring of a text of length . Again, there is a classical algorithm [94], and the applicability and interest in this problem resulted in algorithms shaving log factors; the first one by Myers [79] was improved by Bille and Thorup [24] to time . Recently, Backurs and Indyk proved an SETH lower bound [19], and performed an impressive study of the exact time complexity of the problem with respect to the complexity of the regular expression. This study was essentially completed by Bringmann, Grønlund, and Larsen [28], up to factors. In Section E we show that this problem is also capable of efficiently simulating formulas and thus has the same barriers as LCS and Fréchet distance.

Theorem 1.5.

Formula-SAT on formulas of size on inputs can be reduced to an instance of Regular Expression Pattern Matching on text and pattern of length over a constant size alphabet, in time.

Consequences of the Cubic Formula Barrier

We believe that SAT on formulas can be tightly connected to many other natural problems in P. As we discuss in the next section, such reductions seem to require problem-specific engineering and are left for future work. The main point of this paper is to demonstrate the possibility of basing such ultra fine-grained lower bounds on one common barrier. Our conditional lower bounds are summarized in the following corollary, which shows that current log-shaving algorithms are very close to the well-known barrier from complexity theory of cubic formula lower bounds.

Corollary 1.6.

For all , solving any of the following problems in time refutes Hypothesis 1.2, and solving them in time implies that cannot be computed by non-uniform formulas of cubic size:

  • LCS over alphabets of size 

  • The Fréchet distance on two curves in the plane

  • Regular Expression Pattern Matching over constant size alphabets.

The main reason that our lower bounds above are not tight (the gap between and ) is that we need to start from SAT on cubic size formulas rather than linear size ones, due to the fact that clever algorithms do exist for smaller formulas. We remark that throughout the paper we will work with a class of formulas we call (see Section B), also known as bipartite formulas, that are more powerful than deMorgan formulas yet our reduction to LCS can support them as well. This makes our results stronger, since -Formula-SAT could be a harder problem than SAT on deMorgan formulas. In fact, in an earlier version of the paper we had suggested the hypothesis that -Formula-SAT does not have non-trivial algorithms even on linear size formulas. This stronger hypothesis would give higher lower bounds. However, Avishay Tal (personal communication) told us about such a non-trivial algorithm for formulas of size up to using tools from quantum query complexity. We are optimistic that one could borrow such ideas or the “random restrictions” technique from SAT algorithms in order to shave more logs for combinatorial problems such as LCS. This is an intriguing direction for future work.

2 Technical Overview and the Reduction to LCS

All the reductions from SAT to problems in P mentioned above start with a split-and-list reduction to some “pair finding” problem. In the SETH lower bounds, CNF-SAT is reduced to the Orthogonal-Vectors problem of finding a pair that are orthogonal [96]. When starting from Formula-SAT, we get a more complex pair-finding problem. In Section B we show a simple reduction from SAT on formulas from the class (which contains deMorgan formulas) to the following problem.

Definition 2.1 (Formula-Pair Problem).

Given a deMorgan formula over variables (each appearing once in ), and two sets of vectors of size , decide if there is a pair such that .

In Section B we show a Four-Russians type algorithm that solves Formula-Pair in time, and even when no upper bound is known. By our reduction, such an upper bound would imply a non-trivial algorithm for SAT on formulas from . Moreover, Hypothesis 1.2 implies that we cannot solve Formula-Pair in time, for . In the next sections, we reduce Formula-Pair to LCS, from which Theorem 1.3 follows. A simpler reduction using much larger alphabet size can be found in Section D.

Theorem 2.2.

Formula-Pair on formulas of size and lists of size can be reduced to an instance of LCS on two strings over alphabet of size of length , in linear time.

The reduction constructs strings and a number such that holds if and only if the given Formula-Pair instance is satisfiable. The approach is similar to the reductions from Orthogonal-Vectors to sequence alignment problems (e.g. [6, 27, 18, 2, 29]). The big difference is that our formula can be much more complicated than a CNF, and so we will need more powerful gadgets. Sequence gadgets that are able to simulate the evaluation of deMorgan formulas were (implicitly) constructed in [3] with a recursive approach. Our main contribution is an extremely efficient implementation of such gadgets with LCS.

The main part of the reduction is to construct gate gadgets: for any vectors and any gate of , we construct strings and whose LCS determines whether gate evaluates to true for input to (see Section 2.1). Once we have this, to find a pair of vectors satisfying , we combine the strings , constructed for the root of , using a known construction of so-called alignment gadgets [2, 29] from previous work (see Section C.1).

Let us quickly explain how [3] constructed gate gadgets and the main ideas that go into our new construction. There are two kinds of gadgets, corresponding to the two types of gates in : AND and OR gates. Since the AND gadgets will be relatively simple, let us consider the OR gadgets. Fix two inputs , and let be an OR gate, and assume that we already constructed gate gadgets for , namely so that for we have that is large if the gate outputs true on input , and it is smaller otherwise. In [3], these gadgets were combined as follows. Let be an upper bound on the total length of the gadgets . We add a carefully chosen padding of ’s and ’s, so that any optimal matching of the two strings will have to match either or but not both.

One then argues that, in any optimal LCS matching of , the block of must be matched either left or right. If it’s matched left, then the total score will be equal to while if it’s matched right, we will get . Thus, is determined by the OR of . The blowup of this construction is a multiplicative factor of with every level of the formula, and the length of the gadget of the root will end up roughly . To obtain our tight lower bounds, we will need to decrease this blowup to at every level, where goes to 0 when the alphabet size tends to infinity. With the above construction, decreasing the length of the padding will allow the optimal LCS matching to cheat, e.g. by matching to both and , and no longer corresponding to the OR of .

Our first trick is an ultra-efficient OR gadget in case we are allowed unbounded alphabet size. We take and transform all their letters into a new alphabet , and we take and transform their letters into a disjoint alphabet . Then our OR gadget does not require any padding at all:

The crossing structure of this construction means that any LCS matching that matches letters from cannot also match letters from , and vice versa, while the disjoint alphabets make sure that there can be no matches between or . With such gadgets we can encode a formula of size with letters, for details see Section D.

But how would such an idea work for constant size alphabets? Once we allow and to share even a single letter, this argument breaks. Natural attempts to simulate this construction with smaller alphabets, e.g. by replacing each letter with a random sequence, do not seem to work, and we do not know how to construct such an OR gadget with a smaller alphabet in a black box way. The major part of our proof will be a careful examination of the formula and the sub-gadgets in order to reduce the alphabet size to a large enough constant, while using padding that is only times the length of the sub-gadgets. We achieve this by combining this crossing gadget with a small padding that will reuse letters from alphabets that were used much deeper in the formula, and we will argue that the noise we get from recycling letters is dominated by our paddings, in any optimal matching.

We remark that the reduction of [3] can be implemented in a generic way with any problem that admits alignment gadgets as defined in [29], giving formula-gadgets of size . The list of such problems includes LCS and Edit-Distance on binary strings. However, to get gadgets of length it seems that problem-specific reductions are necessary. A big open question left by our work is to find the most efficient reduction from Formula-SAT to Edit-Distance. A very efficient OR gadget, even if the alphabet is unbounded, might be (provably) impossible. Can we use this intuition to shave more log factors for Edit-Distance?

Fréchet Distance falls outside the alignment gadgets framework of [29] and no reduction from Formula-SAT was known before. In Section F we prove such a reduction by a significant boosting of the SETH-lower bound construction of [27]. In order to implement recursive AND/OR gadgets, our new proof utilizes the geometry of the curves, in contrast to [27] which only used ten different points in the plane.

In the remainder of this section we present the details of the reduction to LCS. Some missing proofs can be found in Section C.

2.1 Implementing Gates

Fix vectors (where is the number of inputs to ). In this section we prove the following lemma which demonstrates our main construction.

Lemma 2.3.

For any sufficiently large let . We can inductively construct, for each gate of , strings and over alphabet size and a number such that for we have (1) and (2) if and only if gate evaluates to true on input to . Moreover, we have , where is the subformula of below .

In this construction, we use disjoint size-5 alphabets , determining the total alphabet size as . Each gate is assigned an alphabet . We fix the function later.

In the following, consider any gate of , and write the gate alphabet as . For readability, we write and similarly define . If has fanin 2, write for the children of . Moreover, let and similarly define and .

Input Gate

The base case is an input bit to (input bits are symmetric). Interpreting as a string of length 1 over alphabet , note that . Hence, the strings and , with , trivially simulate the input bit .

AND Gates

Consider an AND gate and let . We construct strings as

Lemma 2.4.

If and the symbols appear at most times in each of , and , then we have .

Later we will choose the gate alphabets such that the precondition of the above lemma is satisfied. Setting we thus inductively obtain (1) and (2) if and only if and both evaluate to true. Thus, we correctly simulated the AND gate . It remains to prove the lemma.

Proof.

Clearly, we have . For the other direction, consider any LCS of . If does not match any symbol of the left half of , , with any symbol of the right half of , , and it does not match any symbol of the right half of , , with any symbol of the left half of , , then we can split both strings in the middle and obtain

Greedy suffix/prefix matching now yields

In the remaining case, there is a matching from some left half to some right half. By symmetry, we can assume that there is a matching from the left half of to the right half of . We can moreover assume that matches a symbol of with a symbol of , since the case that matches a symbol of with a symbol of is symmetric. Now no symbol in in can be matched with a symbol in in . We obtain a rough upper bound on by summing up the LCS length of all remaining pairs of a part in and a part in . This yields , finishing the proof. ∎

OR Gates

Consider an OR gate and again let . We first make the LCS target values equal by adding to the shorter of and , i.e., we set and similarly , , . Note that the resulting strings satisfy and if and only if evaluates to true, and similarly and if and only if evaluates to true. We construct the strings as

Lemma 2.5.

If and the symbols appear at most times in each of , and , then .

Later we will choose the gate alphabets such that the precondition of the above lemma is satisfied. Setting we thus inductively obtain (1) and (2) if and only if at least one of and evaluates to true, so we correctly simulated the OR gate . The proof of the Lemma is in Section C.

Analyzing the Length

Note that the above constructions inductively yields strings simulating each gate . We inductively prove bounds for and . See Section C.

Lemma 2.6.

We have and for any gate , where is the subformula of below .

Fixing the Gate Alphabets

Now we fix the gate alphabet for any gate . Again let , , be disjoint alphabets of size 5, and let . For any gate of , we call its distance to the root the height . For any , order the gates with height from left to right, and let be the index of gate in this order, for any gate with height . Note that is a unique identifier of gate . We define , i.e., we set the gate alphabet of to . Note that the overall alphabet has size . Recall that we set .

It remains to show that the preconditions of Lemmas 2.4 and 2.5 are satisfied. Specifically, consider a gate with children . As before, let be the strings and string length constructed for gate , and let be the corresponding objects for , . We need to show:

  1. , and

  2. each symbol appears at most times in each of , and .

We call a gate in the subformula -deep if , and -shallow otherwise. For each symbol in or we can trace our construction to find the gate in at which we introduced to or . In other words, each symbol in stems from some gate below .

First consider (2). Observe that all symbols in stemming from -shallow gates do not belong to the gate alphabet , since the function has as the first component, which repeats only every levels. Thus, if a symbol occurs in or , then this occurence stems from a -deep gate. We now argue that only few symbols in stem from deep gates. For any , let be the number of symbols in (or, equivalently, ) steming from -deep gates. Note that is equal to the total string length , summed over all gates in with height . Observe that our construction increases the string lengths in each step by at least a factor , i.e., holds for any . It follows that . Hence, each symbol in appears at most times in each of . Since , we have for sufficiently large . This proves (2).

For (1), remove all -deep symbols from and to obtain strings . Note that we removed exactly symbols from each of . This yields . For , we claim that any -shallow gates in have disjoint alphabets . Indeed, if then since the first component of repeats only every levels we have . If , then note that each gate in height has a unique label , since there are such labels and there are at most gates with height in . Hence, and use disjoint alphabets, and we obtain . Thus, . As above, we bound , so that . Since , we have for sufficiently large . This yields (1), since the strings are symmetric. This finishes the proof of Lemma 2.3.

Finalizing the Proof

Let us sketch how we complete the proof of Theorem 2.2. The full details are in Section C.1. First, for all vectors we construct gate gadgets for the output gate of the formula, i.e. formula gadgets, by invoking Lemma 2.3. Then we combine all these gadgets by applying a standard alignment gadget [2, 29] to get our final sequences of length and with alphabet of size . The LCS of the final sequence will be determined by the existence of a satisfying pair. Since a priori the depth of could be as large as , the factor in our length bound is not yet satisfactory. Thus, as a preprocessing before the above construction, we decrease the depth of using a depth-reduction result of Bonet and Buss [90, 25]: for all there is an equivalent formula with depth at most and size . Choosing the parameters correctly, we get final sequences of length .

3 On the Limitations of Fine-Grained Reductions

With the increasingly complex web of reductions and conjectures used in the “Hardness in P” research, one might oppose to our use of nonstandard assumptions. Why can’t we base the hardness of shaving logs on one of the more established assumptions such as SETH, or even better, on ? We conclude the paper with a proof that such results are not possible if one is restricted to fine-grained reductions, which is essentially the only tool we have in this line of research.

Let be a problem with best known upper bound of on inputs of size , and let be a problem with best known upper bound of on inputs of size . Throughout this section we assume that these runtime are non-decreasing functions, such as or . A fine-grained reduction from “solving in time ” to “solving in time ” proves that improving to improves to . Formally, it is an algorithm that solves and it is allowed to call an oracle for problem , as long as the following bound holds. Let be the size of the instance in the call to problem that our algorithm performs, where for some value , and let be the runtime of excluding the time it takes to answer all the instances of problem . It must be that . This is a natural adaptation of the definition of fine-grained reductions from previous works, where the improvements were restricted to be by polynomial factors.

We can now give a formal version of Theorem 1.1 from the introduction. Note that -SAT on variables and clauses can be solved in time .

Theorem 3.1.

If for some and all there is a fine-grained reduction from solving -SAT in time to solving LCS in time , then SETH is false.

Proof.

Assume there was a fine-grained reduction from -SAT to LCS as above. This means that there is an algorithm for -SAT that makes calls to LCS with instances of size such that:

But then consider algorithm which simulates and whenever makes a call to the LCS oracle with an instance of size , our algorithm will execute the known quadratic time solution for LCS that takes time. Let be the size of the largest instance we call, and note that . Simple calculations show that solves -SAT and has a running time of

for all , refuting SETH. ∎

Acknowledgements

We are grateful to Avishay Tal for telling us about his algorithm for SAT on bipartite formulas, and for very helpful discussions. We also thank Mohan Paturi, Rahul Santhanam, Srikanth Srinivasan, and Ryan Williams for answering our questions about the state of the art of Formula-SAT algorithms, and Arturs Backurs, Piotr Indyk, Mikkel Thorup, and Virginia Vassilevska Williams for helpful discussions regarding regular expressions. We also thank an anonymous reviewer for ideas leading to shaving off a second log-factor for Formula-Pair, and other reviewers for helpful suggestions.

References

  • [1] A. Abboud, A. Backurs, T. D. Hansen, V. Vassilevska Williams, and O. Zamir. Subtree isomorphism revisited. In Proc. of 27th SODA, pages 1256–1271, 2016.
  • [2] A. Abboud, A. Backurs, and V. Vassilevska Williams. Tight Hardness Results for LCS and other Sequence Similarity Measures. In Proc. of 56th FOCS, pages 59–78, 2015.
  • [3] A. Abboud, T. D. Hansen, V. V. Williams, and R. Williams. Simulating branching programs with edit distance and friends: or: a polylog shaved is a lower bound made. In Proc. of the 48th STOC, pages 375–388, 2016.
  • [4] A. Abboud and V. Vassilevska Williams. Popular conjectures imply strong lower bounds for dynamic problems. In Proc. of 55th FOCS, pages 434–443, 2014.
  • [5] A. Abboud, V. Vassilevska Williams, and J. R. Wang. Approximation and fixed parameter subquadratic algorithms for radius and diameter in sparse graphs. In Proc. of 27th SODA, pages 377–391, 2016.
  • [6] A. Abboud, V. Vassilevska Williams, and O. Weimann. Consequences of faster sequence alignment. In Proc. of 41st ICALP, pages 39–51, 2014.
  • [7] A. Abboud, V. Vassilevska Williams, and H. Yu. Matching triangles and basing hardness on an extremely popular conjecture. In Proc. of 47th STOC, pages 41–50, 2015.
  • [8] A. Abboud, R. Williams, and H. Yu. More applications of the polynomial method to algorithm design. In Proc. of 26th SODA, pages 218–230, 2015.
  • [9] P. Agarwal, R. B. Avraham, H. Kaplan, and M. Sharir. Computing the discrete Fréchet distance in subquadratic time. In Proc. 24th ACM-SIAM Symposium on Discrete Algorithms (SODA’13), pages 156–167, 2013.
  • [10] L. Allison and T. I. Dix. A bit-string longest-common-subsequence algorithm. Information Processing Letters, 23(5):305–310, 1986.
  • [11] H. Alt and M. Buchin. Can we compute the similarity between surfaces? Discrete & Computational Geometry, 43(1):78–99, 2010.
  • [12] H. Alt and M. Godau. Computing the Fréchet distance between two polygonal curves. International Journal of Computational Geometry & Applications, 5(1-2):78–99, 1995.
  • [13] H. Alt, C. Knauer, and C. Wenk. Comparison of distance measures for planar curves. Algorithmica, 38(1):45–58, 2004.
  • [14] S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic acids research, 25(17):3389–3402, 1997.
  • [15] A. E. Andreev. About one method of obtaining more than quadratic effective lower bounds of complexity of pi-schemes, 1987.
  • [16] V. Arlazarov, E. Dinic, I. Faradzev, and M. Kronrod. On economic construction of the transitive closure of a direct graph. In Sov. Math (Doklady), volume 11, pages 1209–1210, 1970.
  • [17] B. Aronov, S. Har-Peled, C. Knauer, Y. Wang, and C. Wenk. Fréchet distance for curves, revisited. In Proc. 14th Annual European Symposium on Algorithms (ESA’06), volume 4168 of LNCS, pages 52–63. Springer, 2006.
  • [18] A. Backurs and P. Indyk. Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false). In Proc. of 47th STOC, pages 51–58, 2015.
  • [19] A. Backurs and P. Indyk. Which regular expression patterns are hard to match? In FOCS, 2016.
  • [20] N. Bansal and R. Williams. Regularity lemmas and combinatorial algorithms. In Proc. of 50th FOCS, pages 745–754, 2009.
  • [21] I. Baran, E. D. Demaine, and M. Patrascu. Subquadratic algorithms for 3sum. Algorithmica, 50(4):584–596, 2008.
  • [22] P. Beame, R. Impagliazzo, and S. Srinivasan. Approximating ac^ 0 by small height decision trees and a deterministic algorithm for# ac^ 0sat. In Computational Complexity (CCC), 2012 IEEE 27th Annual Conference on, pages 117–125. IEEE, 2012.
  • [23] P. Bille and M. Farach-Colton. Fast and compact regular expression matching. Theoretical Computer Science, 409(3):486 – 496, 2008.
  • [24] P. Bille and M. Thorup. Faster regular expression matching. In International Colloquium on Automata, Languages, and Programming, pages 171–182. Springer, 2009.
  • [25] M. L. Bonet and S. R. Buss. Size-depth tradeoffs for boolean fomulae. Inf. Process. Lett., 49(3):151–155, 1994.
  • [26] S. Brakatsoulas, D. Pfoser, R. Salas, and C. Wenk. On map-matching vehicle tracking data. In Proc. 31st International Conference on Very Large Data Bases (VLDB’05), pages 853–864, 2005.
  • [27] K. Bringmann. Why walking the dog takes time: Frechet distance has no strongly subquadratic algorithms unless seth fails. In Proc. of 55th FOCS, pages 661–670, 2014.
  • [28] K. Bringmann, A. Grønlund, and K. G. Larsen. A dichotomy for regular expression membership testing. CoRR, abs/1611.00918, 2016.
  • [29] K. Bringmann and M. Künnemann. Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping. In Proc. of 56th FOCS, pages 79–97, 2015.
  • [30] K. Buchin, M. Buchin, J. Gudmundsson, M. Löffler, and J. Luo. Detecting commuting patterns by clustering subtrajectories. International Journal of Computational Geometry & Applications, 21(3):253–282, 2011.
  • [31] K. Buchin, M. Buchin, W. Meulemans, and W. Mulzer. Four soviets walk the dog - with an application to Alt’s conjecture. In Proc. 25th ACM-SIAM Symposium on Discrete Algorithms (SODA’14), pages 1399–1413, 2014.
  • [32] K. Buchin, M. Buchin, and Y. Wang. Exact algorithms for partial curve matching via the Fréchet distance. In Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA’09), pages 645–654, 2009.
  • [33] M. Cairo, R. Grossi, and R. Rizzi. New bounds for approximating extremal distances in undirected graphs. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 363–376, 2016.
  • [34] C. Calabro, R. Impagliazzo, and R. Paturi. A duality between clause width and clause density for SAT. In Proc. of 21st CCC, pages 252–260, 2006.
  • [35] C. Calabro, R. Impagliazzo, and R. Paturi. The complexity of satisfiability of small depth circuits. In Proc. of 4th IWPEC, pages 75–85, 2009.
  • [36] E. W. Chambers, É. Colin de Verdière, J. Erickson, S. Lazard, F. Lazarus, and S. Thite. Homotopic Fréchet distance between curves or, walking your dog in the woods in polynomial time. Computational Geometry, 43(3):295–311, 2010.
  • [37] T. M. Chan. The art of shaving logs. In Proc. of the 13th WADS, page 231, 2013.
  • [38] T. M. Chan. Speeding up the four russians algorithm by about one more logarithmic factor. In Proc. of 26th SODA, pages 212–217, 2015.
  • [39] T. M. Chan. More logarithmic-factor speedups for 3sum, (median,+)-convolution, and some geometric 3sum-hard problems. In 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’18), pages 881–897, 2018.
  • [40] K. Chatterjee, W. Dvorák, M. Henzinger, and V. Loitzenbauer. Model and objective separation with conditional lower bounds: Disjunction is harder than conjunction. CoRR, abs/1602.02670, 2016.
  • [41] R. Chen. Satisfiability algorithms and lower bounds for boolean formulas over finite bases. In International Symposium on Mathematical Foundations of Computer Science, pages 223–234. Springer, 2015.
  • [42] R. Chen and V. Kabanets. Correlation bounds and #sat algorithms for small linear-size circuits. In Computing and Combinatorics - 21st International Conference, COCOON 2015, Beijing, China, August 4-6, 2015, Proceedings, pages 211–222, 2015.
  • [43] R. Chen, V. Kabanets, A. Kolokolova, R. Shaltiel, and D. Zuckerman. Mining circuit lower bound proofs for meta-algorithms. computational complexity, 24(2):333–392, 2015.
  • [44] R. Chen, V. Kabanets, and N. Saurabh. An improved deterministic# sat algorithm for small de morgan formulas. In International Symposium on Mathematical Foundations of Computer Science, pages 165–176. Springer, 2014.
  • [45] R. Chen and R. Santhanam. Satisfiability on mixed instances. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, Cambridge, MA, USA, January 14-16, 2016, pages 393–402, 2016.
  • [46] A. F. Cook and C. Wenk. Geodesic Fréchet distance inside a simple polygon. ACM Transactions on Algorithms, 7(1):193–204, 2010.
  • [47] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms, volume 6. MIT press Cambridge, 2001.
  • [48] M. Crochemore, C. S. Iliopoulos, Y. J. Pinzon, and J. F. Reid. A fast and practical bit-vector algorithm for the longest common subsequence problem. Information Processing Letters, 80(6):279–285, 2001.
  • [49] M. Crochemore, G. M. Landau, and M. Ziv-Ukelson. A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM journal on computing, 32(6):1654–1673, 2003.
  • [50] E. Dantsin and E. A. Hirsch. Worst-case upper bounds. In Handbook of Satisfiability, pages 403–424. 2009.
  • [51] E. Dantsin and A. Wolpert. Exponential complexity of satisfiability testing for linear-size boolean formulas. In International Conference on Algorithms and Complexity, pages 110–121. Springer, 2013.
  • [52] A. Driemel and S. Har-Peled. Jaywalking your dog: computing the Fréchet distance with shortcuts. SIAM Journal on Computing, 42(5):1830–1866, 2013.
  • [53] A. Driemel, S. Har-Peled, and C. Wenk. Approximating the Fréchet distance for realistic curves in near linear time. Discrete & Computational Geometry, 48(1):94–127, 2012.
  • [54] T. Eiter and H. Mannila. Computing discrete Fréchet distance. Technical Report CD-TR 94/64, Christian Doppler Laboratory for Expert Systems, TU Vienna, Austria, 1994.
  • [55] A. Freund. Improved subquadratic 3sum. Algorithmica, 77(2):440–458, 2017.
  • [56] P. Gawrychowski. Faster algorithm for computing the edit distance between slp-compressed strings. In International Symposium on String Processing and Information Retrieval, pages 229–236. Springer, 2012.
  • [57] M. Godau. A natural metric for curves - computing the distance for polygonal chains and approximation algorithms. In Proc. 8th Symposium on Theoretical Aspects of Computer Science (STACS’91), volume 480 of LNCS, pages 127–136. Springer, 1991.
  • [58] O. Gold and M. Sharir. Improved Bounds for 3SUM, k-SUM, and Linear Degeneracy. In 25th Annual European Symposium on Algorithms (ESA 2017), volume 87, pages 42:1–42:13, 2017.
  • [59] A. Golovnev, A. S. Kulikov, A. Smal, and S. Tamaki. Circuit size lower bounds and# sat upper bounds through a general framework. In Electronic Colloquium on Computational Complexity (ECCC), volume 23, page 22, 2016.
  • [60] S. Grabowski. New tabulation and sparse dynamic programming based techniques for sequence similarity problems. In Stringology, pages 202–211, 2014.
  • [61] A. Grønlund and S. Pettie. Threesomes, degenerates, and love triangles. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 621–630, 2014.
  • [62] J. Håstad. The shrinkage exponent of de morgan formulas is 2. SIAM J. Comput., 27(1):48–64, 1998.
  • [63] J. Hopcroft, W. Paul, and L. Valiant. On time versus space. Journal of the ACM (JACM), 24(2):332–337, 1977.
  • [64] H. Hyyrö. Bit-parallel lcs-length computation revisited. In Proc. 15th Australasian Workshop on Combinatorial Algorithms (AWOCA 2004), pages 16–27. Citeseer, 2004.
  • [65] R. Impagliazzo, S. Lovett, R. Paturi, and S. Schneider. 0-1 integer linear programming with a linear number of constraints. Electronic Colloquium on Computational Complexity (ECCC), 21:24, 2014.
  • [66] R. Impagliazzo, W. Matthews, and R. Paturi. A satisfiability algorithm for ac 0. In Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms, pages 961–972. SIAM, 2012.
  • [67] R. Impagliazzo and N. Nisan. The effect of random restrictions on formula size. Random Structures & Algorithms, 4(2):121–133, 1993.
  • [68] R. Impagliazzo and R. Paturi. On the complexity of k-sat. Journal of Computer and System Sciences, 62(2):367–375, 2001.
  • [69] R. Impagliazzo, R. Paturi, and S. Schneider. A satisfiability algorithm for sparse depth two threshold circuits. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 479–488, 2013.
  • [70] P. Indyk. Approximate nearest neighbor algorithms for Fréchet distance via product metrics. In Proc. 18th Annual Symposium on Computational Geometry (SoCG’02), pages 102–106, 2002.
  • [71] V. M. Khrapchenko. Method of determining lower bounds for the complexity of p-schemes. Mathematical Notes, 10(1):474–479, 1971.
  • [72] I. Komargodski, R. Raz, and A. Tal. Improved average-case lower bounds for demorgan formula size. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on, pages 588–597. IEEE, 2013.
  • [73] I. T. Li, W. Shum, and K. Truong. 160-fold acceleration of the smith-waterman algorithm using a field programmable gate array (fpga). BMC bioinformatics, 8(1):1, 2007.
  • [74] Y. Liu, A. Wirawan, and B. Schmidt. Cudasw++ 3.0: accelerating smith-waterman protein database search by coupling cpu and gpu simd instructions. BMC bioinformatics, 14(1):1, 2013.
  • [75] A. Maheshwari, J.-R. Sack, K. Shahbaz, and H. Zarrabi-Zadeh. Fréchet distance with speed limits. Computational Geometry, 44(2):110–120, 2011.
  • [76] W. J. Masek and M. S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System sciences, 20(1):18–31, 1980.
  • [77] D. Moeller, R. Paturi, and S. Schneider. Subquadratic algorithms for succinct stable matching. In Computer Science - Theory and Applications - 11th International Computer Science Symposium in Russia, CSR 2016, St. Petersburg, Russia, June 9-13, 2016, Proceedings, pages 294–308, 2016.
  • [78] M. E. Munich and P. Perona. Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In Proc. 7th IEEE International Conference on Computer Vision, volume 1, pages 108–115, 1999.
  • [79] G. Myers. A four russians algorithm for regular expression pattern matching. Journal of the ACM (JACM), 39(2):432–448, 1992.
  • [80] I. C. Oliveira. Algorithms versus circuit lower bounds. arXiv preprint arXiv:1309.0249, 2013.
  • [81] M. S. Paterson and U. Zwick. Shrinkage of de morgan formulae under restriction. Random Structures & Algorithms, 4(2):135–150, 1993.
  • [82] M. Patrascu and R. Williams. On the possibility of faster SAT algorithms. In Proc. of 21st SODA, pages 1065–1075, 2010.
  • [83] R. Paturi, P. Pudlák, M. E. Saks, and F. Zane. An improved exponential-time algorithm for k-sat. J. ACM, 52(3):337–364, 2005.
  • [84] L. Roditty and V. Vassilevska Williams. Fast approximation algorithms for the diameter and radius of sparse graphs. In Proc. of 45th STOC, pages 515–524, 2013.
  • [85] T. Sakai, K. Seto, S. Tamaki, and J. Teruyama. A satisfiability algorithm for depth-2 circuits with a symmetric gate at the top and and gates at the bottom. In Electronic Colloquium on Computational Complexity (ECCC), 2015.
  • [86] R. Santhanam. Fighting perebor: New and improved algorithms for formula and QBF satisfiability. In Proc. of the 51th FOCS, pages 183–192, 2010.
  • [87] R. Santhanam et al. Ironic complicity: Satisfiability algorithms and circuit lower bounds. Bulletin of EATCS, 1(106), 2013.
  • [88] K. Seto and S. Tamaki. A satisfiability algorithm and average-case hardness for formulas over the full binary basis. computational complexity, 22(2):245–274, 2013.
  • [89] T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. Journal of molecular biology, 147(1):195–197, 1981.
  • [90] P. M. Spira. On time-hardware complexity tradeoffs for boolean functions. In Proceedings of the 4th Hawaii Symposium on System Sciences, pages 525–527, 1971.
  • [91] B. A. Subbotovskaya. Realizations of linear functions by formulas using+. Doklady Akademii Nauk SSSR, 136(3):553–555, 1961.
  • [92] A. Tal. Shrinkage of de morgan formulae by spectral techniques. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on, pages 551–560. IEEE, 2014.
  • [93] A. Tal. #sat algorithms from shrinkage. Electronic Colloquium on Computational Complexity (ECCC), 22:114, 2015.
  • [94] K. Thompson. Programming techniques: Regular expression search algorithm. Communications of the ACM, 11(6):419–422, 1968.
  • [95] R. A. Wagner and M. J. Fischer. The string-to-string correction problem. Journal of the ACM (JACM), 21(1):168–173, 1974.
  • [96] R. Williams. A new algorithm for optimal 2-constraint satisfaction and its implications. Theoretical Computer Science, 348(2):357–365, 2005.
  • [97] R. Williams. Improving exhaustive search implies superpolynomial lower bounds. SIAM Journal on Computing, 42(3):1218–1244, 2013.
  • [98] R. Williams. Algorithms for Circuits and Circuits for Algorithms: Connecting the Tractable and Intractable. In Proceedings of the International Congress of Mathematicians, 2014.
  • [99] R. Williams. Faster all-pairs shortest paths via circuit complexity. In Proc. of 46th STOC, pages 664–673, 2014.
  • [100] R. Williams. New algorithms and lower bounds for circuits with linear threshold gates. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pages 194–202. ACM, 2014.
  • [101] R. Williams. Nonuniform ACC circuit lower bounds. J. ACM, 61(1):2:1–2:32, 2014.
  • [102] H. Yu. An improved combinatorial algorithm for boolean matrix multiplication. In Proc. of 42nd ICALP, pages 1094–1105, 2015.

Appendix A Discussion

As shown above, the popular conjectures are not fine-grained enough for our purposes and our only viable option is to start from assumptions about the hardness of shaving logs for some problem. The approach taken in this paper and in [3] is to start with variants of SAT. Another option would have been to conjecture that -SUM cannot be solved in time, but SAT has several advantages. First, SAT is deeply connected to fundamental topics in complexity theory, which allows us to borrow barriers that complexity theorists have faced for decades. Moreover, there is a vast number of combinatorial problems that we can reduce SAT to, whereas -SUM seems more useful in geometric contexts, e.g. -SUM-hardness for LCS and Frechet might be impossible [31]. Thus, for the task of proving barriers for shaving logs, our approach seems as good as any.

Conditional lower bounds can even lead to better algorithms, by suggesting regimes of possible improvements. Phrased this way, our results leave the open problem of finding a time algorithms for LCS, and perhaps more interestingly, shaving many more logs for the related-but-different Edit-Distance problem. The longstanding upper bound for Edit-Distance is [76] and our approach does not give barriers higher than .

Finally, regardless of the consequences of our reductions, we think that the statements themselves are intrinsically interesting as they reveal a surprisingly close connection between Formula-SAT (a problem typically studied by complexity theorists) and combinatorial problems that are typically studied by stringologists, computational biologists, and computational geometers, which are a priori completely different creatures. The runtime of the standard algorithm for SAT can be recovered almost exactly by encoding the formula into an LCS, Fréchet, or Pattern Matching instance and running the standard dynamic programming algorithms!

Appendix B From Formula-SAT to Formula-Pair

In this section we show a chain of simple reductions starting from variants of Formula-SAT, which have time complexity, and ending at time variants of a problem we call Formula-Pair.

A formula of size over variables is in the class iff it has the following properties. The gates in the first layer (nodes in the tree whose children are all leaves) compute arbitrary functions , as long as can be computed in time and all children of a gate are marked with variables in or with variables in but not with both. W.l.o.g. we can assume that the inputs are only connected to nodes in the first layer. The gates in the other layers compute deMorgan gates, i.e., OR and AND gates. The size of is considered to be the number of gates in the first layer. Since is a formula and thus has fanout 1, our size measure is up to constant factors equal to the total number of all gates except the inputs. Note that the complexity of the functions in the first layer and their number of incoming wires, i.e. the number of leaves in the tree, do not count towards the size of .


-Formula-SAT Input: Formula of size with inputs from the class Question: Exist such that ? Complexity: , even restricted to