SETH-Based Lower Bounds forSubset Sum and Bicriteria Path

SETH-Based Lower Bounds for
Subset Sum and Bicriteria Path

Amir Abboud Department of Computer Science,
Stanford University, CA, USA
abboud@cs.stanford.edu
   Karl Bringmann Max Planck Institute for Informatics,
Saarland Informatics Campus, Germany
kbringma@mpi-inf.mpg.de
   Danny Hermelin Department of Industrial Engineering and Management,
Ben-Gurion University, Israel
hermelin@bgu.ac.il, dvirs@bgu.ac.il
   Dvir Shabtay Department of Industrial Engineering and Management,
Ben-Gurion University, Israel
hermelin@bgu.ac.il, dvirs@bgu.ac.il
Abstract

Subset Sum and -SAT are two of the most extensively studied problems in computer science, and conjectures about their hardness are among the cornerstones of fine-grained complexity. One of the most intriguing open problems in this area is to base the hardness of one of these problems on the other.

Our main result is a tight reduction from -SAT to Subset Sum on dense instances, proving that Bellman’s 1962 pseudo-polynomial -time algorithm for Subset-Sum on numbers and target  cannot be improved to time for any , unless the Strong Exponential Time Hypothesis (SETH) fails. This is one of the strongest known connections between any two of the core problems of fine-grained complexity.

As a corollary, we prove a “Direct-OR” theorem for Subset Sum under SETH, offering a new tool for proving conditional lower bounds: It is now possible to assume that deciding whether one out of given instances of Subset Sum is a YES instance requires time . As an application of this corollary, we prove a tight SETH-based lower bound for the classical Bicriteria -Path problem, which is extensively studied in Operations Research. We separate its complexity from that of Subset Sum: On graphs with edges and edge lengths bounded by , we show that the pseudo-polynomial time algorithm by Joksch from 1966 cannot be improved to , in contrast to a recent improvement for Subset Sum (Bringmann, SODA 2017).

1 Introduction

The field of fine-grained complexity is anchored around certain hypotheses about the exact time complexity of a small set of core problems. Due to dozens of ingenious reductions, we now know that the current algorithms for many important problems are optimal unless breakthrough algorithms for the core problems exist. A central challenge in this field is to understand the connections and relative difficulties among these core problems. In this work, we discover one of the strongest connections to date between two core problems: a tight reduction from -SAT to Subset Sum.

In the first part of the introduction we discuss this new reduction and how it affects the landscape of fine-grained complexity. Then, in Section 1.2, we highlight a corollary of this reduction which gives a new tool for proving conditional lower bounds. As an application, in Section 1.3, we prove the first tight bounds for the classical Bicriteria -Path problem from Operations Research.

Subset Sum.

Subset Sum is one of the most fundamental problems in computer science. Its most basic form is: given integers , and a target value , decide whether there is a subset of the numbers that sums to . The two most classical algorithms for the problem are the pseudo-polynomial algorithm using dynamic programming [29], and the algorithm via “meet-in-the-middle” [66]. A central open question in Exact Algorithms [111] is whether faster algorithms exist, e.g. can we combine the two approaches to get a time algorithm? Such a bound was recently found in a Merlin-Arthur setting [91].

Open Question 1

Is Subset Sum in time or , for some ?

The status of Subset Sum as a major problem has been established due to many applications, deep connections to other fields, and educational value. The algorithm from 1957 is an illuminating example of dynamic programming that is taught in every undergraduate algorithms course, and the NP-hardness proof (from Karp’s original paper [75]) is a prominent example of a reduction to a problem on numbers. Interestingly, one of the earliest cryptosystems by Merkle and Hellman was based on Subset Sum [89], and was later extended to a host of Knapsack-type cryptosystems111Cryptographers usually refer to Subset Sum as Knapsack. (see [103, 32, 94, 46, 68] and the references therein).

The version of Subset Sum where we ask for numbers that sum to zero (the -SUM problem) is conjectured to have time complexity. Most famously, the case is the -SUM conjecture highlighted in the seminal work of Gajentaan and Overmars [56]. It has been shown that this problem lies at the core and captures the difficulty of dozens of problems in computational geometry. Searching in Google Scholar for “3sum-hard” reveals more than 250 papers (see [77] for a highly partial list). More recently, these conjectures have become even more prominent as core problems in fine-grained complexity since their interesting consequences have expanded beyond geometry into purely combinatorial problems [99, 110, 38, 71, 12, 6, 80, 7, 61, 71]. Note that -SUM inherits its hardness from Subset Sum, by a simple reduction: to answer Open Question 1 positively it is enough to solve -SUM in or time.

Entire books [88, 76] are dedicated to the algorithmic approaches that have been used to attack Subset Sum throughout many decades, and, quite astonishingly, major algorithmic advances are still being discovered in our days, e.g. [85, 67, 27, 51, 14, 105, 74, 60, 44, 16, 17, 82, 55, 23, 91, 79, 34], not to mention the recent developments on generalized versions (see [25]) and other computational models (see [104, 41]). Only at this upcoming STOC, an algorithm is presented that beats the trivial bound while using polynomial space [23]. Slightly earlier, at SODA’17, we have seen the first improvements (beyond log factors [97]) over the algorithm, reducing the bound to [79, 34]. And a few years earlier, a surprising result celebrated by cryptographers [67, 27] showed that algorithms are possible on random instances. All this progress leads to the feeling that a positive resolution to Open Question 1 might be just around the corner.

Seth.

-SAT is an equally fundamental problem (if not more) but of a Boolean rather than additive nature, where we are given a -CNF formula on variables and clauses, and the task is to decide whether it is satisfiable. All known algorithms have a running time of the form for some constant [96, 49, 8], and the Strong Exponential Time Hypothesis (SETH) of Impagliazzo and Paturi [69, 70, 39] states that no time algorithms are possible for -SAT, for some independent of . Refuting SETH implies advances in circuit complexity [72], and is known to be impossible with popular techniques like resolution [26].

A seminal paper of Cygan, Dell, Lokshtanov, Marx, Nederlof, Okamoto, Paturi, Saurabh, and Wahlström [48] strives to classify the exact complexity of important NP-hard problems under SETH. The authors design a large collection of ingenious reductions and conclude that algorithms for problems like Hitting Set, Set Splitting, and Not-All-Equal SAT are impossible under SETH. Notably, Subset Sum is not in this list nor any problem for which the known algorithms are non-trivial (e.g. require dynamic programming). As the authors point out: “Of course, we would also like to show tight connections between SETH and the optimal growth rates of problems that do have non-trivial exact algorithms.

Since the work of Cygan et al. [48], SETH has enjoyed great success as a basis for lower bounds in Parameterized Complexity [84] and for problems within P [109]. Some of the most fundamental problems on strings (e.g. [9, 18, 2, 36, 19, 35]), graphs (e.g. [83, 101, 6, 57]), curves (e.g. [33]), vectors [107, 108, 20, 30] and trees [1] have been shown to be SETH-hard: a small improvement to the running time of these problems would refute SETH. Despite the remarkable quantity and diversity of these results, we are yet to see a (tight) reduction from SAT to any problem like Subset Sum, where the complexity comes from the hardness of analyzing a search space defined by addition of numbers. In fact, all hardness results for problems of a more number theoretic or additive combinatoric flavor are based on the conjectured hardness of Subset Sum itself.

In this paper, we address one of the most intriguing open questions in the field of fine-grained complexity: Can we prove a tight SETH-based lower bound for Subset Sum?

The standard NP-hardness proofs imply loose lower bounds under SETH (in fact, under the weaker ETH) stating that algorithms are impossible. A stronger but still loose result of Patrascu and Williams [95] shows that if we solve -SUM in or time, then ETH is false. These results leave the possibility of algorithms and are not satisfactory from the viewpoint of fine-grained complexity. While it is completely open whether such algorithms imply new SAT algorithms, it has been shown that they would imply new algorithms for other famous problems. Bringmann [34] recently observed that a an algorithm for Subset-Sum implies a new algorithm for -Clique, via a reduction of Abboud, Lewi, and Williams [5]. To get a tight lower bound ruling out algorithms, while overcoming the difficulty of reducing SETH to NP-hard problems with dynamic programming algorithms, Cygan et al. [48] were forced to introduce a new untested conjecture essentially stating that the Set Cover problem on sets over a universe of size cannot be solved in time. More than five years after its introduction, this Set Cover Conjecture has not found other major consequences besides this lower bound for Subset Sum, and a lower bound for the Steiner Tree problem; other corollaries of the conjecture include tight lower bounds for less famous problems like Connected Vertex Cover, and Set Partitioning [48]. Whether this conjecture can be based on or replaced by the much more popular SETH remains a major open question.

1.1 Main Result

The dream theorem in this context would be that SETH implies a negative resolution to Open Question 1. Our main result accomplishes half of this statement, showing a completely tight reduction from SAT to Subset Sum on instances where , also known as dense instances222The density of an instance is the ratio ., ruling out time algorithms under SETH.

Theorem 1.1

Assuming SETH, for any there exists a such that Subset Sum is not in time , and -Sum is not in time .

Thus, Subset Sum is yet another SETH-hard problem. This is certainly a major addition to this list, and some people might even consider it the most exciting member. This also adds many other problems that have reductions from Subset Sum, e.g. the famous Knapsack problem, or from -SUM (e.g. [54, 31, 4, 43, 78]). For some of these problems, to be discussed shortly, this even leads to better lower bounds.

Getting a completely tight reduction that also rules out algorithms under SETH is still a fascinating open question. Notably, the strongest possible reduction, ruling out algorithms for -SUM, is provably impossible under the Nondeterministic SETH of Carmosino et al. [42], but there is no barrier for an lower bound.

Theorem 1.1 shows that the lower bound of Cygan et al. [48] can be based on SETH rather than the Set Cover Conjecture, which, in some sense, makes the latter conjecture a little less interesting. As a result, the only major problems that enjoy tight lower bounds under this conjecture, but not under SETH, are Steiner Tree and Set Cover itself. Our work highlights the following open question: Can we either prove SETH-based lower bounds for Set Cover and Steiner Tree, or find faster exact algorithms for them?

A substantial technical barrier that we had to overcome when designing our reduction is the fact that there was no clear understanding of what the hard instances of Subset Sum should look like. Significant effort has been put into finding and characterizing the instances of Subset Sum and Knapsack that are hard to solve. This is challenging both from an experimental viewpoint (see the study of Pisinger [98]) and from the worst-case analysis perspective (see the discussion of Nederlof et al. [17]). Recent breakthroughs refute the common belief that random instances are maximally hard [67, 27], and show that better upper bounds are possible for various classes of inputs. Our reduction is able to generate hard instances by crucially relying on a deep result on the combinatorics of numbers: the existence of dense sum-free sets. A surprising construction of these sets from 1946 due to Behrend [28] (see also [52, 93]) has already lead to breakthroughs in various areas of theoretical computer science [45, 47, 10, 64, 50, 3]. These are among the most non-random-like structures in combinatorics, and therefore allow our instances to bypass the easyness of random inputs. This leads us to a candidate distribution of hard instances for Subset Sum, which could be of independent interest: Start from hard instances of SAT (e.g. random formulas around the threshold) and map them with our reduction (the obtained distribution over numbers will be highly structured).

Very recently, it was shown that the security of certain cryptographic primitives can be based on SETH [21, 22]. We hope that our SETH-hardness for an already popular problem in cryptography will lead to further interaction between fine-grained complexity and cryptography. In particular, it would be exciting if our hard instances could be used for a new Knapsack-type cryptosystem. Such schemes tend to be much more computationally efficient than popular schemes like RSA [32, 94, 68], but almost all known ones are not secure (as famously shown by Shamir [103]). Even more recently, and independently, Bennett, Golovnev, and Stephens-Davidowitz [30] proved SETH hardness for another central problem from cryptography, the Closest-Vector-Problem (CVP). While CVP is a harder problem than Subset Sum, their hardness result addresses a different regime of parameters, and rules out time algorithms (when the numbers or dimension are large). It would be exciting to combine the two techniques and get a completely tight lower bound for CVP.

1.2 A Direct-OR Theorem for Subset Sum

Some readers might find the above result unnecessary: What is the value in a SETH-based lower bound if we already believe the Set Cover Conjecture of Cygan et al.? The rest of this introduction discusses new lower bound results that, to our knowledge, would not have been possible without our new SETH lower bound. To clarify what we mean, consider the following ”Direct-OR” version of Subset Sum: Given different and independent instances of Subset Sum, each on numbers and each with a different target , decide whether any of them is a YES instance. It is natural to expect the time complexity of this problem to be , but how do we formally argue that this is the case? If we could assume that this holds, it would be a very useful tool for conditional lower bounds (as we show in Section 1.3).

Many problems, like SAT, have a simple self-reduction proving that the direct-OR version is hard, assuming the problem itself is hard: To solve a SAT instance on variables, it is enough to solve instances on variables. This is typically the case for problems where the best known algorithm is essentially matched with a brute force algorithm. But what about Subset Sum or Set Cover? Can we use an algorithm that solves instances of Subset Sum in time to solve Subset Sum in time? We cannot prove such statements, however, we can prove that such algorithms would refute SETH.

Corollary 1

Assuming SETH, for any and there exists a such that no algorithm can solve the OR of given instances of Subset Sum on target values and at most numbers each, in total time .

1.3 The Fine-Grained Complexity of Bicriteria Path

The Bicriteria -Path problem is the natural bicriteria variant of the classical -Path problem where edges have two types of weights and we seek an -path which meets given demands on both criteria. More precisely, we are given a directed graph  where each edge is assigned a pair of non-negative integers and , respectively denoting the length and cost of , and two non-negative integers and representing our budgets. The goal is to determine whether there is an -path in , between a given source and a target vertex , such that and .

This natural variant of -Path has been extensively studied in the literature, by various research communities, and has many diverse applications in several areas. Most notable of these are perhaps the applications in the area of transportation networks [59], and the quality of service (QoS) routing problem studied in the context of communication networks [86, 112]. There are also several applications for Bicriteria -Path in Operations Research domains, in particular in the area of scheduling [37, 81, 90, 102], and in column generation techniques [65, 113]. Additional applications can be found in road traffic management, navigation systems, freight transportation, supply chain management and pipeline distribution systems [59].

A simple reduction proves that Bicriteria -Path is at least as hard as Subset Sum (see Garey and Johnson [58]). In 1966, Joksch [73] presented a dynamic programming algorithm with pseudo-polynomial running time (or ) on graphs with edges. Extensions of this classical algorithm appeared in abundance since then, see e.g. [13, 62, 100] and the various FPTASs for the optimization variant of the problem [53, 59, 63, 87, 106]. The reader is referred to the survey by Garroppo et al. [59] for further results on Bicriteria -Path.

Our SETH-based lower bound for Subset Sum easily transfers to show that a time algorithm for Bicriteria -Path refutes SETH. However, after the algorithm for Subset Sum from 1960 was improved last year to , it is natural to wonder if the similar algorithm for Bicriteria -Path from 1966 can also be improved to or even just to . Such an improvement would be very interesting since the pseudo-polynomial algorithm is commonly used in practice, and since it would speed up the running time of the approximation algorithms. We prove that Bicriteria -Path is in fact a harder problem than Subset Sum, and an improved algorithm would refute SETH. The main application of Corollary 1 that we report in this paper is a tight SETH-based lower bound for Bicriteria -Path, which (conditionally) separates the time complexity of Bicriteria -Path and Subset Sum.

Theorem 1.2

Assuming SETH, for any and no algorithm solves Bicriteria -Path on sparse -node graphs and budgets in time .

Intuitively, our reduction shows how a single instance of Bicriteria -Path can simulate multiple instances of Subset Sum and solve the ”Direct-OR” version of it.

Our second application of Corollary 1 concerns the number of different edge-lengths and/or edge-costs in our given input graph. Let denote the former parameter, and denote the latter. Note that and are different from and , and each can be quite small in comparison to the size of the entire input. In fact, in many of the scheduling applications for Bicriteria -Path discussed above it is natural to assume that one of these is quite small. We present a SETH-based lower bound that almost matches the upper bound for the problem.

Theorem 1.3

Bicriteria -Path can be solved in time. Moreover, assuming SETH, for any constants and , there is no time algorithm for the problem.

Finally, we consider the case where we are searching for a path that uses only internal vertices. This parameter is naturally small in comparison to the total input length in several applications of Bicriteria -Path, for example the packet routing application discussed above. We show that this problem is equivalent to the -Sum problem, up to logarithmic factors. For this, we consider an intermediate exact variant of Bicriteria -Path, the Zero-Weight--Path problem, and utilize the known bounds for this variant to obtain the first improvement over the -time brute-force algorithm, as well as a matching lower bound.

Theorem 1.4

Bicriteria -Path can be solved in time. Moreover, for any , there is no -time algorithm for the problem, unless -Sum has an -time algorithm.

2 Preliminaries

For a fixed integer , we let denote the set of integers . All graphs in this paper are, unless otherwise stated, simple, directed, and without self-loops. We use standard graph theoretic notation, e.g. for a graph we let and denote the set of vertices and edges of , respectively. Throughout the paper, we use the and notations to suppress polynomial and logarithmic factors.

Hardness Assumptions:

The Exponential Time Hypothesis (ETH) and its strong variant (SETH) are conjectures about running time of any algorithm for the -SAT problem: Given a boolean CNF formula , where each clause has at most literals, determine whether has a satisfying assignment. Let . The Exponential Time Hypothesis, as stated by Impagliazzo, Paturi and Zane [70], is the conjecture that . It is known that if and only if there is a such that  [70], and that if ETH is true, the sequence increases infinitely often [69]. The Strong Exponential Time Hypothesis, coined by Impagliazzo and Paturi [40, 69], is the conjecture that . In our terms, this can be stated in the following more convenient manner:

Conjecture 1

For any there exists such that -SAT on variables cannot be solved in time .

We use the following standard tool by Impagliazzo, Paturi and Zane:

Lemma 1 (Sparsification Lemma [70])

For any and , there exists and an algorithm that, given a -SAT instance on variables, computes -SAT instances with such that is satisfiable if and only if at least one is satisfiable. Moreover, each has variables, each variable in appears in at most clauses, and the algorithm runs in time .

The -SUM Problem:

In -SUM we are given sets of non-negative integers and a target , and we want to decide whether there are such that . This problem can be solved in time  [66], and it is somewhat standard by now to assume that this is essentially the best possible [4]. This assumption, which generalizes the more popular assumption of the case [56, 99], remains believable despite recent algorithmic progress [15, 24, 44, 74, 105].

Conjecture 2

-Sum cannot be solved in time for any and .

3 From SAT to Subset Sum

In this section we present our main result, the hardness of Subset Sum and -Sum under SETH. Our reduction goes through three main steps: We start with an -SAT formula that is the input to our reduction. This formula is then reduced to subexponentially many Constraint Satisfaction Programs (CSP) with a restricted structure. The main technical part is then to reduce these CSP instances to equivalent Subset Sum instances. The last part of our construction, reducing Subset Sum to -Sum, is rather standard. In the final part of the section we provide a proof for Corollary 1, showing that Subset Sum admits the ”Direct-OR” property discussed in Section 1.2.

3.1 From -SAT to Structured CSP

We first present a reduction from -SAT to certain structured instances of Constraint Satisfaction Programs (CSP). This is a standard combination of the Sparsification Lemma with well-known tricks.

Lemma 2

Given a -SAT instance on variables and clauses, for any and we can compute in time CSP instances , with , such that is satisfiable if and only if some is satisfiable. Each has variables over universe and constraints. Each variable is contained in at most constraints, and each constraint contains at most variables, for some constant depending only on and .

Proof

Let be an instance of -SAT with variables and clauses. We start by invoking the Sparsification Lemma (Lemma 1). This yields -SAT instances with such that is satisfiable if and only if some is satisfiable, and where each has variables, and each variable in appears in at most clauses of , for some constant . In particular, the number of clauses is at most .

We combine multiple variables to a super-variable and multiple clauses to a super-constraint, which yields a certain structured CSP. Specifically, let , and partition the variables into blocks of length . We replace each block of variables by one super-variable over universe . Similarly, we partition the clauses into blocks, each containing clauses. We replace each block of clauses by one super-constraint that depends on all super-variables containing variables appearing in .

Clearly, the resulting CSP is equivalent to . Since each variable appears in at most clauses in , and we combine variables to obtain a variable of , each variable appears in in at most constraints. Similarly, each clause in contains at most variables, and each super-constraint consists of clauses, so each super-constraint contains at most variables for . This finishes the proof. ∎

3.2 From Structured CSP to Subset Sum

Next we reduce to Subset Sum. Specifically, we show the following.

Theorem 3.1

For any , given a -SAT instance on variables we can in time construct instances of Subset Sum on at most items and target at most such that is satisfiable iff at least one of the Subset Sum instances is a YES-instance. Here is a constant depending only on and .

As discussed in Section 1.1, our reduction crucially relies on a construction of sum-free sets. For any , a set of integers is -sum-free iff for all and (not necessarily distinct) with we have . A surprising construction by Behrend [28] has been slightly adapted in [5], showing the following.

Lemma 3

For any , there exists such that a -sum-free set of size , with , can be constructed in time.

While it seems natural to use this lemma when working with an additive problem like Subset Sum, the only other example of a reduction using this lemma we are aware of is a reduction from -Clique to -SUM on numbers in [5]. Our result can be viewed as a significant boosting of this reduction, where we exploit the power of Subset Sum further. Morally, -Clique is like MAX--SAT, since faster algorithms for -Clique imply faster algorithms for MAX--SAT [107]. We show that even MAX--SAT, for any , can be reduced to -SUM, which corresponds to a reduction from Clique on hyper-graphs to -SUM.

Proof (of Theorem 3.1)

We let be a sufficiently large constant depending only on and . We need a -sum-free set, with , where is the constant from Lemma 2. Lemma 3 yields a -sum-free set of size consisting of non-negative integers bounded by , for some constant depending only on . We let be any injective function mapping to . Note that since and are constants constructing takes constant time.

Run Lemma 2 to obtain CSP instances with , each with variables over universe and constraints, such that each variable is contained in at most constraints and each constraint contains at most variables. Fix a CSP . We create an instance of Subset Sum, i.e., a set of positive integers and a target value . We define these integers by describing blocks of their bits, from highest to lowest. The items in are naturally partitioned, as for each variable of there will be items of type , and for each clause of there will be items of type .

We first ensure that any correct solution picks exactly one item of each type. To this end, we start with a block of bits where each item has value 1, and the target value is , which ensures that we pick exactly items. This is followed by bits, where each position is associated to one type, and each item of that type has a 1 at this position and 0s at all other positions. The target has all these bits set to 1. Together, these bits ensure that we pick exactly one item of each type.

The remaining blocks of bits correspond to the variables of . For each variable we have a block consisting of bits. The target number has bits forming the number in each block of each variable.

Now we describe the items of type , where is a variable. For each assignment of , there is an item of type . In the block corresponding to variable , the bits of form the number , where is the number of clauses containing . In all blocks corresponding to other variables, the bits of are 0.

Next we describe the items of type , where is a constraint. Let be the variables that are contained in . For any assignments of that jointly satisfy the clause , there is an item of type . In the block corresponding to variable the bits of form the number , for any . In all blocks corresponding to other variables, the bits are 0.

Correctness:

Recall that the first bits ensure that we pick exactly one item of each type. Consider any variable and the corresponding block of bits. The item of type picks an assignment , resulting in the number , where is the degree of . The constraints containing pick assignments and contribute . Hence, the total contribution in the block is

where . Since maps to a -sum-free set, we can only obtain the target if . Since is injective, this shows that any correct solution picks a coherent assignment for variable . Finally, this coherent choice of assignments for all variables satisfies all clauses, since clause items only exist for assignments satisfying the clause. Hence, we obtain an equivalent Subset Sum instance.

Note that the length of blocks corresponding to variables is set so that there are no carries between blocks, which is necessary for the above argument. Indeed, the degree of any variable  is at most , so the clauses containing can contribute at most to its block, while the item of type also contributes , which gives a number in .

Size Bounds:

Let us count the number of bits in the constructed numbers. We have bits from the first part ensuring that we pick one item of each type, and bits from the second part ensuring to pick coherent and satisfying assignments. Plugging in and and using yields a total number of bits of

where the hidden constant depends only on and . Since tends to 0 for , we can choose sufficiently large, depending on and , to obtain .

Let us also count the number of constructed items. We have one item for each variable and each assignment , amounting to items. Moreover, we have one item for each clause and all assignments that jointly satisfy the clause , where is the number of variables contained in . This amounts to up to items. Note that both factors only depend on and , since only depends on and . Thus, the number of items is bounded by , where only depends on and .

In total, we obtain a reduction that maps an instance of -SAT on variables to instances of Subset Sum with target at most on at most items. The running time of the reduction is clearly . ∎

Our main result (Theorem 1.1) now follows.

Proof (of Theorem 1.1)

Subset Sum: For any set and let be sufficiently large so that -SAT has no algorithm; this exists assuming SETH. Set , where is the constant from Theorem 3.1. Now assume that Subset Sum can be solved in time . We show that this contradicts SETH. Let be a -SAT instance on variables, and run Theorem 3.1 with to obtain Subset Sum instances on at most items and target at most . Using the assumed algorithm on each Subset Sum instance, yields a total time for -SAT of

where we used the definitions of and as well as . This running time contradicts SETH, yielding the lower bound for Subset Sum.

-SUM: The lower bound for -Sum now follows easily from the lower bound for Subset Sum. Consider a Subset Sum instance on items and target . Partition into sets of of equal size, up to . For each set , enumerate all subset sums of , ignoring the subsets summing to larger than . Consider the -Sum instance , where the task is to pick items with . Since , an time algorithm for -Sum now implies an algorithm for Subset Sum, thus contradicting SETH. ∎

3.3 Direct-OR Theorem for Subset Sum

We now provide a proof for Corollary 1. We show that deciding whether at least one of given instances of Subset Sum is a YES-instance requires time , where is a common upper bound on the target. Here we crucially use our reduction from -SAT to Subset Sum, since the former has an easy self-reduction allowing us to tightly reduce one instance to multiple subinstances, while such a self-reduction is not known for Subset Sum.

Proof (of Corollary 1)

Let and , we will fix later. Assume that the OR of given instances of Subset Sum on target values and at most numbers each, can be solved in total time . We will show that SETH fails.

Let be an instance of -SAT on variables. Split the set of variables into and of size and , such that (up to rounding). Enumerate all assignments of the variables in . For each such assignment let be the resulting -SAT instance after applying the partial assignment .

For each , run the reduction from Theorem 3.1 with , resulting in instances of Subset Sum on at most items and target at most . In total, we obtain at most instances of Subset Sum, and is satisfiable iff at least one of these Subset Sum instances is a YES-instance. Set and note that the number of instances is , and that the target bound is . Thus, we constructed at most instances of Subset Sum on target at most , each having at most items.

Using the assumed algorithm, the OR of these instances can be solved in total time . Since and , this running time is

which contradicts SETH. Specifically, for some this running time is less than the time required for -SAT. Setting finishes the proof. ∎

4 The Bicriteria -Path Problem

In this section we apply the results of the previous section to the Bicriteria -Path problem. We will show that the Bicriteria -Path problem is in fact harder than Subset Sum, by proving that the classical pseudo-polynomial time algorithm for the problem cannot be improved on sparse graphs assuming SETH. We also prove Theorem 1.3 concerning a bounded number of different edge-lengths and edge-costs in the input network, and Theorem 1.4 concerning a bounded number of internal vertices in a solution path.

4.1 Sparse networks

We begin with the case of sparse networks; i.e. input graphs on vertices and edges. We embed multiple instances of Subset Sum into one instance of Bicriteria -Path to prove Theorem 1.2, namely that there is no algorithm for Bicriteria -Path on sparse graphs faster than the well-known -time algorithm.

Proof (of Theorem 1.2)

We show that for any , an algorithm solving Bicriteria -Path on sparse -node graphs and budgets in time contradicts SETH. As in Corollary 1, let be instances of Subset Sum on targets and number of items for all . Without loss of generality, we can assume that all sets have the same size (e.g., by making a multiset containing the number 0 multiple times).

Fix an instance and let . We construct a graph with nodes . Writing for simplicity, for each we add an edge from to with length and cost , and we add another333Note that parallel edges can be avoided by subdividing all constructed edges. edge from to with length 0 and cost . Finally, we add an edge from to with length and cost . Then the set of -paths corresponds to the power set of , and the -path corresponding to has total length and cost . Hence, setting the upper bound on the length to and on the cost to , there is an -path respecting these bounds iff there is a subset of summing to , i.e., iff is a YES-instance.

We combine the graphs into one graph by identifying all source nodes , identifying all target nodes , and then taking the disjoint union of the remainder. With the common length bound and cost bound , there is an -path respecting these bounds in iff some instance is a YES-instance. Furthermore, note that has vertices, is sparse, and can be constructed in time . Hence, an time algorithm for Bicriteria -Path would imply an time algorithm for deciding whether at least one of Subset Sum instances is a YES-instance, a contradiction to SETH by Corollary 1.

Finally, let us ensure that . Note that the budgets and are both bounded by . If , then add a supersource and one edge from to with length and cost equal to , and add to and . This results in an equivalent instance, and the new bounds are . If , then do the same where the length and cost from to is , and then add dummy vertices to the graph. Again we obtain budgets . In both cases, the same running time analysis as in the last paragraph goes through. This completes the proof of Theorem 1.2. ∎

4.2 Few different edge-lengths or edge-costs

We next consider the parameters (the number of different edge-lengths) and (the number of different edge-costs). We show that Bicriteria -Path can be solved in time, while its unlikely to be solvable in for any , providing a complete proof for Theorem 1.3. The upper bound of this theorem is quite easy, and is given in the following lemma.

Lemma 4

Bicriteria -Path can be solved in time.

Proof

It suffices to give an time algorithm, as time is symmetric, and a combination of these two algorithms yields the claim. Let be all different edge-length values. We compute a table , where and , which stores the minimum cost of any -path that has exactly edges of length , for each . For the base case of our computation, we set and for entries with some . The remaining entries are computed via the following recursion:

It is easy to see that the above recursion is correct, since if is an optimal -path corresponding to an entry in , with and for some , then is an optimal -path corresponding to the entry . Thus, after computing table , we can determine whether there is a feasible -path in  by checking whether there is an entry with and . As there are entries in in total, and each entry can be computed in time, the entire algorithm requires time. ∎

We now turn to proving the lower-bound given in Theorem 1.3. The starting point is our lower bound for -Sum ruling out algorithms (Theorem 1.1). We present a reduction from -Sum to Bicriteria -Path, where the resulting graph in the Bicriteria -Path instance has few different edge-lengths and edge-costs.

Let be an instance of -Sum with and for all , and we want to decide whether there are with . We begin by constructing an acyclic multigraph , using similar ideas to those used for proving Theorem 1.2. The multigraph has vertices , and is constructed as follows: For each , we add at most edges from to , one for each element in . The length of an edge corresponding to element is set to , and its cost is set to .

Lemma 5

has a solution iff has an -path of length at most and cost at most .

Proof

Suppose there are that sum to . Consider the -path in , where is the edge from to corresponding to . Then , and . Conversely, any -path in has edges , where is an edge from to . If such a path is feasible, meaning that and , then these two inequalities must be tight because for each . This implies that the integers corresponding to the edges of , sum to . ∎

Let be any constant and let . We next convert into a graph which has different edge-lengths and different edge-costs, both taken from the set . Recall that , and the length and cost of each edge in is non-negative and bounded by . The vertex set of will include all vertices of , as well as additional vertices.

For an edge , write its length as , and its cost as , for integers . We replace the edge of with a path in between the endpoints of that has internal vertices. For each , we set edges in this path to have length and cost 0, and edges to have length 0 and cost . Replacing all edges of by paths in this way, we obtain the graph which has vertices and edges (since and are constant). As any edge in between and corresponds to a path between these two vertices in with the same length and cost, we have:

Lemma 6

Any -path in corresponds to an -path in with same length and cost, and vice-versa.

Lemma 7

Assuming SETH, for any constant there is no algorithm for Bicriteria -Path for any .

Proof

Suppose Bicriteria -Path has a time algorithm. We use this algorithm to obtain a fast algorithm for -Sum, contradicting SETH by Theorem 1.1. On a given input of -Sum on items, for we construct the instance described above. Then is a directed acyclic graph with different edge-lengths and edge-costs . Moreover, due to Lemmas 5 and 6, there are summing to iff has a feasible -path. Thus, we can use our assumed Bicriteria -Path algorithm on to solve the given -Sum instance. As has vertices and edges, where , an algorithm runs in time time on . For from Theorem 1.1 and set to , this running time is and thus contradicts SETH by Theorem 1.1. ∎

4.3 Solution paths with few vertices

In this section we investigate the complexity of Bicriteria -Path with respect to the number of internal vertices in a solution path. Assuming is fixed and bounded, we obtain a tight classification of the time complexity for the problem, up to sub-polynomial factors, under Conjecture 2.

Our starting point is the Exact -Path problem: Given an integer , and a directed graph with edge weights, decide whether there is a simple path in on vertices in which the sum of the weights is exactly . Thus, this is the ”exact” variant of Bicriteria -Path on graphs with a single edge criteria, and no source and target vertices. The Exact -Path problem can be solved in time by a “meet-in-the-middle” algorithm [4], where the notation suppresses poly-logarithmic factors in . It is also known that Exact -Path has no time algorithm, for any , unless the -Sum conjecture is false [4]. We will show how to obtain similar bounds for Bicriteria -Path by implementing a very efficient reduction between the two problems.

To show that Exact -Path can be used to solve Bicriteria -Path, we will combine multiple ideas. The first is the observation that Exact -Path can easily solve the Exact Bicriteria