On Integer Programming and Convolution††thanks: Research was supported by German Research Foundation (DFG) project JA 612/20-1
Integer programs with a fixed number of constraints can be solved in pseudo-polynomial time. We present a surprisingly simple algorithm and matching conditional lower bounds. Consider an IP in standard form , where and . Let be an upper bound on the absolute values in . We show that this IP can be solved in time . The previous best algorithm has a running time of .
The hardness of (min, +)-convolution has been used to prove conditional lower bounds on a number of polynomially solvable problems. We show that improving our algorithm for IPs of any fixed number of constraints is equivalent to improving (min, +)-convolution. More precisely, for any fixed there exists an algorithm for solving IPs with constraints in time for some , if and only if there is a truly sub-quadratic algorithm for (min, +)-convolution.
For the feasibility problem, where the IP has no objective function, we improve the running time to . We also give a matching lower bound here: For every fixed and there is no algorithm for testing feasibility of IPs with constraints in time unless the SETH is false.
Vectors that sum up to can be seen as a circle in that walks from to to , etc. until it reaches again. The Steinitz Lemma  says that if each of the vectors is small with respect to some norm, we can reorder them in a way that each point in the circle is not far away from w.r.t. the same norm.
Recently Eisenbrand and Weismantel found a beautiful application of this lemma in the area of integer programming . They looked at IPs of the form , where and and obtained a pseudo-polynomial algorithm in , the biggest absolute value of an entry in , when is fixed. The running time they achieve is for finding the optimal solution and for finding only a feasible solution. This improves on a classic algorithm by Papadimitriou, which has a running time of . The basic idea in  is that a solution for the IP above can be viewed as a walk in starting at and ending at . Every step is a column of the matrix : For every we step times in the direction of (see left picture in Figure 1). By applying the Steinitz Lemma they show that there is an ordering of these steps such that the walk never strays off far from the direct line between and (see right picture in Figure 1). They construct a directed graph with one vertex for every integer point near the line between and and create an edge from to , if is a column in . The weight of the edge is the same as the -value of the column. An optimal solution to the IP can now be obtained by finding a longest path from to . This can be done in the mentioned time, if one is careful with circles.
In this paper, we present an alternative way to apply the Steinitz Lemma to the same problem. Our approach does not reduce to a longest path problem, but rather solves the IP in a divide and conquer fashion. Using the Steinitz Lemma and the intuition of a walk from to , we notice that this walk has to visit a vector near at some point. We guess this vector and solve the problem with and independently. Both results can be merged to a solution for . In the sub-problems the norm of and the norm of the solution is roughly divided in half. We use this idea in a dynamic program and speed up the process of merging solutions using algorithms for convolution problems. This approach gives us better running times for both the problem of finding optimal solutions and for testing feasibility only. We complete our study by giving (almost) tight conditional lower bounds on the running time in which such IPs can be solved.
The case where the number of variables is fixed and not as in this paper behaves somewhat differently. There is a time algorithm, whereas an algorithm of the kind is impossible unless , where is the encoding length of the input and an arbitrary function. The time algorithm is due to Kannan  improving over a time algorithm by Lenstra . It is a long open problem whether is possible instead .
1.1 Detailed description of results
In the following results we assume that has no duplicate columns. In the problem we consider, we can completely ignore a column , if there is another identical column with . In an instance without duplicate columns, we have that .
In the running times we give, we frequently use logarithmic factors like for some parameter . To handle the values formally correct, we would need to write instead of everywhere. This is ignored for simplicity of notation.
Optimal solutions for IPs.
We show that a solution to can be found in time . If given a vertex solution to the fractional relaxation, we can even get to . The running time can be improved if there exists a truly sub-quadratic algorithm for (min, +)-convolution (see Section 4.1 for details on the problem). However, it has been conjectured that no such algorithm exists and this conjecture is the base of several lower bounds in fine-grained complexity [7, 13, 3]. We show that for every the running time above is essentially the best possible unless the (min, +)-convolution conjecture is false. More formally, for every there exists no algorithm that solves in time for some unless there exists a truly sub-quadratic algorithm for (min, +)-convolution. Indeed, this means there is an equivalence between improving algorithms for (min, +)-convolution and for IPs of fixed number of constraints. We put along with so that it is obvious the lower bound holds also when the entries of are small as well. Our lower bound does leave open potential running times like , which would be an interesting improvement for sparse instances, i.e., when .
Feasibility of IPs.
Testing only feasibility of an IP is easier than finding an optimal solution. It can be done in time by solving a boolean convolution problem that has a more efficient algorithm than the (min, +)-convolution problem that arises in the optimization version. Under the Strong Exponential Time Hypothesis (SETH) this running time is tight except for logarithmic factors. If this conjecture holds, there is no time algorithm for any .
Theorem 1 (Steinitz Lemma).
Let be a norm in and let such that for all and . Then there exists a permutation such that for all
Corollary 2 ().
Let denote columns of with . Then there exists a permutation such that for all
This is rather straight-forward: We simply insert , , in the Steinitz Lemma. Note that .
Let be bounded and feasible. Then there exists an optimal solution with .
A similar bound is proved for example in . However, we can also give a proof via the Steinitz Lemma.
Let be an optimal solution of minimal 1-norm. Let denote the multiset of columns of that represent . Assume w.l.o.g. these vectors are ordered as in the previous corollary. There cannot be a circle of positive value in or else the IP would be unbounded. By circle we mean a non-empty subset that sums up to and we consider the value of the columns with regard to . In fact, there cannot be a circle of nonpositive value either, since the 1-norm of the solution is minimal. Hence, each vector in is visited at most once by the walk . The number of integer points with
for some is at most and this upper bounds the 1-norm of : Assume w.l.o.g. as the case is trivial. Take many points evenly distributed along the line from to , i.e., , ,…, . Then the distance between two consecutive points is small:
In particular, for every vector of the form , , there is a point that is not further away than . Thus, for every that satisfies (1), we have a point with
To upper bound the number of vectors of type (1), we count the number of vectors within distance at most to each of the points. This number is at most . This concludes the proof. ∎
By adding a zero column, we can assume w.l.o.g. if the IP is feasible and bounded, then there exists an optimal solution with where is the upper bound for . By scaling the bound of Lemma 3 to the next power of , we can assume that where and .
3 Dynamic Program
In this section we will how to compute the best solution to an IP with the additional constraint . If the IP is bounded, then with and an extra zero column this is the optimum to the IP (Corollary 4). In Section 3.2 we discuss how to cope with unbounded IPs. For every and every with
we solve . In other words, we store whether there is a feasible solution and the optimal value that can be achieved. We start by computing these for , and then iteratively derive solutions for smaller values of using the bigger ones. Ultimately, we will compute a solution for and .
If , then every solution must consist of exactly one column (). We can compute this solution by finding the column that equals should there exist one and set otherwise.
Fix some and and let be columns of that correspond to an optimal solution to . In particular, and . Assume w.l.o.g. that the are ordered such that for all
Note that is a solution to where . Likewise, is obviously a solution to . We claim that and . This implies that we can look up solutions for and in the dynamic table and their sum is a solution for . Clearly it is also optimal. We do not know , but we can guess it: There are only candidates.
Proof of claim.
We have that,
In a similar way, we can show that
3.1 Naive running time
The dynamic table has entries. To compute an entry, operations are necessary during initialization and in the iterative calculations. This gives a total running time of
Note that hides factors polynomial in .
3.2 Unbounded solutions
In the previous dynamic program there is no mechanism for detecting when the IP is unbounded. We follow the approach from  to handle unbounded IPs. The IP is unbounded, if and only if has a solution and has any solution with positive objective value. After running the dynamic program - thereby verifying that there exists any solution - we have to check if the latter condition holds. We can simply run the algorithm again on with . If it returns a positive value, the IP is unbounded. Let us argue why this is enough. We need to understand that when there is a positive solution to , then there is also a positive solution with 1-norm at most . Let be a positive solution to the former IP with minimal 1-norm, i.e., and minimal. Let be the multiset of columns representing . We assume that they are ordered as in Corollary 2. If , then there must be two identical partial sums with . In other words, the circle can be decomposed into two circles and . One of these must be a positive solution or else their sum would be negative. This means the 1-norm of is not minimal. We conclude that .
4 Improvements to the running time
4.1 Applying convolution
Can we speed up the computation of entries in the dynamic table?
Let be the set of vectors with
Recall, the dynamic programms computes values
for each element in .
More precisely for the value of we consider
vectors such that and
take the maximum sum of the values for among all.
First consider only the case of .
Here we have that is equivalent to .
This problem is well studied. It is a variant of (min, +)-convolution.
(min, +)-convolution Input: and . Output: , where .
(max, +)-convolution is the counterpart where the maximum is taken instead of the minimum. The two problems are equivalent. Each of them can be transformed to the other by negating the elements. We construct an instance of (max, +)-convolution of size . We set and , both to the value for in the dynamic table. Set the remaining values of and to . Then for , the correct result will be at .
(min, +)-convolution admits a trivial time algorithm and it has been conjectured that there exists no truly sub-quadratic algorithm . There does, however, exist an time algorithm , which we are going to use. In fact, there is a slightly faster algorithm with running time .
We can reduce the problem for arbitrary to a (max, +)-convolution instance of size . To do so, project a vector to
It is easy to see that for all , it holds that , if and only if . This means when we write the value of each to and , where , the correct solutions will be in . More precisely, we can read the result for some at where .
With an algorithm for (min, +)-convolution with running time we get an algorithm with running time . Inserting we get:
There exists an algorithm that finds the optimum of , in time .
Clearly, a sub-quadratic algorithm, where for some , would directly
improve the exponent. Next, we will consider the problem of only testing feasibility of an IP.
The convolution problem in this case reduces to the following.
Boolean convolution Input: and . Output: , where .
This problem can be solved very efficiently via fast Fourier transform. We compute the -convolution of the input. It is well known that this can be done using FFT in time . The -convolution of and is the vector , where . To get the boolean convolution instead, we simply replace each by . Using for the convolution algorithm we obtain the following.
There exists an algorithm that finds an element in , if there is one, in time .
This can be seen from the calculation below. First we scrape off factors polynomial in :
Next, we use that .
4.2 Use of proximity
Eisenbrand and Weismantel gave the following bound on the proximity between continuous and integral solutions.
Theorem 7 ().
Let be feasible and bounded. Let be an optimal vertex solution of the fractional relaxation. Then there exists an optimal solution with
We briefly explain, how they use this theorem to reduce the right-hand side at the expense of computing the optimum of the fractional relaxation: Note that . Since is a vertex solution, it has at most non-zero components. By setting we obtain the equivalent IP . Indeed, this IP has a bounded right-hand side:
Here, we use that and differ only in non-zero components of and in those by at most . Like in earlier bounds, the O-notation hides polynomial terms in . Using the time algorithm from , this gives a running time of , where is the time to solve the relaxation. The logarithmic dependence on in our new algorithm leads to a much smaller exponent: Using Theorem 5 and the construction above, the IP can be solved in time . Feasibility can be tested in time using Theorem 6.
4.3 Heterogeneous matrices
Let denote the largest absolute values of each row in . When some of these values are much smaller than , the maximum among all, we can do better than . An example for a highly heterogenous matrix is Unbounded Knapsack with cardinality constraints. Consider the norm and let be the multiset of columns corresponding to an optimal solution of the IP. Using the Steinitz Lemma on this norm, it follows that there exists a permutation such that for all and
This means the number of states we have to consider reduces from to at each level of the dynamic program. Hence, we obtain the running time . When the objective function has small coefficients, it is more efficient to perform a binary search for the optimum and encode the objective function as an additional constraint. We can bound the optimum by using the bound on the 1-norm of the solution. Hence, the binary search takes at most iterations. For a guess the following feasibility IP tests if there is a solution of value at least .
Using boolean convolution this gives the running time
5 Lower bounds
5.1 Optimization problem
We use an equivalence between Unbounded Knapsack and (min, +)-convolution
regarding sub-quadratic algorithms.
Note that when we instead require in the problem above, we can transform it to this form by adding an item of profit zero and weight .
Theorem 8 ().
For any there exists no time algorithm for Unbounded Knapsack unless there exists a truly sub-quadratic algorithm for (min, +)-convolution.
When using this theorem, we assume that the input already consists of the at most relevant items only. In particular, and for all . This preprocessing can be done in time .
For every fixed there does not exist an algorithm that solves IPs with constraints in time for some unless there exists a truly sub-quadratic algorithm for (min, +)-convolution.
Let and . Assume that there exists an algorithm that solves IPs of the form where , , and in time , where is the greatest absolute value in . We will show that this implies a time algorithm for the Unbounded Knapsack Problem for some . Let be an instance of this problem. Let us first observe that the claim holds for . Clearly the Unbounded Knapsack Problem (with equality) can be written as the following IP (UKS1).
Since for all , we can solve this IP by assumption in time .
Now consider the case where . We want to reduce by exploiting the additional rows. The central idea here is that for some we can consider almost independently the sums over and , where . Let be minimal such that . Then and therefore . Consider the following IP (UKSm).
We claim that (UKSm) is equivalent to (UKS1) and therefore solves Unbounded Knapsack.
Let be a solution to (UKS1). Then for all ,
It follows that there exists an such that . We choose exactly like this. The first constraint follows directly. Now let . By choice of and we have that
The right-hand side of (2) equals
Likewise, for all
By substitution in (2) and division by we get
This implies that constraints are met. Finally consider the special case of the last constraint. By choice of we have that
Thus, and (5) is the last constraint (with ).
Let be a solution to (UKSm) and . We show by induction that for all
With this implies the claim as and for all . For the claim is exactly the first constraint in (UKSm). Now let and assume that the claim above holds. We will show that it also holds for . From (USKm) we have
Multiplying each side by we get
By inserting the induction hypothesis we conclude
Solving the IP.
All entries of the matrix in (UKSm) and the right-hand side are bounded by . Therefore, by assumption this IP can be solved in time
where is some constant dependent only on . This would therefore yield a truly sub-quadratic algorithm for the Unbounded Knapsack Problem. ∎
5.2 Feasibility problem
We will show that our algorithm for solving feasibility of IPs is optimal (except for log factors).
We use a recently discovered lower bound for k-SUM based on the SETH.