A new Lenstra-type Algorithm for Quasiconvex Polynomial Integer Minimization with Complexity
We study the integer minimization of a quasiconvex polynomial with quasiconvex polynomial constraints. We propose a new algorithm that is an improvement upon the best known algorithm due to Heinz (Journal of Complexity, 2005). This improvement is achieved by applying a new modern Lenstra-type algorithm, finding optimal ellipsoid roundings, and considering sparse encodings of polynomials. For the bounded case, our algorithm attains a time-complexity of when is a bound on the number of monomials in each polynomial and is the binary encoding length of a bound on the feasible region. In the general case, . In each we assume is a bound on the total degree of the polynomials and bounds the maximum binary encoding size of the input.
We study the integer minimization of a quasiconvex polynomial with quasiconvex polynomial constraints. That is, given quasiconvex polynomials with integer coefficients, we wish to solve the following problem
A function is called quasiconvex if for every , the lower level set is a convex subset of . Some quasiconvex programs reduce nicely to convex programs, see for instance, , but this is not likely to be the case in general. Studying quasiconvex integer minimization opens up a larger class of functions that we can optimize over.
We approach the optimization problem by setting and solving the feasibility problem over , where
and applying binary search on objective values until we find an optimal solution. Strict inequalities are used to ensure that if is non-empty, then it is full dimensional in . Since for all , problem (1) can be easily formulated by weak inequalities. This follows from the observation that the inequalities and are equivalent for .
We use a modern Lenstra-type algorithm to solve the integer feasibility problem. Lenstra’s algorithm was the first algorithm to solve integer linear optimization in polynomial time when the dimension is fixed. It can be applied to any family of convex sets in provided that we can solve the ellipsoid rounding problem over sets in . Khachiyan and Porkolab  showed that Lenstra’s algorithm could be generalized to operate on convex semialgebraic sets, having time-complexity of . For the specific case of quasiconvex polynomial minimization, the current best algorithm is due to Heinz and has time-complexity of , where is an upper bound on the total degree of the polynomials and is the maximum binary encoding size of all coefficients.
Our improvement over Heinz’s algorithm comes primarily from the modern Lenstra-type algorithm that we present. Heinz developed a shallow cut separation oracle to show that Lenstra’s original algorithm applies to the quasiconvex minimization problem (1). We generalize Heinz’s shallow cut separation oracle to show that the modern Lenstra-type algorithm works for the quasiconvex minimization problem (1). We also provide a structure of evaluating polynomials that exploits sparsity, which allows us to state a more precise complexity of the algorithm based on the number of monomials given in the input.
Let be sparsely encoded quasiconvex polynomials. Let be an upper bound for the degree of the polynomials , let be the maximum number of monomials in each, and let the binary length of the coefficients be bounded by . Then there exists an algorithm for the minimization problem (1) which computes a minimum point or confirms that such a point does not exist.
If the continuous relaxation of the feasible region is bounded such that is the binary encoding length of a bound on that region with , then the algorithm has time-complexity of and output-complexity of .
Otherwise, the algorithm has time-complexity of and output-complexity of .
For , this complexity is .
If for some , then the complexity becomes .
Lenstra’s algorithm solves the integer feasibility problem for a convex set by first finding a pair of concentric ellipsoids, such that , where is a scaled version of with respect to the center. If we can determine that is non-empty, then we are done. Otherwise, we find a direction of minimal width and branch into integer hyperplanes
that cover , creating lower dimensional subproblems to solve. The same approach is then applied to solving each lower dimensional subproblem.
The complexity of Lenstra’s algorithm is guided primarily by the number of subcases that it must evaluate. As shown in section , the number of subcases in each step is bounded by where is called the lattice width direction of the inner ellipsoid and is the rounding radius or scaling between the concentric ellipsoids that are obtained. The lattice width will be used to determine if . This leads to a total number of subcases bounded by
The idea of the modern Lenstra-type algorithm is not new. Kannan was the first to reduce the number of subcases by solving lattice problems. Recently, extraordinary new lattice algorithms and bounds from the geometry of numbers reveal a better complexity. A new idea we present is finding an ellipsoid rounding with an optimal rounding radius of . This is done by using results for the maximal radius of a ball inscribed in the convex hull of points on the sphere of radius 1. We will explain these improvements and how they affect the complexity of our algorithm.
In section 2 we will discuss ellipsoid rounding from the point of view of the shallow cut ellipsoid method and show how we improve from to , which has never been done before in Lenstra’s algorithm. This improvement allows for a better exponential coefficient in the resulting complexity.
In section 3 we will explain how Kannan’s improvement of Lenstra’s algorithm reduces the number of subcases exponentially. This section will begin with a discussion of lattice theory, flatness directions, and the geometry of numbers, and we will reveal that can be improved from as used in the original Lenstra algorithm  along with Khachiyan and Porkolab and in Heinz , to as stated in Eisenbrand . The important feature that we point out is how new lattice algorithms allow this computation to be done determistically in time, as opposed to , creating a better overall complexity.
In section 4 we state our modern Lenstra-type algorithm more precisely and explain its complexity.
In section 5 we will discuss polynomial encoding and quasiconvex polynomials and then show how to make shallow cuts for sets given by quasiconvex polynomials. This allows Lenstra’s algorithm to be applied to our problem.
In section 6 we discuss the proof of Theorem 1.1.
2 Ellipsoid Rounding
The first step of Lenstra’s algorithm is to find a pair of concentric ellipsoids, one inside and one containing the feasible region. We will write ellipsoids in the form where , is a positive definite matrix and . For example, is the ball of radius centered at the origin. Let be a convex set. The ellipsoid is a -rounding of if
where is called the radius of the rounding. John  showed there exists a -rounding for any convex set. Conversely, for a simplex, a -rounding is the best possible rounding. Finding an optimal rounding will reduce the number of subcases that need to be analyzed in Lenstra’s algorithm. In section 4 we will show precisely where affects the complexity of integer optimization. In this section we will explain how to construct a -rounding.
There are several methods for ellipsoid rounding. Nesterov describes an algorithm to obtain a -rounding, for an arbitrary convex set and also how to obtain a -rounding for centrally symmetric convex sets , but each is based on the assumption that a difficult optimization problem can be solved. For a specific case, Nesterov uses linear programming, whereas we would need to maximize over nonlinear polynomials. In our model, no supplementary optimization problem need be solved. Ellipsoid roundings have also been studied by Khachiyan , which was improved by  and . Some other methods use a volumetric barrier [2, 3]. Unfortunately, none of these have been shown to round general convex sets.
We will use the original approach, which is to employ the shallow cut ellipsoid method. This can be applied to any class of convex sets for which there exists a shallow cut separation oracle.
Definition 2.1 (Shallow cut separation oracle ).
A shallow cut separation oracle for a convex set is an oracle which, for an input and a rational positive definite matrix , outputs one of the following:
verification that is a -rounding of or,
a vector such that the half-space
Theorem 2.2 (Shallow Cut Ellipsoid Method).
There exists an oracle-polynomial time algorithm that for any number and for any circumscribed closed convex set given a shallow cut separation oracle finds a positive definite matrix and a point such that one of the following holds:
(i) is a -rounding of .
This algorithm runs in time oracle-polynomial in .
The main difficulty in generalizing Lenstra’s algorithm to different classes of convex sets is creating shallow cuts. Khachiyan and Porkolab show that for an arbitrary semialgebraic set, a shallow cut can be computed in time[23, Lemma 4.1]. In our algorithm, we intend to do much better for the specific case of quasiconvex polynomials, although this complexity discussion will be saved for section 5 when we discuss quasiconvex polynomials.
Suppose that we have an ellipsoid . Ellipsoids are affine transformations of the unit ball; that is, there exists an affine map such that . The standard method of creating a shallow cut separation oracle is to observe that points in will often admit a shallow cut. For the case when is a polyhedron, finding such a point directly admits a shallow cut, whereas in the case of quasiconvex minimization, Heinz showed that with a little more work, we could find a shallow cut. This will be explained in detail in section 5. The remainder of this section will focus on realizing a -rounding.
On the other hand, if we find a set of points within the ball of radius that are in , then any ball will admit a rounding since . The rounding radius is then dependent on the maximum inscribed sphere inside , as seen in Figure 3.
Using the cross-polytope
where is the unit vector,  obtains a -rounding of a polytope, which is also used in . Heinz used this idea to obtain a rounding of a convex region given by quasiconvex polynomials . In order to overcome numerical issues of requiring exact arithmetic, Heinz chose the points
where . Heinz’s choice also obtains an -rounding.
We will generalize and improve Heinz’s method by applying sphere approximating polytopes of Kochol  that attain an optimal bound within a constant factor. A note from Kochol, modified to give more detail, shows the following result.
Theorem 2.3 (Theorem 3 in ).
Let be positive integers, , where is a constant. Let as the maximal radius of a ball (with center at the origin) contained in the convex hull of points chosen from the -dimensional sphere of radius . Then there exist constants and such that
Furthermore, there exists a polynomial time algorithm in and to construct a set of vectors with such that the polytope with extreme points is symmetric across all axes and attains such bounds.
Kochol notes that choosing points improves the -rounding to and still allows a polynomial time rounding, and improves upon the complexity for Lenstra’s algorithm given in . Theorem 2.3 demonstrates that using an exponential number of points is necessary to obtain an -rounding via the shallow cut ellipsoid method. A tighter rounding is advantageous, even if it requires an exponential number of evaluations. In our new algorithm we choose a single exponential number of points, will suffice, to obtain an -rounding. This improves the exponential coefficient in the final complexity.
We will now show that numerical approximations of Kochol’s approximating spheres will still allow for an optimal rounding. We denote the sphere of radius as . A -net of is a set of points such that for any point , there exists a point such that .
Let be a -net of , let and let . Suppose that is an -approximation of , that is to say that for all there exists a such that
Suppose there exists a point that does not belong to , then separating from by a hyperplane we get a cap of which is disjoint from and its top where is perpendicular to . Since is in , there exists a point from the 1-net that satisfies . See Figure 5 for the geometry.
Letting be the minimum distance between and the hyperplane , we can see that , which is a contradiction since is an -approximation of . ∎
Let be the set of points given by Kochol’s construction for an approximation of the unit sphere and let be an -approximation of with . Then there exist a constant such that the ball of radius is contained in .
This is now the template for our separation oracle. We will choose test points according to an approximation of Kochol’s set of points. If all the test points are feasible, we obtain an -rounding. Otherwise, we find an infeasible test point and generate a shallow cut. The specific algorithm for finding a shallow cut for quasiconvex polynomials will be presented in section 5.
3 Integer Feasibility and Subcases
We assume now to have an ellipsoid rounding
The next step in Lenstra’s algorithm is to determine if the inner ellipsoid contains an integer point. A simple, yet powerful, way to do that is Khinchin’s Flatness Theorem, which roughly states that if the minimum width of a convex body is greater than some constant , then the convex body contains an integer point. If the minimum width is less than , then we can branch into integer hyperplanes perpendicular to the minimum width direction. Since we must branch on the larger ellipsoid, we will have fewer than subcases to branch into. We will first review lattice theory and flatness directions, and present theorems for reducing the complexity of Lenstra’s algorithm.
Given linearly independent vectors , the lattice generated by is the linear transformation of
The set of vectors , or similarly , is called a basis for the lattice .
Lattices naturally arise when looking for integer points in ellipsoids, since an ellipsoid is an affine transformation of . We will need the following related properties of a lattice. The dual lattice is given by
In particular, one can show that when is full rank, . We will use a transference bound later, which is an inequality relating the properties of a lattice and its dual.
The covering radius is the smallest number such that the set of closed balls of radius centered at each lattice point completely covers all of .
The shortest vector problem (SVP) is a well-studied and important problem in lattice theory with many applications. The goal is to find a non-zero lattice vector with minimal length . For our purposes, we will find the shortest vector with respect to the Euclidean norm; therefore,
3.2 Lattice Widths and the Shortest Vector Problem
Kannan first observed that SVP could be used to minimize the number of branching directions in Lenstra’s algorithm . We follow Eisenbrand in presenting this in the context of flatness directions . Let be a non-empty closed subset of and let be a vector. The width of along is the number
The lattice width of is defined as
and any that minimizes is called a flatness direction of .
Theorem 3.1 (Khinchin’s flatness theorem ).
There exists a function , depending only on the dimension, such that if is convex and , then contains an integer point.
The best known bound for is and it is conjectured that . We will see in the next subsection that, for the specific case of ellipsoids, we can obtain this bound.
Flatness directions are invariant under dilations. This is easily shown for the case of ellipsoids.
Let be a flatness direction for . Then for any , is a flatness direction for with
For an ellipsoid, a flatness direction can be computed by solving the shortest vector problem in the lattice . To see this, consider the width along a direction of the ellipsoid ,
We have where and are contained in the unit ball if and only if . Thus properly choosing and on the boundary of , we see that
Finding the minimum lattice width is reduced to solving a SVP over the lattice .
After this transformation, integer points from now live in where .
There are two computations arising here:
1. We find a shortest vector in to determine a flatness direction. If , then we will project into a minimal number of subcases.
2. If then we have confirmed that contains an integer point. To recover this lattice point, we solve and then set . Since is then a closest integer point to with respect to the ellipsoidal norm and we know that contains an integer point, we have .
3.3 Results from the Geometry of Numbers
The geometry of numbers produces a small bound on the lattice width of an ellipsoid not containing an integer point. Using properties of LLL reduced bases, Lenstra originally observed that this value did not exceed . By considering the specific case of ellipsoids, we can achieve an bound. For an arbitrary lattice, the product of the length of a shortest vector in a lattice and the covering radius of the dual lattice is bounded by a constant dependent only on dimension. Using the Fourier transform applied to a probability measure on a lattice, Banaszczyk showed that this function is bounded by a linear factor in the dimension .
Theorem 3.4 (Theorem 2.2 in ).
Let be a lattice with . Then .
If we assume that a specific ellipsoid does not contain a lattice point, then the covering radius of the associated lattice is greater than one. Since the lattice width of an ellipsoid is simply twice the length of a shortest vector, we obtain the following inequality for ellipsoids.
Theorem 3.5 (Theorem 14.26 in ).
If is an ellipsoid that does not contain an integer point, then .
Let be an ellipsoid not containing an integer point, then .
Hence, if does not contain an integer point, then .
3.4 Complexity of the Shortest Vector Problem and the Closest Vector Problem
The shortest vector problem (SVP) has been shown to be NP-hard, even to approximate within a constant factor . Until recently, the best known deterministic solution to SVP was by Kannan, with time-complexity . The well-known Ajtai, Kumar, and Sivakumar  sieving method is a probabilistic method that solves SVP with very high probability and achieved the first single exponential time-complexity, which was shown by  to be . Micciancio and Voulgaris improved this type of method to achieve a run time of .
The closest vector problem (CVP) has also been shown to be NP-hard, even to approximate it within a polynomial factor . Kannan presented an algorithm to solve CVP in time , and although there have been some improvements [14, 16], none have been able to achieve a single exponential time-complexity.
Micciancio and Voulgaris discovered the first deterministic single exponential algorithms for CVP and SVP . This is an exciting and impressive result.
Theorem 3.7 (Corollarys 4.3 and 4.4 in ).
There are deterministic algorithms to solve SVP and CVP, with the Euclidean norm, that both have time-complexity .
This result, for the first time, allows the complexity of SVP to be much smaller than the complexity of Lenstra’s algorithm.
We will need the following simple lemma, which can, for instance, be found in . We indicate it here with a proof to give a precise complexity. The proof uses the common fact that if is a basis for a lattice , and is a unimodular matrix, then is also a basis for .
Suppose is a lattice with basis and . Suppose is primitive (i.e., for all ), and let for such that . Then there exists an algorithm that computes vectors such that is a basis for . This algorithm has time-complexity where is an upper bound on the absolute values of and the entries of for all .
Let and let . Without loss of generality, we assume has rank , otherwise we can simply reorder the basis vectors. Let such that . We now decompose into Hermite normal form, which can be done in polynomial time in the input size and the dimension . That is, we find a unimodular matrix and an upper triangular matrix such that ; therefore, . There are several algorithms to compute Hermite normal form. For a worst case complexity bound, we use Storjohann and Labahn  with run time where is a bound on the maximum binary encoding length of each entry of , and is the time required multiply two numbers of size . The entries of are all ’s, ’s, and ’s. Since is unimodular, is a basis for . Since is upper triangular, we find that , and because is primitive and , we have . Thus is a basis for . ∎
For the case where , we can choose . For Lenstra’s algorithm, , hence the time complexity simply becomes .
After choosing a flatness direction , if the width of the inner ellipsoid is smaller than , we will project into hyperplanes perpendicular to . According to Lemma 3.8, we compute vectors such that is a basis for the lattice . We now consider the projection of into the hyperplane ,
This step must ensure the set is also a convex set in class to allow for the recursion to work. Since is a flatness direction of , we know that for every and it suffices to choose in the set
which has fewer than elements. This means that if the algorithm runs to its full extent, the total number of subcases it will have to evaluate is
Heinz and Khachiyan and Porkolab follow Lenstra’s original algorithm which uses and , which leads to a total number of subcases
Our new algorithm has three important features. First, it applies the SVP algorithm of Micciancio and Voulgaris to obtain a flatness direction in time as opposed to the previous time of Kannan. We use the best known flatness constant for ellipsoids, . And we achieve an -rounding by using test points for our separation oracle. This is the first time, to our knowledge, that this choice has been made, and we point out the important fact that making this choice improves the exponential coefficient in the final complexity. In our algorithm, the worst case number of subcases reduces to
4 Lenstra-type Algorithm
Here we state a modern Lenstra-type algorithm for a class of convex sets and we comment on the overall complexity. This algorithm requires that be closed under projection into affine hyperplanes in .
Input: A convex set in class .
Output: A point or confirmation that no such point exists.
Bounds: Determine and an such that and if , then .
Ellipsoid Rounding: Compute an ellipsoid for such an such that either
and , or
is a -rounding of .
If we are in case (a), then no such point exists.
Otherwise we proceed as we are in case (b).
Flatness Direction: Compute a flatness direction of . Either
, then there exists a point , which we can compute by solving a closest vector problem, or
proceed knowing that .
Sublattice: Compute vectors such that
is a lattice basis for .
Project: For each solve the dimensional integer feasibility subproblem on the set
Considering Lenstra’s algorithm in the form presented here, we find it has time complexity of
where we have left the shallow cut and the feasibility test as unknowns since they are dependent on the class .
5 Quasiconvex Shallow Cuts
In this section we will show that the modern Lenstra-type algorithm can be applied to convex sets given by quasiconvex polynomial inequalities. We will begin with a contribution on efficiently encoding polynomials to exploit sparsity. We will then review properties of quasiconvex polynomials that will be useful for making shallow cuts and present our shallow cut oracle.
5.1 Polynomial Encoding
In this paper, we allow our complexity results to vary based on the encoding scheme chosen for the polynomials. Multi-variable polynomials can be presented in a list of the coefficients of all the monomials up to degree , requiring a large storage space. This is typically referred to as a dense encoding. Under this scheme, the following remark holds.
Remark 5.1 (Remark 2.1 in ).
Let be a polynomial of total degree at most with integer coefficients of binary length bounded by . Moreover, let be a fixed point with binary encoding size . Then there is an algorithm with time complexity and output-complexity which computes the value of the function and the gradient at the point .
This time-complexity, however, is too pessimistic; for example, it seems to require time to evaluate a monomial of degree .
An alternative is sparse encoding, where monomials are listed with their non-zero exponents and their coefficients, allowing for a more concise representation for short polynomials and a more refined time-complexity analysis. Polynomials and their gradients can then be evaluated in time, where is a bound on the number of monomials in the polynomial. This scheme is potentially problematic in Lenstra’s algorithm because each subproblem is realized by intersecting our region with a hyperplane, which would cause a loss of sparsity (fill-in). For instance, if the given polynomial is and our hyperplane is , then in the reduced dimension it becomes . We note that in the algorithm, we never expand these expressions, allowing sparse encoding to continue to be useful. We instead leave the polynomials alone and store coordinate transformation matrices at each step and then compute the coordinates in the original space to input into the polynomials. Gradients are computed via the chain rule. For context, this is discussed in more detail in Remark 5.6.
5.2 Quasiconvex Polynomials
A function is called quasiconvex if all the lower level sets are convex subsets of . Although quasiconvex functions are not necessarily convex, all convex functions are quasiconvex. We follow  for a review on quasiconvex polynomials.
Lemma 5.2 (Section 4.1, Remark 1 in ).
Let F be a quasiconvex polynomial, a fixed point and , a fixed vector. If the polynomial in is strongly decreasing (or constant, respectively), then is strongly decreasing (or constant, respectively) for all .
This lemma does not necessarily hold if the function is not a polynomial. Consider an example from , This is quasiconvex because all the lower level sets are subspaces, for example . This is a counterexample since (a constant), but can vary with the remaining input.
Lemma 5.2 can be used to determine if a quasiconvex polynomial is constant.
Corollary 5.3 (Corollary 2.3 in ).
Let be a quasiconvex polynomial of degree at most, a point, and let the set be a basis of . If for every , there are pairwise distinct real numbers satisfying for all then the polynomial is constant.
The following lemma is important for generating shallow cuts.
Lemma 5.4 (Lemma 2.4 in ).
Let be a quasiconvex polynomial and let be a fixed point. If and , for every other that satisfies , we have that
As mentioned before, the class of convex sets used must be closed under intersections with affine hyperplanes. We will also require that an ellipsoid bound reduce to similar ellipsoid bound.
Remark 5.5 (Within the proof of Theorem 4.2 in ).
Let be quasiconvex polynomials, , a positive definite matrix, and a polynomial defined by , for . Moreover, let the binary length of the coefficients be bounded by , let be an upper bound for the degree of the polynomials. Let be nonsingular, , with entries of and of binary length at most . Let
Consider the set and the new coordinates induced by ,
fixing the last coordinate and rewriting the quasiconvex polynomials in terms of the new coordinates.
The maximum binary length of all coefficients belonging to the new polynomials
, is .
Furthermore, all new polynomials are quasiconvex since the transformation is linear and preserves its form for a new suitable . The degree bound and the number of polynomials remain unchanged, but the number of coordinates reduces by one.
Following the notation of Remark 5.5, we will explain here how we evaluate the polynomials and their gradients under the sparse encoding scheme. Suppose of such coordinate transformations are done to produce the variable . Each is a block diagonal matrix where the last block is an identity matrix of size . In each transformation , we are restricting the last variable to be . A polynomial transformed into the new coordinates we will denote as . For a given , we can compute as
With the product computed ahead of time, a depth first search allows us to store at most of these products at any given time. The partial derivatives of then have a simple representation as
where is the column of .
5.3 Shallow Cuts
The main result of  is derived from Heinz’s shallow cut separation oracle for quasiconvex polynomials. The following is an adaptation of Heinz’s proof to allow for stronger ellipsoid roundings. Specifically, we show that his calculation for a basis direction to admit a shallow cut generalizes to having any direction admit a shallow cut. Recall that we are solving the feasibility problem over the set
where all the are quasiconvex polynomials with integer coefficients. Consider an ellipsoid and let be an orthogonal basis of according to the matrix (where the inner product is given by ). Define the affine map such that
Thus if and only if .
Theorem 5.7 (Shallow Cut Separation Oracle).
Let and let , such that . Then there exists a function where and an algorithm with the following input:
sparsely encoded quasiconvex polynomials of total degree , at most monomials in each, and whose coefficients’ binary encoding lengths are bounded by ,
an ellipsoid containing as defined in (2), where the binary encoding length of the columns of and of are bounded by ,
and outputs one of the following answers:
confirmation that the ellipsoid is a -rounding of , or
a vector , with the property
This algorithm runs in time-complexity
First compute an orthogonal basis according to . Let (Heinz used ). Next construct a polytope approximating according to Theorem 2.3 using vertices and let denote the set of non-normalized vertices. For every , define