# On Polynomial Kernels for Sparse Integer Linear Programs

###### Abstract

Integer linear programs (ILPs) are a widely applied framework for dealing with combinatorial problems that arise in practice. It is known, e.g., by the success of CPLEX, that preprocessing and simplification can greatly speed up the process of optimizing an ILP. The present work seeks to further the theoretical understanding of preprocessing for ILPs by initiating a rigorous study within the framework of parameterized complexity and kernelization.

A famous result of Lenstra (Mathematics of Operations Research, 1983) shows that feasibility of any ILP with variables and constraints can be decided in time . Thus, by a folklore argument, any such ILP admits a kernelization to an equivalent instance of size . It is known, that unless NP coNP/poly and the polynomial hierarchy collapses, no kernelization with size bound polynomial in is possible. However, this lower bound only applies for the case when constraints may include an arbitrary number of variables since it follows from lower bounds for SAT and Hitting Set, whose bounded arity variants admit polynomial kernelizations.

We consider the feasibility problem for ILPs where is an -row-sparse matrix parameterized by the number of variables. We show that the kernelizability of this problem depends strongly on the range of the variables. If the range is unbounded then this problem does not admit a polynomial kernelization unless NP coNP/poly. If, on the other hand, the range of each variable is polynomially bounded in then we do get a polynomial kernelization. Additionally, this holds also for the more general case when the maximum range is an additional parameter, i.e., the size obtained is polynomial in .

## 1 Introduction

The present work seeks to initiate a study of the preprocessing properties of integer linear programs (ILPs) within the framework of parameterized complexity. Generally, preprocessing (or data reduction) is a universal strategy for coping with combinatorially hard problems and can be combined with other strategies like approximation, brute-force, exact exponential-time algorithms, local search, or heuristics. Unlike those other approaches, preprocessing itself incurs only a polynomial-time cost and is error free (or, in rare cases, with negligible error); recall that under standard assumptions we do not expect to exactly solve any NP-hard problem in polynomial time. Thus, preprocessing before applying other paradigms is essentially free and saves solution quality and/or runtime on parts of the input that are sufficiently easy to handle in polynomial time (see e.g. [23]). For a long time, preprocessing has been neglected in theoretical research for lack of appropriate tools^{1}^{1}1In fact, it has been observed that no polynomial-time algorithm can shrink all instances of some NP-hard problem unless P NP [16]; this issue can be avoided in parameterized complexity. and research was limited to experimental evaluation of preprocessing strategies. The introduction of parameterized complexity and its notion of kernelization has sparked a strong interest in theoretically studying preprocessing with proven upper and lower bounds on its performance.

Integer linear programs are widely applied in theory and practice. There is a huge body of scientific literature on ILPs both as a topic of research itself and as a tool for solving other problems. From a theoretical perspective, many fundamental problems that revolve around ILPs are hard, e.g., checking feasibility of a -ILP is NP-hard by an easy reduction from the classic Satisfiability problem [14]. Similarly, it is easy to express Vertex Cover or Independent Set, thus showing that simple covering and packing ILPs are NP-hard to optimize. Thus, for worst-case complexity considerations, the high expressive power of ILPs comes at the price of encompassing plenty of hard problems and, effectively, inheriting all their lower bounds (e.g., approximability).

In practice, the expressive power of ILPs makes them a versatile framework for encoding and solving many combinatorially hard problems. Coupled with powerful software packages for optimizing ILPs this has created a viable way for solving many practical problems on real-world instances. We refer to a survey of Atamtürk and Savelsbergh [1] for an explanation of the capabilities of modern ILP solvers; this includes techniques such as probing and coefficient reduction. One of the most well-known solvers is the CPLEX package, which is, in particular, known for its extensive preprocessing options and parameters.^{2}^{2}2The interested reader is referred to the online documentation and manual of ILOG CPLEX 12.4 at http://pic.dhe.ibm.com/infocenter/cosinfoc/v12r4/index.jsp (see “presolve”, “preprocessing”). It is known that appropriate preprocessing and simplification of ILPs can lead to strong improvements in running time, e.g., reducing the range of variables or eliminating them altogether, or reducing the number of constraints.
Given the large number of options that a user has for controlling the preprocessing in CPLEX, e.g., the number of substitution rounds to reduce rows and columns, this involves some amount of engineering and has a more heuristic flavor. In particular, there are no performance guarantees for the effect of the preprocessing.

Naturally, this leads to the question of whether there are theoretical performance guarantees for the viability of preprocessing for ILPs. To pursue this question in a rigorous and formal way, we take the perspective of parameterized complexity and its notion of (polynomial) kernelization. Parameterized complexity studies classical problems in a more fine-grained way by introducing one or more additional parameters and analyzing time- and space-usage as functions of input size and parameter. In particular, by formalizing a notion of fixed-parameter tractability, which requires efficient algorithms when the parameter is small, this makes the parameter a quantitative indicator of the hardness of a given instance (see Section 2 for formal definitions). This in turn permits us to formalize preprocessing as a reduction to an equivalent instance of size bounded in the parameter, a so-called kernelization. The intuition is that relatively easy instances should be reducible to a computationally hard, but small core, and we do not expect to reduce instances that are already fairly hard compared to their size (e.g., instances that are already reduced). While classically, no efficient algorithm can shrink each instance of an NP-hard problem [16], the notion of kernelization has been successfully applied to a multitude of problems (see recent surveys by Guo and Niedermeier [15] and Bodlaender [3]). Due to many interesting upper bound results (e.g., [5, 12, 20]) but also the fairly recent development of a lower bound framework for polynomial kernels [16, 13, 4, 8], the existence or non-existence of polynomial kernels (which reduce to size polynomial in the parameter) is receiving high interest.

In this work, we focus on the effect that the dimension, i.e., the number of variables, has on the preprocessing properties of ILPs. Feasibility and optimization of ILPs with low dimension has been studied extensively already, see e.g. [18, 17, 21, 22, 19, 7, 10, 11]. The most important result for our purpose is a well-known work of Lenstra [21], who showed that feasibility of an ILP with variables and constraints can be decided in time ; this also means that the problem is fixed-parameter tractable with respect to . This has been improved further, amongst others by Kannan [19] to dependence on the dimension and by Clarkson [7] to (expected) dependence. We take these results as our starting point and consider the problem of determining feasibility of a given ILP parameterized by the number of variables, formally defined as follows.

Integer Linear Program Feasibility()– ILPF() Input: A matrix and a vector . Parameter: Output: Is there a vector such that ?

It is known by a simple folklore argument that any parameterized problem is fixed-parameter tractable if and only if it admits a kernelization; unfortunately the implied size guarantee is usually impractical as it is exponential in the parameter. As an example, using the runtime given by Kannan [19] we only get a kernel size of .^{3}^{3}3If the instance is larger than , then Kannan’s algorithm runs in polynomial time and we may simply return the answer or a trivial yes- or no-instance. Otherwise, the claimed bound trivially holds. Unsurprisingly, we are more interested in what kernel sizes can be achieved by nontrivial preprocessing rules. In particular, we are interested in the conditions under which an ILP with variables can be reduced to size polynomial in , i.e., in the existence of polynomial kernels for Integer Linear Program Feasibility().

### Related work.

Regarding the existence of polynomial kernels for Integer Linear Program Feasibility() only little is known. In general, parameterized by the number of variables, ILPF() admits no polynomial kernelization unless NP coNP/poly and the polynomial hierarchy collapses. This follows for example from the results of Dell and van Melkebeek [8] regarding lower bounds for the compressibility of the satisfiability problem, since there is an immediate reduction from SAT to ILPF(). Similarly, it follows also from earlier results of Dom et al. [9] who showed that Hitting Set parameterized by the universe size admits no polynomial kernelization under the same assumption.

We note that both ways of excluding polynomial kernels for Integer Linear Program Feasibility() use reductions from problems with unbounded arity. Crucially, both -Hitting Set and -SAT admit polynomial kernels of size roughly , where is the number of elements and variables respectively, which can be obtained trivially by discarding duplicate sets or clauses, respectively. Surprisingly perhaps, the work of Dell and van Melkebeek [8] shows that these bounds are tight, assuming NP coNP/poly, i.e., there are no reductions to size for any . We emphasize that this also implies the lower bound of Integer Linear Program Feasibility() since it can express, e.g., Hitting Set with sets of unbounded size (exceeding any constant ).

Motivated by these facts about the kernelization lower bound for Integer Linear Program Feasibility() and the existing straightforward polynomial kernels for -Hitting Set and -SAT, we study the influence of arity on the existence of polynomial kernels for ILPF(). Regarding the considered integer linear programs with constraints this translates to being -row-sparse, i.e., to have at most nonzero entries in each row. (This is equivalent to requiring that each constraint has at most variables with nonzero coefficients.)

### Our results.

We study Integer Linear Program Feasibility() for the case that the constraint matrix is -row-sparse; we call this problem -Sparse Integer Linear Program Feasibility() (-SILPF()). Note that is a constant that is fixed as a part of the problem (it makes no sense to study as an additional parameter since we already know that constraints involve at most all variables, but already for SAT parameterized by the number of variables this is not enough to avoid a kernelization lower bound).

Our main result is that -SILPF() admits no polynomial kernelization for any , unless NP coNP/poly. Thus we see that unlike the simpler problems -Hitting Set and -SAT, a restriction on the arity (or row-sparseness) is not enough to ensure a polynomial kernelization. For this result we give a cross-composition (introduced by Bodlaender et al. [6]; see Section 2) from Clique to -SILPF(). Concretely, we encode instances of Clique into a single instance of -SILPF() with parameter value bounded polynomially in the largest Clique instance plus , such that our obtained instance is yes if and only if at least one of the Clique instances is yes. This is presented in Section 3. The lower bound can be seen to also apply to the case of parameterization by where is the largest absolute value of any coefficient (this refers to integer coefficients which can be obtained by scaling, or, alternatively, one could use the binary encoding size in place of ); this is interesting since an ILP with variables and constraints can be trivially encoded in space .

Unlike other proofs via compositions or cross-compositions, the parameterization by the number of variables combined with the row-sparseness restriction prevent many standard tricks. For example, without the row-sparseness we could simply encode the selection of an instance number of one of the Clique instances. Then we could add constraints that encode all the edges of the input graphs, but which are only valid when the binary encoding of the instance number matches the constraint. Unfortunately, this involves constraints with variables.^{4}^{4}4We can emulate a few such constraints by use of auxiliary variables, but we cannot afford to do this for the constraints corresponding to all instances. (Of course without row-sparseness, a lower bound is known already.) Similarly, if we could use slack variables we could very easily control the constraints and have only those for a single instance of Clique be relevant; however, we cannot afford this.

Our solution goes by using a significantly larger domain for the variables that encode the selection of a clique in one of the input graphs. We use a variable for the instance number, and add (linear) constraints that enforce . This permits us to use indicator variables for the desired clique whose feasible values depend quadratically on the chosen instance number. Accordingly, we can arrange the constraints for the edges of all input graphs , such that they intersect this feasible region when . In this way, depending on , only the constraints from one instance will restrict the choice of values for the indicator variables (beyond the restriction imposed directly by and ).

Complementing our lower bound, and recalling the large domain required for the construction, we analyze the effect of the maximum variable range on the preprocessing. It turns out that we can efficiently reduce row-sparse ILPs of form to a size that is polynomial in , where is the number of variables and is the maximum range of any variable. In other words, -Sparse Integer Linear Program Feasibility admits a polynomial kernelization with respect to the combined parameter , or when is polynomially bounded in ; this is showed in Section 4. Together our upper and lower bound show that the existence for -Sparse Integer Linear Program Feasibility depends strongly on the permitted range for the variables. Note that, our lower bound proof (Section 3) allows the conclusion that parameterization by is not enough to allow a polynomial kernelization: The maximum value of any variable is polynomial in , implying that which suffices for a cross-composition (see Definition 2). We emphasize that small range without row-sparseness does not suffice by the mentioned reductions from SAT and Hitting Set.

Furthermore, let us point out that for the case of an ILP of form , , with variables, Gaussian elimination suffices to reduce the number of constraints to , but still leaves the remaining problem of reducing the size of the coefficients in order to obtain a polynomial kernelization. Note that, while in general there are trivial transformations between and , going from to uses one slack variable per constraint and hence would increase our parameter (the number of variables) by the number of constraints; this would make any further reduction arguments pointless.

## 2 Preliminaries

### Parameterized complexity and kernelization.

A parameterized problem over some finite alphabet is a language . The problem is fixed-parameter tractable if can be decided in time , where is an arbitrary computable function. A polynomial-time algorithm is a kernelization for if, given input , it computes an equivalent instance with where is some computable function; is a polynomial kernelization if is polynomially bounded (in ). By relaxing the restriction that the created instance must be of the same problem and allow the output to be an instance of any classical decision problem we get the notion of (polynomial) compression.

For our lower bound proof we use the concept of an (or-)cross-composition of Bodlaender et al. [6] which builds on a series of earlier results [13, 4, 8] that created a framework for ruling out polynomial kernelizations for certain problems.

###### Definition 1 ([6]).

An equivalence relation on is called a polynomial equivalence relation if the following two conditions hold:

1. There is a polynomial-time algorithm that decides whether two strings belong to the same equivalence class (time polynomial in for .

2. For any finite set the equivalence relation partitions the elements of into a number of classes that is polynomially bounded in the size of the largest element of .

###### Definition 2 ([6]).

Let be a language, let be a polynomial equivalence relation on , and let be a parameterized problem. An or-cross-composition of into (with respect to ) is an algorithm that, given instances of belonging to the same equivalence class of , takes time polynomial in and outputs an instance such that:

1. The parameter value is polynomially bounded in .

2. The instance is yes for if and only if at least one instance is yes for .

We then say that or-cross-composes into .

###### Theorem 1 ([6]).

If an NP-hard language or-cross-composes into the parameterized problem , then does not admit a polynomial kernelization or polynomial compression unless NP coNP/poly and the polynomial hierarchy collapses.

## 3 A kernelization lower bound for sparse ILP Feasibility

In this section we show our main result, namely that a restriction to row-sparse matrices is not enough to ensure a polynomial kernelization for Integer Linear Program Feasibility parameterized by the number of variables. The problem is defined as follows.

-Sparse Integer Linear Programming Feasibility() – -SILPF() Input: An -row-sparse matrix and a vector . Parameter: Output: Is there a vector such that ?

To prove the kernelization lower bound for -SILPF we give an or-cross-composition from the NP-hard Clique problem, i.e., a reduction of many Clique instances into a single instance of -SILPF. The idea behind the construction is to use a fairly large domain in order to recycle the same variables for the constraints that correspond to many different instances.

As a first step we state two propositions which together allow us to “compute” the square of a variable inside an ILP, i.e., to add constraints such that some variable is exactly the square of another in all feasible solutions.

###### Proposition 1.

Let , , , and denote integer variables with range each. Then any feasible assignment for satisfies . Conversely, for any choice of , , and such that , there is a choice of such that holds.

###### Proposition 2.

Let with and let denote the binary expansion of , i.e., . Then

Together the two propositions provide a way of forcing some variable in an ILP to take a value exactly equal to the square of another value. If this requires auxiliary variables and constraints. Now we will give our construction.

###### Theorem 2.

Let be an integer. The -SILPF problem does not admit a polynomial kernelization or compression unless NP coNP/poly and the polynomial hierarchy collapses.

###### Proof.

We give an or-cross-composition from the NP-hard Clique problem. Let instances of Clique be given. By a polynomial equivalence relation that partitions instances according to number of vertices and requested clique size it suffices to consider instances that ask for the same clique size and such that each input graph has vertices. We denote the instances ; for convenience, assume that all graphs have the same vertex set and edge sets for . We will create a single instance of -Sparse Integer Linear Program Feasibility() that is yes if and only if at least one instance is yes for Clique. Without loss of generality, we assume that ; otherwise we could copy some instance sufficiently often (at most doubling the input size).

### Construction–essential part.

For the sake of readability we first describe the matrix by writing down the constraints in a succinct way ignoring the sparsity requirement; there will be a small number of constraints on more than three variables which will be converted later. We also specify explicit ranges for the variables which can be enforced by the obvious constraints. Note that , , , , , and are constants in the ILP; and are used in sums but the expansion of each sum is a constraint where and have constant values.

The first group of variables, namely and serve to pick an instance number and enforce the variables to equal the binary expansion of .

(1) | ||||

(2) | ||||

(3) |

Next we create a variable and auxiliary variables and with the sole purpose of enforcing but using only linear constraints.

(4) | |||||

for all | (5) | ||||

for all | (6) | ||||

(7) |

We introduce variables for all which will encode a -clique in instance . These variables are restricted to take one of two values that depend on in a quadratic way (using ; recall that is a constant).

(8) | ||||

(9) |

That is, we restrict to .

Now we get to the central piece of the ILP, namely the constraints which will enforce the non-edges of the graph . However, we of course need to add those constraints for all input graphs . It is crucial that only the constraints for have an effect on the -variables (beyond the restriction already imposed by (8) and (9)). We add the following for all and instance numbers if is not an edge of .

(10) |

Finally, we take the sum over all , deduct times the minimum value and check that this is at least as large as the specified target value .

(11) |

This completes the essential part of the construction. Formally we still need to convert all constraints into form and to use only three variables in each constraint. However, the proof will be given regarding the more accessible constraints stated above.

### Construction–formal part.

We use to refer to the vector of all variables used above, e.g., . Thus, at this point, we use variables.

To formally complete the construction one now needs to translate all constraints to form . Furthermore, using auxiliary variables, one needs to convert this to such that has at most three non-zero entries in each row. It is clear that all range constraints, namely (1), (2), (4), and (5) can be expressed by two linear inequalities with one variable each. Also the constraints (8), (9), and (10) need no further treatment since they are already linear inequalities with at most three variables each (that is, it suffices to rearrange them to have all variables on one side when transforming to ).

For the remaining constraints, namely (3), (6), (7), and (11) we need to use auxiliary variables to replace them by small sets of linear inequalities with at most three variables each. We sketch this for (3), which requires expressing a sum using partial sums. We introduce new variables and replace as follows; the intuition is that .

We use variables for constraint (3), variables for constraints (6), variables for constraint (7), and variables for constraint (11). Altogether we use additional variables. In total our ILP uses variables, which is consistent with the definition of a cross-composition (polynomial in the largest input instance plus the logarithm of the number of instances).

### Completeness.

To show correctness, let us first assume that some instance is yes for Clique, and let be some -clique in . We will determine a value such that (this is the system obtained by transforming all constraints to inequalities in at most three variables). Again, for clarity, we will simply pick values only for all variables used in the succinct representation (i.e., all variables occurring in (1)–(11)) and check that all (in-)equalities are satisfied. It is obvious how to extend this to the auxiliary variables that are required for formally writing down all constraints as .

First of all, we set and set the variables such that they match the binary expansion of . Clearly, this satisfies constraint as well as the range of each encountered variable. It follows from Proposition 1 that we can set and also find feasible values for all such that all constraints (6) are satisfied. Hence, by Proposition 2 we can set while satisfying constraint (7).

Now, let us assign values to variables for as follows

It is easy to see that this choice satisfies both constraints (9) and (11), since .

Finally, we have to check that the (non-)edge constraints (10) are satisfied for all and all edges . There are two cases, namely and , i.e., we have to satisfy constraints for (using the fact that is a clique) but also constraints created for graphs with .

Let us first consider the case ; concretely, we take the maximum value for , namely , and compare it to the value of constraint (10), namely , using that and :

Since the last inequality holds if , which is exactly what we assumed. Thus all non-edge constraints for graphs with are satisfied.

We now consider the non-edge constraints for . We compute the difference between the bound of constraint (10) and the minimum value of , namely , to check that our assignment to -variables is feasible. Note that and :

Thus, if then at most one of and can take value without violating constraint (10). Otherwise, if , then, from the perspective of this edge, both variables may take value . Clearly, this is consistent with our assignment to the -variables, since the larger value is assigned to all variables that correspond to the vertices of the -clique .

### Soundness.

For soundness, let us assume that we have a feasible solution such that . Again, we consider only the variables of constraints (1)–(11). Recall that . We claim that the graph must have a clique of size at least .

Observe that all variables for have value or in due to constraints (8) and (9). We define a vertex subset by stating that it contains exactly those vertices with . The goal is to show that is a clique in .

As for the converse direction, feasible solutions are required to have , which follows from Propositions 1 and 2; note that obviously the variables need to equal the binary expansion of due to constraint (3).

Now, we consider the non-edge constraints (10) for and compare them to the lower bound of for variables ; we already did this computation earlier, again we have and :

Hence, for every non-edge of among and at most one of the two variables can take the larger value . Therefore, when , then is an edge of . Thus, is a clique in . It follows from that constraint (11) enforces that for at least vertices . Therefore, is of size . This completes the or-cross-composition from Clique.

The cross-composition in the proof of Theorem 2 uses variables of range polynomial in and coefficients of absolute value bounded polynomially in . We will discuss the aspect of variable range in the following section. The size of the coefficients is also interesting since an ILP with integer coefficients (like the one we create) can be easily encoded in space where is the absolute value of the largest coefficient. As the given cross-composition has we see that space polynomial in suffices, and hence the lower bound applies also to -Sparse Integer Linear Program Feasibility(); regarding parameters , , and this is a maximal negative case since parameterization by trivially gives a polynomial kernel (by the mentioned encoding). Put differently, the obstacle established in the lower bound proof is the large number of coefficients; coefficients of value polynomial in are required to make this work, but it is not their encoding size that is the obstacle for polynomial kernels.

## 4 A polynomial kernelization for sparse ILP with bounded range

We have seen that for -Sparse Integer Linear Program Feasibility() there is no polynomial kernelization unless NP coNP/poly. The proof relies strongly on having variables of high range in order to encode the constraints of instances of Clique. It is natural to ask, whether a similar result can be proven when the maximum range of any variable is small, e.g., polynomial in the number of variables. We show that this is not the case by presenting a polynomial kernelization for the variant where the maximum range is an additional parameter. The problem is defined as follows.

-Sparse Bounded Integer Linear Program Feasibility(,) Input: An -row-sparse matrix and a vector . Parameter: Output: Is there a vector such that ?

Note that we restrict to the seemingly special case where each variable is not only restricted to different consecutive values, but in fact all variables must take values from . It can be easily checked that this is as general as allowing any consecutive integers, since we could shift variables to range without changing feasibility (by changing ).

###### Theorem 3.

-Sparse Bounded Integer Linear Programming Feasibility() admits a polynomial kernelization with size .

###### Proof.

We assume that since otherwise the problem can be solved in time by work of Bar-Yehuda and Rawitz [2] and the theorem follows trivially. Recall that for the problem is NP-hard by a reduction from -SAT.

The kernelization works by considering all choices of of the variables and replacing the constraints (i.e., inequalities) in which contain only those variables. The starting observation is that there are choices of picking values for variables, and the considered constraints prevent some of those from being feasible. It can be efficiently checked which of the assignments are feasible. For each infeasible point we show how to give a small number of constraints that exactly exclude this point. Together, all those new constraints have the same effect as the original ones, allowing the latter to be discarded.

Let be any of variables and let denote the set of all points that are infeasible for constraints only involving . (Note that the whole ILP might be infeasible, but locally we only care for an equivalent replacement of the constraints.) We show constraints that enforce :

(12) | ||||||

(13) |

This requires variables and constraints; a few more variables and constraints are required to transform the constraints into an equivalent set of inequalities with at most variables each: For constraint (13) it suffices to flip the sign since it is already an inequality on variables. For constraints (12) we can replace each equality by two equalities using a new auxiliary variable (in fact this is only needed when ) and replacing both equalities in turn by two inequalities. We use variables and constraints total. Note that all coefficients have values in and can be encoded by bits (in fact two bits suffice easily for four values).

Assume first that . Thus , which implies that (taking into account the domains of and ) for all . Thus constraint (13) is violated, making infeasible. On the other hand, if , then there is a position with . It follows that (due to the range of ) which in turn implies that since the contribution of to the equality is a multiple of . Thus constraint (13) is fulfilled.

It follows that we are able to add constraints which exclude any desired point for . Let us complete the proof. Clearly, if a vector fulfills then any choice of variables from fulfills all constraints that contain only these variables. This in turn means that those variables avoid the points that are excluded by the constraints, which implies that they satisfy all our new constraints (since avoiding those points is all that is needed).

Conversely, assume that a vector fulfills all new constraints and hence any choice of variables avoids all forbidden points. Since any of the original constraints contains at most variables, it comes down to forbidding some set of points. Since fulfills our new constraints it also avoids all infeasible points for . Thus, satisfies also all original constraints.

Summarizing, we are able to replace all constraints by new constraints with small coefficients, which have the same outcome. Clearly the computations can be performed in polynomial time (the input size dominates , , and the encodings of all coefficients in and ). Since for any variables there are at most infeasible points, we need at most constraints and variables. The generated equivalent instance can be encoded by bits, by encoding each constraint (on variables) as the binary encoded names of the variables with nonzero coefficients followed by the values of the coefficients. ∎

## 5 Conclusion

We prove that the existence of polynomial kernels for -Sparse Integer Linear Program Feasibility with respect to the number of variables depends strongly on the maximum range of the variables. If the range is unbounded, then there is no polynomial kernelization under standard assumptions. Otherwise, if the range of each variable is polynomially bounded in then we establish a polynomial kernelization. This holds also for the more general case of using the maximum range as an additional parameter.

Future work will be directed at more restricted cases of ILPs in order to obtain more positive kernelization results. Similarly, structural parameters of ILPs seem largely unexplored.

## References

- [1] Alper Atamtürk and Martin W. P. Savelsbergh. Integer-programming software systems. Annals OR, 140(1):67–124, 2005.
- [2] Reuven Bar-Yehuda and Dror Rawitz. Efficient algorithms for integer programs with two variables per constraint. Algorithmica, 29(4):595–609, 2001.
- [3] Hans L. Bodlaender. Kernelization: New upper and lower bound techniques. In IWPEC, volume 5917 of LNCS, pages 17–37. Springer, 2009.
- [4] Hans L. Bodlaender, Rodney G. Downey, Michael R. Fellows, and Danny Hermelin. On problems without polynomial kernels. J. Comput. Syst. Sci., 75(8):423–434, 2009.
- [5] Hans L. Bodlaender, Fedor V. Fomin, Daniel Lokshtanov, Eelko Penninkx, Saket Saurabh, and Dimitrios M. Thilikos. (Meta) Kernelization. In FOCS, pages 629–638, 2009.
- [6] Hans L. Bodlaender, Bart M. P. Jansen, and Stefan Kratsch. Cross-composition: A new technique for kernelization lower bounds. In STACS, volume 9 of LIPIcs, pages 165–176. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2011.
- [7] Kenneth L. Clarkson. Las vegas algorithms for linear and integer programming when the dimension is small. J. ACM, 42(2):488–499, 1995.
- [8] Holger Dell and Dieter van Melkebeek. Satisfiability allows no nontrivial sparsification unless the polynomial-time hierarchy collapses. In STOC, pages 251–260. ACM, 2010.
- [9] Michael Dom, Daniel Lokshtanov, and Saket Saurabh. Incompressibility through colors and ids. In ICALP (1), volume 5555 of LNCS, pages 378–389. Springer, 2009.
- [10] Friedrich Eisenbrand. Fast integer programming in fixed dimension. In ESA, volume 2832 of LNCS, pages 196–207. Springer, 2003.
- [11] Friedrich Eisenbrand and Gennady Shmonin. Parametric integer programming in fixed dimension. Math. Oper. Res., 33(4):839–850, 2008.
- [12] Fedor V. Fomin, Daniel Lokshtanov, Saket Saurabh, and Dimitrios M. Thilikos. Bidimensionality and kernels. In SODA, pages 503–510. SIAM, 2010.
- [13] Lance Fortnow and Rahul Santhanam. Infeasibility of instance compression and succinct PCPs for NP. J. Comput. Syst. Sci., 77(1):91–106, 2011.
- [14] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
- [15] Jiong Guo and Rolf Niedermeier. Invitation to data reduction and problem kernelization. SIGACT News, 38(1):31–45, 2007.
- [16] Danny Harnik and Moni Naor. On the compressibility of instances and cryptographic applications. SIAM J. Comput., 39(5):1667–1713, 2010.
- [17] Ravi Kannan. Improved algorithms for integer programming and related lattice problems. In STOC, pages 193–206. ACM, 1983.
- [18] Ravindran Kannan. A polynomial algorithm for the two-variable integer programming problem. J. ACM, 27(1):118–122, 1980.
- [19] Ravindran Kannan. Minkowski’s convex body theorem and integer programming. Mathematics of Operations Research, 12(3):415–440, 1987.
- [20] Stefan Kratsch and Magnus Wahlström. Representative sets and irrelevant vertices: New tools for kernelization. In FOCS, pages 450–459. IEEE Computer Society, 2012.
- [21] Hendrik W. Lenstra. Integer programming with a fixed number of variables. Mathematics of Operations Research, 8:538–548, 1983.
- [22] Nimrod Megiddo. Linear programming in linear time when the dimension is fixed. J. ACM, 31(1):114–127, 1984.
- [23] Karsten Weihe. Covering trains by stations or the power of data reduction. In Proceedings of ALENEX, pages 1–8, 1998.