Limits of Treewidth-based tractability in Optimization

Limits of Treewidth-based tractability in Optimization

Yuri Faenza Industrial Engineering and Operations Research, Columbia University Gonzalo Muñoz IVADO Fellow, Canada Excellence Research Chair in Data Science for Real-Time Decision-Making Sebastian Pokutta Industrial and Systems Engineering, Georgia Institute of Technology
Abstract

Sparse structures are frequently sought when pursuing tractability in optimization problems. They are exploited from both theoretical and computational perspectives to handle complex problems that become manageable when sparsity is present. An example of this type of structure is given by treewidth: a graph theoretical parameter that measures how “tree-like” a graph is. This parameter has been used for decades for analyzing the complexity of various optimization problems and for obtaining tractable algorithms for problems where this parameter is bounded. In this work we study the limits of the treewidth-based tractability in optimization by proving that, in a certain sense, the already known positive results based on low treewidth are the best possible. More specifically, we show the existence of 0/1 sets that nearly meet the best treewidth-based upper bound on their extension complexity. Additionally, under mild assumptions, we prove that treewidth is the only graph-theoretical parameter that yields tractability in a wide class of optimization problems, a fact well known in Graphical Models in Machine Learning and here we extend it to Optimization.

1 Introduction

Treewidth is a graph-theoretical parameter used to measure, roughly speaking, how far a graph is from being a tree. The treewidth concept was explicitly defined by Robertson and Seymour [47] (also see [48]), but there are many equivalent definitions. An earlier discussion is found in [37] and closely related concepts have been used by many authors under different names, e.g., the “running intersection” property, and the notion of “partial k-trees". Here we make will use the following definition; recall that a chordal graph is a graph where every induced cycle has exactly three vertices.

Definition 1.1.

An undirected graph has treewidth if there exists a chordal graph with and clique number . We denote as the treewidth of .

Note that in the definition above is sometimes referred to as a chordal completion of . It can be shown that a graph has treewidth if and only if it is a forest. On the other extreme, a complete graph of vertices has treewidth . An important fact is that an -vertex graph with treewidth has edges, and thus low treewidth graphs are sparse, although the converse is not true. This sparsity is accompanied by a compact decomposition of low-treewidth graphs that allows to efficiently address various combinatorial problems.

Bounded treewidth has been long and widely recognized as a measure of complexity for all kinds of problems involving graphs and there is expansive literature concerning polynomial-time algorithms for combinatorial problems on graphs with bounded treewidth. One of the earliest references is [3]; see also [2, 4, 20, 10, 14, 11]. These algorithms typically rely on Dynamic Programming techniques that yield algorithms with a non-polynomial dependency on the treewidth. A similar paradigm has been presented in Inference Problems of Graphical Models (see, e.g., [43]), where it is well known that an underlying graph with bounded treewidth yields tractable inference problems; see [46, 31, 25, 54, 21, 55] and references therein.

In a more general optimization context, treewidth-based sparsity has been studied using the concept of the intersection graph111Also called primal constraint graph and Gaifman graph, which provides a representation of the variable interactions in a system of constraints. The intersection graph of a system of constraints was originally introduced in [32] and has been used by many authors, sometimes using different terminology.

Definition 1.2.

The intersection graph of a system of constraints is the undirected graph which has a vertex for each variable and an edge for each pair of variables that appear in any common constraint. If an optimization problem instance or its system of constraints is denoted , we call its intersection graph .

As has been observed before (see [12, 13, 42, 41, 56, 54]), the combination of intersection graph and treewidth makes it possible to define a notion of structured sparsity in an optimization context. One example of a research stream that has made use of treewidth-based sparsity via intersection graphs is that of constraint satisfaction problems (CSPs). One can obtain efficient algorithms for CSPs, whenever the intersection graph of the constraints exhibits low treewidth. Moreover, one can find compact linear extended formulations (i.e., linear formulations with a polynomial number of constraints) in such cases [39, 40]. In the Integer Programming context, extended formulations for binary problems whose constraints present a sparsity pattern with small treewidth have been developed as well; see [13, 54, 42]. A different use of treewidth in Integer Programming is given in [24]. An alternative perspective on structured sparsity in optimization problems, without relying on an intersection graph, is taken in [17].

Contribution

In this article we focus on two questions related to tractability induced by treewidth. While we provide a precise statements of these questions in each corresponding section, roughly speaking these questions and our contribution can be summarized as follows:

  1. In general, whenever an optimization problem exhibits an intersection graph with bounded treewidth, it can be solved (or approximated, depending on the nature of the problem) in polynomial time (see [12, 40, 42]). As such it is natural to ask the following question:

    Is there any other graph-theoretical structure that yields tractability?

    We provide a negative answer to this question, in the sense that a family of graphs with unbounded treewidth can yield intractable optimization problems, regardless of any other structure present in the family. This is based on a result by Chandrasekaran et al. [21] in the context of Graphical Models, where it is proven that a family of graphs with unbounded treewidth can yield intractable inference problems. Here we follow the same strategy and prove that the result holds in a different optimization setting as well, bridging results from Machine Learning and Polynomial Optimization.

  2. For sets in defined using a set of constraints whose intersection graph has treewidth , it is known that there exists a linear programming reformulation of its convex hull of size . This yields the following question:

    For any given treewidth , is there any 0/1 set that (nearly) meets this bound?

    We provide a positive answer to this question. Furthermore, we prove that this bound is tight even if we allow semidefinite programming formulations. This establishes that there is little to be gained from semidefinite programs over linear programs in general when exploiting low treewidth. Our analysis is based on the result of Briët et al. [18], where the existence of 0/1 sets with exponential semidefinite extension complexity is proved. We also prove a similar result for the stable set polytopes, making use of the treewidth of the underlying graph directly instead of relying on a particular formulation, and discuss related results.

    It is worth mentioning that the extension complexity upper bound is obtained enumerating locally feasible vectors along with a gluing argument. Moreover, the upper bound is oblivious to any other structure present in the constraints besides its sparsity pattern. Our result shows that, surprisingly, one cannot do much better than this seemingly straight-forward approach, even if semidefinite formulations are allowed.

    Typically, one can find treewidth-based upper bounds on the extension complexity of certain polytopes [40, 39, 17], or extension complexity lower bounds on specific families of problems [15, 17, 30, 7] parametrized using the problem size. To the best of our knowledge, much less attention has been devoted to providing extension complexity lower bounds parameterized using other features of the problem. As a matter of fact, we are only aware of two other articles in this domain: the work of Gajarskỳ et al. [33], where the authors analyze the extension complexity of the stable set polytope based on the expansion of the underlying graph, and the work of Aboulker et al. [1] which, independently of this work, provided extension complexity lower bounds of the correlation polytope parameterized using the treewidth of the underlying graph. Our work contributes to this line of work, showing the existence of polytopes whose extension complexity lower bound depends on the treewidth parameter and nearly meet the aforementioned bound. We discuss the main difference of our approach to that of Aboulker et al. below.

We believe that addressing these two questions provides valuable insights into the limitations of exploiting treewidth and provides strong lower bounds that allow for assessing the performance of current approaches. In fact, our results show that the existing approaches are, in some sense, the best possible and that further improvement is only possible if further structure is considered.

We emphasize that these two questions are very different in nature. The extension complexity is a concept that does not necessarily depend on whether a problem is easy or hard from an algorithmic perspective, nor on the assumption of . For example, there are instances of the matching polytope with an exponential extension complexity [51], whereas finding a maximum weight matching can be done in polynomial time for any graph. An example in the other direction is given by the stable set problem. For each , an -approximate solution cannot be attained in polynomial time [38, 59] unless , but there exists a formulation of polynomial size of the stable set polytope with the property that, for each objective function , its optimal solution is a factor away from the maximum weight stable set ([8, 9], by building on results from [28]).

Remark 1.3.

It came to our attention that, independently of this work, in Aboulker et al. [1] it was recently proven that for any minor-closed family of graphs there exists a constant such that the correlation polytope of each graph of vertices in the minor-closed family has linear extension complexity at least

(1)

where is the treewidth of the graph. While this result is in the same spirit as one of our results—the lower bound in the extension complexity case—we highlight a few key differences:

  1. The results in [1] study the important question of the linear extension complexity of the correlation polytope for various graphs providing (almost) optimal bounds, while we give ourselves more freedom with the polytope family.

  2. The constant in (1) is at most , and the correlation polytope of a graph with treewidth has ambient dimension —the number of edges of the graph. If additionally , the lower bound in (1) satisfies

    The polytopes we construct here have a lower bound with a leading term as compared to . This is due to the fact that we rely on the stronger existential counting arguments in [50, 18, 19] along with a polytope composition procedure.

  3. Our employed technique is drastically different: rather than reducing to a face of the correlation polytope we provide a general technique to construct high-extension-complexity polytopes from any seed polytope (under appropriate assumptions).

  4. Our results apply to both the semidefinite and the linear case. Moreover, we specialize our construction to Stable Set polytopes where the gluing operation that we use has a natural representation in terms of graph-theoretic operations.

Outline

The rest of the article is organized as follows. In Section 2 we provide the basic notation used in this article. The main contributions are divided in two sections. In Section 3 we provide the answer to the first question above, i.e., we prove that unbounded treewidth can yield intractable optimization problems, and in Section 4 we provide the answer to the second question, i.e., we show the existence of sparse problems with high extension complexity. Both sections are organized similarly: we begin by providing the necessary background, along with the known positive treewidth-exploiting results, and then move to the respective proofs. Section 5 provides additional results to complement Section 4.

2 Notation

We mostly follow standard linear algebra and graph theory notation. For , we use to denote the set of integers . Further, we denote by the -dimensional vector space of the reals and by the -dimensional free -module over the integers. If we restrict vectors to have non-negative entries, we use and . We call with the canonical vectors in , i.e., if and only if . The space of symmetric positive semidefinite matrices is denoted as . The standard inner product between two vectors is denoted by . Given two matrices (of compatible dimension), the Frobenius inner product is denoted by . Given two set , we denote the cartesian product by . The convex hull of a set is denoted as , and its affine hull by . For a graph , we use and to denote its vertices and edges respectively. For , we use to denote the set of neighbors of in , that is . Given two graphs with , we have that is a subgraph of if and , and is a minor of if can be obtained from using vertex deletions, edge deletions, and edge contractions. Lastly, for a polynomial , we denote by the sum of the absolute values of its coefficients, i.e., if with for some , rational and , then

The degree of is defined as .

3 Unbounded treewidth can yield intractability

Our first goal is to study the question of whether low treewidth is the only graph-theoretical structure that yields tractability in optimization. Here we work with the general Polynomial Optimization framework, i.e., we consider problems of the form:

(2a)
s.t. (2b)
(2c)
(2d)

where each is a polynomial of degree at most . When we also use the term QCQP to refer to PO.

Remark 3.1.

Any problem with polynomial objective and constraints, and defined over a compact set, can be cast as a PO. This can be done by appropriately rescaling variables and by using an epigraph formulation to move the non-linear terms of the objective to the constraints.

As mentioned above, it is known that tractability of an instance of PO is implied by an intersection graph of low treewidth. In the pure binary case, an exact optimal solution of can be computed in polynomial time whenever has bounded treewidth (see [12, 40, 42]). However, if continuous variables are present, exact solutions might not be computable in finite time as shown by the following simple example.

s.t.

has an irrational optimal solution. As such approximation is unavoidable from a computational perspective, therefore we make use of the following definition:

Definition 3.2.

Given an instance of PO, we say is -feasible if , where

(3)

Given an instance of PO an LP formulation that takes advantage of low treewidth of was proposed by Bienstock and Muñoz [12] in order to approximate . More specifically:

Theorem 3.3 (Bienstock and Muñoz [12]).

Consider a feasible instance of PO and . Assume each has degree at most . If has treewidth then there is an LP formulation with variables and constraints such that

  1. all feasible solutions to the LP are -feasible for

  2. every optimal LP solution satisfies

    (4)

    where is an optimal solution to and is the sub-vector of corresponding to continuous variables .

Moreover, given a so-called tree-decomposition of width (which exists whenever the treewidth is at most ), the LP can be constructed in time

Here we phrased the theorem in a slightly different way compared to [12]:

(a) we assume that is feasible and (b) the result in [12] only considers continuous variables, whereas we allow for binary variables as well. This can be done while ensuring that the error term in (4) only involves coefficients associated with continuous variables; see [45] for details.

We would like to stress that the approximation provided by Theorem 3.3 is different from the traditional notion of approximation used in approximation algorithms: we allow for -feasibility, i.e., we allow (slightly) infeasible solutions, which is usually not the case in approximation algorithms.

For we obtain the following immediate corollary of Theorem 3.3.

Corollary 3.4.

There is an algorithm such that for all and for any feasible instance of QCQP such that has bounded treewidth, can compute an -feasible solution satisfying (4) in time where is some polynomial.

We establish an (almost) matching lower bound for Theorem 3.3 by providing an (almost) matching lower bound for Corollary 3.4. For this we use the strategy of Chandrasekaran et al. [21] adapted to the general optimization setting. More precisely, we prove the following; we discuss the complexity theoretic assumption in Section 3.2:

Main Theorem 3.5.

Let be an arbitrary family of graphs indexed by treewidth. Let be an algorithm such that for all and for all instances of QCQP such that algorithm computes an -feasible solution satisfying (4). If is the running time of on instance , then assuming implies that grows super-polynomially in .

3.1 Intractability in the 0/1 case

In the context of inference in graphical models the following problem was studied by Chandrasekaran et al. [21]: given a graph , a collection of (binary) random variables with , and for each forming a clique of a function which only involves variables with , then the inference problem involves computing the partition function defined as

where is the set of all cliques in . It is known that if the underlying graph has bounded treewidth, then the inference problem can be solved in polynomial time (see [55]). One of the main results in [21] provides a converse to this statement: given any family of graphs indexed by treewidth—under the complexity assumptions of Theorem 3.5—there exist instances defined over that family of graphs such that inference requires time super-polynomial in .

The proof can be directly adapted to state the same result regarding computing an optimal solution for a 0/1 PO problem. Hence, the result by Chandrasekaran et al. [21] can be viewed as the 0/1 version Theorem 3.5, which does not involve -feasible solutions, as in such context exact solutions can be computed.

Remark 3.6.

The original proof in [21] makes use of the hypothesis and the so called Grid-minor hypothesis. Since then, the latter was shown to be true by Chekuri and Chuzhoy [22], along with an algorithmic result allowing the use of the instead of .

Here we extend these results to include continuous variables and show that even approximately solving the problem remains intractable. The proof is along the lines of [21] and we follow their overall strategy. Our contribution here is to replace reductions between distributions and potential functions with reductions involving Polynomial Optimization problems and their approximations, as well as making use of randomized algorithms instead of non-uniform algorithms. To avoid confusion, we would like to stress that the notion of Approximate Inference presented in [21] is a different concept compared to finding an -feasible solution to a PO problem.

3.2 Complexity-theoretic Assumptions and Graph-theoretic Tools

For a precise definition of and the commonly believed hypothesis, we refer the reader to [5]. Simplifying here, is the class of languages for which a polynomial time probabilistic Turing machine exists which, given an input , provides a wrong answer to the decision with probability of at most 1/3, whether in fact or . In our context here, it is sufficient to know that this complexity-theoretic assumption implies that MAX-2SAT in planar graphs (an NP-hard problem; see [36]) does not belong to .

The second important tool we will use stems from work on the famous graph minor theorem. We briefly recall relevant results here, phrased to match the language in [21].

Theorem 3.7 (Robertson et al. [49]).

There exist universal constants and such that the following holds. Let be a grid. Then, (a) is a minor of all planar graphs with treewidth greater than . Further, (b) all planar graphs of size (number of vertices) less than are minors of .

The next theorem relaxes the planarity assumption however only for one of the directions of Theorem 3.7.

Theorem 3.8 (Robertson et al. [49]).

Let be a grid. There exists a finite such that is a minor of all graphs with treewidth greater than . Further, , where and are universal constants (i.e., they are independent of ).

The last theorem provides bounds on the magnitude of in order for it to have the grid as a minor. The constant was conjectured to be polynomial in , and was used as a complexity-theoretic assumption (under the name Grid-minor hypothesis) in [21]. Since then, a recent breakthrough by Chekuri and Chuzhoy [22] resolved this in the positive.

Theorem 3.9 (Chekuri and Chuzhoy [22]).

Moreover, there is a polynomial time randomized algorithm that, given a graph G with treewidth at least , with high probability222probability at least for some constant outputs the sequence of grid minor operations transforming into the grid.

Remark 3.10.

In [22], the output of the randomized algorithm is a model of the minor. Such model can be directly turned into a set of minor operations.

Remark 3.11.

There has been some considerable progress recently regarding the exponent of the polynomial dependency in Theorem 3.9. We refer the reader to [23] for these improvements. Nonetheless, these newer results are non-algorithmic, which is undesirable for our purposes.

Theorem 3.9, together with Theorem 3.7 and Theorem 3.8, yields the following corollary:

Corollary 3.12.

Let be a planar graph of nodes. There exists a polynomial such that is a minor of all graphs of treewidth at least .

The above in particular implies that is a minor of for all for the sequence of graphs in Theorem 3.5.

3.3 Proof of Theorem 3.5

The outline of the proof of Main Theorem 3.5 is as follows. We start from a NP-hard instance of QCQP, whose intersection graph is planar. Recall that we assume we are given an arbitrary family of graphs indexed by treewidth. Due to Corollary 3.12, is a minor of for some large enough. We then construct an instance of QCQP equivalent to whose intersection graph is exactly . This makes it possible to use algorithm over , which yields the conclusion. The key ingredient is the following: having a family with unbounded treewidth allows us to embed the graph defining the NP-Hard problem into a graph of the given family, even if this family is arbitrary.

3.3.1 Formulating MAX-2SAT as a special PO problem

Consider the NP-Hard problem of planar MAX-2SAT with underlying planar graph . Denote the clauses and the edges of . Let the variables be with . Then

We can formulate MAX-2SAT directly as a QCQP:

(5a)
s.t. (5b)
(5c)
(5d)
(5e)

where

thus implies that clause is satisfied. Let be an instance of MAX-2SAT-1. Note that using this formulation the graph is a subgraph of the intersection graph . It is also not hard to see that is planar, as we only need to add vertices , and each vertex is connected to the endpoints of one particular edge of . We would like to emphasize that constraints are equivalent to simply requiring , so we could formulate MAX-2SAT as a pure binary problem, however, since we are aiming for statements about the complexity of approximating PO problems, we deliberately chose a formulation using variables that can be continuous in nature; this will become clear soon.

The above formulation of MAX-2SAT is straight-forward, however for technical reasons we use the following equivalent alternative. The advantage of this formulation is that all constraints involve only or variables, simplifying the later analysis.

(6a)
s.t. (6b)
(6c)
(6d)
(6e)
(6f)
(6g)

where

and is similarly defined.

3.3.2 Graph Minor Operations

Let be an instance of MAX-2SAT with planar intersection graph . Given a target graph which has as a minor, in this section we show how to construct a QCQP instance equivalent to . The complexity of this reduction is polynomial in the number of minor operations (vertex deletion, edge deletion, edge contraction), assuming that we know in advance which those operations should be. We will first show this for being contractable to using a single minor operation and then argue that this is without loss of generality by repeating the argument. We distinguish the following cases:

  1. Vertex Deletion. If the minor operation is a vertex deletion of a vertex , we define as plus a new variable with objective coefficient . Additionally, for all we add the redundant constraint .

  2. Edge Deletion. If the minor operation is an edge deletion of an edge , we define as plus the redundant constraint .

  3. Edge contraction. If the minor operation is an edge contraction of to form , then we proceed as follows.

    Let be the neighbors of in . Note that in all constraints involve at most variables, hence there is a one-to-one correspondence of edges in and constraints involving variables in , and all these constraints are linear. Such constraints have the form

    where variables can be either variables or in MAX-2SAT, depending on node . Using this, we define from by removing variable , adding variables and , and adding the following constraints

    If the objective value of was 1, then we ensure to have objective value and to have objective value . If was a continuous variable we add the constraints and , and if was a binary variable we enforce and to be binary as well.

Clearly, in any case we obtain that , and is equivalent to . Note that constraints in involve at most variables and the ones with exactly variables are linear. This invariant makes it possible to iterate this procedure using any sequence of minor operations.

Let . Corollary 3.12 implies that is a minor of , thus assuming the sequence of minor operations is known, we can use the procedure above to construct an instance which is equivalent to and whose intersection graph is exactly . It is not hard to see that has the following form (after relabeling variables):

(7a)
s.t. (7b)
(7c)
(7d)
(7e)
(7f)

for some appropriately defined , and where , .

Remark 3.13.

Each constraint (7b) is either a redundant constraint (introduced with the vertex or edge deletion operation) or it involves at least one integer variable. This will be important in the next section.

3.3.3 From approximations to exact solutions

We will now show how to construct a (truly) feasible solution from an -feasible solution to . This will provide the link of the hardness of approximating to the hardness of solving exactly.

Lemma 3.14.

Let be an -feasible solution to satisfying (4) for . Then, from , we can construct such that is feasible and optimal for .

Proof.

Since is an -feasible solution, we have

where the arises as the -norm of the coefficients. Thus either or : is decreasing in , increasing in , , and

as . Thus . From here we conclude if . The case for is symmetric.

Now from we construct by rounding each component to the nearest integer, and we argue the feasibility and optimality of .

  1. Constraints (7d) are clearly satisfied as is a binary vector.

  2. For constraints (7c), being -feasible implies for all and thus, using the above and that are binary, we have

    The left-hand side is an integer and , from where we conclude .

  3. For constraints (7b) fix such that the corresponding constraint is not redundant. By Remark 3.13 either or is integer. Without loss of generality assume , and thus . To make the argument clear, we rewrite the inequality as

    The left hand side is an integer, therefore rounding will keep the inequality valid where we use that :

This proves is feasible. On the other hand, satisfies (4), and only integer variables have non-zero objective coefficient:

therefore is optimal. ∎

3.3.4 Bringing it all together

Proof of Main Theorem 3.5.

Suppose we are given a sequence of graphs , each having treewidth . We show that, under the conditions of Theorem 3.5, the existence of an algorithm as in Theorem 3.5 (i.e., that can approximately solve QCQP problems with ), with running time polynomial in implies that planar MAX-2SAT belongs to , contradicting the assumption .

  1. Consider an instance of planar MAX-2SAT. We construct an instance of a QCQP as in Section 3.3.1, whose intersection graph is planar. We denote its number of vertices.

  2. From Corollary 3.12 we know that is a minor of . Moreover, and, from the discussion in Section 3.3.2, is equivalent to a QCQP problem with .

  3. The minor operations transforming into , which are needed to construct , can be obtained as follows:

    1. Since is planar, it is a minor of the grid. This sequence of minor operations can be found can be found in linear time using the results in [52].

    2. The grid is a minor of . We can find the corresponding sequence of minor operations (with high probability) in polynomial time using the algorithm by Chekuri and Chuzhoy [22] mentioned in Theorem 3.9.

  4. Using the point above, we can construct (with high probability) instance .

  5. Using and a fixed , find an -feasible solution satisfying (4) for in time .

  6. Given an -feasible solution of , we construct an optimal solution for as in Section 3.3.3.

  7. From the optimal solution to , we can find an optimal solution to using the minor operations described in Section 3.3.2 in polynomial time.

Using the optimal solution, we can solve the decision problem associated to planar MAX-2SAT directly. The only place where our algorithm can make a mistake is in the sequence of minor operations, which happens with low probability. If is polynomial, we obtain that planar MAX-2SAT , a contradiction. ∎

4 Treewidth-based Extension Complexity Lower Bounds

In this section we analyze the tightness of the linear extension complexity results that exploit treewidth. While we provide precise definitions in Section 4.1, the linear extension complexity of a problem is the smallest number of inequalities needed to represent a given problem as linear program. In fact our lower bounds will also hold for semidefinite programs, showing that there is little to be gained from semidefinite programs over linear programs in terms of exploiting low treewidth.

To this end, we consider a set defined as

(8)

where each is a boolean function. Note that the intersection graph does not only depend on the set , but also on how it is formulated; we denote the intersection graph of (8) as .

Remark 4.1.

Given the generality of the functions defining the constraints in (8), one could formulate using a single membership oracle of . However, such a formulation would consist of a single constraint involving all variables, which would yield a very dense formulation of , so that we could not exploit low treewidth.

Any pure binary PO can be formulated as (8). We have already seen in the previous section that unbounded treewidth of the intersection graph can yield intractability in the algorithmic sense. In this section, in contrast, we focus on studying how hard a sparse problem can be, using extension complexity as the measure of complexity.

4.1 Background on Extended Formulations

We will now briefly recall basics concepts from Extended Formulations needed for our discussion. Extended formulations aim for finding a formulation of an optimization problem in extended space where auxiliary variables are utilized with the aim to find an overall smaller formulations compared to formulations in the original space, involving only the problem-inherent variables. Note that optimizing a linear objective over an extended formulation is no harder than over the original formulation, which makes extended formulations appealing. For a more detailed discussion we refer the reader to [29, 27].

Definition 4.2 (Linear Extended Formulation).

Given a polytope , a linear extended formulation of is a linear system

(9)

with the property that if and only if there exists such that satisfies (9). The size of the linear extension is given by the number of inequalities in (9), and the linear extension complexity of is the minimum size of a linear extended formulation of , which we denote by .

Remark 4.3.

In the previous definition, system (9) can be made more general. We can also consider

and define the size the same way as before. However, this more general definition does not affect the extension complexity of a polytope; see e.g., [58].

In Yannakakis’ ground-breaking paper [58], it is proved that the linear extension complexity of a polytope is strongly related to the concepts of slack matrix and non-negative rank:

Definition 4.4 (Slack Matrix).

Let be a polytope that can be formulated as

Consider a set of points such that . Then, the slack matrix of associated to and is given by

Definition 4.5 (Non-negative Factorization).

Given a non-negative matrix , a rank- non-negative factorization of is given by two non-negatives matrix (of columns) and (of rows) such that

The non-negative rank of , denoted as , is the minimum rank of a non-negative factorization of .

Theorem 4.6 (Yannakakis [58]).

Let be a polytope with and let be the slack matrix of associated to and . Then

In the linear case the variables in the extended formulation are required to be in the cone given by the non-negative orthant, i.e., . This was generalized to other cones, allowing for more expressiveness in the extended space. Of particular interest to this work is the generalization to semidefinite extended formulations; see [29, 35] for details on the following concepts and results.

Definition 4.7 (Semidefinite Extended Formulations).

Given a convex set , a semidefinite extended formulation of is a system

(10)

where is an index set, , , with the property that if and only if there exists such that satisfies (10). The size of the semidefinite extension is given by the size of matrices in (10), and the semidefinite extension complexity of is the minimum size of a semidefinite extended formulation of . It is denoted .

Definition 4.8 (Semidefinite Factorization).

Given a non-negative matrix , a rank- semidefinite factorization of is given by a set of pairs such that

The semidefinite rank of , denoted as , is the minimum rank of a semidefinite factorization of .

Theorem 4.9 (Yannakakis’ Factorization Theorem for SDPs, [35]).

Let be a polytope with and let be the slack matrix of associated to and . Then

Note that every linear extended formulation is a semidefinite extended formulation using diagonal matrices so that .

4.2 Low treewidth implies small extension complexity

We will now state the known upper bound on the linear extension complexity of low-treewidth problems, which we prove to be nearly optimal. The following strong result is well known; see e.g., [12, 40, 42]:

Theorem 4.10.

Let be a set that exhibits a formulation as

(11)

If has treewidth , then has linear extension complexity

(12)

We will construct sets that (a) can be formulated using sparse constraints (given by some treewidth ) and which (b) exhibit high extension complexity essentially of (12). By building on recent lower bounds on semidefinite extension complexity [18], we show the existence of such 0/1 sets, whose semidefinite extension complexity (nearly) meets the bound (12) (see Main Theorem 4.23). In fact, for those hard instances, we show a stronger result. The extension complexity does not take into account techniques that are routinely adopted to solve integer programs, such as e.g., reformulations or parallelization of separable sets. These techniques can be used to modify the original instance to an equivalent integer programming problem, which may be computationally more attractive. We show that the hard instances we construct cannot be reformulated to have lower extension complexity or being separable.

The careful reader might have noticed an important fact: the extension complexity bound in (12) does not depend on a particular formulation of the set , as opposed to the treewidth. To overcome this disparity and for simplicity in the upcoming discussion we focus on the “best possible” treewidth of a formulation, which we refer to as the treewidth (or treewidth complexity) of . This definition prevents the results from depending on a particular formulation, or the type of constraints (e.g., linear, boolean, or polynomial).

Definition 4.11.

Given , we denote as the smallest treewidth of the intersection graph of any formulation of as in (11).

4.3 Binary optimization problems with high extension complexity

In this section we analyze how high semidefinite extension complexity can be used to derive characteristics of the formulation of sets and their treewidth. Consider a family of sets with such that

(13)

for some . For technical reasons we further assume that satisfies

(14)
Remark 4.12.

Every family of sets such that for some satisfies (14). In such case, asymptotically and (14) can be easily verified.

Assuming (14) only excludes sets with linear or sub-linear semidefinite extension complexity (w.r.t. ), which are of little interest here. Moreover, by [18], we know there exist 0/1 sets whose semidefinite extension complexity satisfies (14).

Lemma 4.13.

Any formulation of has intersection graph with treewidth and at most . In particular, is and .

Proof.

The upper bound is immediate, since has variables. For the lower bound, we know from Theorem 4.10 there exists such that

(15)

where is the treewidth obtained from a formulation (11). And since

(16)

for some , we obtain

If this implies

a contradiction with (14). We conclude . ∎

4.4 Composition of Polytopes

The techniques in this section allow us to manipulate the sets in a convenient way. Here we drop the index for ease of notation as all definitions and results apply for any 0/1 set. We use the notation with to denote the set ; in particular for all .

Definition 4.14.

For , we define as

In particular, for all and . We obtain the following lemma:

Lemma 4.15.
(17)
Proof.

Inclusion is direct, as the right-hand set is convex, and the inclusion can be directly verified for the extreme points.

Now consider an extreme point of the right-hand set in (17). We first claim . Otherwise, we can write

where is the -th canonical vector. By assumption thus and . This contradicts being an extreme point.

As such and we can easily verify that which proves the remaining inclusion. ∎

Definition 4.16.

A polytope is called a pyramid with base and apex if

and is not contained in the affine hull of .

In Tiwary et al. [53] the extension complexity of the Cartesian product of polytopes is analyzed and it is shown:

Theorem 4.17.

Let be non-empty polytopes such that one of the two polytopes is a pyramid. Then

This result provides us with a tool to combine polytopes in a way that their extension complexity is added up. Unfortunately, the result is limited to linear extended formulations. We generalize this result to the SDP case here:

Theorem 4.18.

Let be non-empty polytopes such that one of them is a pyramid. Then

Proof.

This result follows directly from combining the analysis by Tiwary et al. [53] with a result from Fawzi et al. [27]. We assume w.l.o.g. that is a pyramid and thus we may assume the slack matrix of has the form

with a slack matrix of the base of . This implies

(18)

(see e.g., [27, Theorem 2.10]). On the other hand, it also implies that there is a slack matrix of of the following form (see [53]):

where each corresponds to a column of and is a slack matrix of . Further, the following matrix is a sub-matrix of :

Since this is a block-triangular matrix by [27, Theorem 2.10], we know that

Using the factorization theorem for semidefinite extended formulations (Theorem 4.9) and (18) we obtain