Size-Degree Trade-Offs for Sums-of-Squares and Positivstellensatz Proofs

Size-Degree Trade-Offs
for Sums-of-Squares
and Positivstellensatz Proofs

Albert Atserias   Tuomas Hakoniemi
Universitat Politècnica de Catalunya
{atserias,hakoniemi}@cs.upc.edu
Abstract

We show that if a system of degree- polynomial constraints on  Boolean variables has a Sums-of-Squares (SOS) proof of unsatisfiability with at most  many monomials, then it also has one whose degree is of the order of the square root of  plus . A similar statement holds for the more general Positivstellensatz (PS) proofs. This establishes size-degree trade-offs for SOS and PS that match their analogues for weaker proof systems such as Resolution, Polynomial Calculus, and the proof systems for the LP and SDP hierarchies of Lovász and Schrijver. As a corollary to this, and to the known degree lower bounds, we get optimal integrality gaps for exponential size SOS proofs for sparse random instances of the standard NP-hard constraint optimization problems. We also get exponential size SOS lower bounds for Tseitin and Knapsack formulas. The proof of our main result relies on a zero-gap duality theorem for pre-ordered vector spaces that admit an order unit, whose specialization to PS and SOS may be of independent interest.

1 Introduction

A key result in semialgebraic geometry is the Positivstellensatz [33, 20], whose weak form gives a version of the Nullstellensatz for semialgebraic sets: A system of polynomial equations and polynomial inequalities on commuting variables has no solution over reals if and only if

(1)

where the are sums of squares of polynomials, and the are arbitrary polynomials. Based on this, Grigoriev and Vorobjov [16] defined the Positivstellensatz (PS) proof system for certifying the unsatisfiability of systems of polynomial inequalities, and initiated the study of its proof complexity.

For most cases of interest, the statement of the Positivstellensatz stays true even if the first sum in (1) ranges only over singleton sets [31]. This special case of PS yields a proof system called Sums-of-Squares (SOS). Starting with the work in [3], SOS has received a good deal of attention for its applications in algorithms and complexity theory. For the former, through the connection with the hierarchies of SDP relaxations [21, 27, 26, 8]. For the latter, through the lower bounds on the sizes of SDP lifts of combinatorial polytopes [11, 24, 23]. We refer the reader to the introduction of [26] for a discussion on the history of these proof systems and their relevance for combinatorial optimization.

In this paper we concentrate on the proof complexity of PS and SOS when their variables range over the Boolean hypercube, i.e., the variables come in pairs of twin variables and , and are restricted through the axioms , and . This case is most relevant in combinatorial contexts. It is also the starting point for a direct link with the traditional proof systems for propositional logic, such as Resolution, through the realization that monomials represent Boolean disjunctions, i.e., clauses. In return, this link brings concepts and methods from the area of propositional proof complexity to the study of PS and SOS proofs.

In analogy with the celebrated size-width trade-off for Resolution [6] or the size-degree trade-off for Polynomial Calculus [17], a question that is suggested by this link is whether the monomial size of a PS proof can be traded for its degree. For a proof as in (1), the monomial size of the proof is the number of monomials in an explicit representation of the summands of the right-hand side. The degree of the proof is the maximum of the degrees of those summands. These are the two most natural measures of complexity for PS proofs (and precise definitions for both these measures will be made in Section 2). The importance of the question whether size can be traded for degree stems from the fact that, at the time of writing, the complexity of PS and SOS proofs is relatively well understood when it is measured by degree, but rather poorly understood when it is measured by monomial size. If size could be traded for degree, then strong lower bounds on degree would transfer to strong lower bounds on monomial size. The converse, namely that strong lower bounds on monomial size transfer to strong lower bounds on degree, has long been known by elementary linear algebra.

In this paper we answer the size-degree trade-off question for SOS, and for PS proofs of bounded product width, i.e., the number of inequalities that are multiplied together in (1). We show that if a system of degree- polynomial constraints on pairs of twin variables has a PS proof of unsatisfiability of product width and no more than many monomials in total, then it also has one of degree . By taking , this yields a size-degree trade-off for SOS as a special case.

Our result matches its analogues for weaker proof systems that were considered before. Building on the work of [4] and [9], a size-width trade-off theorem was established for Resolution: a proof with many clauses can be converted into one in which all clauses have size , where is the size of the largest initial clause [6]. The same type of trade-off was later established for monomial size and degree for the Polynomial Calculus (PC) in [17], and for proof length and rank for LS and LS [29], i.e., the proof systems that come out of the Lovász-Schrijver LP and SDP hierarchies [25]. To date, the question for PS and SOS had remained open, and is answered here111Besides the proofs of the trade-off results for LS and LS, the conference version of [29] claims the result for the stronger Sherali-Adams and Lasserre/SOS proof systems, but the claim is made without proof. The very last section of the journal version [29] includes a sketch of a proof that, unfortunately, is an oversimplification of the LS/LS argument that cannot be turned into a correct proof. The forthcoming discussion clarifies how our proof is based on, and generalizes, the one for LS/LS in [29]..

Our proof of the trade-off theorem for PS follows the standard pattern of such previous proofs with one new key ingredient. Suppose is a system of equations and inequalities that has a size refutation. Going back to the main idea from [9], the argument for getting a degree refutation goes in four steps: (1) find a variable that appears in many large monomials, (2) set it to a value to kill all monomials where it appears, (3) induct on the number of variables to get refutations of and which, if is small enough, are of degrees and , respectively, and (4) compose these refutations together to get a degree refutation of . The main difficulty in making this work for PS is step (4), for two reasons.

The first difficulty is that, unlike Resolution and the other proof systems, whose proofs are deductive, the proofs of PS are formal identities, also known as static. This means that, for PS, the reasoning it takes to refute from the degree refutation of and the degree refutation of needs to be witnessed through a single polynomial identity, without exceeding the bound on the degree. This is challenging because the general simulation of a deductive proof by a static one incurs a degree loss. The second difficulty comes from the fact that, for establishing this identity, one needs to use a duality theorem that is not obviously available for degree-bounded PS proofs. What is needed is a zero-gap duality theorem for PS proofs of non-negativity that, in addition, holds tight at each fixed degree of proofs. For SOS, the desired zero-gap duals are provided by the levels of the Lasserre hierarchy. This was established in [19] under the sole assumption that the inequalities include a ball contraint for some . In the Boolean hypercube case, this can be assumed without loss of generality. For PS, we are not aware of any published result that establishes what we need, so we provide our own proof. At any rate, one of our contributions is the observation that a zero-gap duality theorem for PS-degree is a key tool for completing the step (4) in the proof of the trade-off theorem. We reached this conclusion from trying to generalize the proofs for LS and LS from [29] to SOS. In those proofs, the corresponding zero-gap duality theorems are required only for the very special case where and for deriving linear inequalities from linear constraints. The fact that these hold goes back to the work of Lovász and Schrijver [25].

In the end, the zero-gap duality theorem for PS-degree turned out to follow from very general results in the theory of ordered vector spaces. Using a result from [28] that whenever a pre-ordered vector space has an order-unit a zero-gap duality holds, we are able to establish the following general fact: for any convex cone of provably non-negative polynomials and its restriction to proofs of some even degree , if the the ball constraints belong to for all variables and some , then a zero-gap duality holds for in the sense that

where is an appropriate dual space for . The conditions are easily seen to hold for PS-degree and SOS-degree in the Boolean hypercube case, and we have what we want. We use this in Section 3, where we prove the trade-off lemma, but defer its proof to Section 5.

In Section 4 we list some of the applications of the size-degree trade-off for PS that follow from known degree lower bounds. Among these we include exponential size SOS lower bounds for Tseitin formulas, Knapsack formulas, and optimal integrality gaps for sparse random instances of MAX-3-XOR and MAX-3-SAT. Except for Knapsack formulas, for which size lower bounds follow from an easy random restriction argument applied to the degree lower bounds in [13, 15], these size lower bounds for SOS appear to be new.

2 Preliminaries

For a natural number we use the notation for the set . We write  and  for the sets of non-negative and positive reals, respectively and for the set of natural numbers. The natural logarithm is denoted , and denotes base exponentiation.

2.1 Polynomials and the Boolean ideal

Let and be two disjoint sets of variables. Each is called a pair of twin variables, where is the basic variable and is its twin. We consider polynomials over the ring of polynomials with real coefficients and commuting variables , which we write simply as . The intention is that all the variables range over the Boolean domain , and that . Accordingly, let be the Boolean ideal, i.e., the ideal of polynomials generated by the following set of Boolean axioms on the pairs of twin variables:

(2)

We write if is in .

A monomial is a product of variables. A term is the product of a non-zero real and a monomial. A polynomial is a sum of terms. For , we write for the monomial , so polynomials take the form for some finite . The monomial size of a polynomial is the number of terms, and is denoted . A sum-of-squares polynomial is a polynomial of the form , where each is a polynomial in . For a polynomial we write for its degree. We think of as an infinite dimensional vector space, and we write for the subspace of polynomials of degree at most .

2.2 Sums-of-Squares proofs

Let be an indexed set of polynomials. We think of the polynomials as inequality constraints, and of the polynomials as equality constraints:

(3)

Let be another polynomial. A Sums-of-Squares (SOS) proof of from is a formal identity of the form

(4)

where and are sums of squares of polynomials, for , and and all are arbitrary polynomials. The proof is of degree at most if , , for each , and for each . The proof is of monomial size at most  if

This definition of size corresponds to the number of monomials of an explicit SOS proof given in the form , where each is given in the form , and all the and polynomials are represented as explicit sums of terms. Accordingly, the monomials of the ’s and the ’s are called the explicit monomials of the proof.

Note that the polynomials are not considered in the definition we have chosen of an explicit SOS proof, so they do not contribute to its monomial size or its degree. The rationale for this is that typically one thinks of the identity in (4) as an equivalence

and we want proof size and degree to not depend on how the computations modulo the Boolean ideal  are performed. For degree this choice is further justified from the fact that one may always assume that the degrees of the products do not surpass the degree in a proof of degree . This follows from the fact that is a Gröbner basis for with respect to any monomial ordering – one can see this quite easily using Buchberger’s Criterion (see e.g. [10]). In particular upper and lower bounds for the restricted definition of degree imply the same upper and lower bounds for our liberal definition of degree, and vice versa. For monomial size, this goes only in one direction: lower bounds on our liberal definition of monomial size translate into lower bounds for a restricted definition of monomial size that takes  also into account. Since our aim is to prove lower bounds on the number of monomials in a proof, proving our results for our more liberal definition of monomial size makes our results only stronger.

2.3 Positivstellensatz proofs

This will be an extension of SOS. Let be an indexed set of polynomials interpreted as in (3). A Positivstellensatz proof (PS) of from is a formal identity of the form

(5)

where is a collection of non-empty subsets of , each is a sum-of-squares polynomial, , and each and is an arbitrary polynomial. The proof is of degree at most  if , , for each , and  for each . The proof is of monomial size at most  if

The proof has product-width at most if each has cardinality at most . The explicit monomials of the proof are the monomials of the ’s and the ’s. It should be noted that PS applied to a that contains at most one inequality constraint (i.e., ) is literally equivalent to SOS: any power of a single inequality is either a square, or the lift of that inequality by a square.

As in SOS proofs, the definitions of monomial size and degree of a proof do not take into account the polynomials. Likewise, the monomials in the products do not contribute to the definition of monomial size. As above, this liberal definition plays in favour of lower bounds in the case of monomial size. For degree, ignoring the ’s does not really matter, again, because is a Gröbner basis for .

2.4 More on the definition of monomial size

Starting at [9, 1], counting monomials in algebraic proof systems such as the Polynomial Calculus (PC) is a well-established practice in propositional proof complexity. One motivation for it comes from the fact that PC with twin variables, called PCR in [1], polynomially simulates Resolution, and the natural transformation that is given by the proof turns the clauses of the Resolution proof into monomials. Another motivation comes from the fact that, in the area of computational algebra, the performance of the Gröbner bases method appears to depend significantly on how the polynomials are represented. In this respect, the sum of monomials representation of polynomials features among the first and most natural choices to be used in practice. That said, for the natural static version of PC called Nullstellensatz (NS) [5], let alone for SOS and PS, counting monomials does not appear to have such a well-established tradition. Note that in the presence of twin variables, SOS monomial size is known to polynomially simulate Resolution (see Lemma 4.6 in [2], where this is proved with a slightly different definition of SOS and monomial size from the one above; the difference is minor). It follows that the first of the two motivations for counting monomials in PC carries over to SOS, and hence to PS.

The original Beame et al. and Grigoriev-Vorobjov papers [5, 16] where NS and PS were defined first, size is never considered, only degree. The subsequent Grigoriev’s papers on SOS [13, 14] did not consider size either. To the best of our knowledge, the first reference that defines a notion of size for (the version of) PS proofs (with ) appears to be [15], where the size of a proof is defined as “the length of a reasonable bit representation of all polynomials” in the proof. The same paper proves lower bounds on the “number of monomials” of an SOS proof (see Lemma 9.1 in [15]) without being precise as to whether it is counting monomials in the polynomials (in the notation of (4)), or in the expansion of as a sum of terms. Note, however, that , hence the difference between these two possibilities is not terribly critical. As with the squares , the definitions in [15] are not explicit as to whether the monomials in the polynomials (in the notation of (4) again) contribute to the monomial size by themselves, or whether one is to take into account the expansions of the products . Unlike ours, the definitions in [15] do not distinguish between the polynomials that multiply the Boolean axioms and the rest.

The difference between counting the monomials of the (or the ) polynomials versus counting those in the expansions of the products and is again not critical if one is satisfied with a notion of size up to a polynomial factor that depends on the size of the input. If one is to care about such refinements of monomial size that take into account polynomial factors, then a natural size measure for, say,  could well be  or even , instead of . Note that  corresponds to the number of monomials that one would encounter while expanding the product in the naive way before merging terms with the same monomial, and in particular, before any potential cancelling of terms occurs. In [2], the monomial size of (their slightly different version of) Lasserre/SOS is defined in terms of the expanded summands, which in the notation of (4), would correspond to . In [22] the same convention for defining monomial size is used but the last sum over is omitted since they work mod  by default. For PS proofs as in (5) that have large product-width , whether we count the monomials in the  polynomials or in the expansions of the products  could make a significant difference, i.e., exponential in . If we think of the proof in (5) as given by the indexed sequences and , then counting only the monomials in the  polynomial, or even better in the polynomials, looks like the natural choice.

3 Size-Degree Trade-Off

In this section we prove the following.

Theorem 1.

For every two natural numbers and , every indexed set of polynomials of degree at most with pairs of twin variables, and every two positive integers and , if there is a PS refutation from of and product-width at most and monomial size at most , then there is a PS refutation from of product-width at most and degree at most .

An immediate consequence is a degree criterion for size lower bounds:

Corollary 1.

Let be an indexed set of polynomials of degree at most with pairs of twin variables, and let be a positive integer. If is the minimum degree and is the minimum monomial size of PS refutations from of product-width at most , and , then .

The proof of Theorem 1 will follow the standard structure of proofs for degree-reduction lemmas for other proof systems, except for some complications in the unrestricting lemmas. These difficulties come from the fact that PS proofs are static. The main tool around these difficulties is a tight Duality Theorem for degree-bounded proofs with respect to so-called cut-off functions as defined next.

3.1 Duality modulo cut-off functions

Let be an indexed set of polynomials interpreted as constraints as in (3). A cut-off function for is a function with  for each , and for each . A PS proof as in (5) has degree mod at most if , , for each , and for each .

Let denote the set of all polynomials of degree at most such that has a PS proof from of degree mod at most and product-width at most . We write  if . A pseudo-expectation for of degree mod at most  and product-width at most is a linear functional from the of all polynomials of degree at most such that and for all . We denote by  the set of pseudo-expectations for the indicated parameters.

Theorem 2.

Let be a positive integer, let be an indexed set of polynomials, let be a cut-off function for , let be a positive integer, and let be a polynomial of degree at most . Then

Moreover, if the set is non-empty, then there is a pseudo-expectation achieving the infimum; i.e., is well-defined.

Note that the statement of Theorem 2 applies only to even degrees. This comes as an artifact of the proof but is in no way a severe restriction for the applications that we have in mind. The definitions of degree for SOS and PS proofs as defined in Section 2 are special cases of the definitions above for appropriate choices of and . Thus, Theorem 2 gives Duality Theorems for them. The role of the cut-off function in our application below will be explained in due time; i.e., after its use in the unrestricting Lemma 3 below. It is important for the lemmas that follow that these duality theorems are tight in two ways: that they have zero duality gap and that they respect the degree; i.e., the degree bound is the same for proofs and pseudo-expectations. We defer the proof of Theorem 2 to Section 5 where a more general statement is proved.

3.2 Unrestricting lemmas

For this section, fix three positive integers , and for the numbers of pairs of twin variables, degree, and product width. We also fix an indexed set of polynomials on the pairs of twin variables, and a cut-off function for .

Lemma 1.

Let and be polynomials of degree at most . If , then for any .

Proof.

The assumption that implies that both and belong to . Hence for any . ∎

Lemma 2.

Let be one of the variables and let be a monomial of degree at most . Then implies for any .

Proof.

Let and be two monomials of degree at most and , respectively, such that . Note first that , since and all degrees are at most . Hence, by Lemma 1. Let then  and note that . For every positive integer we have

where in both cases the equalities follow from and . Since and the inequalities hold for every it must be that and the lemma is proved. ∎

For a polynomial on the pairs of twin variables, an index, and a Boolean value, we denote by the polynomial that results from assigning to and to in . We extend the notation to indexed sets of such polynomials through  to mean . Note that and are polynomials on  pairs of twin variables, and their degrees are at most those of and , respectively.

Lemma 3.

Let , let and be the extensions of with the polynomials and , respectively, and let be the extension of that maps to . The following hold:

  1. The function is a cut-off function for both and ,

  2. If , then .

  3. If , then .

Proof.

() is obvious. By symmetry we prove only (i). Suppose that , say:

(6)

For , write , let and and note that

Therefore where

Note that since for and for . Now

(7)
(8)

Because is a cut-off function for and , we have . Likewise for every , we have:

The second inequality follows from the facts that and for all , the third inequality follows from the fact that is a cut-off function for , and the equality follows from the definition of . Hence, . A similar and easier argument with and in place of and shows that . This gives proofs for all terms in the right-hand side of (6), and the proof of the lemma is complete. ∎

Some comments are in order about the role of the cut-off function in the above proof. First note that, at the semantic level, the constraint is equivalent to the pair of constraints and . At the level of syntatic proofs, though, these two representations of the same constraint behave differently: although a lift of the restriction  of may have its degree bounded by , the degree of its direct simulation through could exceed . The role of the cut-off function is to restrict the lifts in such a way that their simulation through remains a valid lift of degree at most ; this is the case if, indeed, the allowed lifts of are those satisfying , where . This is why is designed to depend only on the index (or ) and not on the polynomial indexed by (or ).

Lemma 4.

Let , let and be the extensions of with the polynomials and , respectively, and let be the extension of that maps to . The following hold:

  • The function is a cut-off function for both and .

  • If , then for any .

  • If , then for any .

Proof.

() is obvious. We prove (i); the proof of (ii) is symmetric. Suppose towards a contradiction that there is such that . We want to show that is also in . This contradicts the assumption that . Let

(9)

be a proof from of degree mod at most and product-width at most . First note that . Therefore, Lemma 2 applies to all the monomials of , so . The rest of (9) will get a non-negative value through , since by assumption is in and is restricted to . Thus, is in . ∎

Lemma 5.

Let and assume that . The following hold:

  • If and , then .

  • If and , then

Proof.

Since in this proof and remain fixed, we write instead of and instead of , and act similarly for degree . First note that , and , so

(10)

We prove (i); the proof of (ii) is entirely analogous.

Assume . By Lemmas 3 and 4 and we have for any . Then, by the Duality Theorem, there exist such that . To see this, let . If is empty, then and any  serves the purpose. If is non-empty, then the Duality Theorem says that the infimum is achieved, hence  for some  in , and serves the purpose. Using again, , so

(11)

Assume also . By Lemmas 3 and 4 we have for any , and this time suffices. By the same argument as before, by the Duality Theorem there exist such that . Now suffices to get

(12)

Adding (10), (11) and (12) gives , i.e., . ∎

3.3 Inductive proof

We need one more technical concept: a PS proof as in (5) is multilinear if and are sums-of-squares of multilinear polynomials for each , and is a multilinear polynomial for each .

Lemma 6.

For every two positive integers and and every indexed set of polynomials, if there is a PS refutation from of monomial size at most and product-width at most , then there is a multilinear PS refutation from of monomial size at most and product-width at most .

Proof.

Assume that and that there is a refutation from as in (5), with and for , where the total number of monomials among the , and is at most . For each polynomial let be its direct multilinearization; i.e., each power with that appears in is replaced by . It is obvious that and also , where is the number of pairs of twin variables in . Moreover, the number of monomials in does not exceed that of . Thus, setting , and we get

(13)

It follows that has a multilinear refutation of monomial size at most . ∎

Theorem 1 will be a consequence of the following lemma for a suitable choice of and :

Lemma 7.

For every natural number , every indexed set of polynomials with pairs of twin variables, every cut-off function for , every real and every two positive integers  and , if there is a multilinear PS refutation from of product-width at most with at most many explicit monomials of degree at least (counted with multiplicity), then there is a PS refutation from of product-width at most and degree mod at most  where and .

Proof.

The proof is an induction on . Let be an indexed set of polynomials with pairs of twin variables, let be a cut-off function for , let be a real, let and be positive integers, and let be a multilinear refutation from of product-width at most  and at most many explicit monomials of degree at least . For the statement is true because . Assume now that . Let be the exact number of explicit monomials of degree at least in . The total number of variable occurrences in such monomials is at least . Therefore, there exists one among the  variables that appears in at least of the explicit monomials of degree at least . Let  be the index of such a variable, basic or twin. If it is basic, let . If it is twin, let . Our goal is to show that

(14)

for and as stated in the lemma. If we achieve so, then because and , so Lemma 5 applies on (14) to give , which is what we are after.

Consider first. This is a set of polynomials on pairs of twin variables, and  is a multilinear refutation from it of product-width at most that has at most  explicit monomials of degree at least . Moreover is a cut-off function for it. We distinguish the cases and . If , then all explicit monomials in have degree at most . Since , this refutation has degree mod  at most . This gives the first part of (14). If , then first note that . Moreover, the induction hypothesis applied to and , and the same , and , gives that there is a refutation from of product-width at most and degree mod at most , where

(15)

Here we used the inequality which holds true for every real , and the fact that . This gives the first part of (14) since .

Consider next. In this case, the best we can say is that is still a cut-off function for it, and that is a multilinear refutation from it of product-width at most , that still has at most  many explicit monomials of degree at least . But has at most pairs of twin variables, so the induction hypothesis applies to it. Applied to the same , , and , it gives that there is a refutation from of degree mod  at most , where

(16)

This gives the second part of (14) since . The proof is complete. ∎

Proof of Theorem 1.

Assume that has a refutation of product-width at most and monomial size at most . Applying Lemma 6 we get a multilinear refutation with at most many explicit monomials, and hence with at most many explicit monomials of degree at least , for any of our choice. We choose

(17)

By assumption and we chose in such a way that . Thus, Lemma 7 applies to any cut-off function for , in particular for the cut-off function that is everywhere. This gives a refutation of product-width at most and degree mod at most