Lower Bounds for Polynomials based on Geometric Programming

Lower Bounds for Polynomials with Simplex Newton Polytopes Based on Geometric Programming

Sadik Iliman  and  Timo de Wolff Sadik Iliman, Goethe-Universität, FB 12 – Institut für Mathematik, Postfach 11 19 32, 60054 Frankfurt am Main, Germany
    Timo de Wolff, Texas A&M University, Department of Mathematics, College Station, TX 77843-3368, USA
iliman@math.uni-frankfurt.de
dewolff@math.tamu.edu
Abstract.

In this article, we propose a geometric programming method in order to compute lower bounds for real polynomials. We provide new sufficient conditions for polynomials to be nonnegative as well as to have a sum of binomial squares representation. These criteria rely on the coefficients and the support of a polynomial and generalize all previous ones by Lasserre, Ghasemi, Marshall, Fidalgo and Kovacec to polynomials with arbitrary simplex Newton polytopes.

This generalization yields a geometric programming approach for computing lower bounds for polynomials that significantly extends the geometric programming method proposed by Ghasemi and Marshall. Furthermore, it shows that geometric programming is strongly related to nonnegativity certificates based on sums of nonnegative circuit polynomials, which were recently introduced by the authors.

Key words and phrases:
Geometric programming, lower bound, nonnegative polynomial, semidefinite programming, simplex, sparsity, sum of nonnegative circuit polynomials, sum of squares
2010 Mathematics Subject Classification:
12D15, 14P99, 52B20, 90C25

1. Introduction

Finding lower bounds for real polynomials is a central problem in polynomial optimization. For polynomials with few variables or low degree, and for polynomials with additional structural properties there exist several well working approaches to this problem. The best known lower bounds are provided by Lasserre relaxations using semidefinite programming. Although the optimal value of a semidefinite program can be computed in polynomial time (up to an additive error), the size of such programs grows rapidly with the number of variables or degree of the polynomials. Hence, there is much recent interest in finding lower bounds for polynomials using alternative approaches such as geometric programming (see (4.1) for a formal definition). Geometric programs can be solved in polynomial time using interior point methods [NN94]; see also [BKVH07, Page 118]. In this article, we provide new lower bounds for polynomials using geometric programs. These bounds extend results in [GM12] by Ghasemi and Marshall.

Let and let be the space of polynomials of degree . A global polynomial optimization problem for some is given by

It is well-known that in general computing is NP-hard [DG14]. By relaxing the nonnegativity condition to a sum of squares condition, a lower bound for based on semidefinite programming is given by

and hence [Las10]. A central open problem in polynomial optimization is to analyze the gap . Very little is known about this gap beyond the cases where it always vanishes. This happens exactly for by Hilbert’s Theorem [Hil88]. For an overview about the topic see [BPT13, Las10, Lau09].

Let denote the standard basis of . In [FK11], Fidalgo and Kovacec consider the class of polynomials, whose Newton polytope is a scaling of the standard simplex . For these polynomials they provide certificates, i.e., sufficient conditions, both for nonnegativity and for being a sum of squares. In [GM12] Ghasemi and Marshall show that these certificates can be translated into checking feasibility of a geometric program. Moreover, in their recent works [GM12, GM13] Ghasemi and Marshall show several important further facts for polynomial optimization via geometric programming. Two key observations are the following ones:

  1. For general polynomials lower bounds based on geometric programming are seemingly not as good as bounds obtained by semidefinite programming.

  2. Even higher dimensional examples can often be solved quite fast via geometric programming. In contrast, semidefinite programs often do not provide an output for problems involving polynomials with many variables or of high degree (at least with the current SDP solvers).

The extension of Ghasemi and Marshall’s results, which we provide in this article, relies on the following key observation. In addition to the sum of squares approach, one can use nonnegative circuit polynomials to certify nonnegativity. Nonnegative circuit polynomials were recently introduced by the authors in [IdW14]. Particularly, the authors results in [IdW14] imply as a special case the sufficient condition for nonnegativity by Fidalgo and Kovacec [FK11], which was used by Ghasemi and Marshall. Therefore, it is self-evident to ask whether the translation into geometric programs can also be generalized. The purpose of this article is to show that this is indeed the case.

Let be a real polynomial with simplex Newton polytope such that all its vertices are in and the coefficients of the terms corresponding to the vertices are nonnegative. The main theoretical results we contribute in Section 3 are some easily checkable criteria on the coefficients of such a polynomial , which imply that is nonnegative. More precisely, these criteria imply that is a sum of nonnegative circuit polynomials (sonc), and, as a consequence of results by the authors in [IdW14], every sonc is nonnegative. See Theorems 3.1 and 3.4, and see Section 2 for a formal definition of a sonc. Moreover, we provide a second criterion on the support of a sonc polynomial, which implies that the sonc additionally is a sum of binomial squares.

The key observation is that, as in [GM12], these criteria can be translated into a geometric optimization problem (Corollary 4.2) in order to find a lower bound for a polynomial. As a surprising fact we show in Corollary 3.6 that for very rich classes of polynomials with simplex Newton polytope, the optimal value of the corresponding geometric program is at least as good as the bound . In fact is in these cases. This is in sharp contrast to the general observation by Ghasemi and Marshall [GM12, GM13], which we outlined above in (1). Additionally, based on similar examples, we can see that the computation of is much faster than in the corresponding semidefinite optimization problem, as it was already shown numerically in [GM12].

Using the geometric programming software package gpposy for Matlab, we demonstrate the capabilities of our results on the basis of different examples. A major observation is that the bounds and are not comparable in general, since the convex cones of sonc’s and sums of squares do not contain each other, see [IdW14]. This observation, again, is in sharp contrast to the one in [GM12] where the bound and the bound given by Ghasemi and Marshall’s geometric program are comparable.

Furthermore, we show in Section 5 that our methods are not only applicable to global polynomial optimization problems, but also to constrained ones using similar methods as Ghasemi and Marshall in [GM13]. A further discussion of the constrained case and generalizations of the results in Section 5 and in [GM13] is content of the follow-up article [DIdW16].

2. Preliminaries

We consider a polynomial of the form with , , and Newton polytope . We call a lattice point even if it is in . A binomial is an expression of the form with if at least one of is 0, it is (also) a monomial. If a polynomial is a sum of squares of binomials, then it is customary to abbreviate this by saying it is a sobs, meaning it is a “sum of binomial squares”. We denote by the convex hull of a subset of Our interest lies foremost in ST-polynomials which we define as follows.

Definition 2.1.

An ST-polynomial written in standard form is a polynomial of the form

with exponents and , coefficients , and a set for which the following hold:

(ST1):

The points and define a set of affinely independent, even points in

(ST2):

is the set of exponents in not defining monomial squares; i.e. iff or and .

(ST3):

There holds the inclusion or, equivalently, every can be written uniquely as

(ST4):

are nonnegative and if then for all there holds .

The “ST” in “ST-polynomial” is short for “simplex tail”. The tail part is given by the sum while the other terms define the simplex part.

Note that hypothesis (ST1) implies that is the vertex set of a full dimensional simplex. It consists of even lattice points and has one vertex at the origin. Hypothesis (ST3) implies that the Newton polytope of is a face of this simplex, possibly the simplex itself. We will have if and only if all are positive. Otherwise, if for some , then is a proper face of Hypotheses (ST2), (ST3) and (ST4) together imply that So, is uniquely defined by and may be referred to by

The denote the barycentric coordinates of relative to the vertices An ST-polynomial is homogeneous only if and for all occurring in (associated to nonzero coefficients) we have where In this case holds for all ; the converse needs not to be true.
Given a polynomial well established algorithms from convex geometry allow to determine if is ST and if so to rewrite it in this form.

Definition 2.2.

An ST-polynomial is a circuit polynomial if is empty or a singleton. We fix the standard notation for a circuit polynomial and define the associated circuit number , which was first defined by the authors in [IdW14], as follows:

(2.1)

with In the uninteresting case i.e. define .

Much of our work will be centered around writing an ST-polynomial as a sum of nonnegative circuit polynomials (sonc).

Let be the -th standard vector. The class of ST-polynomials covers in essence the class of polynomials of degree considered by Ghasemi and Marshall, see, e.g., [GM12, Theorem 2.3 and Corollary 2.5]. This can be seen by putting for and noting that an with lies in the convex hull of Note that Ghasemi and Marshall admit in the definition of their polynomials larger sets of exponents than . However, the difference is associated to terms that are monomial squares. All their criteria for the sum of squares property and algorithms for lower bounds of polynomials use virtually without exception only the information on the coefficients and with The same will hold in this paper. Everything we could say on basis of the present investigation for more general classes of polynomials would be a trivial consequence of what we find here for ST-polynomials. This is the reason why we concentrate on the class of polynomials defined above.

A fundamental fact is that nonnegativity of a circuit polynomial can be completely decided by comparing its tail coefficient with its circuit number

Theorem 2.3 ([IdW14], Theorem 3.8).

Let be a circuit polynomial in standard form and its circuit number, as defined in (2.1). Then the following are equivalent:

  1. is nonnegative.

  2. and  or   and  or  

Note that (2) can be equivalently stated as: or or is a sum of monomial squares. At this point we remark that the definition of the circuit number slightly differs from the one used in [IdW14]. The reason is that is not necessarily assumed to be an interior point as it is in [IdW14]. The difference between the two definitions is explained by [IdW14, Lemma 3.7].

Writing a polynomial as a sum of nonnegative circuit polynomials is a certificate of nonnegativity. Let us denote by sonc the class of polynomials that are sums of nonnegative circuit polynomials or the property of a polynomial to be in this class. We show that this class brings new insights to polynomial optimization and hence deserves a place among the well established classes sos (sums of squares) and sobs (sums of binomial squares). For further details about sonc’s see [dW15, IdW14]

Example 2.4.

Investigate the family . There needs to exist a smallest negative such that is nonnegative. For negative the corresponding polynomial will be a circuit polynomial since and We obtain the circuit number

Thus, is nonnegative for but no smaller real number. The polynomial obtained for is the well-known Motzkin polynomial which is known not to be a sum of squares.

Example 2.5.

In [FK11, Theorem 2.3] forms (homogeneous polynomials) denoted by with are analyzed concerning nonnegativity. is an instance of a circuit polynomial (in [FK11] called elementary diagonal minus tail form) with in the sense of Theorem 2.3. Since we obtain

as the threshold value for nonnegativity.

In the earlier paper [IdW14], a criterion for a polynomial with simplex Newton polytope to be a sonc was established.

Theorem 2.6.

[IdW14, Corollary 7.4] Let be a nonnegative ST-polynomial in standard notation and . If there exists a point such that for all , then is a sonc and the circuit polynomials entering in this sonc decomposition will all have the same Newton polytope as .

The relation between sonc’s and sobs is clarified by using definitions and results from Reznick [Rez89]. Note that we find it convenient to remain near to Reznick’s own notation in this discussion till the end of the proof of Proposition 2.7. He defines, given a set the sets of averages

Given a set , is -mediated if see [Rez89, p. 433,438]. He then shows that there exists, given a maximal -mediated set that contains every -mediated set. Reznick explains things in the context of “frameworks”, that are sets of even lattice points that all have the same 1-norm. But his algorithmic construction of works literally for any finite set of even lattice points. To find one begins with and constructs via inductively a contracting sequence of sets   . As is finite, this sequence becomes stationary at a set which is shown to be the required If the convex hull of is a simplex and then Reznick calls an -trellis [Rez89, p436c1]. The “” is borrowed from the fact that the Hurwitz form has the standard simplex as its Newton polytope. The standard simplex has the property that its corresponding satisfies . We shall speak of an -simplex.

We will apply Reznick’s construction to the vertex set of the Newton polytope of an ST-polynomial and write, with slight abuse of notation, for As mentioned, the papers [FK11, GM12] deal exclusively with polynomials in which consists of the scaled standard vectors and, possibly, To put the results of those papers into perspective, we show that the inhomogeneous simplices generated by the origin and the standard vectors, just as their homogeneous counterparts are also -simplices.

Proposition 2.7.

is a nonhomogeneous -simplex; that is, Every subset of again defines an -simplex.

Proof.

We adapt Reznick’s proof who shows an analogous fact for the case It is clear that is a nonhomogeneous trellis in the sense that the vertices define a simplex which in our case is full dimensional. By Reznick’s criterion [Rez89, p438c-6] and notation we have to show that . (The reader should note that Reznick’s criterion is actually false in the context said there - frameworks - but it is true for trellises, in particular .) The points in have at most one nonzero entry. If this happens for a point in , then the two distinct points in with average must also have at most one nonzero entry at the same position as . But then distinctness implies and so So Now consider a point

Case 1: has only one nonzero coordinate. Then for some integer with Points in whose average might yield are necessarily points of the form If is odd then and are even integers in and is the desired representation. If is even, then we may conclude analogously using instead.

Case 2: has two or more nonzero coordinates. Then we may assume by symmetry and for ease of future notation, that Since we have and We define and find the unique such that Define if is odd and if is even.

Now we let and Assume Then and hence also since otherwise Thus, If is odd, then .

Therefore, in all cases and So, are even lattice points whose average is Finally note that we have as a convex combination of points in yielding . Thus, showing Similarly, one shows that Thus and we have proven the first part.

It is clear by deleting zeros in the coordinates which are not equal to that subsets of the form will again define -simplices. From this and Reznick’s own result the claim concerning subsets follows. ∎

Example 2.8.

Figure 1 shows a scaled standard simplex for the case

Figure 1. The -simplex . All lattice points are contained in the corresponding maximal mediated set.

Using [Rez89, Corollary 4.9], which gives a necessary and sufficient criterion when “simplicial agiforms” are sobs, we developed in [IdW14] the following theorem.

Theorem 2.9 ([IdW14], Theorem 5.2).

Let be a nonnegative circuit polynomial in standard notation, as defined in (2.1). Then the following statements are equivalent:

  1. is a sum of squares,

  2. is a sum of binomial squares,

  3. .

From the Theorems 2.6 and 2.9 we obtain the following corollary.

Corollary 2.10 ([IdW14], Corollary 7.4).

Let be an ST-polynomial in standard form such that . If there exists with for all and , then is nonnegative if and only if is a sum of binomial squares.

Thus, by Theorem 2.9, a circuit polynomial with -simplex Newton polytope is nonnegative if and only if it is a sum of squares. For the very special case of the scaled standard simplex and expressed in the homogeneous case this is the main result of [FK11].

In [IdW14, Theorem 5.9], sufficient conditions based on 2-normality of polytopes and toric geometry are given for a simplex to be an -simplex. In particular, every sufficiently large simplex with even vertices is an -simplex in ; see [IdW14, Section 5.1] for details.

Example 2.11.

Let with . Note that the interior lattice point has the barycentric coordinates in terms of the vertices of . By Theorem 2.3 is nonnegative if and only if Once it is determined that is nonnegative, the question whether or not is a sum of squares depends solely on the lattice point configuration of New It is easy to check that Hence, in particular the “inner” term has an exponent . Therefore, by Theorem 2.9, is a sum of binomial squares if is nonnegative.

As mentioned before, the results of the preliminaries are, up to slight variations, taken from the article [IdW14]. For further background on the results used here the interested reader should particularly focus on Section 3 for nonnegativity of circuit polynomials, Section 5 for the relation to sos, and on Section 7 for the structure of the sonc cone in [IdW14]. Moreover, a short overview about these results can be found in the Oberwolfach report [dW15].

3. Main Results

In this section, we provide sufficient criteria on the coefficients of a polynomial which imply that is a sum of nonnegative circuit polynomials or a sum of (binomial) squares and therefore that is nonnegative. We introduce a new lower bound for nonnegativity, which we will later relate to geometric programs.

Theorem 3.1.

Let be an ST-polynomial in standard form. Assume that for every pair there exists an such that the following holds:

  1. ,

  2.  for all

Then is a sum of nonnegative circuit polynomials (sonc), which all have faces of as Newton polytopes. If in addition then is a sum of binomial squares.

Proof.

Fix an Then and the condition (1) guarantees that if , then Together with the nonnegativity of the this guarantees that is a circuit polynomial; see condition (4) of the definition of an ST-polynomial. The circuit number of the polynomial is As by hypothesis (ST2) or condition (1) implies by Theorem 2.3 that is nonnegative and by summing over all we obtain the sonc

For each the expression is a monomial square and hence nonnegative by condition (2). Adding these expressions to one of the circuit polynomials, call it we get a new circuit polynomial Then Evidently, the Newton polytopes of the circuit polynomials in this sum are all faces of the polytope and we are done with the first part.

Finally, if the are in , then by Theorem 2.9 the polynomials are sums of binomial squares. Thus, by construction is also a sum of binomial squares. Therefore, is a sum of binomial squares. ∎

From this theorem we can deduce the following sufficient condition for the existence of a sonc-decomposition of an ST-polynomial. This condition depends on the coefficients of the polynomial alone.

Corollary 3.2.

Let be an ST-polynomial in standard notation. Assume that for each Then is a sonc. If in addition then is sobs.

Proof.

Choose in Theorem 3.1 Then the product in condition (1) of Theorem 3.1 is and hence, that condition is satisfied. The corollary follows. ∎

This corollary yields a result in [GM12] as a consequence.

Corollary 3.3 ([Gm12], Corollary 2.5).

Let be a polynomial of degree If

  1. and

  2. , for all ,

then is a sum of squares.

Since the proof of the corollary illustrates some points of the paper [GM12], we give it here for the convenience of the reader.

Proof.

Write the polynomial in the form considered in [GM12] as Of course, this polynomial is a sum of squares if the truncated polynomial obtained by deleting the monomial square terms is a sum of squares. The truncated polynomial is ST. Then has a face of the simplex as its Newton polytope. Since we find and for . Hence, the hypothesis of the present corollary translates into By Proposition 2.7 is an -simplex. So The corollary follows from the previous one. ∎

Theorem 3.1 yields new sufficient criteria for a polynomial to be a sonc as well as to be a sum of (binomial) squares. These criteria depend on the coefficients and the support of the polynomial alone. They significantly extend previous sum of squares criteria given in [FK11, GM12, Las07] since we do not require the assumption that is a scaled standard simplex anymore. All the polynomials treated in the cited literature are covered by the above theorems.

An important step to connect Theorem 3.1 to geometric programming is given in the following theorem.

Theorem 3.4.

Assume again that is an ST-polynomial in standard form and let Suppose that for every there exists an such that:

  1. If , then

Then is a sum of nonnegative circuit polynomials (sonc) whose Newton polytopes are faces of .

Proof.

First, note that the condition (0) is dispensable, provided that we assume the to be chosen such that the other conditions are well-defined. In fact, by supposing this, condition (0) can be deduced from conditions (1) and (3). Note that is an ST-polynomial again, since the right hand side of condition (3) is nonnegative. It is sufficient to show that the conditions given in the present theorem, which only considers with , imply the existence of as in Theorem 3.1 (which also includes an ), when the latter is formulated for in place of As and differ in their constant term alone we observe that and .

We fix an and investigate two cases:

Case : Then . Hence, we have to consider condition (1) of Theorem 3.4. Since the additional assumption under the product in condition (1) is obsolete. This means that the under consideration satisfies condition (1) of Theorem 3.1 independent of the choice of .

Case : Then note that criterion (1) of Theorem 3.1 is via solving for equivalent to the inequality

(3.1)

Once we have found satisfying the conditions (2), (3) of the present theorem, there exists evidently the additional to satisfy this inequality and hence (1) of Theorem 3.1 for this The present conditions (2) coincide for all cases except with the conditions (2) in Theorem 3.1. Condition (2) of Theorem 3.1 for the current polynomial and requires us to find such that Now, as the have to satisfy the inequality (3.1) but are subject to no other conditions, we see that condition (3) guarantees what is required. ∎

Finally, we prove the following theorem, which connects the conditions in the previous theorems to the convex cone of sonc’s.

Theorem 3.5.

Let be an ST-polynomial in standard notation. We define:

  • as the supremum of all such that for every there exist nonnegative reals such that the conditions (0) to (3) of Theorem 3.4 are satisfied; and

  • as the supremum of all such that there exist nonnegative circuit polynomials whose Newton polytopes are faces of such that

Then these quantities are equal, i.e.,

Proof.

Consider a real satisfying the conditions defining Then we know by Theorem 3.4 that there exist nonnegative circuit polynomials whose Newton polytopes are faces of such that It follows that and hence

Now consider a real satisfying the conditions defining Let be circuit polynomials as occurring in the equation. With denoting the tail monomial of and obvious definitions of the equation reads in more detail

As and are disjoint and also and are disjoint, a comparison of coefficients of both sides of the equation implies that

We can assume that is minimal. Namely, if in this representation there would exist such that then is a new nonnegative circuit polynomial with its Newton polytope being a face of We can use to replace the subsum above and obtain a representation with less than summands. This is a contradiction.

So, we henceforth assume that and that the circuit polynomials are indexed by If we set , then we have also and we get

From this equality, the nonnegativity of circuit polynomials and Theorem 2.3 we get

(3.2)

From these equations and inequality, we arrange for each reals with satisfying the conditions of Theorem 3.4: we define for the cases If , then Since we have . Thus, by the inequality in (3.2), also This shows that condition (0) is satisfied. In the case we have In this case again the inequality guarantees (1). We see that the condition (2) in Theorem 3.4 is satisfied with equality.

Finally, we investigate condition (3) of Theorem 3.4. For the case we can solve the inequality for obtaining that

So, the equation for above implies condition (3) of Theorem 3.4.

Hence, a real satisfying the conditions defining is also an such that there exist satisfying the conditions of Theorem 3.4. Thus, and consequently This proves the theorem. ∎

In the following we make use of the unified notation corresponding to the equality Ghasemi and Marshall observed a trade off between fast solvability of the corresponding geometric programs in comparison to semidefinite programs and the fact that bounds obtained by geometric programs are worse than for general polynomials ; [GM12]. Here, we conclude that this trade off does not occur for polynomials with simplex Newton polytope satisfying the conditions of Theorem 2.6. Surprisingly, in this case the bound will be at least as good as the bound . Note that the special instance and being the standard simplex with edge length 2d was already observed by Ghasemi and Marshall; see [GM12, Corollary 3.4].

Corollary 3.6.

Let be an ST-polynomial in standard form and . Suppose there exists such that for all

  1. Then

  2. If additionally , then

Proof.

The polynomial is an ST-polynomial again which is nonnegative and has infimum Clearly, . Thus, for Hence, satisfies the hypotheses of Theorem 2.6. Theorem 2.6 guarantees a sonc-decomposition for . Since is a polynomial that attains negative values for the polynomial cannot have a representation as a sonc. In other words, from the definitions of and we get

This shows the first statement.

For the second statement the proof is quite similar. Under the given additional hypothesis, we have Therefore, is a sum of binomial squares by Theorem 2.10. If , then assumes negative values and thus cannot be sum of squares. Hence, from the definitions of and we get

4. Geometric Programming

In this section, we prove that the number can indeed be obtained by a geometric program, which we introduce first.

Definition 4.1.

A function of the form with and is called a monomial (function). A sum of monomials with is called a posynomial (function).

A geometric program has the following form.

(4.1)