Certifying polynomial non-negativity
via hyperbolic optimization
We describe a new approach to certifying the global nonnegativity of multivariate polynomials by solving hyperbolic optimization problems—a class of convex optimization problems that generalize semidefinite programs. Building on the work of Kummer, Plaumann, and Vinzant (‘Hyperbolic polynomials, interlacers, and sums of squares’ Math. Prog. 153(1):223–245, 2015), we show how to produce families of nonnegative polynomials (which we call hyperbolic certificates of nonnegativity) from any hyperbolic polynomial. We investigate the pairs for which there is a hyperbolic polynomial of degree in variables such that an associated hyperbolic certificate of nonnegativity is not a sum of squares. If we show that this occurs whenever . In the degree three case, we find an explicit hyperbolic cubic in variables that gives hyperbolic certificates that are not sums of squares. As a corollary, we obtain the first known hyperbolic cubic no power of which has a definite determinantal representation. Our approach also allows us to show that given a cubic , and a direction , the decision problem “Is hyperbolic with respect to ?” is co-NP hard.
The problem of deciding nonnegativity of a multivariate polynomial is central to solving optimization and feasibility problems expressed in terms of polynomials. These, in turn, arise naturally in a wide range of applications including control systems and robotics, combinatorial optimization, game theory, and quantum information (see, e.g., [BPT12] for an introduction to these ideas and a discussion of numerous applications).
One of the benefits of a polynomial formulation of an optimization problem is that one can then construct a hierarchy of more tractable ‘relaxations’ of the problem, based on the fact that a sufficient condition for a multivariate polynomial to be nonnegative is that it is a sum of squares of polynomials. Deciding whether a polynomial is a sum of squares can be formulated as a semidefinite programming feasibility problem [Sho87, Nes00, Par03, Las01]. While such semidefinite programming-based relaxations of semialgebraic problems have proven very useful for a range of problems, there has been notable recent progress (both qualitative [Sch18] and quantitative [LRS15]) demonstrating the limitations of semidefinite programming-based relaxations of polynomial optimization problems.
Hyperbolic optimization problems (or hyperbolic programs) are a family of convex optimization problems that generalize semidefinite optimization problems [Gül97]. These involve maximizing a linear functional over the intersection of an affine subspace and a hyperbolicity cone (a convex cone constructed from a hyperbolic polynomial, which is a multivariate polynomial with certain real-rootedness properties that we define precisely in Section 2). Hyperbolic programs enjoy many of the good algorithmic properties of semidefinite programs [Gül97]. Despite this, algorithms for hyperbolic programming are less mature than those for semidefinite programming. Indeed, much of the recent research on hyperbolic programming has focused on trying to reformulate various classes of hyperbolic programs as semidefinite programs, or related geometric questions associated with the ‘generalized Lax conjecture’ and its variants [Ami16, Brä14, KPV15, SP15, Sau18]. This lack of algorithmic development may be the result of not having generic ways to produce hyperbolic programming-based formulations and/or relaxations of polynomial optimization problems, beyond those that are already captured by sum-of-squares relaxations.
This paper introduces ways to construct families of nonnegative polynomials that can be searched over via hyperbolic optimization. We call these hyperbolic certificates of nonnegativity. By appropriately choosing the data that specify such a family of nonnegative polynomials, we can recover sums of squares certificates of nonnegativity. As such, using the ideas in this paper, we could construct hyperbolic programming-based relaxations of polynomial feasibility and optimization problems by replacing sums of squares relaxations with relaxations based on hyperbolic certificates of nonnegativity. One significant challenge, not addressed in this paper, is that many choices need to be made to specify a family of hyperbolic certificates of nonnegativity. Given a specific structured class of polynomial optimization problems, it is currently unclear which choices might be appropriate to obtain strong hyperbolic programming-based relaxations using the framework presented in this paper.
A main focus of the paper is the construction of hyperbolic polynomials for which the associated hyperbolic certificates of nonnegativity are not sums of squares. These are of interest because they have the potential to form the basis of hyperbolic programming-based relaxations of polynomial optimization problems that are stronger than semidefinite programming-based relaxations of comparable complexity.
We now discuss the contributions of the paper in more technical detail, and indicate where the main results appear in the paper.
Hyperbolic certificates of nonnegativity
Given a hyperbolic polynomial of degree in variables and a direction of hyperbolicity (see Section 2.2 for the definition of these terms), we construct a polynomial map
from to linear functionals on such that for all if and only if is in the hyperbolicity cone associated with and (Theorem 3.7). This slightly extends a related construction due to Kummer, Plaumann, and Vinzant [KPV15].
If and are polynomial maps then is a nonnegative polynomial in whenever is in the hyperbolicity cone associated with and . If a polynomial can be written this way for a choice of hyperbolic polynomial , direction of hyperbolicity , and polynomial maps and we say it has a hyperbolic certificate of nonnegativity (see Definition 3.12). We can search for such a description of a polynomial (for fixed , , , and ) by solving a hyperbolic optimization problem. Moreover, by appropriately specifying these data, we can recover sums of squares certificates of nonnegativity (Proposition 3.13).
Hyperbolic and SOS-hyperbolic polynomials
If all nonnegative polynomials of the form are sums of squares, we say that is SOS-hyperbolic with respect to (see Definition 4.2). From the point of view of this paper, we are most interested in hyperbolic polynomials that are not SOS-hyperbolic, since these give rise to tractable families of nonnegative polynomials that go beyond sums of squares.
In Sections 4 and 5 we investigate the degrees , and numbers of variables , for which there is a hyperbolic polynomial that is not SOS-hyperbolic. In Section 4 we give a complete characterization of when this happens for by showing that the specialized Vámos polynomial,
is hyperbolic but not SOS-hyperbolic (Proposition 4.9), and showing how to take a hyperbolic but not SOS-hyperbolic polynomial and increase its number of variables, or its degree, and maintain this property (Propositions 4.10 and 4.13). Variations of the specialized Vámos polynomial have been studied by Brändén [Brä11], Kummer [Kum16], Kummer, Plaumann, and Vinzant [KPV15], Burton, Vinzant, and Youm [BVY14], and Amini and Brändén [AB18]. Notably, [KPV15, Example 5.11] shows that a closely related quartic in five variables is, in our language, hyperbolic but not SOS-hyperbolic.
In Section 5 we study the case and obtain a partial classification of when hyperbolicity and SOS-hyperbolicity coincide. We show how to construct, from a simple graph, a one-parameter family of cubic polynomials that are hyperbolic if and only if the parameter is bounded by the clique number of the graph. This observation allows us to establish NP-hardness of deciding hyperbolicity of cubic polynomials (Theorem 5.4). By choosing the graph as the icosahedral graph, (which has vertices and edges) we obtain a hyperbolic cubic
in variables that is hyperbolic but not SOS-hyperbolic.
The known cases where all hyperbolic polynomials are SOS-hyperbolic occur because if a power of a polynomial has a definite determinantal representation then it is SOS-hyperbolic (Proposition 4.7). This is a common generalization of results due to Kummer, Plaumann, and Vinzant [KPV15], and Netzer, Plaumann, and Thom [NPT13]. This means that any hyperbolic polynomial that is not SOS-hyperbolic gives an example of a hyperbolic polynomial for which no power has a definite determinantal representation. In particular, the cubic polynomial (1) associated with the icosahedral graph, appears to be the first explicit hyperbolic cubic with this property reported in the literature.
Parameterizing the dual cone
The dual cones of hyperbolicity cones are, in general, not well understood. In Section 7 we show that the polynomial map almost (i.e., up to closure) parameterizes the dual of the hyperbolicity cone associated with and (Theorem 7.2). We then discuss specific situations in which the map exactly parameterizes the closed dual cone. These can be thought of as new ‘hidden convexity’ results, where the image of a polynomial map is, in fact, a convex set.
2.1 Basic notation
Let denote polynomials with real coefficients, homogeneous of degree in the indeterminates . Let denote real symmetric matrices and let denote symmetric matrices with entries in . If and then is the directional derivative of in the direction . For brevity we write for second-order derivatives.
If is a symmetric matrix, we write to mean that is positive semidefinite. We use to denote the -dimensional real vector space and to denote the dual space of linear functionals on . If and we use the notation for the image of under . Throughout, we let denote the standard basis vectors, so that is zero except for the th entry, which is one. Occasionally it is convenient to use as the standard basis for . If we use to denote the Euclidean norm .
2.2 Hyperbolic polynomials, hyperbolic eigenvalues, and hyperbolicity cones
A homogeneous polynomial is hyperbolic with respect to if 111Requiring that , rather than the usual requirement that , is merely a convenient normalization. and, for all , the univariate polynomial has only real roots. Throughout we let denote the set of polynomials homogeneous in variables of degree that are hyperbolic with respect to .
If and we denote by the roots of , and often refer to these as the hyperbolic eigenvalues of . These depend on the choice of , but we will usually suppress this in our notation. Associated with a hyperbolic polynomial is the closed hyperbolicity cone
This is a convex cone, a result due to Gårding [Går59]. A hyperbolic polynomial is complete if . The hyperbolicity cones of complete hyperbolic polynomials are pointed, in the sense that .
The following result describes how the eigenvalue functions change, and appears in a number of slightly different formulations in the literature [Gül97, HLJ13, ABG70]. Note that the functions appearing below are just a particular choice of ordering of the eigenvalues of .
Theorem 2.1 ([Abg70, Lemma 3.27]).
If and then
where the functions are real analytic functions of with the property that if then for all .
The nonnegativity of the derivatives of the whenever is in the hyperbolicity cone is the key property from which essentially all nonnegativity statements in this paper can be derived.
2.3 Hyperbolic programming
If is hyperbolic with respect to , and is a linear functional, a convex optimization problem of the form
is known as a hyperbolic optimization problem. Since the work of Güler [Gül97] it has been known that is a self-concordant barrier function for the cone . As such, hyperbolic optimization problems can be solved using interior point methods as long as the polynomial can be evaluated efficiently. More recently, other algorithmic approaches to solving hyperbolic optimization problems have been developed, including primal-dual interior point methods [MT14], affine scaling methods [RS14], first-order methods based on applying a subgradient method to a transformation of the problem [Ren16] and accelerated modifications tailored for hyperbolic programs [Ren17].
2.4 Definite determinantal representations
If are symmetric matrices and satisfies then the polynomial
is homogeneous of degree and is hyperbolic with respect to . We say that a polynomial has a definite determinantal representation if it can be expressed in the form (2) for some symmetric matrices such that . If has a definite determinatal representation of the form (2) then its hyperbolicity cone is a spectrahedron, and has the form .
2.5 Bézoutians and Hankel matrices
The nonnegative polynomials we construct from hyperbolic polynomials will come from the positive semidefiniteness of Bézoutians of certain pairs of polynomials, or of Hankel matrices associated with certain rational functions. In this section we summarize some basic facts about these objects and the relationships between them. These can be found, for instance, in [BMO11]. We give proofs of some of these results in Appendix A to make the paper more self-contained.
If and are univariate polynomials with , define the Hankel matrix
If and are univariate polynomials with , the Bézoutian is the matrix defined via the identity
Note that if then the Bézoutian is zero except in the upper left block.
Under appropriate assumptions on and , the Bézoutian and the Hankel matrix are related by a unimodular congruence transformation.
Suppose is a monic polynomial of degree and is a polynomial of degree at most . If then there exists an unimodular matrix with entries that are linear in the coefficients of , such that
See Appendix A. ∎
Certain linear transformations on polynomials give rise to particularly nice congruence transformations on Bézoutians.
Let and be univariate polynomials of degree at most . Let and be shifted versions of those polynomials. Then
with the convention that if .
3 Hyperbolicity cones as sections of nonnegative polynomials
In this section we show how to construct, from a hyperbolic polynomial , a subspace of polynomials for which the cone of nonnegative polynomials in the subspace is linearly isomorphic to the hyperbolicity cone . Consequently, we can optimize over nonnegative polynomials from this subspace by solving hyperbolic optimization problems.
Our approach is closely related to the following result of Kummer, Plaumann, and Vinzant.
Theorem 3.1 (Kummer, Plaumann, Vinzant [Kpv15]).
If is square-free then if and only if
This shows that the hyperbolicity cone is linearly isomorphic to the intersection of the cone of nonnegative polynomials in variables of degree with the subspace spanned by the polynomials
Our variation on Theorem 3.1 is most naturally expressed in terms of the Bézoutian, or alternatively the corresponding Hankel matrix, associated with a polynomial and its directional derivative.
If and let and let . The parameterized Bézoutian is the symmetric matrix with polynomial entries given by
The parameterized Hermite matrix is the symmetric Hankel matrix with polynomial entries given by
Note that and are both linear in . Moreover,
This is (up to an unimportant choice of sign) exactly the parameterized Hermite matrix from [NPT13].
Example 3.3 (Parameterized Hermite matrix for the determinant).
Suppose that is the determinant restricted to symmetric matrices and . If is a symmetric matrix then
This follows from the fact that
The following relationship between the parameterized Bézoutian and Hermite matrices allows positivity statements about parameterized Bézoutians to be transferred to corresponding positive statements about parameterized Hermite matrices, and vice versa.
If and then where and are both matrices with polynomial entries.
First assume that so that is monic. The result then follows from Proposition 2.2 and the fact that the coefficients of are polynomials in . Moreover, in this case has determinant one. For the general case, write where is monic. Then and so we have from which we see that which has polynomial entries. ∎
The following result, which can be found, for instance, in [NPT13], gives a characterization of hyperbolic polynomials in terms of the parameterized Hermite matrix.
If and then if and only if for all .
By using Proposition 3.4, this characterization of hyperbolicity can also be expressed in terms of the parameterized Bézoutian.
If and then if and only if for all .
Our main result for this section shows that these tests for hyperbolicity can be extended to give a description of the full hyperbolicity cone. We defer the proof until Section 6.
In what follows, it is sometimes convenient to use the following variations on Theorems 3.5 and 3.7, respectively. In some arguments they allow us to reduce the number of variables in certain polynomials by one.
Let , , and let be a codimension one subspace such that . Then if and only if for all , which holds if and only if for all .
If and is a codimension one subspace such that then
Suppose that and . Then there exists a unimodular polynomial matrix such that
It follows immediately that for all if and only if for all whenever is a codimension one subspace of and . Corollary 3.9 then follows from Theorem 3.5 and Corollary 3.6. Similarly Corollary 3.10 then follows from Theorem 3.7.
3.1 Hyperbolic certificates of nonnegativity
One consequence of Theorem 3.7 is that if and then both of the following polynomials
are globally nonnegative in and . By composing the polynomials or with other polynomial maps we obtain further nonnegative polynomials.
We say that a polynomial in variables has a hyperbolic certificate of nonnegativity with respect to if there exists and polynomial maps and such that
Since the parameterized Bézoutian and Hermite matrix are the same up to a unimodular congruence transformation (see Proposition 3.4), there is no difference between using or in Definition 3.12. We can transform from one representation to another by changing appropriately. From now on, we will often write instead of unless we specifically want to work with the Bézoutian formulation.
Any polynomial that has a hyperbolic certificate of nonnegativity is nonnegative due to Theorem 3.7. Moreover, given a polynomial , the problem of searching for a hyperbolic certificate of nonnegativity of can be cast as a hyperbolic feasibility problem:
which aims to find a point in the intersection of the hyperbolicity cone and the affine subspace defined by, for instance, equating coefficients in the polynomial identity (6).
Recovering sums of squares certificates
A homogeneous polynomial is a sum of squares if there is a positive integer and homogeneous polynomials such that for all . Clearly any sum of squares is nonnegative. Furthermore, it is well known that if is the vector of all monomials that are homogeneous of degree in variables then is a sum of squares if and only if
This allows one to search for a sum of squares certificate of the nonnegativity of via solving a semidefinite feasibility problem.
If is a sum of squares, we will now show how to choose the data (, , and ) in Definition 3.12 to give a hyperbolic certificate of nonnegativity for . This shows that our notion of hyperbolic certificates of nonnegativity captures sums of squares as a special case.
Let be a sum of squares, and let be such that . Then has a hyperbolic certificate of nonnegativity as
From our choice of and , we see that . From Example 3.3,
We have now seen that every sum of squares has a hyperbolic certificate of nonnegativity. In Section 4 we will show that there are polynomials that have hyperbolic certificates of nonnegativity, but that are not sums of squares.
4 Hyperbolic certificates and sums of squares
In this section we study conditions under which polynomials with hyperbolic certificates of nonnegativity are, or are not, sums of squares. We will often phrase this in terms of sums of squares certificates of the positive semidefiniteness of the parameterized Bézoutian (and Hermite matrix), which are matrices with polynomial entries.
A symmetric matrix with polynomial entries is a matrix sum of squares if there exists a positive integer and a matrix with polynomial entries such that .
It is well known that is a matrix sum of squares if and only if the polynomial is a sum of squares in and . We will freely pass between these two equivalent definitions.
The following is the central definition of this section and Section 5. It specifies a class of hyperbolic polynomials for which we have not just a sum of squares certificate of their hyperbolicity, but also a sum of squares description of the hyperbolicity cone.
If we say that is SOS-hyperbolic with respect to if is a matrix sum of squares for all .
We will use the shorthand notation for the collection of polynomials that are homogeneous of degree in variables and SOS-hyperbolic with respect to . If and has a hyperbolic certificate of nonnegativity via an identity of the form , then is a sum of squares. As such, we are particularly interested in hyperbolic polynomials that are not SOS-hyperbolic, since these give us new certificates of nonnegativity.
Table 1 describes our understanding of the values of and for which the sets and coincide. In this section we develop some preparatory results, establish the equality cases of the table, and the remainder of the first and third columns of the table (corresponding to and ). In Section 5 we focus on the case.
Before giving proofs establishing the entries of the table, we give a number of equivalent characterizations for polynomials that are SOS-hyperbolic with respect to some direction .
Suppose that and let and . Let be an matrix such that the matrix has full rank. Then the following are equivalent
is a matrix sum of squares
is a matrix sum of squares
is a matrix sum of squares
is a matrix sum of squares
If then the hyperbolicity cone can be expressed as the projection of a spectrahedron, i.e., the image of a spectrahedron under a linear map. This is because the cone of matrix sums of squares in is a linear image of the positive semidefinite cone.
If we say that is weakly SOS-hyperbolic with respect to if
is a sum of squares for all .
Clearly if is SOS-hyperbolic with respect to then it is weakly SOS-hyperbolic with respect to . Under what additional assumptions on (if any) are these two notions actually equivalent?
In Proposition 4.7, to follow, we show that if a power of has a definite determinantal representation then is SOS-hyperbolic. This is (at least formally) a slight strengthening of a result of Kummer, Plaumann, and Vinzant [KPV15, Corollary 4.3], which can be rephrased as saying that if a power of has a definite determinantal representation then is weakly SOS-hyperbolic. Proposition 4.7 also generalizes result of Netzer, Plaumann, and Thom [NPT13, Theorem 1.6], which can be rephrased as saying that if a power of has a definite determinantal representation then is a matrix sum of squares. The proof presented here generalizes the argument of Netzer, Plaumann, and Thom.
If and there is a positive integer such that has a definite determinantal representation then .
We will show that is a matrix sum of squares whenever . It then follows from Proposition 4.3 that is a matrix sum of squares whenever .
Let , and note that
In particular, for all . As such, it suffices to show that is a matrix sum of squares.
Since has a definite determinantal representation, we can write where
for real symmetric matrices such that . If then and so has a positive semidefinite square root . In Example 3.3 we explicitly computed the matrix and obtained
Here the inner product is . This is clearly a Gram matrix with factors that are polynomials in , and so shows that is a matrix sum of squares whenever . ∎
All of the equality signs in Table 1 follow directly from known results about when powers of hyperbolic polynomials have definite determinantal representations. In what follows, for the cases and we give direct proofs, avoiding arguments about determinantal representations. It would be interesting to find a similar direct argument in the case .
If or or then .
The case follows from a result of Buckley and Košir [BK07, Theorem 6.4], which says that the square of any smooth hyperbolic cubic form in variables has a definite determinantal representation. Combining this with the fact that smooth hyperbolic polynomials are dense in all hyperbolic polynomials, it follows that the square of any hyperbolic cubic in four variables has a definite determinantal representation by a limiting argument [PV13, Proof of Corollary 4.10].
The case follows from the celebrated Helton-Vinnikov theorem [HV07] that (in its homogeneous form [LPR05]) says that has a definite determinantal representation. Here we give an alternative direct argument. If we choose a basis for then, by Proposition 3.11,
for some polynomial matrix . As such, it suffices to show that if then is a matrix sum of squares. This is a positive semidefinite matrix-valued polynomial for which each entry is a homogeneous form in . From [BSV16, Remark 5.10] it is known that all such polynomial matrices are matrix sums of squares.
In the case , we again give a direct argument. First note that if then for all . Moreover, since there is a positive constant , and polynomials and homogeneous of degree one and two respectively, such that
As such, is a matrix sum of squares if and only if the nonnegative quadratic form is a sum of squares. Since any nonnegative quadratic form is a sum of squares, we are done. ∎
4.1 Hyperbolic certificates that are not sums of squares:
In this section we give an explicit example of a polynomial of degree four in four variables that is hyperbolic, but not SOS-hyperbolic, with respect to . The example is the specialized Vámos polynomial
which is hyperbolic with respect to and has hyperbolicity cone that contains the nonnegative orthant. This is one of a much larger class of hyperbolic polynomials constructed from -uniform hypergraphs by Amini and Brändén [AB18, Theorem 9.4]. The name arises because the basis generating polynomial of the Vámos matroid is
where . The specialized Vámos polynomial is obtained as the restriction,