nonnegativity on closed sets and optimization

A new look at nonnegativity on closed sets and polynomial optimization

Jean B. Lasserre LAAS-CNRS and Institute of Mathematics
University of Toulouse
LAAS, 7 avenue du Colonel Roche
31077 Toulouse Cédex 4,France
Abstract.

We first show that a continuous function is nonnegative on a closed set if and only if (countably many) moment matrices of some signed measure with , are all positive semidefinite (if is compact is an arbitrary finite Borel measure with ). In particular, we obtain a convergent explicit hierarchy of semidefinite (outer) approximations with no lifting, of the cone of nonnegative polynomials of degree at most . Wen used in polynomial optimization on certain simple closed sets (like e.g., the whole space , the positive orthant, a box, a simplex, or the vertices of the hypercube), it provides a nonincreasing sequence of upper bounds which converges to the global minimum by solving a hierarchy of semidefinite programs with only one variable (in fact, a generalized eigenvalue problem). In the compact case, this convergent sequence of upper bounds complements the convergent sequence of lower bounds obtained by solving a hierarchy of semidefinite relaxations as in e.g. [12].

Key words and phrases:
closed sets; nonnegative functions; nonnegative polynomials; semidefinite approximations; moments
90C25 28C15

1. Introduction

This paper is concerned with a concrete characterization of continuous functions that are nonnegative on a closed set and its application for optimization purposes. By concrete we mean a systematic procedure, e.g. a numerical test that can be implemented by an algorithm, at least in some interesting cases. For polynomials, Stengle’s Nichtnegativstellensatz [22] provides a certificate of nonnegativity (or absence of nonnegativity) on a semi-algebraic set. Moreover, in principle, this certificate can be obtained by solving a single semidefinite program (although the size of this semidefinite program is far beyond the capabilities of today’s computers). Similarly, for compact basic semi-algebraic sets, Schmüdgen’s and Putinar’s Positivstellensätze [20, 18] provide certificates of strict positivity that can be obtained by solving finitely many semidefinite programs (of increasing size). Extensions of those certificates to some algebras of non-polynomial functions have been recently proposed in Lasserre and Putinar [14] and in Marshall and Netzer [16]. However, and to the best of our knowledge, there is still no hierarchy of explicit (outer or inner) semidefinite approximations (with or without lifting) of the cone of polynomials nonnegative on a closed set , except if is compact and basic semi-algebraic (in which case outer approximations exist). Another exception is the convex cone of quadratic forms nonnegative on for which inner and outer approximations are available; see e.g. Anstreicher and Burer [1], and Dür [7].

Contribution: In this paper, we present a different approach based on a new (at least to the best of our knowledge) and simple characterization of continuous functions that are nonnegative on a closed set . This characterization involves a single (but known) measure with , and sums of squares of polynomials. Namely, our contribution is twofold:

(a) We first show that a continuous function is nonnegative on a closed set if and only if is nonnegative for all polynomials , where is a finite Borel measure111A finite Borel measure on is a nonnegative set function defined on the Borel -algebra of (i.e., the -algebra generated by the open sets), such that , , and for any collection of disjoint measurable sets . Its support (denoted ) is the smallest closed set such that ; see e.g. Royden [19]. with . The measure is arbitrary if is compact. If is not compact then one may choose for the finite Borel measure:

- if is a polynomial, and

- , if is not a polynomial,

where is any finite Borel measure with support exactly . But many other choices are possible.

Equivalently, is nonnegative on if and only if every element of the countable family of moment matrices associated with the signed Borel measure , is positive semidefinite. The absence of nonnegativity on can be certified by exhibiting a polynomial such that , or equivalently, when some moment matrix in the family is not positive semidefinite. And so, interestingly, as for nonnegativity or positivity, our certificate for absence of nonnegativity is also in terms of sums of squares. When is a polynomial, these moment matrices are easily obtained from the moments of and this criterion for absence of nonnegativity complements Stengle’s Nichtnegativstellensatz [22] (which provides a certificate of nonnegativity on a semi-algebraic set ) or Schmüdgen and Putinar’s Positivstellensätze [20, 18] (for certificates of strict positivity on compact basic semi-algebraic sets). At last but not least, we obtain a convergent explicit hierarchy of semidefinite (outer) approximations with no lifting, of the cone of nonnegative polynomials of degree at most . That is, we obtain a nested sequence such that each is a spectrahedron defined solely in terms of the vector of coefficients of the polynomial, with no additional variable (i.e., no projection is needed). Similar explicit hierarchies can be obtained for the cone of polynomials nonnegative on a closed set (neither necessarily basic semi-algebraic nor compact), provided that all moments of an appropriate measure (with support exactly ) can be obtained. To the best of our knowledge, this is first result of this kind.

(b) As a potential application, we consider the problem of computing the global minimum of a polynomial on a closed set , a notoriously difficult problem. In nonlinear programming, a sequence of upper bounds on is usually obtained from a sequence of feasible points , e.g., via some (local) minimization algorithm. But it is important to emphasize that for non convex problems, providing a sequence of upper bounds , , that converges to is in general impossible, unless one computes points on a grid whose mesh size tends to zero.

We consider the case where is a closed set for which one may compute all moments of a measure with . Typical examples of such sets are e.g. or in the non compact case and a box, a ball, an ellipsoid, a simplex, or the vertices of an hypercube (or hyper rectangle) in the compact case. We then provide a hierarchy of semidefinite programs (with only one variable!) whose optimal values form a monotone nonincreasing sequence of upper bounds which converges to the global minimum . In fact, each semidefinite program is a very specific one as it reduces to solving a generalized eigenvalue problem for a pair of real symmetric matrices.. (Therefore, for efficiency one may use specialized software packages instead of a SDP solver.) However, the convergence to is in general only asymptotic and not finite (except when is a discrete set in which case finite convergence takes place). This is in contrast with the hierarchy of semidefinite relaxations defined in Lasserre [12, 13] which provide a nondecreasing sequence of lower bounds that also converges to , and very often in finitely many steps. Hence, for compact basic semi-algebraic sets these two convergent hierarchies of upper and lower bounds complement each other and permit to locate the global minimum in smaller and smaller intervals.

Notice that convergence of the hierarchy of convex relaxations in [12] is guaranteed only for compact basic semi-algebraic sets, whereas for the new hierarchy of upper bounds, the only requirement on is to know all moments of a measure with . On the other hand, in general computing such moments is possible only for relatively simple (but not necessarily compact nor semi-algebraic) sets.

At last but not least, the nonincreasing sequence of upper bounds converges to even if is not attained, which when , could provide an alternative and/or a complement to the hierarchy of convex relaxations provided in Schweighofer [21] (based on gradient tentacles) and in Hà and Pham [8] (based on the truncated tangency variety), which both provide again a monotone sequence of lower bounds.

Finally, we also give a very simple interpretation of the hierarchy of dual semidefinite programs, which provides some information on the location of global minimizers.

2. Notation, definitions and preliminary results

A Borel measure on is understood as a positive Borel measure, i.e., a nonnegative set function on the Borel -algebra (i.e., the -algebra generated by the open sets of ) such that , and with the countably additive property

 μ(∞⋃i=1Ei)=∞∑i=1μ(Ei),

for any collection of disjoint measurable sets ; see e.g. Royden [19, pp. 253–254].

Let be the ring of polynomials in the variables , and its subset of polynomials that are sums of squares (s.o.s.). Denote by the vector space of polynomials of degree at most , which forms a vector space of dimension , with e.g., the usual canonical basis of monomials. Also, denote by the convex cone of s.o.s. polynomials of degree at most . If , write in the canonical basis and denote by its vector of coefficients. Let denotes the vector space of real symmetric matrices. For a matrix the notation (resp. ) stands for is positive semidefinite (resp. definite).

Moment matrix

With being a sequence indexed in the canonical basis of , let be the linear functional

 f(=∑αfαxα)↦Ly(f)=∑αfαyα,

and let be the symmetric matrix with rows and columns indexed in the canonical basis , and defined by:

 (2.1) Md(y)(α,β):=Ly(xα+β)=yα+β,α,β∈Nnd

with .

If has a representing measure , i.e., if for every , then

 ⟨f,Md(y)f⟩=∫f(x)2dμ(x)≥0,∀f∈R[x]d,

and so . A measure is said to be moment determinate if there is no other measure with same moments. In particular, and as an easy consequence of the Stone-Weierstrass theorem, every measure with compact support is determinate222To see this note that (a) two measures on a compact set are identical if and only if for all continuous functions on , and (b) by Stone-Weierstrass, the polynomials are dense in the space of continuous functions for the sup-norm..

Not every sequence satisfying , , has a representing measure. However:

Proposition 2.1 (Berg [3]).

Let be such that , for every . Then:

(a) The sequence has a representing measure whose support is contained in the ball if there exists such that for every .

(b) The sequence has a representing measure on if

 (2.2) ∞∑t=1Ly(x2ti)−1/2t=+∞,∀i=1,…,n.

In addition, in both cases (a) and (b) the measure is moment determinate.

Condition (b) is an extension to the multivariate case of Carleman’s condition in the univariate case and is due to Nussbaum [17]. For more details see e.g. Berg [3] and/or Maserick and Berg [11].

Localizing matrix

Similarly, with and written

 x↦f(x)=∑γ∈Nnfγxγ,

let be the symmetric matrix with rows and columns indexed in the canonical basis , and defined by:

 (2.3)

If has a representing measure , then , and so if is supported on the set , then for all because

 (2.4) ⟨g,Md(fy)g⟩=∫g(x)2f(x)dμ(x)≥0,∀g∈R[x]d.

3. Nonnegativity on closed sets

Recall that if is a separable metric space with Borel -field , the support of a Borel measure on is the (unique) smallest closed set such that . Given a Borel measure on and a measurable function , the mapping , , defines a set function on . If is nonnegative then is a Borel measure (which is finite if is -integrable); see e.g. Royden [19, p. 276 and p. 408]. If is not nonnegative then setting and , the set function can be written as the difference

 (3.1) ν(B)=ν1(B)−ν2(B),B∈B,

of the two positive Borel measures defined by

 (3.2) ν1(B)=∫B1∩Bfdμ,ν2(B)=−∫B2∩Bfdμ,∀B∈B.

Then is a signed Borel measure provided that either or is finite; see e.g. Royden [19, p. 271]. We first provide the following auxiliary result which is also of self-interest.

Lemma 3.1.

Let be a separable metric space, a closed set, and a Borel measure on with . A continuous function is nonnegative on if and only if the set function , , is a positive measure.

Proof.

The only if part is straightforward. For the if part, if is a positive measure then for -almost all . That is, there is a Borel set such that and on . Indeed, otherwise suppose that there exists a Borel set with and on ; then one would get the contradiction that is not positive because . In fact, is called the Radon-Nikodym derivative of with respect to ; see Royden [19, Theorem 23, p. 276].

Next, observe that and . Therefore because is the unique smallest closed set such that . Hence, let be fixed, arbitrary. As , there is a sequence , , with as . But since is continuous and for every , we obtain the desired result . ∎

Lemma 3.1 itself (of which we have not been able to find a trace in the literature) is a characterization of nonnegativity on for a continuous function on . However, one goal of this paper is to provide a more concrete characterization. To do so we first consider the case of a compact set .

3.1. The compact case

Let be a compact subset of . For simplicity, and with no loss of generality, we may and will assume that .

Theorem 3.2.

Let be compact and let be an arbitrary, fixed, finite Borel measure on with , and with vector of moment , . Let be a continuous function on . Then:

(a) is nonnegative on if and only if

 (3.3)

or, equivalently, if and only if

 (3.4) Md(z)⪰0,d=0,1,…

where , , with , and with as in (2.1).

If in addition then (3.4) reads , , where is the localizing matrix defined in (2.3).

(b) If in addition to be continuous, is also concave on , then is nonnegative on the convex hull of if and only if (3.3) holds.

Proof.

The only if part is straightforward. Indeed, if on then and so for any finite Borel measure on , for every . Next, if is concave and on then on and so the “only if” part of (b) also follows.

If part. The set function , , can be written as the difference of the two positive finite Borel measures described in (3.1)-(3.2), where and . As is compact and is continuous, both are finite, and so is a finite signed Borel measure; see Royden [19, p. 271]. In view of Lemma 3.1 it suffices to prove that in fact is a finite and positive Borel measure. So let , , be the sequence defined by:

 (3.5) zα=∫Kxαdν(x):=∫Kxαf(x)dμ(x),∀α∈Nn.

Every , , is finite because is compact and is continuous. So the condition

 ∫Kg(x)2f(x)dμ(x)≥0,∀f∈R[x]d,

reads for all , that is, , where is the moment matrix defined in (2.1). And so (3.3) implies for every . Moreover, as ,

 |zα|≤c:=∫K|f|dμ,∀α∈Nn.

Hence, by Proposition 2.1, is the moment sequence of a finite (positive) Borel measure on , that is, as ,

 (3.6) ∫[−1,1]nxαdν(x)=∫[−1,1]nxαdψ(x),∀α∈Nn.

But then using (3.1) and (3.6) yields

 ∫[−1,1]nxαdν1(x)=∫[−1,1]nxαd(ν2+ψ)(x),∀α∈Nn,

which in turn implies because measures on a compact set are determinate. Next, this implies and so is a positive Borel measure on . Hence by Lemma 3.1, on .

If in addition , the sequence is obtained as a linear combination of . Indeed if then

 zα=∑β∈Nnfβyα+β,∀α∈Nn,

and so in (3.4), is nothing less than the localizing matrix associated with and , defined in (2.3), and (3.4) reads for all

Finally, if is concave then on implies on , and so the only if part of (b) also follows. ∎

Therefore, to check whether a polynomial is nonnegative on , it suffices to check if every element of the countable family of real symmetric matrices , , is positive semidefinite.

Remark 3.3.

An informal alternative proof of Theorem 3.2 which does not use Lemma 3.1 is as follows. If is not nonnegative on there exists such that , and so as is compact, there is a continuous function, e.g, close to in some open neighborhood of , and very small in the rest of . By the Stone-Weierstrass’s theorem, one may choose to be a polynomial. Next, the complement ) of is closed, and so is a closed set contained in (hence smaller than ). Therefore because otherwise which would imply that is a support of smaller than , in contradiction with . Hence, we would get the contradiction

 ∫h2fdμ≈h(a)2f(a)μ(B(a,δ))<0.

However, in the non compact case described in the next section, this argument is not valid.

3.2. The non-compact case

We now consider the more delicate case where is a closed set of , not necessarily compact. To handle arbitrary non compact sets and arbitrary continuous functions , we need a reference measure with and with nice properties so that integrals such as , , are well-behaved.

So, let be an arbitrary finite Borel measure on whose support is exactly , and let be the finite Borel measure defined by:

 (3.7) μ(B):=∫Bexp(−n∑i=1|xi|)dφ(x),∀B∈B(Rn).

Observe that and satisfies Carleman’s condition (2.2). Indeed, let , , be the sequence of moments of . Then for every , and every , using ,

 (3.8) Lz(x2ki)=∫Kx2ki dμ(x)≤(2k)!∫Ke|xi|dμ(x)≤(2k)!φ(K)=:(2k)!M.

Therefore for every , using for every , yields

 ∞∑k=1Lz(x2ki)−1/2k≥∞∑k=1M−1/2k((2k)!)−1/2k≥∞∑k=1M−1/2k2k=+∞,

i.e., (2.2) holds. Notice also that all the moments of (defined in (3.7)) are finite, and so every polynomial is -integrable.

Theorem 3.4.

Let be closed and let be an arbitrary finite Borel measure whose support is exactly . Let be a continuous function on . If (i.e., is a polynomial) let be as in (3.7) whereas if is not a polynomial let be defined by

 (3.9)

Then (a) and (b) of Theorem 3.2 hold.

For a detailed proof see §6.

It is important to emphasize that in Theorem 3.2 and 3.4, the set is an arbitrary closed set of , and to the best of our knowledge, the characterization of nonnegativity of in terms of positive definiteness of the moment matrices is new. But of course, this characterization becomes even more interesting when one knows how the compute the moment sequence , , which is possible in a few special cases only.

Important particular cases of nice such sets include boxes, hyper rectangles, ellipsoids, and simplices in the compact case, and the positive orthant, or the whole space in the non compact case. For instance, for the whole space one may choose for in (3.7) the multivariate Gaussian (or normal) probability measure

 μ(B):=(2π)−n/2∫Bexp(−12∥x∥2)dx,B∈B(Rn),

which the -times product of the one-dimensional normal distribution

 μi(B):=1√2π∫Bexp(−x2i/2)dxi,B∈B(R),

whose moments are all easily available in closed form. In Theorem 3.4 this corresponds to the choice

 (3.10) φ(B)=(2π)−n/2∫Bexp(−∥x∥2/2)exp(−∑ni=1|xi|)dx,B∈B(Rn).

When is the positive orthant one may choose for the exponential probability measure

 (3.11) μ(B):=∫Bexp(−n∑i=1xi)dx,B∈B(Rn+),

which the -times product of the one-dimensional exponential distribution

 μi(B):=∫Bexp(−xi)dxi,B∈B(R+),

whose moments are also easily available in closed form. In Theorem 3.4 this corresponds to the choice

 φ(B)=2n∫Bexp(−n∑i=1xi)dx,B∈B(Rn+).

3.3. The cone of nonnegative polynomials

The convex cone of nonnegative polynomials of degree at most (a nonnegative polynomial has necessarily even degree) is much harder to characterize than its subcone of sums of squares. Indeed, while the latter has a simple semidefinite representation with lifting (i.e. is the projection in of a spectrahedron333A spectrahedron is the intersection of the cone of positive semidefinite matrices with an affine-linear space. Its algebraic representation is called a Linear Matrix Inequality (LMI). in a higher dimensional space), so far there is no such simple representation for the former. In addition, when is fixed, Blekherman [4] has shown that after proper normalization, the “gap” between and increases unboundedly with the number of variables.

We next provide a convergent hierarchy of (outer) semidefinite approximations , , of where each has a semidefinite representation with no lifting (i.e., no projection is needed and is a spectrahedron). To the best of our knowledge, this is the first result of this kind.

Recall that with every is associated its vector of coefficients , , in the canonical basis of monomials, and conversely, with every is associated a polynomial with vector of coefficients in the canonical basis. Recall that for every ,

 γp:=1√2π∫∞−∞xpe−x2/2dx={0if p=2k+1,∏kj=1(2j−1)if p=2k,

as for every .

Corollary 3.5.

Let be the probability measure on which is the -times product of the normal distribution on , and so with moments , ,

 (3.12) yα=∫Rnxαdμ=n∏i=1(1√2π∫∞−∞xαie−x2/2dx),∀α∈Rn.

For every , let , where is the localizing matrix in (2.3) associated with and . Each is a closed convex cone and a spectrahedron.

Then and if and only if its vector of coefficients satisfies , for every .

Proof.

Following its definition (2.3), all entries of the localizing matrix are linear in , and so is an LMI. Therefore is a spectrahedron and a closed convex cone. Next, let and let be as in Corollary 3.5 and so of the form (3.7) with as in (3.10). Then satisfies Carleman’s condition (2.2). Hence, by Theorem 3.4 with , is nonnegative on if and only if (3.4) holds, which is equivalent to stating that , , which in turn is equivalent to stating that ,

So the nested sequence of convex cones defines arbitrary close outer approximations of . In fact is closed and . It is worth emphasizing that each is a spectrahedron with no lifting, that is, is defined solely in terms of the vector of coefficients with no additional variable (i.e., no projection is needed).

For instance, the first approximation is just the set , which is a half-space of . And with ,

or, equivalently, is the convex basic semi-algebraic set:

 {f∈Rs(2d) : ∫fdμ≥0 (∫x2ifdμ)(∫fdμ)≥(∫xifdμ)2,i=1,2 (∫x21fdμ)(∫x22fdμ)≥(∫x1x2fdμ)2
 (∫fdμ)[(∫x21fdμ)(∫x22fdμ)−(∫x1x2fdμ)2]−
 (∫x1fdμ)2(∫x22fdμ)−(∫x2fdμ)2(∫x21fdμ)+
 2(∫x1fdμ)(∫x2fdμ)(∫x1x2fdμ)≥0},

where we have just expressed the nonnegativity of all principal minors of .

A very similar result holds for the convex cone of polynomials of degree at most , nonnegative on a closed set .

Corollary 3.6.

Let be a closed set and let be defined in (3.7) where is a an arbitrary finite Borel measure whose support is exactly .

For every , let , where is the localizing matrix in (2.3) associated with and . Each is a closed convex cone and a spectrahedron.

Then and if and only if its vector of coefficients satisfies , for every .

The proof which mimicks that of Corollary 3.5 is omitted. Of course, for practical computation, one is restricted to sets where one may compute effectively the moments of the measure . An example of such a set is the positive orthant, in which case one may choose the measure in (3.11) for which all moments are explicitly available. For compact sets let us mention balls, boxes, ellipsoids, and simplices. But again, any compact set where one knows how to compute all moments of some measure with support exactly , is fine.

To the best of our knowledge this is the first characterization of an outer approximation of the cone in a relatively general context. Indeed, for the basic semi-algebraic set

 (3.13) K={x∈Rn:gj(x)≥0,j=1,…,m},

Stengle’s Nichtnegativstellensatz [22] states that is nonnegative on if and only if

 (3.14) pf=f2s+q,

for some integer and polynomials , where is the preordering444The preordering associated with the ’s is the set of polynomials of the form where for each . associated with the ’s. In addition, there exist bounds on the integer and the degree of the s.o.s. weights in the definition of , so that in principle, when is known, checking whether on reduces to solving a single SDP to compute in the nonnegativity certificate (3.14). However, the size of this SDP is potentially huge and makes it unpractical. Moreover, the representation of in (3.14) is not convex in the vector of coefficients of because it involves as well as the product .

Remark 3.7.

If in Corollary 3.6 one replaces the finite-dimensional convex cone with the infinite-dimensional convex cone of all polynomials nonnegative on , and with , then the nested sequence of (increasing but finite-dimensional) convex cones , , provides finite-dimensional approximations of .

4. Application to polynomial optimization

Consider the polynomial optimization problem

 (4.1) P:f∗=infx{f(x):x∈K},

where is closed and .

If is compact let be a finite Borel measure with and if is not compact, let be an arbitrary finite Borel measure with and let be as in (3.7). In both cases, the sequence of moments , , is well-defined, and we assume that is available or can be computed, for every .

Consider the sequence of semidefinite programs:

 (4.2) λd=supλ∈R{λ:Md(fλy)⪰0}

where is the polynomial . Notice that (4.2) has only one variable!

Theorem 4.1.

Consider the hierarchy of semidefinite programs (4.2) indexed by . Then:

(i) (4.2) has an optimal solution for every .

(ii) The sequence , , is monotone nonincreasing and as .

Proof.

(i) Since on , by Theorem 3.2, is a feasible solution of (4.2) for every . Hence for every . Next, let be fixed, and let be an arbitrary feasible solution of (4.2). From the condition , the diagonal entry must be nonnegative, i.e., , and so, as we maximize and , (4.2) must have an optimal solution .

(ii) Obviously whenever , because implies . Therefore, the sequence , , is monotone nonincreasing and being bounded below by , converges to . Next, suppose that ; fix , arbitrary. The convergence implies . As was arbitrary, we obtain that for every . But then by Theorem 3.2 or Theorem 3.4, on , and so , in contradiction with . Therefore . ∎

For each , the semidefinite program (4.2) provides an upper bound on the optimal value only. We next show that the dual contains some information on global minimizers, at least when is sufficiently large.

4.1. Duality

Let be the space of real symmetric matrices. One may write the semidefinite program (4.2) as

 (4.3) λd=supλ{λ:λMd(y)⪯Md(fy)},

which in fact is a generalized eigenvalue problem for the pair of matrices and . Its dual is the semidefinite program

 infX∈Sd{⟨X,Md(fy)⟩:⟨X,Md(y)⟩=1;X⪰0},

or, equivalently,

 (4.4) λ∗d=infσ{∫Kfσdμ:∫Kσdμ=1;σ∈Σ[x]d}.

So the dual problem (4.4) is to find a sum of squares polynomial of degree at most (normalized to satisfy ) that minimizes the integral , and a simple interpretation of (4.4) is as follows:

With being the space of Borel probability measures on , we know that . Next, let be the space of probability measures on which have a density with respect to . Then (4.4) reads , which clearly shows why one obtains an upper bound on . Indeed, instead of searching in one searches in its subset . What is not obvious at all is whether the obtained upper bound obtained in (4.4) converges to when the degree of is allowed to increase!

Theorem 4.2.

Suppose that and has nonempty interior. Then :

(a) There is no duality gap between (4.2) and (4.4) and (4.4) has an optimal solution which satisfies .

(b) If is convex and is convex, let . Then and , so that