Convex Hulls of Algebraic Sets

Convex Hulls of Algebraic Sets

João Gouveia Department of Mathematics, University of Washington, Box 354350, Seattle, WA 98195, USA, and CMUC, Department of Mathematics, University of Coimbra, 3001-454 Coimbra, Portugal jgouveia@math.washington.edu    Rekha Thomas Department of Mathematics, University of Washington, Box 354350, Seattle, WA 98195, USA thomas@math.washington.edu
July 6, 2019
Abstract

This article describes a method to compute successive convex approximations of the convex hull of a set of points in that are the solutions to a system of polynomial equations over the reals. The method relies on sums of squares of polynomials and the dual theory of moment matrices. The main feature of the technique is that all computations are done modulo the ideal generated by the polynomials defining the set to the convexified. This work was motivated by questions raised by Lovász concerning extensions of the theta body of a graph to arbitrary real algebraic varieties, and hence the relaxations described here are called theta bodies. The convexification process can be seen as an incarnation of Lasserre’s hierarchy of convex relaxations of a semialgebraic set in . When the defining ideal is real radical the results become especially nice. We provide several examples of the method and discuss convergence issues. Finite convergence, especially after the first step of the method, can be described explicitly for finite point sets.

1 Introduction

An important concern in optimization is the complete or partial knowledge of the convex hull of the set of feasible solutions to an optimization problem. Computing convex hulls is in general a difficult task, and a classical example is the construction of the integer hull of a polyhedron which drives many algorithms in integer programming. In this article we describe a method to convexify (at least approximately), an algebraic set using semidefinite programming.

By an algebraic set we mean a subset described by a finite list of polynomial equations of the form where is an element of , the polynomial ring in variables over the reals. The input to our algorithm is the ideal generated by , denoted as , which is the set . An ideal is a group under addition and is closed under multiplication by elements of . Given an ideal , its real variety, is an example of an algebraic set. Given , we describe a method to produce a nested sequence of convex relaxations of the closure of , the convex hull of , called the theta bodies of . The -th theta body is obtained as the projection of a spectrahedron (the feasible region of a semidefinite program), and

 \textupTH1(I)⊇\textupTH2(I)⊇⋯⊇\textupTHk(I)⊇\textupTHk+1(I)⊇⋯⊇\textupcl(\textupconv(VR(I))).

Of special interest to us are real radical ideals. We define the real radical of an ideal , denoted as , to be the set of all polynomials such that for some and . We say that is a real radical ideal if . Given a set , the vanishing ideal of , denoted as , is the set of all polynomials in that vanish on . The Real Nullstellensatz (see Theorem 2.2) says that for any ideal , .

The construction of theta bodies for arbitrary ideals was motivated by a problem posed by Lovász. In ShannonCapacity () Lovász constructed the theta body of a graph, a convex relaxation of the stable set polytope of a graph which was shown later to have a description in terms of semidefinite programming. An important result in this context is that the theta body of a graph coincides with the stable set polytope of the graph if and only if the graph is perfect. Lovász observed that the theta body of a graph could be described in terms of sums of squares of real polynomials modulo the ideal of polynomials that vanish on the incidence vectors of stable sets. This observation naturally suggests the definition of a theta body for any ideal in . In fact, an easy extension of his observation leads to a hierarchy of theta bodies for all ideals as above. In (Lovasz, , Problem 8.3), Lovász asked to characterize all ideals that have the property that their first theta body coincides with , which was the starting point of our work. For defining ideals of finite point sets we answer this question in Section 4.

This article is organized as follows. In Section 2 we define theta bodies of an ideal in in terms of sums of squares polynomials. For a general ideal , we get that contains the closure of the projection of a spectrahedron which is described via combinatorial moment matrices from . When the ideal is real radical, we show that coincides with the closure of the projected spectrahedron, and when is the defining ideal of a set of points in , the closure is not needed. We establish a general relationship between the theta body sequence of an ideal and that of its real radical ideal .

Section 3 gives two examples of the construction described in Section 2. As our first example, we look at the stable sets in a graph and describe the hierarchy of theta bodies that result. The first member of the hierarchy is Lovász’s theta body of a graph. This hierarchy converges to the stable set polytope in finitely many steps as is always the case when we start with a finite set of real points. The second example is a cardiod in the plane in which case the algebraic set that is being convexified is infinite.

In Section 4 we discuss convergence issues for the theta body sequence. When is compact, the theta body sequence is guaranteed to converge to the closure of asymptotically. We prove that when is finite, for some finite . In the case of finite convergence, it is useful to know the specific value of for which . This notion is called exactness and we characterize exactness for the first theta body when the set to be convexified is finite. There are examples in which the theta body sequence does not converge to . While a full understanding of when convergence occurs is still elusive, we describe one obstruction to finite convergence in terms of certain types of singularities of .

The last section gives more examples of theta bodies and their exactness. In particular we consider cuts in a graph and polytopes coming from the graph isomorphism question.

The core of this paper is based on results from GPT () and GLPT () which are presented here with a greater emphasis on geometry, avoiding some of the algebraic language in the original results. Theorems 2.5, 4.2 and their corollaries are new while Theorem 4.4 is from GouNet (). The application of theta bodies to polytopes that arise in the graph isomorphism question is taken from DHMO ().

Acknowledgments. Both authors were partially supported by the NSF Focused Research Group grant DMS-0757371. J. Gouveia was also partially supported by Fundação para a Ciência e Tecnologia and R.R. Thomas by the Robert R. and Elaine K. Phelps Endowed Professorship.

2 Theta bodies of polynomial ideals

To describe the convex hull of an algebraic set, we start with a simple observation about any convex hull. Given a set , , the closure of , is the intersection of all closed half-spaces containing :

 \textupcl(\textupconv(S))={p∈Rn:l(p)≥0∀l∈R[x]1 s.t. l|S≥0}.

From a computational point of view, this observation is useless, as the right hand side is hopelessly cumbersome. However, if is the zero set of an ideal , we can define nice relaxations of the above intersection of infinitely many half-spaces using a classical strengthening of the nonnegativity condition . We describe these relaxations in this section. In Section 2.1 we introduce our method for arbitrary ideals in . In Section 2.2 we specialize to real radical ideals, which occur frequently in applications, and show that in this case, much stronger results hold than for general ideals.

2.1 General ideals

Recall that given an ideal , two polynomials and are defined to be congruent modulo , written as mod , if . The relation is an equivalence relation on and the equivalence class of a polynomial is denoted as . The set of all congruence classes of polynomials modulo is denoted as and this set is both a ring and a -vector space: given and , . Note that if mod , then for all .

We will say that a polynomial is a sum of squares (sos) modulo if there exist polynomials such that mod . If is sos modulo then we immediately have that is nonnegative on . In practice, it is important to control the degree of the in the sos representation of , so we will say that is -sos mod if , where is the set of polynomials in of degree at most . The set of polynomials that are -sos mod , considered as a subset of will be denoted as .

Definition 1

Let be a polynomial ideal. We define the -th theta body of to be the set

 \textupTHk(I):={p∈Rn:l(p)≥0 for % all l∈R[x]1 s.t. l is k-sos mod I}.

Since, if is sos mod then on , and is closed, . Also, , since as increases, we are potentially intersecting more half-spaces. Thus, the theta bodies of create a nested sequence of closed convex relaxations of .

We now present a related semidefinite programming relaxation of using the theory of moments.

For an ideal, let be a basis for the -vector space . We assume that the polynomials representing the elements of are minimal degree representatives of their equivalence classes . This makes the set well-defined. In this paper we will restrict ourselves to a special type of basis .

Definition 2

Let be an ideal. A basis of is a -basis if it satisfies the following conditions:

1. ;

2. , for ;

3. is monomial for all ;

4. if then is in the real span of .

We will also always assume that is ordered and that .

Using Gröbner bases theory, one can see that if is not contained in any proper affine space, then always has a -basis. For instance, take the in to be the standard monomials of an initial ideal of with respect to some total degree monomial ordering on (see for example CLO ()). The methods we describe below work with non monomial bases of as explained in GPT (). We restrict to a -basis in this survey for ease of exposition and since the main applications we will discuss only need this type of basis.

Fix a -basis of and define to be the column vector formed by all the elements of in order. Then is a square matrix indexed by with -entry equal to . By hypothesis, the entries of lie in the -span of . Let { } be the set of real numbers such that . We now linearize by replacing each element of by a new variable.

Definition 3

Let , and { } be as above. Let be a real vector indexed by . The -th combinatorial moment matrix of is the real matrix indexed by whose -entry is

Example 1

Let be the ideal . Then a -basis for would be . Let us construct the matrix . Consider the vector , then

 f1[x]f1[x]T=⎛⎜⎝1x1x2x1x21x1x2x2x1x2x22⎞⎟⎠≡⎛⎜⎝1x1x2x12x2−x10x10x22⎞⎟⎠\textupmodI.

We now linearize the resulting matrix using , where indexes the th element of , and get

 MB1(y)=⎛⎜⎝y0y1y2y12y2−y10y20y3⎞⎟⎠.

The matrix will allow us to define a relaxation of that will essentially be the theta body .

Definition 4

Let be an ideal and a -basis of . Then

 QBk(I):=πRn({y∈RB2k:y0=1,MBk(y)⪰0})

where is projection of to , its coordinates indexed by .

The set is a relaxation of . Pick and define . Then , and . We now show the connection between and .

Theorem 2.1

For any ideal and any -basis of , we get .

Proof

We start with a general observation concerning -sos polynomials. Suppose where . Each can be identified with a real row vector such that , and so . Denoting by the positive semidefinite matrix we get . In general, is not unique. Let be the real row vector such that mod . Then check that for any column vector , , where stands for the usual entry-wise inner product of matrices.

Suppose , and such that , and . Since is closed, we just have to show that for any that is -sos modulo , . Since and , since and .

In the next subsection we will see that when is a real radical ideal, coincides with .

The idea of computing approximations of the convex hull of a semialgebraic set in via the theory of moments and the dual theory of sums of squares polynomials is due to Lasserre Lasserre1 (); Lasserre2 (); Lasserre3 () and Parrilo Parrilo:phd (); Parrilo:spr (). In his original set up, the moment relaxations obtained are described via moment matrices that rely explicitly on the polynomials defining the semialgebraic set. In Lasserre2 (), the focus is on semialgebraic subsets of where the equations are used to simplify computations. This idea was generalized in Laurent () to arbitrary real algebraic varieties and studied in detail for zero-dimensional ideals. Laurent showed that the moment matrices needed in the approximations of could be computed modulo the ideal defining the variety. This greatly reduces the size of the matrices needed, and removes the dependence of the computation on the specific presentation of the ideal in terms of generators. The construction of the set is taken from Laurent (). Since an algebraic set is also a semialgebraic set (defined by equations), we could apply Lasserre’s method to to get a sequence of approximations . The results are essentially the same as theta bodies if the generators of are picked carefully. However, by restricting ourselves to real varieties, instead of allowing inequalities, and concentrating on the sum of squares description of theta bodies, as opposed to the moment matrix approach, we can prove some interesting theoretical results that are not covered by the general theory for Lasserre relaxations. Many of the usual results for Lasserre relaxations rely on the existence of a non-empty interior for the semialgebraic set to be convexified which is never the case for a real variety, or compactness of the semialgebraic set which we do not want to impose.

Recall from the introduction that given an ideal its real radical is the ideal

 R√I={f∈R[x]:f2m+∑g2i∈I,m∈N,gi∈R[x]}.

The importance of this ideal arises from the Real Nullstellensatz.

Theorem 2.2 (Bounded degree Real Nullstellensatz Lombardi ())

Let be an ideal. Then there exists a function such that, for all polynomials of degree at most that vanish on , for some polynomials such that and are all bounded above by . In particular .

When is a real radical ideal, the sums of squares approach and the moment approach for theta bodies of coincide, and we get a stronger version of Theorem 2.1.

Theorem 2.3

For any real radical and any -basis of , .

Proof

By Theorem 2.1 we just have to show that . By (PowSchei, , Prop 2.6), the set of elements of that are -sos modulo , is closed when is real radical. Let be nonnegative on and suppose . By the separation theorem, we can find such that but for all , or equivalently, . Since runs over all possible positive semidefinite matrices of size , and the cone of positive semidefinite matrices of a fixed size is self-dual, we have . Let be any real number and consider . Then . Since and can be arbitrarily large, is forced to be positive. So we can scale to have , so that . By hypothesis we then have , but by the linearity of , which is a contradiction, so must be -sos modulo . This implies that any linear inequality valid for is valid for , which proves .

We now have two ways of looking at the relaxations — one by a characterization of the linear inequalities that hold on them and the other by a characterization of the points in them. These two perspectives complement each other. The inequality version is useful to prove (or disprove) convergence of theta bodies to while the description via semidefinite programming is essential for practical computations. All the applications we consider use real radical ideals in which case the two descriptions of coincide up to closure. In some cases, as we now show, the closure can be omitted.

Theorem 2.4

Let be a real radical ideal and be a positive integer. If there exists some linear polynomial that is -sos modulo with a representing matrix that is positive definite, then is closed and equals .

Proof

For this proof we will use a standard result from convex analysis: Let and be finite dimensional vector spaces, be a cone and be a linear map such that . Then where is the adjoint operator to . In particular, is closed in , the dual vector space to . The proof of this result follows, for example, from Corollary 3.3.13 in BorweinLewis () by setting .

Throughout the proof we will identify , for all , with the space by simply considering the coordinates in the basis . Consider the inclusion map , and let be the cone in of polynomials that can be written as a sum of squares of polynomials of degree at most . The interior of this cone is precisely the set of sums of squares with a positive definite representing matrix . Our hypothesis then simply states that which implies by the above result that is closed. Note that is the set of elements in such that is nonnegative for all and this is the same as demanding for all positive semidefinite matrices , which is equivalent to demanding that . So is just the set and by intersecting it with the plane we get which is therefore closed.

One very important case where the conditions of Theorem 2.4 holds is when is the vanishing ideal of a set of points. This is precisely the case of most interest in combinatorial optimization.

If and , then .

Proof

It is enough to show that there is a linear polynomial such that mod for a positive definite matrix and some -basis of .

Let be the matrix

 A=(l+1ctcD),

where , is the vector with all entries equal to , and is the diagonal matrix with all diagonal entries equal to . This matrix is positive definite since is positive definite and its Schur complement is positive.

Since mod for and is a monomial basis, for any , mod . Therefore, the constant (linear polynomial) mod .

The assumption that is real radical seems very strong. However, we now establish a relationship between the theta bodies of an ideal and those of its real radical, showing that determines the asymptotic behavior of . We start by proving a small technical lemma.

Lemma 1

Given an ideal and a polynomial of degree such that for some , the polynomial for every .

Proof

First, note that for any and any we have

 fl+ξ=1ξ(fl2+ξ)2+14ξ(−f2m)f2l−2m

and so is -sos modulo . For , define the polynomial to be the truncation of the Taylor series of at degree i.e.,

 p(x)=2m−1∑n=0(−1)n(2n)!(n!)2(1−2n)σn−1xn.

When we square we get a polynomial whose terms of degree at most are exactly the first terms of , and by checking the signs of each of the coefficients of we can see that the remaining terms of will be negative for odd powers and positive for even powers. Composing with we get

 (p(f(x)))2=σ2+4σf(x)+m−1∑i=0aif(x)2m+2i−m−2∑i=0bif(x)2m+2i+1

where the and are positive numbers. In particular

 σ2+4σf(x)=p(f(x))2+m−1∑i=0aif(x)2i(−f(x)2m)+m−2∑i=0bif(x)2m+2i+1.

On the right hand side of this equality the only term who is not immediately a sum of squares is the last one, but by the above remark, since , by adding an arbitrarily small positive number, it becomes -sos modulo . By checking the degrees throughout the sum, one can see that for any , is -sos modulo . Since and are arbitrary positive numbers we get the desired result.

Lemma 1, together with the Real Nullstellensatz, gives us an important relationship between the theta body hierarchy of an ideal and that of its real radical.

Theorem 2.5

Fix an ideal . Then, there exists a function such that for all .

Proof

Fix some , and let be a linear polynomial that is -sos modulo . This means that there exists some sum of squares such that . Therefore, by the Real Nullstellensatz (Theorem 2.2), for , where depends only on the ideal . By Lemma 1 it follows that is -sos modulo for every . Let . Then we have that is -sos modulo for all . This means that for every , the inequality is valid on for linear and -sos modulo . Therefore, is also valid on , and hence, .

Note that the function whose existence we just proved, can be notoriously bad in practice, as it can be much higher than necessary. The best theoretical bounds on come from quantifier elimination and so increase very fast. However, if we are only interested in convergence of the theta body sequence, as is often the case, Theorem 2.5 tells us that we might as well assume that our ideals are real radical.

3 Computing theta bodies

In this section we illustrate the computation of theta bodies on two examples. In the first example, is finite and hence is a polytope, while in the second example is infinite. Convex approximations of polytopes via linear or semidefinite programming have received a great deal of attention in combinatorial optimization where the typical problem is to maximize a linear function as varies over the characteristic vectors of some combinatorial objects . Since this discrete optimization problem is equivalent to the linear program in which one maximizes over all and is usually unavailable, one resorts to approximations of this polytope over which one can optimize in a reasonable way. A combinatorial optimization problem that has been tested heavily in this context is the maximum stable set problem in a graph which we use as our first example. In ShannonCapacity (), Lovász constructed the theta body of a graph which was the first example of a semidefinite programming relaxation of a combinatorial optimization problem. The hierarchy of theta bodies for an arbitrary polynomial ideal are a natural generalization of Lovász’s theta body for a graph, which explains their name. Our work on theta bodies was initiated by two problems that were posed by Lovász in (Lovasz, , Problems 8.3 and 8.4) suggesting this generalization and its properties.

3.1 The maximum stable set problem

Let be an undirected graph with vertex set and edge set . A stable set in is a set such that for all , . The maximum stable set problem seeks the stable set of largest cardinality in , the size of which is the stability number of , denoted as .

The maximum stable set problem can be modeled as follows. For each stable set , let be its characteristic vector defined as if and otherwise. Let be the set of characteristic vectors of all stable sets in . Then is called the stable set polytope of and the maximum stable set problem is, in theory, the linear program with optimal value . However, is not known apriori, and so one resorts to relaxations of it over which one can optimize .

In order to compute theta bodies for this example, we first need to view as the real variety of an ideal. The natural ideal to take is, , the vanishing ideal of . It can be checked that this real radical ideal is

 IG:=⟨x2i−xi∀i∈[n],xixj∀{i,j}∈E⟩⊂R[x1,…,xn].

For , let . From the generators of it follows that if , then mod where is in the -span of the set of monomials . Check that

 B:={xU+IG:U\textupstablesetinG}

is a -basis of containing . This implies that , and for , their product is which is if is not a stable set in . This product formula allows us to compute where we index the element by the set . Since and is real radical, by Corollary 1, we have that

 \textupTHk(IG)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩y∈Rn:∃M⪰0,M∈R|Bk|×|Bk|\textupsuchthatM∅∅=1,M∅{i}=M{i}∅=M{i}{i}=yiMUU′=0\textupifU∪U′\textupisnotstableinGMUU′=MWW′\textupifU∪U′=W∪W′⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭.

In particular, indexing the one element stable sets by the vertices of ,

 \textupTH1(IG)=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩y∈Rn:∃M⪰0,M∈R(n+1)×(n+1)\textupsuchthatM00=1,M0i=Mi0=Mii=yi∀i∈[n]Mij=0∀{i,j}∈E⎫⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪⎭.
Example 2

Let be a pentagon. Then

 IG=⟨x2i−xi∀i=1,…,5,x1x2,x2x3,x3x4,x4x5,x1x5⟩

and

 B={1,x1,x2,x3,x4,x5,x1x3,x1x4,x2x4,x2x5,x3x5}+IG.

Let be a vector whose coordinates are indexed by the elements of in the given order. Then

 \textupTH1(IG)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩y∈R5:∃y6,…,y10\textups.t.⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1y1y2y3y4y5y1y10y6y70y20y20y8