A refined error analysis for fixed-degree polynomial optimization over the simplex

# A refined error analysis for fixed-degree polynomial optimization over the simplex

Zhao Sun Tilburg University PO Box 90153, 5000 LE Tilburg
Tel.: +31-13-4663313
Fax: +31-13-4663280
22email: z.sun@uvt.nl
###### Abstract

We consider the problem of minimizing a fixed-degree polynomial over the standard simplex. This problem is well known to be NP-hard, since it contains the maximum stable set problem in combinatorial optimization as a special case. In this paper, we revisit a known upper bound obtained by taking the minimum value on a regular grid, and a known lower bound based on Pólya’s representation theorem. More precisely, we consider the difference between these two bounds and we provide upper bounds for this difference in terms of the range of function values. Our results refine the known upper bounds in the quadratic and cubic cases, and they asymptotically refine the known upper bound in the general case.

###### Keywords:
Polynomial optimization over the simplex Global optimization Nonlinear optimization
90C30 90C60
journal:

## 1 Introduction and preliminaries

Consider the problem of minimizing a homogeneous polynomial of degree on the (standard) simplex

 Δn:={x∈Rn+:n∑i=1xi=1}.

That is, the global optimization problem:

 f––:=minx∈Δnf(x),  or  ¯¯¯f:=maxx∈Δnf(x). (1)

Here we focus on the problem of computing the minimum of over . This problem is well known to be NP-hard, as it contains the maximum stable set problem as a special case (when is quadratic). Indeed, given a graph with adjacency matrix , Motzkin and Straus MS () show that the maximum stability number can be obtained by

 1α(G)=minx∈Δ|V|xT(I+A)x,

where denotes the identity matrix. Moreover, one can w.l.o.g. assume is homogeneous. Indeed, if , where is homogeneous of degree , then , setting

For problem (1), many approximation algorithms have been studied in the literature. In fact, when has fixed degree , there is a polynomial time approximation scheme (PTAS) for this problem, see BK02 () for the case and KLP06 (); KLS13 () for . For more results on its computational complexity, we refer to EDK08 (); KHE08 ().

We consider the following two bounds for : an upper bound obtained by taking the minimum value on a regular grid and a lower bound based on Pólya’s representation theorem. They both have been studied in the literature, see e.g. BK02 (); KLP06 (); KLS13 () for and KLP06 (); SY13 (); YEA12 () for . The two ranges and have been studied separately and upper bounds for each of them have been shown in the above mentioned works.

In this paper, we study these two ranges at the same time. More precisely, we analyze the larger range and provide upper bounds for it in terms of the range of function values . Of course, upper bounds for the range can be obtained by combining the known upper bounds for each of the two ranges and . Our new upper bound for refines these known bounds in the quadratic and cubic cases and provide an asymptotic refinement for general degree .

### Notation

Throughout denotes the set of all homogeneous polynomials in variables with degree . We let . We denote as the set of all nonnegative real vectors, and as the set of all nonnegative integer vectors. For , we define and . We denote . We let denote the all-ones vector and denote the -th standard unit vector. We denote as the set of all multivariate polynomials in variables (i.e. ) and denote as the set of all multivariate homogeneous polynomials in variables with degree . For , we denote , while for , we let . Moreover, we denote for integer and for . Thus, if is an integer with .

### Upper bounds using regular grids

One can construct an upper bound for by taking the minimum of on the regular grid

 Δ(n,r):={x∈Δn:rx∈Nn},

for an integer . We define

 fΔ(n,r):=minx∈Δ(n,r)f(x).

Obviously, , and can be computed by evaluations of . In fact, when considering polynomials of fixed degree , the parameters (with increasing values of ) provide a PTAS for (1), as was proved by Bomze and de Klerk BK02 () (for ), and by de Klerk et al. KLP06 () (for ). Recently, de Klerk et al. KLS13 () provide an alternative proof for this PTAS and refine the error bound for from KLP06 () for cubic .

In addition, some researchers study the properties of the regular grid . For instance, given a point , Bomze et al. BGY () show a scheme to find the closest point to on with respect to some class of norms including -norms for .

### Lower bounds based on Pólya’s representation theorem

Given a polynomial , Pólya Pol74 () shows that if is positive over the simplex , then the polynomial has nonnegative coefficients for any large enough (see PR01 () for an explicit bound for ). Based on this result of Pólya, an asymptotically converging hierarchy of lower bounds for can be constructed as follows: for any integer , we define the parameter as

 f(r−d)min:=maxλ  s.t.  (n∑i=1xi)r−d⎛⎝f−λ(n∑i=1xi)d⎞⎠  has nonnegative coefficients. (2)

Notice that can be equivalently formulated as

 f––=max  λ  s.t.  f(x)−λ(n∑i=1xi)d≥0  ∀x∈Rn+.

Then, one can easily check the following inequalities:

 f(0)min≤f(1)min≤⋯≤f––≤fΔ(n,r)≤¯¯¯f.

Parrilo Par00 (); Par03 () first introduces the idea of applying Pólya’s representation theorem to construct hierarchical approximations in copositive optimization. De Klerk et al. KLP06 () consider and show upper bounds for in terms of . Furthermore, Yildirim YEA12 () and Sagol and Yildirim SY13 () analyze error bounds for for quadratic .

Now we give an explicit formula for the parameter , which follows from (PR01, , relation (3)); note that the quadratic case of this formula has also been observed in PVZ07 (); SY13 (); YEA12 ().

###### Lemma 1

For , one has

 f(r−d)min=minα∈I(n,r)∑β∈I(n,d)fβαβ––rd––. (3)
###### Proof

By using the multinomial theorem , we obtain

 (n∑i=1xi)r−df−λ(n∑i=1xi)r = ⎛⎝∑γ∈I(n,r−d)(r−d)!γ!xγ⎞⎠⎛⎝∑β∈I(n,d)fβxβ⎞⎠−λ⎛⎝∑α∈I(n,r)r!α!xα⎞⎠ = ∑α∈I(n,r)⎛⎝∑β∈I(n,d)fβαβ––1rd––⎞⎠r!α!xα−λ⎛⎝∑α∈I(n,r)r!α!xα⎞⎠ = ∑α∈I(n,r)⎛⎝∑β∈I(n,d)fβαβ––1rd––−λ⎞⎠r!α!xα.

Hence, by definition (2), we obtain

 f(r−d)min = max  λ  s.t  ∑β∈I(n,d)fβαβ––1rd––−λ≥0  ∀α∈I(n,r) = min∑β∈I(n,d)fβαβ––1rd––  s.t  α∈I(n,r).

Similarly as , by (3), the computation of requires evaluations of the polynomial .

### Bernstein coefficients

For any polynomial , we can write it as

 f=∑β∈I(n,d)fβxβ=∑β∈I(n,d)(fββ!d!)d!β!xβ. (4)

For any , we call the Bernstein coefficients of (this terminology has also been used in KL10 (); KLS13 ()), since they are the coefficients of the polynomial when is expressed in the Bernstein basis of . Applying the multinomial theorem together with (4), one can obtain that when evaluating at a point , is a convex combination of the Bernstein coefficients . Therefore, we have

 minβ∈I(n,d)fββ!d!≤f––≤fΔ(n,r)≤¯¯¯f≤maxβ∈I(n,d)fββ!d!. (5)

For the analysis in Section 5, we need the following result of KLP06 (), which bounds the range of the Bernstein coefficients of in terms of its range of values .

###### Theorem 1.1

(KLP06, , Theorem 2.2) For any polynomial , one has

 maxβ∈I(n,d)fββ!d!−minβ∈I(n,d)fββ!d!≤(2d−1d)dd(¯¯¯f−f––).

### Contribution of the paper

In this paper, we consider upper bounds for in terms of . More precisely, we provide tighter upper bounds in the quadratic, cubic, and square-free (aka multilinear) cases and, in the general case , our upper bounds are asymptotically tighter when is large enough. We will apply the formula (3) directly for the quadratic, cubic and square-free cases, while for the general case we will use Theorem 1.1.

There are some relevant results in the literature. De Klerk et al. KLP06 () give upper bounds for (the upper bound for cubic has been refined by de Klerk et al. KLS13 ()) and for in terms of , and by adding them up one can easily derive upper bounds for . Furthermore, for quadratic polynomial , Yildirim YEA12 () considers the upper bound for (for ) and upper bounds the range in terms of . Our results in this paper refine the results in KLP06 (); KLS13 (); YEA12 () for the quadratic and cubic cases (see Sections 2 and 3 respectively), while for the general case our result refines the result of KLP06 () when is sufficiently large (see Section 5).

### Structure

The paper is organized as follows. In Sections 2 and 3, we consider the quadratic and cubic cases respectively, and refine the relevant results obtained from KLP06 (); KLS13 (); YEA12 (). Then, we look at the square-free (aka multilinear) case in Section 4. Moreover, in Section 5, we consider general (fixed-degree) polynomials and compare our new result with the one of KLP06 ().

For any quadratic polynomial , we consider the range and derive the following upper bound in terms of .

###### Theorem 2.1

For any quadratic and , one has

 fΔ(n,r)−f(r−2)min≤1r−1(Qmax−fΔ(n,r))≤1r−1(¯¯¯f−f––), (6)

where .

###### Proof

By (3), we have

 f(r−2)min=minα∈I(n,r)1r(r−1)[f(α)−n∑i=1Qiiαi].

Hence, We obtain

 r−1rf(r−2)min≥minα∈I(n,r)f(αr)−maxα∈I(n,r)1rn∑i=1Qiiαir=fΔ(n,r)−1rQmax. (7)

One can easily obtain the first inequality in (6) by (7). For the second inequality in (6), we use the fact that (since for ), as well as the fact that . ∎

Now we point out that our result (6) refines the relevant result of KLP06 (). De Klerk et al. KLP06 () show the following theorem.

###### Theorem 2.2

(KLP06, , Theorem 3.2) Suppose and . Then

 f––−f(r−2)min≤1r−1(¯¯¯f−f––), (8)
 fΔ(n,r)−f––≤1r(¯¯¯f−f––). (9)

By adding up (8) and (9), one gets

 fΔ(n,r)−f(r−2)min≤(1r−1+1r)(¯¯¯f−f––),

which is implied by our result (6).

Moreover, in YEA12 (), Yildirim considers one hierarchical upper bound of (when is quadratic), which is defined by One can easily verify that

 f(r−2)min≤f––≤mink≤rfΔ(n,k)≤fΔ(n,r).

In (YEA12, , Theorem 4.1), Yildirim shows , which can also be easily implied by our result (6).

The following example shows that the upper bound (6) can be tight.

###### Example 1

(KLS13, , Example 2) Consider the quadratic polynomial . As is convex, one can check that (attained at ) and (attained at any standard unit vector). To compute , we write as , where and . Then one can check that

 fΔ(n,r)=1n+1r2s(n−s)n.

By (3), we have

 fΔ(n,r)−f(r−2)min=1r−1(¯¯¯f−f––)−1r2(r−1)s(n−s)n.

Hence, for this example, the upper bound (6) is tight when .

## 3 The cubic case

For any cubic polynomial , we consider the difference and show the following result.

###### Theorem 3.1

For any cubic polynomial and , one has

 fΔ(n,r)−f(r−3)min≤4r(r−1)(r−2)(¯¯¯f−f––). (10)
###### Proof

We can write any cubic polynomial as

 f=n∑i=1fix3i+∑i

Then, by (3) one can check that

 (r−1)(r−2)r2f(r−3)min (11) = minα∈I(n,r){f(αr)−1r3(3n∑i=1fiα2i−2n∑i=1fiαi+∑i

Evaluating at and yields, respectively, the relations:

 f––≤fi≤¯¯¯f, (12) fi+fj+fij+gij≤8¯¯¯f. (13)

Using (13) and the fact that , one can obtain

 ∑i

By (11), (12), (14) and the fact that , one can get

 (r−1)(r−2)f(r−3)min≥r2fΔ(n,r)−4r¯¯¯f+(r+2)minx∈Δnn∑i=1fixi≥r2fΔ(n,r)−4r¯¯¯f+(r+2)f––.

Hence, one has

 (r−1)(r−2)(fΔ(n,r)−f(r−3)min)≤4r¯¯¯f−(3r−2)fΔ(n,r)−(r+2)f––≤4r(¯¯¯f−f––).

Now we observe that our result (10) refines the relevant upper bound obtained from KLP06 (); KLS13 (). De Klerk et al. KLP06 () show the following result.

###### Theorem 3.2

(KLP06, , Theorem 3.3) Suppose and . Then

 f––−f(r−3)min≤4r(r−1)(r−2)(¯¯¯f−f––), (15)
 fΔ(n,r)−f––≤4r(¯¯¯f−f––). (16)

Recently, De Klerk et al. (KLS13, , Corollary 2 ) refine (16) to

 fΔ(n,r)−f––≤(4r−4r2)(¯¯¯f−f––). (17)

Similar to the quadratic case (in Section 2), our new upper bound (10) implies the upper bound obtained by adding up (15) and (17). However, we do not find any example showing the upper bound (10) is tight. Thus, it is still an open question to show the tightness of the upper bound (10).

## 4 The square-free case

Consider the square-free (aka multilinear) polynomial . We have the following result for the difference .

###### Theorem 4.1

For any square-free polynomial and , one has

 fΔ(n,r)−f(r−d)min≤(rdrd––−1)(¯¯¯f−f––). (18)
###### Proof

From (3), one can easily check that

 f(r−d)min=minα∈I(n,r)∑I:I⊆[n],|I|=dfIαIrd––=1rd––minα∈I(n,r)f(α).

As a result, one can obtain

 f(r−d)minfΔ(n,r)=rdr–d.

For , the result (18) is clear.

Now we assume . Considering (as for any ), we obtain

 fΔ(n,r)−f(r−d)min=(1−rdrd––)fΔ(n,r)≤(1−rdrd––)f––≤(rdrd––−1)(¯¯¯f−f––). (19)

The following example shows that our upper bound (18) can be tight.

###### Example 2

(KLS13, , Example 4) Consider the square-free polynomial . One can check , and

 fΔ(2,r)=⎧⎨⎩−14if r is even,−14+14r2if r is odd.

By (3), we have

 fΔ(2,r)−f(r−2)min=⎧⎪⎨⎪⎩1r−1(¯¯¯f−f––)if r is even,(1r+1r2)(¯¯¯f−f––)if r is odd.

For this example, the upper bound (18) is tight when is even. In fact, from (19), one can easily see that the upper bound (18) is tight as long as holds.

## 5 The general case

Now, we consider an arbitrary polynomial . We need the following notation to formulate our result. Consider the univariate polynomial (in the variable ), which can be written as

for some positive scalars . Moreover, one can easily check that

We can show the following error bound for the range .

###### Theorem 5.1

For any polynomial and , one has

 fΔ(n,r)−f(r−d)min≤(r+d−1)d––−rdrd––(2d−1d)dd(¯¯¯f−f––). (22)

Note that when is quadratic, cubic or square-free, we have shown better upper bounds in Theorems 2.1, 3.1 and 4.1.

In the proof we will need the following Vandermonde-Chu identity (see PR01 () for a proof, or alternatively use induction on ):

 (n∑i=1xi)d––=∑α∈I(n,d)d!α!xα––    ∀x∈Rn, (23)

which is an analogue of the multinomial theorem

Now we prove Theorem 5.1.

###### Proof

(of Theorem 5.1) From (3), we have

 rd––rdf(r−d)min=minα∈I(n,r)⎧⎨⎩∑β∈I(n,d)fβαβrd−∑β∈I(n,d)fβαβ−αβ––rd⎫⎬⎭.

From this we obtain the inequality:

 rd––rdf(r−d)min≥fΔ(n,r)−maxα∈I(n,r)∑β∈I(n,d)fβαβ−αβ––rd. (24)

We now focus on the summation .

For any and , we can write the polynomial as

 xβ−xβ––=∑γ:|γ|≤d−1(−1)d−|γ|−1cβγxγ, (25)

for some nonnegative scalars (which is an analogue of (20)). We now claim that, for any fixed , the following identity holds:

For this, observe that the polynomials at both sides of (26) are homogeneous of degree . Hence (26) will follow if we can show that the equality holds after summing each side over . In other words, it suffices to show the identity: