Dines theorem for inhomogeneous quadratic functions and nonconvex optimization

# Dines theorem for inhomogeneous quadratic functions and nonconvex optimization

Fabián Flores-Bazán Departamento de Ingeniería Matemática, CIMA, Universidad de Concepción, Casilla 160-C, Concepción, Chile (fflores@ing-mat.udec.cl). The research for the first author was supported in part by CONICYT-Chile through FONDECYT and BASAL Projects, CMM, Universidad de Chile.    Felipe Opazo Departamento de Ingeniería Matemática, Universidad de Concepción, Casilla 160-C, Concepción, Chile (felipeopazo@udec.cl)

# Pair of inhomogeneous quadratic functions: joint-range convexity properties and nonconvex programming

Fabián Flores-Bazán Departamento de Ingeniería Matemática, CIMA, Universidad de Concepción, Casilla 160-C, Concepción, Chile (fflores@ing-mat.udec.cl). The research for the first author was supported in part by CONICYT-Chile through FONDECYT and BASAL Projects, CMM, Universidad de Chile.    Felipe Opazo Departamento de Ingeniería Matemática, Universidad de Concepción, Casilla 160-C, Concepción, Chile (felipeopazo@udec.cl)

# Joint-range convexity for a pair of inhomogeneous quadratic functions and applications to QP

Fabián Flores-Bazán Departamento de Ingeniería Matemática, CIMA, Universidad de Concepción, Casilla 160-C, Concepción, Chile (fflores@ing-mat.udec.cl). The research for the first author was supported in part by CONICYT-Chile through FONDECYT and BASAL Projects, CMM, Universidad de Chile.    Felipe Opazo Departamento de Ingeniería Matemática, Universidad de Concepción, Casilla 160-C, Concepción, Chile (felipeopazo@udec.cl)
###### Abstract

We establish various extensions of the convexity Dines theorem for a (joint-range) pair of inhomogeneous quadratic functions. If convexity fails we describe those rays for which the sum of the joint-range and the ray is convex. These results are suitable for dealing nonconvex inhomogeneous quadratic optimization problems under one quadratic equality constraint. As applications of our main results, different sufficient conditions for the validity of S-lemma (a nonstrict version of Finsler’s theorem) for inhomogenoeus quadratic functions, is presented. In addition, a new characterization of strong duality under Slater-type condition is established.

Key words. Dines theorem, Nonconvex optimization, hidden convexity, Quadratic programming, S-lemma, nonstrict version of Finsler’s theorem, Strong duality.

Mathematics subject classification 2000. Primary: 90C20, 90C46, 49N10, 49N15, 52A10.

## 1 Introduction

Quadratic functions has proved to be very important in mathematics because of its consequences in various subjects like calculus of variations, mathematical programming, matrix theory (related to matrix pencil), geometry and special relativity [24, 21, 43, 22, 20, 26, 4, 14], among others, and applications in Applied sciences: telecommunications, robust control [33, 40], trust region problems [19, 41].

The lack of convexity always offers a nice challenge in mathematics, but sometimes, as occurs in the quadratic world, hidden convexity is present, It seems to be that one of the first results for quadratic forms is due to Finsler [15], known as (strict) Finsler’s theorem, which refers to positive definiteness of a matrix pencil. The same result was proved, independently, by the Chicago’s School under the guidance of Bliss. We quote Albert [1], Reid, [39], Dines [13], Calabi [11], Hestenes [23].

It perhaps the first beautiful results for a pair of quadratic forms is due to Dines [13] and Brickman [10], proving the convexity, respectively, of

 {(⟨Ax,x⟩,⟨Bx,x⟩)∈R2: x∈Rn}, (1)
 {(⟨Ax,x⟩,⟨Bx,x⟩)∈R2: ⟨x,x⟩=1, x∈Rn}  (n≥3), (2)

provided and are real symmetric matrices. Actually Dines, motivated by the above result due to Finsler, searched the convexity in (1). This convexity property inspired to many researchers for searching hidden convexity in the quadratic framework. Generalizations to more than two matrices were developed in [4, 37, 21, 25, 12, 36], and references therein, without being completed. It is well known that, in general, is nonconvex if and are inhomogeneous quadratic functions.
Precisely, our interest in the present paper is to consider a pair of inhomogeneous quadratic functions and , and to describe completely when the convexity of occurs (the only result we aware is Theorem 2.2 in [37], it will be contained in our Theorem 4.6 below). In addition, we also answer the question about which directions we must add to the set in order to get convexity, in another words, for which directions , the set is convex. As a consequence of our main result we recover the Dines theorem. We exploit the hidden convexity to derive some sufficient condition for the validity of an S-lemma with an equality constraint (a nonstrict version of Finsler’s theorem for inhomogeneous quadratic functions), which are expressed in a different way than that established in [48], suitable for dealing with the problem

 inf{f(x): g(x)=0, x∈Rn}. (3)

The latter S-lemma is also useful for dealing with bounded generalized trust region subproblems, that is, with constraints , as shown in [48].

Moreover, a new strong duality result for this problem as well as necessary and sufficient optimality conditions are established, covering situations where no result in [34, 28, 48] is applicable. In [48], by using a completely different approach, a characterization of the convexity of , when is affine, is given.

A complete description (besides the convexity) of the set , where

 μ≐inf{f(x): g(x)≤0, x∈Rn}, (4)

for any pair of inhomogeneous quadratic functions and , is given in [16] by assuming to be finite; and when the set is considered. When and are any real-valued functions, strong duality for (4) implies the convexity of as shown in [17].

It is worthwhile mentioning that the existence of solution for (4) was fully analyzed in [5] under simultaneous diagonalizability (SD).

We point out that the convexity of (proved in Theorem 4.17 below) was stated in Corollary 10 of [45], but its proof is not correct since the set is not closed in general: Examples 3.5 and 5.15 show this fact. On the other hand, we mention the recent paper [29] where it is proved, under suitable assumptions, the convexity of with being any quadratic function, (quadratic) strictly convex and all the other functions affine linear. Another joint-range convexity result involving -matrices may be found in [28].

Apart from the characterizations of strong duality, several sufficient conditions of the zero duality gap for convex programs have been established in the literature, see [18, 2, 3, 52, 7, 8, 9, 44, 35].

The paper is structured as follows. Section 2 provides the necessary notations, definitions and some preliminaries to be used throughout the paper: in particular, the Dines theorem is recalled. Some characterizations of bi-dimensional Simultaneous Diagonalization (SD) and Non Degenerate (ND) properties for a pair of matrices are established in Section 3. Section 4 contains our main results, all of them related to extensions of Dines theorem. Applications of those extensions to nonconvex quadratic optimization under a single equality constraint are presented in Section 5: they include a new S-lemma (a nonstrict version of Finsler’s theorem for inhomogeneous quadratic functions), strong duality results, as well as necessary and sufficient optimality conditions. Finally, Section 6 presents, for reader’s convenience, a brief historical note about the appearance, in a chronological order, of the several properties arising in the study of quadratic forms. Some relationships between those properties are also outlined.

## 2 Basic notations and some preliminaries

In this section we introduce the basic definitions, notations and some preliminary results.

Given any nonempty set , its closure is denoted by ; its convex hull by which is the smallest convex set containing ; its topological interior by , whereas its relative interior by , it is the interior with respect to its affine set; the (topological) boundary of is denoted by . We denote the complement of by . We set , being the smallest cone containing , and In case , we denote and , where . Furthermore, stands for the (non-negative) polar cone of which is defined by

 K∗≐{ξ∈Rn: ⟨ξ,a⟩≥0  ∀ a∈K},

where means the scalar or inner product in , whose elements are considered column vectors. Thus, for all . By we mean the ortogonal subspace to , given by ; in case , we simply put ; stands for the ray starting at the origin along the direction . We say is a cone if for all , and it is pointed if .

Throughout this paper the matrices are always with real entries. Given any matrix or order , stands for the transpose of ; whereas if is a symmetric square matrix of order , we say it is positive semidefinite, denoted by , if for all ; it is positive definite, denoted by if for all , . The set of symmetric square matrices of order is denoted by .

 f(x)=⟨Ax,x⟩+⟨a,x⟩+k1,

for some , and , we set

 fH(x)≐⟨Ax,x⟩,  fL(x)≐⟨a,x⟩.

If we are given another quadratic function

 g(x)=⟨Bx,x⟩+⟨b,x⟩+k2,

for some , and . Set

 (5)

An important property in matrix analyis and in the study of nonconvex quadratic programming, is that of Simultaneous Diagonalization property. We say that any two matrices in has the Simultaneous Diagonalization (SD) property, simple simultaneous diagonalizable, if there exists a nonsingular matrix such that both and are diagonal [26, Section 7.6], that is, if there are linearly independent (LI) vector , , such that Such an assumption, for instance, allowed the authors in [6] to re-write the original problem in a more tractable one. The symbol LD stands for linear dependence.

It is said that and are Non Degenerate (ND) if

 ⟨Au,u⟩=0=⟨Bu,u⟩⟹u=0. (6)

One of the most important results concerning quadratic functions refers to Dine’s theorem [13], it perhaps motivated by Finsler’s theorem [15].

###### Theorem 2.1.

[13, Theorem 1, Theorem 2][23, Theorem 2] The set is a convex cone. In addition, if (6) holds then either or is closed and pointed.

The convexity may fail for if with being not necessarily homogeneous quadratic functions, as the next example shows.

###### Example 2.2.

Consider , , and define the set . Clearly and , but . One can actually see that

 FH(R2)=R+(−1,1), and F(R2)={(t−t2,t2−1): t∈R}.

Another instance is Example 4.3, where

 F(R2)={(0,0}∪[R2∖(R×{0})],  FH(R2)=R×{0}.

We now state a simple result which will be used in the next sections. For any , set , so that and .

###### Proposition 2.3.

Let . The following hold

• is LI.

• Assume that is LI. Then

Finally, the next lemma which is important by itself will play an important role in the subsequent sections.

###### Lemma 2.4.

Let be a nonempty subset of and , , be any elements in such that

 F(X)+Rh+R+h1⊆F(Rn). (7)

Then is convex under any of the following circumstances:

• is LI and

• is LD and .

###### Proof.

Let and with . The desired result is obtained by showing that .
: By assumption , and therefore, from (7) and Proposition 2.3 one gets, for all ,

 H(x0)≐{u: ⟨h⊥,h1⟩⟨h⊥,u−F(x0)⟩>0}⊆F(Rn). (8)

The desired result is obtained by showing that for some . We distinguish two cases.
: (the case “” is entirely similar). Since

 ⟨h⊥,h1⟩⟨h⊥,ft−F(x)⟩>0,

by densedness and continuity, we get close to such that , and so by (8).
: . Let us consider the functions and defined by

 q1(λ)≐F(λx+(1−λ)y),  q(λ)≐⟨h⊥,h1⟩⟨h⊥,q1(λ)−F(x)⟩.

Clearly is quadratic satisfying . Let us consider first that . Due to continuity is a connected set contained in the line passing through and . Thus, .
We now consider . Then there exists satisfying , i. e.,

 ⟨h⊥,h1⟩⟨h⊥,F(λ1x+(1−λ1)y)−ft⟩=⟨h⊥,h1⟩⟨h⊥,F(λ1x+(1−λ1)y)−F(x)⟩<0.

Hence by taking near , we obtain , and so by (8).
: As is LD, then (7) means that for all ,

 H0(x0)≐{u∈R2: ⟨h⊥,u−F(x0)⟩=0}⊆F(Rn).

Let . Then is continuous and , . We observe that either or . In the first case for all , and so . In case of opposite sign, we get such that , which implies that . ∎

## 3 Characterizing SD and ND in two dimensional spaces

This section is devoted to characterizing the simultaneous diagonalization and non degenerate properties for a pair of matrices in terms of its homogeneous quadratic forms. As one may found in the literature, the study in deserves a special treatment from , , and to the best knowledge of these authors the following characterizations are new. As said before, here .

We start by a simple proposition appearing elsewhere whose proof is presented here just for reader’s convenience.

###### Proposition 3.1.

Let us consider the assertions:

• ND holds for and

• is closed.

Then .

###### Proof.

: Let satisfying . If on the contrary , then by taking such that is linearly independent, we obtain for ,

 FH(αu+βv)=α2FH(u)+β2FH(v)+2αβzu,v.

Thus , which is impossible if .
: Let be a sequence such that . In case is bounded, there is nothing to do. If is unbounded, up to a subsequence, we may suppose that and . Thus and

 1∥xk∥2FH(xk)=FH(xk∥xk∥)→FH(u)=0,

which yields, by assumption, , a contradiction. ∎

Example 3.2 below shows that may fail in higher dimension. However, for , one obtains that implies the existence of , , such that , as Corollary in [23, p. 401] shows. We also point out the proof for proving implies remains valid for any dimension, see also Theorem 6 in [23].

###### Example 3.2.

Take

Then, , but ND does not hold for and .

Next result provides a new characterization for SD in two dimension.

###### Theorem 3.3.

The following statements are equivalent:

• SD holds for and

• such that

• is closed and .

###### Proof.

: By assumption, there exist LI vectors , such that . Thus, . From the equality the desired conclusion is obtained.
: it is straightforward.
: We already know that is a convex cone. We first check that cannot be a halfspace. Indeed, suppose that for some , . Then there exist such that and , which imply that is LI. Since for all , we get for all . Hence , and therefore .
Thus, the set may be the origin ; a ray; a pointed cone, a straightline.
: We simply take any two LI vectors and . Indeed, since , we obtain .
: Assume that , and take such that , and choose so that is LI. In case , we proceed as follows. Since , we obtain , which implies that for some . It follows that with being LI, and therefore SD holds.
: We have, for some LI vectors (see Proposition 2.3)

 FH(R2)=R+p+R+q={z∈R2:⟨p⊥,q⟩⟨p⊥,z⟩≥0, ⟨q⊥,p⟩⟨q⊥,z⟩≥0}, (9)

with the property . Take in satisfying , . It follows that and are LI. From (9), we get in particular, , for all . This implies that . Similarly one obtains . Thus , which is the desired result.
: This case is similar to . Take such that , , which imply that is LI. Hence is LI for some and . ∎

Next example illustrates that and need not to be LI in the previous theorem; Example 3.2 shows that does not imply in higher dimension, since we get , and clearly SD holds for and ; whereas Example 3.5 exhibits an instance where without the closedness of the implication may fail.

###### Example 3.4.

Take

 A=(0110),   B=0,

Then, by choosing

 C=(11−11),

we get that is diagonal. It is easy to see that

 FH(R2)=R+(1,0)+R+(−1,0)=R×{0}.
###### Example 3.5.

Consider

Then, even if

 FH(x1,x2)=(x1+x2)2(1,0)+(x21−x22)(0,1),

one obtains , which is not closed and clearly SD does not hold for and .

We are now in a position to establish a new characterization for ND in .

###### Theorem 3.6.

The following assertions are equivalent:

• ND holds for and

• and is a closed set different from a line.

###### Proof.

: The first part of is straightforward, and the closedness of is a consequence of Proposition 3.1. It remains only to prove that is different from a line. In case , we are done; thus suppose that . By Theorem 3.3, we have SD, that is, there exist , LI, such that . This means for all . Obviously , and if for some , then . This implies that which is impossible, therefore for all . Thus is not a line.
: Since is closed, by Theorem 3.3, either or SD holds. In the first case, Proposition 3.1 implies that is satisfied. Assume that SD holds, as before, there exist , LI, such that . Let satisfying , we claim that . By writting for some , , we get . We distinguish various situations. If (resp. ), then and (resp. and ), which along with and , allow us to infer (resp. ). It follows that (resp. ), which is impossible.
We now consider . Suppose, on the contrary, that for . Then, from , we obtain for some . This yields that is a line, a contradiction. Hence for , and so , completing the proof. ∎

The same proof of the previous theorem allows us to obtain the next result which establishes a relationship between ND and SD.

###### Corollary 3.7.

The following assertions are equivalent:

• and ND holds

• , is different from a line and SD holds;

• such that .

###### Proof.

follows from Theorems 3.3 and 3.6; whereas the reverse implication is derived from the proof of the previous theorem. The equivalence between and is Corollary 1 in [13, page 498], valid for all . ∎

## 4 Dines-type theorem for inhomogeneous quadratic functions and relatives

This section is devoted to proving a generalization of Dines theorem for inhomogeneous quadratic functions. Set

 f(x)≐⟨Ax,x⟩+⟨a,x⟩, g(x)≐⟨Bx,x⟩+⟨b,x⟩, (10)

and, as before , so that .

We first deal with the one-dimensional case and afterward the general situation.

### 4.1 The case of one-dimension

We begin with the following useful simple result.

Let , . Then

• .

###### Proof.

is straightforward and is a consequence of the following equalities:

 tF(αu) + (1−t)F(βu)=[tα2+(1−t)β2]FH(u)+[tα+(1−t)β]FL(u) (11) = [(tα+(1−t)β)2+(t−t2)(α−β)2]FH(u)+[tα+(1−t)β]FL(u) = F((tα+(1−t)β)u)+(t−t2)(α−β)2FH(u).

The one-dimensional version of (inhomogeneous) Dines-type theorem is expressed in the following

###### Lemma 4.2.

Let , and . The following hold:

• Assume that is LD then is convex.

• Assume that is LI. Then

1. if one has

2. if then

 F(Ru)+R+d=F(Ru)∪C(co F(Ru))=¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(co F(Ru))≠co F(Ru)+R+d;
3. if is LI, one has

Similar results hold for the set for any fixed since

 F(x+tu)=t2FH(u)+t(⟨2Ax+a,u⟩⟨2Bx+b,u⟩)+F(x).
###### Proof.

We write .
: In this case the set is either a point or ray or a line, so convex.
: From Proposition 4.1, we obtain

 co(F(Ru)+R+d)=co F(Ru)+R+d=F(Ru)+R+FH(u)+R+d, (12)

from which the convexity of follows if .
: We obtain the following equalities, thanks to the LI of :

 co F(Ru) = {3∑i=1λiF(αiu): 3∑i=1λi=1, λi≥0, αi∈R} (13) = {3∑i=1λiα2iFH(u)+3∑i=1λiαiFL(u): λi≥0, 3∑i=1λi=1, αi∈R} = {αFH(u)+βFL(u): α≥β2, α, β∈R}.

Thus

 C(co F(Ru))={αFH(u)+βFL(u): α<β2, α, β∈R},  and so
 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(co F(Ru)) = {αFH(u)+βFL(u): α≤β2, α, β∈R}=F(Ru)+R+d,

since by Proposition 4.1.
: We write with . By virtue of (12), we need to check that . This requires to solve a quadratic equation, which is always possible. Indeed, take , , , we must find and such that

 αλ2+λ+=βλ2+r+,  α2+αλ1+γ+=β2+βλ1. (14)

We can solve this system by substituting from the first equation of (14) into the second one, proving the convexity of .
Let us check the last assertion. By assumption, we can write with . From (13), if and only if with . By taking sufficiently large such that