Finding Minimum Volume Circumscribing Ellipsoids Using Copositive Programming
We study the problem of finding an ellipsoid with minimum volume that contains a given convex set. We reformulate the problem as a generalized copositive program, and use that reformulation to derive tractable semidefinite programming approximations for instances where the set is defined by affine and quadratic inequalities. We empirically demonstrate that our proposed method generates high-quality solutions faster than solving the problem to optimality. We prove that our method always provides an ellipsoid of lower volume than the one provided by the application of the well-known -procedure. Moreover, we demonstrate that the difference in the solutions provided by the two methods can be very significant.
We consider the minimum volume ellipsoid problem (MVEP), i.e., the problem of finding an ellipsoid having minimum volume that contains a given set [1, 2]. This problem arises in many applications studied in the literature, e.g., outlier detection [3, 4], pattern recognition , computer graphics , facility location , etc. For compact convex sets, such an ellipsoid is unique and affine invariant, which makes it an attractive candidate for approximating these sets from the outside. For some sets, it is possible to find the minimum volume ellipsoid in polynomial time. For example, if the set is defined as the convex hull of a finite number of points, then the complexity of the MVEP is polynomial in the problem size [8, 9]. When the set is a union of ellipsoids, the MVEP can be solved in polynomial time using the -lemma . In general, solving the MVEP can be difficult. For example, if the set is a polytope defined by affine inequalities, or if the set is an intersection of ellipsoids, then solving the MVEP is NP-hard [1, 11].
In this paper, we present an exact generalized copositive programming reformulation for the MVEP for a general convex set. Solving a generalized copositive program is NP-hard, but this reformulation enables us to leverage state-of-the-art approximation schemes available for such optimization problems. In particular, generalized copositive programs yield a hierarchy of optimization problems which provide an increasingly tight conservative approximation to the original problem [12, 13, 14]. While the exact reformulation that we present holds for general convex sets, we limit the discussion of the approximations to those defined by affine and quadratic inequalities. For these sets, the optimization problems that approximate the respective copositive program are semidefinite programs (SDPs) which can be solved in polynomial time. To the best of our knowledge, we are the first to propose such a reformulation and approximation method for solving the MVEP. For the rest of the paper, we use the shorter term copositive program(ming) when referring to generalized copositive program(ming).
Another method of solving the MVEP approximately is via the maximum volume inscribed ellipsoid (MVIE). For a -dimensional convex set, scaling the MVIE by a factor of results in an ellipsoid that contains the set, thereby serving as an approximate solution to the MVEP . Moreover, MVIE can be computed in polynomial time for sets defined by affine and quadratic inequalities . However, this ellipsoid can be very conservative because of the scaling factor . We can add the scaled MVIE as a redundant constraint to the definition of the set and then use the well-known -procedure—also known as the approximate -lemma in other literature—to potentially generate an ellipsoid of lower volume (See [17, Section 3.7] and references therein). In this paper, we prove that the volume of the ellipsoid generated by our method cannot be larger than the one generated by the -procedure. We empirically demonstrate that the difference between the solutions generated by the two methods can be very significant. Furthermore, we prove that if the set of interest is a polytope, then the -procedure does not improve upon the scaled MVIE. Our method, on the other hand, generates ellipsoids of significantly lower volume.
The MVEP can be solved exactly using a constraint-generation approach . Here, we start with some feasible points in the set and find the ellipsoid of minimum volume containing those points, which can be done in polynomial time. Then we successively generate points in the set which lie outside the current ellipsoid and update the ellipsoid at each step. We repeat the process until a desired optimality tolerance is reached. However, generating a new point at each iteration is very slow, as it entails solving a non-convex optimization problem. Therefore, solving the MVEP using this approach is tractable only in low dimensions. Through our experiments, we demonstrate that our approach yields near-optimal solutions to the MVEP much faster than solving it to optimality using constraint-generation.
The results in this paper borrow heavily from the copositive programming literature. There has been a significant amount of work on presenting exact copositive programming reformulations for otherwise difficult problems, and using those reformulations to generate tractable approximations [18, 19, 20, 21, 22, 23, 24]. We provide another result in this direction by demonstrating the ability of copositive programs to exactly model the MVEP.
Our main contributions are as follows. 1) We reformulate the MVEP for general convex sets as an exact copositive program. 2) We provide a tractable approximation to the MVEP when the set is defined by affine and quadratic inequalities. 3) We prove that our approximation method yields a better solution to the MVEP than the one using the -procedure. 4) We empirically demonstrate that our method significantly outperforms the -procedure and provides high-quality solutions to the MVEP much faster than solving it to optimality.
This article is organized as follows. In Section 2, we describe the MVEP and reformulate it as an equivalent copositive program. In Section 3, we use that reformulation to derive a tractable semidefinite programming approximation to the MVEP. We also compare the resulting approximation with the -procedure. Finally, in Section 4, we present the numerical experiments comparing the quality of solutions generated by our method against those found using the -procedure.
For a positive integer , we use to denote the index set . We use to denote the vector of ones and to denote the identity matrix; their dimensions will be clear from the context. We use to denote the set of (non-negative) vectors of length , and to denote the set of all symmetric (positive semidefinite) matrices. We use and to denote the trace and the determinant of a matrix , respectively. We use to denote a diagonal matrix with vector on its main diagonal. For a vector , we use and to denote its -norm and -norm respectively. For two scalars or vectors and , we use to denote the vector formed by their vertical concatenation. We use to denote the conic hull of the set .
We use to denote the set of generalized copositive matrices with respect to cone , i.e., . We use to denote the set of generalized completely positive matrices with respect to cone , i.e., where is a positive integer. The cones and are duals of each other. For any and cone , the conic inequality indicates that is an element of . We drop the subscript and simply write , when . Finally, the relation indicates that is strictly copositive, i.e., for all .
2 Copositive Reformulation
In this section, we describe the MVEP and develop a copositive reformulation for it. Given a set , the MVEP seeks to find parameters and such that is an ellipsoid with minimum volume that contains . The volume of the ellipsoid is proportional to . The MVEP can be written as the following optimization problem :
In this paper, we focus on sets that satisfy the following assumption.
The set is compact, full-dimensional, and convex.
Compactness guarantees the feasibility of (since it implies that is bounded) and is useful for achieving the copositive reformulation. On the other hand, if is not full-dimensional, then it can be contained in an ellipsoid of zero volume. The convexity assumption is made without loss of generality. If the set is not convex, then we can instead consider its convex hull without changing the solution of . Let a cone be defined as
Using this definition and the convexity of , we can write as
Using this representation for the set simplifies the formulations. We are now ready to present the main result of this section. In Theorem 1 presented below, we reformulate as an equivalent copositive program.
Before proving Theorem 1, we present the following technical lemmas which are required for the proof.
If , then . Furthermore, only at the origin.
From the definition (2) of , there exist and coefficients , , such that . By comparing the last element, we get , since . Also, implies that for all , which further implies that .∎∎
Let be a symmetric matrix, be a vector and be defined as (2). Then there exists a real number such that
We have to show that there exists such that for all , where
Let be an element of such that . From Lemma 1, implies , which contradicts the assumption . Therefore and we have that
Since , the above expression is strictly positive if we choose strictly bigger than
The first equality follows by setting and using the fact that if and only if , since is a cone and . The second equality follows from the definition of . The expression (5) is finite since is compact and is finite for all . Hence, there exists such that . ∎∎
The following lemma is an immediate extension of a result proved recently [21, Lemma 4]. We include the proof here for the sake of completeness.
Let be a symmetric matrix and be an arbitrary matrix. Then, for any proper cone , the copositive inequality is satisfied if and only if there exists a matrix such that
Next, we present the proof of Theorem 1 using these lemmas.
The dual of this completely positive program can be written as:
In Lemma 2, we show that a Slater point exists in the optimization problem (8). Hence, strong duality holds and . Next, note that if and only if there exists a feasible solution to problem (8) whose objective function value is at most . Therefore, if and only if there exists such that
which in turn holds if and only if
The constraint (9) has non-linearity because of the terms involving the product of the decision variables and . However, by Lemma 3, this constraint is satisfied if and only if there exist variables and such that
Theorem 1 implies that can be reformulated exactly as the copositive program (4), which remains difficult to solve. In the following section, we discuss tractable approximations to (4). First, we extend the copositive reformulation to sets which can be expressed as the union or the Minkowski sum of a finite number of sets satisfying Assumption 1.
Remark 1 (Union of sets).
Let , where the set satisfies Assumption 1 for all . Let , be the corresponding cones defined as (2). The set does not satisfy Assumption 1 since it may not be convex. However, it is possible to derive a copositive reformulation to as follows. Note that
if and only if
Remark 2 (Minkowski sum of sets).
For all , let the set satisfy Assumption 1 and be the corresponding cone defined as (2). Let be the Minkowski sum of these sets, i.e., Although satisfies Assumption 1, it might be difficult to find a concise representation for it. However, we can still reformulate for as a copositive program of polynomial size as follows. Observe that
where By defining the cone as
and repeating the steps of Theorem 1, we get the following copositive reformulation:
In this section, we use the reformulation (4) to present tractable approximations for . There exists a hierarchy of sets , which provide increasingly tight approximations of the set [12, 13, 14]. Replacing with one of these sets yields a conservative approximation to the original problem. Here, we focus on the following sets defined by affine and quadratic inequalities:
where and . This set can be represented in the form (3) by defining the cone as
The simplest inner approximation of for the cone defined above is given by:
For completion, we show that is indeed a subset of in the following lemma. We use this fact in Theorem 2 to derive a tractable SDP that provides an approximation to () .
The set is a subset of .
First, observe that for all ,
Also . Next, consider any element . For any , we have that
where the final inequality follows from the fact that every term in the summation is non-negative because , , and . Therefore for all , which implies that . Hence, . ∎∎
If the set is defined as (12), then the optimal value of the following SDP is an upper bound for the optimal value of :
In what follows, we show that the approximation based on not only has good theoretical guarantees, but also generates high-quality solutions empirically. Although one can get better solutions by using higher order approximations of the set , the size of the resulting SDPs is much larger, making them computationally unattractive.
3.1 Comparison with the -procedure
For the set (12), applying the well-known -procedure leads to the following SDP:
We derive this approximation in Appendix A. In the next theorem, we prove that our method always generates a solution at least as good as the one generated by using the -procedure.
We show that any feasible solution to the -procedure formulation (16) can be used to generate a feasible solution to (15) having the same objective function value. Let be a feasible solution to (16). Consider a solution to (15), where and are the same as . In addition, and the variables and satisfy
By construction, the left- and the right-hand sides of the first semidefinite constraint of the problem (15) are equal; therefore, the constraint is satisfied. Furthermore, substituting the values of , and from above into the semidefinite constraint of the problem (16), we find that the second semidefinite constraint of (15) also holds. Therefore, this solution is feasible to the problem (15). In addition, both these solutions give the same objective function value. Thus, the claim follows. ∎∎
Next, we turn to the case when the set is a polytope. The -procedure requires the presence of at least one quadratic constraint in the definition of the set . This can be achieved by finding an ellipsoid which contains , and adding as a redundant constraint in the definition of . Since the ellipsoid contains , its volume already provides an upper bound to the optimal value of . We can then apply the -procedure in the hope of finding an ellipsoid with lower volume. However, in the following proposition, we show that applying the -procedure provides no such improvement and returns the same ellipsoid as its unique optimal solution.
For the set , The -procedure yields the following approximation:
The Lagrange dual of (17) is given by
Consider the following solution to the primal problem (17):
This solution is feasible since , and
Next, consider the following solution to the dual problem (18):
where is the center of the ellipsoid . We claim that this solution is feasible to (18). Under the assumption that the center of the ellipsoid lies inside the polytope, we get that , which implies that . Next, we have that
where the last inequality uses the fact that . Also,
Therefore, all constraints in the dual problem are satisfied. Finally, both of these solutions give an objective function value of . Therefore, both solutions are optimal to their respective problems. This shows the existence of a primal optimal solution with and . Furthermore, the solution is unique because the feasible region is convex and the objective function is strictly convex in the space of positive definite matrices. ∎∎
One way to find an ellipsoid covering the polytope is to scale the MVIE by a factor of . Since the center of the MVIE lies inside , Proposition 1 implies that the -procedure will not improve upon the ellipsoid provided by scaling the MVIE. This volume can be very conservative because of the scaling factor . On the other hand, our method can generate significantly better solutions, which we demonstrate with the following example.
Example 1 (Chipped Square).
Let be the following set parametrized by : . We compare the solutions generated by our approximation method and the -procedure for different values of . Figure 1 plots the optimal volume as well as the volumes of the ellipsoids generated by the two approximation methods for . Figure 1 shows the ellipsoids generated by the three methods for . We observe that the set is a simplex when , and both approximations methods are optimal in that case. However, when , the -procedure generates inferior solutions compared to our approach.
4 Numerical Experiments
In this section, we experimentally compare our proposed approximation method with the constraint generation approach  and the -procedure in terms of the solution quality and the computational time. All optimization problems are solved using the YALMIP interface  on a 16-core 3.4 GHz computer with 32 GB RAM. We use MOSEK 8.1 to solve SDPs, and CPLEX 12.8 to solve the non-convex quadratic programs to optimality. The metric that we use to quantify the suboptimality of an approximation method is
where is the volume of the ellipsoid generated by the approximation method and is the true minimum volume found using the constraint generation approach. This expression represents the suboptimality in the “size” of the ellipsoids generated by the approximation methods.
We perform the experiment on polytopes generated randomly as follows. We start with the hyper-rectangle with center . Then we add linear inequalities in the following way. For , we generate a vector uniformly distributed on the surface of the unit hypersphere. We generate a distance uniformly randomly from the interval , and add the constraint if and if . Choosing from the specified interval leads to a constraint that cuts the hyper-rectangle (i.e., the constraint is not redundant). Also, the construction of the constraint ensures that is feasible, thereby avoiding the case of an infeasible polytope. In light of Proposition 1, we use the ellipsoid formed by scaling the MVIE by a factor of as the solution provided by the -procedure.
For several values of and , we solve the problem exactly  as well as using the two approximation methods on randomly generated instances. We report the suboptimality results of the two approximation methods in Table 1. For higher values of , for which we were not able to solve the problem exactly, we report the suboptimality of the -procedure with respect to our approach in Table 2. It can be seen that our method provides significantly better solutions than the -procedure. We notice that the suboptimality of the -procedure increases with the dimension of the set . This is because we have to scale the MVIE by a factor of , which becomes very conservative for higher values of . The suboptimality of our method, however, remains the same as increases.
We report the computation times for solving the problem to optimality and for the two approximation methods in Table 3. For the -procedure, we report the time to find the MVIE. For small problem instances, our method finds a solution faster than solving the problem to optimality. For higher dimensional problems (), where solving the problem exactly becomes intractable, our approximation continues to efficiently provide high-quality solutions.
Appendix A -procedure
Lemma 5 (-procedure).
Let . Then the optimal value of the non-convex quadratic optimization problem
is if there exist such that