Practical volume estimation by a new annealing schedule for cooling convex bodies
We study the problem of estimating the volume of convex polytopes, focusing on H- and V-polytopes, as well as zonotopes. Although a lot of effort is devoted to practical algorithms for H-polytopes there is no such method for the latter two representations. We propose a new, practical algorithm for all representations, which is faster than existing methods. It relies on Hit-and-Run sampling, and combines a new simulated annealing method with the Multiphase Monte Carlo (MMC) approach.
Our method introduces the following key features to make it adaptive: (a) It defines a sequence of convex bodies in MMC by introducing a new annealing schedule, whose length is shorter than in previous methods with high probability, and the need of computing an enclosing and an inscribed ball is removed; (b) It exploits statistical properties in rejection-sampling and proposes a better empirical convergence criterion for specifying each step; (c) For zonotopes, it may use a sequence of convex bodies for MMC different than balls, where the chosen body adapts to the input. We offer an open-source, optimized C++ implementation, and analyze its performance to show that it outperforms state-of-the-art software for H-polytopes by Cousins-Vempala (2016) and Emiris-Fisikopoulos (2018), while it undertakes volume computations that were intractable until now, as it is the first polynomial-time, practical method for V-polytopes and zonotopes that scales to high dimensions (currently ).
We further focus on zonotopes, and characterize them by their order (number of generators over dimension), because this largely determines sampling complexity. The number of phases in MMC tends to constant as order increases, while the bodies in (c) are balls. For low orders, the generators’ matrix is used to define a centrally symmetric convex polytope in (c). We analyze a related application, where we evaluate methods of zonotope approximation in engineering.
Department of Informatics & Telecommunications
National & Kapodistrian University of Athens, Greeceachalkis@di.uoa.gr Department of Informatics & Telecommunications
National & Kapodistrian University of Athens, Greeceemiris@di.uoa.gr Department of Informatics & Telecommunications
National & Kapodistrian University of Athens, Greece firstname.lastname@example.org \CopyrightApostolos Chalkis, Ioannis Z. Emiris and Vissarion Fisikopoulos\subjclassDesign and analysis of algorithms:
Computational geometry, Random walks and Markov chains
As a special case of integration, volume computation is a fundamental problem with many applications in science and engineering. From a computational complexity point of view it is hard even if we restrict to convex sets. In particular, it is P-hard for H- and V-polytopes, including zonotopes [volZono]. It is even hard to approximate, namely, APX-hard [Elekes1986]. Therefore, a great effort has been devoted to randomized approximation algorithms, starting with the celebrated result in [DyerFrKa91] with complexity , where indicates a soft-big-Oh hiding polylog factors. It introduced the Multiphase Monte Carlo (MMC) technique, which reduced volume approximation to computing a telescoping product of volumes, estimated by uniformly sampling a sequence of convex bodies (by means of random walks). The following years, improved algorithms reduced the exponent of dimension down to [LovSim]. The latter led to the first practical implementation [VolEsti] for high dimensions, which highlighted the importance of Coordinate Direction Hit-and-Run (HnR). Further results in [CousinsV14, LovVem] reduced the exponent to , followed by another practical method [Cousins15].
This paper proposes a new practical volume estimation for convex polytopes, improving upon state-of-the-art methods for H-polytopes, while yielding the first method capable to scale in the case of V-polytopes and zonotopes. We use an adaptive sequence of convex bodies, simulated annealing, and the statistical properties of the telescoping ratios in order to drastically reduce the number of phases in MMC as well as the sample size required to estimate these ratios. Our aim is to fully exploit probabilistic methods within the current paradigm of the MMC approach, and optimize the resulting software by careful algorithmic engineering.
Notation. is a full-dimensional convex polytope lying in -dimensional space. An H-polytope (in H-representation) is . A V-polytope is the convex hull of a pointset in . A zonotope (Z-polytope) is the Minkowski sum of -dimensional segments or equivalently given by matrix and seen as a linear map of hypercube to : we call it a Z-representation. The order of a zonotope is the ratio . We study sequences of bodies intersecting where the corresponding convex bodies define the telescoping product.
Membership oracles. A point if and only if when is a H-polytope. If is a V-polytope we follow [fukunda] and solve the linear program (LP) below,
The holds if and only if the optimal value of the LP is strictly positive and the LP computes a separating hyperplane. If is a zonotope we solve the following linear feasibility problem,
The holds if and only if the answer is positive.
Paper structure. The rest of the section presents previous work as well as our contributions. Section 2 discusses our method, while Section 3 presents our implementation, evaluates its practical complexity, compares to existing software and offers a concrete application.
1.1 Previous work
The prevalent paradigm in volume approximation relies on MMC and sampling with random walks. We build on the approach which defines a sequence of convex bodies such that rejection sampling would efficiently estimate . Assuming is well-rounded, i.e. , one defines a sequence of scaled copies of the unit ball , and . Then, it suffices to compute and apply the following telescopic product from [LovSim]:
In practical implementations [VolEsti], assuming , the construction gives , see Figure 1 (left). The critical complexity issue is to define a sequence that minimizes while each ratio remains bounded by a constant. This would permit a larger approximation error per ratio without compromising overall error, while it would require a smaller sample.
Each ratio is estimated by sampling uniform points from , obtained by random walks. The main approach today being HnR while, more recently, a convergence rate is given for Hamiltonian walk [LeeVem] applicable to H-polytopes only. The Vaidya walk is even faster [JMLR:v19:18-158] when the number of facets . A recent Hamiltonian walk with reflections [ChePioCaz18] reduces mixing time: It can enhance our method for H-polytopes and may offer better stability than Coordinate Directions HnR; it can also be integrated to the two existing implementations discussed below. But its application to V- and Z-polytopes is unclear.
In [LovVem] they construct and estimate the volume of a dimensional convex body, called the "pencil", and then they use rejection sampling to estimate the volume of the input convex body. Moreover, they generalize the telescopic product to a sequence of functions by fixing a sequence of exponential distributions approximating the uniform; the total complexity is . In [CousinsV14], which holds the current record in asymptotic analysis for volume approximation of general convex bodies, they consider a sequence of spherical Gaussian distributions on ; the total complexity is . The sequence of spherical Gaussians is not deterministic, but approaches the uniform distribution fast. In [LeeVem] they improve the asymptotic complexity to for H-polytopes as they combine Hamiltonian Monte Carlo with a sequence of Gibbs distributions instead of Gaussians.
Current state-of-the-art software handles H-polytopes based on the above paradigms and, typically, HnR. The software of [VolEsti] scales up to hundreds of dimensions and uses Coordinate-Direction HnR. We also juxtapose the software of [Cousins15] (for H-polytopes), implementing [CousinsV14] with an annealing schedule [LovVem] of a sequence of Gaussians. For the ratio, they use a sliding window and stop sampling when the maximum and minimum values meet a convergence criterion.
The main reason that these implementations cannot handle efficiently Z- nor V-polytopes (cannot scale beyond, say, ) is that the boundary and membership oracles require to solve linear programs (LPs). Moreover both of them require inscribed balls. If is a zonotope, checking whether a ball is in co-NP, but it is not known whether it is co-NP-complete. When is a V-polytope, given the computation of the largest inscribed ball centered at is NP-hard [Vpolyinsc]. Additionally, the software of [Cousins15] requires the number of facets which is typically exponential in the dimension for both Z- and V-polytopes (Section 3 and Figure 6).
To sum up the discussion on previous work we should mention the rich area of implementations of deterministic algorithms; notable examples are VINCI and qHull but the list is quite long. As expected, those implementations do not scale beyond, say, dimensions for general polytopes.
For zonotopes and V-polytopes, computing the largest inscribed ball is a key issue for both methods. If is a zonotope, checking whether a ball is in co-NP, but it is not known whether it is co-NP-complete. When is a V-polytope given the computation of the largest inscribed ball centered at is NP-hard [Vpolyinsc].
1.2 Our contribution
Our contribution is a new volume approximation method for H-, V-, and Z-polytopes, also applicable to general convex bodies, though in this paper we focus on convex polytopes. For V- and Z-polytopes, our algorithm requires solving two (related) LPs per step of the random walk (Section 2.3). However, we drastically reduce the number of such steps, hence offering the first practical algorithm for such bodies in high dimensions. Regarding LPs, each step solves two LPs with a common basic feasible solution. In Section 3 we experimentally analyze our method to show it scales up to 100 dimensions for V-polytopes and low order zonotopes. Hence, it outperforms both implementations in [VolEsti, Cousins15] on V- and Z-polytopes (Figure 6). In fact, it performs volume computations which were intractable until now (Table 2). On H-polytopes our method is faster than [VolEsti] for every and faster for than [Cousins15] (Table 3 and Figure 6). The main algorithmic features follow:
We design a new simulated annealing method for cooling convex bodies in MMC (Section 2.1). We exploit the fact that rejection sampling works efficiently for bounded ratios, which are actually smaller than those in [LovSim] (see Figure 1). To reduce the number of phases, we employ statistical tests to bound each with high probability. Our annealing schedule does not need an enclosing body of as they do in [LovSim, VolEsti]: it suffices to set . In addition, our method does not require computing an inscribed ball, as older methods did: the ball (or any body we use in MMC) with minimum volume is computed by the annealing schedule, thus further reducing the number of phases in practice. Finally, we prove that the annealing schedule terminates successfully with constant probability (Section 2.1). This adaptive MMC sequence reduces significantly the number of phases: we prove that this number in [LovSim, VolEsti] upper bounds the number of phases in our method with high probability (Section 2.4).
When we sample uniform points from , the number of points in follows the binomial distribution. The main task here is to estimate the ratio with minimum . We exploit the binomial proportion confidence interval and modify it by using the standard deviation of a sliding window, in order to specify a new empirical convergence criterion for the ratio, as increases (Section 2.2); this drastically reduces sample size (Figure 6). An analogous technique was used in [Cousins15], but the window here is of half the length.
We allow other convex bodies besides balls in MMC, the choice being a function of the input, aiming at reducing the number of phases. For H- and V-polytopes we actually use balls as in classical MMC. We leave it as an open question whether there are more suitable bodies such as those defined in [apx]. For zonotopes we show that, as order grows, most suitable is the ball: our method requires only a small constant number of them (Figure 3 and Table 2). However, for low order, e.g. , we use the matrix of generators to define a centrally-symmetric H-polytope that accelerates the algorithm so as to scale to, say, , which used to be intractable (Section 2.5). For instance, for a random zonotope with generators, , our software takes hr (Table 2).
We prove that, in our method, the number of phases is with high probability (Section 2.4), when we use balls in MMC. This yields for some well-rounded polytopes such as the cross polytopes, a clear improvement over the general bound of [LovSim]. Specifically, for a cross V-polytope in our method requires just two balls and takes sec (Figure 5 and Table 1), while the problem is intractable under the H-representation. Moreover, when applying a rounding step to random V-polytopes our method requires very small , in practice , even for (Section 3, and Table 1). If is a zonotope, we experimentally show that, for constant , and (number of generators) increasing, the number of phases decreases to 1 (Table 2 and Figure 3). An intuitive property of zonotopes is that, while order increases for constant , a random zonotope approximates the hypersphere. In [AprBallZon] they prove that for the unit ball can be approximated up to in the Hausdorff distance by a zonotope defined by segments of equal length, , where is a constant. This result cannot be used straightforwardly to prove our claim but strengthens it intuitively. For some instances, when order is large, our method creates just one ball and the method reduces to just one or two rejection-sampling steps (Table 2).
Last but not least, we offer an open source efficient implementation of the new method in C++111https://github.com/GeomScale/volume_approximation/tree/CoolingBodies.
2 Volume algorithm
Our volume algorithm relies on Multiphase Monte Carlo (MMC) method and samples from uniform target distribution with Hit-and-Run (HnR) random walk. Hence, the first part of the algorithm (Algorithm 1) is to construct a sequence of convex bodies intersecting the given polytope using simulated annealing (Algorithm 2). Then it estimates each ratio in the telescopic product of Equation (1) using HnR and a new empirical criterion convergence (Algorithm 3). A typical choice for the ’s is a sequence of co-centric balls but other convex bodies may be used (see, e.g., Section 2.5). Then,
In the sequel, we write and .
The behavior of Algorithm 1 is parameterized by: the error of approximation , cooling parameters , , which are used in the schedule, significance level (s.l.) of the statistical tests, the degrees of freedom for the t-student used in t-tests (all in Section 2.1), and parameter that controls the number of points generated in . We generate uniform samples in using HnR (Section 2.3).
Following the telescopic product of Equation (1), it is clear that in practical estimations has to be a convex body whose volume is obtained much faster than (ideally by a closed formula) and easy to sample. The maximum number of constructed convex bodies by Algorithm 1 can be bounded by upper and lower probabilistic bounds (Section 2.4). When the input is a zonotope we study other choices for the sequence of bodies, i.e. sequences of balls and H-polytopes (Section 2.5).
2.1 Annealing schedule for convex bodies
Given a convex polytope , an error parameter and s.t. , the annealing schedule (Algorithm 2) generates the sequence of convex bodies defining and . The main goal is to restrict each ratio in the interval with high probability. We call cooling parameters and the significance level (s.l.) of the schedule.
We introduce some notions from statistics needed to define two tests and refer to [Cramer46] for details. Given observations from a r.v. with unknown variance , the (one tailed) t-test checks the null hypothesis that the population mean exceeds a specified value using the statistic , where is the sample mean, the sample standard deviation and is the t-student distribution with degrees of freedom. Given a s.l. we test the null hypothesis for the mean value of the population, against . We reject if,
which implies . Otherwise we fail to reject .
The schedule algorithm uses the following two statistical tests:
Successful if is rejected Successful if is rejected They are used to restrict each to . In the sequel we write testL() and testR().
If we sample uniform points from a body then r.v. that counts points in , follows , the binomial distribution and is Gaussian.
This normal approximation suffices when is large enough and we adopt the well known rule of thumb to use it only if .
Then each sample proportion that counts successes in over is an unbiased estimator for the mean of , which is . So if we sample points from and split the sample into sublists of length , the corresponding ratios are experimental values that follow and can be used to check both null hypotheses in testL and testR. So, using the mean of the ratios, assuming
then is restricted to with high probability.
Details and bounds on the number of phases are given in Section 2.4.
Perform testR and testL
Input: convex bodies , cooling parameters , significance level and
Sample uniform points from
Partition points to lists , each of length
Compute ratios ,
Compute the mean, , and standard deviation, , of the ratios
if then testR holds, otherwise testR fails
if then testL holds, otherwise testL fails
Let us now describe the annealing schedule: Each body in is a scalar multiple of a given body . When is the unit ball, the body used in each step is determined by a radius. Since our algorithm does not use an inscribed ball, the initialization step computes the body with minimum volume, denoted by or , s.t. with high probability. The algorithm employs to decide stopping at the -th step; if the criterion fails, the algorithm computes by a regular step.
Initialization step. The schedule is given convex body , and an interval . At initialization, one computes s.t. both testL() and testR() are successful. Now corresponds to a body that fails in testR with unit probability, e.g. when , while to one for which testL fails with probability arbitrarily close to 1. The computation of is not trivial but in Section 3 we give practical selections depending on body which are very efficient in practice.
The algorithm performs binary search in . Let then:
If testL() succeeds and testR() fails, continue to the left-half of the interval.
If testL() fails and testR() succeeds, continue to the right-half of the interval.
If both testL() and testR() succeed, stop and set .
If both testL() and testR() fail (contradiction) then sample a new set of uniform points from and repeat both tests.
Note that in each step of binary search, the schedule samples points from to check both testL and testR. The output is which shall be denoted by at termination.
Regular step. At step , the algorithm determines by computing a scaling factor of s.t. volume ratio with high probability. The schedule samples points from and binary searches for a in an updated interval s.t. both testL() and testR() are successful. Then set . To update the interval, Algorithm 2 uses the value of computed in the initialization step as and the value of computed in the previous step as . The updated interval implies that has to lie between and .
Stopping criterion. The algorithm uses in the -th step for checking whether with high probability, using only testR. Formally, is the body with minimum volume in the sequence:
Stop in step if testR() holds. Then, set , and .
To perform testR, points are sampled from . The schedule stops when is large enough according to testR. It is clear that, at termination, . Otherwise, the algorithm determines the next convex body in a regular step.
Termination. We demonstrate halting of Algorithm 2 for a given input polytope and a set of parameters. Before stating the theorem, let us introduce the notion of the power of a t-test: . The power of a t-test cannot be usually calculated in practice. It is well known that it depends on s.l. , sample size , and the magnitude of the effect on the mean value of the population. For example, for testR, assuming , we have
where is the quantile function of t-student with degrees of freedom. A similar analysis for the power of testL is straightforward.
In the t-tests of Algorithm 2 we might have some errors of type I or II and, thus, binary search in intervals that do not contain values corresponding to ratios in . Therefore, there is a probability that Algorithm 2 fails to terminate. The following theorem states that this probability is bounded by a constant when Algorithm 2 performs at least as many steps as the minimum number required for it to terminate: this number is denoted by , depends on the inputs, and occurs when there are no errors in the performed t-tests.
Let . If Algorithm 2 fails to terminate after pairs of testL and testR, then some type I or type II error occurred in the t-tests. An error of type I occurs when the null Hypothesis is true and the test rejects it, while type II occurs when the null Hypothesis is false and the test fails to reject it. The respective probabilities are and , which is a value of the quintile function of t-student. For the latter probability we write for testL and testR respectively. If, for a pair of tests, both null hypotheses are false then an error occurs with probability
2.2 Empirical ratio estimation
As described in the previous section, annealing schedule returns bodies intersecting that is we estimate ratios in total. In this section we describe how this estimation is performed. First, we bound the error in the estimation of each ratio in order to use it for the definition of the stopping criterion. For each ratio , we bound error by s.t.
For fixed step of the schedule, and for each new sample point generated in , we update and keep the value of the -th ratio. If we assume uniform sampling then the number of points in follows the binomial distribution , where is the number of points we have generated in . Then a confidence interval of is given by
where is the proportion of the number of points in over and is the quintile of the Gaussian distribution. Notice that while increases the interval is becoming tighter around so that a natural choice is to stop when and to sample points from . Then Equation (1) would estimate up to at most error with probability . In practice we generate approximate uniform samples which makes that criterion completely useless, because for small random walk steps the number of points we obtain in do not follow the binomial distribution and the criterion usually results to false positives.
Recall that the quantity is an estimator of the standard deviation of all the sample proportions of size . In our empirical criterion we replace that quantity with the standard deviation of a sliding window, which stores consecutive estimators of . In particular, we store the last ratios in a queue called sliding window denoted by and we say that has length . We update each time a new sample point is generated by inserting the new value of the -th ratio and by popping out the oldest ratio value in . We stop sampling when and the st.d. of the values in , called , meet the criterion of Equation (4): they we say they meet convergence. Clearly, for the first points sampled in , we do not check for convergence. Let and the criterion is as follows:
The size of the sliding window is determined experimentally (Section 3).
We use HnR with uniform target distribution for sampling from at step of the annealing schedule or ratio estimation. With directed along a random vector on the boundary of the -dimensional unit hypersphere, we have Random Direction HnR (RDHR). If is defined by a random vector parallel to one of the axes, we have Coordinate Direction HnR (CDHR). If is given as a set of inequalities, RDHR costs and CDHR per step. For zonotopes each step in both CDHR and RDHR solves the following LP to compute one extreme point on :
For the second extreme point, keep the same constraints and minimize . This LP uses the basic feasible solution of the first one. For V-polytopes we use the same LP with constraint and while equals the -th vertex. In Section 3 we discuss practical choices of the walk step.
2.4 Number of phases
In this section we give probabilistic bounds for the number of phases. To do this, we assume that i) polytope is sandwiched, , ii) we sample uniform points in each step of Algorithm 2 and iii) that Algorithm 2 terminates successfully. The construction in [LovSim] defines a sequence of convex bodies of length and each ratio is restricted . If is well-rounded then . In our method we can use any convex body for the sandwiching. Moreover we give a corollary which states which convex body minimizes the number of phases for a given input .
Given and cooling parameters s.t. and parameters , , , , let be the number of convex bodies in MMC returned by Algorithm 2. Then , where and is the minimum among all the values of the quantile function appearing in testR.
If holds then type I errors of testL occurred in Algorithm 2, i.e. holds but the test rejects it, while testR was successful, i.e. is false and the test rejects it. Type I error occurs with probability and the probability of the success of testR is . Let be the minimum among the values of the quantile function appearing in all instances of testR. Then,
Given and cooling parameters s.t. and parameters , , , , let be the number of convex bodies in MMC returned by Algorithm 2. Then , where , and are the minimum among all the values of quantile function appearing in testL and testR respectively.
Similar to the proof of Proposition 2, , where and is the minimum among all the values of quantile function appearing in testR. If holds then type I errors of testR occurred, while testL was successful with probability . Then, , where and is the minimum among all the values of quantile function appearing in testL. Then
From Proposition 3 it is easily derived that the number of phases in Algorithm 1 is with high probability. If we use balls in MMC and assume that then which is always smaller, due to a constant, than the number of phases in [LovSim]. As we mentioned, there are well-rounded convex bodies that our method improves this bound by , e.g. for the cross polytope , as . We leave it as an open question if this result can be extended to other convex bodies. Proposition 3 also shows the importance of the type of body .
2.5 Multiphase Monte Carlo for zonotopes
In this section we study different types of convex bodies used in the MMC sequence to approximate the volume of a zonotope, i.e. the Minkowski sum of segments. Let be the number of generators of zonotope , and the matrix they define. Note has zero eigenvalues; the corresponding eigenvectors form . The intersection of the hypercube with the -dimensional affine subspace defined by equals a -dimensional polytope in . SVD yields an orthonormal basis for the linear constraints, and its orthogonal complement :
Let be a H-representation of , then is an H-representation of a -dimensional polytope and is a H-representation of the full-dimensional, centrally symmetric polytope with facets. Each in MMC arises from a parallel shifting of the facets of . This type of improves the schedule when order is low, e.g. .
Let the H-representation of . We introduce a second improvement in the schedule: In each step we do not compute a in an interval as suggested in Section 2.1 but we consider two vectors and compute with binary search a s.t. and results to successions in both testL and testR. For a if the testL is succeeded and testR is failed we continue to the right-half of the interval and if testL is failed and testR is succeeded we continue to the left-half; if both fail (contradiction) we sample a new set of points and repeat both tests.
Let the matrix that contains row-wise the normals of the facets of a convex polytope , where is a -dimensional zonotope. Let the generators’ matrix and , the row of and s.t. , where is the coefficient of . If then holds. Moreover for every facet of , holds.
Let the matrix of zonotope , a halfspace intersecting . Let . For all there is a s.t. , so . The that gives the maximum inner product with a vector is the following,
So . Then for the halfspace