Near-Optimal Deterministic Algorithms for Volume Computation and Lattice Problems via M-Ellipsoids

# Near-Optimal Deterministic Algorithms for Volume Computation and Lattice Problems via M-Ellipsoids

## Abstract

We give a deterministic algorithm for computing an M-ellipsoid of a convex body, matching a known lower bound. This has several interesting consequences including improved deterministic algorithms for volume estimation of convex bodies and for the shortest and closest lattice vector problems under general norms.

## 1 Introduction

Ellipsoids have traditionally played an important role in the study of convex bodies. The classical Lowner-John ellipsoid, for instance, is the starting point for many interesting studies. To recall John’s theorem, for any convex body in , there is an ellipsoid with centroid such that

 x0+E⊆K⊆x0+nE.

In fact, this bound is achieved by the maximum volume ellipsoid contained in .

Ellipsoids have also been critical to the design and analysis of efficient algorithms. The most notable example is the ellipsoid algorithm [1, 2] for linear [3] and convex optimization [4], which represents a frontier of polynomial-time solvability. For the basic problems of sampling and integration in high dimension, the inertial ellipsoid defined by the covariance matrix of a distribution is an important ingredient of efficient algorithms [5, 6, 7]. This ellipsoid also achieves the bounds of John’s theorem for general convex bodies (for centrally-symmetric convex bodies, the max-volume ellipsoid achieves the best possible sandwiching ratio of while the inertial ellipsoid could still have a ratio of ).

Another ellipsoid that has played a critical role in the development of modern convex geometry is the M-ellipsoid (Milman’s ellipsoid). This object was introduced by Milman as a tool to prove fundamental inequalities in convex geometry (see e.g., Chapter 7 of [8]). An M-ellipsoid of a convex body has small covering numbers with respect to K. We let denote the number of translations of required to cover . Then, as shown by Milman, every convex body has an ellipsoid for which is bounded by . This is the best possible bound up to a constant in the exponent. In contrast, the John ellipsoid can have this covering bound as high as . The existence of M-ellipsoids now has several proofs in the literature by Milman [9], multiple proofs by Pisier [8], and most recently, by Klartag [10].

The complexity of computing these ellipsoids is interesting for its own sake, but also due to several important consequences that we will discuss presently. John ellipsoids are hard to compute, but their sandwiching bounds can be approximated deterministically to within in polynomial time. Inertial ellipsoids can be approximated to arbitrary accuracy by random sampling in polynomial time. Algorithms for M-ellipsoids have been considered only recently. The proof of Klartag [10] gives a randomized polynomial-time algorithm [11]. In [12], we give a deterministic time and -space algorithm to compute the -ellipsoid of any convex body. The -ellipsoid yields an approximation to the M-ellipsoid, where the product of covering estimates is instead of the best possible bound of . It has been open to give a deterministic algorithm for constructing an M-ellipsoid that achieves optimal covering bounds. The extent to which randomness is essential for efficiency is a very interesting question in general, and specifically for problems on convex bodies where separations between randomized and deterministic complexity are known in the general oracle model [13, 14]. Here we address the question of deterministic M-ellipsoid construction and consider its algorithmic consequences for volume estimation and for fundamental lattice problems, namely the Shortest Vector Problem (SVP) and Closest Vector Problem (CVP).

The core new result of this paper is a deterministic algorithm for computing an M-ellipsoid of a convex body in the oracle model [4]. Moreover, there is a lower bound for deterministic algorithms, so this is the best possible up to a constant in the exponent. We state this result formally, then proceed to its consequences and a nearly matching lower bound.

###### Theorem 1.1.

There is a deterministic algorithm that, given any convex body specified by a membership oracle, finds an ellipsoid such that . The time complexity of the algorithm (oracle calls and arithmetic operations) is and its space complexity is polynomial in .

The first consequence is for estimating the volume of a convex body. This is an ancient problem that has lead to many insights in algorithmic techniques, high-dimensional geometry and probability theory. One one hand, the problem can be solved for any convex body presented in the general membership oracle model in randomized polynomial time to arbitrary accuracy [15]. On the other hand, the following lower bound (improving on [16]) shows that deterministic algorithms cannot achieve such approximations.

###### Theorem 1.2.

[13] Suppose there is a deterministic algorithm that takes a convex body as input and outputs such that and makes at most calls to the membership oracle for . Then there is some convex body for which

 B(K)A(K)≤(cnalogn)n/2

where is an absolute constant.

In particular, this implies that even achieving a approximation requires oracle calls. Now the volume of an M-ellipsoid of is clearly within a factor of of the volume of , thus Theorem 1.1 gives a algorithm that achieves this approximation. And, as claimed, we have a lower bound of for computing an M-ellipsoid deterministically. We state this corollary formally.

###### Theorem 1.3.

There is a deterministic algorithm of time complexity (oracle calls and arithmetic operations) and polynomial space complexity that estimates the volume of a convex body given by a membership oracle to within a factor of .

A natural question is whether this can be generalized to a trade-off between approximation and complexity. Indeed the following result of Barany and Furedi [17] gives a lower bound.

###### Theorem 1.4.

[17] For any , any deterministic algorithm that estimates the volume of any input convex body to within a given only a membership oracle to the body, must make at least queries to the membership oracle.

We show that the M-ellipsoid algorithm can be extended to give an algorithm that essentially matches this best possible complexity for centrally symmetric convex bodies.

###### Theorem 1.5.

For any , there is a deterministic algorithm that computes a approximation of the volume of a given centrally symmetric convex body in time and polynomial space.

Next we turn to lattice problems. For a convex body , s.t. , the gauge function of is

 ∥x∥K=inf{s≥0:x∈sK}

for . For symmetric (i.e. ), is a usual norm on (we shall refer to as the norm induced by and specify asymmetric whenever relevant).

In recent work, M-ellipsoids were shown to be useful for solving basic lattice problems [11] of SVP and CVP. The Shortest Vector Problem (SVP) is stated as follows: given an -dimensional lattice represented by a basis, and a norm defined by a convex body , find a nonzero such that is minimized. In the Closest Vector Problem (CVP), in addition to a lattice and a norm, we are also given a query point in , and the goal is to find a vector that minimizes . These problems are central to the geometry of numbers and have applications to integer programming, factoring polynomials, cryptography, etc. The fastest known algorithms for solving SVP under general norms, are time randomized algorithms based on the AKS sieve [18, 19]. Finding deterministic algorithms of this complexity for both SVP and CVP has been an important open problem.

In fact, the AKS sieve uses an exponential amount of randomness. Improving on this, [11] gave a Las Vegas algorithm for general norm SVP which uses only a polynomial amount of randomness. For CVP the complexity was assuming the minimum distance of the query point is at most times the length of the shortest vector. In subsequent work [12], we gave a deterministic algorithm for the same results. In this paper, we completely eliminate the randomness. In the statements below, we say that is well-centered if . (every convex body is well-centered with respect to its centroid or a point sufficiently close to its centroid).

###### Theorem 1.6.

Given a basis for a lattice and a well-centered norm specified by a convex body both in , the shortest vector in under the norm can be found deterministically using time and space.

###### Theorem 1.7.

Given a basis for a lattice , any well-centered -dimensional convex body and a query point in , the closest vector in to in the norm defined by can be computed deterministically using time and space, provided that the minimum distance is at most times the length of the shortest nonzero vector of under .

The approach in [11] is to reduce the problem for general norms to the the Euclidean norm, or more specifically, to enumerating lattice points in ellipsoids. We describe summarize the reduction in Section 6. In [11], the M-ellipsoid construction is a randomized polynomial-time algorithm based on the existence proof by Klartag [10]. This approach is based on estimating a covariance matrix and seems inherently difficult to derandomize. In [12], we gave a deterministic algorithm based on computing an approximate minimum mean-width ellipsoid. For this approximation, we get that the covering bound is , giving a deterministic algorithm of this complexity. Here we completely algorithmicize Milman’s existence proof, to obtain the best possible deterministic complexity of . By adjusting the parameters in the resulting algorithm to “slow down” Milman’s iteration, we get the optimal trade-off between approximation and complexity for volume computation.

## 2 Techniques from convex geometry

### 2.1 The Lewis ellipsoid

Let be a norm on matrices. We define the dual norm for any as

 α∗(S)=sup{tr(SA):A∈Rn×n,α(A)≤1}. (2.1)

For a matrix , we denote its transpose by , and its inverse (when it exists) by .

###### Theorem 2.1.

[20] For any norm on , there is an invertible linear transformation such that

 α(A)=1 and α∗(A−1)=n.

The proof of the above theorem is based on examining the properties of the optimal solution to the following mathematical program:

 maxdet(A)s.t.A∈Rn×nα(A)≤1 (2.2)

From here, showing that the optimal satisfies is a simple variational argument (reproduced in Lemma 4.1).

We will be interested in norms of the following form. Let denote a symmetric convex body with associated norm , and let denote the canonical Gaussian measure on . We define the -norm with respect to for as

 ℓK(A)=(∫∥Ax∥2Kdγn(x))1/2

The -norm was first studied and defined by Tomczak-Jaegermann and Figiel [21].

The next crucial ingredient is a connection between the dual norm defined above and the -norm with respect to the polar , namely,

 ℓK∗(A)=(∫∥Ax∥2K∗dγn(x))1/2.

For two convex bodies the Banach-Mazur distance between and is

 dBM(K,L)=inf{s:s≥1,TK⊆L−x⊆sTK,x∈Rn,T∈Rn×n invertible }
###### Lemma 2.2.

[8] For , we have that

 ℓK∗(AT)≤4(1+logdBM(K,Bn2))ℓ∗K(A)

### 2.2 Covering numbers and volume estimates

Let denote the -dimensional Euclidean ball. Recall that is the number of translates of required to cover . The following bounds for convex bodies are classical. We use to denote absolute constants here and later in the paper.

###### Lemma 2.3.

For any two symmetric convex bodies ,

 vol(K)vol(K∩D)≤N(K,D)≤3nvol(K)vol(K∩D).

The next lemma is from [22].

###### Lemma 2.4.

Let , . Then,

 vol(conv{K,D})≤4αnN(D,K)vol(K).

The following are the Sudakov and dual Sudakov inequalities (see e.g., Section of [23]).

###### Lemma 2.5 (Sudakov Inequality).

For any , and invertible matrix

 N(K,tABn2)≤eCℓK∗(A−T)2/t2.
###### Lemma 2.6 (Dual Sudakov Inequality).

For any , and

 N(ABn2,tK)≤eCℓK(A)2/t2.

The following lemma gives a simple containment relationship (see e.g., [12]).

###### Lemma 2.7.

For any , invertible, we have that

 1ℓK∗(A−1)K⊆ABn2⊆ℓK(A)K
###### Proof.

We first show that . Assume not, then there exists such that . Now pick achieving . Then we have that

 ℓK(A)<|⟨x,y⟩|≤supz∈ABn2|⟨z,y⟩|=supz∈Bn2|⟨z,Aty⟩|=∥Aty∥2

But now note that

 ℓK(A)=E[∥AX∥2K]12≥E[|⟨y,AX⟩|2]12=∥Aty∥2

a clear contradiction. Therefore as needed. Now applying the same argument on and , we get that . From here via duality, we get that

 1ℓK∗(A−1)K=(ℓK∗(A−1)K∗)∗⊆(A−1Bn2)∗=ABn2

as needed. ∎

## 3 Algorithm for computing an M-ellipsoid

In this section, we present the algorithm for computing an M-ellipsoid of an arbitrary convex body given in the oracle model. We first observe that it suffices to give an algorithm for centrally symmetric . For a general convex body , we may replace by the difference body (which is symmetric). An -ellipsoid for remains one for , as the covering estimates changes by at most a factor. To see this, note that for any ellipsoid we have that and that

 N(E,K)≤N(E,K−K)N(K−K,K)≤N(E,K−K)2O(n),

where the last inequality follows from the Rogers-Shephard inequality [24], i.e. .

Our algorithm has two main components: a subroutine to compute an approximate Lewis ellipsoid for a norm given by a convex body, and an implementation of the iteration that makes this ellipsoid converge to an M-ellipsoid of the original convex body.

### 3.1 Approximating the ℓ-norm

Our approximation of the norm is as follows:

 ~ℓK(A)=∑x∈{−1,1}n12n∥Ax∥K.

The next lemma is essentially folklore, we give a known proof here.

###### Lemma 3.1.

For a symmetric convex body and any , we have

 ℓK(A)≤4√π2(1+logdBM(K,Bn2))~ℓK(A).
###### Proof.

Let denote i.i.d. Gaussians, let denote i.i.d. uniform random variables and let denote the columns of . Then we have that

 ℓK(A) ≤4(1+logdBM(K,Bn2))sup{∑i⟨Ai,yi⟩:E[∥∑igiyi∥2K∗]12≤1} ≤4√π2(1+logdBM(K,Bn2)) sup{∑i⟨Ai,yi⟩:E[∥∑iuiyi∥2K∗]12≤1} ≤4√π2(1+logdBM(K,Bn2)) E[∥∑iuiAi∥2K]12=4√π2(1+logdBM(K,Bn2)) ~ℓK(A)

Here, the first inequality follows by Lemma 2.2. The next inequality follows from the classical comparison for any convex function , and setting . The last inequality follows from the following weak duality relation:

 ∑i⟨Ai,yi⟩ =E[⟨∑iuiAi,∑jujyj⟩]≤E[∥∑iuiAi∥K∥∑jujyj∥K∗] ≤E[∥∑iuiAi∥2K]12E[∥∑jyjuj∥2K∗]12≤ℓK(A).

The next lemma is a strengthening due to Pisier, using Proposition 8 from [25]. While it is not critical for our results (the difference is only in absolute constants), we use this stronger bound in our analysis.

###### Lemma 3.2.

For a symmetric convex body and any , we have

 1√π2~ℓK(A)≤ℓK(A)≤c1~ℓK(A)√1+logdBM(K,Bn2)

where are absolute constants. Furthermore, by duality, we get that

 1c1√1+logdBM(K,Bn2)~ℓ∗K(A)≤ℓ∗K(A)≤√π2~ℓ∗(A).

### 3.2 A convex program

To compute the approximate -ellipsoid we use the following convex program:

 maxdet(A)1ns.t.A⪰0~ℓK(A)≤1 (3.1)

Here the main thing we change is that we replace the -norm with . This will suffice for our purposes. We optimize over only positive semidefinite matrices (unlike Lewis’ program 2.2). This enables us to ensure convexity of program while maintaining the desired properties for the optimal solution. For convenience we use as the objective function and clearly this makes no essential difference.

### 3.3 Main algorithm

Given a convex body , we put it in approximate John position using the Ellipsoid algorithm in polynomial time [4], so that . We then use the above procedure, which is essentially an algorithmic version of Milman’s proof of the existence of -ellipsoids. In the description below, by we mean the ’th iterated logarithm, i.e., and so on.

## 4 Analysis

We note that the time complexity of the algorithm is bounded by and the space complexity is polynomial in . In fact, the only step that takes exponential time is the evaluation of the -norm constraint of the SDP. This evaluation happens a polynomial number of times. The rest of computation involves applying the ellipsoid algorithm and computing oracles for successive bodies (for given an oracle for ), both of which are fairly straightforward [4]. In particular, we build an oracle for the intersection of two convex bodies given by oracles and for the convex hull of two convex bodies given by oracles. The oracle for a body consists of a membership test and a bound on the ratio between two balls that sandwich the body. Our analysis below provides sandwiching bounds and the complexity of the oracle grows as in the ’th iteration, for a maximum of .

We begin by showing that Lewis’s bound (Theorem 2.1) is robust to approximation and works when restricted to positive semi-definite transformations. This allows us to establish the desired properties for approximate optimizers of the convex program (3.1).

###### Lemma 4.1.

Let be such that and , be a -approximate optimizer for the convex program (3.1), i.e. . Then for , we have that

 ~ℓK(A)~ℓ∗K(A−1)≤n(1+6n2√ϵ)≤2n.
###### Proof.

For simplicity of notation, we write as for . Take (not necessarily positive semidefinite) satisfying . Let denote the frobenius norm of , and denote the operator norm of .

#### Claim: α(T)≤∥T∥F≤nα(T).

###### Proof.

Let denote a uniform vector in . Since for any , we have that

 α(T)=E[∥UT∥2K]12≥1nE[∥UT∥22]12=1n∥T∥F.

Now using the inequality for , a similar argument yields . ∎

First note that is a feasible solution to (3.1) satisfying

 det(Inα(In))1n=1α(In)≥1∥In∥F=1√n.

Let denote the optimal solution to (3.1). Since , we clearly have that . Therefore for small enough we have that . From this, we see that is also feasible for (3.1) as . Since is the optimal solution, we have that

 det(AOPT+δTα(AOPT+δT))1n≤det(AOPT)1n.

Rewriting this and using the triangle inequality,

 det(AOPT+δT)1n ≤det(AOPT)1nα(AOPT+δT)≤det(AOPT)1n(α(AOPT)+δα(T)) ≤det(AOPT)1n(1+δ).

Dividing by on both sides, we get that

 det(In+δA−1OPTT)1n≤1+δ. (4.1)

Since both sides are equal at , we must have the same inequality for the derivatives with respect to at . This yields

 1ntr(A−1OPTT)≤1⇔tr(A−1OPTT)≤n (4.2)

Up to this point the proof is essentially the same as Lewis’ proof of Theorem 2.1. We now depart from that proof to account for approximately optimal solutions.

#### Claim: ∥A−1OPT∥2≤n.

###### Proof.

Let denote the largest eigenvalue of and be an associated unit eigenvector. Since , we have that , and hence . Now note that for any , and that . Therefore by Equation (4.2), we have that

 n≥tr(A−1(vvT))=tr(σvvT)=σ

as needed. ∎

#### Claim: A−1⪯(1+6√nϵ)A−1OPT.

###### Proof.

Since is -approximate maximizer to (3.1) we have that

 det(A)1n≥(1−ϵ)det(AOPT)1n⇒det(A)≥(1−nϵ)det(AOPT)

We begin by proving by proving . Now note that

 A⪰(1−3√nϵ)AOPT  ⇔  A−12OPTAA−12OPT⪰(1−3√nϵ)In

Hence letting , it suffices to show that . From here, we note that . Now from Equation (4.2), we have that

 tr(B)=tr(A−12OPTAA−12OPT)=tr(A−1OPTA)≤n

Let denote the eigen values of in non-increasing order. We first note that since otherwise

 det(B)=n∏i=1σi≥σnn>1

a contradiction. Furthermore, since , we have that . So we may write , for . Now since , by the arithmetic mean - geometric mean inequality we have that

 det(B)=σnn−1∏i=1σi=(1−ϵ0)n−1∏i=1σi≤(1−ϵ0)(∑n−1i=1σin−1)n−1≤(1−ϵ0)(1+ϵ0n−1)n−1

Using the inequality for , we get that

 (1−ϵ0)(1+ϵ0n−1)n−1 ≤(1−ϵ0)eϵ0≤(1−ϵ0)(1+ϵ0+e−12ϵ20) =1−3−e2ϵ20−e−12ϵ30≤1−3−e2ϵ20

From this we get that

 1−3−e2ϵ20≥det(B)≥(1−nϵ)  ⇒  ϵ0≤√23−enϵ≤3√nϵ

Therefore as needed. From here we get that

 A−1⪯(11−3√nϵ)A−1OPT⪯(1+6√nϵ)A−1OPT

for , proving the claim. ∎

Now take satisfying . By the first claim, we note that . Now by Equation (4.2), we have that

 tr(A−1T)=tr(A−1OPTT)+tr((A−1−A−1OPT)T)≤n+∥A−1−A−1OPT∥F∥T∥F≤n+n∥A−1−A−1OPT∥F

We bound the second term using the previous claim. Since , we have that , and hence

 ∥A−1−A−1OPT∥F≤√n∥A−1−A−1OPT∥2≤6n√ϵ∥A−1OPT∥2≤6n2√ϵ

Using this bound, we get

 tr(A−1T)≤n+6n3√ϵ=n(1+6n2√ϵ)

for any satisfying . Thus we get that . Together with the constraint , the conclusion of the lemma follows. ∎

###### Theorem 4.2.

Let be a -approximate optimizer to the convex program (3.1) for . Then

 ℓK(A)ℓK∗(A−1)≤Cnlog32dBM(K,Bn2).

for an absolute constant .

###### Proof.

Using Lemma 4.1, we have that

 ~ℓK(A)~ℓ∗K(A−1)≤2n.

Next we use the approximation property (Lemma 3.2) of to derive that

 ℓK(A)ℓ∗K(A−1)≤Cn√logdBM(K,Bn2)).

Finally, noting that (by symmetry of ), we apply Lemma 2.2 to infer that

 ℓK∗(A−1)≤Cℓ∗K(A−1)logdBM(K,Bn2),

which completes the proof. ∎

Next we turn to proving that the algorithm produces an M-ellipsoid. While the analysis follows the existence proof to a large extent, we need to handle the various approximations incurred.

To aid in the analysis of Algorithm 1 on input , we make some additional definitions. Let and . Let and denote the sequence of bodies and transformations generated by the algorithm. Set , and for define

 Kini+1=conv{Kini,riinAiBn2}Kouti+1=Kouti∩rioutAiBn2

where are defined as in the ’th iteration of the main loop in Algorithm 1.

By construction, we have the relations

 K⊆Kin1⊆⋯⊆KinT,K⊇Kout1⊇⋯⊇KoutT,Kouti⊆Ki⊆Kini  ∀i∈[T]

The proof of the main theorem will be based on the following inductive lemmas which quantify the properties of the sequences of bodies defined above.

, we have that .

###### Proof.

For the base case, we have that for any constant .

For the general case, by construction of we have that

 riinAiBn2⊆Ki+1⊆rioutAiBn2.

Therefore,

 dBM(Ki+1,Bn2) ≤ riout/riin = a2i~ℓK∗i(A−1i)~ℓKi(Ai)/n ≤ C1a2iℓK∗i(A−1)ℓKi(Ai)/n(by Lemma ???) ≤ C1(log(i)n)2(logdBM(Ki,Bn2))32.(by Lemma ???)

Using the fact that , , a direct computation shows that the above recurrence equation implies the existence of a constant (depending only on ) such that the stated bound on holds. ∎

###### Lemma 4.4.

For , we have that

 max{vol(Kouti)vol(Kouti+1),vol(Kini+1)vol(Kini)}≤eCn/log(i)n
###### Proof.

By Lemma 2.3, the fact that , Lemma 2.5, Lemma 3.2 and Lemma 4.3, we have that

 vol(Kouti)vol(Kouti+1) ≤N(Kouti,rioutAiBn2)≤N(