Approximation to uniform distribution in \mathrm{SO}(3)

# Approximation to uniform distribution in SO(3)

Carlos Beltrán and Damir Ferizović The first author was supported by the Spanish “Ministerio de Economía y Competitividad” under projects MTM2017-83816-P and MTM2017-90682-REDT (Red ALAMA), as well as by the Banco Santander and Universidad de Cantabria under project 21.SI01.64658. The second named author thankfully acknowledges support by the Austrian Science Fund (FWF): F5503 “Quasi-Monte Carlo Methods” and by the NAWI Graz Funding.
###### Abstract

Using the theory of determinantal point processes we give upper bounds for the Green and Riesz energies for the rotation group , with Riesz parameter up to 3. The Green function is computed explicitly, and a lower bound for the Green energy is established, enabling comparison of uniform point constructions on . The variance of rotation matrices sampled by the determinantal point process is estimated, and formulas for the -norm of Gegenbauer polynomials with index 2 are deduced, which might be of independent interest. Also a simple but effective algorithm to sample points in is given.

## 1 Introduction and Results

In this paper we study properties of a finite collection of randomly generated points in , the rotation group of 3-dimensional Euclidean space, sampled by determinantal point processes (dpp). It turns out that they tend to be well distributed, a property that is important for discretization, integration and approximation. Our goal is not to compute actual collections of evenly distributed rotation matrices, but rather to provide a comparison tool that allows to decide the effectiveness of any given method.

If one is given an algorithm to generate finite (but arbitrarily large) collections of matrices, common methods to measure how well distributed these are, include either calculating some discrete energy of them or looking at the speed of convergence of the counting measure towards uniform measure. Most work in this direction has been done on spheres of various dimensions, see for instance [7], etc.; the particular question of finding collections of points with very small energy was posed by Shub and Smale in [24] and is nowadays known as Smale’s 7th problem [25].

In order to extend part of the work done on spheres to the context of rotation matrices, we will obtain bounds on various energies for points generated through the method of dpp, which are technically speaking counting measures where one identifies them with their set of atoms. In few words, such a process is obtained by taking a Hilbert space of an underlying measure space and an -dimensional subspace , with projection kernel onto – then, under mild conditions on , one is guaranteed almost surely the existence of such a process with distinct points in associated to .

The theory of those processes has been developed in [8]; there one also finds a pseudo-code which samples points based on the dpp – which seems hard to implement. A main feature of the underlying points is that they tend to “repel” each other, and hence have become the theoretical basis of construction of well-distributed points on various symmetric spaces, see for instance [2, 6, 7, 22].

Since one can sometimes compute the expected value of the energy of points coming from these processes with high precision, they have been used as a tool to understand the asymptotic properties of the discrete energy in that context; and in particular, for even dimensional spheres with exception of the usual -sphere, the best known bounds for some energies have been proved using this approach.

We will employ the same method for , considering first the (discrete) Riesz -energy for :

 EsR(A):=∑j≠k1∥αj−αk∥sF,

with being thought of as rotation matrices, being the Frobenius or -norm, and . In contrast to this, the continuous Riesz -energy is given by replacing the double sum by the double integral over . We further set

 EsR(N)=inf|A|=NEsR(A).

The investigation of these sums is very popular and results describe the behavior of the two leading terms. This seems particularly interesting in case equals the dimension, where we have following result.

###### Theorem 1.1.

Let for , then the Riesz 3-energy satisfies

 12√2π⋅E3R(N) ≤ N2log(N)+(3γ+log(82⋅6)−214)N2+O(N5/3log(N)).

The right-hand side is the expected value of the Riesz 3-energy with underlying points generated by a dpp. Now, given any particular method of generating finite point sets in , one can compute, numerically, their -energy and compare it to the value above to decide if the points are evenly distributed. This comparison would clearly rise in significance at the presence of lower bounds on the -energy, which do not seem easy to find. For this reason we turn our attention to the Green energy, where we succeeded in this endeavor.

To recap, a Green function for a linear differential operator is an integral kernel to produce solutions for inhomogeneous differential equations and is unique modulo kern(). In our case, we deal with the Laplace-Beltrami operator , and note that kern() is the set of harmonic functions – which are just constants on a compact Riemannian manifold (). We will construct in such a way, that it integrates to zero and speak of the Green function.

The (discrete) Green energy for will be given by

 EG(A):=∑i≠jG(αi,αj),

and we let

 EG(N)=inf|A|=NEG(A).

It is noteworthy that for close to in geodesic distance , and a set of points with small Green energy is hence expected to be well-distributed, which is indeed the main result in [5]: We know that if attains the minimal possible energy, then the associated discrete measure approaches the uniform distribution in as . A set of points with small Green energy is also expected to be well-separated, see [9].

Now, is for any a zero mean function by definition, and if were simply chosen uniformly and independently in , then the expected value of the Green energy would equal , so in particular we have . In this note we prove the following much stronger result.

###### Theorem 1.2.

Let for , then

 −33√πN4/3+O(N)≤ EG(N) ≤−4(34)4/3N4/3+O(N).

The right-hand side is the expected value of the Green energy with underlying points generated by a dpp, and that is where we have the restriction for , as the process is related to subspaces that we can project onto. The lower bound is valid for all .

As mentioned above, another classical measure of the distribution properties of is the speed of convergence to uniform measure, i.e. choosing some range sets measurable w.r.t. Haar measure and investigating the behavior of

 supj∈I∣∣#{k : αk∈Aj}−N⋅μ(Aj)∣∣

as grows large. We will tackle this problem probabilistically, where we turn the count of points in into a random variable.

In analogy to spherical caps on spheres, the range sets for will be chosen to be balls for and being the rotation angle distance introduced in the following sections. For given random points , we define random variables via characteristic functions

Now, for a collection of random uniform points, chosen independently in we have

 E[ηα,ε]=Nμ(B(α,ε))=Nμ(B(\mathds1,ε)),

and the variance can also be computed from the independence of the points:

 Var[ηα,ε]=E[η2α,ε]−E[ηα,ε]2=N(μ(B(\mathds1,ε))−μ(B(\mathds1,ε))2).

We are able to bound the variance of this quantity for our dpp, proving that it is much smaller than in the previous case.

###### Theorem 1.3.

Let for , and be fixed, then the points generated by our determinantal point process satisfy

 E[ηα,ε]=Nμ(B(α,ε))=Nμ(B(\mathds1,ε)),

and moreover

 Var(ηα,ε)=O(ε2cos(ε))⋅N2/3log(N).

From Theorem 1.3 and for any fixed , we then have by Chebyshev’s inequality

 supα∈SO(3)P(∣∣ηα,ε−Nμ(B(\mathds1,ε))∣∣≥T)≤Var(ηα,ε)T−2;

for example, letting and with some little arithmetic we obtain

 supα∈SO(3)P(∣∣1Nηα,ε−μ(B(\mathds1,ε))∣∣≥log(N)N2/3)=O(1log(N)).

In other words, for large the counting and Haar measures are very similar with large probability.

## 2 Introductory Concepts

In this section we collect some definitions and previous results that we will use and that intend to make this manuscript reasonably self-contained. Proofs and definitions of Chebyshev polynomials and alike are postponed to subsection 2.4.

### 2.1 Structure, distances and integration in SO(3)

The special orthogonal group is the compact Lie group of 3 by 3 orthogonal matrices over that represent rotations in , i.e. with determinant equal to one. Its exponential map is given by Rodrigues’ rotation formula, and a closed expression for the Baker-Campbell-Hausdorff formula has been derived in [13]. It is a dimensional manifold and since it is naturally included in it is customary to let it inherit its Riemannian submanifold structure.

Following [16], using Euler angles , every element can be decomposed as where

 sz(φ1):=⎛⎜⎝cos(φ1)−sin(φ1)0sin(φ1)cos(φ1)0001⎞⎟⎠, sx(θ):=⎛⎜⎝1000cos(θ)−sin(θ)0sin(θ)cos(θ)⎞⎟⎠

are rotations around the -axis and -axis respectively. The normalized Haar measure (i.e. the unique left and right invariant probability measure in ) is given by , and it corresponds to the inherited Riemannian submanifold structure of up to the normalizing constant.

The Riemannian distance associated to the structure of is certainly a natural and useful concept, but for us it will be more convenient to use the so called rotation angle distance defined as follows: for ,

 ω(α−1β)=arccos({Trace}(α−1β)−12).

Its convenience stems from following fact, see for example [16, page 173]: Given a function such that we can find with , then

 ∫SO(3)f(x) dμ(x)=2π∫π0~f(t)sin2(t2) dt. (1)

### 2.2 Laplace-Beltrami operator and Green function in SO(3)

The Laplace-Beltrami operator is defined on any Riemannian manifold in terms of the Levi-Civita connection. Following [11], if is a set of geodesics in an -dimensional manifold such that for all , and such that form an orthonormal basis of the tangent space (geodesic normal coordinates), then the action of on -functions at is given by

 Δgf(p)=−n∑j=1d2dt2∣∣t=0f(γj(t)).

Note the convention given by the minus sign in front of the sum, which sometimes leads to confusion given the Laplacian in . The convention we use here is widely accepted, see for example [19]. A Green function is a distributional solution to

 ΔgG(⋅,y)=δ(⋅,y)−1μdV(M).

This way defined it is unique modulo kern() and it is common practice to add a constant in such a way that for all the function has zero mean, see [4]. We use this convention and simply refer to as the Green function.

It further follows from classical Fredholm theory for a linear differential operator that

 GL(x,y)=∞∑j=1ϕj(x)¯ϕj(y)λj, (2)

where is the sequence of eigenvalues for and , is a complete orthonormal set of associated eigenfunctions. This is hence true locally on any manifold, and the expression we obtain will be independent of any particular chart, thus valid globally. In the case , geodesics are dealt with in [21], the eigenvalues and eigenfunctions of are known from the classical theory of continuous groups and have been intensively studied in the physics literature, see [16, 18], [28, §15]:

###### Lemma 2.1.

The eigenvalues of in are for . Moreover, if is the eigenspace associated to , then the dimension of is and an orthonormal basis of is given by where and are Wigner’s -functions.

Moreover we have, see [18, Eq. 4.65] or [27, pp. 40-41] for a nice summary:

 (3)

where is the Chebyshev polynomial of second kind and degree . The following simple form for the Green function is derived, and to the best of our knowledge, this is the first time it has been formulated.

###### Lemma 2.2.

The Green function for the Laplace-Beltrami operator on can be written in terms of the metric , i.e. for with :

### 2.3 Determinantal point processes

We point the reader to the excellent monograph [8] for an introduction to point processes, and we briefly summarize part of this material below. As in [7] and [6], we will use only a fraction of the theory.

A simple point process on a locally compact Polish space with reference measure is a random, integer-valued positive Radon measure , that almost surely assigns at most measure 1 to singletons – we shall think of it as a counting measure

 η=∑j=1δxj,

with for . One usually identifies with a discrete subset of .

The joint intensities of w.r.t. , if they exist, are functions for , such that for pairwise disjoint sets , the expected value of the product of number of points falling into is given by

and in case for some .

A simple point process is determinantal with kernel 111 If is a projection kernel, one ought to say determinantal projection process., iff for every and all ’s

 ρk(y1,…,yk)=det(K(yj,ys))1≤j,s≤k.

Let be a compact Riemannian manifold with measure . Let be any -dimensional subspace in the set of square-integrable functions. It follows from the Macchi-Soshnikov theorem [8, Thm. 4.5.5] that a simple point process with points exists in associated to . Its main property is given by [8, Form. (1.2.2)]: For any measurable function

 E[∑i≠jf(xi,xj)]=∬Mf(x,y)(KH(x,x)KH(y,y)−|KH(x,y)|2) dμ(x,y); (4)

where

means expected value of some function defined from ( copies of ) to , when are chosen from the point process associated to ;

is the (orthogonal) projection kernel on , namely for any the orthogonal projection of onto satisfies:

 ΠH(f)(x)=∫y∈Mf(y)KH(x,y) dμ(y) ∈L2(H).

Note that if is an orthonormal basis of , then we can write

 KH(x,y)=N∑j=1φj(x)¯¯¯¯¯¯¯¯¯¯¯¯φj(y), (5)

and clearly

 ∫SO(3)KH(x,x) dμ(x)=N.

Coming back to the case of interest and following ideas in [7], we choose as subspace the span of the first eigenspaces of .

###### Lemma 2.3.

Let and be the span of the union of eigenspaces for eigenvalues of . Then, we define

 N:=dim(HL)=(2L+33)=C(2)2L(1)=43L3+O(L2).

Moreover222Here, , , is the sequence of Gegenbauer (ultra-spherical) polynomials., the projection kernel is:

 KL(α,β)=C(2)2L(cos(ω(α−1β)2)).

### 2.4 Chebyshev polynomials and proofs of lemmas

The degree Chebyshev polynomials of first and second kind satisfy the recurrence relation

 Pn+1(x)=2xPn(x)−Pn−1(x), (6)

with , and , in their respective notation. Gegenbauer or ultra-spherical polynomials of degree and index appear any time rotation invariance plays a role, and can be defined for integer as multiples of derivatives of Chebyshev polynomials of the second kind; sufficient for us is the one formula in (9). With this said, using (2), (3) and (5), we obtain

 (7)
 (8)

Further we list some equations for later reference and the reader’s convenience.

###### Proof of Lemma 2.3.

Let , then by (7) and (9)

 K(α,β)=ddxL∑ℓ=0T2ℓ+1(x)∣∣y=ddx12U2L+1(x)∣∣y=C(2)2L(y).

The formula for the dimension of can be proved as follows. The eigenspace associated to has dimension since this is the number of elements of its basis . Thus dim() is given by . ∎

###### Proof of Lemma 2.2.

In (8) we apply the equality

 U2ℓ(cos(t))=sin((2ℓ+1)t)sin(t)\@@cite[cite]{[\@@bibref{}{Stegun}{}{}, Eq. 22.3.16]},

and reason, under the assumption , as follows

 G(α,β)=∞∑ℓ=12ℓ+1ℓ(ℓ+1)sin((2ℓ+1)w2)sin(w2)=1sin(w2)∞∑ℓ=1(sin((2ℓ+1)w2)ℓ+1+sin((2ℓ+1)w2)ℓ)=1i(−log(1−eiw)+log(1−e−iw))cot(w2)−1;

where we used the well known fact, that the power series for at 1 converges at the boundary of its disc of convergence (except for ) and equals the logarithm at these values:

 ∞∑ℓ=1sin((2ℓ+1)w2)ℓ+1=12i∞∑ℓ=1eiw2(2ℓ+1)−e−iw2(2ℓ+1)ℓ+1=e−iw22i∞∑ℓ=1eiw(ℓ+1)ℓ+1−eiw22i∞∑ℓ=1e−iw(ℓ+1)ℓ+1=−e−iw22i(log(1−eiw)+eiw)+eiw22i(log(1−e−iw)+e−iw)=−e−iw22ilog(1−eiw)+eiw22ilog(1−e−iw)−sin(w2),

and similarly

 ∞∑ℓ=1sin((2ℓ+1)w2)ℓ=−eiw22ilog(1−eiw)+e−iw22ilog(1−e−iw).

Further, by , we conclude

where we used a property of the complex logarithm: . ∎

## 3 Riesz s-Energy: Proof of Theorem 1.1

Recall if is a real matrix, we have . We set throughout for , and note next a well known fact before we proceed, see for instance [17, Eq. (33)].

For , we have

###### Proof.

We abbreviate , and use the half-angle formula for sine:

 ∥α−β∥2F={Trace}% [(α−β)t(α−β)]=6−2{Trace}(α−1β)=82−({Trace}(α−1β)−1)4=81−cos(ω)2=8[sin(ω2)]2.\qed

Reminding us first of the Beta function for , we are ready to state our first proposition.

###### Proposition 3.2.

For and , we have

 EsR(N) ≤ 28s/2πB(3−s2,12)N2+O(N1+s/3).

If , we have more information on the term : It is respectively

 −√2π(34)4/3N4/3+O(N) and% −415(34)5/3N5/3+O(N4/3).
###### Proof.

We use (4), Lemma 2.3, Lemma 3.1, invariance of Haar measure, and (1):

 ∬SO(3)K(α,α)2−K(α,β)2∥α−β∥sF dμ(α,β)=∬SO(3)[C(2)2L(1)]2−[C(2)2L(cos(ω(α−1β)2))]28s2[sin(ω(α−1β)2)]s dμ(α,β)=28s2π∫π0(N2−[C(2)2L(cos(t2))]2)sin(t2)2−s dt=48s2πN2∫π/20sin(t)2−s dt−48s2π∫10[C(2)2L(t)]2(1−t2)1−s2 dt.

The next line is, apart of the factor , the continuous Riesz -energy:

 ∫π/20sin(t)2−s dt=∫10t1−s⋅t√1−t2 dt=12∫10t1−s2(1−t)−1/2 dt=12B(3−s2,12).

For , we hence obtain

where we inferred [26, Eq. 7.33.6], i.e. for every there is such that

 |C(2)2L(cos(θ))|≤CLθ2,cL≤θ≤π2.

The case is Lemma B.2; the case follows from Lemma B.4:

where with notation as in Lemma B.4. ∎

To use (1) in the next proof, which is valid for functions – we argue as follows: Use Lebesgue’s monotone convergence theorem with (1) on . We will further use the digamma function , see Appendix B.

###### Proof of Theorem 1.1.

We proceed as in the previous proof and use Lemma B.4:

 ∫π/20[C(2)2L(1)]2−[C(2)2L(cos(t))]2sin(t) dt=22L∑r=1∫π/201−cos(2rt)sin(t) dt2L−r∑u=0cr+u,u=42L∑r=1∫π/20Ur−1(cos(t))2sin(t) dt2L−r∑u=0cr+u,u=42L∑r=1∫10Ur−1(t)2 dt2L−r∑u=0cr+u,u=(⋆).

We use (19): and obtain

 (⋆)=2(γ+log(4))2L∑r=12L−r∑u=0cr+u,u+22L∑r=1ψ(r+12)2L−r∑u=0cr+u,u=:S1+S2.

By , we have

 2L−r∑u=0cr+u,u=1615L5+23L2r3−43L3r2−r530+Oa+b<5(Larb),

and hence by well known summation formulas due to Faulhaber:

 S1=2(γ+log(4))(1615L52L+23L24L4−43L383L3−130323L6)+O(L5)=169(γ+log(4))L6+O(L5).

Invoking Lemma 3.3 yields

 12S2=1615L5⋅(2L ψ(2L)−2L)+23L2⋅((2L)44ψ(2L)−(2L)442)−43L3⋅((2L)33ψ(2L)−(2L)332)−130((2L)66ψ(2L)−(2L)662)+O(L5log(L))=89L6⋅ψ(2L)−149L6+O(L5log(L)).

Since , and we see

 13log(34N)=log(L)+O(L−1);

and with (19), using harmonic numbers :

proving the claim when multiplied by . ∎

###### Lemma 3.3.

Let be the digamma function and , then

 n∑k=1kmψ(k+12)=nm+1m+1ψ(n)−nm+1(m+1)2+O(nmlog(n)).
###### Proof.

Since for , we have

 n∑k=1kmψ(k+12)=∫n1tmlog(t) dt+O(nmlog(n));

as the sum can be bounded from above and below by the same integral, apart from integration boundaries, where we obtain the error term. We finish by applying the anti-derivative: . ∎

## 4 Green Energy: Proof of Theorem 1.2

We prove the lower and upper bound separately in the following two sections.

### 4.1 Estimate of the Green Energy: Lower Bound

We follow an exposition due to N. Elkies, found in [20, pp. 149-154]. This has been pointed out to the authors by E. Saff, and his help is thankfully acknowledged.

The idea is to find a function with nice properties smaller than , and to bound its energy from below. For and , the following will do:

 Gt(α,β)=∞∑ℓ=1e−ℓ(ℓ+1)⋅t2ℓ+1ℓ(ℓ+1)U2ℓ(cos(ω(α−1β)2)).

To show that it really is smaller, we infer an adaptation of [20, Lem. 5.2].

###### Lemma 4.1 (N. Elkies).

For all and we have

 G(α,β)≥Gt(α,β)−t.
###### Proof.

Using uniform convergence, we differentiate term by term and define

 ht(α,β):=−∂tGt(α,β)=∞∑ℓ=1e−ℓ(ℓ+1)⋅t(2ℓ+1)ℓ∑m=−ℓℓ∑n=−ℓDℓm,n(α)¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯Dℓm,n(β).

Given a smooth test function , with uniformly converging representation as , where , we set

 u(α,t):=∫SO(3)ht(α,β)ϕ(β) dμ(β)=∞∑ℓ=1e−ℓ(ℓ+1)⋅tϕℓ(α),

where we interchanged integration and summation by uniform convergence and used that is an orthonormal basis. Now we have uniformly

 limt→0u(α,t)=ϕ(α)−∫SO(3)ϕ(β) dμ(β)=ϕ(α)−ϕ0.

For fixed, we can interchange differentiation and integration yielding

 Δgu(α,t)+∂tu(α,t)=0.

By the strong maximum principle Theorem A.2, we have for every :

 minα∈SO(3)u(α,t)≥minα∈SO(3)u(α,0).

The same PDE and estimates hold for

 v(α,t)=u(α,t)+ϕ0.

If , then so is for all by the maximum principle as . Hence