AVERAGING ON MANIFOLDS BY EMBEDDING ALGORITHM

# Averaging on Manifolds by Embedding Algorithm

Petre Birtea*, Dan Comănescu* and Călin-Adrian Popa**
*Department of Mathematics, West University of Timişoara
Bd. V. Pârvan, No 4, 300223 Timişoara, România
birtea@math.uvt.ro, comanescu@math.uvt.ro
**Department of Computer Science, ”Politehnica” University of Timişoara
Bd. V. Pârvan, No 2, 300223 Timişoara, România
calin.popa@cs.upt.ro
###### Abstract

We will propose a new algorithm for finding critical points of cost functions defined on a differential manifold. We will lift the initial cost function to a manifold that can be embedded in a Riemannian manifold (Euclidean space) and will construct a vector field defined on the ambient space whose restriction to the embedded manifold is the gradient vector field of the lifted cost function. The advantage of this method is that it allows us to do computations in Cartesian coordinates instead of using local coordinates and covariant derivatives on the initial manifold. We will exemplify the algorithm in the case of averaging problems and will rediscover a few well known results that appear in literature.

MSC: 43A07, 49M05, 53B21, 58A05, 58E05.

Keywords: Averaging, Optimization, Distance functions, Rotations, Quaternions, Metriplectic dissipation, Gradient vector field.

## 1 Introduction

A problem which one frequently encounters in applications is to find an average for a finite set of sample points belonging to a manifold . A method to solve this problem is to write it as an optimization problem, more precisely to construct a cost function associated with this set of sample points and then the average is defined by

 argminy∈NGN(y).

Of special interest are the cost functions of least square type , where is a distance function on . The average of the set of sample points on the manifold is the set defined by

 argminy∈Nr∑i=1d2(y,yi).

When the function is differentiable, this is equivalent with solving the equation and with testing for which solutions the minimum value is attained. In order to write this equation we need the knowledge of a local system of coordinates on the manifold , a requirement that might be difficult to fulfill in many practical cases. Another problem that we may encounter with this approach is that the set of sample points might not be entirely included in the domain of a single local system of coordinates. One way to overcome these difficulties is to lift the problem on a simpler manifold .

Let be a surjective submersion. The lifted cost function is defined by . The set equality shows that it is sufficient to solve the equation on the simpler manifold and project these solutions on the manifold .

A way to solve this new problem is to endow with a Riemannian metric and compute the critical points of the gradient vector field and verify which one of them are local/global minima. We note that if is a critical point of the vector field , then it is a critical point of the vector field , where is any other Riemannian metric on . Although the manifold might have a simpler geometrical structure then the initial manifold , we still need a local system of coordinates or the knowledge of covariant derivatives on the manifold in order to be able to compute the critical points of the function , see , , , , , . To further avoid the use of Riemannian geometry of the manifold we will embed in a larger space .

In the current paper we propose a solution for finding the critical points of the lifted cost function by constructing a vector field on the ambient space (usually an Euclidean space), that is tangent to the submanifold and by also constructing a Riemannian metric on such that . Consequently, the critical points of the vector field (usually written in Cartesian coordinates) that belong to are critical points of the lifted cost function and their projection through the surjective submersion gives the critical points of the initial cost function . This construction will be illustrated in details for the averaging problem on the Lie group associated with four different cost functions. Among this functions we will study the cost functions. We will analyze and compare the averaging problems for and cases. The case have been studied before in the literature. We will show that the case differ from the case in a few important aspects.

In the literature the problem of averaging on Lie groups is often solved by using the exponential map as a tool to introduce local coordinates and so lifting the problem on the tangent space, see , , , , , . This method can be generalized to the context of Riemannian manifolds as there also exists an exponential map. The exponential map can be further replaced by the more general notion of retraction developed in , , . Averaging problems coming from real world applications have been studied in , , , , ,  by using the notions of covariant derivatives and the geometry of geodesics on various Riemannian manifolds. In this paper we will propose a different algorithm in order to deal with averaging problems on general differentiable manifolds.

Other interesting cost functions are the ones that come from the Fermat-Torricelli problem. This averaging problem has been studied on various spaces, see , , , , and it will be of interest to apply the techniques developed in this paper to such problems.

## 2 Averaging on a differentiable manifold

We will solve the averaging problem presented in the Introduction by embedding the manifold as a submanifold of a Riemannian manifold . Further more, we will assume in this paper that is the preimage of a regular value for a smooth function , i.e. , for a regular value of .

As we have already stated, our approach is to find a Riemannian metric on the submanifold and a vector field such that

 v0|S=∇τGS. (2.1)

Consequently, our initial problem is equivalent with finding the critical points of the vector field , which is defined on the ambient space , and with choosing the ones that belong to .

The vector field that we are looking for is the standard control vector field introduced in , which in turn is a continuation of the study we have begun in . Let be an -dimensional Riemannian manifold and be smooth functions. The standard control vector field is defined by

 v0=k∑i=1(−1)i+k+1detΣ(F1,…,Fk)(F1,…,ˆFi,…,Fk,G)∇Fi+detΣ(F1,…,Fk)(F1,…,Fk)∇G, (2.2)

where represent the missing term and is the Gramian matrix

 Σ(f1,...,fr)(g1,...,gs)=⎛⎜⎝<∇g1,∇f1>...<∇gs,∇f1>.........\par<∇g1,∇fr>...<∇gs,∇fr>⎞⎟⎠, (2.3)

generated by the smooth functions .

The vector field conserves the regular leaves of the map and dissipates the function , more precisely the derivation of the function along the vector field is given by the Lie derivative .

In what follows, we will describe the geometry of the standard control vector field. Let be the real vector space of the differential one forms on the manifold . Let be the degenerate symmetric contravariant 2-tensor given by

 T:=k∑i,j=1(−1)i+j+1detΣ(F1,…,^Fj,…,Fk)(F1,…,^Fi,…,Fk)∇Fi⊗∇Fj+detΣ(F1,…,Fk)(F1,…,Fk)g−1, (2.4)

where is the cometric 2-tensor constructed from the metric tensor and the contravariant 2-tensor is defined by the formula

 ∇Fi⊗∇Fj(α,β):=α(∇Fi)β(∇Fj).

In Riemannian geometry the gradient vector field of a smooth function can be defined by the formula , where denotes the interior product defined by . The following result shows that looks like the gradient vector field of the function , with respect to the degenerate symmetric contravariant 2-tensor .

###### Theorem 2.1.

 On the manifold , the standard control vector field is given by the following formula

 v0=idGT.

From the above theorem, computing the critical points of (which is written in local coordinates on ) that belong also to is equivalent with solving the following system of equations on :

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩⎡⎢ ⎢ ⎢⎣T11(x)…T1m(x)⋮⋯⋮Tm1(x)…Tmm(x)⎤⎥ ⎥ ⎥⎦⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣∂G∂x1(x)⋮∂G∂xm(x)⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦=0F(x)=c, (2.5)

where and . In general, a system of equations with unknowns might not have any solutions. But the above system that has unknowns, namely the coordinates of in , has exactly functional independent equations due to the fact that the rank of the tensor is for every regular point in the open set in of regular points of (the rank of the tensor is at every point where are linear independent, see Section 4 of ).

The standard control vector field is tangent to every regular leaf and next we will recall its geometry when restricted to a regular leaf. Let be the canonical inclusion of the regular leaf into the manifold . The contravariant symmetric 2-tensor is non-degenerate when restricted to a regular leaf and consequently, the restriction can be inverted and it thus generates the Riemannian metric, see , defined by

 τc(Xc,Yc):=T−1((ic)∗Xc,(ic)∗Yc),

where is the push-forward application.

###### Theorem 2.2.

 On a regular leaf we have the following characterizations.

• , where is the pull-back operator;

• .

The above theorem shows that the tensor induces a Riemannian metric on every regular leaf which is the first fundamental form of the submanifold multiplied with a positive function. The vector field is tangent to the submanifold and the restriction is equal with the gradient vector field of the restricted function with respect to the metric . As is such a regular leaf, the Riemannian metric on is and the above theorem shows that the equality (2.1) holds, where .

In conclusion, in order to solve our initial averaging problem on the manifold associated to the cost function , by using the standard control vector field, we apply the following embedding algorithm:

• Choose the geometrical setting of a manifold , a surjective submersion and construct the lifted cost function .

• Find a Riemannian ambient space and a differentiable function such that , where is a regular value of . Construct the contravariant symmetric 2-tensor .

• Find a prolongation function such that and construct the standard control vector field with the initial data .

• Solve the system (i.e. the system (2.5)) which is equivalent with finding the critical points of the lifted cost function . Project through these solutions and thus obtain the critical points of the initial cost function .

• Find critical points that are local/global minima for the cost function .

Examples that fit the geometry underlying this embedding algorithm are given by the rich cases of quotient manifolds, with a group that acts on . In the case when can be written as a preimage of a regular value of a smooth map , then (i) is not a necessary step of the algorithm ( and is the identity).

In the case when is a vector field on the manifold , then we have the set inclusion . If is also a local diffeomorphism we will obtain equality between the above two sets and consequently, the critical points of the cost function are characterized by the equation In this particular setting the step (iv) can be replaced by the following:

• Suppose that and is a local diffeomorphism, then solve the equation (the solutions are critical points for the cost function ).

For solving point (v) one can study the second order derivatives of the cost function (see , , ) or study the convexity properties of the cost functions as in .

## 3 Averaging on So(3)

A problem frequently encountered in practice is finding a rotation that is in some sense an average of a finite set of rotations. The group of special rotations is not a Euclidean space, but a differential manifold. Consequently, an averaging problem on such a space has to be considered in the realm of differential geometry.

The special orthogonal group can be identified by a double covering map with the sphere in , see , . Our manifold will be and the ambient space will be the Euclidean space . We will consider different distance functions on that generate different cost functions and will solve the associated averaging problem, as we have described in the previous section, by using the embedding algorithm .

From a historical perspective, we will use the quaternion notation on , although the quaternion structure will not be specifically used in our computations. The unit quaternions and correspond to the following rotation in :

 Rq=⎛⎜⎝(q0)2+(q1)2−(q2)2−(q3)22(q1q2−q0q3)2(q1q3+q0q2)2(q1q2+q0q3)(q0)2−(q1)2+(q2)2−(q3)22(q2q3−q0q1)2(q1q3−q0q2)2(q2q3+q0q1)(q0)2−(q1)2−(q2)2+(q3)2⎞⎟⎠

This gives rise to the smooth double covering map , . The covering map is a local diffeomorphism and consequently, instead of working with the distance function (cost function) on , we will work with cost functions on the unit sphere as it has been explained in the previous section. In this case, for the step (ii) of the embedding algorithm, we have , with , . The tensor is given by the formula, see  eq. (4.2), , where is the Euclidean metric on . Writing this explicitly, the matrix associated to the symmetric contravariant 2-tensor is given by:

In order to compute the standard control vector field according to the formula given in Theorem 2.1, we need the following elementary computation: if is a one form on , then we have the vector field

 iωT(q)=4(⟨q,q⟩¯¯¯ω(q)−⟨q,¯¯¯ω(q)⟩q), (3.1)

where we have made the notation .

We will exemplify our construction by using various distance functions used in literature for measuring distances between Euclidean transformations. An extensive list is presented in , where their equivalence and functional dependence is also studied.

I. On the Lie group we will consider the distance function , , where is the Frobenius norm. The associated cost function is ,

 G1SO(3)(R)=r∑i=1||R−Ri||2F,

where are the sample rotation matrices, see , .

According to the step (i) of the embedding algorithm, we will lift the problem of finding the critical points of the cost function to the problem of finding the critical points of the lifted cost function , .

In order to compute the lifted cost function , by using the surjectivity of the covering map , for each sample matrix we will choose a sample quaternion . We have the following computation:

 G1S3(q) =r∑i=1||Rq−Rqi||2F=r∑i=1tr((Rq−Rqi)(Rq−Rqi)T) =2r∑i=1(3−tr(RqTRqi)) =8r∑i=1(1−⟨q,qi⟩2),

where we have used the equality

 ⟨q,qi⟩2=14(tr(RqTRqi)+1). (3.2)

For implementing the step (iii) we need to construct the prolongation function , . Note that and are even functions so they do not depend on the choice of the sample quaternions or which represent the same sample rotation . This subtler problem has also been addressed in a different way in . Computing the differential 1-form we obtain:

 ¯¯¯¯¯¯¯¯¯dG1(q) =(∂G1∂q0(q),...,∂G1∂q3(q))=(−16r∑i=1q0i,...,−16r∑i=1q3i) =−16r∑i=1qi.

Using (3.1), where the 1-form is , the system of equations (2.5) becomes:

 (3.3)

and this represents the equation . The above system has four equations of third degree in the four unknowns corresponding to the standard control vector field plus the constraint equation that describes .

The four equations corresponding to are not functionally independent because, as we have stated in Section 2, the tensor is degenerate. Nevertheless, since , for any , the system (3.3) can be described only by three equations plus the constraint equation.

In order to write as a system of three independent equations, we will push forward the vector field through the covering map , which will also place us in the hypotheses of the step (iv’) of the embedding algorithm. By definition of the push-forward operator, we have that

As is not an injective map, in order to obtain a vector field on the target space , the equality must be satisfied. We will show in what follows that this is the case.

We need to compute the tangent map of the covering map . This map can be written as a restriction of the map defined by , where a rotation matrix has been identified with a point in (the first line is identified with the first three components and so on).

The matrix corresponding to the linear map is given by Jacobian matrix

 2⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣q0q1−q2−q3−q3q2q1−q0q2q3q0q1q3q2q1q0q0−q1q2−q3−q1−q0q3q2−q2q3−q0q1q1q0q3q2q0−q1−q2q3⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦.

By direct computation, where we identify a vector in with a tangent vector , for we have that

 DP(q)⋅v0|S3(q) =D˜P(q)⋅v0|S3(q) =r∑i=1⟨q,qi⟩D˜P(q)⋅(⟨q,q⟩qi−⟨q,qi⟩q) ≅r∑i=1⟨q,qi⟩RqΔi(q) =Rqr∑i=1⟨q,qi⟩Δi(q),

where

 Δi(q)=⎛⎜ ⎜⎝0−q0q3i+q1q2i−q2q1i+q3q0iq0q2i+q1q3i−q2q0i−q3q1iq0q3i−q1q2i+q2q1i−q3q0i0−q0q1i+q1q0i+q2q3i−q3q2i−q0q2i−q1q3i+q2q0i+q3q1iq0q1i−q1q0i−q2q3i+q3q2i0⎞⎟ ⎟⎠.

By identifying the vector with a matrix, the skew-symmetric matrix is given by the formula

By noticing that , we obtain the equality and consequently, is a vector field on the manifold .

As a result we obtain that the critical points of the cost function are the solutions of the following system (which is equivalent with (3.3)):

 (3.4)

The advantage of the above system, compared with the system (3.3), is that we have only three equations of second degree instead of four equations of third degree plus the constraint equation.

If we transform the above system into rotations, by direct computation, we obtain:

 ⟨q,qi⟩Δi(q)=14(RqiTRq−RqTRqi) (3.5)

and the above system becomes:

 r∑i=1(RqiTRq−RqTRqi)=0.

This is equivalent with the following equation on :

 ¯¯¯¯¯RTR−RT¯¯¯¯¯R=0, (3.6)

where . Solving this equation corresponds to step (iv’) of the embedding algorithm and the solutions are the critical points of the cost function . Also, this equation is the same as the characterization for the Euclidean mean obtained in . Thus, we have obtained that searching for critical points of the cost function can be performed in two ways. More precisely, we can solve (3.6) or we can transform the initial problem into quaternions and solve the system (3.4) and then transform the solutions into rotations.

II. Next, we will consider the geodesic distance on the Lie group which is defined by the angle of two rotations, see , , , , , . For two rotations , . where is the angle between the rotations and . The associated cost function is ,

 G2SO(3)(R)=r∑i=1||Log(RTiR)||2F,

where are the sample rotation matrices and the closed set is defined by with

For to be well defined we have to remove the rotations included in the closed set . When we lift to quaternions we use the set equality , where . Consequently, from we have to subtract the set . As before, we will compute the lifted cost function as stated in step (i) of the embedding algorithm. More precisely, we have111.

 G2S3(q) =r∑i=1||Log(RqiTRq)||2F=2r∑i=1θ2i =2r∑i=1arccos2(tr(RqiTRq)−12) \lx@stackrel(???)=2r∑i=1arccos2(2⟨qi,q⟩2−1) =2r∑i=1arccos2(|⟨qi,q⟩|).

The lifted cost function does not depend on the choice of sample quaternions or that represent the same sample rotation .

For step (iii) of the embedding algorithm, the prolongation function is given by , . The coefficients of the differential 1-form are given by:

 ¯¯¯¯¯¯¯¯¯dG2(q) =(∂G2∂q0(q),...,∂G2∂q3(q)) =⎛⎜ ⎜⎝...,−4r∑i=1sgn(⟨q,qi⟩)||q||2qji−|⟨q,qi⟩|qj||q||2√||q||2||qi||2−⟨q,qi⟩2arccos(|⟨q,qi⟩|||q||⋅||qi||),...⎞⎟ ⎟⎠ =−4r∑i=1arccos(|⟨q,qi⟩|||q||⋅||qi||)sgn(⟨q,qi⟩)||q||2√||q||2||qi||2−⟨q,qi⟩2(⟨q,q⟩qi−⟨q,qi⟩q).

In order to solve the differentiability problem we will eliminate from the domain of definition of the function the hyperplanes and the lines that go through points and .

In this case, the equation is equivalent with

By using the same arguments as before, the equation is equivalent with which has the following form

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩r∑i=1sgn(⟨q,qi⟩)arccos(|⟨q,qi⟩|)√1−⟨q,qi⟩2Δi(q)=0q∈S3∖r⋃i=1Πiq≠±qi. (3.7)

The above expression also shows that is a vector field on the manifold .

In order to transform (3.7) into rotations we need the following computation:

 sgn(⟨q,qi⟩)arccos(|⟨q,qi⟩|)√1−⟨q,qi⟩2 =arccos(|⟨q,qi⟩|)⟨q,qi⟩|⟨q,qi⟩|√1−⟨q,qi⟩2=|θi|2|cos(θi2)|√1−cos2(θi2)⟨q,qi⟩

where we have used the property which implies that . Using (3.5) we obtain the equivalent system in rotations, and this corresponds to the step (iv’) of the embedding algorithm,

 r∑i=1(RqiTRq−Rq