Curvature-aware Manifold Learning

# Curvature-aware Manifold Learning

Yangyang Li Academy of Mathematics and Systems Science Key Lab of MADIS
Chinese Academy of Sciences, Beijing, 100190, China
University of Chinese Academy of Sciences, Beijing, 100049, China
###### Abstract

Traditional manifold learning algorithms assumed that the embedded manifold is globally or locally isometric to Euclidean space. Under this assumption, they divided manifold into a set of overlapping local patches which are locally isometric to linear subsets of Euclidean space. By analyzing the global or local isometry assumptions it can be shown that the learnt manifold is a flat manifold with zero Riemannian curvature tensor. In general, manifolds may not satisfy these hypotheses. One major limitation of traditional manifold learning is that it does not consider the curvature information of manifold. In order to remove these limitations, we present our curvature-aware manifold learning algorithm called CAML. The purpose of our algorithm is to break the local isometry assumption and to reduce the dimension of the general manifold which is not isometric to Euclidean space. Thus, our method adds the curvature information to the process of manifold learning. The experiments have shown that our method CAML is more stable than other manifold learning algorithms by comparing the neighborhood preserving ratios.

###### keywords:
Manifold Learning, Riemannian Curvature, Second Fundamental Form, Hessian Operator
journal: Pattern Recognition

## 1 Introduction

In many machine learning tasks, one is often confronted with the redundant dimension of data points. There is a strong intuition that the data points may have an intrinsic lower dimensional representation. The concept of manifold was first applied in dimension reduction in 3 () and 6 (), called manifold learning (MAL). In this decade, manifold learning has become a significant component of machine learning, pattern recognition, image vision and so on. Traditional manifold learning algorithms aim to reduce the dimensionality of high dimensional data points, so that the lower dimensional representations could reflect the intrinsic geometrical and topological structure of the high dimensional sampled points. In general, the existing manifold learning algorithms are mainly divided into two classes: global and local 21 (). Global approaches aim to preserve the global geometric structure of the manifold during dimension reduction, such as IsoMap 3 (); Local approaches attempt to uncover the geometric structures of local patches, such as LLE 6 (), LEP 1 (), LPP 4 (), LTSA 5 (), Hessian Eigenmap 2 () et al. Isomap aims to preserve the geodesic distance between any two high dimensional data points, which can be viewed as a nonlinear extension to Multidimensional Scaling (MDS) 11 (). Locally preserved manifold learning algorithms aim to inherit and preserve the local geometric structure of embedded manifold. For instance, LLE aims to preserve the local linear structures of local patches and LEP aims to preserve the local similarities among data points during dimension reduction.

### 1.1 Manifold Assumption

One fundamental assumption of manifold learning is that the input data points lie on or nearly on a manifold which is viewed as a sub-manifold of the ambient feature space. For each algorithm, it has additionally special assumptions. IsoMap assumes that is globally isometric to a convex subset of Euclidean space. Locally preserved manifold learning algorithms 20 () visualize the embedded manifold as a collection of overlapping local patches. For different local preserved manifold learning algorithms, the assumptions for local patches are different. LLE assumes that is an open sub-manifold and the input data points are dense enough to make the neighborhood of each data point a linear subspace. LEP also regards the neighborhood of each sample as a linear subspace, then constructs the corresponding local weight matrix, where the distance between two neighbor samples is measured by Euclidean metric. HLLE assumes that is locally isometric to Euclidean space, so that the null space can be uncovered by the average norm of Hessian matrix of all data points. For LTSA, in each local patch it uses PCA 12 () to reduce the dimension of local samples. So it assumes that each local patch of is a linear subspace of Euclidean space. PFE 17 () uses parallel vector field to learn a dimension reduction map, where this map induces locally isometric to Euclidean space. LSML 10 () reduces the dimension of which is not isometric to Euclidean space. But it regards the local patches of sub-manifold as linear subspaces. All the assumptions of MAL algorithms are shown in Table 1.

All the MAL algorithms except for LSML assume that is globally or locally isometric to Euclidean space. But in practice, general manifold is far-fetched to satisfy these assumptions. All the existing algorithms do not analyze the reliability and validity of these assumptions. In addition, all of them do not analyze the difference between non-isometry property and isometry property of manifold.

### 1.2 Limitations

Despite the wide applications of the existing MAL algorithms in many fields, such as: computer vision, pattern recognition, and machine learning, there are still a few limitations and problems remained to be solved.

• Local linearity assumption: it requires the input data points to be dense enough to guarantee the local patches being linear subspaces. In practice, there are not enough samples to generate the local patches with small enough size to guarantee the linearity.

• Parameters sensitivity problem: the neighbor-size parameter determines the size of local patches. Since the local isometry hypothesis, it requires the neighbor-size small enough. Otherwise, it would break the assumption of existing manifold learning algorithms.

• Locally short circuit problem: if the embedded manifold is highly curved, the local Euclidean distance between any two points is obviously shorter than the intrinsic geodesic distance.

• Intrinsic dimension estimation problem: since local patches are simply taken as tangent spaces, the intrinsic dimension of manifold cannot be determined by the latter accurately, in particular in case of strongly varying curvature.

• Curvature sensitivity problem: if the curvature of original manifold is especially high at some point, smaller patch is needed for representing the neighborhood around this point. In practice, it is hard to avoid this case, especially when the data points are sparse.

All the limitations mentioned above generate from the assumption that is locally isometric to Euclidean space. Thus, to remove this assumption is the main target of this paper. The problem of our paper is stated as follows.

### 1.3 Problem Statement

The input data points that we considered in this paper are , where is the number of data points and is the dimension of data points. We assume that these discrete data points lie on a -dimensional manifold embedded in the high dimensional feature space , where can be viewed as a sub-manifold of . The aim of manifold learning is to learn an embedding map :

 xi=f(yi)+ϵi,i=1,⋯,N, (1)

where are lower dimensional representations of and are the corresponding noises. needs to preserve the geometric structure of sub-manifold so that the lower dimensional representations can uncover the intrinsic structure of sub-manifold .

Under local isometry assumption, the embedding map locally satisfies:

 ∥f(yi)−f(yj)∥2=∥yi−yj∥2+o(∥yi−yj∥2), (2)

where are in a same local patch.

For general manifold, the locally isometric condition is not always satisfied, such as sphere 10 (). The problem that we aim to solve in this paper is the situation that is non-locally isometric to Euclidean space. All the manifold learning algorithms aim to uncover the intrinsic structure of the embedded-manifold . Thus our method attempts to learn the embedding map in Eq.1 under non-isometric condition which is not satisfied Eq.2. In the next section, we give a detailed analysis to make clear the relationship between local isometry and curvature tensor of sub-manifold . Based on this analysis, we give our curvature-aware manifold learning algorithm.

## 2 Geometry Background

In this section, we first give the definition of local isometry, then we give a geometric interpretation behind the locally isometric assumption. From this analysis, we uncover the potential limitations of traditional manifold learning algorithms. In second subsection, we give the geometry theory of general Riemannian sub-manifold.

### 2.1 Local Isometry

The family of all inner products defined on all tangent spaces is known as Riemannian metric of manifold . At each tangent space , the Riemannian metric is a scalar inner product , .

Definition 2.1. (Local Isometry) 13 () Let and be two Riemannian manifolds where and are Riemannian metrics on them. For a map between manifolds , is called local isometry if for all . Here is the differential of .

Under local isometry, is a linear isometry between the corresponding tangent spaces and .

Definition 2.2. (Global Isometry) 13 () A map is called global isometry between manifolds if it is a diffeomorphism and also a local isometry.

A Riemannian manifold is said to be flat if it is locally isometric to Euclidean space. That is to say, if every point has a neighborhood isometric to an open subset of Euclidean space, the Riemannian manifold is called a flat manifold.

Theorem 2.1. 14 () A Riemannian manifold is flat if and only if its curvature tensor vanishes identically.

So under the local isometry assumption of traditional MAL, the curvature tensor of sub-manifold is null tensor everywhere. However, in general the sub-manifold may be highly curved and not isometric to Euclidean space. Under this case, traditional manifold learning algorithms cannot accurately uncover the intrinsic structure of sub-manifold.

The root cause of these limitations for traditional manifold learning algorithms is without considering the curvature tensor of sub-manifold. To our knowledge, there have been several papers to consider the intrinsic curvature of data points 7 () 22 () 23 (). However, K. I. Kim et al. 7 () mainly applied on semi-supervised learning. Xu et al. 22 () and 23 () used Ricci flow to rectify the pair-wise non-Euclidean dissimilarities among data points. In this paper, our method attempts to add curvature information in manifold learning and remove the limitations of traditional manifold learning. Thus we propose our curvature-aware manifold learning.

### 2.2 Riemannian Sub-manifold

In Riemannian geometry, the geometric structure of sub-manifold is determined by two fundamental forms. Riemannian metric can be viewed as the first fundamental form which aims to compute the intrinsic geometric structure of Riemannian manifold, such as: the geodesic distance, area, and volume. The second fundamental form aims to uncover the extrinsic structure of sub-manifold relative to ambient space, such as curvature, torsion and so on. For Riemannian manifold, the torsion is zero. How the sub-manifold curved with respect to the ambient space is measured by the second fundamental form.

### 2.3 Second fundamental form

Suppose is a Riemannian manifold with dimension and is embedded in with dimension . At any point , the ambient tangent space divides into two perpendicular linear subspaces 14 (), where is the normal space and is the tangent space of at . In this paper, we regard Riemannian manifold as a Riemannian sub-manifold of .The Riemannian metric of is defined as the induced metric from . Riemannian curvature tensor defined on Riemannian manifold is a order tensor. The curvature operator is represented by the second order derivative on vector field of Riemannian manifold, where the directional derivative is defined as Riemannian connection . In Riemannian sub-manifold, the Riemannian curvature tensor of sub-manifold is computed with the help of second fundamental form expressed as .

Definition 2.3. (Riemannian Curvature) 15 () Let be a Riemannian manifold and the Riemannian connection. The curvature tensor is a -tensor defined by:

 R(X,Y)Z=∇X∇YZ−∇Y∇XZ−∇[X,Y]Z,

on vector fields .

Using Riemannian metric , can be changed to a -tensor 15 ():

 R(X,Y,Z,W)=g(R(X,Y)Z,W). (3)

In Riemannian sub-manifold, one main task is to compare the Riemannian curvature of with that of ambient space . According to the definition of curvature tensor, we first give the relationship between the Riemannian connection of and of 14 ():

 ˜∇XY=∇XY+B(X,Y), (4)

where the normal component is known as the second fundamental form of .

Therefore, we can interpret the second fundamental form as a measure of the difference between the Riemannian connection on and the ambient Riemannian connection on . Based on the relationship between and , we give the following theorem to show the relationship between the Riemannian curvature of sub-manifold and the Riemannian curvature of ambient space.

Theorem 2.2. (The Gauss Equation) 14 () For any vector fields , the following equation holds:

 ˜R(X,Y,Z,M)=R(X,Y,Z,W)−⟨B(X,W),B(Y,Z)⟩+⟨B(X,Z),B(Y,W)⟩.

So Riemannian curvature of ambient space can be decomposed into two components. In this paper the ambient space is Euclidean space , so . In this case, the Riemannian curvature of is represented as:

 R(X,Y,Z,W)=⟨B(X,W),B(Y,Z)⟩−⟨B(X,Z),B(Y,W)⟩. (5)

In order to compute the scalar value of second fundamental form, we construct a local natural orthonormal coordinate frame of the ambient space at point , the restrictions of to form a local orthonormal frame of . The last orthonormal coordinates form a local orthonormal frame of . Under the locally natural orthonormal coordinate frame, the Riemannian curvature of in Eq.5 is represented as:

 Rijkl=∑α(hαikhαjl−hαilhαjk). (6)

Accordingly, the second fundamental form under this local coordinate frame is showed as: , with being the coefficients of with respect to the normal coordinate frame . And under this locally natural coordinate frame, the embedding map is redefined as , where are natural parameters. is the second derivative of embedding component function , which constitutes the Hessian matrix . By the above analysis, in order to compute the Riemannian curvature of Riemannian sub-manifold , we just need to estimate the Hessian matrix of the embedding map . Next, we give the estimation of Hessian operator.

### 2.4 Hessian Operator

Hessian matrix is a square matrix of second-order derivatives with respect to all of the variables of a scalar-valued function. It represents the concavity, convexity and the local curvature of a function. Suppose is a multivariate function with parameters. Then the Hessian matrix of is given as: .

In each local patch of , we choose a set of local natural orthogonal coordinates . In practice, we use PCA 12 () to estimate the local orthogonal coordinate system of . The corresponding normal coordinates of normal space are computed by Gram-Schmidt orthogonal method. The corresponding local coordinates of under this new local normal coordinate system are represented as . is projected into the original point. The second fundamental form coefficients are estimated as .

Consider the Taylor expansion of , at under this new local coordinate system:

 fα(uij)=fα(0)+uij∇fα+uijHαuTij+o(∥uij∥2). (7)

For each component of Hessian matrix , it can be considered as the second order item coefficient of the quadratic polynomial function . For local tangent space at , the orthogonal coordinate system is spanned by . For local quadratic polynomial vector space, the local coordinate system is spanned by . So the Hessian matrix is estimated by projecting the input data points into the polynomial vector space. For estimation, we use the least square estimation method to compute the projecting coefficents. The solution is obtained by: , where , , is the pseudo-inverse matrix of and , . The learnt local projection coordinates of each point is given as where is the tangent components vector and is the vector-form representation of Hessian matrix. is projected into the original point expressed as .

## 3 Curvature-aware Manifold Learning

In this paper, we just consider the locally geometric structure preserving MAL algorithms, namely LLE, LEP, LTSA and so on. These algorithms attempt to recover the local underlying structure of sub-manifold in lower dimensional Euclidean space. In general, the procedures of this type of algorithms are mainly divides into three steps 8 (). The detailed statement is given in the following subsection.

### 3.1 Manifold Learning

In the first step, traditional MAL algorithms partition local patches to each input point based on the Euclidean metric in ambient space . In general, there are two commonly used methods. The first one is by choosing an -ball with as center. Then all the points in this ball are called the neighbors of . The other method is to use -nearest neighbor method to find the neighbors of each input data point . For these two methods, and are parameters which are very sensitive to the dimension reduction results of experiments.

In the second step, traditional manifold learning algorithms aim to construct a weight matrix in each local patch to represent the local geometric structure of sub-manifold . For different manifold learning algorithms, the weight matrices are different.

The third step is to reconstruct a set of lower dimensional representations , where corresponds to . is learnt by minimizing a reconstruction error function under some normalization constraints 8 ().

 Φ(Y)=N∑i=1ϕ(Yi)=N∑i=1∥WiYi∥2F, (8)

with the normalization constraints for LLE, LTSA, HLLE and for LEP where is a diagonal matrix with .

### 3.2 Curvature-aware Manifold Learning

It has been analyzed that one critical assumption of traditional manifold learning algorithms is that the embedded manifold is isometric to Euclidean space. For this type of algorithms, the similarity between any two neighbor points is measured by Euclidean distance. Obviously, it overestimates the similarity if the manifold is highly curved.

In LTSA 5 (), the authors analyzed the reconstruction error in theory and obtained that the error is highly influenced by the curvature of sub-manifold . When the sub-manifold is highly curved in the higher dimensional feature space, the reconstruction error would be very high. By analysis, the accurate determination of local tangent space is dependent on several factors: curvature information embedded in the Hessian matrices, local sampling density, and noise level of data points. So for LTSA, it is necessary to analyze the curvature information of sub-manifold during dimension reduction.

Besides analyzing the reconstruction error of LTSA, our method aims to improve traditional manifold learning algorithms by adding curvature information. In this paper, we focus on improving two algorithms LLE and LEP in detail and give the detailed analysis of these two improved algorithms CA-LLE and CA-LEP in theory. For local structure preserved method, we just divide the sub-manifold into a set of local patches in each point and choose as an example to analyze our algorithm. We have shown that we consider the local patch structure in quadratic polynomial vector space to obtain curvature information. In local patch of , span the local polynomial vector space. Projecting original input data points to this local polynomial vector space, we respectively obtain the corresponding projection coefficients shown as . The local curvature information of at is hidden in the quadratic component vector . In the following, we give the detailed description of our CAML algorithm.

CAML Algorithm Procedures:

1. Input a set of data points . This step is the same as the first step of traditional manifold learning algorithm. In this step, we choose nearest neighbor method to divide the sub-manifold into a set of local patches in each point under Euclidean metric.
2. Unlike traditional manifold learning algorithms to project local patches to local tangent spaces, we project the local patch into a second-order polynomial vector space and obtain the new local coordinate representations by Eq.7, where . The curvature information at each point is embedded in .
3. Using new local representations to construct the local geometrical structure represented by weight matrix in each local patch . This step is the most critical step of CAML by adding curvature information to reconstruct the local weight matrix .
4. After constructing the curvature-aware weight matrix , we use this weight matrix to reconstruct the representations in lower dimensional Euclidean space . is learnt by minimizing a reconstruction error function in Eq.8 under some normalization constraints. This step is the same as the third step of traditional manifold learning.

We respectively consider the improvements of LEP and LLE under our curvature-aware algorithms as two examples:

For Curvature-aware LEP:

 Wij=⎧⎪ ⎪⎨⎪ ⎪⎩exp−∥Bi0−Bij∥22σ2,xij∈Ui0,xij∉Ui, (9)

For Curvature-aware LLE:
in local patch is obtained by minimizing the following equation:

 argmin∥Bi0−WijBij∥2. (10)

When we use to represent the local coordinate representation of , the curvature information of local patches is added into the local weight matrix . The detailed theoretical analysis is shown in the following section.

## 4 Algorithm Analysis

We just consider one local patch as an example to give the analysis of our local curvature-aware manifold learning algorithm. Different from traditional locally preserved MAL algorithms, our method projects the original data points in local polynomial vector space. The corresponding local projection of is shown as .

### 4.1 Curvature-aware LEP

In the polynomial vector space, the weight value between two neighbor points is given as:

 Wij=exp−∥Bi0−Bij∥22σ2=exp−∥τi0−τij∥22σ2⋅exp−∥Hi0−Hij∥22σ2, (11)

where . represents the Hessian matrix at point . Under this new local normal coordinate frame of , the coordinate of is zero, so , obviously .

Hessian matrix is a symmetric matrix. We do the eigenvalue decomposition to and obtain the following expression:

 ∥Hxj∥2F=∥UTΛjU∥2F=∥Λj∥2F, (12)

where is the eigenvalue matrix of . In Riemannian geometry, each eigenvalue of is a principal curvature along the corresponding coordinate. Based on the above analysis, the weight value in Eq.11 is shown as:

 Wij=exp−∥τi0−τij∥22σ2⋅exp−∥Λj∥2F2σ2. (13)

It is equivalent to add a curvature penalty on similarity weight .

Theorem 4.1. Assume the reconstruction error under our curvature-aware weight matrix is represented as . And the reconstruction error under traditional LEP 1 () is represented as . Then, we have

 ∥E∥F≤∥˜E∥F. (14)

Proof: The weight matrix under our CAML is defined as in Eq.13:

 Wij=exp−∥τi0−τij∥22σ2⋅exp−∥Λj∥2F2σ2.

And the weight under traditional LEP algorithm is defined as:

 ˜Wij=exp−∥τi0−τij∥22σ2.

Obviously we have . The corresponding Laplace matrices are defined as , . Therefore, we have:

 λi(L)≤λi(˜L),i=1,2,⋯,N,

where is the number of input data points.

For LEP, the lower dimensional representations are obtained from the eigenvectors of the smallest eigenvalues of Laplace matrix. The reconstruction error is measured by the values of the smallest eigenvalues ,

 ∥E∥F=N∑i=1∥xi−f(yi)∥=d∑i=1λi.

We have proved that the eigenvalue of is less than that of . So we have

 ∥E∥F≤∥˜E∥F.

Therefore, when considering the curvature information of sub-manifold, the reconstruction error gets much lower.

### 4.2 Curvature-aware LLE

In each local patch , we compute the local linear combination structure by minimizing the following equation:

 Φi=∥Bi0−K∑j=1WijBij∥2, (15)

where .

The equation in Eq.15 can be rewritten as:

 Φi=∥τi0−K∑j=1Wijτij∥2+∥Hi0−K∑j=1WijHij∥2. (16)

For traditional LLE, the authors just minimized the first item of . For our method we add an item to measure the linear combination of Hessian matrices.

In the following, we give a theoretical derivation to explain the necessity for adding the second Hessian item of .

Frist we give the Taylor expansion of embedding map in local patch :

 f(u)=f(0)+uT∇f+12(uTHu)+o(∥u∥2), (17)

Under the Taylor expansion of , we obtain the linear relationship between and the rest neighbors:

 f(0)−∑jWijf(uij)≈f(0)−∑jWijf(0)−∑jWijuTij∇f−12∑iWijuTijHuij. (18)

Since

 f(0)−∑jWijf(uij)≈−12∑jWijuTijHuij, (19)

where .

We have stated that the coordinate of under this local normal coordinate frame is zero, so the corresponding Hessian matrix . Therefore Eq.19 can be given as:

 f(0)−∑jWijf(uij)≈12Hi0−12∑jWijHij. (20)

So for our method, it is necessary to add a Hessian item when constructing the local linear combination structure. Traditional LLE algorithm does this linear combination in local tangent space, while our method does this in local polynomial vector space to consider the local curvature information of .

### 4.3 Time Complexity Analysis

In this subsection, we give the time complexity analysis of our algorithm compared with traditional manifold learning algorithms based on the number of data points , the input dimension , the intrinsic dimension . Comparing with traditional manifold learning algorithms, the added time cost of our algorithm mainly focuses on the computation of Riemannian curvature information of databases. The main process of this step is to estimate the local analytical structure by fitting a two-order polynomial function in Eq.7. We get the Riemannian curvature of each local patch by computing the eigenvalues of each Hessian matrix, where the size of each Hessian matrix is , so the time cost of eigenvalue decomposition of each Hessian matrix is , the total time cost of the full samples is shown as . In general, the intrinsic dimension is far less than the input dimension . In addition, only the time cost of finding nearest neighbors of all samples is .

In short, compared with the total time cost of traditional manifold learning algorithms, the total time cost of our algorithm CAML is slightly higher than them. If the number of samples is especially large, the added time cost of CAML can be ignored.

## 5 Experiments

In this section, we compare our algorithm CAML with several traditional MAL algorithms on four synthetic databases e.g. Swiss Roll, Punctured Sphere, Gaussian, and Twin Peaks 16 () as well as two real world data sets. For synthetic databases, we respectively learn the effectiveness of our algorithm on two tasks: dimension reduction and parameter sensitivity analysis. For real world data sets, we compare the classification performance of our algorithm with traditional MAL algorithms.

### 5.1 Topology Structure

Before dimension reduction, we first analyze the topology structures of the four synthetic databases. All the databases are generated from Matlab code ’mani.m’ 16 (). For each database, it contains points distributed on the corresponding synthetic manifold. Swiss Roll is a locally flat manifold which is locally isometric to Euclidean space. For this data set, traditional manifold learning algorithms can uncover the intrinsic structure of Swiss roll accurately. For punctured sphere data set, these data points lie on a two dimensional sphere which is embedded in . The curvature of this sphere is non-zero everywhere, so it is not locally or globally isometric to Euclidean space. Twin peaks manifold is a highly curved two-dimensional manifold embedded in three dimensional Euclidean space. It is not locally isometric to Euclidean space, so traditional manifold learning algorithms cannot accurately uncover the intrinsic structure of this curved synthetic manifold. Two dimensional Gaussian manifold is not also isometric to Euclidean space, where the Gauss curvature of Gaussian manifold is not zero everywhere.

Based on the analysis of these four synthetic manifolds, we compare our curvature-aware manifold learning algorithm with other traditional manifold learning algorithms in the next two subsection to emphasize the need for considering curvature information.

### 5.2 Dimension Reduction

In this subsection, we compare our algorithm CAML with other MAL algorithms on these four datasets. To evaluate the performance of our curvature aware algorithm, we compare our method with seven traditional MAL algorithms (e.g. MDS, PCA, IsoMap, LLE, LEP, DFM, LTSA). The objective of this comparison is to map each dataset to two dimensional Euclidean space and then to analyze the neighborhood preserving ratio (NPR) 18 () of different algorithms. Table 2 shows the comparison results, where the neighbor-size parameter . The neighborhood preserving ratio (NPR) is defined as:

 NB=1KNN∑i=1|N(xi)⋂N(yi)|, (21)

where is the set of -nearest sample subscripts of , and is the set of -nearest sample subscripts of . represents the number of intersection points.

Table 2 shows that for all but Swiss Roll dataset, the NPRs of our CA-LEP and CA-LLE are higher than the rest traditional MAL algorithms. Swiss Roll is a flat Riemannian manifold, so our algorithm has almost no advantages under this dataset. For Gaussian dataset, it is a symmetric and convex manifold. So the NPRs of all algorithms are all very high. For Punctured Sphere and Twin Peaks, the NPRs of our algorithm obviously outperform the rest traditional MAL algorithms. These results clearly demonstrate that our CAML algorithm is more stable and better to uncover the local structure of data points.

### 5.3 Parameter Sensitivity Analysis

By analyzing, traditional MAL algorithms are sensitive to some parameters e.g. neighbor-size parameter , intrinsic dimension . For intrinsic dimension , we have not a very suitable method to estimate it exactly. So, in this paper, we assume that the intrinsic dimension of sub-manifold is unique and approximately estimated 19 (). In this experiment, we mainly analyze the sensitivity of neighbor-size parameter .

We compare the neighborhood preserving ratios of different manifold learning algorithms under different parameter values , respectively. All the experiments are done on two datasets (Punctured Sphere and Twin Peaks) with 2000 data points. These two synthetic manifolds are all not isometric to Euclidean space. Thus we analyze the effectiveness of curvature information in dimension reduction. In order to highlight the improvement of our algorithm, we use five traditional manifold learning algorithms (LEP, IsoMap, LLE, HLLE, and LTSA) to compare with our algorithm CAML. The final comparison results are shown in Figure 1. Compared with these traditional MAL algorithms, our method outperforms them when . In addition, from figure (a) and (b) we can see that the neighbor-size parameter is very sensitive under traditional MAL algorithms. The NPR curves under traditional MAL algorithms are changed especially unsteadily under different values of neighbor-size parameter . However, the NPRs of CA-LEP and CA-LLE are steady growth as the increase of neighbor-size parameter .

### 5.4 Real World Experiments

In this experiment, we consider the application of our algorithm on two real-world data sets: Extended Yale Face database B and USPS database. The main purpose of this experiment is to test the classfication accuracies in the lower dimensional space after using manifold learning algorithms to reduce the dimension of data points.

The Extended YaleFace B database, or YFB DB for short, contains single light source images of individuals each seen under about near frontal images under different illuminations per individual. For every subject in a particular pose, an image with ambient illumination was also captured. The face region in each image is resized into , so the original dimension of this database is .

The USPS database consists of images. It refers to numeric data obtained from the scanning of handwritten digits from envelopes by the U.S. Postal Service. The original scanned digits are binary and of different sizes and orientations; the images here have been deslanted and size normalized, resulting in grayscale images. So the original dimension of this database is .

In this experiment, we first analyze the curvature distributions of YaleB database and USPS database which are shown in Figure 1. From Figure 1, we can see that the embedded manifold of USPS database is highly curved in higher dimensional Euclidean space. The curvature value in each data point of USPS database is almost higher than . One reason is that the handwritten digits from different classes are vary greatly. However, for Extended YaleFace B database, the curvature distribution of each point is in the range of to . It means that the local geometric structure of YaleB database is close to flat space.

In the second step of this experiment, we compare our algorithm with traditional manifold learning algorithms under these two databases. The whole experiment design is shown as follows: First, we use manifold learning algorithms to reduce the dimension of databases. Second, in the low dimension space, we use Nearest Neighbor Classifier to test the classification accuracies of these two databases.

For YFB database, we choose images per subject totally images in our experiment. We totally do the experiment four times by each algorithm. In each experiment, we randomly choose images per subject as the training dataset, the rest images per subject as the testing dataset respectively. The classification accuracy results of different algorithms are shown in Table 3 upper part. The main purpose of this experiment is to find the improvement of our curvature-aware manifold learning algorithm compared with traditional manifold learning algorithms without considering the curvature information. From Table 3, the classification results of manifold learning algorithms mostly outperform the linear dimension reduction algorithm PCA. In addition, the classification results of LPP and LEP are especially higher than LLE. One main reason is that LLE assumes the local patches of data points are linear space and uncovers the linear combination relationship. Among all these classification results, we especially propose to analyze the comparisons between traditional manifold learning algorithms and our curvature-aware manifold learning algorithm. After adding curvature information to LLE, the classification results of CA-LLE slightly outperform LLE. One main reason is that the curvature distribution of YFB database is close to zero. In all, the performance of our curvature-aware manifold learning algorithm is better than all the other algorithms.

For USPS Database, we choose images per subject in this experiment. We also do the experiments four times by each algorithm respectively. As the same method with YFB DB, we randomly choose respectively image sets per subject for training, the rest for testing. At each time, we choose the different data set as the training set, the rest as the testing set. The whole classification results of different algorithms are shown in Table 3 lower part. From these results, we can see that the classification results of traditional manifold learning algorithms outperform PCA in any case. For our curvature-aware manifold learning, we consider the curvature information of data points. Compared with traditional manifold learning algorithms, the classification accuracies of our algorithm are higher than the other algorithms. Among these results, our focus is to compare the classification accuracies between LEP, LLE and CA-LEP, CA-LLE. It can be observed that our method significantly outperforms them.

Above all, when adding the curvature information of data points into manifold learning, the results of our algorithm outperform other traditional manifold learning algorithms in any case.

## 6 Conclusions and Future Works

To precisely describe the continuous change of point cloud, one critical step of manifold learning is to assume the dataset distributed on a lower dimensional embedded manifold. Then they use the mathematical theoretical knowledge of manifold to deal with these datasets, such as dimensionality reduction, classification, clustering, recognition and so on. Whether the manifold structure is uncovered exactly or not directly impacts the learning results. Traditional MAL algorithms just consider the distance metric. However, general Riemannian manifold may be not isometric to Euclidean space. So our method aims to excavate the higher order geometric quantity Riemannian curvature of Riemannian sub-manifold and uses curvature information as well as distance metric to uncover the intrinsic geometric structure of local patches. The extensive experiments have shown that our method is more stable compared with other traditional manifold learning algorithms. It is the first time to try to add curvature information on high dimensional data points for dimensionality reduction.

In the future, this work will try to use Ricci flow to dynamically uncover the intrinsic curvature structure of sub-manifold. Next we will further study the Ricci flow theory, and apply these research results on manifold learning field.

## Aknowledgments

This work is supported by the National Key Research and Development Program of China under grant 2016YFB1000902, NSFC project No.61232015, No.61472412, No.61621003, the Beijing Science and Technology Project: Machine Learning based Stomatology and Tsinghua-Tencent-AMSS-Joint Project: WWW Knowledge Structure and its Application.

## References

• (1) M. Belkin and P. Niyogi, Laplacian Eigenmaps and Spectial Techniques for Embedding and Clustering. In NIPS, vol. 14, pp. 585-591, 2001.
• (2) D.L. Donoho and C. Grimes, Hessian Eigenmaps: Locally Linear Embedding Techniques for High-dimensional Data. Proceedings of the National Academy of Sciences of the United Vision, vol.100, no.10, pp.5591-5596, 2013.
• (3) J. Tenenbaum, V. de Silva and J. Langford A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, vol.290, pp.2319-2323, 2000.
• (4) Xiaofei He and P. Niyogi, Locality Preserving Projections. In NIPS, 2003.
• (5) Z. Zhang and H. Zha, Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment SIAM J. Scientific Computing, vol.26, no.1, pp.313-338, 2005.
• (6) S. Roweis and L. Saul, Nonlinear Dimensionality by Locally Linear Embedding. Science, vol.290, pp.2323-2326, 2000.
• (7) K. I. Kim, J. Tompkin and C. Theobalt, Curvature-aware Regularization on Riemannian Submanifolds. In ICCV, 2013.
• (8) Y. Goldberg, A. Zakai, D. Kushnir and Y. Ritov, Manifold Learning: the Price of Normalization. The Journal of Machine Learning Research, vol.9, pp.1909-1939, 2010.
• (9) R.R. Coifman and S. Lafon, Diffusion Maps. Applied and Computational Harmonic Analysis, vol.21, no.1, pp.5-30, 2007.
• (10) P. Dollar, V. Rabaud and S. Belongie, Non-isometric Manifold Learning: Analysis and Algorithm. In ICML, pp.241-248, 2007.
• (11) T. Cox and M. Cox, Multidimensional Scaling. London, 1994.
• (12) I.T. Jolliffe, Principal Component Analysis. Springer-Verlag, New York, 1989.
• (13) J.M. Lee, Introduction to Smooth Manifolds. Springer-Verlag, New York, 2nd, 2003.
• (14) J.M. Lee, Riemannian Manifolds. Springer, New York, 1997.
• (15) P. Petersen, Riemannian Geometry. Springer, New York, 1998.
• (16) T. Wittman, Manifold Learning Matlab demo. http://www. math.umn.edu / wittman/mani, 2005.
• (17) Binbin Lin, Xiaofei He, Chiyuan Zhang and Ming Ji, Parallel Vector Field Embedding. Journal of Machine Learning Research, vol.14, pp.2945-2977, 2013.
• (18) S. Guido, Dimensionality Reduction of Clustered Data Sets. IEEE TPAMI, vol.30, no.3 2008.
• (19) P. Mordohai and G. Medioni, Dimensionality Estimation, Manifold Learning and Function Approximation using Tensor Voting. Journal of Machine Learning Research, vol.11, pp.411-450, 2010.
• (20) X. Xing, K. Wang, Z. Lv, Y. Zhou and S. Du, Fusion of Local Manifold Learning Methods IEEE Signal Processing Letters, vol.22, no.4, pp.395-399, 2015.
• (21) V.D. Silva and J.B. Tenenbaum, Global versus Local Methods in Nonlinear Dimensionality Reduction. In NIPS, pp.705-712, 2003.
• (22) Weiping Xu, R. Edwin and R.C. Wilson, Rectifying the Ricci Flow Embedding, pp.579-588, 2010.
• (23) Weiping Xu, R. Edwin and R.C. Richard, Ricci Flow Embedding for Rectifying Non-Euclidean Dissimilarity Data. Pattern Recognition, vol.47, pp.3709-3725, 2014.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   