K-means clustering for efficient and robust registration of multi-view point sets
Efficiency and robustness are the important performance for the registration of multi-view point sets. To address these two issues, this paper casts the multi-view registration into a clustering problem, which can be solved by the extended K-means clustering algorithm. Before the clustering, all the centroids are uniformly sampled from the initially aligned point sets involved in the multi-view registration. Then, two standard K-means steps are utilized to assign all points to one special cluster and update each clustering centroid. Subsequently, the shape comprised by all cluster centroids can be used to sequentially estimate the rigid transformation for each point set. These two standard K-means steps and the step of transformation estimation constitute the extended K-means clustering algorithm, which can achieve the clustering as well as the multi-view registration by iterations. To show its superiority, the proposed approach has tested on some public data sets and compared with the-state-of-art algorithms. Experimental results illustrate its good efficiency and robustness for the registration of multi-view point sets.
As a fundamental issue in many areas, the problem of point set registration has attracted immense attention in computer vision [1, 2], computer graphics [3, 4], and robotics [5, 6], etc. It addresses the problem of transformation estimation between different point sets. According to the involved number of point sets, this problem can be divided into pair-wise registration and multi-view registration.
Usually, the pair-wise registration problem is solved by the iterative closest point (ICP) algorithm  or its variants [8, 9, 10]. To achieve registration, most of these approaches alternately build hard correspondences and estimate the transformation. These approaches are efficient but may not very accurate. Instead, some registration approaches [11, 12, 13, 14] replace the hard assignment by the soft assignment so as to obtain more accurate results. As the soft assignment should be built from each point to all points in the opposite point set, these approaches are time-consuming. Besides, most of registration approaches are locally convergent. To obtain the desired global minimum, particle filter [15, 16] and genetic algorithm [17, 18] can be combined to estimate the transformation. What’s more, some 3D features can be extracted from the point sets. They can be matched to provide initial transformation for the pair-wise registration[19, 20]. As the base of multi-view registration, the problem of pair-wise registration has been well solved.
Compared to the pair-wise registration, multi-view registration is more difficult and has comparatively attracted less attention. For the multi-view registration, the original method is to sequentially align and merge two point sets until all point sets are merged into one model. Although this approach may be efficient, it suffers from the error accumulation problem. To address this issue, the pair-wise registration algorithm can be sequentially utilized to align one point set to the coarse model constructed by all point sets . The results can be returned to update the coarse model, which can be further used to align each point set. Due to too many points in the reconstructed model, the efficiency of these approaches is required to be further improved.
Besides, Govindu and Pooja proposed the motion averaging algorithm [22, 23], which can recover all rigid transformations simultaneously for multi-view registration from a set of relative motions, which can be estimated by the pair-wise registration. Given reliable and accurate relative motions [24, 25], the motion averaging algorithm can achieve accurate results. Besides, the multi-view registration can also be cast into the problem of low-rank and sparse (LRS) matrix decomposition . For some scan pairs with high overlapping percentages, the relative motions can be estimated and concatenated into a large matrix, which has missed data corresponding to the scan pairs with low overlapping percentages. By the LRS decomposition, the matrix can be completed and the multi-view registration results can be recovered. Compared to other approaches, it is more likely to be affected by the sparsity of the uncompleted matrix.
Recently, Georgios and Radu hold the view that all the points involved in multi-view registration are drawn from a central Gaussian mixture . Therefore, the multi-view registration can be cast into a clustering problem, where the expectation maximization (EM) algorithm is utilized to estimate both the GMM parameters and the rigid transformations that optimally align the point sets. Although this approach is very accurate, it should estimate many parameters and is time-consuming.
Actually, we can assume that all the points are drawn from clusters, which are represented by the same number of centroids. This idea initially appeared in , where the K-means clustering algorithm was utilized to refine the 3D model from multi-view registered range images. But it suppose that the multi-view registration has been accomplished. In this paper, the multi-view registration can reasonably be cast into the clustering problem. To achieve the clustering, the proposed approach starts with initial estimates for the centroids uniformly sampled from all initially posed point sets. Then two standard steps of K-means algorithm are required to assign each point to one special cluster and then update the cluster centroids. As all point sets are not well aligned, the shape comprised by all the update centroids can be used to sequentially estimate the rigid transformation for each point set by the pair-wise registration. To obtain the desired results, the K-means steps and transformation estimation should be alternately and iteratively applied to all point sets.
The remaining of this paper is organized as follows. Section 2 gives a brief introduction of K-means clustering algorithm. Following that, Section 3 proposed the novel approach for registration of multi-view point sets. In Section 4, the proposed approach is tested and evaluated on some public datasets. Section 5 concludes this paper.
Ii K-means clustering algorithm
As an unsupervised learning method, the K-means algorithm is very effective for the clustering. Given the number of clusters and the data set , this algorithm starts with initial estimates for the centroids , which can either be randomly selected or generated from the data sets. It can achieve the clustering by the iteration of two steps.
(1) Data assignment
(2) Centroid update
Usually, the good clustering results can be obtained by iterating these two steps until some convergence criteria are met.
Iii Multi-view registration by K-means cluster
In this section, we will formulate the multi-view registration problem and present the motivation of the proposed approach. Then, we propose the extended K-means clustering algorithm for the registration of multi-view point sets.
Iii-a Problem formulation
The development of scanning equipment makes it possible to reconstruct the precise 3D object model. Due to the occlusion, the object cannot be scanned in its entirety from a single viewpoint. Therefore, scanners can acquire point sets from different viewpoints so as to cover the entire object surface. These point sets should then be transformed into one common reference frame for the 3D model reconstruction.
Denote as range points that belong to the th point set and let be the number of point sets, where . Given initial rigid transformations , the goal of multi-view registration is to estimate accurate rigid transformations between each point set and the reference frame. Without loss of generality, the reference frame can be attached to the first point set. Therefore, there is no need to estimate the first rigid transformation, which has always been fixed during the multi-view registration.
Given the accurate model, the multiview registration problem can be divided into multiple subproblems, where each point set is pair-wisely registered to the accurate model, respectively. However, there is no accurate model before the multi-view registration. What¡¯s available is just the coarse model reconstructed from the initially posed point sets, which contains too many redundant points due to the overlapping areas among different point sets.
For the multi-view registration, we can suppose all the points are drawn from clusters, which can be represented by the same number of centroids. These centroids can make up an accurate model. Therefore, if all the points can be well clustered, the accurate model can be constructed from the clustering centroids. Different from other clustering problems, this clustering problem contains both the the clustering problem and the registration problem. As we all know, the K-means algorithm is very effective for the clustering. To solve the this clustering problem, it should be extended so as to cluster points and estimate the rigid transformations simultaneously.
Accordingly, we proposed the novel multi-view registration approach by the extended K-means clustering algorithm. The flowchart of the proposed approach is displayed in Fig. 1. As shown in Fig. 1, given the initial rigid transformations, the proposed approach can achieve the accurate registration of multi-view point sets as follows:
(1). Generate all the initial centroids from initially aligned point sets.
(2). Based on the current registration results, assign each point to one special cluster and then update the centroids for all clusters.
(3). Based on the updated centroids, sequentially update the rigid transformations for each point set.
(4). Iterate Steps 2 and 3 until some stop criteria are met.
Subsequently, the extended K-means clustering algorithm will be presented to solve the registration of multi-view point sets.
Iii-C The extended K-means clustering algorithm
Given the clustering centroids , the multi-view registration can be formulated as the following least-square (LS) problem:
where is the nearest cluster centroid of the aligned point and is a binary variable. If the th cluster only contains the point belongs the th point set, we set , otherwise . Eq. (3) can be solved by the variant of K-means clustering algorithm.
Before the multi-view registration, initial centroids are required for the K-means clustering. As mentioned before, the initial centroids can either be randomly selected or generated from the data set. Given initial rigid transformations , the coarse model can be reconstructed by all point sets involved in the multi-view registration as follows:
Then, the initial centroids of all clusters can be uniformly sampled from this coarse model. The shape comprised by all centroids is similar to the coarse model. The only difference is the resolution of points.
Being provided with initial rigid transformations , the extended K-means clustering algorithm can be proposed to achieve the registration of multi-view point sets by iterations. In each iteration, the following three steps are included:
(1) Assign each point to one special cluster:
(2) Update the centroids for all clusters:
(3) Sequentially estimate each rigid transformation except the first one:
Similar to the standard K-means algorithm, Steps (1)-(3) should be repeated until the iteration number exceeds the maximum value or all the estimated rigid transformations have no obvious change between two iterations. Finally, it can achieve the clustering and obtain the desired results for the multi-view registration of multi-view point sets.
In fact, Eq. (5) is the nearest neighbor search problem, which can be efficiently solved by the search method based on k-d tree. After data assignment, each clustering centroid can be directly calculated from all the assigned points. Besides, Eq. (7) can be solved by the singular value decomposition (SVD) method . The main difficulty lies in the confirmation of each binary variable . In the multi-view registration, the th point set covers some regions that other point sets cannot cover. If these regions are used to update the rigid transformation for itself in Eq. (7), the accuracy will go down. Therefore, these regions are invalid and should be eliminated. As these regions are only covered by one point set, the cluster lying in these regions would contain less points than other regions, which may be covered by more than two point sets. Hence, we can confirm the value of according to number of point in the th cluster. Here, if the point number of the th cluster is less than half of the mean value of all clusters, we set so as to eliminate the regions covered only by th point set. Otherwise, we can set .
Based on the above description, we can summarize the extended K-means clustering algorithm for the multi-view registration as Algorithm 1.
As shown in Algorithm 1, to efficiently achieve the multi-view registration, all points in the coarse reconstructed model should be clustered so as to get the reasonable model comprised by the clustering centroids. As the reconstructed model is comprised by the clustering centroids, the rigid transformation can be efficiently refined for each point sets. To obtain good registration results, the clustering and transformation estimation can be alternately and iteratively applied to all point sets. In this approach, the only parameter is the number of the clusters . Actually, this parameter has a greater impact on the performance of the multi-view registration. If is too small, the registration may be efficient but inaccurate. While, if is too large, the registration may be time-consuming and easily to be trapped into local minimum. However, the actual number of the clusters is unknown and varies with the multi-view point sets to be registered. In this paper, we set it to be a fixed value during the experiments.
This section discusses the complexity of the proposed approach for registration of multi-view point sets. Since our approach is proposed for registration of multi-view point sets, the number of points is the central quantity. As the raw K-means clustering algorithm, the extended k-means clustering algorithm accomplish the clustering by iterations. In each iteration, three steps are included:
(1) Assign each point to one special cluster. In this step, one clustering centroid should be assigned to each point in multi-view registration. To accelerate the assignment, the proposed approach utilizes the nearest neighbor search method based on the -d tree, which leads to a complexity of .
(2) Update the centroids for all clusters. By sequentially traversing each point, all the clustering centroids can be updated. Therefore, This step introduces a complexity of .
(3) Sequentially estimate each rigid transformation except the first one. As shown in Eq. (7), point pairs are used to estimate one rigid transformation, so the estimation of rigid transformations can introduce a complexity of .
According to the above statement, the complexity of each individual operation is displayed in Table I.
To test the proposed approach, a set of experiments were conducted on 6 datasets in the Stanford 3D Scanning Repository . Table II displays some information of these datasets. These data sets contain the multi-view point sets and the ground truth of transformation for the multi-view registration. To reduce the runtime of multi-view registration, all the raw point sets have been down-sampled by a factor of 8 before experiments. For the comparison, the error of rotation matrix and translation can be defined as and , where denotes the ground truth of the th rigid transformation and indicates the one estimated by the multi-view registration approach.
During experiments, some parameters are set as follows: and . All the competed approaches adopted the nearest-neighbor search method based on -d tree to establish correspondences. Experiments were implemented in MATLAB and performed on a four-Core 3.6 GHz computer with 8 GB of memory.
As mentioned before, the reconstructed model is comprised by the clustering centroids, which are initially sampled from all the aligned point sets. Since each point set covers some regions that other point sets cannot cover, the proposed approach suggests that these regions should be eliminated to estimate the rigid transformation of this point set itself. To verify the effectiveness of this strategy, we compare the proposed approach with and without elimination of the invalid cluster centroids, which are abbreviated as KmeansReg and KmeansRegWOR, respectively. They were tested on Stanford Bunny and Armadillo under different noises levels, where five uniform distributed noises were added to ground-truth transformations, respectively. To eliminate the randomness, 10 Monte Carlo (MC) trials were conducted with respect to five noise levels for two strategies. Fig. 2 illustrates the mean of rotation error, translation error and the mean runtime of two different strategies.
As shown in the Fig. 2, the proposed approach without elimination of the invalid cluster centroids can always obtain the less accurate registration results under varied noise levels for different datasets. This is because the pair-wise registration of two same point sets can only obtain the identity matrix. Since the th rigid transformation should be estimated by aligning the th point set to the model comprised by all the clustering centroids, these cluster centroids located in the regions only covered by the th point set should be eliminated. Otherwise, the registration accuracy will go down. Therefore, the proposed approach with elimination of the invalid cluster centroids can obtain more accurate registration results. Besides, the execution time of these two strategies are approximately equal. Accordingly, the proposed approach is valid for multi-view registration of initially posed point sets.
To illustrate its superiority, the proposed approach is compared with three state-of-the-art approaches: the coarse-to-fine TrICP approach , the motion averaging TrICP approach  and the approach based on the low-rank and sparse decomposition , which are abbreviated as CFTrICP, MATrICP and LRS, respectively.
Iv-B1 Accuracy and efficiency
As the locally convergent registration approaches, all these four approaches require the initial registration parameters. Before experiment, initial registration parameters can be generated by adding random noises to the ground truth of rigid transformations. Subsequently, four competed approaches were applied to the registration of multi-view point sets. For comparison, Table III records the registration error and runtime of all competed approaches for different data sets, where the bold number denotes the best performance among these competed approaches. To evaluate the registration accuracy in a more intuitive way, Fig. 3 displays the multi-view registration results of the corresponding four competitive approaches in the form of cross-section.
As shown in the Table III, the proposed approach is the most efficient one among these four competed registration approaches. For the registration, the proposed approach only requires to establish the correspondence between each point set and the model comprised by the clustering centroids. While, the CFTrICP should establish the correspondence between each point set and the model constructed by other aligned point sets. Besides, other two registration approaches should establish the correspondence between one point set and the other point set. As the number of the clustering centroids are much less than the number of point involved in multi-view registration, the establishment of correspondences, the most time-consuming operation in the multi-view registration, is much faster in the proposed approach.
As shown in Table III and Fig. 3, the proposed approach is not very sensitive to noises and it can always obtain acceptable accuracy compared to other competed approaches. Since the proposed approach utilizes the model comprised by all clustering centriods for the estimation of each transformation, the resolution of this model is much lower than the model constructed by raw point sets. Therefore, the proposed approach may not always obtain the most accurate registration results. But other competed approaches may fail to achieve the multi-view registration without good registration parameters. Therefore, the proposed approach is comparable to other state-of-the-art approaches in the term of accuracy.
To further verify its robustness, the proposed approach was tested on Stanford Dragon under different noises levels, where five uniformly distributed noises were added to ground-truth transformations, respectively. To eliminate the randomness, 10 MC trials were conducted with respect to five noise levels for all competed approaches. Table IV recorded the mean of rotation error, the mean of translation err, and the mean runtime for these approaches.
As shown in Table IV, the proposed approach can obtain the most robust registration results under varied noise levels. Other three registration approaches are all vulnerable to noise levels. With the increase of noise level, the registration errors of the proposed approach increase gradually. Although the CFTrICP performs well under low noise levels, its registration errors sharply raises due to the increase of noise level. Meanwhile, both the MATrICP and LRS approaches are difficult to achieve accuracy registration for the Stanford Dragon under all noise levels. Therefore the proposed approach is the most robust one among all the competed approaches. Besides, its runtime is far less than the three other competed approaches.
This paper proposes a novel approach for registration of multi-viewpoint sets. To the best of our knowledge, it is the first time that K-means clustering is applied to solve the multi-view registration problem. Viewed as the clustering problem, the multi-view registration can be sloved by the proposed extended K-means clustering algorithm. The proposed approach has been tested on the Stanford 3D Scanning Repository and the experimental results demonstrate that it can achieve the multi-view registration of initially posed point sets with good efficiency and robustness. Similar to most of the related approaches, the proposed approach requires initial parameters for the multi-view registration.
Our future work will investigate how to estimate the initial rigid transformations so as to automatically achieve the multi-view registration of unordered point sets without any prior information.
This work is supported by the National Natural Science Foundation of China under Grant nos. 61573273, 61573280 and 61503300. We would like to thank Federica Arrigoni for providing the MATLAB implementation of .
-  Shiratori T, Berclaz J, Harville M, et al, ”Efficient Large-Scale Point Cloud Registration Using Loop Closures,” International Conference on 3d Vision, 232-240 (2015).
-  Yang J L, Li H D, Campbell D, Jia Y D, ”Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 2241-2254 (2016).
-  Dai A, Niener M, Zollhofer M, Izadi S, Theobalt C, ”Real-Time Globally Consistent 3D Reconstruction Using On-the-Fly Surface Reintegration,” ACM Transactions on Graphics, 36(3), 24(1-18) (2017).
-  D. Aiger and N. J. Mitra and D. Cohen-Or, ”4-points Congruent Sets for Robust Surface Registration,” ACM Transactions on Graphics, 27(3), 85(1-10) (2008).
-  Yu F, Xiao J, Funkhouser T, ”Semantic alignment of LiDAR data at city scale,” Computer Vision and Pattern Recognition, 1722-1731 (2015).
-  Liang Ma, Jihua Zhu, Li Zhu, Shaoyi Du, Jingru Cui, ”Merging grid maps of different resolutions by scaling registration,” Robotica, 34(11), 2516-2531 (2016).
-  Besl P J, Mckay N D, ”A method for registration of 3-D shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239-256 (1992).
-  Chetverikov D, Stepanov D, Krsek P, ”Robust Euclidean alignment of 3D point sets: the trimmed iterative closest point algorithm,” Image and Vision Computing, 23(3), 299-309 (2005).
-  Phillips J M, Liu R, Tomasi C, ”Outlier Robust ICP for Minimizing Fractional RMSD,” Sixth International Conference on 3-D Digital Imaging and Modeling, 427-434 (2007).
-  10. Langis C, Greenspan M, Godin G, ”The Parallel Iterative Closest Point Algorithm,” 3rd International Conference on 3-D Digital Imaging and Modeling, 195-202 (2001).
-  Granger S, Pennec X, ”Multi-scale EM-ICP: A Fast and Robust Approach for Surface Registration,” European Conference on Computer Vision, 418-432 (2002).
-  Jian B, Vemuri B C, ”Robust Point Set Registration Using Gaussian Mixture Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(8), 1633 (2011).
-  Myronenko A, Song X, ”Point set registration: coherent point drift,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(12), 2262-2275 (2010).
-  Tsin Y, Kanade T, ”A Correlation-Based Approach to Robust Point Set Registration,” European Conference on Computer Vision, 558-569 (2004).
-  Sandhu R, Dambreville S, Tannenbaum A, ”Point Set Registration via Particle Filtering and Stochastic Dynamics,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(8), 1459-1473 (2010).
-  Kolesov I, Lee J, Sharp G, et al, ”A Stochastic Approach to Diffeomorphic Point Set Registration with Landmark Constraints,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 238 (2016).
-  Lomonosov E, Chetverikov D, Rt A, ”Pre-registration of arbitrarily oriented 3D surfaces using a genetic algorithm,” Pattern Recognition Letters, 27(11), 1201-1208 (2006).
-  Zhu J, Meng D, Li Z, et al, ”Robust registration of partially overlapping point sets via genetic algorithm with growth operator,” Iet Image Processing, 8(10), 582-590 (2014).
-  Guo Y, Bennamoun M, Sohel F, et al, ”An Integrated Framework for 3-D Modeling, Object Detection, and Pose Estimation From Point-Clouds,” IEEE Transactions on Instrumentation and Measurement, 64(3), 683-693 (2015).
-  Rusu R B, Blodow N, Beetz M, ”Fast Point Feature Histograms (FPFH) for 3D registration,” IEEE International Conference on Robotics and Automation, 1848-1853 (2009).
-  Zhu J, ”Surface reconstruction via efficient and accurate registration of multiview range scans,” Optical Engineering, 53(10), 102104 (2014).
-  Govindu V M, ¡°Lie-Algebraic Averaging for Globally Consistent Motion Estimation,¡± Computer Vision and Pattern Recognition,(1), 684-691 (2004).
-  Govindu V M, Pooja A, ”On averaging multiview relations for 3D scan registration,” IEEE Transactions on Image Processing, 23(3), 1289-1302 (2014).
-  Govindu V M, ”Robustness in Motion Averaging,” Asian Conference on Computer Vision, 457-466 (2006).
-  Li Z, Zhu J, Lan K, et al, ”Improved Techniques for Multi-view Registration with Motion Averaging,” International Conference on 3d Vision, 713-719 (2014)
-  Arrigoni F, Rossi B, Fusiello A, ”Global Registration of 3D Point Sets via LRS Decomposition,” European Conference on Computer Vision, 489-504 (2016).
-  Evangelidis D G, Horaud R, ”Joint Alignment of Multiple Point Sets with Batch and Incremental Expectation-Maximization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99), 1-1 (2017).
-  Zhou H, Liu Y, ”Accurate integration of multi-view range images using k-means clustering,” Pattern Recognition,” 41(1), 152-175 (2008).
-  Zhu J, Wang D, Bai X, et al, ”Registration of Point Clouds Based on the Ratio of Bidirectional Distances,” International Conference on 3d Vision, 102-107 (2016).
-  Marc Levoy, ”The Stanford 3D Scanning Repository,” September 2 2013, http://graphics.stanford.edu/data/3Dscanrep/ (10 October 2017)