3D Dynamic Point Cloud Denoising via
Spatiotemporal Graph Modeling
Abstract.
The prevalence of accessible depth sensing and 3D laser scanning techniques has enabled the convenient acquisition of 3D dynamic point clouds, which provide efficient representation of arbitrarilyshaped objects in motion. Nevertheless, dynamic point clouds are often perturbed by noise due to hardware, software or other causes. While many methods have been proposed for the denoising of static point clouds, dynamic point cloud denoising has not been studied in the literature yet. Hence, we address this problem based on the proposed spatiotemporal graph modeling, exploiting both the intraframe similarity and interframe consistency. Specifically, we first represent a point cloud sequence on graphs and model it via spatiotemporal Gaussian Markov Random Fields on defined patches. Then for each target patch, we pose a Maximum a Posteriori estimation, and propose the corresponding likelihood and prior functions via spectral graph theory, leveraging its similar patches within the same frame and corresponding patch in the previous frame. This leads to our problem formulation, which jointly optimizes the underlying dynamic point cloud and spatiotemporal graph. Finally, we propose an efficient algorithm for patch construction, similar/corresponding patch search, intra and interframe graph construction, and the optimization of our problem formulation via alternating minimization. Experimental results show that the proposed method outperforms framebyframe denoising from stateoftheart static point cloud denoising approaches.
1069
1. Introduction
The maturity of depth sensing and 3D laser scanning techniques has enabled convenient acquisition of 3D dynamic point clouds, a natural representation for arbitrarilyshaped objects varying over time (Rusu and Cousins, 2011). A dynamic point cloud consists of a sequence of static point clouds, each of which is composed of a set of points defined on irregular grids, as shown in Fig. 1. Each point has geometry information (i.e., 3D coordinates) and possibly attribute information such as color. We focus on the geometry of point clouds in this paper due to its vital role. Because of the efficient representation, dynamic point clouds have been widely deployed in various fields, such as 3D immersive telepresence, navigation for autonomous vehicles, gaming and animation (Tulvan et al., 2016).
Point clouds are often perturbed by noise, which comes from hardware, software or other causes. Hardware wise, noise occurs due to the inherent limitations of the acquisition equipment. Software wise, in the case of generating point clouds with existing algorithms, points may locate somewhere completely wrong due to imprecise triangulation (e.g., a false epipolar matching). The noise corruption directly affects the subsequent applications of dynamic point clouds.
However, the denoising of dynamic point clouds hasn’t been studied in the literature yet, while many approaches have been proposed for static point cloud denoising. Existing denoising methods for static point clouds mainly include moving least squares (MLS)based methods, locally optimal projection (LOP)based methods, sparsitybased methods, and nonlocal similaritybased methods. MLSbased methods (Alexa et al., 2003; Guennebaud and Gross, 2007; Oztireli et al., 2009) approximate a smooth surface for the input point clouds and project the points to the estimated surface. LOPbased methods (Huang et al., 2013; Hui et al., 2009; Lipman et al., 2007) also apply surface approximation but the operator is nonparametric. Sparsitybased methods (Avron et al., 2010; Mattei and Castrodad, 2017) assume sparse representation of point normals, and solve the global minimization problem to obtain the sparse reconstruction of the point normals. Nonlocal similaritybased methods (Dinesh et al., 2018; Zeng et al., 2018) exploit selfsimilarities among surface patches in a point cloud. Besides, several other approaches have been proposed for static point cloud denoising (Huang et al., 2009; Yan and Zhai, 2015; Rusu et al., 2008; Gao et al., 2018), in which the key idea is to detect noise in point clouds via certain characteristics and then delete them.
Whereas it is possible to apply existing static point cloud denoising methods to each frame of a dynamic point cloud sequence separately, the interframe correlation would be neglected, which may lead to inconsistent denoising results in the temporal domain. Hence, we propose joint denoising of dynamic point clouds by exploiting the interframe correlation, which not only enforces the temporal consistency but also provides additional information for denoising. Since point clouds are irregular, it is challenging to acquire the temporal correspondence between neighboring frames. We address this issue by representing dynamic point clouds naturally on graphs, where each vertex represents a point, each edge captures the relationship between neighboring points, and the corresponding graph signal refers to the coordinates of points. We then propose a graphbased method to search the temporal correspondence and estimate the underlying clean dynamic point cloud.
Specifically, since it is computationally inefficient to consume an entire frame of point cloud, we first divide each frame into overlapping patches. Each irregular patch is defined as a local point set consisting of a centering point and its nearest neighbors. Then we propose a spatialtemporal model under Gaussian Markov Random Fields (GMRF) (Rue and Held, 2005), which play a crucial role in describing both the intraframe and interframe correlations over patches. Next, we estimate the underlying current frame via Maximum a Posteriori (MAP) estimation, given the previous and current noisy frames. We propose the likelihood function and prior distribution, based on the GMRF modeling and graphsignal smoothness prior (Shuman et al., 2013). This leads to the proposed problem formulation of dynamic point cloud denoising, where the underlying frame and its graph representation (the graph Laplacian^{1}^{1}1In spectral graph theory (Chung, 1997), a graph Laplacian matrix is an algebraic representation of the connectivities of the corresponding graph, which will be introduced in Section 3. in particular) are jointly optimized.
Based on the above problem formulation, we propose an efficient algorithm to address the denoising problem of dynamic point clouds. For each target patch in the current frame, we first search for its similar patches in the same frame to exploit the intraframe correlation, and search for its corresponding patch in the previous frame to explore the interframe correlation. Similar to (Zeng et al., 2018), the similarity metric between two patches depends on the distance from each point in the two patches to the tangent plane at each patch center of both patches. Based on the similar patches and corresponding patch, we address the problem formulation by designing an efficient alternating minimization algorithm to solve the underlying frame and graph Laplacian alternately. In particular, since the computational complexity of solving the graph Laplacian would be high and the numerical computation might be unstable, we propose to construct the intraframe graph and interframe graph based on the patch similarity manually from each update of the underlying frame. Experimental results show that the proposed method outperforms separate denoising of each frame from stateoftheart static point cloud denoising methods on five widely used dynamic point cloud sequences.
In summary, the main contributions of our work include:

To the best of our knowledge, we are the first to address dynamic point cloud denoising problem in the literature. The key idea is to exploit the interframe correlation of irregular point clouds for the temporal consistency.

We propose a spatialtemporal model of dynamic point clouds under GMRF, and derive the MAP estimation from graphsignal priors, which finally casts dynamic point cloud denoising as an optimization problem.

We propose an efficient algorithm to solve the optimization problem. Experimental results validate the effectiveness of our method.
2. Related Work
To the best of our knowledge, there has been no research on dynamic point cloud denoising yet in the literature. Previous works on static point cloud denoising can be divided into four classes: moving least squares (MLS)based methods, locally optimal projection (LOP)based methods, sparsitybased methods, and nonlocal methods.
MLSbased methods. MLSbased methods aim to approximate a smooth surface from the input point cloud and minimize the geometric error of the approximation. Alexa et al. obtain a polynomial function on a local reference domain to best fit neighboring points in terms of MLS (Alexa et al., 2003). Other similar solutions are algebraic point set surfaces (APSS) (Guennebaud and Gross, 2007) and robust implicit MLS (RIMLS) (Oztireli et al., 2009). However, the results may be oversmoothing and may not perform well in terms of removing outliers.
LOPbased methods. LOPbased methods also apply surface approximation for denoising point clouds. But unlike MLSbased methods, the operator is nonparametric, thus it performs well in cases of ambiguous orientation. For example, Lipman et al. define a set of points that represent the estimated surface by minimizing the sum of Euclidean distances to the data points (Lipman et al., 2007). The two branches of (Lipman et al., 2007) are weighted LOP (WLOP) (Hui et al., 2009) and anisotropic WLOP (AWLOP) (Huang et al., 2013). (Hui et al., 2009) produces a set of denoised, outlierfree and more evenly distributed particles over the original dense point cloud to keep the sample distance of neighboring points. (Huang et al., 2013) modifies WLOP with an anisotropic weighting function so as to preserve sharp features better. However, LOPbased methods may also lead to oversmoothing results.
Sparsitybased methods. Sparsitybased methods are based on the theory of sparse representation of the point normals. With sparsity regularization, they solve a global minimization problem to obtain sparse reconstruction of the point normals. Then the positions of points are updated by solving another global minimization problem based on a local planar assumption, such as (Mattei and Castrodad, 2017) and (Avron et al., 2010). However, when locally high noisetosignal ratios yield redundant features, these methods may not perform well and lead to oversmoothing or oversharpening (Sun et al., 2015).
Nonlocal methods. Nonlocal methods exploit selfsimilarities among surface patches in a point cloud. These methods are inspired by nonlocal means (NLM) (Buades et al., 2005) and BM3D (Dabov et al., 2007) image denoising algorithms. For example, Digne et al. utilize a NLM algorithm to denoise static point clouds (Digne, 2012), while Rosman et al. implement a BM3D method to smooth point clouds (Rosman et al., 2013). Besides, Zeng et al. define the selfsimilarity among patches in point clouds formally as a lowdimensional manifold prior (Zeng et al., 2018). Dinesh et al. approximate a nearestneighbor graph of 3D points as a bipartite graph and then deploy graph total variation to the surface normals of neighboring 3D points as regularization (Dinesh et al., 2018). However, the computational complexity of the above methods is usually high.
Besides, deep learning has been recently deployed for static point cloud denoising (Almonacid et al., 2018). A CNN model is trained with a set of real and synthetic scans with clean and noisy areas, and then applied to perform denoising. However, finer geometric precision is unfeasible for now given the high computational complexity of the model.
3. Preliminaries
We represent dynamic point clouds on undirected graphs. An undirected graph is composed of a vertex set of cardinality , an edge set connecting vertices, and a weighted adjacency matrix . is a real and symmetric matrix, where is the weight assigned to the edge connecting vertices and . Edge weights often measure the similarity between connected vertices.
The graph Laplacian matrix is defined from the adjacency matrix. Among different variants of Laplacian matrices, the combinatorial graph Laplacian used in (Shen et al., 2010; Hu et al., 2015) is defined as , where is the degree matrix—a diagonal matrix where .
Graph signal refers to data that resides on the vertices of a graph. In our case, the coordinates of each point in the input dynamic point cloud are the graph signal. A graph signal defined on a graph is smooth with respect to the topology of if
(1) 
where is a small positive scalar, and denotes two vertices and are onehop neighbors in the graph. In order to satisfy (1), and have to be similar for a large edge weight , and could be quite different for a small . Hence, (1) enforces to adapt to the topology of , which is thus coined graphsignal smoothness prior.
4. Problem Formulation
In this section, we elaborate on the proposed problem formulation. We start from the modeling of a dynamic point cloud sequence via spatiotemporal GMRFs, and propose such modeling on patch basis. Then we pose a MAP estimation of the underlying dynamic point cloud, and come up with the likelihood function and prior distribution. Finally, we arrive at the problem formulation from the MAP estimation.
4.1. SpatialTemporal Modeling
A dynamic point cloud sequence consists of frames of point clouds. The coordinates denote the position of each point in the point cloud at frame , in which represents the coordinates of the th point at frame . Let denote the ground truth coordinates of the th frame, and , denote the noisecorrupted coordinates of the th and th frame respectively. Then we formulate the dynamic point cloud denoising problem as
(2) 
where is a zeromean signalindependent noise. For point clouds acquired from equipments, the noise distribution is related to the acquisition equipments. Several previous works (Nguyen et al., 2012; Sun et al., 2008) have shown through statistics that the noise in point clouds approximates Gaussian distribution for 3D scanning equipments such as Microsoft Kinect, 3D laser scanner, etc. As these are popular sensors, we assume the noise follows Gaussian distribution.
Spatiotemporal GMRF modeling. In particular, we model the relationship in consecutive frames of a dynamic point cloud via spatiotemporal GMRF models. A spatial GMRF is a restrictive multivariate Gaussian distribution that satisfies additional conditional independence assumptions. A graph is often used to represent the conditional independence assumption. Here is the formal definition:
Definition: A random vector is a GMRF with respect to a graph with mean and precision matrix , if its density has the form
(3) 
and
(4) 
Spatiotemporal GMRF models are extensions of spatial GMRF models to account for additional temporal variation. In our case, we represent a dynamic point cloud of frames on a sequence of subgraphs. Each subgraph describes the intraframe connectivities within each frame, and temporal connectivities exist between neighboring subgraphs to describe the interframe connectivities.
Patch representation. Further, as it is computationally expensive to consume an entire point cloud, we model both intraframe and interframe dependencies on patch basis. Unlike images or videos defined on regular grids, point clouds reside on irregular domain with uncertain local neighborhood, thus the definition of a patch is nontrivial. We define a patch in the point cloud at frame as a local point set of points, consisting of a centering point and its nearest neighbors in terms of Euclidean distance. Then the entire set of patches at frame is
(5) 
where is a sampling matrix to select points from point cloud so as to form patches of points each, and contains the coordinates of patch centers for each point.
Based on the patch representation, we model the intraframe dependency by building graph connectivities among similar patches within a frame, and model the interframe dependency by constructing graph connectivities between corresponding patches over consecutive frames. The details of searching similar patches within a frame and corresponding patches between frames will be discussed in section 5.2.
4.2. MAP Estimation of Dynamic Point Clouds
Under the spatiotemporal GMRF modeling, we pose a MAP estimation for the underlying patches in the point cloud at frame : given the observed noisy previous frame and current noisy frame , find the most probable signal ,
(6) 
where is the likelihood function, and is the signal prior. Because are patches that cover the entire , Eq. (6) also gives the MAP estimation of :
(7) 
The proposed likelihood function. is the probability of obtaining the observed point clouds and given the desired current frame . We have
(8) 
where is equivalent to because we assume the noise of the th frame and th frame are independent.
For the second term in Eq. (8), according to the linear relationship of and as in Eq. (5) and assuming zeromean Gaussian distribution for the noise, we have
(9) 
where is a normalization factor to keep the integral of the probability function equal to 1, and is a variancerelated parameter.
For the first term in Eq. (8), since the variation between adjacent frames is often trivial, we assume the current frame is a perturbed version of the previous frame. In particular, we propose to adopt a weighting parameter to represent the perturbation at the th patch, leading to
(10) 
where is a normalization factor, and is a variancerelated parameter. In the proposed algorithm, is a variable depending on and , which describes the similarity between and .
The proposed prior distribution. Since follows GMRF modeling, assuming zero mean, we have its prior distribution from Eq. (3) as:
(11) 
where is the precision matrix of the th frame, and is a normalization factor.
4.3. Final Problem Formulation
Combining Eq. (7), Eq. (8), Eq. (9), Eq. (10), Eq. (12), we have
(13) 
Due to the dependency of and on , and are optimization variables as well as .
Taking logarithm of Eq. (LABEL:eq:final_1) and multiplying by , we arrive at the final problem formulation:
(14) 
As , and are optimization variables, Eq. (14) is nontrivial to solve. We develop an efficient algorithm to address this problem formulation in the next section.
5. The Proposed Algorithm
As demonstrated in Fig. 2, for a given dynamic point cloud, we perform denoising on each frame sequentially. The proposed algorithm consists of four major steps: 1) patch construction, in which we form overlapped patches from chosen patch centers; 2) similar/corresponding patch search, in which we search similar patches for each patch in the current frame, and search the corresponding patch in the previous frame; 3) graph construction, in which we build a spatiotemporal graph with intraconnectivities among similar patches and interconnectivities among corresponding patches; 4) optimization, in which we solve the proposed problem formulation in Eq. (14) via alternating minimization, thus performing step 24 iteratively. Note that, the interframe reference is bypassed for denoising the first frame as there is no previous frame. We discuss the four steps separately in detail.
5.1. Patch Construction
As each patch is formed around a patch center, we first select points from as the patch centers, denoted as . In order to keep the patches distributed as uniformly as possible, we first choose a random point in as , and add a point which holds the farthest distance to the previous patch centers as the next patch center, until there are points in the set of patch centers. We then search the nearest neighbors of each patch center in terms of Euclidean distance, which leads to M patches in .
5.2. Similar/Corresponding Patch Search
For each constructed patch in , we search for its similar patches locally in , and its corresponding patch in . A metric is necessary to measure the similarity between patches. It remains a challenging problem as the patches are irregular.
Similarity Metric. We deploy a simplified method of (Zeng et al., 2018) to measure the similarity between patch and patch . The key idea is to compare the distance of the two patches, from each point to the tangent plane at the patch center.
Firstly, we structure the tangent planes of the two patches. A point cloud describes the surface of the object. We thus calculate the surface normals and for patch and patch respectively. Then we acquire the tangent planes of the patches at the patch center and .
Secondly, we measure the difference of patches with the distance of the two patches from each point to the corresponding tangent plane. Specifically, we project each point in patch and patch to the tangent plane of patch . For the th point in patch , we find the point in , whose projection on the tangent plane is closest to that of . We then define and as the distance of the two points to their projections on the tangent plane. is regarded as the difference of the two patches in point and . Then we acquire the average difference between the two patches at all the points:
(15) 
Similarly, projecting each point in patch and patch to the tangent plane of patch , we acquire an average difference . The final mean difference between the two patches is:
(16) 
Finally, we measure the patch similarity with a thresholded Gaussian function using the above mean difference:
(17) 
where is a threshold determined by the density of the point cloud, and is a variancerelated parameter. The larger is, the more similar and are.
Local Patch Search. Given the similarity measure, we search for similar patches within the current frame. The number of the similar patches depends on the size of the point cloud. As to the corresponding patch in the previous frame, we only search one patch as the corresponding patch. Given a target patch in the th frame , we choose the most similar patch to in the th frame as the corresponding patch .
In order to reduce the computation complexity, we set a local window in the th frame for the corresponding patch search, which contains patches centering at the Knearest neighbors of the target patch center. Thus we evaluate the patch similarity between the target patch and these Knearest patches instead of all the patches in the th frame. Once we acquire the patch , we deploy its similarity measure in Eq. (17) to the patch as the weighting parameter in Eq. (10). Similarly, we set a local window for similar patch search in the th frame.
5.3. Graph Construction
Having searched intraframe similar patches and interframe corresponding patches, we construct a spatiotemporal graph over the patches. Though this graph is supposed to be learned via Eq. (14), the computational complexity of solving the optimization problem would be high and the numerical computation might be unstable. Instead, we propose to manually build intraframe graph connectivities and interframe graph connectivities based on the patch similarity, as shown in Fig. 3.
Intraframe graph construction. Given a target patch in the th frame, we construct a bipartite graph between and each of its similar patches . Specifically, each point in is connected with its nearest neighbors in , where the distance is in terms of their projections on the tangent plane decided by the surface normal of at the patch center. Similarly, each point in is connected with the nearest points in in terms of their projections on the tangent plane decided by the surface normal of at the patch center. The intraframe connectivities are undirected and share the same weight as in Eq. (17). We build intraframe connectivities over all the patches in this way, which leads to the graph Laplacian , where is the number of points in each patch and is the number of patches in the th frame.
Note that, we do not connect points within each patch explicitly in order to avoid bringing the coordinates close to each other in a patch. However, connectivities may exist among some points if they are nearest neighbors in overlapping patches.
Interframe graph construction. In order to leverage the interframe correlation and keep the temporal consistency, we connect corresponding patches between the th frame and th frame. Similar to the intragraph construction, we connect each point in patch with its nearest points in patch , where the distance is in terms of their projections on the tangent plane decided by the surface normal of patch at the patch center. The edges are undirected and share the same weight , which is the similarity measurement in Eq. (17).
5.4. Optimization Algorithm
We first rewrite Eq. (14) for efficient optimization. We define a matrix to describe the weights between corresponding patches:
(18) 
Then we rewrite Eq. (14) in the following form:
(19) 
Eq. (19) is nontrivial to solve with three optimization variables. We propose an efficient alternating minimization approach as follows. Firstly, we initialize with the noisy observation , based on which we calculate from the proposed intraframe graph construction and from the proposed interframe graph construction. Secondly, we fix both and , take derivative of Eq. (19) with respect to and set the derivative to . This leads to the closedform solution of :
(20) 
Then we update and from the solved . The iterations are repeated until convergence, i.e., when the difference of , , and from their values in the previous iteration is trivial.
Note that, we first perform denoising on the first frame with only intracorrelations. Then for the next frame, in order to take advantage of the previously reconstructed frame for better reference, we take as patches in the denoised previous frame instead of those in the observed noisy previous frame. Hence, the final solution of in Eq. (19) serves as the reference frame for the denoising of the next frame. A summary of the proposed algorithm is shown in Algorithm 1.
6. Experimental Results
Noisy  APSS  RIMLS  MRPCA  Baseline  Ours  
Soldier  1.4984  1.4125  1.3572  1.3488  1.2805  
Longdress  1.4746  1.3985  1.3360  1.3247  1.2475  
Loot  1.4715  1.3571  1.3279  1.3101  1.2208  
Redandblack  1.4589  1.3892  1.3499  1.3221  1.2506  
UlliWegner  1.3359  1.2652  1.2065  1.1989  1.1091 
Noisy  APSS  RIMLS  MRPCA  Baseline  Ours  
Soldier  2.1453  1.8047  1.8105  1.8116  1.7815  
Longdress  2.1260  1.8007  1.7955  1.7922  1.7329  
Loot  2.1286  1.7668  1.7883  1.7703  1.7079  
Redandblack  2.1110  1.7915  1.8061  1.7939  1.7439  
UlliWegner  2.0865  1.7830  1.7807  1.8024  1.7444 
Noisy  APSS  RIMLS  MRPCA  Baseline  Ours  
Soldier  2.5417  1.9675  2.0450  1.9999  2.0111  
Longdress  2.5139  1.9630  2.0297  1.9754  1.9748  
Loot  2.5205  1.9359  2.0271  1.9487  1.9299  
Redandblack  2.5035  1.9726  2.0537  1.9849  2.0037  
UlliWegner  2.5600  2.0730  2.1060  2.0895  2.0405 
Noisy  APSS  RIMLS  MRPCA  Baseline  Ours  
Soldier  3.0127  2.1404  2.3901  2.1874  2.1442  
Longdress  2.9761  2.1236  2.3748  2.1360  2.0975  
Loot  2.9853  2.1118  2.3338  2.1037  2.0451  
Redandblack  2.9622  2.1433  2.3266  2.1563  2.1339  
UlliWegner  3.0492  2.2737  2.4086  2.2988  2.2925 
6.1. Experimental Setup
We evaluate our algorithm by testing on dynamic point clouds from MPEG (Ebner et al., 2018) and JPEG Pleno (Eugene d’Eon and Chou, 2017), including , , , and . We randomly choose 6 consecutive frames as the sample data: frame 601606 in , frame 12011206 in , frame 12011206 in , frame 15011506 in , and frame 14111416 in . The number of points in each frame is about 1 million, so we perform downsampling with the sampling rate of 0.05 prior to the denoising. Because the point clouds in the dataset are clean, we add white Gaussian noise with a range of variance . Then we compare our algorithm with three static point cloud denoising methods: MRPCA (Mattei and Castrodad, 2017), APSS (Guennebaud and Gross, 2007), and RIMLS (Oztireli et al., 2009), where we perform each static denoising method frame by frame independently on dynamic point clouds. Also, we compare with our Baseline scheme for ablation study, in which we remove the temporal reference by setting in Eq. (19). That is, Baseline performs denoising on each frame independently. Regarding the evaluation metric, we adopt mean squared error (MSE), i.e., the average Euclidean distance between the denoised point cloud sequence and the ground truth. That is, we take the average of the MSE on frames as our metric. Besides, for the first frames in all the datasets, we set because they have no previous frame.
6.2. Experimental Results
Objective results. We list the denoising results of different methods in Tab. 4,4,4,4, and mark the lowest MSE in bold. We see that our method outperforms all the four static point cloud denoising methods on the five datasets under all the noise levels. Specifically, we reduce the average MSE by on average over APSS, on average over RIMLS, on average over MRPCA, and on average over Baseline. This validates the effectiveness of our method. In particular, the improvement over Baseline validates that the temporal correlation we exploit is beneficial to dynamic point cloud denoising. Further, the MSE reduction over Baseline is respectively with increasing noise levels. This indicates that the temporal correlation makes more impact at high noise levels, because the interframe difference is more negligible compared to the noise with large variance.
For easier comparison with static point cloud denoising methods, we compute the average MSE on the five datasets under each noise level for different methods. The results are visualized in Fig. 4. We see that we achieve the best performance under various noise levels.
Subjective results. As illustrated in Fig. 5, the proposed method also has competitive visual results, especially in local details and temporal consistency. In order to demonstrate the temporal consistency, instead of the previous chosen frames as in Sec. 6.1, we choose another 6 consecutive frames that exhibit apparent movement in and under noise variance 0.05. We show the visual comparison with APSS and MRPCA because they have comparatively better objective performance as presented in Fig. 4. We see that, our results preserve the local structure and keep the temporal consistency better. For example, in the dataset, the boundary of the left hand in our result is much cleaner than that in APSS, and smoother than that in MRPCA. Also, our result exhibits better temporal consistency in general.
7. Conclusion
While the denoising of static 3D point clouds has been widely studied, it remains a challenge to denoise dynamic point clouds. In order to address the problem, we propose a graphbased method to exploit both the intraframe selfsimilarity and interframe consistency. Specifically, we propose spatiotemporal graph modeling of patches in dynamic point clouds, and pose a MAP estimation on the underlying patches. The key is to construct intraframe connectivities among searched similar patches within the same frame, as well as interframe connectivities between searched corresponding patches over consecutive frames. We then accordingly cast dynamic point cloud denoising as an optimization problem, which leverages the similar/corresponding patches and a graphsignal smoothness prior based on the constructed graph. Experimental results show that our method outperforms framebyframe denoising from stateoftheart static point cloud denoising approaches.
References
 (1)
 Alexa et al. (2003) Marc Alexa, Johannes Behr, Daniel CohenOr, Shachar Fleishman, David Levin, and Claudio T. Silva. 2003. Computing and Rendering Point Set Surfaces. IEEE Transactions on Visualization and Computer Graphics 9, 1 (2003), 0–15.
 Almonacid et al. (2018) Jonathan Almonacid, Celia Cintas, Claudio Derieux, and Mirtha Lewis. 2018. Point Cloud Denoising using Deep Learning. In Congreso Argentino de Ciencias de la Informática y Desarrollos de Investigación (CACIDI). 1–5.
 Avron et al. (2010) Haim Avron, Andrei Sharf, Chen Greif, and Daniel CohenOr. 2010. l1Sparse Reconstruction of Sharp Point Set Surfaces. ACM Transactions on Graphics (TOG) 29 (10 2010), 135.
 Buades et al. (2005) Antoni Buades, Bartomeu Coll, and JM Morel. 2005. A nonlocal algorithm for image denoising. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2. 60–65.
 Chung (1997) Fan RK Chung. 1997. Spectral graph theory. In Conference Board of the Mathematical Sciences. American Mathematical Society.
 Dabov et al. (2007) Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image Denoising by Sparse 3D TransformDomain Collaborative Filtering. IEEE Transactions on Image Processing (TIP) 16, 8 (Aug 2007), 2080–2095.
 Digne (2012) Julie Digne. 2012. Similarity based filtering of point clouds. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 73–79.
 Dinesh et al. (2018) Chinthaka Dinesh, Gene Cheung, Ivan V. Bajic, and Yang Cheng. 2018. Fast 3D Point Cloud Denoising via Bipartite Graph Approximation & Total Variation. (2018).
 Ebner et al. (2018) Thomas Ebner, Ingo Feldmann, Oliver Schreer, Peter Kauff, and Tanja Unger. 2018. HHI Point cloud dataset of a boxing trainer. http://mpegfs.intevry.fr/MPEG/PCC/DataSets/pointCloud/CfP/. In ISO/IEC JTC1/SC29/WG11 (MPEG2018) input document M42921.
 Eugene d’Eon and Chou (2017) Taos Myers Eugene d’Eon, Bob Harrison and Philip A. Chou. 2017. 8i Voxelized Full Bodies, version 2  A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m40059/M74006 (January 2017).
 Gao et al. (2018) Xiang Gao, Wei Hu, and Zongming Guo. 2018. GraphBased Point Cloud Denoising. In IEEE Fourth International Conference on Multimedia Big Data (BigMM). 1–6.
 Guennebaud and Gross (2007) Gaël Guennebaud and Markus Gross. 2007. Algebraic Point Set Surfaces. In ACM SIGGRAPH. 23.
 Hu et al. (2015) Wei Hu, Gene Cheung, Antonio Ortega, and Oscar C. Au. 2015. Multiresolution graph fourier transform for compression of piecewise smooth images. IEEE Transactions on Image Processing (TIP) 24 (January 2015), 419–433.
 Huang et al. (2013) Hui Huang, Shihao Wu, Minglun Gong, Daniel CohenOr, and Hao Zhang. 2013. EdgeAware Point Set Resampling. ACM Transactions on Graphics (TOG) 32, 1 (2013), 1–12.
 Huang et al. (2009) Wenming Huang, Yuanwang Li, Peizhi Wen, and Xiaojun Wu. 2009. Algorithm for 3D Point Cloud Denoising. In International Conference on Genetic and Evolutionary Computing.
 Hui et al. (2009) Huang Hui, Li Dan, Zhang Hao, Uri Ascher, and Daniel CohenOr. 2009. Consolidation of unorganized point clouds for surface reconstruction. In Acm Siggraph Asia.
 Lipman et al. (2007) Yaron Lipman, Daniel CohenOr, David Levin, and Hillel TalEzer. 2007. Parameterizationfree projection for geometry reconstruction. Acm Transactions on Graphics 26, 3 (2007), 22.
 Mattei and Castrodad (2017) Enrico Mattei and Alexey Castrodad. 2017. Point Cloud Denoising via Moving RPCA. Computer Graphics Forum 36 (11 2017).
 Nguyen et al. (2012) Chuong V Nguyen, Shahram Izadi, and David Lovell. 2012. Modeling kinect sensor noise for improved 3d reconstruction and tracking. In International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT). 524–530.
 Oztireli et al. (2009) Cengiz Oztireli, Gaël Guennebaud, and Markus Gross. 2009. Feature Preserving Point Set Surfaces based on NonLinear Kernel Regression. 28 (2009), 493–501.
 Rosman et al. (2013) Guy Rosman, Anastasia Dubrovina, and Ron Kimmel. 2013. PatchCollaborative Spectral PointCloud Denoising. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 1–12.
 Rue and Held (2005) Havard Rue and Leonhard Held. 2005. Gaussian Markov random fields: theory and applications. Chapman and Hall/CRC.
 Rusu and Cousins (2011) Radu Bogdan Rusu and Steve Cousins. 2011. 3D is here: Point Cloud Library (PCL). IEEE International Conference on Robotics and Automation (2011), 1–4.
 Rusu et al. (2008) Radu Bogdan Rusu, Zoltan Csaba Marton, Nico Blodow, Mihai Dolha, and Michael Beetz. 2008. Towards 3D Point cloud based object maps for household environments. Robotics and Autonomous Systems 56, 11 (2008), 927–941.
 Shen et al. (2010) Godwin Shen, Woo Shik Kim, Sunil K. Narang, Antonio Ortega, and Ho Cheon Wey. 2010. Edgeadaptive transforms for efficient depth map coding. In IEEE Picture Coding Symposium (PCS). 566–569.
 Shuman et al. (2013) David I Shuman, Sunil K. Narang, Pascal Frossard, Antonio Ortega, and Pierre Vandergheynst. 2013. The emerging field of signal processing on graphs: Extending highdimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30 (2013), 83–98.
 Spielman (2004) DA Spielman. 2004. Lecture 2 of spectral graph theory and its applications.
 Sun et al. (2008) Xianfang Sun, Paul L. Rosin, Ralph R. Martin, and Frank C. Langbein. 2008. Noise in 3D laser range scanner data. IEEE International Conference on Shape Modeling and Applications (2008), 37–45.
 Sun et al. (2015) Yujing Sun, Scott Schaefer, and Wenping Wang. 2015. Denoising point sets via L 0 minimization.
 Tulvan et al. (2016) Christian Tulvan, Rufael Mekuria, Zhu Li, and Sebastien Laserre. 2016. Use Cases for Point Cloud Compression. In ISO/IEC JTC1/SC29/WG11 (MPEG) output document N16331.
 Yan and Zhai (2015) Fu Yan and Jinlei Zhai. 2015. Research on scattered points cloud denoising algorithm. In IEEE International Conference on Signal Processing.
 Zeng et al. (2018) Jin Zeng, Gene Cheung, Michael Ng, Jiahao Pang, and Cheng Yang. 2018. 3d point cloud denoising using graph laplacian regularization of a low dimensional manifold model. arXiv preprint arXiv:1803.07252 (2018).
 Zhang et al. (2014) Cha Zhang, Dinei Florencio, and Charles Loop. 2014. Point cloud attribute compression with graph transform. In IEEE International Conference on Image Processing (ICIP). 2066–2070.