Meshlet Priors for 3D Mesh Reconstruction
Estimating a mesh from an unordered set of sparse, noisy 3D points is a challenging problem that requires carefully selected priors.
Existing hand-crafted priors, such as smoothness regularizers, impose an undesirable trade-off between attenuating noise and preserving local detail.
Recent deep-learning approaches produce impressive results by learning priors directly from the data.
However, the priors are learned at the object level, which makes these algorithms class-specific, and even sensitive to the pose of the object.
We introduce meshlets, small patches of mesh that we use to learn local shape priors.
Meshlets act as a dictionary of local features and thus allow to use learned priors to reconstruct object meshes in any pose and from unseen classes, even when the noise is large and the samples sparse.
The ability to capture, represent, and digitally manipulate objects is crucial for a wide range of important applications, from content creation to animation, robotics, and virtual reality. Among the different representations for 3D objects (which also include depth maps, occupancy grids, and point clouds), meshes are particularly appealing.
Estimating meshes of real-world objects, however, is not straightforward since common capture strategies, such as structured light  or multi-view stereo [14, 36], produce point clouds or depth maps instead. These intermediate representations are noisy, sparse, and, when used to estimate a continuous surface, they introduce a trade-off between over-fitting to the noise and over-smoothing.
Traditional methods require hand-crafted priors (\eg, local smoothness) to balance noise and details, as is the case for Laplacian reconstruction . Figure 1 shows that this balance is difficult to strike when the point cloud is noisy (rows marked as ). Recent learning-based methods learn priors directly from a large number of examples [16, 34, 15, 26, 29]. Because of their ability to learn priors directly from the data, these approaches can produce impressive results from both point clouds and a single image. However, they learn priors at the object level, which limits their ability to reconstruct objects from classes not seen during training (rows marked as ). They also struggle to disentangle the shape priors with the object pose: state-of-the-art learning methods can fail completely on a rotated point cloud (rows marked as ) even though they can reconstruct the same point cloud when its pose resembles that of the training set (rows marked as P). Tatarchenko \etalgo further and suggest that many of these methods may actually learn a form of classification and nearest-neighbor retrieval from the dataset, rather than a proper 3D reconstruction . In fact, even for rows P and T in Figure 1, a closer look seems to indicate that AtlasNet  and OccNet  are reconstructing a different couch and chair from the training set.
We present a learning-based method to extract a 3D mesh from a set of sparse, noisy, unordered points that bridges traditional and learning-based approaches. Our key intuition is to learn geometric priors locally while enforcing their consistency globally. To represent shape priors, we introduce meshlets, small patches of mesh that, loosely speaking, serve as a learned dictionary of local features.
Specifically, we use a variational auto-encoder (VAE)  to learn the latent space of meshlets that can be observed in natural shapes. We call these natural meshlets. Learning these local features offers two key advantages. First, it allows to reconstruct objects from classes never seen in training: at some scale, a couch exhibits similar local features to those of a bunny.
Second, it disentangles the global pose of the object and the parametrization learned by the network, which allows our algorithm to be robust to dramatic changes of the object’s pose, see Figure 1.
To fit the meshlets to a point cloud we minimize their distance to the points, while enforcing that they belong to the latent space of natural meshlets. Therefore, the resulting surface will locally satisfy the priors we learned. However, because the meshlets are optimized independently of each other, the mesh extracted from their union will not be watertight.
Therefore, we define an auxiliary, watertight mesh and propose to use it in an alternating optimization that ensures that the meshlets are consistent with each other and with the observed point cloud.
We show with extensive comparisons that this iterative method produces results that outperform the state-of-the-art on challenging scenarios such as noisy points clouds in arbitrary poses, as shown in Figure 1. In summary, our contributions are:
We present meshlets, a new way of representing local shape priors in the latent space of a variational autoencoder that is trained on local patches from a dataset of real-world objects.
We propose an alternating optimization which fits meshlets to the measured point samples (enforcing local constraints) while maintaining global consistency for the mesh.
We demonstrate for the first time, to our knowledge, successful reconstructions of 3D meshes from very sparse, noisy point measurements with a category-agnostic, learning-based method.
2 Related Work
Extracting a mesh from a point cloud is an important problem that has been the focus of much research since the early days of graphics. Traditional methods such as marching cubes  or Ball-Pivoting  work well for cases where the noise is small as compared to the density of the point cloud.
In general, however, noise does pose issues. One traditional solution, then, is to use the points and the relative normals to compute a signed-distance function whose zero crossing is the desired surface [8, 13, 2, 17, 18]. An alternative is to use hand-crafted priors, such as smoothness of the vertices and normals of the estimated mesh . However, these priors introduce a trade-off between suppressing the noise and preserving sharp features that becomes increasingly brittle for sparser and noisier point clouds, see Figure 1.
Priors can be more effectively learned from data with neural networks. Deep learning methods, for instance, have shown great success in estimating depth maps from images, whether from multiple views [14, 36], stereo , or even single-image [9, 10, 39, 21]. Even meshes can be directly extracted from a single image, provided that the class of the object is known [16, 34, 15].
Rather than requiring to manually tinker with the traditional noise/sharpness trade-off, methods that learn priors to extract meshes from point clouds introduce a new one: generally speaking, the lower the quality of the observations (\eg, strong noise or sparsity of the point cloud), the stronger the priors need to be, thus affecting the algorithm’s ability to generalize to different and unseen classes. For instance, methods that learn local priors are class-agnostic but tend to need dense point clouds with low levels of noise [12, 38, 37, 35].
The recent works or Park \etal , Groueix \etal , and Mesheder \etal  produce impressive results even with sparser and potentially noisier data, but fail to generalize to completely new classes of objects. Often they even struggle when the point cloud is in a pose that differs significantly from the training pose, as show in Figure 1, rows . This issue is due, in part, to the fact that these methods lack a mechanism to enforce geometric constraints at inference time. Our method is class-agnostic thanks to its ability to learn and enforce local priors while minimizing the error with respect to the point cloud at inference time.
The idea of learning priors from data and enforcing geometric constraints at inference time was recently explored for depth map , point cloud  and surface estimation [23, 22, 29]. These approaches use low dimensional representations that allow inference time optimization.
However, approaches that learn priors at the object level [40, 23, 22, 29] and, thus, tend to be category specific. Our meshlet priors directly encode the (local) shape of the surface instead of a viewer centric depth . Meshlets are class agnostic and can be used to learn and enforce priors at different scales.
Key to solving the mesh estimation problem is how to represent it. Different representations for meshes exist that are amenable to use with neural networks, but they also tend to be class specific [31, 3]. One key ingredient of our method is the use of small mesh patches, called meshlets, which simplify the processing of the mesh, among other things. A related approach is the work of Groueix \etalwho also represent the mesh as a collection of large parts, which they call charts . However, their method does not offer a mechanism to enforce global consistency and does not leverage local shape priors.
Our goal is to estimate a mesh from a set of unordered, non-oriented points. The task is easy when the point cloud is dense and the noise is low. However, when the quality of the observations degrades, \eg, sparser or noisier points, the choice of priors and heuristics becomes central. Hand-crafted priors, such as smoothness, introduce a trade-off between overly smooth and noisy reconstructions, as shown in Figure 1 (-low/high). On the other hand, neural networks can learn priors directly from data, but they introduce other challenges. First, capturing the distribution of generic objects requires training on a large number of examples, possibly larger than what existing datasets can supply. Moreover, generalization can be an issue: the performance of existing learning-based methods quickly degrades when the test objects differ from the ones used in training, as shown in the rows marked as in Figure 1.
Finally, it is not straightforward to disentangle object-level priors and the pose of the object. Figure 1 shows that OccNet  and AtlasNet , both recent state-of-the-art works, fail for classes never seen in training , or even when the pose of the object is significantly different from the poses seen in training .
To overcome these issues, we propose to learn priors locally: even if the Stanford Bunny in Figure 1 was never seen in training, its local features are similar to those found in more common objects from the training set. We introduce meshlets, which can be regarded as small patches of a mesh, see Figure 17. Loosely speaking, meshlets act as a dictionary of basic shape features. Meshlets are local and of limited size, and thus offer a simple mechanism to disentangle the (local) priors from the object’s pose. If meshlets are adapted to the point cloud independently of each other, however, they may not result in a watertight surface. Therefore, we explicitly enforce their consistency globally. In the following we describe these two stages, and the overall process to extract a mesh.
3.1 Local Shape Priors with Meshlets
In this section we introduce meshlets, and describe how we leverage them to enforce local shape priors. Intuitively, a meshlet is a small patch of mesh deformed to adhere to a region of another, larger mesh, see Figure 17 and 90(a).
To extract meshlet at vertex of mesh , we first compute a local geodesic parametrization  that maps the 3D coordinates of the vertices in a neighborhood of to coordinates on , the plane tangent to at . We then re-sample the geodesic distance function at integer coordinates . This gives us the correspondence between a vertex on the meshlet at , and a vertex on the mesh in the neighborhood of .
Because they only require a local parametrization computed with respect to the center vertex , meshlets work well even for objects with large, varying curvature. Because they are local, they can learn shape priors that are independent of the pose and class of the object.
Learning local shape priors with meshlets. We want to learn the distribution of “natural meshlets,” \ie, those meshlets that capture the local features of real-world objects. Inspired by recent methods [40, 5], we use a variational auto-encoder (VAE). By training the VAE to reconstruct a large number of meshlets, we force its bottleneck to learn the latent space of natural meshlets. Differently put, vectors sampled on this manifold and fed into the decoder result in natural meshlets. We extract meshlets from objects from the ShapeNet dataset  and we feed their 3D coordinates for training. However, we first roto-translate the meshlets to bring them into a canonical pose. This transformation is necessary to make sure that similar meshlets sampled from different 3D locations and orientations map to similar regions in the VAE’s latent space. More specifically, given a meshlet , we first translate and rotate it so that its center is at the origin, and the normal at is aligned with the z-axis, then we rotate it around the axis so that the local coordinates of the meshlet are aligned with the and axes. We call this the canonical pose. A meshlet, then, is completely defined by , the transformation from global to canonical pose, and , the latent vector corresponding to the meshlet in canonical pose:
Section 4.2 details the network’s architecture. Since we disentangle pose and shape, smoothly traversing the latent space will smoothly vary the shape of the reconstructed meshlet as shown in Figure 17, where we take the latent vectors and corresponding to meshlets A and B, and we progressively interpolate between them to get vectors ’s. The meshlets reconstructed from the ’s smoothly interpolate between the shape of meshlets A and B. (Please use a media-enabled PDF viewer, such as Adobe Reader, to view the animations in Figure 17.)
Fitting a meshlet to 3D points. Assume now that we are given a set of 3D points roughly corresponding to the size of a meshlet (we will generalize this to a complete point cloud in Section 3.2). Deforming a natural meshlet to fit it is now straightforward: we simply traverse the latent space learned by the VAE to minimize the distance between the meshlet and the points. Specifically, we take , an initialization of the meshlet, and run it through the encoder to find the corresponding latent vector . This is the starting point of our optimization. We then freeze the weights of the VAE, compute the error between the meshlet and the points, and take a gradient descent step through the decoder. This brings us to a new point in latent space, , and the corresponding meshlet . Meshlet is a natural meshlet that is closer to the given 3D points. We iterate until convergence, see Figure 18. We note that, although other approaches have also proposed to optimize the latent vector of a VAE to match some measured samples (\eg, [29, 5, 1, 23]), they do so at the object (or scene) level. Because our method learns local surface patches, and therefore reuse surface priors across different object categories, it can better generalize.
3.2 Overall Optimization
Having explained how our meshlets can be used to learn local priors, and can be fit to a set of 3D points, we can describe the overall algorithm, which is fairly straightforward at its core. We start with , an initial, rough approximation of the complete mesh. This could be a sphere, or any other surface that satisfies our meshlet priors, \ie, meshlets extracted from lie on the manifold learned by the VAE. From we extract overlapping meshlets ’s and find the corresponding ’s and ’s, Figure 20. We select so that each vertex on is covered by at least meshlets. Generally, this results in to meshlets. We also find the distance between and the point cloud, whose gradient we can propagate to the meshlets since we have the correspondences between mesh and meshlets by construction. This allows us to update the meshlets to adapt to the points (Section 3.1).
However, this optimization is performed on each meshlet independently, so it results in small gaps between the meshlets, see Figure 19(a). Therefore, we enforce and maintain global consistency by adding a step in which we deform to match the meshlets and update the meshlets to match . Deforming brings it closer to the point cloud, deforming the meshlets forces them to be globally consistent. Finally we iterate:
Optimize meshlets to fit the point cloud (3.2.1).
Optimize meshlets and mesh to match each other (3.2.2).
At convergence, the auxiliary variable , watertight by construction, is our estimation of the mesh. We now explain the two steps in detail.
Enforcing Local Shape Priors
To optimize the meshlets with respect to the point cloud we need to define an error. Unfortunately, the correspondences between the point cloud and the vertices of the meshlets are not readily available. A Chamfer distance, then, is not straightforward to use because without correspondences all the points in the point cloud would contribute to the error of all the meshlets—even if they are on opposite sides of the object. However, we do have the correspondences between the vertices of and the meshlets. Therefore, we compute the Chamfer distance between the point cloud and the mesh instead:
where is a 3D point in the input point cloud, PC. Equation 2 gives us per-vertex error on the mesh, which we can propagate to the corresponding meshlets. We then update the meshlets to minimize as explained in Section 3.1, and get a new set of natural meshlets .
Enforcing Global Consistency
To enforce that meshlets are globally consistent, \ie, that their union is a watertight mesh, we use, once again, . Specifically, we compute the Chamfer distance between the vertices of and the vertices of all the meshlets as:
First we keep the meshlets fixed and deform to minimize . Then we fix the resulting mesh and adjust the meshlets with the algorithm described in Section 3.1, but this time to minimize . We iterate until Equation 3 is minimized. At this point the meshlets will be consistent with the mesh and, in turn, globally. This process corresponds to the block “global consistency” in Figure 19.
4 Implementation Details
We start by describing a few details that improve the efficiency of the optimization procedure or the quality of the resulting meshes.
Mesh initialization. The auxiliary mesh can be any genus-zero mesh that satisfies the meshlets’ priors (see Section 3.2). The actual choice, however, does have a bearing on the number of iterations required to converge. We initialize our approach with an overly-smoothed Laplacian reconstruction. Empirically, we have observed that the results of our algorithm initialized in this way are effectively indistinguishable from the results obtained by using a sphere as an initialization; convergence, however, does take a fraction of the time. For reference, we show a few examples of in the Supplementary.
Meshlets re-sampling. As the optimization progresses, the shape and the size of the auxiliary mesh may change significantly. On the one hand, this is a desirable behavior: if the mesh can scale to arbitrary sizes, it can properly match the size of the underlying mesh, even when the initialization is far from it. On the other hand, it results in a sparser meshlet coverage and, potentially, no coverage in some areas. Moreover, it could cause meshlets to be overly-stretched. Therefore, every 20 iterations of enforcing local shape priors and global consistency (blue arrow in Figure 19), we re-sample the meshlets on the current mesh.
Re-meshing. Large changes from the initialization may also cause issues to the mesh itself, which may stretch in some regions or become otherwise irregular. One way to prevent this is to use strong smoothness priors when enforcing global consistency, but that would hinder our ability to reconstruct sharp features. At the end of every iteration, we re-mesh using Screened Poisson Reconstruction  to encourage smoothness while respecting the priors enforced by our approach, \ie, preserving the sharpness of local features. We provide more details in the Supplementary.
4.2 Meshlet training
To train the meshlets network, we sample meshlets from the ShapeNet dataset . We extract meshlets by randomly selecting objects across several classes. We then apply three different scales to each object and extract 256 meshlets for each scale, so that our meshlet dataset captures both fine and coarse details. Note that we disregard meshlets that are problematic. Specifically, we use the geodesic distance algorithm by Melvær \etal  and reject those meshlets for which the geodesic distance calculation results in a large anisotropic stretch, or fails altogether. The network, then, is trained to reconstruct these meshlets using as a loss. In all of our experiments we use meshlets of size . To exploit the latent space of natural meshlets, we use a fully-connected encoder decoder network that takes as input a vector (\ie, a vectorized version of the meshlet). The encoder and the decoder are symmetric with 6 layers each, and the latent code vector is one third of the input dimension.
In this section we evaluate our method against state-of-the-art approaches. Then, we compare our meshlets to other local shape priors to validate their importance.
We compare our method with several state-of-the-art mesh reconstruction approaches. The first is Screened Poisson , a widely used, traditional technique that creates watertight surfaces from oriented point clouds. Because our input is a raw point cloud, then, we need to estimate normals. We use two methods to estimate normals. One is MeshLab’s normal estimation, which fits local planes and uses them to estimate normals . The other is a recently published, learning-based method called PCPNet . PCPNet estimates geometric properties such as normals and curvature from local regions on the point clouds. The second mesh reconstruction approach is the method by Öztireli \etal , which also requires oriented points. They propose a variant of marching cubes that preserves sharp features using non-linear regression. In addition, we compare against Laplacian mesh optimization . Leveraging the fact that the norm of the mesh’s Laplacian captures the local mean curvature, this mesh optimization algorithm optimizes the Laplacian at the vertices in a weighted least-square sense. The algorithm has a free parameter that regulates the smoothness of the resulting surface. After a parameter sweep we found that no single parameter would yield the best results over the whole dataset. Therefore we settled for two values, each offering a different compromise between denoising and over-smoothing.
The data. We test all the methods on 20 objects. To validate that our method generalizes well, we also include four objects that are commonly used by the graphics community (Suzanne, the Stanford Bunny, Armadillo, and the Utah Teapot). We select the rest of the meshes from the test set of ShapeNet dataset . We show all the objects in the Supplementary. However, because the ShapeNet meshes are not always watertight, we pre-process them with a simple algorithm that we describe in the Supplementary. Given the watertight meshes we randomly decimate the number of vertices by different factors, obtaining three different sparsity levels. For each sparsity level we also add an increasingly large amount of Gaussian noise.
We describe the parameters we use, and offer visualization of the different levels of noise in the Supplementary.
Numerical evaluation. For our numerical evaluation we use the symmetric Hausdorff distance, which reports the largest vertex reconstruction error for each mesh, and the Chamfer- distance, which computes the distance between two meshes after assigning correspondences based on closest vertices. Table 1 shows that our method performs consistently better than all the competitors. The gap is most apparent when comparing with deep-learning methods that learn priors at the object level, further suggesting that our strategy to learn local priors is a promising direction. We list the numbers for each object across the different noise settings in the Supplementary.
|Meshlab +||Scr.Pois.||0.0285 / 0.0112||0.339 / 0.102|
|RILMS ||0.0177 / 0.0166||0.149 / 0.148|
|PCPNet +||Scr.Pois.||0.0122 / 0.0109||0.147 / 0.140|
|RILMS ||0.0181 / 0.0176||0.151 / 0.153|
|Laplacian ||Low||0.0104 / 0.0103||0.100 / 0.065|
|High||0.0096 / 0.0094||0.103 / 0.069|
|Deep Geometric Prior ||0.0128 / 0.0130||0.147 / 0.148|
|AtlasNet ||0.0415 / 0.0377||0.293 / 0.263|
|OccNet ||0.0630 / 0.0627||0.304 / 0.285|
|Ours||0.0090 / 0.0092||0.054 / 0.047|
In Figure 90 we show both the meshlets at the end of our optimization (a) and the quality of the final mesh reconstructed by our algorithm (b). Note that, thanks to the re-meshing steps during our optimization procedure (Section 4.1), our output is a high-quality, regular mesh. We also show a subset of the objects used in the numerical evaluation in Figures 1 and 87 for different levels of sparsity and noise. Additional results are in the Supplementary. Competing methods are significantly impacted by noise and produce overly smooth results to attenuate its effect. For example, the Laplacian reconstructions obtained with low regularization (Lap-low) are still noisy, while those for which we used high regularization (Lap-high) are over-smoothed. Even Screened Poisson reconstruction , the de facto standard among traditional methods, used in conjunction with PCPNet  to estimate the normals, produces visibly noisy results. Finally, as also shown in Figure 1, state-of-the-art deep learning methods only work on objects seen in training and for low levels of noise. Our results, on the other hand, offer the best trade-off between detail and noise by recovering locally sharp features and small details, despite the sparsity and noise of the point clouds.
On the importance of natural meshlets.
Our meshlet priors are the core of our reconstruction. Here we compare our natural meshlets with other shape priors to isolate their contribution to the overall quality of the result. The first is a Laplacian regularizer, which is a standard smoothness prior . The second, is the recent work by Williams \etal, which suggests that a neural network is also, in itself, a prior for local geometry . We use these priors and our natural meshlet prior to optimize small patches of mesh to small point clouds extracted from real objects.
Figure 99 shows two representative examples. Despite the complexity of the local shape, and the level of noise, the optimization that uses our strategy (Section 3.1) is able to correctly estimate the underlying meshlet. On the contrary, Deep Geometric Prior over-fits to the noise, and the Laplacian regularizer over-smooths the surface. On meshlets, the average symmetric Hausdorff distance is for DGP, for Laplacian, for our method.
6 Discussion and Limitations
Our approach optimizes a mesh and a number of meshlets based on the gradients available at mesh vertexes, while enforcing the meshlets priors. In this paper, the gradients for the mesh were obtained by computing the distance of the mesh to the point cloud. However, our method can take gradients from any source, including a differentiable renderer . This adds to the flexibility of our approach
In our work we learn and enforce priors for mesh estimation using meshlets, which have an intrinsic scale and resolution. Our current approach uses a single fixed scale of the meshlet for all the object reconstructions, although we effectively learn meshlets at multiple scales (see Section 4.2). This poses limitations on the level of details we can reconstruct: they cannot be smaller than the resolution of the meshlet. Using a meshlet at a single scale throughout the mesh deformation process may also lead to local minima. A natural extension, then, would be to use a coarse-to-fine approach.
Our current approach is computationally expensive and not optimized for speed. Hence, it can take from hours to dozens of hours, depending on the initialization to run the full optimization. Several acceleration techniques for finding correspondences between mesh and meshlets exist that would help. Improving the efficiency of the meshlets extraction would help.
We have presented a novel geometrical representation, meshlets, which allows us to robustly reconstruct 3D meshes from sparse, noisy point samples. By training a variational autoencoder to learn the low-dimensional manifold of natural local surface patches, meshlets provide us with a strong prior that can be used to properly reconstruct geometry from sparse samples. However, because meshlets are localized representations, optimizing them independently would result in an inconsistent surface. Therefore, we propose an alternating optimization which first optimizes the meshlets to match the point samples and then enforces consistency across all of them to match the global shape of the reconstructed mesh. The resulting algorithm is able to reconstruct surfaces from very sparse and noisy samples more reliably than state-of-the-art approaches.
We thank Kihwan Kim, Alejandro Troccoli, and Ben Eckart for helpful discussions about evaluation. We also thank Arash Vahdat for valuable discussions on VAE training. Abhishek was supported by the NVIDIA Fellowship. This work was partially funded by National Science Foundation grant #IIS-1619376.
1 Additional Experimentation Details
In this section we give additional details about the experiments shown in the main paper.
1.1 Generating Water-tight Meshes
We used meshlets extracted from the ShapeNet  dataset for training. The test dataset for mesh reconstruction was formed by selecting objects from the test set of ShapeNet dataset as well as few objects from outside the ShapeNet. However, most ShapeNet objects are not watertight. To generate water-tight meshes for our objects we used the process described by Stutz and Geiger . First, the object is scaled to lie in . Next, depth maps are rendered from 200 views and with a resolution of . These depth maps are used to perform TSDF fusion. We use volume for TSDF fusion. Finally a mesh simplification step is performed using meshlab to give us a final mesh with 50k vertices. These vertices are roughly uniform over the surface of the mesh.
1.2 Noise, Sparsity and Outliers Parameters used in Experiments
To test different approaches we designed three different settings for noise and sparsity.
Given a GT object with 50k vertices, we first randomly sample a of the vertices to get a sparse point cloud. Following this we add a Gaussian noise of magnitude to each point in the sparse point cloud.
The three different settings used for the experiments are as follows:
Setting 1 (S1): and
Setting 2 (S2): and
Setting 3 (S3): and
To appreciate the level of noise we provide a visualization of these noise settings in Figure 40.
1.3 Test dataset and Initialization
In Figure 21 we show all the GT objects used for the mesh reconstruction evaluation. We also show the initial mesh, . Note that we used the exact same to initialize both our method and the Laplacian mesh optimization.
2 Additional Results
Figure 162 shows additional qualitative comparisons between different approaches.
3 Additional Algorithm Details
The pseudo code for our optimization procedure to estimate a watertight mesh while enforcing meshlets priors is explained in Algorithm LABEL:algo:meshoptim.
In our optimization procedure we update both the meshlets and the auxiliary mesh. While meshlets priors make the updates of meshlets stable, a prior is needed while updating mesh to ensure that the mesh is watertight and vertices are uniformly distributed. Use of smoothness priors or other priors for mesh would hinder our ability to reconstruct sharp features. Hence we use Screened Poisson Reconstruction  at the end of every iteration and use the vertices and normals of globally consistent meshlets to update the mesh.
- This work was done while Abhishek Badki was interning at NVIDIA.
- (2018) Modeling facial geometry using compositional VAEs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §3.1.
- (1995) Automatic reconstruction of surfaces and scalar fields from 3D scans. In Proceedings of SIGGRAPH, Cited by: §2.
- (2018) Multi-chart generative surface modeling. ACM Transactions on Graphics. Cited by: §2.
- (1999) The ball-pivoting algorithm for surface reconstruction. IEEE Transactions on Visualization and Computer Graphics (TVCG). Cited by: §2.
- (2018) CodeSLAM â learning a compact, optimisable representation for dense visual SLAM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2, §2, §3.1, §3.1.
- (2015) ShapeNet: An information-rich 3D model repository. Technical report Technical Report arXiv:1512.03012. Cited by: §1.1, §3.1, §4.2, §5.
- (2008) MeshLab: An open-source mesh processing tool. In Eurographics Italian Chapter Conference, Cited by: (dh)dh, (di)di, Figure 162, Table 1, Table 2, (be)be, (bf)bf, Figure 87, §5, Table 1.
- (1996) A volumetric method for building complex models from range images. In ACM Transactions on Graphics (SIGGRAPH), Cited by: §2.
- (2014) Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems (NIPS), Cited by: §2.
- (2017) Unsupervised monocular depth estimation with left-right consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
- (2018) A papier-mâché approach to learning 3D surface generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: Figure 1, §1, Figure 162, Table 1, Table 2, §2, §2, §3, Figure 87, §5, Table 1.
- (2018) PCPNet: learning local shape properties from raw point clouds. In Computer Graphics Forum, Cited by: (dj)dj, (dk)dk, Figure 162, Table 1, Table 2, §2, (bg)bg, (bh)bh, Figure 87, §5, §5, Table 1.
- (1992) Surface reconstruction from unorganized points. In Proceedings of SIGGRAPH, Cited by: §2.
- (2018) DeepMVS: Learning multi-view stereopsis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1, §2.
- (2018) Learning category-specific mesh reconstruction from image collections. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §1, §2.
- (2018) Neural 3D mesh renderer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1, §2, §6.
- (2006) Poisson surface reconstruction. In Eurographics Symposium on Geometry Processing, Cited by: §2.
- (2013) Screened Poisson surface reconstruction. ACM Transactions on Graphics. Cited by: (di)di, (dk)dk, Figure 162, Table 1, Table 2, §2, §3, §4.1, (bf)bf, (bh)bh, Figure 87, §5, §5, Table 1.
- (2017) End-to-end learning of geometry and context for deep stereo regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
- (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114. Cited by: §1.
- (2019) Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. arXiv preprint arXiv:1907.01341. Cited by: §2.
- Cited by: §2, §2.
- (2018) Deformable shape completion with graph convolutional autoencoders. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2, §2, §3.1.
- (1987) Marching cubes: A high resolution 3D surface construction algorithm. In Proceedings of SIGGRAPH, Cited by: §2.
- (2012) Geodesic polar coordinates on polygonal meshes. In Computer Graphics Forum, Cited by: §3.1, §4.2.
- (2019) Occupancy networks: Learning 3D reconstruction in function space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: Figure 1, §1, Figure 162, Table 1, Table 2, §2, §3, Figure 87, §5, Table 1.
- (2006) Laplacian mesh optimization. In ACM International Conference on Computer Graphics and Interactive Techniques (GRAPHITE), Cited by: Figure 1, §1, Figure 162, Table 1, Table 2, §2, Figure 87, §5, §5, Table 1.
- (2009) Feature preserving point set surfaces based on non-linear kernel regression. In Computer Graphics Forum, Cited by: (dh)dh, (dj)dj, Figure 162, Table 1, Table 2, (be)be, (bg)bg, Figure 87, §5, Table 1.
- (2019-06) DeepSDF: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1, §2, §2, §2, §3.1.
- (2003) High-accuracy stereo depth maps using structured light. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1.
- (2017) SurfNet: Generating 3D shape surfaces using deep residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
- (2018) Learning 3d shape completion under weak supervision. CoRR. Cited by: §1.1.
- (2019) What do single-view 3D reconstruction networks learn?. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §1.
- (2018) Pixel2mesh: Generating 3D mesh models from single RGB images. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §1, §2.
- (2019-06) Deep geometric prior for surface reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: Figure 162, Table 1, Table 2, §2, Figure 87, §5, §5, Table 1.
- (2018) MVSNet: Depth inference for unstructured multi-view stereo. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §1, §2.
- (2018) EC-Net: An edge-aware point set consolidation network. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §2.
- (2018) PU-Net: point cloud upsampling network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
- (2017) Unsupervised learning of depth and ego-motion from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
- (2018) Object-centric photometric bundle adjustment with deep shape prior. In IEEE Winter Conference on Applications of Computer Vision (WACV), Cited by: §2, §2, §3.1.