Masked Surfel Prediction for Self-Supervised Point Cloud Learning
- URL: http://arxiv.org/abs/2207.03111v1
- Date: Thu, 7 Jul 2022 06:47:26 GMT
- Title: Masked Surfel Prediction for Self-Supervised Point Cloud Learning
- Authors: Yabin Zhang, Jiehong Lin, Chenhang He, Yongwei Chen, Kui Jia, Lei
Zhang
- Abstract summary: We make the first attempt to consider the local geometry information explicitly into the masked auto-encoding, and propose a novel Masked Surfel Prediction (MaskSurf) method.
Specifically, given the input point cloud masked at a high ratio, we learn a transformer-based encoder-decoder network to estimate the underlying masked surfels.
MaskSurf is validated on six downstream tasks under three fine-tuning strategies.
- Score: 40.16043026141161
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Masked auto-encoding is a popular and effective self-supervised learning
approach to point cloud learning. However, most of the existing methods
reconstruct only the masked points and overlook the local geometry information,
which is also important to understand the point cloud data. In this work, we
make the first attempt, to the best of our knowledge, to consider the local
geometry information explicitly into the masked auto-encoding, and propose a
novel Masked Surfel Prediction (MaskSurf) method. Specifically, given the input
point cloud masked at a high ratio, we learn a transformer-based
encoder-decoder network to estimate the underlying masked surfels by
simultaneously predicting the surfel positions (i.e., points) and per-surfel
orientations (i.e., normals). The predictions of points and normals are
supervised by the Chamfer Distance and a newly introduced Position-Indexed
Normal Distance in a set-to-set manner. Our MaskSurf is validated on six
downstream tasks under three fine-tuning strategies. In particular, MaskSurf
outperforms its closest competitor, Point-MAE, by 1.2\% on the real-world
dataset of ScanObjectNN under the OBJ-BG setting, justifying the advantages of
masked surfel prediction over masked point cloud reconstruction. Codes will be
available at https://github.com/YBZh/MaskSurf.
Related papers
- MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction [17.16231247910372]
We propose MGMap, a mask-guided approach that effectively highlights the informative regions and achieves precise map element localization.
Specifically, MGMap employs learned masks based on the enhanced multi-scale BEV features from two perspectives.
Compared to the baselines, our proposed MGMap achieves a notable improvement of around 10 mAP for different input modalities.
arXiv Detail & Related papers (2024-04-01T03:13:32Z) - CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud
Semantic Segmentation [60.0893353960514]
We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations.
We propose a Contextual Point Cloud Modeling ( CPCM) method that consists of two parts: a region-wise masking (RegionMask) strategy and a contextual masked training (CMT) method.
arXiv Detail & Related papers (2023-07-19T04:41:18Z) - Self-supervised adversarial masking for 3D point cloud representation
learning [0.38233569758620056]
We introduce PointCAM, a novel adversarial method for learning a masking function for point clouds.
Compared to previous techniques, we postulate applying an auxiliary network that learns how to select masks instead of choosing them randomly.
Our results show that the learned masking function achieves state-of-the-art or competitive performance on various downstream tasks.
arXiv Detail & Related papers (2023-07-11T15:11:06Z) - GeoMAE: Masked Geometric Target Prediction for Self-supervised Point
Cloud Pre-Training [16.825524577372473]
We introduce a point cloud representation learning framework, based on geometric feature reconstruction.
We identify three self-supervised learning objectives to peculiar point clouds, namely centroid prediction, normal estimation, and curvature prediction.
Our pipeline is conceptually simple and it consists of two major steps: first, it randomly masks out groups of points, followed by a Transformer-based point cloud encoder.
arXiv Detail & Related papers (2023-05-15T17:14:55Z) - Self-supervised Pre-training with Masked Shape Prediction for 3D Scene
Understanding [106.0876425365599]
Masked Shape Prediction (MSP) is a new framework to conduct masked signal modeling in 3D scenes.
MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points.
arXiv Detail & Related papers (2023-05-08T20:09:19Z) - MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point
Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition.
In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE.
MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
Pre-training [56.81809311892475]
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for language and 2D image transformers.
We propose Point-M2AE, a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds.
arXiv Detail & Related papers (2022-05-28T11:22:53Z) - Masked Discrimination for Self-Supervised Learning on Point Clouds [27.652157544218234]
Masked autoencoding has achieved great success for self-supervised learning in the image and language domains.
Standard backbones like PointNet are unable to properly handle the training versus testing distribution mismatch introduced by masking during training.
We bridge this gap by proposing a discriminative mask pretraining Transformer framework, MaskPoint, for point clouds.
arXiv Detail & Related papers (2022-03-21T17:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.