Point2Vec for Self-Supervised Representation Learning on Point Clouds
- URL: http://arxiv.org/abs/2303.16570v2
- Date: Wed, 11 Oct 2023 10:41:11 GMT
- Title: Point2Vec for Self-Supervised Representation Learning on Point Clouds
- Authors: Karim Abou Zeid and Jonas Schult and Alexander Hermans and Bastian
Leibe
- Abstract summary: We extend data2vec to the point cloud domain and report encouraging results on several downstream tasks.
We propose point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds.
- Score: 66.53955515020053
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the self-supervised learning framework data2vec has shown inspiring
performance for various modalities using a masked student-teacher approach.
However, it remains open whether such a framework generalizes to the unique
challenges of 3D point clouds. To answer this question, we extend data2vec to
the point cloud domain and report encouraging results on several downstream
tasks. In an in-depth analysis, we discover that the leakage of positional
information reveals the overall object shape to the student even under heavy
masking and thus hampers data2vec to learn strong representations for point
clouds. We address this 3D-specific shortcoming by proposing point2vec, which
unleashes the full potential of data2vec-like pre-training on point clouds. Our
experiments show that point2vec outperforms other self-supervised methods on
shape classification and few-shot learning on ModelNet40 and ScanObjectNN,
while achieving competitive results on part segmentation on ShapeNetParts.
These results suggest that the learned representations are strong and
transferable, highlighting point2vec as a promising direction for
self-supervised learning of point cloud representations.
Related papers
- HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation [106.09886920774002]
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network.
Our method achieves consistent improvements over the baseline trained from scratch and significantly out- performs the existing schemes.
arXiv Detail & Related papers (2024-03-18T14:18:08Z) - Cross-Modal Information-Guided Network using Contrastive Learning for
Point Cloud Registration [17.420425069785946]
We present a novel Cross-Modal Information-Guided Network (CMIGNet) for point cloud registration.
We first incorporate the projected images from the point clouds and fuse the cross-modal features using the attention mechanism.
We employ two contrastive learning strategies, namely overlapping contrastive learning and cross-modal contrastive learning.
arXiv Detail & Related papers (2023-11-02T12:56:47Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - Explore In-Context Learning for 3D Point Cloud Understanding [71.20912026561484]
We introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds.
We propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator.
We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks.
arXiv Detail & Related papers (2023-06-14T17:53:21Z) - Self-supervised Learning for Pre-Training 3D Point Clouds: A Survey [25.51613543480276]
Self-supervised point cloud representation learning has attracted increasing attention in recent years.
This paper presents a comprehensive survey of self-supervised point cloud representation learning using DNNs.
arXiv Detail & Related papers (2023-05-08T13:20:55Z) - Variational Relational Point Completion Network for Robust 3D
Classification [59.80993960827833]
Vari point cloud completion methods tend to generate global shape skeletons hence lack fine local details.
This paper proposes a variational framework, point Completion Network (VRCNet) with two appealing properties.
VRCNet shows great generalizability and robustness on real-world point cloud scans.
arXiv Detail & Related papers (2023-04-18T17:03:20Z) - Self-Supervised Feature Learning from Partial Point Clouds via Pose
Disentanglement [35.404285596482175]
We propose a novel self-supervised framework to learn informative representations from partial point clouds.
We leverage partial point clouds scanned by LiDAR that contain both content and pose attributes.
Our method not only outperforms existing self-supervised methods, but also shows a better generalizability across synthetic and real-world datasets.
arXiv Detail & Related papers (2022-01-09T14:12:50Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - PnP-3D: A Plug-and-Play for 3D Point Clouds [38.05362492645094]
We propose a plug-and-play module, -3D, to improve the effectiveness of existing networks in analyzing point cloud data.
To thoroughly evaluate our approach, we conduct experiments on three standard point cloud analysis tasks.
In addition to achieving state-of-the-art results, we present comprehensive studies to demonstrate our approach's advantages.
arXiv Detail & Related papers (2021-08-16T23:59:43Z) - DRINet: A Dual-Representation Iterative Learning Network for Point Cloud
Segmentation [45.768040873409824]
DRINet serves as the basic network structure for dual-representation learning.
Our network achieves state-of-the-art results for point cloud classification and segmentation tasks.
For large-scale outdoor scenarios, our method outperforms state-of-the-art methods with a real-time inference speed of 62ms per frame.
arXiv Detail & Related papers (2021-08-09T13:23:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.