Related papers: GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D

GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D

URL: http://arxiv.org/abs/2405.12419v1
Date: Mon, 20 May 2024 23:53:42 GMT
Title: GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D
Authors: Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Milad Cheraghalikhani, Gustavo Adolfo Vargas Hakim, David Osowiechi, Farzad Beizaee, Ismail Ben Ayed, Christian Desrosiers,
Abstract summary: We introduce a pioneering approach to self-supervised learning for point clouds. We employ a geometrically informed mask selection strategy called GeoMask3D (GM3D) to boost the efficiency of Masked Autos.
Score: 18.33878596057853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a pioneering approach to self-supervised learning for point clouds, employing a geometrically informed mask selection strategy called GeoMask3D (GM3D) to boost the efficiency of Masked Auto Encoders (MAE). Unlike the conventional method of random masking, our technique utilizes a teacher-student model to focus on intricate areas within the data, guiding the model's focus toward regions with higher geometric complexity. This strategy is grounded in the hypothesis that concentrating on harder patches yields a more robust feature representation, as evidenced by the improved performance on downstream tasks. Our method also presents a complete-to-partial feature-level knowledge distillation technique designed to guide the prediction of geometric complexity utilizing a comprehensive context from feature-level information. Extensive experiments confirm our method's superiority over State-Of-The-Art (SOTA) baselines, demonstrating marked improvements in classification, and few-shot tasks.

Related papers

MaskHOI: Robust 3D Hand-Object Interaction Estimation via Masked Pre-training [23.200848479769903]
MaskHOI is a novel Masked Autoencoder-driven pretraining framework for enhanced HOI pose estimation.<n>Our core idea is to leverage the masking-then-reconstruction strategy of MAE to encourage the feature encoder to infer missing spatial and structural information.<n>To enhance the geometric awareness of the pretrained encoder, we introduce a novel Masked Signed Distance Field (SDF)-driven multimodal learning mechanism.
arXiv Detail & Related papers (2025-07-18T05:52:37Z)
Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds [6.69660410213287]
We propose an innovative framework called Point-MGE to explore the benefits of deeply integrating 3D representation learning and generative learning. In shape classification, Point-MGE achieved an accuracy of 94.2% (+1.0%) on the ModelNet40 dataset and 92.9% (+5.5%) on the ScanObjectNN dataset. Experimental results also confirmed that Point-MGE can generate high-quality 3D shapes in both unconditional and conditional settings.
arXiv Detail & Related papers (2024-06-25T07:57:03Z)
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders [52.66195794216989]
We propose Point Feature Enhancement Masked Autoencoders (Point-FEMAE) to learn compact 3D representations. Point-FEMAE consists of a global branch and a local branch to capture latent semantic features. Our method significantly improves the pre-training efficiency compared to cross-modal alternatives.
arXiv Detail & Related papers (2023-12-17T14:17:05Z)
Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding [106.0876425365599]
Masked Shape Prediction (MSP) is a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points.
arXiv Detail & Related papers (2023-05-08T20:09:19Z)
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds [72.60362979456035]
Masked Autoencoders (MAE) are challenging to explore in large-scale 3D point clouds. We propose a textbfGenerative textbfDecoder for MAE (GD-MAE) to automatically merges the surrounding context. We demonstrate the efficacy of the proposed method on several large-scale benchmarks: KITTI, and ONCE.
arXiv Detail & Related papers (2022-12-06T14:32:55Z)
Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training [57.25828870799331]
We propose STMono3D, a new self-teaching framework for unsupervised domain adaptation on Mono3D. We develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain. STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.
arXiv Detail & Related papers (2022-04-25T12:23:07Z)
Unsupervised Learning on 3D Point Clouds by Clustering and Contrasting [11.64827192421785]
unsupervised representation learning is a promising direction to auto-extract features without human intervention. This paper proposes a general unsupervised approach, named textbfConClu, to perform the learning of point-wise and global features.
arXiv Detail & Related papers (2022-02-05T12:54:17Z)
Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper. The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z)
FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations. We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling. Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z)
Attention Model Enhanced Network for Classification of Breast Cancer Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular. To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch. Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.