A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with
Traditional/Learned 3D Descriptors
- URL: http://arxiv.org/abs/2312.01262v1
- Date: Sun, 3 Dec 2023 02:51:54 GMT
- Title: A Review and A Robust Framework of Data-Efficient 3D Scene Parsing with
Traditional/Learned 3D Descriptors
- Authors: Kangcheng Liu
- Abstract summary: Existing state-of-the-art 3D point cloud understanding methods merely perform well in a fully supervised manner.
This work presents a general and simple framework to tackle point cloud understanding when labels are limited.
- Score: 10.497309421830671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing state-of-the-art 3D point cloud understanding methods merely perform
well in a fully supervised manner. To the best of our knowledge, there exists
no unified framework that simultaneously solves the downstream high-level
understanding tasks including both segmentation and detection, especially when
labels are extremely limited. This work presents a general and simple framework
to tackle point cloud understanding when labels are limited. The first
contribution is that we have done extensive methodology comparisons of
traditional and learned 3D descriptors for the task of weakly supervised 3D
scene understanding, and validated that our adapted traditional PFH-based 3D
descriptors show excellent generalization ability across different domains. The
second contribution is that we proposed a learning-based region merging
strategy based on the affinity provided by both the traditional/learned 3D
descriptors and learned semantics. The merging process takes both low-level
geometric and high-level semantic feature correlations into consideration.
Experimental results demonstrate that our framework has the best performance
among the three most important weakly supervised point clouds understanding
tasks including semantic segmentation, instance segmentation, and object
detection even when very limited number of points are labeled. Our method,
termed Region Merging 3D (RM3D), has superior performance on ScanNet
data-efficient learning online benchmarks and other four large-scale 3D
understanding benchmarks under various experimental settings, outperforming
current arts by a margin for various 3D understanding tasks without complicated
learning strategies such as active learning.
Related papers
- Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis [39.85109385954641]
Recent works demonstrate the performances of 3D point cloud recognition can be boosted remarkably by parameter-efficient prompt tuning.
We present a comprehensive regulation framework that allows the learnable prompts to actively interact with the well-learned general knowledge in large 3D models.
Surprisingly, our method not only realizes consistently increasing generalization ability but also enhances task-specific 3D recognition performances across various 3DDG benchmarks by a clear margin.
arXiv Detail & Related papers (2024-10-27T10:35:47Z) - A Data-efficient Framework for Robotics Large-scale LiDAR Scene Parsing [10.497309421830671]
Existing state-of-the-art 3D point clouds understanding methods only perform well in a fully supervised manner.
This work presents a general and simple framework to tackle point clouds understanding when labels are limited.
arXiv Detail & Related papers (2023-12-03T02:38:51Z) - Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature
Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.
To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.
Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D
Segmentation [23.110443633049382]
We propose a novel approach for the challenging cross-modal and cross-domain adaptation task by fully exploring the relationship between images and point clouds.
Our method achieves state-of-the-art performance for 3D point cloud semantic segmentation on Semantic KITTI by using the knowledge of KITTI360 and GTA5.
arXiv Detail & Related papers (2023-09-19T14:29:57Z) - SL3D: Self-supervised-Self-labeled 3D Recognition [89.19932178712065]
We propose a Self-supervised-Self-Labeled 3D Recognition (SL3D) framework.
SL3D simultaneously solves two coupled objectives, i.e., clustering and learning feature representation.
It can be applied to solve different 3D recognition tasks, including classification, object detection, and semantic segmentation.
arXiv Detail & Related papers (2022-10-30T11:08:25Z) - 3D Spatial Recognition without Spatially Labeled 3D [127.6254240158249]
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition.
We show that WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time.
arXiv Detail & Related papers (2021-05-13T17:58:07Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.