Language-Assisted 3D Scene Understanding
- URL: http://arxiv.org/abs/2312.11451v2
- Date: Sun, 31 Dec 2023 07:38:13 GMT
- Title: Language-Assisted 3D Scene Understanding
- Authors: Yanmin Wu, Qiankun Gao, Renrui Zhang, and Jian Zhang
- Abstract summary: We propose a language-assisted approach to point cloud feature learning (LAST-PCL)
We achieve de-redundancy and feature dimensionality reduction without compromising textual priors.
The proposed method learns semantically meaningful point cloud features and achieves state-of-the-art or comparable performance in 3D semantic segmentation, 3D object detection, and 3D scene classification tasks.
- Score: 17.663583203177197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The scale and quality of point cloud datasets constrain the advancement of
point cloud learning. Recently, with the development of multi-modal learning,
the incorporation of domain-agnostic prior knowledge from other modalities,
such as images and text, to assist in point cloud feature learning has been
considered a promising avenue. Existing methods have demonstrated the
effectiveness of multi-modal contrastive training and feature distillation on
point clouds. However, challenges remain, including the requirement for paired
triplet data, redundancy and ambiguity in supervised features, and the
disruption of the original priors. In this paper, we propose a
language-assisted approach to point cloud feature learning (LAST-PCL),
enriching semantic concepts through LLMs-based text enrichment. We achieve
de-redundancy and feature dimensionality reduction without compromising textual
priors by statistical-based and training-free significant feature selection.
Furthermore, we also delve into an in-depth analysis of the impact of text
contrastive training on the point cloud. Extensive experiments validate that
the proposed method learns semantically meaningful point cloud features and
achieves state-of-the-art or comparable performance in 3D semantic
segmentation, 3D object detection, and 3D scene classification tasks.
Related papers
- PointMoment:Mixed-Moment-based Self-Supervised Representation Learning
for 3D Point Clouds [11.980787751027872]
We propose PointMoment, a novel framework for point cloud self-supervised representation learning.
Our framework does not require any special techniques such as asymmetric network architectures, gradient stopping, etc.
arXiv Detail & Related papers (2023-12-06T08:49:55Z) - Edge Aware Learning for 3D Point Cloud [8.12405696290333]
This paper proposes an innovative approach to Hierarchical Edge Aware 3D Point Cloud Learning (HEA-Net)
It seeks to address the challenges of noise in point cloud data, and improve object recognition and segmentation by focusing on edge features.
We present an innovative edge-aware learning methodology, specifically designed to enhance point cloud classification and segmentation.
arXiv Detail & Related papers (2023-09-23T20:12:32Z) - PointLLM: Empowering Large Language Models to Understand Point Clouds [63.39876878899682]
PointLLM understands colored object point clouds with human instructions.
It generates contextually appropriate responses, illustrating its grasp of point clouds and common sense.
arXiv Detail & Related papers (2023-08-31T17:59:46Z) - Point Contrastive Prediction with Semantic Clustering for
Self-Supervised Learning on Point Cloud Videos [71.20376514273367]
We propose a unified point cloud video self-supervised learning framework for object-centric and scene-centric data.
Our method outperforms supervised counterparts on a wide range of downstream tasks.
arXiv Detail & Related papers (2023-08-18T02:17:47Z) - Explore In-Context Learning for 3D Point Cloud Understanding [71.20912026561484]
We introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds.
We propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator.
We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks.
arXiv Detail & Related papers (2023-06-14T17:53:21Z) - A Survey of Label-Efficient Deep Learning for 3D Point Clouds [109.07889215814589]
This paper presents the first comprehensive survey of label-efficient learning of point clouds.
We propose a taxonomy that organizes label-efficient learning methods based on the data prerequisites provided by different types of labels.
For each approach, we outline the problem setup and provide an extensive literature review that showcases relevant progress and challenges.
arXiv Detail & Related papers (2023-05-31T12:54:51Z) - CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and
Feature Mapping [12.679625717350113]
We present CLR-GAM, a contrastive learning-based framework with Guided Augmentation (GA) for efficient dynamic exploration strategy.
We empirically demonstrate that the proposed approach achieves state-of-the-art performance on both simulated and real-world 3D point cloud datasets.
arXiv Detail & Related papers (2023-02-28T04:38:52Z) - PointVST: Self-Supervised Pre-training for 3D Point Clouds via
View-Specific Point-to-Image Translation [64.858505571083]
This paper proposes a translative pre-training framework, namely PointVST.
It is driven by a novel self-supervised pretext task of cross-modal translation from 3D point clouds to their corresponding diverse forms of 2D rendered images.
arXiv Detail & Related papers (2022-12-29T07:03:29Z) - Self-Supervised Feature Learning from Partial Point Clouds via Pose
Disentanglement [35.404285596482175]
We propose a novel self-supervised framework to learn informative representations from partial point clouds.
We leverage partial point clouds scanned by LiDAR that contain both content and pose attributes.
Our method not only outperforms existing self-supervised methods, but also shows a better generalizability across synthetic and real-world datasets.
arXiv Detail & Related papers (2022-01-09T14:12:50Z) - Point Discriminative Learning for Unsupervised Representation Learning
on 3D Point Clouds [54.31515001741987]
We propose a point discriminative learning method for unsupervised representation learning on 3D point clouds.
We achieve this by imposing a novel point discrimination loss on the middle level and global level point features.
Our method learns powerful representations and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2021-08-04T15:11:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.