Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D
Segmentation
- URL: http://arxiv.org/abs/2309.10649v2
- Date: Mon, 16 Oct 2023 08:28:07 GMT
- Title: Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D
Segmentation
- Authors: Jingyu Zhang, Huitong Yang, Dai-Jie Wu, Jacky Keung, Xuesong Li, Xinge
Zhu, Yuexin Ma
- Abstract summary: We propose a novel approach for the challenging cross-modal and cross-domain adaptation task by fully exploring the relationship between images and point clouds.
Our method achieves state-of-the-art performance for 3D point cloud semantic segmentation on Semantic KITTI by using the knowledge of KITTI360 and GTA5.
- Score: 23.110443633049382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current state-of-the-art point cloud-based perception methods usually rely on
large-scale labeled data, which requires expensive manual annotations. A
natural option is to explore the unsupervised methodology for 3D perception
tasks. However, such methods often face substantial performance-drop
difficulties. Fortunately, we found that there exist amounts of image-based
datasets and an alternative can be proposed, i.e., transferring the knowledge
in the 2D images to 3D point clouds. Specifically, we propose a novel approach
for the challenging cross-modal and cross-domain adaptation task by fully
exploring the relationship between images and point clouds and designing
effective feature alignment strategies. Without any 3D labels, our method
achieves state-of-the-art performance for 3D point cloud semantic segmentation
on SemanticKITTI by using the knowledge of KITTI360 and GTA5, compared to
existing unsupervised and weakly-supervised baselines.
Related papers
- Pic@Point: Cross-Modal Learning by Local and Global Point-Picture Correspondence [0.0]
We present Pic@Point, an effective contrastive learning method based on structural 2D-3D correspondences.
We leverage image cues rich in semantic and contextual knowledge to provide a guiding signal for point cloud representations.
arXiv Detail & Related papers (2024-10-12T12:43:41Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - Cross-Modal Information-Guided Network using Contrastive Learning for
Point Cloud Registration [17.420425069785946]
We present a novel Cross-Modal Information-Guided Network (CMIGNet) for point cloud registration.
We first incorporate the projected images from the point clouds and fuse the cross-modal features using the attention mechanism.
We employ two contrastive learning strategies, namely overlapping contrastive learning and cross-modal contrastive learning.
arXiv Detail & Related papers (2023-11-02T12:56:47Z) - Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency [78.76508318592552]
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
arXiv Detail & Related papers (2023-03-15T15:14:00Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - Dual Adaptive Transformations for Weakly Supervised Point Cloud
Segmentation [78.6612285236938]
We propose a novel DAT (textbfDual textbfAdaptive textbfTransformations) model for weakly supervised point cloud segmentation.
We evaluate our proposed DAT model with two popular backbones on the large-scale S3DIS and ScanNet-V2 datasets.
arXiv Detail & Related papers (2022-07-19T05:43:14Z) - CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D
Point Cloud Understanding [2.8661021832561757]
CrossPoint is a simple cross-modal contrastive learning approach to learn transferable 3D point cloud representations.
Our approach outperforms the previous unsupervised learning methods on a diverse range of downstream tasks including 3D object classification and segmentation.
arXiv Detail & Related papers (2022-03-01T18:59:01Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - 3D Spatial Recognition without Spatially Labeled 3D [127.6254240158249]
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition.
We show that WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time.
arXiv Detail & Related papers (2021-05-13T17:58:07Z) - Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point
Clouds of Wild Scenes [36.07733308424772]
The deficiency of 3D segmentation labels is one of the main obstacles to effective point cloud segmentation.
We propose a novel deep graph convolutional network-based framework for large-scale semantic scene segmentation in point clouds with sole 2D supervision.
arXiv Detail & Related papers (2020-04-26T23:02:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.