LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving
- URL: http://arxiv.org/abs/2212.03504v1
- Date: Wed, 7 Dec 2022 08:08:01 GMT
- Title: LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving
- Authors: Xiang Li, Junbo Yin, Botian Shi, Yikang Li, Ruigang Yang, Jianbin Shen
- Abstract summary: We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS)
LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
- Score: 34.119642131912485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image instance segmentation is a fundamental research topic in autonomous
driving, which is crucial for scene understanding and road safety. Advanced
learning-based approaches often rely on the costly 2D mask annotations for
training. In this paper, we present a more artful framework, LiDAR-guided
Weakly Supervised Instance Segmentation (LWSIS), which leverages the
off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as
natural weak supervisions for training the 2D image instance segmentation
models. Our LWSIS not only exploits the complementary information in multimodal
data during training, but also significantly reduces the annotation cost of the
dense 2D masks. In detail, LWSIS consists of two crucial modules, Point Label
Assignment (PLA) and Graph-based Consistency Regularization (GCR). The former
module aims to automatically assign the 3D point cloud as 2D point-wise labels,
while the latter further refines the predictions by enforcing geometry and
appearance consistency of the multimodal data. Moreover, we conduct a secondary
instance segmentation annotation on the nuScenes, named nuInsSeg, to encourage
further research on multimodal perception tasks. Extensive experiments on the
nuInsSeg, as well as the large-scale Waymo, show that LWSIS can substantially
improve existing weakly supervised segmentation models by only involving 3D
data during training. Additionally, LWSIS can also be incorporated into 3D
object detectors like PointPainting to boost the 3D detection performance for
free. The code and dataset are available at https://github.com/Serenos/LWSIS.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - MWSIS: Multimodal Weakly Supervised Instance Segmentation with 2D Box
Annotations for Autonomous Driving [13.08936676096554]
We propose a novel framework called Multimodal WeaklySupervised Instance (MWSIS)
MWSIS incorporates various fine-grained label generation and correction modules for both 2D and 3D modalities.
It outperforms fully supervised instance segmentation with only 5% fully supervised annotations.
arXiv Detail & Related papers (2023-12-12T05:12:22Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point
Clouds of Wild Scenes [36.07733308424772]
The deficiency of 3D segmentation labels is one of the main obstacles to effective point cloud segmentation.
We propose a novel deep graph convolutional network-based framework for large-scale semantic scene segmentation in point clouds with sole 2D supervision.
arXiv Detail & Related papers (2020-04-26T23:02:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.