3D Guided Weakly Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2012.00242v1
- Date: Tue, 1 Dec 2020 03:34:15 GMT
- Title: 3D Guided Weakly Supervised Semantic Segmentation
- Authors: Weixuan Sun, Jing Zhang, Nick Barnes
- Abstract summary: We propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information.
We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.
- Score: 27.269847900950943
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pixel-wise clean annotation is necessary for fully-supervised semantic
segmentation, which is laborious and expensive to obtain. In this paper, we
propose a weakly supervised 2D semantic segmentation model by incorporating
sparse bounding box labels with available 3D information, which is much easier
to obtain with advanced sensors. We manually labeled a subset of the 2D-3D
Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D
inference module to generate accurate pixel-wise segment proposal masks. Guided
by 3D information, we first generate a point cloud of objects and calculate
objectness probability score for each point. Then we project the point cloud
with objectness probabilities back to 2D images followed by a refinement step
to obtain segment proposals, which are treated as pseudo labels to train a
semantic segmentation network. Our method works in a recursive manner to
gradually refine the above-mentioned segment proposals. Extensive experimental
results on the 2D-3D-S dataset show that the proposed method can generate
accurate segment proposals when bounding box labels are available on only a
small subset of training images. Performance comparison with recent
state-of-the-art methods further illustrates the effectiveness of our method.
Related papers
- Label-Efficient 3D Brain Segmentation via Complementary 2D Diffusion Models with Orthogonal Views [10.944692719150071]
We propose a novel 3D brain segmentation approach using complementary 2D diffusion models.
Our goal is to achieve reliable segmentation quality without requiring complete labels for each individual subject.
arXiv Detail & Related papers (2024-07-17T06:14:53Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance
Fields [73.97131748433212]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained
Image-Language Models [56.324516906160234]
Generalizable 3D part segmentation is important but challenging in vision and robotics.
This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP.
We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm.
arXiv Detail & Related papers (2022-12-03T06:59:01Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene
Segmentation [48.677336052620895]
We present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels.
By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design.
arXiv Detail & Related papers (2022-03-29T04:16:40Z) - Multi-Modality Task Cascade for 3D Object Detection [22.131228757850373]
Many methods train two models in isolation and use simple feature concatenation to represent 3D sensor data.
We propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions.
We show that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance.
arXiv Detail & Related papers (2021-07-08T17:55:01Z) - FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle
Detection [81.79171905308827]
We propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations.
Our method consists of two stages: coarse 3D segmentation and 3D bounding box estimation.
It is able to accurately detect objects in 3D space with only 2D bounding boxes and sparse point clouds.
arXiv Detail & Related papers (2021-05-17T07:29:55Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point
Clouds of Wild Scenes [36.07733308424772]
The deficiency of 3D segmentation labels is one of the main obstacles to effective point cloud segmentation.
We propose a novel deep graph convolutional network-based framework for large-scale semantic scene segmentation in point clouds with sole 2D supervision.
arXiv Detail & Related papers (2020-04-26T23:02:23Z) - 3D-MiniNet: Learning a 2D Representation from Point Clouds for Fast and
Efficient 3D LIDAR Semantic Segmentation [9.581605678437032]
3D-MiniNet is a novel approach for LIDAR semantic segmentation that combines 3D and 2D learning layers.
It first learns a 2D representation from the raw points through a novel projection which extracts local and global information from the 3D data.
These 2D semantic labels are re-projected back to the 3D space and enhanced through a post-processing module.
arXiv Detail & Related papers (2020-02-25T14:33:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.