3D-MiniNet: Learning a 2D Representation from Point Clouds for Fast and
Efficient 3D LIDAR Semantic Segmentation
- URL: http://arxiv.org/abs/2002.10893v5
- Date: Tue, 27 Apr 2021 15:31:54 GMT
- Title: 3D-MiniNet: Learning a 2D Representation from Point Clouds for Fast and
Efficient 3D LIDAR Semantic Segmentation
- Authors: I\~nigo Alonso, Luis Riazuelo, Luis Montesano, Ana C. Murillo
- Abstract summary: 3D-MiniNet is a novel approach for LIDAR semantic segmentation that combines 3D and 2D learning layers.
It first learns a 2D representation from the raw points through a novel projection which extracts local and global information from the 3D data.
These 2D semantic labels are re-projected back to the 3D space and enhanced through a post-processing module.
- Score: 9.581605678437032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LIDAR semantic segmentation, which assigns a semantic label to each 3D point
measured by the LIDAR, is becoming an essential task for many robotic
applications such as autonomous driving. Fast and efficient semantic
segmentation methods are needed to match the strong computational and temporal
restrictions of many of these real-world applications.
This work presents 3D-MiniNet, a novel approach for LIDAR semantic
segmentation that combines 3D and 2D learning layers. It first learns a 2D
representation from the raw points through a novel projection which extracts
local and global information from the 3D data. This representation is fed to an
efficient 2D Fully Convolutional Neural Network (FCNN) that produces a 2D
semantic segmentation. These 2D semantic labels are re-projected back to the 3D
space and enhanced through a post-processing module. The main novelty in our
strategy relies on the projection learning module. Our detailed ablation study
shows how each component contributes to the final performance of 3D-MiniNet. We
validate our approach on well known public benchmarks (SemanticKITTI and
KITTI), where 3D-MiniNet gets state-of-the-art results while being faster and
more parameter-efficient than previous methods.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS)
LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - MNet: Rethinking 2D/3D Networks for Anisotropic Medical Image
Segmentation [13.432274819028505]
A novel mesh network (MNet) is proposed to balance the spatial representation inter axes via learning.
Comprehensive experiments are performed on four public datasets (CT&MR)
arXiv Detail & Related papers (2022-05-10T12:39:08Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Multi-Modality Task Cascade for 3D Object Detection [22.131228757850373]
Many methods train two models in isolation and use simple feature concatenation to represent 3D sensor data.
We propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions.
We show that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance.
arXiv Detail & Related papers (2021-07-08T17:55:01Z) - 3D Guided Weakly Supervised Semantic Segmentation [27.269847900950943]
We propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information.
We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.
arXiv Detail & Related papers (2020-12-01T03:34:15Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.