S3Net: 3D LiDAR Sparse Semantic Segmentation Network
- URL: http://arxiv.org/abs/2103.08745v1
- Date: Mon, 15 Mar 2021 22:15:24 GMT
- Title: S3Net: 3D LiDAR Sparse Semantic Segmentation Network
- Authors: Ran Cheng, Ryan Razani, Yuan Ren and Liu Bingbing
- Abstract summary: S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
- Score: 1.330528227599978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic Segmentation is a crucial component in the perception systems of
many applications, such as robotics and autonomous driving that rely on
accurate environmental perception and understanding. In literature, several
approaches are introduced to attempt LiDAR semantic segmentation task, such as
projection-based (range-view or birds-eye-view), and voxel-based approaches.
However, they either abandon the valuable 3D topology and geometric relations
and suffer from information loss introduced in the projection process or are
inefficient. Therefore, there is a need for accurate models capable of
processing the 3D driving-scene point cloud in 3D space. In this paper, we
propose S3Net, a novel convolutional neural network for LiDAR point cloud
semantic segmentation. It adopts an encoder-decoder backbone that consists of
Sparse Intra-channel Attention Module (SIntraAM), and Sparse Inter-channel
Attention Module (SInterAM) to emphasize the fine details of both within each
feature map and among nearby feature maps. To extract the global contexts in
deeper layers, we introduce Sparse Residual Tower based upon sparse convolution
that suits varying sparsity of LiDAR point cloud. In addition, geo-aware
anisotrophic loss is leveraged to emphasize the semantic boundaries and
penalize the noise within each predicted regions, leading to a robust
prediction. Our experimental results show that the proposed method leads to a
large improvement (12\%) compared to its baseline counterpart (MinkNet42
\cite{choy20194d}) on SemanticKITTI \cite{DBLP:conf/iccv/BehleyGMQBSG19} test
set and achieves state-of-the-art mIoU accuracy of semantic segmentation
approaches.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - Large Generative Model Assisted 3D Semantic Communication [51.17527319441436]
We propose a Generative AI Model assisted 3D SC (GAM-3DSC) system.
First, we introduce a 3D Semantic Extractor (3DSE) to extract key semantics from a 3D scenario based on user requirements.
We then present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images.
Finally, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels.
arXiv Detail & Related papers (2024-03-09T03:33:07Z) - PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z) - Push-the-Boundary: Boundary-aware Feature Propagation for Semantic
Segmentation of 3D Point Clouds [0.5249805590164901]
We propose a boundary-aware feature propagation mechanism to improve semantic segmentation near object boundaries.
With one shared encoder, our network outputs (i) boundary localization, (ii) prediction of directions pointing to the object's interior, and (iii) semantic segmentation, in three parallel streams.
Our proposed approach yields consistent improvements by reducing boundary errors.
arXiv Detail & Related papers (2022-12-23T15:42:01Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point
Clouds [0.16799377888527683]
We present S3CNet, a sparse convolution based neural network that predicts the semantically completed scene from a single, unified LiDAR point cloud.
We show that our proposed method outperforms all counterparts on the 3D task, achieving state-of-the art results on the Semantic KITTI benchmark.
arXiv Detail & Related papers (2020-12-16T20:14:41Z) - Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene
Understanding [19.134536179555102]
We propose an alternative approach to overcome the limitations of CNN based approaches by encoding the spatial features of raw 3D point clouds into undirected graph models.
The proposed method achieves on par state-of-the-art accuracy with improved training time and model stability thus indicating strong potential for further research.
arXiv Detail & Related papers (2020-11-29T12:56:19Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with
Diamond inceptiOn module [23.112192919085825]
TORNADO-Net is a neural network for 3D LiDAR point cloud semantic segmentation.
We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture.
We also take advantage of the fact that the LiDAR data encompasses 360 degrees field of view and uses circular padding.
arXiv Detail & Related papers (2020-08-24T16:32:41Z) - Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point
Clouds of Wild Scenes [36.07733308424772]
The deficiency of 3D segmentation labels is one of the main obstacles to effective point cloud segmentation.
We propose a novel deep graph convolutional network-based framework for large-scale semantic scene segmentation in point clouds with sole 2D supervision.
arXiv Detail & Related papers (2020-04-26T23:02:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.