(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection
for Sparse Semantic Segmentation Network
- URL: http://arxiv.org/abs/2102.04530v1
- Date: Mon, 8 Feb 2021 21:04:21 GMT
- Title: (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection
for Sparse Semantic Segmentation Network
- Authors: Ran Cheng, Ryan Razani, Ehsan Taghavi, Enxu Li, Bingbing Liu
- Abstract summary: We propose AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation.
We present a novel multi-branch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder.
Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale SemanticKITTI benchmark.
- Score: 3.6967381030744515
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous robotic systems and self driving cars rely on accurate perception
of their surroundings as the safety of the passengers and pedestrians is the
top priority. Semantic segmentation is one the essential components of
environmental perception that provides semantic information of the scene.
Recently, several methods have been introduced for 3D LiDAR semantic
segmentation. While, they can lead to improved performance, they are either
afflicted by high computational complexity, therefore are inefficient, or lack
fine details of smaller instances. To alleviate this problem, we propose
AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic
segmentation. We present a novel multi-branch attentive feature fusion module
in the encoder and a unique adaptive feature selection module with feature map
re-weighting in the decoder. Our AF2-S3Net fuses the voxel based learning and
point-based learning into a single framework to effectively process the large
3D scene. Our experimental results show that the proposed method outperforms
the state-of-the-art approaches on the large-scale SemanticKITTI benchmark,
ranking 1st on the competitive public leaderboard competition upon publication.
Related papers
- ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Domain Adaptive Semantic Segmentation by Optimal Transport [13.133890240271308]
semantic scene segmentation has received a great deal of attention due to the richness of the semantic information it contains.
Current approaches are mainly based on convolutional neural networks (CNN), but they rely on a large number of labels.
We propose a domain adaptation (DA) framework based on optimal transport (OT) and attention mechanism to address this issue.
arXiv Detail & Related papers (2023-03-29T03:33:54Z) - LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using
Multi-Scale Convolution Attention [0.0]
We propose a projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation.
The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features.
We show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods.
arXiv Detail & Related papers (2023-01-11T02:51:38Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Omni-supervised Point Cloud Segmentation via Gradual Receptive Field
Component Reasoning [41.83979510282989]
We bring the first omni-scale supervision method to point cloud segmentation via the proposed gradual Receptive Field Component Reasoning (RFCR)
Our method brings new state-of-the-art performances for S3DIS as well as Semantic3D and ranks the 1st in the ScanNet benchmark among all the point-based methods.
arXiv Detail & Related papers (2021-05-21T08:32:02Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - F2Net: Learning to Focus on the Foreground for Unsupervised Video Object
Segmentation [61.74261802856947]
We propose a novel Focus on Foreground Network (F2Net), which delves into the intra-inter frame details for the foreground objects.
Our proposed network consists of three main parts: Siamese Module, Center Guiding Appearance Diffusion Module, and Dynamic Information Fusion Module.
Experiments on DAVIS2016, Youtube-object, and FBMS datasets show that our proposed F2Net achieves the state-of-the-art performance with significant improvement.
arXiv Detail & Related papers (2020-12-04T11:30:50Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.