Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple
Feature Aggregation
- URL: http://arxiv.org/abs/2202.13377v2
- Date: Thu, 3 Mar 2022 09:01:17 GMT
- Title: Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple
Feature Aggregation
- Authors: Song Wang, Jianke Zhu, Ruixiang Zhang
- Abstract summary: We propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg.
A novel range residual image representation is introduced to capture the spatial-temporal information.
An efficient U-Net backbone is used to obtain the multi-scale features.
- Score: 21.337629798133324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR sensor is essential to the perception system in autonomous vehicles and
intelligent robots. To fulfill the real-time requirements in real-world
applications, it is necessary to efficiently segment the LiDAR scans. Most of
previous approaches directly project 3D point cloud onto the 2D spherical range
image so that they can make use of the efficient 2D convolutional operations
for image segmentation. Although having achieved the encouraging results, the
neighborhood information is not well-preserved in the spherical projection.
Moreover, the temporal information is not taken into consideration in the
single scan segmentation task. To tackle these problems, we propose a novel
approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg,
where a novel range residual image representation is introduced to capture the
spatial-temporal information. Specifically, Meta-Kernel is employed to extract
the meta features, which reduces the inconsistency between the 2D range image
coordinates input and Cartesian coordinates output. An efficient U-Net backbone
is used to obtain the multi-scale features. Furthermore, Feature Aggregation
Module (FAM) aggregates the meta features and multi-scale features, which tends
to strengthen the role of range channel. We have conducted extensive
experiments for performance evaluation on SemanticKITTI, which is the de-facto
dataset for LiDAR semantic segmentation. The promising results show that our
proposed Meta-RangeSeg method is more efficient and effective than the existing
approaches. Our full implementation is publicly available at
https://github.com/songw-zju/Meta-RangeSeg .
Related papers
- ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - FASTC: A Fast Attentional Framework for Semantic Traversability Classification Using Point Cloud [7.711666704468952]
We address the problem of traversability assessment using point clouds.
We propose a pillar feature extraction module that utilizes PointNet to capture features from point clouds organized in vertical volume.
We then propose a newtemporal attention module to fuse multi-frame information, which can properly handle the varying density problem of LIDAR point clouds.
arXiv Detail & Related papers (2024-06-24T12:01:55Z) - Human Semantic Segmentation using Millimeter-Wave Radar Sparse Point
Clouds [3.3888257250564364]
This paper presents a framework for semantic segmentation on sparse sequential point clouds of millimeter-wave radar.
The sparsity and capturing temporal-topological features of mmWave data is still a problem.
We introduce graph structure and topological features to the point cloud and propose a semantic segmentation framework.
Our model achieves mean accuracy on a custom dataset by $mathbf82.31%$ and outperforms state-of-the-art algorithms.
arXiv Detail & Related papers (2023-04-27T12:28:06Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using
Multi-Scale Convolution Attention [0.0]
We propose a projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation.
The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features.
We show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods.
arXiv Detail & Related papers (2023-01-11T02:51:38Z) - LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS)
LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z) - FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection [11.962073589763676]
Existing 3D detectors significantly improve the accuracy by adopting a two-stage paradigm.
The sparsity of point clouds, especially for the points far away, makes it difficult for the LiDAR-only refinement module to accurately recognize and locate objects.
We propose a novel multi-modality two-stage approach named FusionRCNN, which effectively and efficiently fuses point clouds and camera images in the Regions of Interest(RoI)
FusionRCNN significantly improves the strong SECOND baseline by 6.14% mAP on baseline, and outperforms competing two-stage approaches.
arXiv Detail & Related papers (2022-09-22T02:07:25Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.