RangeSeg: Range-Aware Real Time Segmentation of 3D LiDAR Point Clouds
- URL: http://arxiv.org/abs/2205.01570v1
- Date: Mon, 2 May 2022 09:57:59 GMT
- Title: RangeSeg: Range-Aware Real Time Segmentation of 3D LiDAR Point Clouds
- Authors: Tzu-Hsuan Chen and Tian Sheuan Chang
- Abstract summary: This paper takes advantages of the uneven range distribution of different LiDAR laser beams to propose a range aware instance segmentation network, RangeSeg.
Experiments on the KITTI dataset show that RangeSeg outperforms the state-of-the-art semantic segmentation methods with enormous speedup.
The whole RangeSeg pipeline meets the real time requirement on NVIDIAtextsuperscripttextregistered JETSON AGX Xavier with 19 frames per second in average.
- Score: 0.6119392435448721
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Semantic outdoor scene understanding based on 3D LiDAR point clouds is a
challenging task for autonomous driving due to the sparse and irregular data
structure. This paper takes advantages of the uneven range distribution of
different LiDAR laser beams to propose a range aware instance segmentation
network, RangeSeg. RangeSeg uses a shared encoder backbone with two range
dependent decoders. A heavy decoder only computes top of a range image where
the far and small objects locate to improve small object detection accuracy,
and a light decoder computes whole range image for low computational cost. The
results are further clustered by the DBSCAN method with a resolution weighted
distance function to get instance-level segmentation results. Experiments on
the KITTI dataset show that RangeSeg outperforms the state-of-the-art semantic
segmentation methods with enormous speedup and improves the instance-level
segmentation performance on small and far objects. The whole RangeSeg pipeline
meets the real time requirement on NVIDIA\textsuperscript{\textregistered}
JETSON AGX Xavier with 19 frames per second in average.
Related papers
- DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.
Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the
OpenPCSeg Codebase [43.95911443801265]
We present a unified multi-modal LiDAR segmentation network, termed UniSeg.
It accomplishes semantic segmentation and panoptic segmentation simultaneously.
We also construct the OpenPCSeg, which is the largest and most comprehensive outdoor LiDAR segmentation.
arXiv Detail & Related papers (2023-09-11T16:00:22Z) - V-DETR: DETR with Vertex Relative Position Encoding for 3D Object
Detection [73.37781484123536]
We introduce a highly performant 3D object detector for point clouds using the DETR framework.
To address the limitation, we introduce a novel 3D Relative Position (3DV-RPE) method.
We show exceptional results on the challenging ScanNetV2 benchmark.
arXiv Detail & Related papers (2023-08-08T17:14:14Z) - Rethinking Range View Representation for LiDAR Segmentation [66.73116059734788]
"Many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections.
We present RangeFormer, a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing.
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-09T16:13:27Z) - Super Sparse 3D Object Detection [48.684300007948906]
LiDAR-based 3D object detection contributes ever-increasingly to the long-range perception in autonomous driving.
To enable efficient long-range detection, we first propose a fully sparse object detector termed FSD.
FSD++ generates residual points, which indicate the point changes between consecutive frames.
arXiv Detail & Related papers (2023-01-05T17:03:56Z) - Fully Sparse 3D Object Detection [57.05834683261658]
We build a fully sparse 3D object detector (FSD) for long-range LiDAR-based object detection.
FSD is built upon the general sparse voxel encoder and a novel sparse instance recognition (SIR) module.
SIR avoids the time-consuming neighbor queries in previous point-based methods by grouping points into instances.
arXiv Detail & Related papers (2022-07-20T17:01:33Z) - Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple
Feature Aggregation [21.337629798133324]
We propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg.
A novel range residual image representation is introduced to capture the spatial-temporal information.
An efficient U-Net backbone is used to obtain the multi-scale features.
arXiv Detail & Related papers (2022-02-27T14:46:13Z) - FIDNet: LiDAR Point Cloud Semantic Segmentation with Fully Interpolation
Decoding [5.599306291149907]
Projecting the point cloud on the 2D spherical range image transforms the LiDAR semantic segmentation to a 2D segmentation task on the range image.
We propose a new projection-based LiDAR semantic segmentation pipeline that consists of a novel network structure and an efficient post-processing step.
Our pipeline achieves the best performance among all projection-based methods with $64 times 2048$ resolution and all point-wise solutions.
arXiv Detail & Related papers (2021-09-08T17:20:09Z) - RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection [48.76483606935675]
We propose an anchor-free single-stage LiDAR-based 3D object detector -- RangeDet.
Compared with the commonly used voxelized or Bird's Eye View (BEV) representations, the range view representation is more compact and without quantization error.
Our best model achieves 72.9/75.9/65.8 3D AP on vehicle/pedestrian/cyclist.
arXiv Detail & Related papers (2021-03-18T06:18:51Z) - Range Conditioned Dilated Convolutions for Scale Invariant 3D Object
Detection [41.59388513615775]
This paper presents a novel 3D object detection framework that processes LiDAR data directly on its native representation: range images.
Benefiting from the compactness of range images, 2D convolutions can efficiently process dense LiDAR data of a scene.
arXiv Detail & Related papers (2020-05-20T09:24:43Z) - 3D Object Detection From LiDAR Data Using Distance Dependent Feature
Extraction [7.04185696830272]
This work proposes an improvement for 3D object detectors by taking into account the properties of LiDAR point clouds over distance.
Results show that training separate networks for close-range and long-range objects boosts performance for all KITTI benchmark difficulties.
arXiv Detail & Related papers (2020-03-02T13:16:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.