RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection
- URL: http://arxiv.org/abs/2103.10039v1
- Date: Thu, 18 Mar 2021 06:18:51 GMT
- Title: RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection
- Authors: Lue Fan, Xuan Xiong, Feng Wang, Naiyan Wang, Zhaoxiang Zhang
- Abstract summary: We propose an anchor-free single-stage LiDAR-based 3D object detector -- RangeDet.
Compared with the commonly used voxelized or Bird's Eye View (BEV) representations, the range view representation is more compact and without quantization error.
Our best model achieves 72.9/75.9/65.8 3D AP on vehicle/pedestrian/cyclist.
- Score: 48.76483606935675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an anchor-free single-stage LiDAR-based 3D object
detector -- RangeDet. The most notable difference with previous works is that
our method is purely based on the range view representation. Compared with the
commonly used voxelized or Bird's Eye View (BEV) representations, the range
view representation is more compact and without quantization error. Although
there are works adopting it for semantic segmentation, its performance in
object detection is largely behind voxelized or BEV counterparts. We first
analyze the existing range-view-based methods and find two issues overlooked by
previous works: 1) the scale variation between nearby and far away objects; 2)
the inconsistency between the 2D range image coordinates used in feature
extraction and the 3D Cartesian coordinates used in output. Then we
deliberately design three components to address these issues in our RangeDet.
We test our RangeDet in the large-scale Waymo Open Dataset (WOD). Our best
model achieves 72.9/75.9/65.8 3D AP on vehicle/pedestrian/cyclist. These
results outperform other range-view-based methods by a large margin (~20 3D AP
in vehicle detection), and are overall comparable with the state-of-the-art
multi-view-based methods. Codes will be public.
Related papers
- What Matters in Range View 3D Object Detection [15.147558647138629]
Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes.
We achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature.
arXiv Detail & Related papers (2024-07-23T18:42:37Z) - Far3D: Expanding the Horizon for Surround-view 3D Object Detection [15.045811199986924]
This paper proposes a novel sparse query-based framework, dubbed Far3D.
By utilizing high-quality 2D object priors, we generate 3D adaptive queries that complement the 3D global queries.
We demonstrate SoTA performance on the challenging Argoverse 2 dataset, covering a wide range of 150 meters.
arXiv Detail & Related papers (2023-08-18T15:19:17Z) - Fully Sparse Fusion for 3D Object Detection [69.32694845027927]
Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View feature maps.
Fully sparse architecture is gaining attention as they are highly efficient in long-range perception.
In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
arXiv Detail & Related papers (2023-04-24T17:57:43Z) - Embracing Single Stride 3D Object Detector with Sparse Transformer [63.179720817019096]
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases.
Many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds.
We propose Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network.
arXiv Detail & Related papers (2021-12-13T02:12:02Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range
Image Representation [35.6155506566957]
RangeRCNN is a novel and effective 3D object detection framework based on the range image representation.
In this paper, we utilize the dilated residual block (DRB) to better adapt different object scales and obtain a more flexible receptive field.
Experiments show that RangeRCNN achieves state-of-the-art performance on the KITTI dataset and the Open dataset.
arXiv Detail & Related papers (2020-09-01T03:28:13Z) - BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices.
We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.