R2Det: Redemption from Range-view for Accurate 3D Object Detection
- URL: http://arxiv.org/abs/2307.11482v2
- Date: Thu, 24 Aug 2023 05:14:34 GMT
- Title: R2Det: Redemption from Range-view for Accurate 3D Object Detection
- Authors: Yihan Wang, Qiao Yan and Yi Wang
- Abstract summary: Redemption from Range-view Module (R2M) is a plug-and-play approach for 3D surface texture enhancement from the 2D range view to the 3D point view.
R2M can be seamlessly integrated into state-of-the-art LiDAR-based 3D object detectors as preprocessing.
- Score: 16.855672228478074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR-based 3D object detection is of paramount importance for autonomous
driving. Recent trends show a remarkable improvement for bird's-eye-view (BEV)
based and point-based methods as they demonstrate superior performance compared
to range-view counterparts. This paper presents an insight that leverages
range-view representation to enhance 3D points for accurate 3D object
detection. Specifically, we introduce a Redemption from Range-view Module
(R2M), a plug-and-play approach for 3D surface texture enhancement from the 2D
range view to the 3D point view. R2M comprises BasicBlock for 2D feature
extraction, Hierarchical-dilated (HD) Meta Kernel for expanding the 3D
receptive field, and Feature Points Redemption (FPR) for recovering 3D surface
texture information. R2M can be seamlessly integrated into state-of-the-art
LiDAR-based 3D object detectors as preprocessing and achieve appealing
improvement, e.g., 1.39%, 1.67%, and 1.97% mAP improvement on easy, moderate,
and hard difficulty level of KITTI val set, respectively. Based on R2M, we
further propose R2Detector (R2Det) with the Synchronous-Grid RoI Pooling for
accurate box refinement. R2Det outperforms existing range-view-based methods by
a significant margin on both the KITTI benchmark and the Waymo Open Dataset.
Codes will be made publicly available.
Related papers
- What Matters in Range View 3D Object Detection [15.147558647138629]
Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes.
We achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature.
arXiv Detail & Related papers (2024-07-23T18:42:37Z) - Fully Sparse Fusion for 3D Object Detection [69.32694845027927]
Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View feature maps.
Fully sparse architecture is gaining attention as they are highly efficient in long-range perception.
In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
arXiv Detail & Related papers (2023-04-24T17:57:43Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - SM3D: Simultaneous Monocular Mapping and 3D Detection [1.2183405753834562]
We present an innovative and efficient multi-task deep learning framework (SM3D) for Simultaneous Mapping and 3D Detection.
By end-to-end training of both modules, the proposed mapping and 3D detection method outperforms the state-of-the-art baseline by 10.0% and 13.2% in accuracy.
Our monocular multi-task SM3D is more than 2 times faster than pure stereo 3D detector, and 18.3% faster than using two modules separately.
arXiv Detail & Related papers (2021-11-24T17:23:37Z) - Improved Pillar with Fine-grained Feature for 3D Object Detection [23.348710029787068]
3D object detection with LiDAR point clouds plays an important role in autonomous driving perception module.
Existing point-based methods are challenging to reach the speed requirements because of too many raw points.
The 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution.
arXiv Detail & Related papers (2021-10-12T14:53:14Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - RangeRCNN: Towards Fast and Accurate 3D Object Detection with Range
Image Representation [35.6155506566957]
RangeRCNN is a novel and effective 3D object detection framework based on the range image representation.
In this paper, we utilize the dilated residual block (DRB) to better adapt different object scales and obtain a more flexible receptive field.
Experiments show that RangeRCNN achieves state-of-the-art performance on the KITTI dataset and the Open dataset.
arXiv Detail & Related papers (2020-09-01T03:28:13Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z) - SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint
Estimation [3.1542695050861544]
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
We propose a novel 3D object detection method, named SMOKE, that combines a single keypoint estimate with regressed 3D variables.
Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset.
arXiv Detail & Related papers (2020-02-24T08:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.