Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images
- URL: http://arxiv.org/abs/2205.13764v1
- Date: Fri, 27 May 2022 05:42:16 GMT
- Title: Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images
- Authors: Zhi Tian, Xiangxiang Chu, Xiaoming Wang, Xiaolin Wei, Chunhua Shen
- Abstract summary: FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes.
We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
- Score: 96.66271207089096
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a simple yet effective fully convolutional one-stage 3D object
detector for LiDAR point clouds of autonomous driving scenes, termed
FCOS-LiDAR. Unlike the dominant methods that use the bird-eye view (BEV), our
proposed detector detects objects from the range view (RV, a.k.a. range image)
of the LiDAR points. Due to the range view's compactness and compatibility with
the LiDAR sensors' sampling process on self-driving cars, the range view-based
object detector can be realized by solely exploiting the vanilla 2D
convolutions, departing from the BEV-based methods which often involve
complicated voxelization operations and sparse convolutions.
For the first time, we show that an RV-based 3D detector with standard 2D
convolutions alone can achieve comparable performance to state-of-the-art
BEV-based detectors while being significantly faster and simpler. More
importantly, almost all previous range view-based detectors only focus on
single-frame point clouds, since it is challenging to fuse multi-frame point
clouds into a single range view. In this work, we tackle this challenging issue
with a novel range view projection mechanism, and for the first time
demonstrate the benefits of fusing multi-frame point clouds for a range-view
based detector. Extensive experiments on nuScenes show the superiority of our
proposed method and we believe that our work can be strong evidence that an
RV-based 3D detector can compare favourably with the current mainstream
BEV-based detectors.
Related papers
- Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection [33.07575082922186]
Three-dimensional object detection is one of the key tasks in autonomous driving.
relying solely on cameras is difficult to achieve highly accurate and robust 3D object detection.
radar-camera fusion 3D object detection method in the bird's eye view (BEV)
RadarBEVNet consists of a dual-stream radar backbone and a Radar Cross-Section (RC) aware BEV encoder.
arXiv Detail & Related papers (2024-03-25T06:02:05Z) - Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame
Point Clouds [47.488158093929904]
We present a new approach to train a detector to simulate features and responses following a detector trained on multi-frame point clouds.
Our approach needs multi-frame point clouds only when training the single-frame detector, and once trained, it can detect objects with only single-frame point clouds as inputs during the inference.
arXiv Detail & Related papers (2022-07-03T12:59:50Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - A Simple Baseline for BEV Perception Without LiDAR [37.00868568802673]
Building 3D perception systems for autonomous vehicles that do not rely on LiDAR is a critical research problem.
Current methods use multi-view RGB data collected from cameras around the vehicle.
We propose a simple baseline model, where the "lifting" step simply averages features from all projected image locations.
arXiv Detail & Related papers (2022-06-16T06:57:32Z) - PillarGrid: Deep Learning-based Cooperative Perception for 3D Object
Detection from Onboard-Roadside LiDAR [15.195933965761645]
We propose textitPillarGrid, a novel cooperative perception method fusing information from multiple 3D LiDARs.
PillarGrid consists of four main phases: 1) cooperative preprocessing of point clouds, 2) pillar-wise voxelization and feature extraction, 3) grid-wise deep fusion of features from multiple sensors, and 4) convolutional neural network (CNN)-based augmented 3D object detection.
Extensive experimentation shows that PillarGrid outperforms the SOTA single-LiDAR-based 3D object detection methods with respect to both accuracy and range by a large margin.
arXiv Detail & Related papers (2022-03-12T02:28:41Z) - SGM3D: Stereo Guided Monocular 3D Object Detection [62.11858392862551]
We propose a stereo-guided monocular 3D object detection network, termed SGM3D.
We exploit robust 3D features extracted from stereo images to enhance the features learned from the monocular image.
Our method can be integrated into many other monocular approaches to boost performance without introducing any extra computational cost.
arXiv Detail & Related papers (2021-12-03T13:57:14Z) - RAANet: Range-Aware Attention Network for LiDAR-based 3D Object
Detection with Auxiliary Density Level Estimation [11.180128679075716]
Range-Aware Attention Network (RAANet) is developed for 3D object detection from LiDAR data for autonomous driving.
RAANet extracts more powerful BEV features and generates superior 3D object detections.
Experiments on nuScenes dataset demonstrate that our proposed approach outperforms the state-of-the-art methods for LiDAR-based 3D object detection.
arXiv Detail & Related papers (2021-11-18T04:20:13Z) - Anchor-free 3D Single Stage Detector with Mask-Guided Attention for
Point Cloud [79.39041453836793]
We develop a novel single-stage 3D detector for point clouds in an anchor-free manner.
We overcome this by converting the voxel-based sparse 3D feature volumes into the sparse 2D feature maps.
We propose an IoU-based detection confidence re-calibration scheme to improve the correlation between the detection confidence score and the accuracy of the bounding box regression.
arXiv Detail & Related papers (2021-08-08T13:42:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.