Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame
Point Clouds
- URL: http://arxiv.org/abs/2207.01030v1
- Date: Sun, 3 Jul 2022 12:59:50 GMT
- Title: Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame
Point Clouds
- Authors: Wu Zheng, Li Jiang, Fanbin Lu, Yangyang Ye, Chi-Wing Fu
- Abstract summary: We present a new approach to train a detector to simulate features and responses following a detector trained on multi-frame point clouds.
Our approach needs multi-frame point clouds only when training the single-frame detector, and once trained, it can detect objects with only single-frame point clouds as inputs during the inference.
- Score: 47.488158093929904
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: To boost a detector for single-frame 3D object detection, we present a new
approach to train it to simulate features and responses following a detector
trained on multi-frame point clouds. Our approach needs multi-frame point
clouds only when training the single-frame detector, and once trained, it can
detect objects with only single-frame point clouds as inputs during the
inference. We design a novel Simulated Multi-Frame Single-Stage object Detector
(SMF-SSD) framework to realize the approach: multi-view dense object fusion to
densify ground-truth objects to generate a multi-frame point cloud;
self-attention voxel distillation to facilitate one-to-many knowledge transfer
from multi- to single-frame voxels; multi-scale BEV feature distillation to
transfer knowledge in low-level spatial and high-level semantic BEV features;
and adaptive response distillation to activate single-frame responses of high
confidence and accurate localization. Experimental results on the Waymo test
set show that our SMF-SSD consistently outperforms all state-of-the-art
single-frame 3D object detectors for all object classes of difficulty levels 1
and 2 in terms of both mAP and mAPH.
Related papers
- MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection [28.319440934322728]
MV2DFusion is a multi-modal detection framework that integrates the strengths of both worlds through an advanced query-based fusion mechanism.
Our framework's flexibility allows it to integrate with any image and point cloud-based detectors, showcasing its adaptability and potential for future advancements.
arXiv Detail & Related papers (2024-08-12T06:46:05Z) - Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework [44.44329455757931]
In autonomous driving, LiDAR sensors are vital for acquiring 3D point clouds, providing reliable geometric information.
To address this, we propose a multi-branch two-stage 3D object detection framework using a Semantic-aware Multi-branch Sampling (SMS) module.
The experimental results on KITTI 3D object detection benchmark dataset show that our method achieves excellent detection performance improvement for a variety of backbones.
arXiv Detail & Related papers (2024-07-08T09:25:45Z) - PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest [65.48057241587398]
PoIFusion is a framework to fuse information of RGB images and LiDAR point clouds at the points of interest (PoIs)
Our approach maintains the view of each modality and obtains multi-modal features by computation-friendly projection and computation.
We conducted extensive experiments on nuScenes and Argoverse2 datasets to evaluate our approach.
arXiv Detail & Related papers (2024-03-14T09:28:12Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object
Detection [47.941714033657675]
3D object detection using point clouds has attracted increasing attention due to its wide applications in autonomous driving and robotics.
We design TransPillars, a novel transformer-based feature aggregation technique that exploits temporal features of consecutive point cloud frames.
Our proposed TransPillars achieves state-of-art performance as compared to existing multi-frame detection approaches.
arXiv Detail & Related papers (2022-08-04T15:41:43Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images [96.66271207089096]
FCOS-LiDAR is a fully convolutional one-stage 3D object detector for LiDAR point clouds of autonomous driving scenes.
We show that an RV-based 3D detector with standard 2D convolutions alone can achieve comparable performance to state-of-the-art BEV-based detectors.
arXiv Detail & Related papers (2022-05-27T05:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.