Multimodal Virtual Point 3D Detection
- URL: http://arxiv.org/abs/2111.06881v1
- Date: Fri, 12 Nov 2021 18:58:01 GMT
- Title: Multimodal Virtual Point 3D Detection
- Authors: Tianwei Yin, Xingyi Zhou, Philipp Kr\"ahenb\"uhl
- Abstract summary: Lidar-based sensing drives current autonomous vehicles.
Current Lidar sensors lag two decades behind traditional color cameras in terms of resolution and cost.
We present an approach to seamlessly fuse RGB sensors into Lidar-based 3D recognition.
- Score: 6.61319085872973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lidar-based sensing drives current autonomous vehicles. Despite rapid
progress, current Lidar sensors still lag two decades behind traditional color
cameras in terms of resolution and cost. For autonomous driving, this means
that large objects close to the sensors are easily visible, but far-away or
small objects comprise only one measurement or two. This is an issue,
especially when these objects turn out to be driving hazards. On the other
hand, these same objects are clearly visible in onboard RGB sensors. In this
work, we present an approach to seamlessly fuse RGB sensors into Lidar-based 3D
recognition. Our approach takes a set of 2D detections to generate dense 3D
virtual points to augment an otherwise sparse 3D point cloud. These virtual
points naturally integrate into any standard Lidar-based 3D detectors along
with regular Lidar measurements. The resulting multi-modal detector is simple
and effective. Experimental results on the large-scale nuScenes dataset show
that our framework improves a strong CenterPoint baseline by a significant 6.6
mAP, and outperforms competing fusion approaches. Code and more visualizations
are available at https://tianweiy.github.io/mvp/
Related papers
- Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection.
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - SemanticBEVFusion: Rethink LiDAR-Camera Fusion in Unified Bird's-Eye
View Representation for 3D Object Detection [14.706717531900708]
LiDAR and camera are two essential sensors for 3D object detection in autonomous driving.
Recent methods focus on point-level fusion which paints the LiDAR point cloud with camera features in the perspective view.
We present SemanticBEVFusion to deeply fuse camera features with LiDAR features in a unified BEV representation.
arXiv Detail & Related papers (2022-12-09T05:48:58Z) - Far3Det: Towards Far-Field 3D Detection [67.38417186733487]
We focus on the task of far-field 3D detection (Far3Det) of objects beyond a certain distance from an observer.
Far3Det is particularly important for autonomous vehicles (AVs) operating at highway speeds.
We develop a method to find well-annotated scenes from the nuScenes dataset and derive a well-annotated far-field validation set.
We propose a Far3Det evaluation protocol and explore various 3D detection methods for Far3Det.
arXiv Detail & Related papers (2022-11-25T02:07:57Z) - DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection [83.18142309597984]
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving.
We develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods.
arXiv Detail & Related papers (2022-03-15T18:46:06Z) - Embracing Single Stride 3D Object Detector with Sparse Transformer [63.179720817019096]
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases.
Many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds.
We propose Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network.
arXiv Detail & Related papers (2021-12-13T02:12:02Z) - RoIFusion: 3D Object Detection from LiDAR and Vision [7.878027048763662]
We propose a novel fusion algorithm by projecting a set of 3D Region of Interests (RoIs) from the point clouds to the 2D RoIs of the corresponding the images.
Our approach achieves state-of-the-art performance on the KITTI 3D object detection challenging benchmark.
arXiv Detail & Related papers (2020-09-09T20:23:27Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.