SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception
- URL: http://arxiv.org/abs/2405.15843v1
- Date: Fri, 24 May 2024 17:25:48 GMT
- Title: SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception
- Authors: Louis Foucard, Samar Khanna, Yi Shi, Chi-Kuei Liu, Quinn Z Shen, Thuyen Ngo, Zi-Xiang Xia,
- Abstract summary: SpotNet is a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection.
We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support.
- Score: 3.627834388176496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose SpotNet: a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection. We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support. Unlike more recent bird's-eye-view (BEV) sensor-fusion methods which scale with range $r$ as $O(r^2)$, SpotNet scales as $O(1)$ with range. We argue that such an architecture is ideally suited to leverage each sensor's strength, i.e. semantic understanding from images and accurate range finding from LiDAR data. Finally we show that anchoring detections on LiDAR points removes the need to regress distances, and so the architecture is able to transfer from 2MP to 8MP resolution images without re-training.
Related papers
- Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection.
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z) - Towards Long-Range 3D Object Detection for Autonomous Vehicles [4.580520623362462]
3D object detection at long range is crucial for ensuring the safety and efficiency of self driving vehicles.
Most current state of the art LiDAR based methods are range limited due to sparsity at long range.
We investigate two ways to improve long range performance of current LiDAR based 3D detectors.
arXiv Detail & Related papers (2023-10-07T13:39:46Z) - Fully Sparse Fusion for 3D Object Detection [69.32694845027927]
Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View feature maps.
Fully sparse architecture is gaining attention as they are highly efficient in long-range perception.
In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
arXiv Detail & Related papers (2023-04-24T17:57:43Z) - ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3D Object
Detection [20.44294678711783]
We propose ImLiDAR, a new 3OD paradigm to narrow the cross-sensor discrepancies by progressively fusing the multi-scale features of camera Images and LiDAR point clouds.
First, we propose a cross-sensor dynamic message propagation module to combine the best of the multi-scale image and point features.
Second, we raise a direct set prediction problem that allows designing an effective set-based detector.
arXiv Detail & Related papers (2022-11-17T13:31:23Z) - M$^2$-3DLaneNet: Exploring Multi-Modal 3D Lane Detection [30.250833348463633]
M$2$-3DLaneNet lifts 2D features into 3D space by incorporating geometry information from LiDAR data through depth completion.
Experiments on the large-scale OpenLane dataset demonstrate the effectiveness of M$2$-3DLaneNet, regardless of the range.
arXiv Detail & Related papers (2022-09-13T13:45:18Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - Frustum Fusion: Pseudo-LiDAR and LiDAR Fusion for 3D Detection [0.0]
We propose a novel data fusion algorithm to combine accurate point clouds with dense but less accurate point clouds obtained from stereo pairs.
We train multiple 3D object detection methods and show that our fusion strategy consistently improves the performance of detectors.
arXiv Detail & Related papers (2021-11-08T19:29:59Z) - Deep Continuous Fusion for Multi-Sensor 3D Object Detection [103.5060007382646]
We propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.
We design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution.
arXiv Detail & Related papers (2020-12-20T18:43:41Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.