VaLID: Verification as Late Integration of Detections for LiDAR-Camera Fusion
- URL: http://arxiv.org/abs/2409.15529v1
- Date: Mon, 23 Sep 2024 20:27:10 GMT
- Title: VaLID: Verification as Late Integration of Detections for LiDAR-Camera Fusion
- Authors: Vanshika Vats, Marzia Binta Nizam, James Davis,
- Abstract summary: Methods using LiDAR generally outperform those using cameras only.
We propose a model-independent late fusion method, VaLID, which validates whether each predicted bounding box is acceptable.
Our approach is model-agnostic and demonstrates state-of-the-art competitive performance even when using generic camera detectors.
- Score: 2.503388496100123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vehicle object detection is possible using both LiDAR and camera data. Methods using LiDAR generally outperform those using cameras only. The highest accuracy methods utilize both of these modalities through data fusion. In our study, we propose a model-independent late fusion method, VaLID, which validates whether each predicted bounding box is acceptable or not. Our method verifies the higher-performing, yet overly optimistic LiDAR model detections using camera detections that are obtained from either specially trained, general, or open-vocabulary models. VaLID uses a simple multi-layer perceptron trained with a high recall bias to reduce the false predictions made by the LiDAR detector, while still preserving the true ones. Evaluating with multiple combinations of LiDAR and camera detectors on the KITTI dataset, we reduce false positives by an average of 63.9%, thus outperforming the individual detectors on 2D average precision (2DAP). Our approach is model-agnostic and demonstrates state-of-the-art competitive performance even when using generic camera detectors that were not trained specifically for this dataset.
Related papers
- Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images.
In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data.
We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Towards a Robust Sensor Fusion Step for 3D Object Detection on Corrupted
Data [4.3012765978447565]
This work presents a novel fusion step that addresses data corruptions and makes sensor fusion for 3D object detection more robust.
We demonstrate that our method performs on par with state-of-the-art approaches on normal data and outperforms them on misaligned data.
arXiv Detail & Related papers (2023-06-12T18:06:29Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector.
The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference.
Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z) - LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection [26.278496981844317]
We propose variants of the 3D AP metric to be more permissive with respect to depth estimation errors.
Specifically, our novel longitudinal error tolerant metrics, LET-3D-AP and LET-3D-APL, allow longitudinal localization errors up to a given tolerance.
We find that state-of-the-art camera-based detectors can outperform popular LiDAR-based detectors with our new metrics past 10% depth error tolerance.
arXiv Detail & Related papers (2022-06-15T17:57:41Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - Self-Supervised Person Detection in 2D Range Data using a Calibrated
Camera [83.31666463259849]
We propose a method to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors.
We show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained using manual annotations.
Our method is an effective way to improve person detectors during deployment without any additional labeling effort.
arXiv Detail & Related papers (2020-12-16T12:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.