Far3Det: Towards Far-Field 3D Detection
- URL: http://arxiv.org/abs/2211.13858v1
- Date: Fri, 25 Nov 2022 02:07:57 GMT
- Title: Far3Det: Towards Far-Field 3D Detection
- Authors: Shubham Gupta, Jeet Kanjani, Mengtian Li, Francesco Ferroni, James
Hays, Deva Ramanan, Shu Kong
- Abstract summary: We focus on the task of far-field 3D detection (Far3Det) of objects beyond a certain distance from an observer.
Far3Det is particularly important for autonomous vehicles (AVs) operating at highway speeds.
We develop a method to find well-annotated scenes from the nuScenes dataset and derive a well-annotated far-field validation set.
We propose a Far3Det evaluation protocol and explore various 3D detection methods for Far3Det.
- Score: 67.38417186733487
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We focus on the task of far-field 3D detection (Far3Det) of objects beyond a
certain distance from an observer, e.g., $>$50m. Far3Det is particularly
important for autonomous vehicles (AVs) operating at highway speeds, which
require detections of far-field obstacles to ensure sufficient braking
distances. However, contemporary AV benchmarks such as nuScenes underemphasize
this problem because they evaluate performance only up to a certain distance
(50m). One reason is that obtaining far-field 3D annotations is difficult,
particularly for lidar sensors that produce very few point returns for far-away
objects. Indeed, we find that almost 50% of far-field objects (beyond 50m)
contain zero lidar points. Secondly, current metrics for 3D detection employ a
"one-size-fits-all" philosophy, using the same tolerance thresholds for near
and far objects, inconsistent with tolerances for both human vision and stereo
disparities. Both factors lead to an incomplete analysis of the Far3Det task.
For example, while conventional wisdom tells us that high-resolution RGB
sensors should be vital for 3D detection of far-away objects, lidar-based
methods still rank higher compared to RGB counterparts on the current benchmark
leaderboards. As a first step towards a Far3Det benchmark, we develop a method
to find well-annotated scenes from the nuScenes dataset and derive a
well-annotated far-field validation set. We also propose a Far3Det evaluation
protocol and explore various 3D detection methods for Far3Det. Our result
convincingly justifies the long-held conventional wisdom that high-resolution
RGB improves 3D detection in the far-field. We further propose a simple yet
effective method that fuses detections from RGB and lidar detectors based on
non-maximum suppression, which remarkably outperforms state-of-the-art 3D
detectors in the far-field.
Related papers
- Towards Long-Range 3D Object Detection for Autonomous Vehicles [4.580520623362462]
3D object detection at long range is crucial for ensuring the safety and efficiency of self driving vehicles.
Most current state of the art LiDAR based methods are range limited due to sparsity at long range.
We investigate two ways to improve long range performance of current LiDAR based 3D detectors.
arXiv Detail & Related papers (2023-10-07T13:39:46Z) - Far3D: Expanding the Horizon for Surround-view 3D Object Detection [15.045811199986924]
This paper proposes a novel sparse query-based framework, dubbed Far3D.
By utilizing high-quality 2D object priors, we generate 3D adaptive queries that complement the 3D global queries.
We demonstrate SoTA performance on the challenging Argoverse 2 dataset, covering a wide range of 150 meters.
arXiv Detail & Related papers (2023-08-18T15:19:17Z) - FocalFormer3D : Focusing on Hard Instance for 3D Object Detection [97.56185033488168]
False negatives (FN) in 3D object detection can lead to potentially dangerous situations in autonomous driving.
In this work, we propose Hard Instance Probing (HIP), a general pipeline that identifies textitFN in a multi-stage manner.
We instantiate this method as FocalFormer3D, a simple yet effective detector that excels at excavating difficult objects.
arXiv Detail & Related papers (2023-08-08T20:06:12Z) - An Empirical Analysis of Range for 3D Object Detection [70.54345282696138]
We present an empirical analysis of far-field 3D detection using the long-range detection dataset Argoverse 2.0.
Near-field LiDAR measurements are dense and optimally encoded by small voxels, while far-field measurements are sparse and are better encoded with large voxels.
We propose simple techniques to efficiently ensemble models for long-range detection that improve efficiency by 33% and boost accuracy by 3.2% CDS.
arXiv Detail & Related papers (2023-08-08T05:29:26Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Multimodal Virtual Point 3D Detection [6.61319085872973]
Lidar-based sensing drives current autonomous vehicles.
Current Lidar sensors lag two decades behind traditional color cameras in terms of resolution and cost.
We present an approach to seamlessly fuse RGB sensors into Lidar-based 3D recognition.
arXiv Detail & Related papers (2021-11-12T18:58:01Z) - Is Pseudo-Lidar needed for Monocular 3D Object detection? [32.772699246216774]
We propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations.
Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data.
arXiv Detail & Related papers (2021-08-13T22:22:51Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.