Accurate 3D Object Detection using Energy-Based Models
- URL: http://arxiv.org/abs/2012.04634v2
- Date: Tue, 7 Nov 2023 11:38:32 GMT
- Title: Accurate 3D Object Detection using Energy-Based Models
- Authors: Fredrik K. Gustafsson, Martin Danelljan, Thomas B. Sch\"on
- Abstract summary: Regressing 3D bounding boxes in cluttered environments based on sparse LiDAR data is a highly challenging problem.
We explore recent advances in conditional energy-based models (EBMs) for probabilistic regression.
Our proposed approach consistently outperforms the SA-SSD baseline across all 3DOD metrics.
- Score: 46.05502630457458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate 3D object detection (3DOD) is crucial for safe navigation of complex
environments by autonomous robots. Regressing accurate 3D bounding boxes in
cluttered environments based on sparse LiDAR data is however a highly
challenging problem. We address this task by exploring recent advances in
conditional energy-based models (EBMs) for probabilistic regression. While
methods employing EBMs for regression have demonstrated impressive performance
on 2D object detection in images, these techniques are not directly applicable
to 3D bounding boxes. In this work, we therefore design a differentiable
pooling operator for 3D bounding boxes, serving as the core module of our EBM
network. We further integrate this general approach into the state-of-the-art
3D object detector SA-SSD. On the KITTI dataset, our proposed approach
consistently outperforms the SA-SSD baseline across all 3DOD metrics,
demonstrating the potential of EBM-based regression for highly accurate 3DOD.
Code is available at https://github.com/fregu856/ebms_3dod.
Related papers
- SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - BSH-Det3D: Improving 3D Object Detection with BEV Shape Heatmap [10.060577111347152]
We propose a novel LiDAR-based 3D object detection model named BSH-Det3D.
It applies an effective way to enhance spatial features by estimating complete shapes from a bird's eye view.
Experiments on the KITTI benchmark achieve state-of-the-art (SOTA) performance in terms of accuracy and speed.
arXiv Detail & Related papers (2023-03-03T15:13:11Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle
Detection [81.79171905308827]
We propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations.
Our method consists of two stages: coarse 3D segmentation and 3D bounding box estimation.
It is able to accurately detect objects in 3D space with only 2D bounding boxes and sparse point clouds.
arXiv Detail & Related papers (2021-05-17T07:29:55Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint
Estimation [3.1542695050861544]
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
We propose a novel 3D object detection method, named SMOKE, that combines a single keypoint estimate with regressed 3D variables.
Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset.
arXiv Detail & Related papers (2020-02-24T08:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.