EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection
- URL: http://arxiv.org/abs/2303.17895v4
- Date: Wed, 30 Aug 2023 02:10:53 GMT
- Title: EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection
- Authors: Haotian Hu, Fanyi Wang, Jingwen Su, Yaonong Wang, Laifeng Hu, Weiye
Fang, Jingwei Xu, Zhiwang Zhang
- Abstract summary: We propose a novel Edge-aware Lift-splat-shot (EA-LSS) framework for 3D object detection.
Our EA-LSS framework is compatible for any LSS-based 3D object detection models.
- Score: 9.289537252177048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, great progress has been made in the Lift-Splat-Shot-based
(LSS-based) 3D object detection method. However, inaccurate depth estimation
remains an important constraint to the accuracy of camera-only and multi-model
3D object detection models, especially in regions where the depth changes
significantly (i.e., the "depth jump" problem). In this paper, we proposed a
novel Edge-aware Lift-splat-shot (EA-LSS) framework. Specifically, edge-aware
depth fusion (EADF) module is proposed to alleviate the "depth jump" problem
and fine-grained depth (FGD) module to further enforce refined supervision on
depth. Our EA-LSS framework is compatible for any LSS-based 3D object detection
models, and effectively boosts their performances with negligible increment of
inference time. Experiments on nuScenes benchmarks demonstrate that EA-LSS is
effective in either camera-only or multi-model models. It is worth mentioning
that EA-LSS achieved the state-of-the-art performance on nuScenes test
benchmarks with mAP and NDS of 76.5% and 77.6%, respectively.
Related papers
- DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting [52.52398576505268]
We introduce DoF-Gaussian, a controllable depth-of-field method for 3D-GS.
We develop a lens-based imaging model based on geometric optics principles to control DoF effects.
Our framework is customizable and supports various interactive applications.
arXiv Detail & Related papers (2025-03-02T05:57:57Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Introducing Depth into Transformer-based 3D Object Detection [24.224177932086455]
We present a Depth-Aware Transformer framework designed for camera-based 3D detection.
We show that DAT achieves a significant improvement of +2.8 NDS on nuScenes val under the same settings.
When using pre-trained VoVNet-99 as the backbone, DAT achieves strong results of 60.0 NDS and 51.5 mAP on nuScenes test.
arXiv Detail & Related papers (2023-02-25T06:28:32Z) - MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth
Seeds for 3D Object Detection [89.26380781863665]
Fusing LiDAR and camera information is essential for achieving accurate and reliable 3D object detection in autonomous driving systems.
Recent approaches aim at exploring the semantic densities of camera features through lifting points in 2D camera images into 3D space for fusion.
We propose a novel framework that focuses on the multi-scale progressive interaction of the multi-granularity LiDAR and camera features.
arXiv Detail & Related papers (2022-09-07T12:29:29Z) - Is Pseudo-Lidar needed for Monocular 3D Object detection? [32.772699246216774]
We propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations.
Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data.
arXiv Detail & Related papers (2021-08-13T22:22:51Z) - Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images
with Virtual Depth [64.29043589521308]
We propose a rendering module to augment the training data by synthesizing images with virtual-depths.
The rendering module takes as input the RGB image and its corresponding sparse depth image, outputs a variety of photo-realistic synthetic images.
Besides, we introduce an auxiliary module to improve the detection model by jointly optimizing it through a depth estimation task.
arXiv Detail & Related papers (2021-07-28T11:00:47Z) - IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a
Single Image [37.83574424518901]
3D object detection from a single image is an important task in Autonomous Driving.
We propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection.
arXiv Detail & Related papers (2021-03-05T05:47:52Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic
Information Modeling [65.47126868838836]
We propose a novel 3D object detection framework with dynamic information modeling.
Coarse predictions are generated in the first stage via a voxel-based region proposal network.
Experiments are conducted on the large-scale nuScenes 3D detection benchmark.
arXiv Detail & Related papers (2020-07-16T18:27:08Z) - Generative Sparse Detection Networks for 3D Single-shot Object Detection [43.91336826079574]
3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality.
Yet, the sparse nature of the 3D data poses unique challenges to this task.
We propose Generative Sparse Detection Network (GSDN), a fully-convolutional single-shot sparse detection network.
arXiv Detail & Related papers (2020-06-22T15:54:24Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.