Group Equivariant BEV for 3D Object Detection
- URL: http://arxiv.org/abs/2304.13390v2
- Date: Thu, 29 Jun 2023 03:22:08 GMT
- Title: Group Equivariant BEV for 3D Object Detection
- Authors: Hongwei Liu, Jian Yang, Jianfeng Zhang, Dongheng Shao, Jielong Guo,
Shaobo Li, Xuan Tang, Xian Wei
- Abstract summary: 3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios.
We propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory.
GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene.
- Score: 28.109682816379998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, 3D object detection has attracted significant attention and
achieved continuous improvement in real road scenarios. The environmental
information is collected from a single sensor or multi-sensor fusion to detect
interested objects. However, most of the current 3D object detection approaches
focus on developing advanced network architectures to improve the detection
precision of the object rather than considering the dynamic driving scenes,
where data collected from sensors equipped in the vehicle contain various
perturbation features. As a result, existing work cannot still tackle the
perturbation issue. In order to solve this problem, we propose a group
equivariant bird's eye view network (GeqBevNet) based on the group equivariant
theory, which introduces the concept of group equivariant into the BEV fusion
object detection network. The group equivariant network is embedded into the
fused BEV feature map to facilitate the BEV-level rotational equivariant
feature extraction, thus leading to lower average orientation error. In order
to demonstrate the effectiveness of the GeqBevNet, the network is verified on
the nuScenes validation dataset in which mAOE can be decreased to 0.325.
Experimental results demonstrate that GeqBevNet can extract more rotational
equivariant features in the 3D object detection of the actual road scene and
improve the performance of object orientation prediction.
Related papers
- EVT: Efficient View Transformation for Multi-Modal 3D Object Detection [2.9848894641223302]
We propose a novel 3D object detector via efficient view transformation (EVT)
EVT uses Adaptive Sampling and Adaptive Projection (ASAP) to generate 3D sampling points and adaptive kernels.
It is designed to effectively utilize the obtained multi-modal BEV features within the transformer decoder.
arXiv Detail & Related papers (2024-11-16T06:11:10Z) - OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection [29.530177591608297]
Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.
Most of the current state-of-the-art detectors follow the query-based bird's-eye-view (BEV) paradigm.
We propose an Object-Centric query-BEV detector OCBEV, which can carve the temporal and spatial cues of moving targets more effectively.
arXiv Detail & Related papers (2023-06-02T17:59:48Z) - DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for
Autonomous Driving [4.489333751818157]
We propose DuEqNet, which first introduces the concept of equivariance into 3D object detection network.
The dual-equivariant of our model can extract the equivariant features at both local and global levels.
Our model presents higher accuracy on orientation and better prediction efficiency.
arXiv Detail & Related papers (2023-02-27T08:30:02Z) - OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for
Multi-Camera 3D Object Detection [78.38062015443195]
OA-BEV is a network that can be plugged into the BEV-based 3D object detection framework.
Our method achieves consistent improvements over the BEV-based baselines in terms of both average precision and nuScenes detection score.
arXiv Detail & Related papers (2023-01-13T06:02:31Z) - Transformation-Equivariant 3D Object Detection for Autonomous Driving [44.17100476968737]
Transformation-Equivariant 3D Detector (TED) is an efficient way to detect 3D objects in autonomous driving.
TED ranks 1st among all submissions on KITTI 3D car detection leaderboard.
arXiv Detail & Related papers (2022-11-22T02:51:56Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - Delving into Localization Errors for Monocular 3D Object Detection [85.77319416168362]
Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving.
In this work, we quantify the impact introduced by each sub-task and find the localization error' is the vital factor in restricting monocular 3D detection.
arXiv Detail & Related papers (2021-03-30T10:38:01Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.