Transformation-Equivariant 3D Object Detection for Autonomous Driving
- URL: http://arxiv.org/abs/2211.11962v2
- Date: Wed, 23 Nov 2022 01:51:39 GMT
- Title: Transformation-Equivariant 3D Object Detection for Autonomous Driving
- Authors: Hai Wu and Chenglu Wen and Wei Li and Xin Li and Ruigang Yang and
Cheng Wang
- Abstract summary: Transformation-Equivariant 3D Detector (TED) is an efficient way to detect 3D objects in autonomous driving.
TED ranks 1st among all submissions on KITTI 3D car detection leaderboard.
- Score: 44.17100476968737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D object detection received increasing attention in autonomous driving
recently. Objects in 3D scenes are distributed with diverse orientations.
Ordinary detectors do not explicitly model the variations of rotation and
reflection transformations. Consequently, large networks and extensive data
augmentation are required for robust detection. Recent equivariant networks
explicitly model the transformation variations by applying shared networks on
multiple transformed point clouds, showing great potential in object geometry
modeling. However, it is difficult to apply such networks to 3D object
detection in autonomous driving due to its large computation cost and slow
reasoning speed. In this work, we present TED, an efficient
Transformation-Equivariant 3D Detector to overcome the computation cost and
speed issues. TED first applies a sparse convolution backbone to extract
multi-channel transformation-equivariant voxel features; and then aligns and
aggregates these equivariant features into lightweight and compact
representations for high-performance 3D object detection. On the highly
competitive KITTI 3D car detection leaderboard, TED ranked 1st among all
submissions with competitive efficiency.
Related papers
- Cross-Cluster Shifting for Efficient and Effective 3D Object Detection
in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving.
We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector.
We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection.
Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features.
Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z) - Group Equivariant BEV for 3D Object Detection [28.109682816379998]
3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios.
We propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory.
GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene.
arXiv Detail & Related papers (2023-04-26T09:00:31Z) - SWFormer: Sparse Window Transformer for 3D Object Detection in Point
Clouds [44.635939022626744]
3D object detection in point clouds is a core component for modern robotics and autonomous driving systems.
Key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene.
We propose Sparse Window Transformer (SWFormer), a scalable and accurate model for 3D object detection.
arXiv Detail & Related papers (2022-10-13T21:37:53Z) - SRCN3D: Sparse R-CNN 3D for Compact Convolutional Multi-View 3D Object
Detection and Tracking [12.285423418301683]
This paper proposes Sparse R-CNN 3D (SRCN3D), a novel two-stage fully-sparse detector that incorporates sparse queries, sparse attention with box-wise sampling, and sparse prediction.
Experiments on nuScenes dataset demonstrate that SRCN3D achieves competitive performance in both 3D object detection and multi-object tracking tasks.
arXiv Detail & Related papers (2022-06-29T07:58:39Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - siaNMS: Non-Maximum Suppression with Siamese Networks for Multi-Camera
3D Object Detection [65.03384167873564]
A siamese network is integrated into the pipeline of a well-known 3D object detector approach.
associations are exploited to enhance the 3D box regression of the object.
The experimental evaluation on the nuScenes dataset shows that the proposed method outperforms traditional NMS approaches.
arXiv Detail & Related papers (2020-02-19T15:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.