SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection
- URL: http://arxiv.org/abs/2101.02672v3
- Date: Wed, 17 Mar 2021 17:35:41 GMT
- Title: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection
- Authors: Prarthana Bhattacharyya, Chengjie Huang and Krzysztof Czarnecki
- Abstract summary: We propose two variants of self-attention for contextual modeling in 3D object detection.
We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel and point-based detectors.
Next, we propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations.
- Score: 9.924083358178239
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing point-cloud based 3D object detectors use convolution-like operators
to process information in a local neighbourhood with fixed-weight kernels and
aggregate global context hierarchically. However, non-local neural networks and
self-attention for 2D vision have shown that explicitly modeling long-range
interactions can lead to more robust and competitive models. In this paper, we
propose two variants of self-attention for contextual modeling in 3D object
detection by augmenting convolutional features with self-attention features. We
first incorporate the pairwise self-attention mechanism into the current
state-of-the-art BEV, voxel and point-based detectors and show consistent
improvement over strong baseline models of up to 1.5 3D AP while simultaneously
reducing their parameter footprint and computational cost by 15-80% and 30-50%,
respectively, on the KITTI validation set. We next propose a self-attention
variant that samples a subset of the most representative features by learning
deformations over randomly sampled locations. This not only allows us to scale
explicit global contextual modeling to larger point-clouds, but also leads to
more discriminative and informative feature descriptors. Our method can be
flexibly applied to most state-of-the-art detectors with increased accuracy and
parameter and compute efficiency. We show our proposed method improves 3D
object detection performance on KITTI, nuScenes and Waymo Open datasets. Code
is available at https://github.com/AutoVision-cloud/SA-Det3D.
Related papers
- 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D
Object Detection [17.526914782562528]
We propose AutoAlignV2, a faster and stronger multi-modal 3D detection framework, built on top of AutoAlign.
Our best model reaches 72.4 NDS on nuScenes test leaderboard, achieving new state-of-the-art results.
arXiv Detail & Related papers (2022-07-21T06:17:23Z) - Structure Aware and Class Balanced 3D Object Detection on nuScenes
Dataset [0.0]
NuTonomy's nuScenes dataset greatly extends commonly used datasets such as KITTI.
The localization precision of this model is affected by the loss of spatial information in the downscaled feature maps.
We propose to enhance the performance of the CBGS model by designing an auxiliary network, that makes full use of the structure information of the 3D point cloud.
arXiv Detail & Related papers (2022-05-25T06:18:49Z) - HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object
Detection [39.64891219500416]
3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene.
We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features.
arXiv Detail & Related papers (2021-04-02T06:34:49Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic
Information Modeling [65.47126868838836]
We propose a novel 3D object detection framework with dynamic information modeling.
Coarse predictions are generated in the first stage via a voxel-based region proposal network.
Experiments are conducted on the large-scale nuScenes 3D detection benchmark.
arXiv Detail & Related papers (2020-07-16T18:27:08Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.