IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a
Single Image
- URL: http://arxiv.org/abs/2103.03480v1
- Date: Fri, 5 Mar 2021 05:47:52 GMT
- Title: IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a
Single Image
- Authors: Dingfu Zhou, Xibin Song, Yuchao Dai, Junbo Yin, Feixiang Lu, Jin Fang,
Miao Liao and Liangjun Zhang
- Abstract summary: 3D object detection from a single image is an important task in Autonomous Driving.
We propose an instance-aware approach to aggregate useful information for improving the accuracy of 3D object detection.
- Score: 37.83574424518901
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 3D object detection from a single image is an important task in Autonomous
Driving (AD), where various approaches have been proposed. However, the task is
intrinsically ambiguous and challenging as single image depth estimation is
already an ill-posed problem. In this paper, we propose an instance-aware
approach to aggregate useful information for improving the accuracy of 3D
object detection with the following contributions. First, an instance-aware
feature aggregation (IAFA) module is proposed to collect local and global
features for 3D bounding boxes regression. Second, we empirically find that the
spatial attention module can be well learned by taking coarse-level instance
annotations as a supervision signal. The proposed module has significantly
boosted the performance of the baseline method on both 3D detection and 2D
bird-eye's view of vehicle detection among all three categories. Third, our
proposed method outperforms all single image-based approaches (even these
methods trained with depth as auxiliary inputs) and achieves state-of-the-art
3D detection performance on the KITTI benchmark.
Related papers
- S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection.
Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features.
Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - PAI3D: Painting Adaptive Instance-Prior for 3D Object Detection [22.41785292720421]
Painting Adaptive Instance-prior for 3D object detection (PAI3D) is a sequential instance-level fusion framework.
It first extracts instance-level semantic information from images.
Extracted information, including objects categorical label, point-to-object membership and object position, are then used to augment each LiDAR point in the subsequent 3D detection network.
arXiv Detail & Related papers (2022-11-15T11:15:25Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint
Estimation [3.1542695050861544]
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
We propose a novel 3D object detection method, named SMOKE, that combines a single keypoint estimate with regressed 3D variables.
Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset.
arXiv Detail & Related papers (2020-02-24T08:15:36Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.