SGM3D: Stereo Guided Monocular 3D Object Detection
- URL: http://arxiv.org/abs/2112.01914v1
- Date: Fri, 3 Dec 2021 13:57:14 GMT
- Title: SGM3D: Stereo Guided Monocular 3D Object Detection
- Authors: Zheyuan Zhou and Liang Du and Xiaoqing Ye and Zhikang Zou and Xiao Tan
and Errui Ding and Li Zhang and Xiangyang Xue and Jianfeng Feng
- Abstract summary: We propose a stereo-guided monocular 3D object detection network, termed SGM3D.
We exploit robust 3D features extracted from stereo images to enhance the features learned from the monocular image.
Our method can be integrated into many other monocular approaches to boost performance without introducing any extra computational cost.
- Score: 62.11858392862551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular 3D object detection is a critical yet challenging task for
autonomous driving, due to the lack of accurate depth information captured by
LiDAR sensors. In this paper, we propose a stereo-guided monocular 3D object
detection network, termed SGM3D, which leverages robust 3D features extracted
from stereo images to enhance the features learned from the monocular image. We
innovatively investigate a multi-granularity domain adaptation module (MG-DA)
to exploit the network's ability so as to generate stereo-mimic features only
based on the monocular cues. The coarse BEV feature-level, as well as the fine
anchor-level domain adaptation, are leveraged to guide the monocular branch. We
present an IoU matching-based alignment module (IoU-MA) for object-level domain
adaptation between the stereo and monocular predictions to alleviate the
mismatches in previous stages. We conduct extensive experiments on the most
challenging KITTI and Lyft datasets and achieve new state-of-the-art
performance. Furthermore, our method can be integrated into many other
monocular approaches to boost performance without introducing any extra
computational cost.
Related papers
- MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection [9.780498146964097]
We propose an innovative network architecture, MonoMM, for real-time monocular 3D object detection.
MonoMM consists of Focused Multi-Scale Fusion (FMF) and Depth-Aware Feature Enhancement Mamba (DMB) modules.
Our method outperforms previous monocular methods and achieves real-time detection.
arXiv Detail & Related papers (2024-08-01T10:16:58Z) - VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D
Object Detection [15.204935788297226]
ODM3D framework entails cross-modal knowledge distillation at various levels to inject LiDAR-domain knowledge into a monocular detector during training.
By identifying foreground sparsity as the main culprit behind existing methods' suboptimal training, we exploit the precise localisation information embedded in LiDAR points.
Our method ranks 1st in both KITTI validation and test benchmarks, significantly surpassing all existing monocular methods, supervised or semi-supervised.
arXiv Detail & Related papers (2023-10-28T07:12:09Z) - SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection [0.0]
A recently proposed framework for monocular 3D detection based on Pseudo-Stereo has received considerable attention in the community.
In this work, we propose an end-to-end, efficient pseudo-stereo 3D detection framework by introducing a Single-View Diffusion Model.
SVDM allows the entire pseudo-stereo 3D detection pipeline to be trained end-to-end and can benefit from the training of stereo detectors.
arXiv Detail & Related papers (2023-07-05T13:10:37Z) - MonoDistill: Learning Spatial Features for Monocular 3D Object Detection [80.74622486604886]
We propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors.
We use the resulting data to train a 3D detector with the same architecture as the baseline model.
Experimental results show that the proposed method can significantly boost the performance of the baseline model.
arXiv Detail & Related papers (2022-01-26T09:21:41Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.