Related papers: MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in Adverse Scenes

MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in Adverse Scenes

URL: http://arxiv.org/abs/2305.10974v2
Date: Thu, 25 May 2023 06:12:02 GMT
Title: MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in Adverse Scenes
Authors: Xingyuan Li and Jinyuan Liu and Yixin Lei and Long Ma and Xin Fan and Risheng Liu
Abstract summary: This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP. We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors. Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
Score: 49.21187418886508
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D object detection plays a crucial role in numerous intelligent vision systems. Detection in the open world inevitably encounters various adverse scenes, such as dense fog, heavy rain, and low light conditions. Although existing efforts primarily focus on diversifying network architecture or training schemes, resulting in significant progress in 3D object detection, most of these learnable modules fail in adverse scenes, thereby hindering detection performance. To address this issue, this paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP, which effectively mitigates the degradation of detection performance in various harsh environments. Specifically, we first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors. Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth, enabling the integration of scene-level features and object-level features. Additionally, we assemble a new adverse 3D object detection dataset encompassing a wide range of challenging scenes, including rainy, foggy, and low light weather conditions, with each type of scene containing 7,481 images. Experimental results demonstrate that our proposed method outperforms current state-of-the-art approaches by an average of 3.12% in terms of AP_R40 for car category across various adverse environments.

Related papers

Revisiting Monocular 3D Object Detection with Depth Thickness Field [44.4805861813093]
We present MonoDTF, a scene-to-instance depth-adapted network for monocular 3D object detection. The framework mainly comprises a Scene-Level Depth Retargeting (SDR) module and an Instance-Level Spatial Refinement (ISR) module. The latter refines the voxel space with the guidance of instances, enhancing the 3D instance-aware capability of the depth thickness field.
arXiv Detail & Related papers (2024-12-26T10:51:50Z)
Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version. We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z)
S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection [21.96072831561483]
This paper proposes a novel Supervised Shape&Scale-perceptive Deformable Attention'' (S$3$-DA) module for monocular 3D object detection. Benefiting from this, S$3$-DA effectively estimates receptive fields for query points belonging to any category, enabling them to generate robust query features. Experiments on KITTI and Open datasets demonstrate that S$3$-DA significantly improves the detection accuracy.
arXiv Detail & Related papers (2023-09-02T12:36:38Z)
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds [55.755450273390004]
Existing offboard 3D detectors always follow a modular pipeline design to take advantage of unlimited sequential point clouds. We have found that the full potential of offboard 3D detectors is not explored mainly due to two reasons: (1) the onboard multi-object tracker cannot generate sufficient complete object trajectories, and (2) the motion state of objects poses an inevitable challenge for the object-centric refining stage. To tackle these problems, we propose a novel paradigm of offboard 3D object detection, named DetZero.
arXiv Detail & Related papers (2023-06-09T16:42:00Z)
AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation. We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z)
ORA3D: Overlap Region Aware Multi-view 3D Object Detection [11.58746596768273]
Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly. We propose using the following two main modules: (1) Stereo Disparity Estimation for Weak Depth Supervision and (2) Adrial Overlap Region Discriversaminator. Our proposed method outperforms current state-of-the-art models, i.e., DETR3D and BEVDet.
arXiv Detail & Related papers (2022-07-02T15:28:44Z)
The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving. We introduce a Dynamic Feature Reflecting Network, named DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z)
MonoGRNet: A General Framework for Monocular 3D Object Detection [23.59839921644492]
We propose MonoGRNet for the amodal 3D object detection from a monocular image via geometric reasoning. MonoGRNet decomposes the monocular 3D object detection task into four sub-tasks including 2D object detection, instance-level depth estimation, projected 3D center estimation and local corner regression. Experiments are conducted on KITTI, Cityscapes and MS COCO datasets.
arXiv Detail & Related papers (2021-04-18T10:07:52Z)
Geometry-aware data augmentation for monocular 3D object detection [18.67567745336633]
This paper focuses on monocular 3D object detection, one of the essential modules in autonomous driving systems. A key challenge is that the depth recovery problem is ill-posed in monocular data. We conduct a thorough analysis to reveal how existing methods fail to robustly estimate depth when different geometry shifts occur. We convert the aforementioned manipulations into four corresponding 3D-aware data augmentation techniques.
arXiv Detail & Related papers (2021-04-12T23:12:48Z)
Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image. Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space. We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step. This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
3D Object Detection on Point Clouds using Local Ground-aware and Adaptive Representation of scenes' surface [1.9336815376402714]
A novel, adaptive ground-aware, and cost-effective 3D Object Detection pipeline is proposed. A new state-of-the-art 3D object detection performance among the two-stage Lidar Object Detection pipelines is proposed.
arXiv Detail & Related papers (2020-02-02T05:42:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.