ORA3D: Overlap Region Aware Multi-view 3D Object Detection
- URL: http://arxiv.org/abs/2207.00865v4
- Date: Thu, 29 Jun 2023 09:20:24 GMT
- Title: ORA3D: Overlap Region Aware Multi-view 3D Object Detection
- Authors: Wonseok Roh, Gyusam Chang, Seokha Moon, Giljoo Nam, Chanyoung Kim,
Younghyun Kim, Jinkyu Kim, Sangpil Kim
- Abstract summary: Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly.
We propose using the following two main modules: (1) Stereo Disparity Estimation for Weak Depth Supervision and (2) Adrial Overlap Region Discriversaminator.
Our proposed method outperforms current state-of-the-art models, i.e., DETR3D and BEVDet.
- Score: 11.58746596768273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current multi-view 3D object detection methods often fail to detect objects
in the overlap region properly, and the networks' understanding of the scene is
often limited to that of a monocular detection network. Moreover, objects in
the overlap region are often largely occluded or suffer from deformation due to
camera distortion, causing a domain shift. To mitigate this issue, we propose
using the following two main modules: (1) Stereo Disparity Estimation for Weak
Depth Supervision and (2) Adversarial Overlap Region Discriminator. The former
utilizes the traditional stereo disparity estimation method to obtain reliable
disparity information from the overlap region. Given the disparity estimates as
supervision, we propose regularizing the network to fully utilize the geometric
potential of binocular images and improve the overall detection accuracy
accordingly. Further, the latter module minimizes the representational gap
between non-overlap and overlapping regions. We demonstrate the effectiveness
of the proposed method with the nuScenes large-scale multi-view 3D object
detection data. Our experiments show that our proposed method outperforms
current state-of-the-art models, i.e., DETR3D and BEVDet.
Related papers
- Cross-Cluster Shifting for Efficient and Effective 3D Object Detection
in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving.
We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector.
We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - SVDM: Single-View Diffusion Model for Pseudo-Stereo 3D Object Detection [0.0]
A recently proposed framework for monocular 3D detection based on Pseudo-Stereo has received considerable attention in the community.
In this work, we propose an end-to-end, efficient pseudo-stereo 3D detection framework by introducing a Single-View Diffusion Model.
SVDM allows the entire pseudo-stereo 3D detection pipeline to be trained end-to-end and can benefit from the training of stereo detectors.
arXiv Detail & Related papers (2023-07-05T13:10:37Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - MVM3Det: A Novel Method for Multi-view Monocular 3D Detection [0.0]
MVM3Det simultaneously estimates the 3D position and orientation of the object according to the multi-view monocular information.
We present a first dataset for multi-view 3D object detection named MVM3D.
arXiv Detail & Related papers (2021-09-22T01:31:00Z) - MonoGRNet: A General Framework for Monocular 3D Object Detection [23.59839921644492]
We propose MonoGRNet for the amodal 3D object detection from a monocular image via geometric reasoning.
MonoGRNet decomposes the monocular 3D object detection task into four sub-tasks including 2D object detection, instance-level depth estimation, projected 3D center estimation and local corner regression.
Experiments are conducted on KITTI, Cityscapes and MS COCO datasets.
arXiv Detail & Related papers (2021-04-18T10:07:52Z) - Delving into Localization Errors for Monocular 3D Object Detection [85.77319416168362]
Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving.
In this work, we quantify the impact introduced by each sub-task and find the localization error' is the vital factor in restricting monocular 3D detection.
arXiv Detail & Related papers (2021-03-30T10:38:01Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint
Estimation [3.1542695050861544]
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
We propose a novel 3D object detection method, named SMOKE, that combines a single keypoint estimate with regressed 3D variables.
Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset.
arXiv Detail & Related papers (2020-02-24T08:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.