3D Video Object Detection with Learnable Object-Centric Global
Optimization
- URL: http://arxiv.org/abs/2303.15416v1
- Date: Mon, 27 Mar 2023 17:39:39 GMT
- Title: 3D Video Object Detection with Learnable Object-Centric Global
Optimization
- Authors: Jiawei He, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang
- Abstract summary: Correspondence-based optimization is the cornerstone for 3D scene reconstruction but is less studied in 3D video object detection.
We propose BA-Det, an end-to-end optimizable object detector with object-centric temporal correspondence learning and featuremetric object bundle adjustment.
- Score: 65.68977894460222
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We explore long-term temporal visual correspondence-based optimization for 3D
video object detection in this work. Visual correspondence refers to one-to-one
mappings for pixels across multiple images. Correspondence-based optimization
is the cornerstone for 3D scene reconstruction but is less studied in 3D video
object detection, because moving objects violate multi-view geometry
constraints and are treated as outliers during scene reconstruction. We address
this issue by treating objects as first-class citizens during
correspondence-based optimization. In this work, we propose BA-Det, an
end-to-end optimizable object detector with object-centric temporal
correspondence learning and featuremetric object bundle adjustment.
Empirically, we verify the effectiveness and efficiency of BA-Det for multiple
baseline 3D detectors under various setups. Our BA-Det achieves SOTA
performance on the large-scale Waymo Open Dataset (WOD) with only marginal
computation cost. Our code is available at
https://github.com/jiaweihe1996/BA-Det.
Related papers
- 3DGS-CD: 3D Gaussian Splatting-based Change Detection for Physical Object Rearrangement [2.2122801766964795]
We present 3DGS-CD, the first 3D Gaussian Splatting (3DGS)-based method for detecting physical object rearrangements in 3D scenes.
Our approach estimates 3D object-level changes by comparing two sets of unaligned images taken at different times.
Our method can detect changes in cluttered environments using sparse post-change images within as little as 18s, using as few as a single new image.
arXiv Detail & Related papers (2024-11-06T07:08:41Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - Learning Object-level Point Augmentor for Semi-supervised 3D Object
Detection [85.170578641966]
We propose an object-level point augmentor (OPA) that performs local transformations for semi-supervised 3D object detection.
In this way, the resultant augmentor is derived to emphasize object instances rather than irrelevant backgrounds.
Experiments on the ScanNet and SUN RGB-D datasets show that the proposed OPA performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2022-12-19T06:56:14Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Objects are Different: Flexible Monocular 3D Object Detection [87.82253067302561]
We propose a flexible framework for monocular 3D object detection which explicitly decouples the truncated objects and adaptively combines multiple approaches for object depth estimation.
Experiments demonstrate that our method outperforms the state-of-the-art method by relatively 27% for the moderate level and 30% for the hard level in the test set of KITTI benchmark.
arXiv Detail & Related papers (2021-04-06T07:01:28Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.