A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full
6D Pose Estimators
- URL: http://arxiv.org/abs/2001.10609v2
- Date: Sun, 19 Apr 2020 11:59:38 GMT
- Title: A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full
6D Pose Estimators
- Authors: Caner Sahin, Guillermo Garcia-Hernando, Juil Sock, Tae-Kyun Kim
- Abstract summary: We present the first comprehensive and most recent review of the methods on object pose recovery.
The methods mathematically model the problem as a classification, regression, classification & regression, template matching, and point-pair feature matching task.
- Score: 40.049600223903546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object pose recovery has gained increasing attention in the computer vision
field as it has become an important problem in rapidly evolving technological
areas related to autonomous driving, robotics, and augmented reality. Existing
review-related studies have addressed the problem at visual level in 2D, going
through the methods which produce 2D bounding boxes of objects of interest in
RGB images. The 2D search space is enlarged either using the geometry
information available in the 3D space along with RGB (Mono/Stereo) images, or
utilizing depth data from LIDAR sensors and/or RGB-D cameras. 3D bounding box
detectors, producing category-level amodal 3D bounding boxes, are evaluated on
gravity aligned images, while full 6D object pose estimators are mostly tested
at instance-level on the images where the alignment constraint is removed.
Recently, 6D object pose estimation is tackled at the level of categories. In
this paper, we present the first comprehensive and most recent review of the
methods on object pose recovery, from 3D bounding box detectors to full 6D pose
estimators. The methods mathematically model the problem as a classification,
regression, classification & regression, template matching, and point-pair
feature matching task. Based on this, a mathematical-model-based categorization
of the methods is established. Datasets used for evaluating the methods are
investigated with respect to the challenges, and evaluation metrics are
studied. Quantitative results of experiments in the literature are analyzed to
show which category of methods best performs across what types of challenges.
The analyses are further extended comparing two methods, which are our own
implementations, so that the outcomes from the public results are further
solidified. Current position of the field is summarized regarding object pose
recovery, and possible research directions are identified.
Related papers
- Open Vocabulary Monocular 3D Object Detection [10.424711580213616]
We pioneer the study of open-vocabulary monocular 3D object detection, a novel task that aims to detect and localize objects in 3D space from a single RGB image.
We introduce a class-agnostic approach that leverages open-vocabulary 2D detectors and lifts 2D bounding boxes into 3D space.
Our approach decouples the recognition and localization of objects in 2D from the task of estimating 3D bounding boxes, enabling generalization across unseen categories.
arXiv Detail & Related papers (2024-11-25T18:59:17Z) - Rigidity-Aware Detection for 6D Object Pose Estimation [60.88857851869196]
Most recent 6D object pose estimation methods first use object detection to obtain 2D bounding boxes before actually regressing the pose.
We propose a rigidity-aware detection method exploiting the fact that, in 6D pose estimation, the target objects are rigid.
Key to the success of our approach is a visibility map, which we propose to build using a minimum barrier distance between every pixel in the bounding box and the box boundary.
arXiv Detail & Related papers (2023-03-22T09:02:54Z) - Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection.
Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon.
Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - Pose Estimation of Specific Rigid Objects [0.7931904787652707]
We address the problem of estimating the 6D pose of rigid objects from a single RGB or RGB-D input image.
This problem is of great importance to many application fields such as robotic manipulation, augmented reality, and autonomous driving.
arXiv Detail & Related papers (2021-12-30T14:36:47Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Geometry-aware data augmentation for monocular 3D object detection [18.67567745336633]
This paper focuses on monocular 3D object detection, one of the essential modules in autonomous driving systems.
A key challenge is that the depth recovery problem is ill-posed in monocular data.
We conduct a thorough analysis to reveal how existing methods fail to robustly estimate depth when different geometry shifts occur.
We convert the aforementioned manipulations into four corresponding 3D-aware data augmentation techniques.
arXiv Detail & Related papers (2021-04-12T23:12:48Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.