RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery
- URL: http://arxiv.org/abs/2309.10255v2
- Date: Wed, 18 Oct 2023 08:21:34 GMT
- Title: RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery
- Authors: Jiaxin Wei, Xibin Song, Weizhe Liu, Laurent Kneip, Hongdong Li and Pan
Ji
- Abstract summary: We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
- Score: 72.13154206106259
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While showing promising results, recent RGB-D camera-based category-level
object pose estimation methods have restricted applications due to the heavy
reliance on depth sensors. RGB-only methods provide an alternative to this
problem yet suffer from inherent scale ambiguity stemming from monocular
observations. In this paper, we propose a novel pipeline that decouples the 6D
pose and size estimation to mitigate the influence of imperfect scales on rigid
transformations. Specifically, we leverage a pre-trained monocular estimator to
extract local geometric information, mainly facilitating the search for inlier
2D-3D correspondence. Meanwhile, a separate branch is designed to directly
recover the metric scale of the object based on category-level statistics.
Finally, we advocate using the RANSAC-P$n$P algorithm to robustly solve for 6D
object pose. Extensive experiments have been conducted on both synthetic and
real datasets, demonstrating the superior performance of our method over
previous state-of-the-art RGB-based approaches, especially in terms of rotation
accuracy. Code: https://github.com/goldoak/DMSR.
Related papers
- SEMPose: A Single End-to-end Network for Multi-object Pose Estimation [13.131534219937533]
SEMPose is an end-to-end multi-object pose estimation network.
It can perform inference at 32 FPS without requiring inputs other than the RGB image.
It can accurately estimate the poses of multiple objects in real time, with inference time unaffected by the number of target objects.
arXiv Detail & Related papers (2024-11-21T10:37:54Z) - LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation [43.549593231397644]
LaPose is a novel framework that models the object shape as the Laplacian mixture model for Pose estimation.
By representing each point as a probabilistic distribution, we explicitly quantify the shape uncertainty.
LaPose yields state-of-the-art performance in category-level object pose estimation.
arXiv Detail & Related papers (2024-09-24T04:20:18Z) - Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference [62.99706119370521]
Humans can easily deduce the relative pose of an unseen object, without label/training, given only a single query-reference image pair.
We propose a novel 3D generalizable relative pose estimation method by elaborating (i) with a 2.5D shape from an RGB-D reference, (ii) with an off-the-shelf differentiable, and (iii) with semantic cues from a pretrained model like DINOv2.
arXiv Detail & Related papers (2024-06-26T16:01:10Z) - RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images [13.051302134031808]
We introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image.
Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence.
arXiv Detail & Related papers (2024-05-14T10:10:45Z) - MV-ROPE: Multi-view Constraints for Robust Category-level Object Pose and Size Estimation [23.615122326731115]
We propose a novel solution that makes use of RGB video streams.
Our framework consists of three modules: a scale-aware monocular dense SLAM solution, a lightweight object pose predictor, and an object-level pose graph.
Our experimental results demonstrate that when utilizing public dataset sequences with high-quality depth information, the proposed method exhibits comparable performance to state-of-the-art RGB-D methods.
arXiv Detail & Related papers (2023-08-17T08:29:54Z) - ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose
Estimation [76.31125154523056]
We present a discrete descriptor, which can represent the object surface densely.
We also propose a coarse to fine training strategy, which enables fine-grained correspondence prediction.
arXiv Detail & Related papers (2022-03-17T16:16:24Z) - Single-stage Keypoint-based Category-level Object Pose Estimation from
an RGB Image [27.234658117816103]
We propose a single-stage, keypoint-based approach for category-level object pose estimation.
The proposed network performs 2D object detection, detects 2D keypoints, estimates 6-DoF pose, and regresses relative bounding cuboid dimensions.
We conduct extensive experiments on the challenging Objectron benchmark, outperforming state-of-the-art methods on the 3D IoU metric.
arXiv Detail & Related papers (2021-09-13T17:55:00Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - Robust 6D Object Pose Estimation by Learning RGB-D Features [59.580366107770764]
We propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem.
We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction.
Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-02-29T06:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.