Improving 2D-3D Dense Correspondences with Diffusion Models for 6D
Object Pose Estimation
- URL: http://arxiv.org/abs/2402.06436v1
- Date: Fri, 9 Feb 2024 14:27:40 GMT
- Title: Improving 2D-3D Dense Correspondences with Diffusion Models for 6D
Object Pose Estimation
- Authors: Peter H\"onig, Stefan Thalhammer, Markus Vincze
- Abstract summary: Estimating 2D-3D correspondences between RGB images and 3D space is a fundamental problem in 6D object pose estimation.
Recent pose estimators use dense correspondence maps and Point-to-Point algorithms to estimate object poses.
Recent advancements in image-to-image translation have led to diffusion models being the superior choice when evaluated on benchmarking datasets.
- Score: 9.760487761422326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating 2D-3D correspondences between RGB images and 3D space is a
fundamental problem in 6D object pose estimation. Recent pose estimators use
dense correspondence maps and Point-to-Point algorithms to estimate object
poses. The accuracy of pose estimation depends heavily on the quality of the
dense correspondence maps and their ability to withstand occlusion, clutter,
and challenging material properties. Currently, dense correspondence maps are
estimated using image-to-image translation models based on GANs, Autoencoders,
or direct regression models. However, recent advancements in image-to-image
translation have led to diffusion models being the superior choice when
evaluated on benchmarking datasets. In this study, we compare image-to-image
translation networks based on GANs and diffusion models for the downstream task
of 6D object pose estimation. Our results demonstrate that the diffusion-based
image-to-image translation model outperforms the GAN, revealing potential for
further improvements in 6D object pose estimation models.
Related papers
- FocalPose++: Focal Length and Object Pose Estimation via Render and Compare [35.388094104164175]
We introduce FocalPose++, a neural render-and-compare method for jointly estimating the camera-object 6D pose and camera focal length.
We show results on three challenging benchmark datasets that depict known 3D models in uncontrolled settings.
arXiv Detail & Related papers (2023-11-15T13:28:02Z) - Shape-Constraint Recurrent Flow for 6D Object Pose Estimation [15.238626453460666]
We propose a shape-constraint recurrent matching framework for 6D object pose estimation.
We first compute a pose-induced flow based on the displacement of 2D reprojection between the initial pose and the currently estimated pose.
We then use this pose-induced flow to construct the correlation map for the following matching iterations.
arXiv Detail & Related papers (2023-06-23T02:36:34Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - Focal Length and Object Pose Estimation via Render and Compare [36.177948726394874]
We introduce FocalPose, a neural render-and-compare method for jointly estimating the camera-object 6D pose and camera focal length.
We show results on three challenging benchmark datasets that depict known 3D models in uncontrolled settings.
arXiv Detail & Related papers (2022-04-11T14:26:53Z) - RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust
Correspondence Field Estimation and Pose Optimization [46.144194562841435]
We propose a framework based on a recurrent neural network (RNN) for object pose refinement.
The problem is formulated as a non-linear least squares problem based on the estimated correspondence field.
The correspondence field estimation and pose refinement are conducted alternatively in each iteration to recover accurate object poses.
arXiv Detail & Related papers (2022-03-24T06:24:55Z) - ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose
Estimation [76.31125154523056]
We present a discrete descriptor, which can represent the object surface densely.
We also propose a coarse to fine training strategy, which enables fine-grained correspondence prediction.
arXiv Detail & Related papers (2022-03-17T16:16:24Z) - Category-Level 6D Object Pose Estimation via Cascaded Relation and
Recurrent Reconstruction Networks [22.627704070200863]
Category-level 6D pose estimation is fundamental to many scenarios such as robotic manipulation and augmented reality.
We achieve accurate category-level 6D pose estimation via cascaded relation and recurrent reconstruction networks.
Our method exceeds the latest state-of-the-art SPD by $4.9%$ and $17.7%$ on the CAMERA25 dataset.
arXiv Detail & Related papers (2021-08-19T15:46:52Z) - SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation [98.83762558394345]
SO-Pose is a framework for regressing all 6 degrees-of-freedom (6DoF) for the object pose in a cluttered environment from a single RGB image.
We introduce a novel reasoning about self-occlusion, in order to establish a two-layer representation for 3D objects.
Cross-layer consistencies that align correspondences, self-occlusion and 6D pose, we can further improve accuracy and robustness.
arXiv Detail & Related papers (2021-08-18T19:49:29Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.