Self-Supervised Geometric Correspondence for Category-Level 6D Object
Pose Estimation in the Wild
- URL: http://arxiv.org/abs/2210.07199v3
- Date: Mon, 3 Apr 2023 05:35:31 GMT
- Title: Self-Supervised Geometric Correspondence for Category-Level 6D Object
Pose Estimation in the Wild
- Authors: Kaifeng Zhang, Yang Fu, Shubhankar Borse, Hong Cai, Fatih Porikli,
Xiaolong Wang
- Abstract summary: We introduce a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild.
Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding.
Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images.
- Score: 47.80637472803838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While 6D object pose estimation has wide applications across computer vision
and robotics, it remains far from being solved due to the lack of annotations.
The problem becomes even more challenging when moving to category-level 6D
pose, which requires generalization to unseen instances. Current approaches are
restricted by leveraging annotations from simulation or collected from humans.
In this paper, we overcome this barrier by introducing a self-supervised
learning approach trained directly on large-scale real-world object videos for
category-level 6D pose estimation in the wild. Our framework reconstructs the
canonical 3D shape of an object category and learns dense correspondences
between input images and the canonical shape via surface embedding. For
training, we propose novel geometrical cycle-consistency losses which construct
cycles across 2D-3D spaces, across different instances and different time
steps. The learned correspondence can be applied for 6D pose estimation and
other downstream tasks such as keypoint transfer. Surprisingly, our method,
without any human annotations or simulators, can achieve on-par or even better
performance than previous supervised or semi-supervised methods on in-the-wild
images. Our project page is: https://kywind.github.io/self-pose .
Related papers
- Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos [15.532504015622159]
Category-level 3D pose estimation is a fundamentally important problem in computer vision and robotics.
We tackle the problem of learning to estimate the category-level 3D pose only from casually taken object-centric videos.
arXiv Detail & Related papers (2024-07-05T09:43:05Z) - FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models [5.754251195342313]
We show how to tackle the same task but without training on specific data.
We propose FreeZe, a novel solution that harnesses the capabilities of pre-trained geometric and vision foundation models.
FreeZe consistently outperforms all state-of-the-art approaches, including competitors extensively trained on synthetic 6D pose estimation data.
arXiv Detail & Related papers (2023-12-01T22:00:14Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - NeRF-Pose: A First-Reconstruct-Then-Regress Approach for
Weakly-supervised 6D Object Pose Estimation [44.42449011619408]
We present a weakly-supervised reconstruction-based pipeline, named NeRF-Pose, which needs only 2D object segmentation and known relative camera poses during training.
A NeRF-enabled RAN+SAC algorithm is used to estimate stable and accurate pose from the predicted correspondences.
Experiments on LineMod-Occlusion show that the proposed method has state-of-the-art accuracy in comparison to the best 6D pose estimation methods.
arXiv Detail & Related papers (2022-03-09T15:28:02Z) - 3D Registration for Self-Occluded Objects in Context [66.41922513553367]
We introduce the first deep learning framework capable of effectively handling this scenario.
Our method consists of an instance segmentation module followed by a pose estimation one.
It allows us to perform 3D registration in a one-shot manner, without requiring an expensive iterative procedure.
arXiv Detail & Related papers (2020-11-23T08:05:28Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.