CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning
- URL: http://arxiv.org/abs/2003.05848v3
- Date: Fri, 11 Sep 2020 10:20:19 GMT
- Title: CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning
- Authors: Fabian Manhardt and Gu Wang and Benjamin Busam and Manuel Nickel and
Sven Meier and Luca Minciullo and Xiangyang Ji and Nassir Navab
- Abstract summary: Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
- Score: 74.53664270194643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contemporary monocular 6D pose estimation methods can only cope with a
handful of object instances. This naturally hampers possible applications as,
for instance, robots seamlessly integrated in everyday processes necessarily
require the ability to work with hundreds of different objects. To tackle this
problem of immanent practical relevance, we propose a novel method for
class-level monocular 6D pose estimation, coupled with metric shape retrieval.
Unfortunately, acquiring adequate annotations is very time-consuming and labor
intensive. This is especially true for class-level 6D pose estimation, as one
is required to create a highly detailed reconstruction for all objects and then
annotate each object and scene using these models. To overcome this
shortcoming, we additionally propose the idea of synthetic-to-real domain
transfer for class-level 6D poses by means of self-supervised learning, which
removes the burden of collecting numerous manual annotations. In essence, after
training our proposed method fully supervised with synthetic data, we leverage
recent advances in differentiable rendering to self-supervise the model with
unannotated real RGB-D data to improve latter inference. We experimentally
demonstrate that we can retrieve precise 6D poses and metric shapes from a
single RGB image.
Related papers
- Self-Supervised Geometric Correspondence for Category-Level 6D Object
Pose Estimation in the Wild [47.80637472803838]
We introduce a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild.
Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding.
Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images.
arXiv Detail & Related papers (2022-10-13T17:19:22Z) - Imitrob: Imitation Learning Dataset for Training and Evaluating 6D
Object Pose Estimators [20.611000416051546]
This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera.
The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand.
arXiv Detail & Related papers (2022-09-16T14:43:46Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation [88.8963330073454]
We propose a novel monocular 6D pose estimation approach by means of self-supervised learning.
We leverage current trends in noisy student training and differentiable rendering to further self-supervise the model.
Our proposed self-supervision outperforms all other methods relying on synthetic data.
arXiv Detail & Related papers (2022-03-19T15:12:06Z) - OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose
Estimation [12.773040823634908]
We propose a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask.
Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning.
We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data.
arXiv Detail & Related papers (2022-03-02T12:51:33Z) - Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose
Estimation [44.8872454995923]
We present a novel approach for scalable 6D pose estimation, by self-supervised learning on synthetic data of multiple objects using a single autoencoder.
We test our method on two multi-object benchmarks with real data, T-LESS and NOCS REAL275, and show it outperforms existing RGB-based methods in terms of pose estimation accuracy and generalization.
arXiv Detail & Related papers (2021-07-27T01:55:30Z) - 3D Registration for Self-Occluded Objects in Context [66.41922513553367]
We introduce the first deep learning framework capable of effectively handling this scenario.
Our method consists of an instance segmentation module followed by a pose estimation one.
It allows us to perform 3D registration in a one-shot manner, without requiring an expensive iterative procedure.
arXiv Detail & Related papers (2020-11-23T08:05:28Z) - Self6D: Self-Supervised Monocular 6D Object Pose Estimation [114.18496727590481]
We propose the idea of monocular 6D pose estimation by means of self-supervised learning.
We leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data.
arXiv Detail & Related papers (2020-04-14T13:16:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.