Self6D: Self-Supervised Monocular 6D Object Pose Estimation
- URL: http://arxiv.org/abs/2004.06468v3
- Date: Mon, 3 Aug 2020 23:47:56 GMT
- Title: Self6D: Self-Supervised Monocular 6D Object Pose Estimation
- Authors: Gu Wang, Fabian Manhardt, Jianzhun Shao, Xiangyang Ji, Nassir Navab,
Federico Tombari
- Abstract summary: We propose the idea of monocular 6D pose estimation by means of self-supervised learning.
We leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data.
- Score: 114.18496727590481
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6D object pose estimation is a fundamental problem in computer vision.
Convolutional Neural Networks (CNNs) have recently proven to be capable of
predicting reliable 6D pose estimates even from monocular images. Nonetheless,
CNNs are identified as being extremely data-driven, and acquiring adequate
annotations is oftentimes very time-consuming and labor intensive. To overcome
this shortcoming, we propose the idea of monocular 6D pose estimation by means
of self-supervised learning, removing the need for real annotations. After
training our proposed network fully supervised with synthetic RGB data, we
leverage recent advances in neural rendering to further self-supervise the
model on unannotated real RGB-D data, seeking for a visually and geometrically
optimal alignment. Extensive evaluations demonstrate that our proposed
self-supervision is able to significantly enhance the model's original
performance, outperforming all other methods relying on synthetic data or
employing elaborate techniques from the domain adaptation realm.
Related papers
- Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation
of In-hand Objects [1.8263882169310044]
We introduce a hierarchical graph neural network architecture for combining multimodal (vision and touch) data.
We also introduce a hierarchical message passing operation that flows the information within and across modalities to learn a graph-based object representation.
arXiv Detail & Related papers (2023-06-28T01:18:53Z) - GEO-Bench: Toward Foundation Models for Earth Monitoring [139.77907168809085]
We propose a benchmark comprised of six classification and six segmentation tasks.
This benchmark will be a driver of progress across a variety of Earth monitoring tasks.
arXiv Detail & Related papers (2023-06-06T16:16:05Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation [88.8963330073454]
We propose a novel monocular 6D pose estimation approach by means of self-supervised learning.
We leverage current trends in noisy student training and differentiable rendering to further self-supervise the model.
Our proposed self-supervision outperforms all other methods relying on synthetic data.
arXiv Detail & Related papers (2022-03-19T15:12:06Z) - VIPose: Real-time Visual-Inertial 6D Object Pose Tracking [3.44942675405441]
We introduce a novel Deep Neural Network (DNN) called VIPose to address the object pose tracking problem in real-time.
The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose.
The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
arXiv Detail & Related papers (2021-07-27T06:10:23Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image
Residuals in Synthetic Domains [12.71983073907091]
This work proposes a data-driven optimization approach for long-term, 6D pose tracking.
It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model.
The proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images.
arXiv Detail & Related papers (2020-07-27T21:09:36Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.