Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D
Object Pose Estimation in Color Images
- URL: http://arxiv.org/abs/2003.12344v2
- Date: Fri, 16 Oct 2020 09:49:53 GMT
- Title: Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D
Object Pose Estimation in Color Images
- Authors: Juil Sock, Guillermo Garcia-Hernando, Anil Armagan, Tae-Kyun Kim
- Abstract summary: Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images.
A two-stage 6D object pose estimator framework that can be applied on top of existing neural-network-based approaches is proposed.
- Score: 38.9238085806793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most successful approaches to estimate the 6D pose of an object typically
train a neural network by supervising the learning with annotated poses in real
world images. These annotations are generally expensive to obtain and a common
workaround is to generate and train on synthetic scenes, with the drawback of
limited generalisation when the model is deployed in the real world. In this
work, a two-stage 6D object pose estimator framework that can be applied on top
of existing neural-network-based approaches and that does not require pose
annotations on real images is proposed. The first self-supervised stage
enforces the pose consistency between rendered predictions and real input
images, narrowing the gap between the two domains. The second stage fine-tunes
the previously trained model by enforcing the photometric consistency between
pairs of different object views, where one image is warped and aligned to match
the view of the other and thus enabling their comparison. In the absence of
both real image annotations and depth information, applying the proposed
framework on top of two recent approaches results in state-of-the-art
performance when compared to methods trained only on synthetic data, domain
adaptation baselines and a concurrent self-supervised approach on LINEMOD,
LINEMOD OCCLUSION and HomebrewedDB datasets.
Related papers
- FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - GS-Pose: Category-Level Object Pose Estimation via Geometric and
Semantic Correspondence [5.500735640045456]
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics.
We propose to utilize both geometric and semantic features obtained from a pre-trained foundation model.
This requires significantly less data to train than prior methods since the semantic features are robust to object texture and appearance.
arXiv Detail & Related papers (2023-11-23T02:35:38Z) - Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation [14.469317161361202]
We propose a 6D object pose estimation method that can be trained with pure RGB images without any auxiliary information.
We evaluate our method on three challenging datasets and demonstrate that it outperforms state-of-the-art self-supervised methods significantly.
arXiv Detail & Related papers (2023-08-19T13:52:18Z) - DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose
Estimation [16.32910684198013]
We present DiffPose, a novel diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem.
We show two unique characteristics from DiffPose on pose estimation task: (i) the ability to combine multiple sets of pose estimates to improve prediction accuracy, particularly for challenging joints, and (ii) the ability to adjust the number of iterative steps for feature refinement without retraining the model.
arXiv Detail & Related papers (2023-07-31T14:00:23Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust
Correspondence Field Estimation and Pose Optimization [46.144194562841435]
We propose a framework based on a recurrent neural network (RNN) for object pose refinement.
The problem is formulated as a non-linear least squares problem based on the estimated correspondence field.
The correspondence field estimation and pose refinement are conducted alternatively in each iteration to recover accurate object poses.
arXiv Detail & Related papers (2022-03-24T06:24:55Z) - Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation [121.02948087956955]
For some applications, such as those in space or deep under water, acquiring real images, even unannotated, is virtually impossible.
We propose a method that can be trained solely on synthetic images, or optionally using a few additional real images.
It performs on par with methods that require annotated real images for training when not using any, and outperforms them considerably when using as few as twenty real images.
arXiv Detail & Related papers (2022-03-18T10:20:21Z) - Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose
Estimation [74.76155168705975]
Deep Bingham Networks (DBN) can handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data.
DBN extends the state of the art direct pose regression networks by (i) a multi-hypotheses prediction head which can yield different distribution modes.
We propose new training strategies so as to avoid mode or posterior collapse during training and to improve numerical stability.
arXiv Detail & Related papers (2020-12-20T19:20:26Z) - MirrorNet: A Deep Bayesian Approach to Reflective 2D Pose Estimation
from Human Images [42.27703025887059]
The main problems with the standard supervised approach are that it often yields anatomically implausible poses.
We propose a semi-supervised method that can make effective use of images with and without pose annotations.
The results of experiments show that the proposed reflective architecture makes estimated poses anatomically plausible.
arXiv Detail & Related papers (2020-04-08T05:02:48Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.