MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose
Estimation
- URL: http://arxiv.org/abs/2302.07300v1
- Date: Tue, 14 Feb 2023 19:34:41 GMT
- Title: MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose
Estimation
- Authors: Dingding Cai, Janne Heikkil\"a, Esa Rahtu
- Abstract summary: We propose a self-supervised domain adaptation approach to acquire labeled 6D poses from real images.
We first pre-train the model with synthetic RGB images and then utilize real RGB(-D) images to fine-tune the pre-trained model.
We experimentally demonstrate that our method achieves comparable performance against its fully-supervised counterpart.
- Score: 12.773040823634908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Acquiring labeled 6D poses from real images is an expensive and
time-consuming task. Though massive amounts of synthetic RGB images are easy to
obtain, the models trained on them suffer from noticeable performance
degradation due to the synthetic-to-real domain gap. To mitigate this
degradation, we propose a practical self-supervised domain adaptation approach
that takes advantage of real RGB(-D) data without needing real pose labels. We
first pre-train the model with synthetic RGB images and then utilize real
RGB(-D) images to fine-tune the pre-trained model. The fine-tuning process is
self-supervised by the RGB-based pose-aware consistency and the depth-guided
object distance pseudo-label, which does not require the time-consuming online
differentiable rendering. We build our domain adaptation method based on the
recent pose estimator SC6D and evaluate it on the YCB-Video dataset. We
experimentally demonstrate that our method achieves comparable performance
against its fully-supervised counterpart while outperforming existing
state-of-the-art approaches.
Related papers
- RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images [13.051302134031808]
We introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image.
Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence.
arXiv Detail & Related papers (2024-05-14T10:10:45Z) - RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic
Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest.
We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D.
The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z) - Unseen Object Instance Segmentation with Fully Test-time RGB-D
Embeddings Adaptation [14.258456366985444]
Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and applying the model to unseen real-world scenarios.
We re-emphasize the adaptation process across Sim2Real domains in this paper.
We propose a framework to conduct the Fully Test-time RGB-D Embeddings Adaptation (FTEA) based on parameters of the BatchNorm layer.
arXiv Detail & Related papers (2022-04-21T02:35:20Z) - Occlusion-Aware Self-Supervised Monocular 6D Object Pose Estimation [88.8963330073454]
We propose a novel monocular 6D pose estimation approach by means of self-supervised learning.
We leverage current trends in noisy student training and differentiable rendering to further self-supervise the model.
Our proposed self-supervision outperforms all other methods relying on synthetic data.
arXiv Detail & Related papers (2022-03-19T15:12:06Z) - SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation [98.83762558394345]
SO-Pose is a framework for regressing all 6 degrees-of-freedom (6DoF) for the object pose in a cluttered environment from a single RGB image.
We introduce a novel reasoning about self-occlusion, in order to establish a two-layer representation for 3D objects.
Cross-layer consistencies that align correspondences, self-occlusion and 6D pose, we can further improve accuracy and robustness.
arXiv Detail & Related papers (2021-08-18T19:49:29Z) - Robust RGB-based 6-DoF Pose Estimation without Real Pose Annotations [92.5075742765229]
We introduce an approach to robustly and accurately estimate the 6-DoF pose in challenging conditions without using any real pose annotations.
We achieve state of the art performance on LINEMOD, and OccludedLINEMOD in without real-pose setting, even outperforming methods that rely on real annotations during training on Occluded-LINEMOD.
arXiv Detail & Related papers (2020-08-19T12:07:01Z) - Self6D: Self-Supervised Monocular 6D Object Pose Estimation [114.18496727590481]
We propose the idea of monocular 6D pose estimation by means of self-supervised learning.
We leverage recent advances in neural rendering to further self-supervise the model on unannotated real RGB-D data.
arXiv Detail & Related papers (2020-04-14T13:16:36Z) - Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D
Object Pose Estimation in Color Images [38.9238085806793]
Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images.
A two-stage 6D object pose estimator framework that can be applied on top of existing neural-network-based approaches is proposed.
arXiv Detail & Related papers (2020-03-27T11:53:38Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.