Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for
Robotic Bin-picking
- URL: http://arxiv.org/abs/2204.07049v1
- Date: Thu, 14 Apr 2022 15:54:01 GMT
- Title: Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for
Robotic Bin-picking
- Authors: Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-Hui Liu, Pieter
Abbeel, and Qi Dou
- Abstract summary: We propose an iterative self-training framework for sim-to-real 6D object pose estimation to facilitate cost-effective robotic grasping.
We establish a photo-realistic simulator to synthesize abundant virtual data, and use this to train an initial pose estimation network.
This network then takes the role of a teacher model, which generates pose predictions for unlabeled real data.
- Score: 98.5984733963713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an iterative self-training framework for
sim-to-real 6D object pose estimation to facilitate cost-effective robotic
grasping. Given a bin-picking scenario, we establish a photo-realistic
simulator to synthesize abundant virtual data, and use this to train an initial
pose estimation network. This network then takes the role of a teacher model,
which generates pose predictions for unlabeled real data. With these
predictions, we further design a comprehensive adaptive selection scheme to
distinguish reliable results, and leverage them as pseudo labels to update a
student model for pose estimation on real data. To continuously improve the
quality of pseudo labels, we iterate the above steps by taking the trained
student model as a new teacher and re-label real data using the refined teacher
model. We evaluate our method on a public benchmark and our newly-released
dataset, achieving an ADD(-S) improvement of 11.49% and 22.62% respectively.
Our method is also able to improve robotic bin-picking success by 19.54%,
demonstrating the potential of iterative sim-to-real solutions for robotic
applications.
Related papers
- In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators [11.389756788049944]
Testing autonomous robotic manipulators is challenging due to the complex software interactions between vision and control components.
A crucial element of modern robotic manipulators is the deep learning based object detection model.
We propose the MARTENS framework, which integrates a photorealistic NVIDIA Isaac Sim simulator with evolutionary search to identify critical scenarios.
arXiv Detail & Related papers (2024-10-25T03:10:42Z) - Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification [0.0]
We present a novel method for self-supervised fine-tuning of pose estimation for bin-picking.
Our approach enables the robot to automatically obtain training data without manual labeling.
Our pipeline allows the system to fine-tune while the process is running, removing the need for a learning phase.
arXiv Detail & Related papers (2024-09-17T19:26:21Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Markerless Camera-to-Robot Pose Estimation via Self-supervised
Sim-to-Real Transfer [26.21320177775571]
We propose an end-to-end pose estimation framework that is capable of online camera-to-robot calibration and a self-supervised training method.
Our framework combines deep learning and geometric vision for solving the robot pose, and the pipeline is fully differentiable.
arXiv Detail & Related papers (2023-02-28T05:55:42Z) - Towards Precise Model-free Robotic Grasping with Sim-to-Real Transfer
Learning [11.470950882435927]
We present an end-to-end robotic grasping network with a grasp.
In physical robotic experiments, our grasping framework grasped single known objects and novel complex-shaped household objects with a success rate of 90.91%.
The proposed grasping framework outperformed two state-of-the-art methods in both known and unknown object robotic grasping.
arXiv Detail & Related papers (2023-01-28T16:57:19Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Sim2Real Instance-Level Style Transfer for 6D Pose Estimation [0.4893345190925177]
We introduce a simulation to reality (sim2real) instance-level style transfer for 6D pose estimation network training.
Our approach transfers the style of target objects individually, from synthetic to real, without human intervention.
arXiv Detail & Related papers (2022-03-03T23:46:47Z) - SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.