Markerless Camera-to-Robot Pose Estimation via Self-supervised
Sim-to-Real Transfer
- URL: http://arxiv.org/abs/2302.14332v2
- Date: Tue, 21 Mar 2023 03:57:07 GMT
- Title: Markerless Camera-to-Robot Pose Estimation via Self-supervised
Sim-to-Real Transfer
- Authors: Jingpei Lu, Florian Richter, Michael C. Yip
- Abstract summary: We propose an end-to-end pose estimation framework that is capable of online camera-to-robot calibration and a self-supervised training method.
Our framework combines deep learning and geometric vision for solving the robot pose, and the pipeline is fully differentiable.
- Score: 26.21320177775571
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving the camera-to-robot pose is a fundamental requirement for
vision-based robot control, and is a process that takes considerable effort and
cares to make accurate. Traditional approaches require modification of the
robot via markers, and subsequent deep learning approaches enabled markerless
feature extraction. Mainstream deep learning methods only use synthetic data
and rely on Domain Randomization to fill the sim-to-real gap, because acquiring
the 3D annotation is labor-intensive. In this work, we go beyond the limitation
of 3D annotations for real-world data. We propose an end-to-end pose estimation
framework that is capable of online camera-to-robot calibration and a
self-supervised training method to scale the training to unlabeled real-world
data. Our framework combines deep learning and geometric vision for solving the
robot pose, and the pipeline is fully differentiable. To train the
Camera-to-Robot Pose Estimation Network (CtRNet), we leverage foreground
segmentation and differentiable rendering for image-level self-supervision. The
pose prediction is visualized through a renderer and the image loss with the
input image is back-propagated to train the neural network. Our experimental
results on two public real datasets confirm the effectiveness of our approach
over existing works. We also integrate our framework into a visual servoing
system to demonstrate the promise of real-time precise robot pose estimation
for automation tasks.
Related papers
- CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera [18.971816395021488]
Markerless pose estimation methods have eliminated the need for time-consuming physical setups for camera-to-robot calibration.
We propose a novel framework capable of estimating the robot pose with partially visible robot manipulators.
arXiv Detail & Related papers (2024-09-16T16:22:43Z) - HRP: Human Affordances for Robotic Pre-Training [15.92416819748365]
We present a framework for pre-training representations on hand, object, and contact.
We experimentally demonstrate (using 3000+ robot trials) that this affordance pre-training scheme boosts performance by a minimum of 15% on 5 real-world tasks.
arXiv Detail & Related papers (2024-07-26T17:59:52Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Image-based Pose Estimation and Shape Reconstruction for Robot
Manipulators and Soft, Continuum Robots via Differentiable Rendering [20.62295718847247]
State estimation from measured data is crucial for robotic applications as autonomous systems rely on sensors to capture the motion and localize in the 3D world.
In this work, we achieve image-based robot pose estimation and shape reconstruction from camera images.
We demonstrate that our method of using geometrical shape primitives can achieve high accuracy in shape reconstruction for a soft continuum robot and pose estimation for a robot manipulator.
arXiv Detail & Related papers (2023-02-27T18:51:29Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for
Robotic Bin-picking [98.5984733963713]
We propose an iterative self-training framework for sim-to-real 6D object pose estimation to facilitate cost-effective robotic grasping.
We establish a photo-realistic simulator to synthesize abundant virtual data, and use this to train an initial pose estimation network.
This network then takes the role of a teacher model, which generates pose predictions for unlabeled real data.
arXiv Detail & Related papers (2022-04-14T15:54:01Z) - A Kinematic Bottleneck Approach For Pose Regression of Flexible Surgical
Instruments directly from Images [17.32860829016479]
We propose a self-supervised image-based method, exploiting, at training time only, the kinematic information provided by the robot.
In order to avoid introducing time-consuming manual annotations, the problem is formulated as an auto-encoder.
Validation of the method was performed on semi-synthetic, phantom and in-vivo datasets, obtained using a flexible robotized endoscope.
arXiv Detail & Related papers (2021-02-28T18:41:18Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.