Training Robots without Robots: Deep Imitation Learning for
Master-to-Robot Policy Transfer
- URL: http://arxiv.org/abs/2202.09574v2
- Date: Mon, 26 Feb 2024 10:27:33 GMT
- Title: Training Robots without Robots: Deep Imitation Learning for
Master-to-Robot Policy Transfer
- Authors: Heecheol Kim, Yoshiyuki Ohmura, Akihiko Nagakubo, and Yasuo Kuniyoshi
- Abstract summary: Deep imitation learning is promising for robot manipulation because it only requires demonstration samples.
Existing demonstration methods have deficiencies; bilateral teleoperation requires a complex control scheme and is expensive.
This research proposes a new master-to-robot (M2R) policy transfer system that does not require robots for teaching force feedback-based manipulation tasks.
- Score: 4.318590074766604
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep imitation learning is promising for robot manipulation because it only
requires demonstration samples. In this study, deep imitation learning is
applied to tasks that require force feedback. However, existing demonstration
methods have deficiencies; bilateral teleoperation requires a complex control
scheme and is expensive, and kinesthetic teaching suffers from visual
distractions from human intervention. This research proposes a new
master-to-robot (M2R) policy transfer system that does not require robots for
teaching force feedback-based manipulation tasks. The human directly
demonstrates a task using a controller. This controller resembles the kinematic
parameters of the robot arm and uses the same end-effector with force/torque
(F/T) sensors to measure the force feedback. Using this controller, the
operator can feel force feedback without a bilateral system. The proposed
method can overcome domain gaps between the master and robot using gaze-based
imitation learning and a simple calibration method. Furthermore, a Transformer
is applied to infer policy from F/T sensory input. The proposed system was
evaluated on a bottle-cap-opening task that requires force feedback.
Related papers
- Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition [48.65867987106428]
We introduce a novel system for joint learning between human operators and robots.
It enables human operators to share control of a robot end-effector with a learned assistive agent.
It reduces the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks.
arXiv Detail & Related papers (2024-06-29T03:37:29Z) - Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System [5.497832119577795]
dexterous, contact-rich manipulation tasks using rigid robots is a significant challenge in robotics.
Compliance control schemes have been introduced to mitigate these issues by controlling forces via external sensors.
Learning from Demonstrations offers an intuitive alternative, allowing robots to learn manipulations through observed actions.
arXiv Detail & Related papers (2024-06-21T09:03:37Z) - A Framework for Learning from Demonstration with Minimal Human Effort [11.183124892686239]
We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control.
In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time.
We show that our approach to controller selection reduces the human cost to perform two simulated tasks and a single real-world task.
arXiv Detail & Related papers (2023-06-15T15:49:37Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via
Novel-View Synthesis [50.93065653283523]
SPARTN (Synthetic Perturbations for Augmenting Robot Trajectories via NeRF) is a fully-offline data augmentation scheme for improving robot policies.
Our approach leverages neural radiance fields (NeRFs) to synthetically inject corrective noise into visual demonstrations.
In a simulated 6-DoF visual grasping benchmark, SPARTN improves success rates by 2.8$times$ over imitation learning without the corrective augmentations.
arXiv Detail & Related papers (2023-01-18T23:25:27Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy
Transfer [57.045140028275036]
We consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology.
Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal action and/or state distribution being mismatched in different robots.
We propose a novel method named $REvolveR$ of using continuous evolutionary models for robotic policy transfer implemented in a physics simulator.
arXiv Detail & Related papers (2022-02-10T18:50:25Z) - Transformer-based deep imitation learning for dual-arm robot
manipulation [5.3022775496405865]
In a dual-arm manipulation setup, the increased number of state dimensions caused by the additional robot manipulators causes distractions.
We address this issue using a self-attention mechanism that computes dependencies between elements in a sequential input and focuses on important elements.
A Transformer, a variant of self-attention architecture, is applied to deep imitation learning to solve dual-arm manipulation tasks in the real world.
arXiv Detail & Related papers (2021-08-01T07:42:39Z) - Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks with
Base Controllers [26.807673929816026]
We propose a method of learning long-horizon sparse-reward tasks utilizing one or more traditional base controllers.
Our algorithm incorporates the existing base controllers into stages of exploration, value learning, and policy update.
Our method bears the potential of leveraging existing industrial robot manipulation systems to build more flexible and intelligent controllers.
arXiv Detail & Related papers (2020-11-24T14:23:57Z) - Learning Force Control for Contact-rich Manipulation Tasks with Rigid
Position-controlled Robots [9.815369993136512]
We propose a learning-based force control framework combining RL techniques with traditional force control.
Within said control scheme, we implemented two different conventional approaches to achieve force control with position-controlled robots.
Finally, we developed a fail-safe mechanism for safely training an RL agent on manipulation tasks using a real rigid robot manipulator.
arXiv Detail & Related papers (2020-03-02T01:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.