SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies
Learned in Simulation
- URL: http://arxiv.org/abs/2201.13248v1
- Date: Thu, 27 Jan 2022 16:40:36 GMT
- Title: SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies
Learned in Simulation
- Authors: Rituraj Kaushik, Karol Arndt and Ville Kyrki
- Abstract summary: Policy learned in the simulation may not always generate a safe behaviour on the real robot.
In this work, we introduce a novel learning algorithm called SafeAPT that leverages a diverse repertoire of policies evolved in the simulation.
We show that SafeAPT finds high-performance policies within a few minutes in the real world while minimizing safety violations during the interactions.
- Score: 12.778412161239466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The framework of Simulation-to-real learning, i.e, learning policies in
simulation and transferring those policies to the real world is one of the most
promising approaches towards data-efficient learning in robotics. However, due
to the inevitable reality gap between the simulation and the real world, a
policy learned in the simulation may not always generate a safe behaviour on
the real robot. As a result, during adaptation of the policy in the real world,
the robot may damage itself or cause harm to its surroundings. In this work, we
introduce a novel learning algorithm called SafeAPT that leverages a diverse
repertoire of policies evolved in the simulation and transfers the most
promising safe policy to the real robot through episodic interaction. To
achieve this, SafeAPT iteratively learns a probabilistic reward model as well
as a safety model using real-world observations combined with simulated
experiences as priors. Then, it performs Bayesian optimization on the
repertoire with the reward model while maintaining the specified safety
constraint using the safety model. SafeAPT allows a robot to adapt to a wide
range of goals safely with the same repertoire of policies evolved in the
simulation. We compare SafeAPT with several baselines, both in simulated and
real robotic experiments and show that SafeAPT finds high-performance policies
within a few minutes in the real world while minimizing safety violations
during the interactions.
Related papers
- Don't Let Your Robot be Harmful: Responsible Robotic Manipulation [57.70648477564976]
Unthinking execution of human instructions in robotic manipulation can lead to severe safety risks.
We present Safety-as-policy, which includes (i) a world model to automatically generate scenarios containing safety risks and conduct virtual interactions, and (ii) a mental model to infer consequences with reflections.
We show that Safety-as-policy can avoid risks and efficiently complete tasks in both synthetic dataset and real-world experiments.
arXiv Detail & Related papers (2024-11-27T12:27:50Z) - TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction [25.36756787147331]
Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots.
We propose a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework.
We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly.
arXiv Detail & Related papers (2024-05-16T17:59:07Z) - Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation.
We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments.
We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
arXiv Detail & Related papers (2024-05-09T17:30:16Z) - Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world.
However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots.
One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z) - Residual Physics Learning and System Identification for Sim-to-real
Transfer of Policies on Buoyancy Assisted Legged Robots [14.760426243769308]
In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification.
Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy.
We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones.
arXiv Detail & Related papers (2023-03-16T18:49:05Z) - DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to
Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand.
Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z) - Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and
Generalization Guarantees [7.6347172725540995]
Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world.
We propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.
arXiv Detail & Related papers (2022-01-20T18:41:01Z) - Robot Learning from Randomized Simulations: A Review [59.992761565399185]
Deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data.
State-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive.
We focus on a technique named 'domain randomization' which is a method for learning from randomized simulations.
arXiv Detail & Related papers (2021-11-01T13:55:41Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.