Related papers: SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies Learned in Simulation

SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies Learned in Simulation

URL: http://arxiv.org/abs/2201.13248v1
Date: Thu, 27 Jan 2022 16:40:36 GMT
Title: SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies Learned in Simulation
Authors: Rituraj Kaushik, Karol Arndt and Ville Kyrki
Abstract summary: Policy learned in the simulation may not always generate a safe behaviour on the real robot. In this work, we introduce a novel learning algorithm called SafeAPT that leverages a diverse repertoire of policies evolved in the simulation. We show that SafeAPT finds high-performance policies within a few minutes in the real world while minimizing safety violations during the interactions.
Score: 12.778412161239466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The framework of Simulation-to-real learning, i.e, learning policies in simulation and transferring those policies to the real world is one of the most promising approaches towards data-efficient learning in robotics. However, due to the inevitable reality gap between the simulation and the real world, a policy learned in the simulation may not always generate a safe behaviour on the real robot. As a result, during adaptation of the policy in the real world, the robot may damage itself or cause harm to its surroundings. In this work, we introduce a novel learning algorithm called SafeAPT that leverages a diverse repertoire of policies evolved in the simulation and transfers the most promising safe policy to the real robot through episodic interaction. To achieve this, SafeAPT iteratively learns a probabilistic reward model as well as a safety model using real-world observations combined with simulated experiences as priors. Then, it performs Bayesian optimization on the repertoire with the reward model while maintaining the specified safety constraint using the safety model. SafeAPT allows a robot to adapt to a wide range of goals safely with the same repertoire of policies evolved in the simulation. We compare SafeAPT with several baselines, both in simulated and real robotic experiments and show that SafeAPT finds high-performance policies within a few minutes in the real world while minimizing safety violations during the interactions.

Related papers

Evaluating Robot Policies in a World Model [54.874926065292904]
We investigate World-model-based Policy Evaluation (WPE)<n>WPE achieves high fidelity in mimicing robot arm movements as in real videos.<n>We show that WPE can serve as a starting point for evaluating robot policies before real-world deployment.
arXiv Detail & Related papers (2025-05-31T15:51:56Z)
Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics [3.7491742648742568]
Domain randomization is a technique to facilitate the transfer of policies from simulation to real-world robotic applications. We propose a method to enable safe deployment-time policy adaptation in real-world robot control.
arXiv Detail & Related papers (2025-03-13T23:28:11Z)
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation [57.70648477564976]
Unthinking execution of human instructions in robotic manipulation can lead to severe safety risks. We present Safety-as-policy, which includes (i) a world model to automatically generate scenarios containing safety risks and conduct virtual interactions, and (ii) a mental model to infer consequences with reflections. We show that Safety-as-policy can avoid risks and efficiently complete tasks in both synthetic dataset and real-world experiments.
arXiv Detail & Related papers (2024-11-27T12:27:50Z)
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction [25.36756787147331]
Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. We propose a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly.
arXiv Detail & Related papers (2024-05-16T17:59:07Z)
Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation. We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments. We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
arXiv Detail & Related papers (2024-05-09T17:30:16Z)
Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world. However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots. One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z)
Residual Physics Learning and System Identification for Sim-to-real Transfer of Policies on Buoyancy Assisted Legged Robots [14.760426243769308]
In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones.
arXiv Detail & Related papers (2023-03-16T18:49:05Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z)
Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees [7.6347172725540995]
Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. We propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.
arXiv Detail & Related papers (2022-01-20T18:41:01Z)
Robot Learning from Randomized Simulations: A Review [59.992761565399185]
Deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data. State-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive. We focus on a technique named 'domain randomization' which is a method for learning from randomized simulations.
arXiv Detail & Related papers (2021-11-01T13:55:41Z)
TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation. In particular, we leverage an implicit latent variable model to parameterize a joint actor policy. We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.