Related papers: SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer

SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer

URL: http://arxiv.org/abs/2509.18648v4
Date: Tue, 21 Oct 2025 19:23:04 GMT
Title: SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer
Authors: Yarden As, Chengrui Qu, Benjamin Unger, Dongho Kang, Max van der Hart, Laixi Shi, Stelian Coros, Adam Wierman, Andreas Krause,
Abstract summary: We propose SPiDR, short for Sim-to-real via Pessimistic Domain Randomization.<n> SPiDR is a scalable algorithm with provable guarantees for safe sim-to-real transfer.<n>We demonstrate that SPiDR effectively ensures safety despite the sim-to-real gap while maintaining strong performance.
Score: 60.19411648245077
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deploying reinforcement learning (RL) safely in the real world is challenging, as policies trained in simulators must face the inevitable sim-to-real gap. Robust safe RL techniques are provably safe, however difficult to scale, while domain randomization is more practical yet prone to unsafe behaviors. We address this gap by proposing SPiDR, short for Sim-to-real via Pessimistic Domain Randomization -- a scalable algorithm with provable guarantees for safe sim-to-real transfer. SPiDR uses domain randomization to incorporate the uncertainty about the sim-to-real gap into the safety constraints, making it versatile and highly compatible with existing training pipelines. Through extensive experiments on sim-to-sim benchmarks and two distinct real-world robotic platforms, we demonstrate that SPiDR effectively ensures safety despite the sim-to-real gap while maintaining strong performance.

Related papers

Towards provable probabilistic safety for scalable embodied AI systems [79.31011047593492]
Embodied AI systems are increasingly prevalent across various applications.<n> Ensuring their safety in complex operating environments remains a major challenge.<n>This Perspective offers a pathway toward safer, large-scale adoption of embodied AI systems in safety-critical applications.
arXiv Detail & Related papers (2025-06-05T15:46:25Z)
Neural Fidelity Calibration for Informative Sim-to-Real Adaptation [10.117298045153564]
Deep reinforcement learning can seamlessly transfer agile locomotion and navigation skills from the simulator to real world.<n>However, bridging the sim-to-real gap with domain randomization or adversarial methods often demands expert physics knowledge to ensure policy robustness.<n>We propose Neural Fidelity (NFC), a novel framework that employs conditional score-based diffusion models to calibrate simulator physical coefficients and residual fidelity domains online during robot execution.
arXiv Detail & Related papers (2025-04-11T15:12:12Z)
Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics [3.7491742648742568]
Domain randomization is a technique to facilitate the transfer of policies from simulation to real-world robotic applications.<n>We propose a method to enable safe deployment-time policy adaptation in real-world robot control.
arXiv Detail & Related papers (2025-03-13T23:28:11Z)
ReGentS: Real-World Safety-Critical Driving Scenario Generation Made Stable [88.08120417169971]
Machine learning based autonomous driving systems often face challenges with safety-critical scenarios that are rare in real-world data. This work explores generating safety-critical driving scenarios by modifying complex real-world regular scenarios through trajectory optimization. Our approach addresses unrealistic diverging trajectories and unavoidable collision scenarios that are not useful for training robust planner.
arXiv Detail & Related papers (2024-09-12T08:26:33Z)
Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous Environments [63.053364805943026]
We extend the approximate model-based shielding framework to the continuous setting. In particular we use Safety Gym as our test-bed, allowing for a more direct comparison of AMBS with popular constrained RL algorithms.
arXiv Detail & Related papers (2024-02-01T17:55:08Z)
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework. Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations. We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z)
A Conservative Approach for Few-Shot Transfer in Off-Dynamics Reinforcement Learning [3.1515473193934778]
Off-dynamics Reinforcement Learning seeks to transfer a policy from a source environment to a target environment characterized by distinct yet similar dynamics. We propose an innovative approach inspired by recent advancements in Imitation Learning and conservative RL algorithms.
arXiv Detail & Related papers (2023-12-24T13:09:08Z)
Scaling #DNN-Verification Tools with Efficient Bound Propagation and Parallel Computing [57.49021927832259]
Deep Neural Networks (DNNs) are powerful tools that have shown extraordinary results in many scenarios. However, their intricate designs and lack of transparency raise safety concerns when applied in real-world applications. Formal Verification (FV) of DNNs has emerged as a valuable solution to provide provable guarantees on the safety aspect.
arXiv Detail & Related papers (2023-12-10T13:51:25Z)
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z)
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments [84.3830478851369]
We propose a safe reinforcement learning approach that can jointly learn the environment and optimize the control policy. Our approach can effectively enforce hard safety constraints and significantly outperform CMDP-based baseline methods in system safe rate measured via simulations.
arXiv Detail & Related papers (2022-09-29T20:49:25Z)
SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies Learned in Simulation [12.778412161239466]
Policy learned in the simulation may not always generate a safe behaviour on the real robot. In this work, we introduce a novel learning algorithm called SafeAPT that leverages a diverse repertoire of policies evolved in the simulation. We show that SafeAPT finds high-performance policies within a few minutes in the real world while minimizing safety violations during the interactions.
arXiv Detail & Related papers (2022-01-27T16:40:36Z)
Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees [7.6347172725540995]
Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. We propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.
arXiv Detail & Related papers (2022-01-20T18:41:01Z)
Understanding Domain Randomization for Sim-to-real Transfer [41.33483293243257]
We propose a theoretical framework for sim-to-real transfers, in which the simulator is modeled as a set of MDPs with tunable parameters. We prove that sim-to-real transfer can succeed under mild conditions without any real-world training samples.
arXiv Detail & Related papers (2021-10-07T07:45:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.