Safe Guaranteed Dynamics Exploration with Probabilistic Models
- URL: http://arxiv.org/abs/2509.16650v1
- Date: Sat, 20 Sep 2025 11:55:24 GMT
- Title: Safe Guaranteed Dynamics Exploration with Probabilistic Models
- Authors: Manish Prajapat, Johannes Köhler, Melanie N. Zeilinger, Andreas Krause,
- Abstract summary: We introduce a notion of maximum safe dynamics learning via sufficient exploration in the space of safe policies.<n>We propose a $textitpessimistically$ safe framework that ensures continuous online learning of dynamics.<n>We demonstrate the effectiveness of our approach in challenging domains such as autonomous car racing and drone navigation.
- Score: 34.655934881761446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring both optimality and safety is critical for the real-world deployment of agents, but becomes particularly challenging when the system dynamics are unknown. To address this problem, we introduce a notion of maximum safe dynamics learning via sufficient exploration in the space of safe policies. We propose a $\textit{pessimistically}$ safe framework that $\textit{optimistically}$ explores informative states and, despite not reaching them due to model uncertainty, ensures continuous online learning of dynamics. The framework achieves first-of-its-kind results: learning the dynamics model sufficiently $-$ up to an arbitrary small tolerance (subject to noise) $-$ in a finite time, while ensuring provably safe operation throughout with high probability and without requiring resets. Building on this, we propose an algorithm to maximize rewards while learning the dynamics $\textit{only to the extent needed}$ to achieve close-to-optimal performance. Unlike typical reinforcement learning (RL) methods, our approach operates online in a non-episodic setting and ensures safety throughout the learning process. We demonstrate the effectiveness of our approach in challenging domains such as autonomous car racing and drone navigation under aerodynamic effects $-$ scenarios where safety is critical and accurate modeling is difficult.
Related papers
- Safe Reinforcement Learning via Recovery-based Shielding with Gaussian Process Dynamics Models [57.006252510102506]
Reinforcement learning (RL) is a powerful framework for optimal decision-making and control but often lacks provable guarantees for safety-critical applications.<n>We introduce a novel recovery-based shielding framework that enables safe RL with a provable safety lower bound for unknown and non-linear continuous dynamical systems.
arXiv Detail & Related papers (2026-02-12T22:03:35Z) - Safely Learning Controlled Stochastic Dynamics [61.82896036131116]
We introduce a method that ensures safe exploration and efficient estimation of system dynamics.<n>After training, the learned model enables predictions of the system's dynamics and permits safety verification of any given control.<n>We provide theoretical guarantees for safety and derive adaptive learning rates that improve with increasing Sobolev regularity of the true dynamics.
arXiv Detail & Related papers (2025-06-03T11:17:07Z) - Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions [23.406516455945653]
We propose an amortized safe AL framework that replaces expensive online computations with a pretrained neural policy.<n>Our framework is modular and can be adapted to unconstrained, time-sensitive AL tasks by omitting the safety requirement.
arXiv Detail & Related papers (2025-01-26T09:05:52Z) - ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration.<n>We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.<n>In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Model-based Safe Deep Reinforcement Learning via a Constrained Proximal
Policy Optimization Algorithm [4.128216503196621]
We propose an On-policy Model-based Safe Deep RL algorithm in which we learn the transition dynamics of the environment in an online manner.
We show that our algorithm is more sample efficient and results in lower cumulative hazard violations as compared to constrained model-free approaches.
arXiv Detail & Related papers (2022-10-14T06:53:02Z) - Safely Learning Dynamical Systems from Short Trajectories [12.184674552836414]
A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety.
We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory.
We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible.
arXiv Detail & Related papers (2020-11-24T18:06:10Z) - Safe Active Dynamics Learning and Control: A Sequential
Exploration-Exploitation Framework [30.58186749790728]
We propose a theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty.
Our framework guarantees the high-probability satisfaction of all constraints at all times jointly.
This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities.
arXiv Detail & Related papers (2020-08-26T17:39:58Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Safe Mission Planning under Dynamical Uncertainties [15.533842336139063]
This paper considers safe robot mission planning in uncertain dynamical environments.
It is a challenging problem due to modeling and integrating dynamical uncertainties into a safe planning framework.
arXiv Detail & Related papers (2020-03-05T20:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.