Fugu-MT 論文翻訳(概要): Safe Guaranteed Dynamics Exploration with Probabilistic Models

論文の概要: Safe Guaranteed Dynamics Exploration with Probabilistic Models

arxiv url: http://arxiv.org/abs/2509.16650v1
Date: Sat, 20 Sep 2025 11:55:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-23 18:58:15.906619
Title: Safe Guaranteed Dynamics Exploration with Probabilistic Models
Title（参考訳）: 確率モデルによる安全保証されたダイナミクス探索
Authors: Manish Prajapat, Johannes Köhler, Melanie N. Zeilinger, Andreas Krause,
Abstract要約: 我々は,安全政策の空間における十分な探索を通して,最大安全力学学習の概念を導入する。我々は、動的に連続的にオンライン学習することを保証する、$textitpessimistically$safeフレームワークを提案する。自動運転車レースやドローンナビゲーションといった課題領域におけるアプローチの有効性を実証する。
参考スコア（独自算出の注目度）: 34.655934881761446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring both optimality and safety is critical for the real-world deployment of agents, but becomes particularly challenging when the system dynamics are unknown. To address this problem, we introduce a notion of maximum safe dynamics learning via sufficient exploration in the space of safe policies. We propose a $\textit{pessimistically}$ safe framework that $\textit{optimistically}$ explores informative states and, despite not reaching them due to model uncertainty, ensures continuous online learning of dynamics. The framework achieves first-of-its-kind results: learning the dynamics model sufficiently $-$ up to an arbitrary small tolerance (subject to noise) $-$ in a finite time, while ensuring provably safe operation throughout with high probability and without requiring resets. Building on this, we propose an algorithm to maximize rewards while learning the dynamics $\textit{only to the extent needed}$ to achieve close-to-optimal performance. Unlike typical reinforcement learning (RL) methods, our approach operates online in a non-episodic setting and ensures safety throughout the learning process. We demonstrate the effectiveness of our approach in challenging domains such as autonomous car racing and drone navigation under aerodynamic effects $-$ scenarios where safety is critical and accurate modeling is difficult.
Abstract（参考訳）: エージェントの実際の展開には最適性と安全性の確保が不可欠だが、システムのダイナミクスが不明な場合には特に困難になる。この問題に対処するために,安全政策の空間における十分な探索を通して,最大安全力学学習の概念を導入する。我々は、$\textit{pessimistically}$safe frameworkを提案し、$\textit{optimistically}$は、情報的状態を探究し、モデルの不確実性のため到達しなかったにもかかわらず、ダイナミックスの継続的なオンライン学習を保証する。このフレームワークは第一種の結果を達成している: ダイナミックスモデルを有限時間で任意の小さな許容値(雑音に代えて)まで十分に$-$で学習し、高い確率とリセットを必要とせず、確実に安全な操作を保証する。そこで本研究では,報酬を最大化するためのアルゴリズムを提案し,性能を最適にするために$\textit{only to the extent}$を学習する。一般的な強化学習(RL)法とは異なり,本手法は非エポゾディックな環境でオンラインで動作し,学習過程を通じて安全性を確保する。安全が重要であり、正確なモデリングが困難なシナリオにおいて、自律走行車レースや空力効果下でのドローンナビゲーションといった挑戦的な領域におけるアプローチの有効性を実証する。

論文の概要: Safe Guaranteed Dynamics Exploration with Probabilistic Models

関連論文リスト