Fugu-MT 論文翻訳(概要): Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning

論文の概要: Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning

arxiv url: http://arxiv.org/abs/2604.13192v1
Date: Tue, 14 Apr 2026 18:16:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-16 20:38:32.246628
Title: Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning
Title（参考訳）: 対向強化学習による最大ロバスト制御バリア関数の合成と展開
Authors: Donggeon David Oh, Duy P. Nguyen, Haimin Hu, Jaime Fernández Fisac,
Abstract要約: 境界不確実性下での一般非線形システムのための新しい頑健なCBFフレームワークを提案する。まず、動的プログラミングのアイザック方程式を解く安全値関数は、最大ロバストな安全集合に対する安全性を強制する有効なロバストな離散時間CBFであることを示す。次に、バリア証明書を状態-作用空間に持ち上げることにより、明示的なダイナミクスの必要性を解消する、品質関数(RL)の重要な強化学習概念を採用する。
参考スコア（独自算出の注目度）: 4.062182903323153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robust control barrier functions (CBFs) provide a principled mechanism for smooth safety enforcement under worst-case disturbances. However, existing approaches typically rely on explicit, closed-form structure in the dynamics (e.g., control-affine) and uncertainty models. This has led to limited scalability and generality, with most robust CBFs certifying only conservative subsets of the maximal robust safe set. In this paper, we introduce a new robust CBF framework for general nonlinear systems under bounded uncertainty. We first show that the safety value function solving the dynamic programming Isaacs equation is a valid robust discrete-time CBF that enforces safety on the maximal robust safe set. We then adopt the key reinforcement learning (RL) notion of quality function (or Q-function), which removes the need for explicit dynamics by lifting the barrier certificate into state-action space and yields a novel robust Q-CBF constraint for safety filtering. Combined with adversarial RL, this enables the synthesis and deployment of robust Q-CBFs on general nonlinear systems with black-box dynamics and unknown uncertainty structure. We validate the framework on a canonical inverted pendulum benchmark and a 36-D quadruped simulator, achieving substantially less conservative safe sets than barrier-based baselines on the pendulum and reliable safety enforcement even under adversarial uncertainty realizations on the quadruped.
Abstract（参考訳）: ロバスト制御バリア関数(CBF)は、最悪の場合の安全対策を円滑に行うための原則的なメカニズムを提供する。しかし、既存のアプローチは典型的には、ダイナミックス(例えば、制御-アフィン)や不確実性モデルにおける明示的で閉形式構造に依存している。これはスケーラビリティと汎用性に限界をもたらし、ほとんどの頑健なCBFは極大頑健な安全な集合の保守的な部分集合のみを証明している。本稿では,境界不確実性の下での一般非線形システムのための新しいロバストCBFフレームワークを提案する。まず、動的プログラミングのアイザック方程式を解く安全値関数は、最大ロバストな安全集合に対する安全性を強制する有効なロバストな離散時間CBFであることを示す。次に、品質関数(Q関数)というキー強化学習(RL)の概念を採用し、バリア証明書を状態-作用空間に持ち上げることにより、明示的なダイナミクスの必要性を排除し、安全フィルタリングのための新しい堅牢なQ-CBF制約を与える。逆数RLと組み合わせることで、ブラックボックス力学と未知の不確実構造を持つ一般的な非線形システム上で、堅牢なQ-CBFの合成と展開が可能になる。本手法を正準逆振り子ベンチマークと36次元四重極子シミュレータで検証し,四重極子上でのバリアベースベースラインよりも極めて保守的な安全セットを実現し,四重極子上での逆不確実性の実現においても信頼性の高い安全性確保を実現する。

論文の概要: Synthesis and Deployment of Maximal Robust Control Barrier Functions through Adversarial Reinforcement Learning

関連論文リスト