Fugu-MT 論文翻訳(概要): Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints

論文の概要: Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints

arxiv url: http://arxiv.org/abs/2603.17152v1
Date: Tue, 17 Mar 2026 21:29:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.402694
Title: Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints
Title（参考訳）: 動的時間論理制約下におけるシールド強化学習
Authors: Sadık Bera Yüksel, Ali Tevfik Buyukkocak, Derya Aksaray,
Abstract要約: 強化学習(Reinforcement Learning, RL)は、様々なロボティクスアプリケーションにおいて有望であるが、安全性と運用上の制約により、実際のシステムへの展開は制限されている。本稿では,逐次制御障壁関数とモデルフリーRLを利用して,学習プロセスを通じて与えられたタスクが満足されることを保証するフレームワークを提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement Learning (RL) has shown promise in various robotics applications, yet its deployment on real systems is still limited due to safety and operational constraints. The safe RL field has gained considerable attention in recent years, which focuses on imposing safety constraints throughout the learning process. However, real systems often require more complex constraints than just safety, such as periodic recharging or time-bounded visits to specific regions. Imposing such spatio-temporal tasks during learning still remains a challenge. Signal Temporal Logic (STL) is a formal language for specifying temporal properties of real-valued signals and provides a way to express such complex tasks. In this paper, we propose a framework that leverages sequential control barrier functions and model-free RL to ensure that the given STL tasks are satisfied throughout the learning process. Our method extends beyond traditional safety constraints by enforcing rich STL specifications, which can involve visits to dynamic targets with unknown trajectories. We also demonstrate the effectiveness of our framework through various simulations.
Abstract（参考訳）: 強化学習(Reinforcement Learning, RL)は、様々なロボティクスアプリケーションにおいて有望であるが、安全性と運用上の制約により、実際のシステムへの展開は制限されている。近年、安全なRL分野が注目され、学習プロセス全体を通して安全性の制約を課すことに焦点が当てられている。しかし、実際のシステムは、定期的なチャージや特定の地域への時間境界訪問のような、単なる安全以上の複雑な制約を必要とすることが多い。このような時空間的タスクを学習中に実施することは依然として課題である。 Signal Temporal Logic (STL) は、実数値信号の時間的特性を特定するための形式言語であり、そのような複雑なタスクを表現する方法を提供する。本稿では,逐次制御障壁関数とモデルフリーRLを利用して,学習過程を通じて与えられたSTLタスクが満足されることを保証するフレームワークを提案する。我々の手法は、未知の軌道を持つ動的ターゲットへの訪問を伴って、リッチなSTL仕様を強制することによって、従来の安全制約を超えて拡張する。また,各種シミュレーションにより,本フレームワークの有効性を実証する。

論文の概要: Shielded Reinforcement Learning Under Dynamic Temporal Logic Constraints

関連論文リスト