Fugu-MT 論文翻訳(概要): Automating the Refinement of Reinforcement Learning Specifications

論文の概要: Automating the Refinement of Reinforcement Learning Specifications

arxiv url: http://arxiv.org/abs/2512.01047v1
Date: Sun, 30 Nov 2025 19:32:33 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-02 19:46:34.560039
Title: Automating the Refinement of Reinforcement Learning Specifications
Title（参考訳）: 強化学習仕様の修正を自動化する
Authors: Tanmay Ambadkar, Đorđe Žikelić, Abhinav Verma,
Abstract要約: textscAutoSpecはSpectRL仕様ロジックで指定された強化学習タスクに適用できる。論理仕様からポリシーを学習するために,textscAutoSpecを既存の強化学習アルゴリズムに統合する方法を示す。
参考スコア（独自算出の注目度）: 1.8033500402815792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Logical specifications have been shown to help reinforcement learning algorithms in achieving complex tasks. However, when a task is under-specified, agents might fail to learn useful policies. In this work, we explore the possibility of improving coarse-grained logical specifications via an exploration-guided strategy. We propose \textsc{AutoSpec}, a framework that searches for a logical specification refinement whose satisfaction implies satisfaction of the original specification, but which provides additional guidance therefore making it easier for reinforcement learning algorithms to learn useful policies. \textsc{AutoSpec} is applicable to reinforcement learning tasks specified via the SpectRL specification logic. We exploit the compositional nature of specifications written in SpectRL, and design four refinement procedures that modify the abstract graph of the specification by either refining its existing edge specifications or by introducing new edge specifications. We prove that all four procedures maintain specification soundness, i.e. any trajectory satisfying the refined specification also satisfies the original. We then show how \textsc{AutoSpec} can be integrated with existing reinforcement learning algorithms for learning policies from logical specifications. Our experiments demonstrate that \textsc{AutoSpec} yields promising improvements in terms of the complexity of control tasks that can be solved, when refined logical specifications produced by \textsc{AutoSpec} are utilized.
Abstract（参考訳）: 論理的仕様は、複雑なタスクを達成するための強化学習アルゴリズムに役立つことが示されている。しかし、タスクが過小評価されている場合、エージェントは有用なポリシーを学ばない可能性がある。本研究では,探索誘導戦略を用いて,粗粒度論理仕様の改善の可能性を検討する。本稿では,従来の仕様の満足度を示す論理的仕様修正を探索するフレームワークである「textsc{AutoSpec}」を提案する。 \textsc{AutoSpec} は SpectRL 仕様ロジックで指定された強化学習タスクに適用できる。我々はSpectRLで書かれた仕様の合成特性を活用し、既存のエッジ仕様の修正や新しいエッジ仕様の導入によって仕様の抽象グラフを変更する4つの改良手順を設計する。これら4つの手順がすべて仕様の健全性を維持すること、すなわち、洗練された仕様を満たす任意の軌道もまた、原文を満たすことを証明している。次に、既存の強化学習アルゴリズムと‘textsc{AutoSpec}’を統合して、論理仕様からポリシーを学習する方法を示す。提案実験では, \textsc{AutoSpec} が生成する洗練された論理仕様を応用した場合に, 制御タスクの複雑さの観点から, 期待できる改善が得られることを示した。

関連論文リスト

SLD-Spec: Enhancement LLM-assisted Specification Generation for Complex Loop Functions via Program Slicing and Logical Deletion [29.231420590756954]
SLD-Specは、複雑なループ構造を持つプログラムに適したLCM支援仕様生成方法である。 SLD-Specは最先端のAutoSpecよりも5つのプログラムの検証に成功し、ランタイムを23.73%削減した。
論文参考訳（メタデータ） (2025-09-12T01:40:27Z)
Vision to Specification: Automating the Transition from Conceptual Features to Functional Requirements [10.85799957734291]
EasyFRアプローチでは、与えられた抽象機能に対してセマンティックロールラベリングシーケンスを推奨し、結合機能要件(FR)の生成において、事前学習言語モデル(PLM)をガイドする。我々の結果は、将来のソフトウェアプロジェクトにおける要求仕様のプロセスを改善する可能性を秘めている、自動要求合成の領域における顕著な進歩を示している。
論文参考訳（メタデータ） (2025-05-18T07:01:50Z)
GraphRank Pro+: Advancing Talent Analytics Through Knowledge Graphs and Sentiment-Enhanced Skill Profiling [0.0]
本稿では,構造化グラフ,自然言語処理(NLP),ディープラーニングを活用した革命的アプローチを提案する。複雑なロジックをグラフ構造に抽象化することで、生データを包括的な知識グラフに変換する。この革新的なフレームワークは、正確な情報抽出と高度なクエリを可能にする。
論文参考訳（メタデータ） (2025-02-25T16:07:40Z)
Learning Task Representations from In-Context Learning [67.66042137487287]
大規模言語モデル(LLM)は、文脈内学習(ICL)において顕著な習熟性を示した。 ICLプロンプトにおけるタスク情報をアテンションヘッドの関数として符号化するための自動定式化を導入する。提案手法は,テキスト中の実演からタスク固有の情報を抽出し,テキストと回帰タスクの両方で優れる。
論文参考訳（メタデータ） (2025-02-08T00:16:44Z)
What is Formal Verification without Specifications? A Survey on mining LTL Specifications [5.655251163654288]
リアクティブシステムのためのデファクト標準仕様言語であるLTL(Linear Temporal Logic)のマイニング仕様の進歩をリストし比較する。いくつかのアプローチは、仕様設計の異なる側面と設定に対処する公式を学習するために設計されている。本研究は,現在の最先端技術について調査し,形式的手法実践者の利便性について比較する。
論文参考訳（メタデータ） (2025-01-27T18:06:48Z)
DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL [59.01527054553122]
線形時間論理(LTL)は、最近、複雑で時間的に拡張されたタスクを特定するための強力なフォーマリズムとして採用されている。既存のアプローチにはいくつかの欠点がある。これらの問題に対処するための新しい学習手法を提案する。
論文参考訳（メタデータ） (2024-10-06T21:30:38Z)
Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification [15.686651364655958]
AutoSpecは、自動プログラム検証のための仕様を合成するための自動化アプローチである。仕様の汎用性における既存の作業の欠点を克服し、完全な証明のために十分かつ適切な仕様を合成する。実世界のX509パーサプロジェクトでプログラムを検証するためにうまく適用することができる。
論文参考訳（メタデータ） (2024-03-31T18:15:49Z)
Multi-Agent Reinforcement Learning with Temporal Logic Specifications [65.79056365594654]
本研究では,時間論理仕様を満たすための学習課題を,未知の環境下でエージェントのグループで検討する。我々は、時間論理仕様のための最初のマルチエージェント強化学習手法を開発した。主アルゴリズムの正確性と収束性を保証する。
論文参考訳（メタデータ） (2021-02-01T01:13:03Z)
Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
線形時間論理(LTL)を用いて未知の連続状態/動作マルコフ決定過程(MDP)のゴールを定式化できるモデルフリーなRLアルゴリズムを提案する。このアルゴリズムは、トレースが仕様を最大確率で満たす制御ポリシーを合成することが保証される。
論文参考訳（メタデータ） (2019-02-02T20:09:32Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。