Fugu-MT 論文翻訳(概要): Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement

論文の概要: Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement

arxiv url: http://arxiv.org/abs/2511.05931v1
Date: Sat, 08 Nov 2025 08:49:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-11 21:18:44.660133
Title: Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement
Title（参考訳）: 計画誘導型政策リファインメントのための接地経験からの自己抽象化
Authors: Hiroaki Hayashi, Bo Pang, Wenting Zhao, Ye Liu, Akash Gokul, Srijan Bansal, Caiming Xiong, Semih Yavuz, Yingbo Zhou,
Abstract要約: 大規模言語モデル(LLM)ベースのエージェントは、ソフトウェア工学のタスクに取り組むためにますます使われています。エージェントが自身のタスク実行から学習することを可能にするフレームワークであるSAGE(Self-Abstraction from Grounded Experience)を提案する。
参考スコア（独自算出の注目度）: 61.35824395228412
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language model (LLM) based agents are increasingly used to tackle software engineering tasks that require multi-step reasoning and code modification, demonstrating promising yet limited performance. However, most existing LLM agents typically operate within static execution frameworks, lacking a principled mechanism to learn and self-improve from their own experience and past rollouts. As a result, their performance remains bounded by the initial framework design and the underlying LLM's capabilities. We propose Self-Abstraction from Grounded Experience (SAGE), a framework that enables agents to learn from their own task executions and refine their behavior through self-abstraction. After an initial rollout, the agent induces a concise plan abstraction from its grounded experience, distilling key steps, dependencies, and constraints. This learned abstraction is then fed back as contextual guidance, refining the agent's policy and supporting more structured, informed subsequent executions. Empirically, SAGE delivers consistent performance gains across diverse LLM backbones and agent architectures. Notably, it yields a 7.2% relative performance improvement over the strong Mini-SWE-Agent baseline when paired with the GPT-5 (high) backbone. SAGE further achieves strong overall performance on SWE-Bench Verified benchmark, reaching 73.2% and 74% Pass@1 resolve rates with the Mini-SWE-Agent and OpenHands CodeAct agent framework, respectively.
Abstract（参考訳）: 大規模言語モデル(LLM)ベースのエージェントは、多段階の推論とコード修正を必要とするソフトウェアエンジニアリングタスクに取り組むためにますます使われており、期待できるパフォーマンスが制限されていることを示している。しかしながら、既存のLLMエージェントは一般的に静的実行フレームワーク内で動作し、自身の経験や過去のロールアウトから学び、自己改善する原則的なメカニズムが欠如している。結果として、彼らのパフォーマンスは、初期フレームワーク設計と基盤となるLLMの能力に縛られ続けている。本研究では,エージェントが自身のタスク実行から学習し,自己抽出を通じて行動を改善するためのフレームワークである,SAGE(Self-Abstraction from Grounded Experience)を提案する。最初のロールアウトの後、エージェントはその基盤となる経験から簡潔な計画抽象化を誘導し、重要なステップ、依存関係、制約を蒸留する。この学習された抽象化は、コンテキストガイダンスとして返され、エージェントのポリシーを精査し、より構造化され、その後の実行をサポートする。経験的に、SAGEは多様なLLMバックボーンとエージェントアーキテクチャで一貫したパフォーマンス向上を提供します。特に、GPT-5(High)バックボーンと組み合わせると、強力なMini-SWE-Agentベースラインよりも7.2%性能が向上する。 SAGEはSWE-Bench Verifiedベンチマークの全体的なパフォーマンスをさらに向上させ、Mini-SWE-AgentとOpenHands CodeActのエージェントフレームワークでそれぞれ73.2%と74%のPass@1リゾルバレートを達成した。

論文の概要: Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement

関連論文リスト