Fugu-MT 論文翻訳(概要): Tree-of-Experience: A Structured Experience-Management Solution for Self-Evolving Agents under Low-Repetition and Implicit-Reward Environments

論文の概要: Tree-of-Experience: A Structured Experience-Management Solution for Self-Evolving Agents under Low-Repetition and Implicit-Reward Environments

arxiv url: http://arxiv.org/abs/2606.06960v1
Date: Fri, 05 Jun 2026 06:39:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.599306
Title: Tree-of-Experience: A Structured Experience-Management Solution for Self-Evolving Agents under Low-Repetition and Implicit-Reward Environments
Title（参考訳）: ツリー・オブ・エクスペリエンス:低繰り返し・インシシット・リワード環境下での自己進化型エージェントのための構造化されたエクスペリエンス・マネジメント・ソリューション
Authors: Zihao Deng, Yining Zhu, Leiming Wang, Jingfei Lu, Junbo Wang, Chuncheng Ran, Yu Yang, Dixuan Yang, Jikun Shen,
Abstract要約: 暗黙の報酬を伴う低繰り返しタスク、過去の経験を再利用するのが難しく、フィードバックが遅れ、うるさい、そして結果レベル。本研究では,エージェント体験の整理,検索,検証,更新を行う構造化経験管理手法であるTree-of-Experience(ToE)を提案する。
参考スコア（独自算出の注目度）: 7.400600301289333
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Experience-based self-evolution is crucial for LLM agents, but existing benchmarks often assume explicit goals, stable task patterns, and clear feedback. We study a more challenging setting: low-repetition tasks with implicit rewards, where past experience is difficult to reuse and feedback is delayed, noisy, and outcome-level. We introduce \textsc{FinEvolveBench}, a temporally controlled benchmark for financial sentiment prediction that links daily news-driven predictions to future excess returns. We further propose Tree-of-Experience (ToE), a structured experience-management method that organizes, retrieves, validates, and updates agent experience. Experiments show that general-purpose experience mechanisms do not consistently outperform no-experience baselines, while ToE achieves stronger overall performance. These results highlight the importance of structured experience management for self-evolving agents in implicit-reward environments.
Abstract（参考訳）: LLMエージェントにはエクスペリエンスベースの自己進化が不可欠だが、既存のベンチマークでは明確な目標、安定したタスクパターン、明確なフィードバックが想定される。暗黙の報酬を伴う低繰り返しタスク、過去の経験を再利用するのが難しく、フィードバックが遅れ、うるさい、そして結果レベル。我々は、日々のニュース駆動予測と将来の過剰リターンを関連付ける、時間的に制御された金融感情予測のベンチマークである「textsc{FinEvolveBench}」を紹介した。さらに,エージェントエクスペリエンスの整理,検索,検証,更新を行う構造化エクスペリエンス管理手法であるTree-of-Experience (ToE)を提案する。実験によると、汎用的な体験メカニズムは経験のないベースラインを一貫して上回り、ToEは全体的なパフォーマンスを向上する。これらの結果は、暗黙の逆転環境における自己進化型エージェントの構造化経験管理の重要性を強調した。

論文の概要: Tree-of-Experience: A Structured Experience-Management Solution for Self-Evolving Agents under Low-Repetition and Implicit-Reward Environments

関連論文リスト