Fugu-MT 論文翻訳(概要): Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

論文の概要: Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

arxiv url: http://arxiv.org/abs/2605.22343v1
Date: Thu, 21 May 2026 11:29:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 16:35:42.2304
Title: Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators
Title（参考訳）: Sibyl-AutoResearch: 自律的な研究は、紙の発電機ではなく、自己進化的な試行錯誤を必要とする
Authors: Chengcheng Wang, Qinhua Xie, Wei He, Jianyuan Guo, Shiqi Wang, Chang Xu,
Abstract要約: 我々はScientific Trial-and-Error Harnessesを中心に構築された自己進化型AutoResearchフレームワークであるSibyl-AutoResearchを紹介した。ハーネスは、エージェントが有界なトライアルを実行し、肯定的な結果と否定的な結果を保持し、後続の計画、検証、クレームスコープ、スケジューリング、批判、執筆、修復に教訓を導いてくれる。 SIBYLはファイルベースの自律的な研究システムで、状態、役割、メモリ、ゲート、アーティファクトトレースを公開して変換パスを検査する。
参考スコア（独自算出の注目度）: 37.075000666622074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous research systems increasingly make the scientific workflow executable: agents can propose ideas, run code, inspect results, and draft papers. But executable workflows do not by themselves produce research judgment. We analyze where current systems lose trial experience: weak evidence becomes prose, pilot signals become broad claims, memory remains textual, and recurring process failures do not change later behavior. We introduce Sibyl-AutoResearch, a self-evolving AutoResearch framework built around Scientific Trial-and-Error Harnesses. A harness lets agents run bounded trials, preserve positive and negative outcomes, and route lessons into later planning, validation, claim scope, scheduling, critique, writing, and harness repair. We formalize this through two auditable conversion units: trial-to-behavior conversion, which links trial signals to later research actions, and trial-to-harness-behavior conversion, which links recurring process failures to system updates. We implement the framework in SIBYL, a file-backed autonomous research system that exposes the state, roles, memory, gates, and artifact traces needed to inspect these conversion paths. A retrospective audit identifies eight high-confidence conversion events, with a median latency of one iteration and a maximum latency of three iterations. A recovered-failure registry further shows how five naturally occurring failure classes, including duplicate results, stale numbers, and unsupported statistics, were blocked, downgraded, or routed into later repair. These traces do not establish a comparative performance claim; they show that the proposed conversion units are recoverable from realistic autonomous-research workspaces. The SIBYL framework and system are available at https://github.com/Sibyl-Research-Team/AutoResearch-SibylSystem.
Abstract（参考訳）: エージェントはアイデアを提案し、コードを実行し、結果を検査し、ドラフト論文を作成できる。しかし、実行可能なワークフローはそれ自体で研究判断を下さない。弱い証拠は散文になり、パイロット信号は広範な主張となり、記憶はテキストのままであり、繰り返し発生するプロセスの失敗は後の行動を変えない。我々はScientific Trial-and-Error Harnessesを中心に構築された自己進化型AutoResearchフレームワークであるSibyl-AutoResearchを紹介した。ハーネスは、エージェントが有界なトライアルを実行し、肯定的な結果と否定的な結果を保持し、後続の計画、検証、クレームスコープ、スケジューリング、批評、書き込み、ハーネスの修復に教訓を導く。本稿では,2つの監査可能な変換ユニット,すなわち,後の研究行動にトライアル信号をリンクするトライアル・ツー・行動変換と,繰り返し発生するプロセス障害とシステム更新をリンクするトライアル・ツー・ハーネス・行動変換を形式化する。 SIBYLはファイルベースの自律的な研究システムで、状態、役割、メモリ、ゲート、アーティファクトトレースを公開して変換パスを検査する。レトロスペクティブ監査では、8つの高信頼の変換イベントを特定し、中央値のレイテンシは1イテレーション、最大値のレイテンシは3イテレーションである。回復障害レジストリはさらに、重複した結果、古い番号、サポートされていない統計を含む5つの自然発生障害クラスがブロックされた、ダウングレードされた、あるいは後の修復にルーティングされたかを示している。これらのトレースは比較性能クレームを確立しておらず、提案した変換ユニットが現実的な自律検索ワークスペースから復元可能であることを示している。 SIBYLフレームワークとシステムはhttps://github.com/Sibyl-Research-Team/AutoResearch-SibylSystemで利用可能である。

論文の概要: Sibyl-AutoResearch: Autonomous Research Needs Self-Evolving Trial-and-Error Harnesses, Not Paper Generators

関連論文リスト