Fugu-MT 論文翻訳(概要): Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents

論文の概要: Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents

arxiv url: http://arxiv.org/abs/2605.21810v1
Date: Wed, 20 May 2026 23:10:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 20:14:18.500165
Title: Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents
Title（参考訳）: Trace2Skill: 長期EDAエージェントのための検証ガイド型スキル進化
Authors: Zijian Du, Nathaniel Pinckney,
Abstract要約: テスト時間スケーリングフレームワークであるTrace2Skillを提案する。新しいモデルをトレーニングしたり、より多くの候補ソリューションをサンプリングする代わりに、Trace2Skillはエージェントの自然言語スキルを進化可能なポリシーとして扱う。成功と失敗モードのために繰り返しロールアウトトレースをマイニングし、それらを密集した診断やオラクルのレッスンに変換し、オラクル、ミューテータ、セレクタループを使用してタスク固有のスキルを生成する。
参考スコア（独自算出の注目度）: 0.3733676450456031
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Complex Verilog Design Problems (CVDP) challenge hardware LLM agents because solving them requires localizing verifier-relevant RTL, testbenches, include paths, and build dependencies inside large repository snapshots, making precise edits, and recovering from sparse hidden-verifier failures. We present Trace2Skill, a test-time scaling framework that improves a hardware agent without RTL-specialized model fine-tuning. Rather than training a new model or only sampling more candidate solutions, Trace2Skill treats the agent's natural-language skill as an evolvable policy. It mines repeated rollout traces for success and failure modes, converts them into dense diagnostics and oracle lessons, and uses an oracle, mutator, and selector loop to produce task-specific skills that guide later search, editing, validation, and recovery. Because final pass/fail labels are often too coarse for hard failures, Trace2Skill also supports bounded runtime dense verifier feedback that returns sanitized functional observations while keeping hidden harnesses and reference solutions inaccessible to the agent. This feedback helps guide skill evolution and agent execution by connecting skill text, verifier evidence, and downstream behavior. Across hard CVDP tasks that defeat the seed CVDP agent, including tasks that also defeat frontier coding agents, Trace2Skill with dense verifier feedback substantially improves task pass rates and produces breakthrough passes on previously unsolved tasks, without requiring high-quality fine-tuning data, specialized RTL model training, or model weight updates. The same framework provides a general test-time scaling strategy that can extend beyond digital design to other verifiable EDA tasks.
Abstract（参考訳）: 複雑なVerilog Design Problems (CVDP) は、検証関連RTL、テストベンチ、パスを含む、大規模なリポジトリスナップショット内の依存関係のローカライズ、正確な編集、疎結合の検証エラーからの回復を必要とするため、ハードウェアLLMエージェントに挑戦する。テスト時間スケーリングフレームワークであるTrace2Skillを提案する。新しいモデルをトレーニングしたり、より多くの候補ソリューションをサンプリングする代わりに、Trace2Skillはエージェントの自然言語スキルを進化可能なポリシーとして扱う。成功と失敗モードのために繰り返しロールアウトトレースをマイニングし、それらを密集した診断やオラクルのレッスンに変換し、オラクル、ミュータ、セレクタループを使用して、後の検索、編集、バリデーション、リカバリをガイドするタスク固有のスキルを生成する。最終パス/フェイルラベルはハード障害には大きすぎることが多いため、Trace2Skillは、隠されたハーネスと参照ソリューションをエージェントにアクセスできないままにして、正常化された機能観察を返却する、バウンダリされたランタイム高密度バリファイアフィードバックもサポートする。このフィードバックは、スキルテキスト、バリデーションエビデンス、下流の振る舞いを接続することで、スキルの進化とエージェントの実行をガイドするのに役立つ。フロンティアコーディングエージェントを倒すタスクを含む、シードCVDPエージェントを倒すハードなタスク全体において、密集した検証対象フィードバックを持つTrace2Skillは、高品質な微調整データ、特殊なRTLモデルトレーニング、モデルウェイト更新を必要とせず、タスクパス率を大幅に改善し、未解決タスクのブレークスルーパスを生成する。同じフレームワークは、デジタル設計を越えて他の検証可能なEDAタスクにまで拡張可能な、一般的なテスト時のスケーリング戦略を提供する。

論文の概要: Trace2Skill: Verifier-Guided Skill Evolution for Long-Context EDA Agents

関連論文リスト