Fugu-MT 論文翻訳(概要): AIP: A Graph Representation for Learning and Governing Agent Skills

論文の概要: AIP: A Graph Representation for Learning and Governing Agent Skills

arxiv url: http://arxiv.org/abs/2606.04781v1
Date: Wed, 03 Jun 2026 12:02:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.733987
Title: AIP: A Graph Representation for Learning and Governing Agent Skills
Title（参考訳）: AIP: エージェントスキルの学習と統治のためのグラフ表現
Authors: Zachary Blumenfeld, Jim Webber,
Abstract要約: 現在、エージェントスキルは、エージェントがすべてのセッションでどのように振る舞うかを読み、解釈し、再導出する必要があるフリーフォームの散文で構成されている。これは、実装が重いタスクに対する信頼性の低下と、スキル創造と改善の難しさの2つの複合コストを課します。 Agent Instruction Protocol (AIP) は両方とも、有向実行グラフとしてスキルをモデル化することで対処する。コンパイラのメタスキルは、既存の人書きスキルをこの形式に変換する。
参考スコア（独自算出の注目度）: 1.1458853556386797
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Agent Skills today consist largely of free-form prose requiring the agent to read, interpret, and re-derive how to act in every session. This imposes two compounding costs: reduced reliability on implementation-heavy tasks, and difficulty in skill creation and improvement, since editing prose is a fragile process that both humans and agents struggle with, particularly for domain-specific procedural knowledge underrepresented in model training. The Agent Instruction Protocol (AIP) addresses both by modeling a skill as a directed execution graph: discrete steps as nodes backed by deterministic scripts or natural-language descriptions, connected by explicit typed input/output edges, and governed by a schema-validated YAML specification. A compiler meta-skill translates existing human-written skills into this form. The benefits are twofold. First, compiling human-written skills to AIP raised Claude Sonnet's mean task reward from 0.60 to 0.71 and pass rate from 53% to 67% across 27 real agent tasks from SkillsBench - a statistically significant gain (Wilcoxon signed-rank p = 0.011), winning 12 tasks to 2 with 13 ties - often in less wall-clock time. The graph delivers vetted, runnable units to the agent rather than asking it to re-derive code, commands, and tool calls from natural language. Second, on creation and improvement, because each skill is schema-validated, functionally testable, and addressable node-by-node, failures can be diagnosed and repaired precisely. Two authored-skill failures were traced to the script level. After adjusting the AIP spec and recompiling, both recovered with zero regressions (one task going from 0/5 to 5/5), turning skill improvement into a measurable tuning loop rather than a prose rewrite. That same graph structure supports corpus-level governance and skill introspection, and provides a natural action space for reinforcement learning over skills.
Abstract（参考訳）: 現在、エージェントスキルは、エージェントがすべてのセッションでどのように振る舞うかを読み、解釈し、再導出する必要があるフリーフォームの散文で構成されている。これは2つの複合コストを課している: 実装が重いタスクに対する信頼性の低減、そして、技術創出と改善の難しさ、そして、編集のプロセスは、人間とエージェントの両方が苦労する脆弱なプロセスであり、特にモデルトレーニングで不足しているドメイン固有の手続き的知識に対してである。 Agent Instruction Protocol (AIP)は、スキルを有向実行グラフとしてモデル化することで、双方に対処する: 決定論的スクリプトや自然言語記述によるノードとしての個別ステップ、明示的な型付け入力/出力エッジによって接続され、スキーマ検証されたYAML仕様によって管理される。コンパイラのメタスキルは、既存の人書きスキルをこの形式に変換する。利点は2つある。まず、人手書きのスキルをAIPにコンパイルすることで、Claude Sonnet氏の平均タスク報酬は0.60から0.71に上昇し、SkillsBench氏の27の実際のエージェントタスクで53%から67%に上昇した(Wilcoxon sign-rank p = 0.011)。このグラフは、自然言語からコード、コマンド、ツール呼び出しを再帰させるのではなく、検証済みで実行可能なユニットをエージェントに配信する。第二に、各スキルがスキーマ検証され、機能的にテスト可能で、アドレス対応可能なノードバイノードであるため、作成と改善において、障害の正確な診断と修正が可能である。著者による2つの失敗は、スクリプトレベルまで遡る。 AIP仕様の調整と再コンパイルの後、どちらもゼロレグレッション(0/5から5/5までのタスク1つ)で回復し、スキル改善を散文の書き直しではなく測定可能なチューニングループに変換する。同じグラフ構造は、コーパスレベルのガバナンスとスキルイントロスペクションをサポートし、スキルよりも強化学習のための自然なアクションスペースを提供する。

論文の概要: AIP: A Graph Representation for Learning and Governing Agent Skills

関連論文リスト