Fugu-MT 論文翻訳(概要): Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents

論文の概要: Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents

arxiv url: http://arxiv.org/abs/2605.22634v2
Date: Sun, 24 May 2026 12:02:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 16:32:37.965777
Title: Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents
Title（参考訳）: Contractual Skills: エンタープライズAIエージェントのためのGovernSpecデザインフレームワーク
Authors: Ting Liu,
Abstract要約: 本稿では,SKILL.mdファイルを可読性のあるタスクコントラクトとして整理するための,GovernSpecにインスパイアされた設計フレームワークであるコントラクトスキルを提案する。このフレームワークは、コントラクトスキル、GovernSpec YAMLコントラクト、モデルコンテキストプロトコル(MCP)サーフェス、ツールアダプタ、ランタイムガードレール、トレース、評価システムの境界を明確にしている。
参考スコア（独自算出の注目度）: 8.419155861590548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Skills have become a practical packaging mechanism for agent instructions, workflows, scripts, and reference materials. In enterprise settings, however, a skill often needs to express more than task guidance: goals, input boundaries, permissions, human approval points, evidence requirements, output contracts, quality criteria, verification steps, and handoff rules. This paper proposes contractual skills, a GovernSpec-inspired design framework for organizing SKILL.md files as readable task contracts while preserving lightweight skill discovery and progressive loading. The framework clarifies the boundary between contractual skills, GovernSpec YAML contracts, Model Context Protocol (MCP) surfaces, tool adapters, runtime guardrails, tracing, and evaluation systems. We evaluate the framework with three offline empirical studies. The first text-generation experiment covers three enterprise skills, fifteen synthetic tasks, four instruction conditions, and eight generation models, producing 960 outputs and 1680 cross-judge score records. The second study is a public-skill A/B expansion: eight public skills are compared with contractual rewrites across forty-eight synthetic tasks, six generation models, two repeats, 1152 outputs, and two complete judge files. In this setting, contractual skills raise mean quality from 4.692 to 4.914 and reduce critical-error rate from 0.083 to 0.013. The third study is an offline tool-calling challenge with eight models and 192 simulated tool-call records. The results suggest that contractual skills are best understood as a governance layer that makes task intent, boundaries, and acceptance criteria explicit, not as a standalone safety mechanism.
Abstract（参考訳）: スキルはエージェント命令、ワークフロー、スクリプト、リファレンス材料のための実用的なパッケージングメカニズムになっている。しかし、エンタープライズ環境では、ゴール、入力境界、許可、人間の承認ポイント、エビデンス要件、アウトプット契約、品質基準、検証手順、およびハンドオフルールといった、タスクガイダンス以上のものを表現する必要がある場合が多い。本稿では、軽量なスキル発見とプログレッシブローディングを保ちながら、SKILL.mdファイルを読みやすいタスクコントラクトとして整理する、GovernSpecにインスパイアされた設計フレームワークであるコントラクトスキルを提案する。このフレームワークは、コントラクトスキル、GovernSpec YAMLコントラクト、モデルコンテキストプロトコル(MCP)サーフェス、ツールアダプタ、ランタイムガードレール、トレース、評価システムの境界を明確にしている。この枠組みを3つのオフライン実証実験で評価した。最初のテキスト生成実験は、3つのエンタープライズスキル、15つの合成タスク、4つの命令条件、8つの世代モデルを含み、960の出力と1680のクロスジャッジスコアを出力する。第2の研究では、パブリックスキルのA/B拡張として、8つの公開スキルを48の合成タスク、第6世代のモデル、2つの繰り返し、152のアウトプット、2つの完全な判断ファイルの契約書と比較した。この設定では、契約スキルは平均品質を4.692から4.914に引き上げ、臨界エラー率を0.083から0.013に下げる。 3つ目の研究は、8つのモデルと192のシミュレートされたツールコールレコードを備えた、オフラインのツールコールチャレンジである。その結果、契約上のスキルは、独立した安全メカニズムとしてではなく、タスク意図、バウンダリ、受け入れ基準を明確にするガバナンス層として理解されていることが示唆された。

論文の概要: Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents

関連論文リスト