Fugu-MT 論文翻訳(概要): Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

論文の概要: Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

arxiv url: http://arxiv.org/abs/2605.00314v1
Date: Fri, 01 May 2026 00:48:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-04 17:43:28.799913
Title: Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis
Title（参考訳）: Semia: Constraint-Guided Representation Synthesisによる監査エージェントスキル
Authors: Hongbo Wen, Ying Li, Hanzhi Liu, Chaofan Shou, Yanju Chen, Yuan Tian, Yu Feng,
Abstract要約: エージェントスキルの静的監査ツールであるSemiaについて紹介する。 Semiaは、各スキルをDatalogファクトベースであるSDL(Skill Description Language)に引き上げる。我々は,公共市場から13,728の現実世界のスキルを評価する。
参考スコア（独自算出の注目度）: 11.214847720120192
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: An agent skill is a configuration package that equips an LLM-driven agent with a concrete capability, such as reading email, executing shell commands, or signing blockchain transactions. Each skill is a hybrid artifact-a structured half declares executable interfaces, while a prose half dictates when and how those interfaces fire-and the prose is reinterpreted probabilistically on every invocation. Conventional static analyzers parse the structured half but ignore the prose; LLM-based tools read the prose but cannot reproducibly prove that a tainted input reaches a high-impact sink. We present Semia, a static auditor for agent skills. Semia lifts each skill into the Skill Description Language (SDL), a Datalog fact base that captures LLM-triggered actions, prose-defined conditions, and human-in-the-loop checkpoints. Synthesizing a fact base that is both structurally sound and semantically faithful to the original prose is the central challenge; we address it with Constraint-Guided Representation Synthesis (CGRS), a propose-verify-evaluate loop that refines LLM candidates until convergence. Security properties (e.g., indirect injection, secret leakage, confused deputies, unguarded sinks, etc.) over an agent skill can then be reduced to Datalog reachability queries. We evaluate Semia on 13,728 real-world skills from public marketplaces. Semia renders all of them auditable and finds that more than half carry at least one critical semantic risk. On a stratified sample of 541 expert-labeled skills, Semia achieves 97.7% recall and an F1 of 90.6%, substantially outperforming signature-based scanners and LLM baselines.
Abstract（参考訳）: エージェントスキルは、メールの読み込み、シェルコマンドの実行、ブロックチェーントランザクションの署名といった具体的な機能を備えた、LLM駆動のエージェントを備えた設定パッケージである。各スキルはハイブリッドアーティファクトであり、構造化された半分は実行可能なインターフェイスを宣言し、散文半分はいつ、どのようにそのインターフェースを発射するかを宣言し、散文はすべての呼び出しで確率論的に解釈される。従来の静的アナライザは、構造化された半分を解析するが、散文を無視する; LLMベースのツールは散文を読み取るが、汚染された入力が高インパクトのシンクに達することを再現的に証明することはできない。エージェントスキルの静的監査ツールであるSemiaについて紹介する。 Semiaは、各スキルをSDL(Skill Description Language)に引き上げる。これは、LLMトリガーされたアクション、散文定義条件、ループ中の人間チェックポイントをキャプチャするDatalogファクトベースである。我々は,LLM候補を収束するまで改良する提案的検証・評価ループである Constraint-Guided Representation Synthesis (CGRS) を用いて,構造的に健全かつ意味論的に忠実な事実ベースを合成する。エージェントスキル上のセキュリティプロパティ(例えば、間接インジェクション、シークレットリーク、混乱したデリゲート、防御されていないシンクなど)は、Datalogの到達性クエリに還元される。我々は,公共市場から13,728の現実世界のスキルを評価する。セミアはこれら全てを監査可能とし、半数以上が少なくとも1つの重要なセマンティックリスクを負っていることを発見した。 541のエキスパートラベルのスキルの階層化されたサンプルで、Semiaは97.7%のリコールと90.6%のF1を達成し、シグネチャベースのスキャナーとLCMベースラインを大きく上回っている。

論文の概要: Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

関連論文リスト