Fugu-MT 論文翻訳(概要): Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections

論文の概要: Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections

arxiv url: http://arxiv.org/abs/2510.26328v1
Date: Thu, 30 Oct 2025 10:27:11 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.756439
Title: Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections
Title（参考訳）: エージェントスキルは、現実的でトリビシカルにシンプルなプロンプトインジェクションのクラスを可能にする
Authors: David Schmotz, Sahar Abdelnabi, Maksym Andriushchenko,
Abstract要約: 辺境のLLM会社はエージェントスキルを導入してこれを一歩進めた。簡単なプロンプトインジェクションを可能にするため、基本的に安全でないことが示される。我々は、機密データを抽出するために、長いエージェントスキルファイルや参照スクリプトに悪意のある命令を隠蔽する方法を実証する。
参考スコア（独自算出の注目度）: 24.46526203453932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Enabling continual learning in LLMs remains a key unresolved research challenge. In a recent announcement, a frontier LLM company made a step towards this by introducing Agent Skills, a framework that equips agents with new knowledge based on instructions stored in simple markdown files. Although Agent Skills can be a very useful tool, we show that they are fundamentally insecure, since they enable trivially simple prompt injections. We demonstrate how to hide malicious instructions in long Agent Skill files and referenced scripts to exfiltrate sensitive data, such as internal files or passwords. Importantly, we show how to bypass system-level guardrails of a popular coding agent: a benign, task-specific approval with the "Don't ask again" option can carry over to closely related but harmful actions. Overall, we conclude that despite ongoing research efforts and scaling model capabilities, frontier LLMs remain vulnerable to very simple prompt injections in realistic scenarios. Our code is available at https://github.com/aisa-group/promptinject-agent-skills.
Abstract（参考訳）: LLMにおける継続的な学習の実現は、未だに未解決の研究課題である。最近の発表で、フロンティアのLLM企業が、単純なマークダウンファイルに格納された命令に基づいて、エージェントに新しい知識を提供するフレームワークであるAgent Skillsを導入して、これに向けた一歩を踏み出した。 Agent Skillsは非常に有用なツールであるが、簡単なプロンプトインジェクションを可能にするため、基本的に安全でないことを示す。我々は、長期のエージェントスキルファイルや参照スクリプトに悪意のある命令を隠して、内部ファイルやパスワードなどの機密データを抽出する方法を実証する。重要なことは、一般的なコーディングエージェントのシステムレベルのガードレールをバイパスする方法を示している。全体として、現在進行中の研究努力とスケーリングモデル機能にもかかわらず、フロンティアLSMは、現実的なシナリオにおいて非常に単純なインジェクションに対して脆弱なままである、と結論付けている。私たちのコードはhttps://github.com/aisa-group/promptinject-agent-skills.comで利用可能です。

論文の概要: Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections

関連論文リスト