Fugu-MT 論文翻訳(概要): Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents

論文の概要: Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents

arxiv url: http://arxiv.org/abs/2510.07809v1
Date: Thu, 09 Oct 2025 05:34:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:14.888156
Title: Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents
Title（参考訳）: モバイルビジョンランゲージエージェントの配置における有効で安定したワンショットジェイルブレイク
Authors: Renhua Ding, Xiao Yang, Zhengwei Fang, Jun Luo, Kun He, Jun Zhu,
Abstract要約: アプリ内のプロンプトインジェクションを活用する一発のjailbreak攻撃を提示する。悪意のあるアプリはUIテキストに短いプロンプトを埋め込むが、エージェントがADBを介してUIを駆動すると明らかになる。当社のフレームワークは,(1)悪質なアプリへのペイロードをエージェントの視覚入力として注入する低プライバシー認識チェーンターゲティング,(2)物理的タッチ属性を用いてエージェントを識別し,エージェント操作時にのみペイロードを公開するタッチベーストリガ,(3)ステルス誘導された文字レベルのワンショットプロンプトエフェクト,の3つの重要なコンポーネントから構成される。
参考スコア（独自算出の注目度）: 29.62914440645731
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities to UI-level attacks remain critically understudied. Existing research often depends on conspicuous UI overlays, elevated permissions, or impractical threat models, limiting stealth and real-world applicability. In this paper, we present a practical and stealthy one-shot jailbreak attack that leverages in-app prompt injections: malicious applications embed short prompts in UI text that remain inert during human interaction but are revealed when an agent drives the UI via ADB (Android Debug Bridge). Our framework comprises three crucial components: (1) low-privilege perception-chain targeting, which injects payloads into malicious apps as the agent's visual inputs; (2) stealthy user-invisible activation, a touch-based trigger that discriminates agent from human touches using physical touch attributes and exposes the payload only during agent operation; and (3) one-shot prompt efficacy, a heuristic-guided, character-level iterative-deepening search algorithm (HG-IDA*) that performs one-shot, keyword-level detoxification to evade on-device safety filters. We evaluate across multiple LVLM backends, including closed-source services and representative open-source models within three Android applications, and we observe high planning and execution hijack rates in single-shot scenarios (e.g., GPT-4o: 82.5% planning / 75.0% execution). These findings expose a fundamental security vulnerability in current mobile agents with immediate implications for autonomous smartphone operation.
Abstract（参考訳）: 大規模な視覚言語モデル(LVLM)は、自律的なモバイルエージェントがスマートフォンのユーザインターフェースを操作できるようにするが、UIレベルの攻撃に対する脆弱性は、依然として極めて過小評価されている。既存の研究は、しばしば目立ったUIオーバーレイ、許可の高揚、あるいは非現実的な脅威モデルに依存し、ステルスと現実の応用性を制限する。本稿では,アプリケーション内のプロンプトインジェクションを活用する,実用的でステルスな1発のジェイルブレイク攻撃について述べる。悪意のあるアプリケーションは,人間のインタラクション中に不活性なままのUIテキストに短いプロンプトを埋め込むが,エージェントがADB(Android Debug Bridge)を介してUIを駆動すると明らかにする。本フレームワークは,(1)エージェントの視覚入力として悪意あるアプリにペイロードを注入する低プライバシー認識チェーンターゲティング,(2)物理的タッチ属性を用いてエージェントを識別し,エージェント操作時にのみペイロードを露呈するタッチベースのトリガー,(3)単発の即効性,(HG-IDA*)一発のキーワードレベルデトキシフィケーションにより,デバイス上の安全フィルタを回避し,一発のキーワードレベルデトキシフィケーションを行う。 3つのAndroidアプリケーション内で、クローズドソースサービスや代表的なオープンソースモデルを含む複数のLVLMバックエンドを評価し、シングルショットシナリオ(例:GPT-4o:82.5%プランニング/75.0%実行)でハイプランニングと実行のハイジャックレートを観察します。これらの発見は、現在のモバイルエージェントの基本的なセキュリティ上の脆弱性が、スマートフォンの自律操作にすぐに影響することを示している。

論文の概要: Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents

関連論文リスト