Fugu-MT 論文翻訳(概要): AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents

論文の概要: AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents

arxiv url: http://arxiv.org/abs/2603.23007v1
Date: Tue, 24 Mar 2026 09:51:43 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-25 19:53:37.415799
Title: AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents
Title（参考訳）: AgentRAE: スクリーンショットベースのモバイルGUIエージェントに対する通知ベースのビジュアルバックドアによるリモートアクション実行
Authors: Yutao Luo, Haotian Zhu, Shuchao Pang, Zhigang Lu, Tian Dong, Yongbin Zhou, Minhui Xue,
Abstract要約: モバイルグラフィカルユーザインタフェース(GUI)エージェントは、アプリケーションとオペレーティングシステム(OS)を自律的に制御する本稿では,視覚的に自然なトリガを用いたモバイルGUIエージェントにおけるリモートアクション実行を誘導する新しいバックドアアタックであるAgentRAEを提案する。評価の結果,提案したバックドアは10個のモバイル操作に対して90%以上の攻撃成功率でクリーンな性能を保っていることが明らかとなった。
参考スコア（独自算出の注目度）: 18.82273534480229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid adoption of mobile graphical user interface (GUI) agents, which autonomously control applications and operating systems (OS), exposes new system-level attack surfaces. Existing backdoors against web GUI agents and general GenAI models rely on environmental injection or deceptive pop-ups to mislead the agent operation. However, these techniques do not work on screenshots-based mobile GUI agents due to the challenges of restricted trigger design spaces, OS background interference, and conflicts in multiple trigger-action mappings. We propose AgentRAE, a novel backdoor attack capable of inducing Remote Action Execution in mobile GUI agents using visually natural triggers (e.g., benign app icons in notifications). To address the underfitting caused by natural triggers and achieve accurate multi-target action redirection, we design a novel two-stage pipeline that first enhances the agent's sensitivity to subtle iconographic differences via contrastive learning, and then associates each trigger with a specific mobile GUI agent action through a backdoor post-training. Our extensive evaluation reveals that the proposed backdoor preserves clean performance with an attack success rate of over 90% across ten mobile operations. Furthermore, it is hard to visibly detect the benign-looking triggers and circumvents eight representative state-of-the-art defenses. These results expose an overlooked backdoor vector in mobile GUI agents, underscoring the need for defenses that scrutinize notification-conditioned behaviors and internal agent representations.
Abstract（参考訳）: アプリケーションとOS(OS)を自律的に制御するモバイルグラフィカルユーザインタフェース(GUI)エージェントが急速に採用され、新たなシステムレベルの攻撃面が公開された。既存のWeb GUIエージェントや一般的なGenAIモデルに対するバックドアは、エージェント操作を誤解させるために環境注入や偽りのポップアップに依存している。しかし、これらの技術はスクリーンショットベースのモバイルGUIエージェントでは動作しない。これは、トリガ設計スペースの制限、OSのバックグラウンド干渉、複数トリガアクションマッピングにおけるコンフリクトといった課題のためである。本稿では,視覚的に自然なトリガ(通知の良質なアプリアイコンなど)を用いて,モバイルGUIエージェントにリモートアクション実行を誘導できる新しいバックドアアタックであるAgentRAEを提案する。自然なトリガによる不適合に対処し,正確なマルチターゲット動作のリダイレクトを実現するために,コントラスト学習による微妙な図形的差異に対するエージェントの感度を高める新しい2段階パイプラインを設計し,その後,バックドアポストトレーニングを通じて,各トリガを特定の移動GUIエージェントアクションに関連付ける。提案したバックドアは,10個のモバイル操作に対して90%以上の攻撃成功率でクリーンな性能を保っている。さらに、良性に見えるトリガーを視覚的に検出し、8つの代表的な最先端防御を回避することは困難である。これらの結果は、モバイルGUIエージェントで見落とされたバックドアベクターを露呈し、通知条件の動作や内部エージェント表現を精査する防御の必要性を強調している。

論文の概要: AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents

関連論文リスト