Fugu-MT 論文翻訳(概要): SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

論文の概要: SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

arxiv url: http://arxiv.org/abs/2603.08316v2
Date: Tue, 10 Mar 2026 11:10:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 12:59:13.120113
Title: SlowBA: An efficiency backdoor attack towards VLM-based GUI agents
Title（参考訳）: SlowBA: VLMベースのGUIエージェントに対する効率的なバックドア攻撃
Authors: Junxian Li, Tu Lan, Haozhen Tan, Yan Meng, Haojin Zhu,
Abstract要約: 本稿では、VLMベースのGUIエージェントの応答性をターゲットとした、新しいバックドア攻撃であるSlowBAを紹介する。キーとなるアイデアは、特定のトリガーパターンの下で過度に長い推論チェーンを誘導することで、レスポンスのレイテンシを操作することです。実験により、SlowBAはタスクの正確性を大きく保ちながら、応答長とレイテンシを大幅に向上できることが示された。
参考スコア（独自算出の注目度）: 13.613479645526334
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to respond to user instructions with low latency. While existing research on GUI-agent security mainly focuses on manipulating action correctness, the security risks related to response efficiency remain largely unexplored. In this paper, we introduce SlowBA, a novel backdoor attack that targets the responsiveness of VLM-based GUI agents. The key idea is to manipulate response latency by inducing excessively long reasoning chains under specific trigger patterns. To achieve this, we propose a two-stage reward-level backdoor injection (RBI) strategy that first aligns the long-response format and then learns trigger-aware activation through reinforcement learning. In addition, we design realistic pop-up windows as triggers that naturally appear in GUI environments, improving the stealthiness of the attack. Extensive experiments across multiple datasets and baselines demonstrate that SlowBA can significantly increase response length and latency while largely preserving task accuracy. The attack remains effective even with a small poisoning ratio and under several defense settings. These findings reveal a previously overlooked security vulnerability in GUI agents and highlight the need for defenses that consider both action correctness and response efficiency. Code can be found in https://github.com/tu-tuing/SlowBA.
Abstract（参考訳）: 最新の視覚言語モデル(VLM)ベースのGUIエージェントは,アクションを正確に実行するだけでなく,低レイテンシでユーザ命令に応答することが期待されている。 GUIエージェントのセキュリティに関する既存の研究は、主にアクションの正当性を操作することに焦点を当てているが、応答効率に関するセキュリティリスクはほとんど未調査のままである。本稿では,VLMベースのGUIエージェントの応答性を目標とした,新しいバックドアアタックであるSlowBAを紹介する。キーとなるアイデアは、特定のトリガーパターンの下で過度に長い推論チェーンを誘導することで、レスポンスのレイテンシを操作することです。そこで本研究では,2段階の報酬レベルバックドアインジェクション(RBI)戦略を提案する。また,GUI環境に自然に現れるトリガーとしてリアルなポップアップウィンドウを設計し,攻撃のステルス性を向上させる。複数のデータセットとベースラインにわたる大規模な実験により、SlowBAはタスクの正確性を大幅に保ちながら、応答長とレイテンシを大幅に向上できることが示された。この攻撃は、小さな中毒率といくつかの防御条件の下でも有効である。これらの発見は、以前見落とされたGUIエージェントのセキュリティ脆弱性を明らかにし、アクションの正しさと応答効率の両方を考慮した防御の必要性を強調している。コードはhttps://github.com/tu-tuing/SlowBA.comにある。

論文の概要: SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

関連論文リスト