Fugu-MT 論文翻訳(概要): TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models

論文の概要: TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models

arxiv url: http://arxiv.org/abs/2510.10932v1
Date: Mon, 13 Oct 2025 02:45:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:30.159883
Title: TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models
Title（参考訳）: TabVLA:ビジョンランゲージ・アクションモデルによるバックドア攻撃を狙う
Authors: Zonghuan Xu, Xiang Zheng, Xingjun Ma, Yu-Gang Jiang,
Abstract要約: バックドア付きVLAエージェントは、プレインジェクトされたバックドアによって隠蔽的にトリガーされ、敵のアクションを実行することができる。我々は,VLAモデルに対するターゲットバックドア攻撃について検討し,ブラックボックスファインチューニングによる攻撃を可能にする新しいフレームワークであるTabVLAを紹介した。我々の研究は、バックドア操作をターゲットにしたVLAモデルの脆弱性を強調し、より高度な防御の必要性を強調します。
参考スコア（独自算出の注目度）: 63.51290426425441
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: With the growing deployment of Vision-Language-Action (VLA) models in real-world embodied AI systems, their increasing vulnerability to backdoor attacks poses a serious safety threat. A backdoored VLA agent can be covertly triggered by a pre-injected backdoor to execute adversarial actions, potentially causing system failures or even physical harm. Although backdoor attacks on VLA models have been explored, prior work has focused only on untargeted attacks, leaving the more practically threatening scenario of targeted manipulation unexamined. In this paper, we study targeted backdoor attacks on VLA models and introduce TabVLA, a novel framework that enables such attacks via black-box fine-tuning. TabVLA explores two deployment-relevant inference-time threat models: input-stream editing and in-scene triggering. It formulates poisoned data generation as an optimization problem to improve attack effectivess. Experiments with OpenVLA-7B on the LIBERO benchmark reveal that the vision channel is the principal attack surface: targeted backdoors succeed with minimal poisoning, remain robust across variations in trigger design, and are degraded only by positional mismatches between fine-tuning and inference triggers. We also investigate a potential detection-based defense against TabVLA, which reconstructs latent visual triggers from the input stream to flag activation-conditioned backdoor samples. Our work highlights the vulnerability of VLA models to targeted backdoor manipulation and underscores the need for more advanced defenses.
Abstract（参考訳）: Vision-Language-Action(VLA)モデルの現実世界の具体化AIシステムへの展開の増加に伴い、バックドア攻撃に対する脆弱性の増大は深刻な安全上の脅威となる。バックドア付きVLAエージェントは、プレインジェクトされたバックドアによって隠蔽的にトリガーされ、敵のアクションを実行し、システム障害や物理的損傷を引き起こす可能性がある。 VLAモデルに対するバックドア攻撃は検討されているが、以前の研究は標的外攻撃にのみ焦点を合わせており、ターゲット操作のより現実的に脅威となるシナリオは未検討のままである。本稿では,VLAモデルに対するターゲットバックドア攻撃について検討し,ブラックボックスファインチューニングによる攻撃を可能にする新しいフレームワークであるTabVLAを紹介する。 TabVLAでは、インプットストリーム編集とインシーントリガという、2つのデプロイメント関連推論時脅威モデルについて検討している。攻撃効果を改善する最適化問題として有毒なデータ生成を定式化する。 LIBEROベンチマークでのOpenVLA-7Bの実験では、視覚チャネルが主要な攻撃面であることを明らかにしている。標的のバックドアは最小限の毒で成功し、トリガー設計のバリエーションで堅牢であり、微調整と推論のトリガー間の位置ミスマッチによってしか劣化しない。また,TabVLAに対する潜在的な検出に基づく防御について検討し,入力ストリームからの潜時的な視覚的トリガーをフラグアクティベートしたバックドアサンプルに再構成する。我々の研究は、バックドア操作をターゲットにしたVLAモデルの脆弱性を強調し、より高度な防御の必要性を強調します。

論文の概要: TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models

関連論文リスト