Related papers: State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space

State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space

URL: http://arxiv.org/abs/2601.04266v1
Date: Wed, 07 Jan 2026 08:54:31 GMT
Title: State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space
Authors: Ji Guo, Wenbo Jiang, Yansong Lin, Yijing Liu, Ruichen Zhang, Guomin Lu, Aiguo Chen, Xinshuo Han, Hongwei Li, Dusit Niyato,
Abstract summary: Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics.<n>We introduce the State Backdoor, a novel and practical backdoor attack that leverages the robot arm's initial state as the trigger.<n>Our method achieves over 90% attack success rate without affecting benign task performance, revealing an underexplored vulnerability in embodied AI systems.
Score: 42.234025453061875
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics. However, their complex multimodal interactions also expose new security vulnerabilities. In this paper, we investigate a backdoor threat in VLA models, where malicious inputs cause targeted misbehavior while preserving performance on clean data. Existing backdoor methods predominantly rely on inserting visible triggers into visual modality, which suffer from poor robustness and low insusceptibility in real-world settings due to environmental variability. To overcome these limitations, we introduce the State Backdoor, a novel and practical backdoor attack that leverages the robot arm's initial state as the trigger. To optimize trigger for insusceptibility and effectiveness, we design a Preference-guided Genetic Algorithm (PGA) that efficiently searches the state space for minimal yet potent triggers. Extensive experiments on five representative VLA models and five real-world tasks show that our method achieves over 90% attack success rate without affecting benign task performance, revealing an underexplored vulnerability in embodied AI systems.

Related papers

SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models [43.17292256124026]
Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications.<n>We identify a fundamental security flaw in modern VLA systems.<n>We propose SILENTDRIFT, a stealthy black-box backdoor attack exploiting this vulnerability.
arXiv Detail & Related papers (2026-01-20T01:24:17Z)
AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models [60.39655329875822]
Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks.<n>Despite growing interest in attacking such models, the effectiveness of existing techniques remains unclear.<n>We propose AttackVLA, a unified framework that aligns with the VLA development lifecycle.
arXiv Detail & Related papers (2025-11-15T10:30:46Z)
TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models [63.51290426425441]
A backdoored VLA agent can be covertly triggered by a pre-injected backdoor to execute adversarial actions.<n>We study targeted backdoor attacks on VLA models and introduce TabVLA, a novel framework that enables such attacks via black-box fine-tuning.<n>Our work highlights the vulnerability of VLA models to targeted backdoor manipulation and underscores the need for more advanced defenses.
arXiv Detail & Related papers (2025-10-13T02:45:48Z)
FreezeVLA: Action-Freezing Attacks against Vision-Language-Action Models [124.02734355214325]
Vision-Language-Action (VLA) models are driving rapid progress in robotics.<n> adversarial images can "freeze" VLA models and cause them to ignore subsequent instructions.<n>FreezeVLA generates and evaluates action-freezing attacks via min-max bi-level optimization.
arXiv Detail & Related papers (2025-09-24T08:15:28Z)
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization [45.97834622654751]
BadVLA is a backdoor attack method based on Objective-Decoupled Optimization.<n>We show that BadVLA consistently achieves near-100% attack success rates with minimal impact on clean task accuracy.<n>Our work offers the first systematic investigation of backdoor vulnerabilities in VLA models.
arXiv Detail & Related papers (2025-05-22T13:12:46Z)
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics [68.36528819227641]
This paper systematically evaluates the robustness of Vision-Language-Action (VLA) models.<n>We introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory.<n>We design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments.
arXiv Detail & Related papers (2024-11-18T01:52:20Z)
BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions. We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space. Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.