Related papers: When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models

When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models

URL: http://arxiv.org/abs/2511.21192v2
Date: Sun, 30 Nov 2025 06:53:52 GMT
Title: When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
Authors: Hui Lu, Yi Yu, Yiming Yang, Chenyu Yi, Qixin Zhang, Bingquan Shen, Alex C. Kot, Xudong Jiang,
Abstract summary: Vision-Language-Action (VLA) models are vulnerable to adversarial attacks, yet universal and transferable attacks remain underexplored.<n>We introduce UPA-RFAS (Universal Patch Attack via Robust Feature, Attention, and Semantics), a unified framework that learns a single physical patch in a shared feature space.<n> Experiments across diverse VLA models, manipulation suites, and physical executions show that UPA-RFAS consistently transfers across models, tasks, and viewpoints.
Score: 81.7618160628979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision-Language-Action (VLA) models are vulnerable to adversarial attacks, yet universal and transferable attacks remain underexplored, as most existing patches overfit to a single model and fail in black-box settings. To address this gap, we present a systematic study of universal, transferable adversarial patches against VLA-driven robots under unknown architectures, finetuned variants, and sim-to-real shifts. We introduce UPA-RFAS (Universal Patch Attack via Robust Feature, Attention, and Semantics), a unified framework that learns a single physical patch in a shared feature space while promoting cross-model transfer. UPA-RFAS combines (i) a feature-space objective with an $\ell_1$ deviation prior and repulsive InfoNCE loss to induce transferable representation shifts, (ii) a robustness-augmented two-phase min-max procedure where an inner loop learns invisible sample-wise perturbations and an outer loop optimizes the universal patch against this hardened neighborhood, and (iii) two VLA-specific losses: Patch Attention Dominance to hijack text$\to$vision attention and Patch Semantic Misalignment to induce image-text mismatch without labels. Experiments across diverse VLA models, manipulation suites, and physical executions show that UPA-RFAS consistently transfers across models, tasks, and viewpoints, exposing a practical patch-based attack surface and establishing a strong baseline for future defenses.

Related papers

Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models [54.61181161508336]
We introduce Multi-Faceted Attack (MFA), a framework that exposes general safety vulnerabilities in leading defense-equipped Vision-Language Models (VLMs)<n>The core component of MFA is the Attention-Transfer Attack (ATA), which hides harmful instructions inside a meta task with competing objectives.<n>MFA achieves a 58.5% success rate and consistently outperforms existing methods.
arXiv Detail & Related papers (2025-11-20T07:12:54Z)
Model-agnostic Adversarial Attack and Defense for Vision-Language-Action Models [25.45513133247862]
Vision-Language-Action (VLA) models have achieved revolutionary progress in robot learning.<n>Despite this progress, their adversarial robustness remains underexplored.<n>We propose both adversarial patch attack and corresponding defense strategies for VLA models.
arXiv Detail & Related papers (2025-10-15T07:42:44Z)
TabVLA: Targeted Backdoor Attacks on Vision-Language-Action Models [63.51290426425441]
A backdoored VLA agent can be covertly triggered by a pre-injected backdoor to execute adversarial actions.<n>We study targeted backdoor attacks on VLA models and introduce TabVLA, a novel framework that enables such attacks via black-box fine-tuning.<n>Our work highlights the vulnerability of VLA models to targeted backdoor manipulation and underscores the need for more advanced defenses.
arXiv Detail & Related papers (2025-10-13T02:45:48Z)
Universal Camouflage Attack on Vision-Language Models for Autonomous Driving [67.34987318443761]
Visual language modeling for automated driving is emerging as a promising research direction.<n>VLM-AD remains vulnerable to serious security threats from adversarial attacks.<n>We propose the first Universal Camouflage Attack framework for VLM-AD.
arXiv Detail & Related papers (2025-09-24T14:52:01Z)
IAP: Invisible Adversarial Patch Attack through Perceptibility-Aware Localization and Perturbation Optimization [3.096869664709865]
adversarial patches can drastically change the prediction of computer vision models.<n>We introduce IAP, a novel attack framework that generates highly invisible adversarial patches.<n>IAP consistently achieves competitive attack success rates in targeted settings.
arXiv Detail & Related papers (2025-07-09T13:58:40Z)
Boosting Adversarial Transferability with Spatial Adversarial Alignment [56.97809949196889]
Deep neural networks are vulnerable to adversarial examples that exhibit transferability across various models.<n>We propose a technique that employs an alignment loss and leverages a witness model to fine-tune the surrogate model.<n>Experiments on various architectures on ImageNet show that aligned surrogate models based on SAA can provide higher transferable adversarial examples.
arXiv Detail & Related papers (2025-01-02T02:35:47Z)
Generative Adversarial Patches for Physical Attacks on Cross-Modal Pedestrian Re-Identification [24.962600785183582]
Visible-infrared pedestrian Re-identification (VI-ReID) aims to match pedestrian images captured by infrared cameras and visible cameras. This paper introduces the first physical adversarial attack against VI-ReID models.
arXiv Detail & Related papers (2024-10-26T06:40:10Z)
PG-Attack: A Precision-Guided Adversarial Attack Framework Against Vision Foundation Models for Autonomous Driving [23.13958600806388]
Vision foundation models are increasingly employed in autonomous driving systems due to their advanced capabilities. These models are susceptible to adversarial attacks, posing significant risks to the reliability and safety of autonomous vehicles. We propose a novel Precision-Guided Adversarial Attack framework that combines two techniques: Precision Mask Perturbation Attack and Deceptive Text Patch Attack.
arXiv Detail & Related papers (2024-07-18T02:39:31Z)
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks. In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z)
Generating Transferable and Stealthy Adversarial Patch via Attention-guided Adversarial Inpainting [12.974292128917222]
We propose an innovative two-stage adversarial patch attack called Adv-Inpainting. In the first stage, we extract style features and identity features from the attacker and target faces, respectively. The proposed layer can adaptively fuse identity and style embeddings by fully exploiting priority contextual information. In the second stage, we design an Adversarial Patch Refinement Network (APR-Net) with a novel boundary variance loss.
arXiv Detail & Related papers (2023-08-10T03:44:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.