Related papers: Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object

Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object

URL: http://arxiv.org/abs/2603.04913v1
Date: Thu, 05 Mar 2026 07:57:47 GMT
Title: Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object
Authors: Chanmi Lee, Minsung Yoon, Woojae Kim, Sebin Lee, Sung-eui Yoon,
Abstract summary: This work proposes a viewpoint-consistent adversarial texture optimization method for 3D objects through differentiable rendering.<n>As optimization strategies, we employ Expectation over Transformation (EOT) with a Coarse-to-Fine (C2F) curriculum.<n>We further integrate saliency-guided perturbations to redirect policy attention and design a targeted loss that persistently drives robots toward adversarial objects.
Score: 26.15314358613966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural network-based visuomotor policies enable robots to perform manipulation tasks but remain susceptible to perceptual attacks. For example, conventional 2D adversarial patches are effective under fixed-camera setups, where appearance is relatively consistent; however, their efficacy often diminishes under dynamic viewpoints from moving cameras, such as wrist-mounted setups, due to perspective distortions. To proactively investigate potential vulnerabilities beyond 2D patches, this work proposes a viewpoint-consistent adversarial texture optimization method for 3D objects through differentiable rendering. As optimization strategies, we employ Expectation over Transformation (EOT) with a Coarse-to-Fine (C2F) curriculum, exploiting distance-dependent frequency characteristics to induce textures effective across varying camera-object distances. We further integrate saliency-guided perturbations to redirect policy attention and design a targeted loss that persistently drives robots toward adversarial objects. Our comprehensive experiments show that the proposed method is effective under various environmental conditions, while confirming its black-box transferability and real-world applicability.

Related papers

Reinforced Embodied Active Defense: Exploiting Adaptive Interaction for Robust Visual Perception in Adversarial 3D Environments [26.37868865624549]
Adversarial attacks in 3D environments have emerged as a critical threat to the reliability of visual perception systems.<n>We introduce Reinforced Embodied Active Defense (Rein-EAD), a proactive defense framework that leverages adaptive exploration and interaction with the environment.<n>Rein-EAD exhibits robust generalization to unseen and adaptive attacks, making it suitable for real-world complex tasks.
arXiv Detail & Related papers (2025-07-24T14:56:21Z)
3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation [50.03578546845548]
Physical adversarial attack methods expose the vulnerabilities of deep neural networks and pose a significant threat to safety-critical scenarios such as autonomous driving.<n> Camouflage-based physical attack is a more promising approach compared to the patch-based attack, offering stronger adversarial effectiveness in complex physical environments.<n>We propose a physical attack framework based on 3D Gaussian Splatting (3DGS), named PGA, which provides rapid and precise reconstruction with few images.
arXiv Detail & Related papers (2025-07-02T05:10:16Z)
The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector [37.74333887056029]
3D object detection is a critical component in autonomous driving systems.<n>In this paper, we investigate the vulnerability of 3D object detection models to 3D adversarial attacks.<n>We generate non-invasive 3D adversarial objects tailored for real-world attack scenarios.
arXiv Detail & Related papers (2025-05-28T15:49:54Z)
AdvReal: Physical Adversarial Patch Generation Framework for Security Evaluation of Object Detection Systems [13.653653250544004]
We propose a unified joint adversarial training framework for both 2D and 3D domains.<n>We develop a realism enhancement mechanism that incorporates non-rigid deformation modeling and texture remapping.<n>Our method achieves an average attack success rate (ASR) of 70.13% on YOLOv12 in physical scenarios.
arXiv Detail & Related papers (2025-05-22T08:54:03Z)
Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation [18.218913010189237]
We propose a practical attack method for embodied vision navigation by attaching adversarial patches to objects.<n>Our adversarial patches decrease the navigation success rate by an average of 22.39%, outperforming previous methods in practicality, effectiveness, and naturalness.
arXiv Detail & Related papers (2024-09-16T08:21:22Z)
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios. We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out. Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z)
Unified Adversarial Patch for Visible-Infrared Cross-modal Attacks in the Physical World [11.24237636482709]
We design a unified adversarial patch that can perform cross-modal physical attacks, achieving evasion in both modalities simultaneously with a single patch. We propose a novel boundary-limited shape optimization approach that aims to achieve compact and smooth shapes for the adversarial patch. Our method is evaluated against several state-of-the-art object detectors, achieving an Attack Success Rate (ASR) of over 80%.
arXiv Detail & Related papers (2023-07-27T08:14:22Z)
On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks. This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z)
MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks [77.56526918859345]
We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios. It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
arXiv Detail & Related papers (2021-12-19T07:52:05Z)
Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform. We produce a closed-loop controller to reactively push objects in a continuous action space. We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
Evaluating the Robustness of Semantic Segmentation for Autonomous Driving against Real-World Adversarial Patch Attacks [62.87459235819762]
In a real-world scenario like autonomous driving, more attention should be devoted to real-world adversarial examples (RWAEs) This paper presents an in-depth evaluation of the robustness of popular SS models by testing the effects of both digital and real-world adversarial patches.
arXiv Detail & Related papers (2021-08-13T11:49:09Z)
Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving [87.3492357041748]
In this paper, we showcase practical susceptibilities of multi-sensor detection by placing an adversarial object on top of a host vehicle. Our experiments demonstrate that successful attacks are primarily caused by easily corrupted image features. Towards more robust multi-modal perception systems, we show that adversarial training with feature denoising can boost robustness to such attacks significantly.
arXiv Detail & Related papers (2021-01-17T21:15:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.