Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object
- URL: http://arxiv.org/abs/2603.04913v1
- Date: Thu, 05 Mar 2026 07:57:47 GMT
- Title: Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object
- Authors: Chanmi Lee, Minsung Yoon, Woojae Kim, Sebin Lee, Sung-eui Yoon,
- Abstract summary: This work proposes a viewpoint-consistent adversarial texture optimization method for 3D objects through differentiable rendering.<n>As optimization strategies, we employ Expectation over Transformation (EOT) with a Coarse-to-Fine (C2F) curriculum.<n>We further integrate saliency-guided perturbations to redirect policy attention and design a targeted loss that persistently drives robots toward adversarial objects.
- Score: 26.15314358613966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network-based visuomotor policies enable robots to perform manipulation tasks but remain susceptible to perceptual attacks. For example, conventional 2D adversarial patches are effective under fixed-camera setups, where appearance is relatively consistent; however, their efficacy often diminishes under dynamic viewpoints from moving cameras, such as wrist-mounted setups, due to perspective distortions. To proactively investigate potential vulnerabilities beyond 2D patches, this work proposes a viewpoint-consistent adversarial texture optimization method for 3D objects through differentiable rendering. As optimization strategies, we employ Expectation over Transformation (EOT) with a Coarse-to-Fine (C2F) curriculum, exploiting distance-dependent frequency characteristics to induce textures effective across varying camera-object distances. We further integrate saliency-guided perturbations to redirect policy attention and design a targeted loss that persistently drives robots toward adversarial objects. Our comprehensive experiments show that the proposed method is effective under various environmental conditions, while confirming its black-box transferability and real-world applicability.
Related papers
- Reinforced Embodied Active Defense: Exploiting Adaptive Interaction for Robust Visual Perception in Adversarial 3D Environments [26.37868865624549]
Adversarial attacks in 3D environments have emerged as a critical threat to the reliability of visual perception systems.<n>We introduce Reinforced Embodied Active Defense (Rein-EAD), a proactive defense framework that leverages adaptive exploration and interaction with the environment.<n>Rein-EAD exhibits robust generalization to unseen and adaptive attacks, making it suitable for real-world complex tasks.
arXiv Detail & Related papers (2025-07-24T14:56:21Z) - 3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation [50.03578546845548]
Physical adversarial attack methods expose the vulnerabilities of deep neural networks and pose a significant threat to safety-critical scenarios such as autonomous driving.<n> Camouflage-based physical attack is a more promising approach compared to the patch-based attack, offering stronger adversarial effectiveness in complex physical environments.<n>We propose a physical attack framework based on 3D Gaussian Splatting (3DGS), named PGA, which provides rapid and precise reconstruction with few images.
arXiv Detail & Related papers (2025-07-02T05:10:16Z) - The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector [37.74333887056029]
3D object detection is a critical component in autonomous driving systems.<n>In this paper, we investigate the vulnerability of 3D object detection models to 3D adversarial attacks.<n>We generate non-invasive 3D adversarial objects tailored for real-world attack scenarios.
arXiv Detail & Related papers (2025-05-28T15:49:54Z) - AdvReal: Physical Adversarial Patch Generation Framework for Security Evaluation of Object Detection Systems [13.653653250544004]
We propose a unified joint adversarial training framework for both 2D and 3D domains.<n>We develop a realism enhancement mechanism that incorporates non-rigid deformation modeling and texture remapping.<n>Our method achieves an average attack success rate (ASR) of 70.13% on YOLOv12 in physical scenarios.
arXiv Detail & Related papers (2025-05-22T08:54:03Z) - Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation [18.218913010189237]
We propose a practical attack method for embodied vision navigation by attaching adversarial patches to objects.<n>Our adversarial patches decrease the navigation success rate by an average of 22.39%, outperforming previous methods in practicality, effectiveness, and naturalness.
arXiv Detail & Related papers (2024-09-16T08:21:22Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Unified Adversarial Patch for Visible-Infrared Cross-modal Attacks in
the Physical World [11.24237636482709]
We design a unified adversarial patch that can perform cross-modal physical attacks, achieving evasion in both modalities simultaneously with a single patch.
We propose a novel boundary-limited shape optimization approach that aims to achieve compact and smooth shapes for the adversarial patch.
Our method is evaluated against several state-of-the-art object detectors, achieving an Attack Success Rate (ASR) of over 80%.
arXiv Detail & Related papers (2023-07-27T08:14:22Z) - On the Real-World Adversarial Robustness of Real-Time Semantic
Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks.
This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches.
A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z) - MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks [77.56526918859345]
We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios.
It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
arXiv Detail & Related papers (2021-12-19T07:52:05Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - Evaluating the Robustness of Semantic Segmentation for Autonomous
Driving against Real-World Adversarial Patch Attacks [62.87459235819762]
In a real-world scenario like autonomous driving, more attention should be devoted to real-world adversarial examples (RWAEs)
This paper presents an in-depth evaluation of the robustness of popular SS models by testing the effects of both digital and real-world adversarial patches.
arXiv Detail & Related papers (2021-08-13T11:49:09Z) - Exploring Adversarial Robustness of Multi-Sensor Perception Systems in
Self Driving [87.3492357041748]
In this paper, we showcase practical susceptibilities of multi-sensor detection by placing an adversarial object on top of a host vehicle.
Our experiments demonstrate that successful attacks are primarily caused by easily corrupted image features.
Towards more robust multi-modal perception systems, we show that adversarial training with feature denoising can boost robustness to such attacks significantly.
arXiv Detail & Related papers (2021-01-17T21:15:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.