Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches
- URL: http://arxiv.org/abs/2404.00540v1
- Date: Sun, 31 Mar 2024 03:02:35 GMT
- Title: Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches
- Authors: Lingxuan Wu, Xiao Yang, Yinpeng Dong, Liuwei Xie, Hang Su, Jun Zhu,
- Abstract summary: The vulnerability of deep neural networks to adversarial patches has motivated numerous defense strategies for boosting model robustness.
We develop Embodied Active Defense (EAD), a proactive defensive strategy that actively contextualizes environmental information to address misaligned adversarial patches in 3D real-world settings.
- Score: 37.317604316147985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vulnerability of deep neural networks to adversarial patches has motivated numerous defense strategies for boosting model robustness. However, the prevailing defenses depend on single observation or pre-established adversary information to counter adversarial patches, often failing to be confronted with unseen or adaptive adversarial attacks and easily exhibiting unsatisfying performance in dynamic 3D environments. Inspired by active human perception and recurrent feedback mechanisms, we develop Embodied Active Defense (EAD), a proactive defensive strategy that actively contextualizes environmental information to address misaligned adversarial patches in 3D real-world settings. To achieve this, EAD develops two central recurrent sub-modules, i.e., a perception module and a policy module, to implement two critical functions of active vision. These models recurrently process a series of beliefs and observations, facilitating progressive refinement of their comprehension of the target object and enabling the development of strategic actions to counter adversarial patches in 3D environments. To optimize learning efficiency, we incorporate a differentiable approximation of environmental dynamics and deploy patches that are agnostic to the adversary strategies. Extensive experiments demonstrate that EAD substantially enhances robustness against a variety of patches within just a few steps through its action policy in safety-critical tasks (e.g., face recognition and object detection), without compromising standard accuracy. Furthermore, due to the attack-agnostic characteristic, EAD facilitates excellent generalization to unseen attacks, diminishing the averaged attack success rate by 95 percent across a range of unseen adversarial attacks.
Related papers
- Game-Theoretic Defenses for Robust Conformal Prediction Against Adversarial Attacks in Medical Imaging [12.644923600594176]
Adversarial attacks pose significant threats to the reliability and safety of deep learning models.
This paper introduces a novel framework that integrates conformal prediction with game-theoretic defensive strategies.
arXiv Detail & Related papers (2024-11-07T02:20:04Z) - Real-world Adversarial Defense against Patch Attacks based on Diffusion Model [34.86098237949215]
This paper introduces DIFFender, a novel DIFfusion-based DeFender framework to counter adversarial patch attacks.
At the core of our approach is the discovery of the Adversarial Anomaly Perception (AAP) phenomenon.
DIFFender seamlessly integrates the tasks of patch localization and restoration within a unified diffusion model framework.
arXiv Detail & Related papers (2024-09-14T10:38:35Z) - Robust Image Classification: Defensive Strategies against FGSM and PGD Adversarial Attacks [0.0]
Adversarial attacks pose significant threats to the robustness of deep learning models in image classification.
This paper explores and refines defense mechanisms against these attacks to enhance the resilience of neural networks.
arXiv Detail & Related papers (2024-08-20T02:00:02Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks.
In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks [34.86098237949214]
Adversarial attacks, particularly patch attacks, pose significant threats to the robustness and reliability of deep learning models.
This paper introduces DIFFender, a novel defense framework that harnesses the capabilities of a text-guided diffusion model to combat patch attacks.
DIFFender integrates dual tasks of patch localization and restoration within a single diffusion model framework.
arXiv Detail & Related papers (2023-06-15T13:33:27Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Evaluating the Robustness of Semantic Segmentation for Autonomous
Driving against Real-World Adversarial Patch Attacks [62.87459235819762]
In a real-world scenario like autonomous driving, more attention should be devoted to real-world adversarial examples (RWAEs)
This paper presents an in-depth evaluation of the robustness of popular SS models by testing the effects of both digital and real-world adversarial patches.
arXiv Detail & Related papers (2021-08-13T11:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.