Related papers: Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs

Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs

URL: http://arxiv.org/abs/2506.01064v3
Date: Sun, 14 Sep 2025 09:51:48 GMT
Title: Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs
Authors: Yudong Zhang, Ruobing Xie, Yiqing Huang, Jiansheng Chen, Xingwu Sun, Zhanhui Kang, Di Wang, Yu Wang,
Abstract summary: We introduce F3, a novel adversarial purification framework that employs a counterintuitive fighting fire with fire'' strategy.<n>By injecting noise into adversarial examples, F3 effectively refines their attention, resulting in cleaner and more reliable model outputs.<n>F3 offers several distinct advantages: it is training-free and straightforward to implement, and exhibits significant computational efficiency improvements.
Score: 53.59536976915476
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in large vision-language models (LVLMs) have showcased their remarkable capabilities across a wide range of multimodal vision-language tasks. However, these models remain vulnerable to visual adversarial attacks, which can substantially compromise their performance. In this paper, we introduce F3, a novel adversarial purification framework that employs a counterintuitive ``fighting fire with fire'' strategy: intentionally introducing simple perturbations to adversarial examples to mitigate their harmful effects. Specifically, F3 leverages cross-modal attentions derived from randomly perturbed adversary examples as reference targets. By injecting noise into these adversarial examples, F3 effectively refines their attention, resulting in cleaner and more reliable model outputs. Remarkably, this seemingly paradoxical approach of employing noise to counteract adversarial attacks yields impressive purification results. Furthermore, F3 offers several distinct advantages: it is training-free and straightforward to implement, and exhibits significant computational efficiency improvements compared to existing purification methods. These attributes render F3 particularly suitable for large-scale industrial applications where both robust performance and operational efficiency are critical priorities. The code is available at https://github.com/btzyd/F3.

Related papers

DiffCAP: Diffusion-based Cumulative Adversarial Purification for Vision Language Models [45.126261544696185]
Vision Language Models (VLMs) have shown remarkable capabilities in multimodal understanding, yet their susceptibility to perturbations poses a significant threat to their reliability in real-world applications.<n>This paper introduces DiffCAP, a novel diffusion-based purification strategy that can effectively neutralize adversarial corruptions in VLMs.
arXiv Detail & Related papers (2025-06-04T13:26:33Z)
Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking [15.806472680573297]
This paper proposes for the first time a novel adversarial defense method based on denoise diffusion probabilistic models, termed DiffDf.<n>Experiments show that DiffDf achieves real-time inference speeds of over 30 FPS, showcasing outstanding defense performance and efficiency.
arXiv Detail & Related papers (2025-05-31T00:37:28Z)
Beyond Classification: Evaluating Diffusion Denoised Smoothing for Security-Utility Trade off [4.497768222083102]
Diffusion Denoised Smoothing is emerging as a promising technique to enhance model robustness.<n>We analyze three datasets with four distinct downstream tasks under three different adversarial attack algorithms.<n>High-noise diffusion denoising to clean images without any distortions significantly degrades performance by as high as 57%.<n>We introduce a novel attack strategy specifically targeting the diffusion process itself, capable of circumventing defenses in the low-noise regime.
arXiv Detail & Related papers (2025-05-21T14:49:24Z)
Eidos: Efficient, Imperceptible Adversarial 3D Point Clouds [16.604139389480615]
Eidos is a framework providing Efficient Imperceptible aDversarial attacks on 3D pOint cloudS. This paper adds to the understanding of adversarial attacks by presenting Eidos, a framework providing Efficient Imperceptible aDversarial attacks on 3D pOint cloudS.
arXiv Detail & Related papers (2024-05-23T06:09:08Z)
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z)
FedRDF: A Robust and Dynamic Aggregation Function against Poisoning Attacks in Federated Learning [0.0]
Federated Learning (FL) represents a promising approach to typical privacy concerns associated with centralized Machine Learning (ML) deployments. Despite its well-known advantages, FL is vulnerable to security attacks such as Byzantine behaviors and poisoning attacks. Our proposed approach was tested against various model poisoning attacks, demonstrating superior performance over state-of-the-art aggregation methods.
arXiv Detail & Related papers (2024-02-15T16:42:04Z)
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data. Recent attack methods can achieve a relatively high attack success rate (ASR) We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z)
On Evaluating Adversarial Robustness of Large Vision-Language Models [64.66104342002882]
We evaluate the robustness of large vision-language models (VLMs) in the most realistic and high-risk setting. In particular, we first craft targeted adversarial examples against pretrained models such as CLIP and BLIP. Black-box queries on these VLMs can further improve the effectiveness of targeted evasion.
arXiv Detail & Related papers (2023-05-26T13:49:44Z)
Dynamic Transformers Provide a False Sense of Efficiency [75.39702559746533]
Multi-exit models make a trade-off between efficiency and accuracy, where the saving of computation comes from an early exit. We propose a simple yet effective attacking framework, SAME, which is specially tailored to reduce the efficiency of the multi-exit models. Experiments on the GLUE benchmark show that SAME can effectively diminish the efficiency gain of various multi-exit models by 80% on average.
arXiv Detail & Related papers (2023-05-20T16:41:48Z)
PointACL:Adversarial Contrastive Learning for Robust Point Clouds Representation under Adversarial Attack [73.3371797787823]
Adversarial contrastive learning (ACL) is considered an effective way to improve the robustness of pre-trained models. We present our robust aware loss function to train self-supervised contrastive learning framework adversarially. We validate our method, PointACL on downstream tasks, including 3D classification and 3D segmentation with multiple datasets.
arXiv Detail & Related papers (2022-09-14T22:58:31Z)
Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. We propose DiffPure that uses diffusion models for adversarial purification. Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)
Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise. Pre-processing methods may suffer from the robustness degradation effect. A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model. We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.