Related papers: MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks

MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks

URL: http://arxiv.org/abs/2410.14089v1
Date: Thu, 17 Oct 2024 23:52:39 GMT
Title: MMAD-Purify: A Precision-Optimized Framework for Efficient and Scalable Multi-Modal Attacks
Authors: Xinxin Liu, Zhongliang Guo, Siyuan Huang, Chun Pong Lau,
Abstract summary: We introduce an innovative framework that incorporates a precision-optimized noise predictor to enhance the effectiveness of our attack framework. Our framework provides a cutting-edge solution for multi-modal adversarial attacks, ensuring reduced latency. We demonstrate that our framework achieves outstanding transferability and robustness against purification defenses.
Score: 21.227398434694724
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks have achieved remarkable performance across a wide range of tasks, yet they remain susceptible to adversarial perturbations, which pose significant risks in safety-critical applications. With the rise of multimodality, diffusion models have emerged as powerful tools not only for generative tasks but also for various applications such as image editing, inpainting, and super-resolution. However, these models still lack robustness due to limited research on attacking them to enhance their resilience. Traditional attack techniques, such as gradient-based adversarial attacks and diffusion model-based methods, are hindered by computational inefficiencies and scalability issues due to their iterative nature. To address these challenges, we introduce an innovative framework that leverages the distilled backbone of diffusion models and incorporates a precision-optimized noise predictor to enhance the effectiveness of our attack framework. This approach not only enhances the attack's potency but also significantly reduces computational costs. Our framework provides a cutting-edge solution for multi-modal adversarial attacks, ensuring reduced latency and the generation of high-fidelity adversarial examples with superior success rates. Furthermore, we demonstrate that our framework achieves outstanding transferability and robustness against purification defenses, outperforming existing gradient-based attack models in both effectiveness and efficiency.

Related papers

CoDefend: Cross-Modal Collaborative Defense via Diffusion Purification and Prompt Optimization [4.6467356929461925]
Multimodal Large Language Models (MLLMs) have achieved remarkable success in tasks such as image captioning, visual question answering, and cross-modal reasoning.<n>Their multimodal nature exposes them to adversarial threats, where attackers can perturb either modality or both jointly to induce harmful, misleading, or policy violating outputs.<n>Existing defense strategies, such as adversarial training and input purification, face notable limitations.<n>We propose a supervised diffusion based denoising framework that leverages paired adversarial clean image datasets to fine-tune diffusion models.
arXiv Detail & Related papers (2025-10-13T07:44:54Z)
The Power of Many: Synergistic Unification of Diverse Augmentations for Efficient Adversarial Robustness [6.471349369877151]
Adversarial perturbations pose a significant threat to deep learning models.<n>Adversarial Training (AT) faces challenges of high computational costs and a degradation in standard performance.<n>We propose the Universal Adversarial Augmenter (UAA) framework, which is characterized by its plug-and-play nature and training efficiency.
arXiv Detail & Related papers (2025-08-05T08:42:14Z)
Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
Enhancing Variational Autoencoders with Smooth Robust Latent Encoding [54.74721202894622]
Variational Autoencoders (VAEs) have played a key role in scaling up diffusion-based generative models. We introduce Smooth Robust Latent VAE, a novel adversarial training framework that boosts both generation quality and robustness. Experiments show that SRL-VAE improves both generation quality, in image reconstruction and text-guided image editing, and robustness, against Nightshade attacks and image editing attacks.
arXiv Detail & Related papers (2025-04-24T03:17:57Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step. To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration. Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
SAP-DIFF: Semantic Adversarial Patch Generation for Black-Box Face Recognition Models via Diffusion Models [4.970240615354004]
Impersonation attacks are a significant threat because adversarial perturbations allow attackers to disguise themselves as legitimate users. We propose a novel method to generate adversarial patches via semantic perturbations in the latent space rather than direct pixel manipulation. Our method achieves an average attack success rate improvement of 45.66%, and a reduction in the number of queries by about 40%.
arXiv Detail & Related papers (2025-02-27T02:57:29Z)
Improving the Transferability of Adversarial Examples by Inverse Knowledge Distillation [15.362394334872077]
Inverse Knowledge Distillation (IKD) is designed to enhance adversarial transferability effectively. IKD integrates with gradient-based attack methods, promoting diversity in attack gradients and mitigating overfitting to specific model architectures. Experiments on the ImageNet dataset validate the effectiveness of our approach.
arXiv Detail & Related papers (2025-02-24T09:35:30Z)
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models [93.76814568163353]
We propose a novel bilevel optimization framework for pruned diffusion models. This framework consolidates the fine-tuning and unlearning processes into a unified phase. It is compatible with various pruning and concept unlearning methods.
arXiv Detail & Related papers (2024-12-19T19:13:18Z)
Improving Transferable Targeted Attacks with Feature Tuning Mixup [12.707753562907534]
Deep neural networks exhibit vulnerability to examples that can transfer across different models. We propose Feature Tuning Mixup (FTM) to enhance targeted attack transferability. Our method achieves significant improvements over state-of-the-art methods while maintaining low computational cost.
arXiv Detail & Related papers (2024-11-23T13:18:25Z)
Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models [9.905296922309157]
Diffusion Models have emerged as powerful generative models for high-quality image synthesis, with many subsequent image editing techniques based on them. Previous works have attempted to safeguard images from diffusion-based editing by adding imperceptible perturbations. Our work proposes a novel attacking framework with a feature representation attack loss that exploits vulnerabilities in denoising UNets and a latent optimization strategy to enhance the naturalness of protected images.
arXiv Detail & Related papers (2024-08-21T17:56:34Z)
Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations. We transform the multi-granular attack into a sequential decision-making process. Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z)
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content [62.685566387625975]
Current mitigation strategies, while effective, are not resilient under adversarial attacks. This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently moderate harmful and unsafe inputs.
arXiv Detail & Related papers (2024-03-19T07:25:02Z)
Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme. Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z)
Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks [15.882687207499373]
No-box adversarial attacks are becoming more practical and challenging for AI systems. This paper recasts adversarial attack as a downstream task by introducing foundational models as surrogate models.
arXiv Detail & Related papers (2023-07-13T08:10:48Z)
LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via Latent Ensemble Attack [11.764601181046496]
Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society. To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs. We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
arXiv Detail & Related papers (2023-07-04T07:00:37Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
A Perceptual Distortion Reduction Framework for Adversarial Perturbation Generation [58.6157191438473]
We propose a perceptual distortion reduction framework to tackle this problem from two perspectives. We propose a perceptual distortion constraint and add it into the objective function of adversarial attack to jointly optimize the perceptual distortions and attack success rate.
arXiv Detail & Related papers (2021-05-01T15:08:10Z)
Adversarial example generation with AdaBelief Optimizer and Crop Invariance [8.404340557720436]
Adversarial attacks can be an important method to evaluate and select robust models in safety-critical applications. We propose AdaBelief Iterative Fast Gradient Method (ABI-FGM) and Crop-Invariant attack Method (CIM) to improve the transferability of adversarial examples. Our method has higher success rates than state-of-the-art gradient-based attack methods.
arXiv Detail & Related papers (2021-02-07T06:00:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.