Related papers: Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability

Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability

URL: http://arxiv.org/abs/2405.03193v1
Date: Mon, 6 May 2024 06:32:58 GMT
Title: Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability
Authors: Juanjuan Weng, Zhiming Luo, Shaozi Li,
Abstract summary: We introduce a frequency decomposition-based feature mixing method to exploit frequency characteristics in both clean and adversarial samples. Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models. We propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update.
Score: 26.159434438078968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have shown that Deep Neural Networks (DNNs) are susceptible to adversarial attacks, with frequency-domain analysis underscoring the significance of high-frequency components in influencing model predictions. Conversely, targeting low-frequency components has been effective in enhancing attack transferability on black-box models. In this study, we introduce a frequency decomposition-based feature mixing method to exploit these frequency characteristics in both clean and adversarial samples. Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models, while combining clean features with the adversarial features extracted from low-frequency parts decomposed from the adversarial samples yields better results in attacking defense models. However, a conflict issue arises when these two mixing approaches are employed simultaneously. To tackle the issue, we propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update. In the meta-train step, we leverage the low-frequency components of adversarial samples to boost the transferability of attacks against defense models. Meanwhile, in the meta-test step, we utilize adversarial samples to stabilize gradients, thereby enhancing the attack's transferability against normally trained models. For the final update, we update the adversarial sample based on the gradients obtained from both meta-train and meta-test steps. Our proposed method is evaluated through extensive experiments on the ImageNet-Compatible dataset, affirming its effectiveness in improving the transferability of attacks on both normally-trained CNNs and defense models. The source code is available at https://github.com/WJJLL/MetaSSA.

Related papers

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks. FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z)
Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions. Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions. We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z)
Enhancing Adversarial Robustness via Score-Based Optimization [22.87882885963586]
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations. We introduce a novel adversarial defense scheme named ScoreOpt, which optimize adversarial samples at test-time. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both performance and robustness speed.
arXiv Detail & Related papers (2023-07-10T03:59:42Z)
Boosting Adversarial Transferability via Fusing Logits of Top-1 Decomposed Feature [36.78292952798531]
We propose a Singular Value Decomposition (SVD)-based feature-level attack method. Our approach is inspired by the discovery that eigenvectors associated with the larger singular values from the middle layer features exhibit superior generalization and attention properties.
arXiv Detail & Related papers (2023-05-02T12:27:44Z)
Phase-shifted Adversarial Training [8.89749787668458]
We analyze the behavior of adversarial training through the lens of response frequency. PhaseAT significantly improves the convergence for high-frequency information. This results in improved adversarial robustness by enabling the model to have smoothed predictions near each data.
arXiv Detail & Related papers (2023-01-12T02:25:22Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation [94.30136898739448]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model.<n>Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components.<n>Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Modelling Adversarial Noise for Adversarial Defense [96.56200586800219]
adversarial defenses typically focus on exploiting adversarial examples to remove adversarial noise or train an adversarially robust target model. Motivated by that the relationship between adversarial data and natural data can help infer clean data from adversarial data to obtain the final correct prediction. We study to model adversarial noise to learn the transition relationship in the label space for using adversarial labels to improve adversarial accuracy.
arXiv Detail & Related papers (2021-09-21T01:13:26Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Yet Another Intermediate-Level Attack [31.055720988792416]
The transferability of adversarial examples across deep neural network (DNN) models is the crux of a spectrum of black-box attacks. We propose a novel method to enhance the black-box transferability of baseline adversarial examples.
arXiv Detail & Related papers (2020-08-20T09:14:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.