Related papers: Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning

Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning

URL: http://arxiv.org/abs/2209.11964v2
Date: Fri, 29 Mar 2024 08:46:46 GMT
Title: Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning
Authors: Zhengwei Fang, Rui Wang, Tao Huang, Liping Jing,
Abstract summary: We propose an approach named Multiple Asymptotically Normal Distribution Attacks (MultiANDA) We approximate the posterior distribution over the perturbations by taking advantage of the normality property of gradient ascent (SGA) Our proposed method outperforms ten state-of-the-art black-box attacks on deep learning models with or without defenses.
Score: 24.10329164911317
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Strong adversarial examples are crucial for evaluating and enhancing the robustness of deep neural networks. However, the performance of popular attacks is usually sensitive, for instance, to minor image transformations, stemming from limited information -- typically only one input example, a handful of white-box source models, and undefined defense strategies. Hence, the crafted adversarial examples are prone to overfit the source model, which hampers their transferability to unknown architectures. In this paper, we propose an approach named Multiple Asymptotically Normal Distribution Attacks (MultiANDA) which explicitly characterize adversarial perturbations from a learned distribution. Specifically, we approximate the posterior distribution over the perturbations by taking advantage of the asymptotic normality property of stochastic gradient ascent (SGA), then employ the deep ensemble strategy as an effective proxy for Bayesian marginalization in this process, aiming to estimate a mixture of Gaussians that facilitates a more thorough exploration of the potential optimization space. The approximated posterior essentially describes the stationary distribution of SGA iterations, which captures the geometric information around the local optimum. Thus, MultiANDA allows drawing an unlimited number of adversarial perturbations for each input and reliably maintains the transferability. Our proposed method outperforms ten state-of-the-art black-box attacks on deep learning models with or without defenses through extensive experiments on seven normally trained and seven defense models.

Related papers

Boosting Adversarial Transferability Against Defenses via Multi-Scale Transformation [0.8388591755871736]
The transferability of adversarial examples poses a significant security challenge for deep neural networks.<n>We propose a new Segmented Gaussian Pyramid (SGP) attack method to enhance the transferability.<n>In contrast to the state-of-the-art methods, SGP significantly enhances attack success rates against black-box defense models.
arXiv Detail & Related papers (2025-07-02T15:16:30Z)
Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation [27.016302505571048]
Deep neural networks are known to be vulnerable to well-designed adversarial attacks. We propose a novel model-free adversarial purification method by a specially designed tensor network decomposition algorithm. Our method generalizes effectively across various norm threats, attack types, and tasks.
arXiv Detail & Related papers (2025-02-25T08:45:50Z)
Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied. We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables. To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z)
Boosting the Targeted Transferability of Adversarial Examples via Salient Region & Weighted Feature Drop [2.176586063731861]
A prevalent approach for adversarial attacks relies on the transferability of adversarial examples. A novel framework based on Salient region & Weighted Feature Drop (SWFD) designed to enhance the targeted transferability of adversarial examples.
arXiv Detail & Related papers (2024-11-11T08:23:37Z)
Boosting Adversarial Transferability by Achieving Flat Local Maxima [23.91315978193527]
Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives. In this work, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability. We propose an approximation optimization method to simplify the gradient update of the objective function.
arXiv Detail & Related papers (2023-06-08T14:21:02Z)
Boosting Adversarial Transferability via Fusing Logits of Top-1 Decomposed Feature [36.78292952798531]
We propose a Singular Value Decomposition (SVD)-based feature-level attack method. Our approach is inspired by the discovery that eigenvectors associated with the larger singular values from the middle layer features exhibit superior generalization and attention properties.
arXiv Detail & Related papers (2023-05-02T12:27:44Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent. We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z)
Patch-wise++ Perturbation for Adversarial Targeted Attacks [132.58673733817838]
We propose a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $epsilon$-constraint is properly assigned to its surrounding regions. Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 35.9% for defense models and 32.7% for normally trained models.
arXiv Detail & Related papers (2020-12-31T08:40:42Z)
Shaping Deep Feature Space towards Gaussian Mixture for Visual Classification [74.48695037007306]
We propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification. With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution. The proposed model can be implemented easily and efficiently without using extra trainable parameters.
arXiv Detail & Related papers (2020-11-18T03:32:27Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.