Assessing Robustness via Score-Based Adversarial Image Generation
- URL: http://arxiv.org/abs/2310.04285v1
- Date: Fri, 6 Oct 2023 14:37:22 GMT
- Title: Assessing Robustness via Score-Based Adversarial Image Generation
- Authors: Marcel Kollovieh, Lukas Gosch, Yan Scholten, Marten Lienen, Stephan
G\"unnemann
- Abstract summary: We introduce Score-Based Adversarial Generation (ScoreAG) to generate adversarial examples beyond $ell_p$-norm constraints.
ScoreAG maintains the core semantics of images while generating realistic adversarial examples, either by transforming existing images or new ones entirely from scratch.
Our empirical evaluation demonstrates that ScoreAG matches the performance of state-of-the-art attacks and defenses across multiple benchmarks.
- Score: 7.640804709462919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most adversarial attacks and defenses focus on perturbations within small
$\ell_p$-norm constraints. However, $\ell_p$ threat models cannot capture all
relevant semantic-preserving perturbations, and hence, the scope of robustness
evaluations is limited. In this work, we introduce Score-Based Adversarial
Generation (ScoreAG), a novel framework that leverages the advancements in
score-based generative models to generate adversarial examples beyond
$\ell_p$-norm constraints, so-called unrestricted adversarial examples,
overcoming their limitations. Unlike traditional methods, ScoreAG maintains the
core semantics of images while generating realistic adversarial examples,
either by transforming existing images or synthesizing new ones entirely from
scratch. We further exploit the generative capability of ScoreAG to purify
images, empirically enhancing the robustness of classifiers. Our extensive
empirical evaluation demonstrates that ScoreAG matches the performance of
state-of-the-art attacks and defenses across multiple benchmarks. This work
highlights the importance of investigating adversarial examples bounded by
semantics rather than $\ell_p$-norm constraints. ScoreAG represents an
important step towards more encompassing robustness assessments.
Related papers
- Counterfactual Image Generation for adversarially robust and
interpretable Classifiers [1.3859669037499769]
We propose a unified framework leveraging image-to-image translation Generative Adrial Networks (GANs) to produce counterfactual samples.
This is achieved by combining the classifier and discriminator into a single model that attributes real images to their respective classes and flags generated images as "fake"
We show how the model exhibits improved robustness to adversarial attacks, and we show how the discriminator's "fakeness" value serves as an uncertainty measure of the predictions.
arXiv Detail & Related papers (2023-10-01T18:50:29Z) - Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks [78.2700757742992]
Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms to generate such adversaries.
We experimentally verify this assertion on a synthetic-data example and by evaluating our proposed method across 25 different $ell_infty$-robust models and 3 datasets.
Our strongest adversarial attack outperforms all of the white-box components of AutoAttack ensemble.
arXiv Detail & Related papers (2022-12-15T17:44:31Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - CARBEN: Composite Adversarial Robustness Benchmark [70.05004034081377]
This paper demonstrates how composite adversarial attack (CAA) affects the resulting image.
It provides real-time inferences of different models, which will facilitate users' configuration of the parameters of the attack level.
A leaderboard to benchmark adversarial robustness against CAA is also introduced.
arXiv Detail & Related papers (2022-07-16T01:08:44Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Generating Structured Adversarial Attacks Using Frank-Wolfe Method [7.84752424025677]
Constraining adversarial search with different norms results in disparately structured adversarial examples.
structured adversarial examples can be used for adversarial regularization of models to make models more robust or improve their performance on datasets which are structurally different.
arXiv Detail & Related papers (2021-02-15T06:36:50Z) - Perceptually Constrained Adversarial Attacks [2.0305676256390934]
We replace the usually applied $L_p$ norms with the structural similarity index (SSIM) measure.
Our SSIM-constrained adversarial attacks can break state-of-the-art adversarially trained classifiers and achieve similar or larger success rate than the elastic net attack.
We evaluate the performance of several defense schemes in a perceptually much more meaningful way than was done previously in the literature.
arXiv Detail & Related papers (2021-02-14T12:28:51Z) - Trace-Norm Adversarial Examples [24.091216490378567]
Constraining the adversarial search with different norms results in disparately structured adversarial examples.
structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part.
They allow some control on the generation of the adversarial perturbation, like (localized) bluriness.
arXiv Detail & Related papers (2020-07-02T13:37:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.