Adversarial Examples on Segmentation Models Can be Easy to Transfer
- URL: http://arxiv.org/abs/2111.11368v1
- Date: Mon, 22 Nov 2021 17:26:21 GMT
- Title: Adversarial Examples on Segmentation Models Can be Easy to Transfer
- Authors: Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr
- Abstract summary: The transferability of adversarial examples on classification models has attracted a growing interest.
We study the overfitting phenomenon of adversarial examples on classification and segmentation models.
We propose a simple and effective method, dubbed dynamic scaling, to overcome the limitation.
- Score: 21.838878497660353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural network-based image classification can be misled by adversarial
examples with small and quasi-imperceptible perturbations. Furthermore, the
adversarial examples created on one classification model can also fool another
different model. The transferability of the adversarial examples has recently
attracted a growing interest since it makes black-box attacks on classification
models feasible. As an extension of classification, semantic segmentation has
also received much attention towards its adversarial robustness. However, the
transferability of adversarial examples on segmentation models has not been
systematically studied. In this work, we intensively study this topic. First,
we explore the overfitting phenomenon of adversarial examples on classification
and segmentation models. In contrast to the observation made on classification
models that the transferability is limited by overfitting to the source model,
we find that the adversarial examples on segmentations do not always overfit
the source models. Even when no overfitting is presented, the transferability
of adversarial examples is limited. We attribute the limitation to the
architectural traits of segmentation models, i.e., multi-scale object
recognition. Then, we propose a simple and effective method, dubbed dynamic
scaling, to overcome the limitation. The high transferability achieved by our
method shows that, in contrast to the observations in previous work,
adversarial examples on a segmentation model can be easy to transfer to other
segmentation models. Our analysis and proposals are supported by extensive
experiments.
Related papers
- Scaling Laws for Black box Adversarial Attacks [37.744814957775965]
Adversarial examples exhibit cross-model transferability, enabling to attack black-box models.
Model ensembling is an effective strategy to improve the transferability by attacking multiple surrogate models simultaneously.
We show that scaled attacks bring better interpretability in semantics, indicating that the common features of models are captured.
arXiv Detail & Related papers (2024-11-25T08:14:37Z) - Transcending Adversarial Perturbations: Manifold-Aided Adversarial
Examples with Legitimate Semantics [10.058463432437659]
Deep neural networks were significantly vulnerable to adversarial examples manipulated by malicious tiny perturbations.
In this paper, we propose a supervised semantic-transformation generative model to generate adversarial examples with real and legitimate semantics.
Experiments on MNIST and industrial defect datasets showed that our adversarial examples not only exhibited better visual quality but also achieved superior attack transferability.
arXiv Detail & Related papers (2024-02-05T15:25:40Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Contrastive Learning for Fair Representations [50.95604482330149]
Trained classification models can unintentionally lead to biased representations and predictions.
Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise.
We propose a method for mitigating bias by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations.
arXiv Detail & Related papers (2021-09-22T10:47:51Z) - Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world.
This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - On the Benefits of Models with Perceptually-Aligned Gradients [8.427953227125148]
We show that interpretable and perceptually aligned gradients are present even in models that do not show high robustness to adversarial attacks.
We leverage models with interpretable perceptually-aligned features and show that adversarial training with low max-perturbation bound can improve the performance of models for zero-shot and weakly supervised localization tasks.
arXiv Detail & Related papers (2020-05-04T14:05:38Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial
Perturbations [65.05561023880351]
Adversarial examples are malicious inputs crafted to induce misclassification.
This paper studies a complementary failure mode, invariance-based adversarial examples.
We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks.
arXiv Detail & Related papers (2020-02-11T18:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.